Finding Alpha Users

Today I created two parallel domains to test out ways of driving new users to Noodlix. Both domains feature a simple landing page that is suitable for advertising. Users can read a brief synopsis of what the service aims to provide and then signup for the site’s newsletter.

Site A: http://noodlix.com/

Noodlix Splash Page

Site B: http://countlessminds.com/

Countless Minds Splash Page

Under the hood, both sites *are* the same service. There are some minor CSS and domain name differences, but the content is identical. I created a Facebook Ad and two Google AdWords campaigns. So far I’ve had some interest from people I know directly, but I haven’t started promoting the idea very hard online until now.

On the backend, these forms just submit an email message, notifying me when someone has subscribed (or unsubscribed). I set an initial deadline of December 1 for the Alpha group. The plan is to iterate with whoever joins between now and the end of my NMotion prelaunch class.

Posted in Development | Leave a comment

Global Development Capacity

dna-1370603787LgY

I’ve been experimenting recently on building tools to help programmers coordinate effort in new ways. While thinking about my core design problems this evening, I decided to calculate a rough estimate of the world’s development capacity in terms of raw source code produced per day.

Assumptions

  • 18 million software developers
  • 60 WPM – average typing speed per developer
  • 2 hours – time spent in a flow state per developer per day
  • 4 TiB HDD costs $116
    • http://www.newegg.com/Product/Product.aspx?Item=N82E16822178338

Calculations

  • 5 bytes/sec – maximum output bandwidth per developer
  • 35.2 KiB – maximum contribution per developer per day
  • 0.59 TiB – maximum global contribution per day
  • $17.10 – Cost to store all of our productive work each day

Conclusions

Software is eating the world at an average rate of perhaps 7.5 MB/sec.

A source code “blockchain” of *all* of our productive work completed each week would fit on a single 4 TiB HDD. Within 10 years, we’ll be able to store the combined output of roughly 6 months of work on a single consumer drive (assuming ~100 TiB capacity and little change in productive output per day). Now, I’m not saying that we can fit all of the derivatives of that work on a single computer, but in terms of raw text produced per day, it seems quite doable.

What this means for our productivity is an interesting thought. With our bandwidth to the computer being so low, it seems obvious that we should be constantly looking for ways to leverage and learn from the work done by others. Github, Stack Overflow and Slack are steps in the right direction, but I think it would be awesome if we had a global source code pool.  I image such a system would provide a collective heartbeat of the development community. What do you think?

Posted in Development, Information Technology, Noodlix | Leave a comment

Quick Pronunciation Guide for Base 256

Here’s a pronunciation guide for base 256 that I came up with during a Computer Science lecture circa 2002.

Definitions

Starting Consonants { b, g, h, j, k, l, n, p, q, r, s, t, v, w, y, z }
Vowels { a, e, i, o }
Ending Consonants { c, d, f, m }

Encoding Process

We encode values using a single syllable. This makes expressing values natural and easy, because each syllable is expressed by *value*.

Letters are incremented in much the same way as numbers would be. The letters are given in ascending order above, so “bac”==0 and “zom”==255.

The starting consonants encode the high-order nybble and the vowel + ending consonant encodes the low-order nybble. Taken together, the three letters encode a single byte.

Full Listing

0. bac
1. bad
2. baf
3. bam
4. bec
5. bed
6. bef
7. bem
8. bic
9. bid
10. bif
11. bim
12. boc
13. bod
14. bof
15. bom
16. gac
17. gad
18. gaf
19. gam
20. gec
21. ged
22. gef
23. gem
24. gic
25. gid
26. gif
27. gim
28. goc
29. god
30. gof
31. gom
32. hac
33. had
34. haf
35. ham
36. hec
37. hed
38. hef
39. hem
40. hic
41. hid
42. hif
43. him
44. hoc
45. hod
46. hof
47. hom
48. jac
49. jad
50. jaf
51. jam
52. jec
53. jed
54. jef
55. jem
56. jic
57. jid
58. jif
59. jim
60. joc
61. jod
62. jof
63. jom
64. kac
65. kad
66. kaf
67. kam
68. kec
69. ked
70. kef
71. kem
72. kic
73. kid
74. kif
75. kim
76. koc
77. kod
78. kof
79. kom
80. lac
81. lad
82. laf
83. lam
84. lec
85. led
86. lef
87. lem
88. lic
89. lid
90. lif
91. lim
92. loc
93. lod
94. lof
95. lom
96. nac
97. nad
98. naf
99. nam
100. nec
101. ned
102. nef
103. nem
104. nic
105. nid
106. nif
107. nim
108. noc
109. nod
110. nof
111. nom
112. pac
113. pad
114. paf
115. pam
116. pec
117. ped
118. pef
119. pem
120. pic
121. pid
122. pif
123. pim
124. poc
125. pod
126. pof
127. pom
128. qac
129. qad
130. qaf
131. qam
132. qec
133. qed
134. qef
135. qem
136. qic
137. qid
138. qif
139. qim
140. qoc
141. qod
142. qof
143. qom
144. rac
145. rad
146. raf
147. ram
148. rec
149. red
150. ref
151. rem
152. ric
153. rid
154. rif
155. rim
156. roc
157. rod
158. rof
159. rom
160. sac
161. sad
162. saf
163. sam
164. sec
165. sed
166. sef
167. sem
168. sic
169. sid
170. sif
171. sim
172. soc
173. sod
174. sof
175. som
176. tac
177. tad
178. taf
179. tam
180. tec
181. ted
182. tef
183. tem
184. tic
185. tid
186. tif
187. tim
188. toc
189. tod
190. tof
191. tom
192. vac
193. vad
194. vaf
195. vam
196. vec
197. ved
198. vef
199. vem
200. vic
201. vid
202. vif
203. vim
204. voc
205. vod
206. vof
207. vom
208. wac
209. wad
210. waf
211. wam
212. wec
213. wed
214. wef
215. wem
216. wic
217. wid
218. wif
219. wim
220. woc
221. wod
222. wof
223. wom
224. yac
225. yad
226. yaf
227. yam
228. yec
229. yed
230. yef
231. yem
232. yic
233. yid
234. yif
235. yim
236. yoc
237. yod
238. yof
239. yom
240. zac
241. zad
242. zaf
243. zam
244. zec
245. zed
246. zef
247. zem
248. zic
249. zid
250. zif
251. zim
252. zoc
253. zod
254. zof
255. zom

Posted in Development | Leave a comment

Amblyopia Tetris

News of an ‘Amblyopia Tetris’ game hit the media a few months ago. Since then I’ve kicked around the idea of making a version of it in JavaScript. Today I ran across a baseline implementation of Tetris, so I went ahead and made the dream a reality! Enjoy!

The preview below shows how the game appears to someone wearing red/green 3d glasses. The right eye can only see the active pieces. The left eye can see the placed pieces.

I modified the version of Tetris posted by Jake Gordon a couple of years ago. His original post can be found here: JavaScript Tetris. You can find my fork of the project on GitHub.

Posted in Development, Games, Ideas | Leave a comment

Processor Basics

I ran across a post today that begged for a couple of answers.

The Question

What I don’t understand is why there’s such reluctance towards making the die bigger. I understand the benefits of making the process smaller, but there has to be a limit somewhere, and it would seem easier/more cost effective to just make the die bigger. Who’s idea was it to have this completely arbitrary general die size and attempt to always stuff more transistors onto it?

If you have a 5mm x 5mm die with 500 million 90nm transistors, you can either halve the process size to 45nm to get 1 billion transistors or you could just double the die size to 10mm x 10mm… Aside from heat and power consumption, why wouldn’t that work?

Answers

Let’s start with the assumptions:

  1. Die Size is “completely arbitrary”: false.
  2. It would be more cost-effective to make larger dies: false.
  3. There is a reluctance to making the die bigger: false.
  4. There has to be a limit to transistor feature size: true.
  5. It would be easier to make larger dies for a given design: false.
  6. Halving the process from 90 nm to 45 nm would double transistor count: false.
  7. “Doubling” the die size from 5 mm x 5 mm to 10 mm x 10 mm would double the transistor count: false.
  8. The poster understands the benefits of making the process smaller: false.
  9. “Someone” decided to make die size consistent: true.

Die Size is Not Arbitrary

Processors are manufactured by etching features into large wafers measuring approximately 100 nm to 300 mm in diameter. Intel Samsung, and TSMC announced a transition to 450 mm wafers in 2008. The equipment required to work with wafers dominates whether or not it is economically feasible to increase wafer size. As we will see below, decreasing die size reduces waste and improves yield. So manufacturers have significant pressure to reduce die size.

Constraints on Wafer Size

  1. Wafers are circular.
  2. Standard wafer size as of 2011 is 300 mm in diameter
  3. Wafer Area: 70,686 mm^2 (pi x (300/2 mm)^2)
  4. Equipment for etching wafers is expensive (measured in billions of dollars).
  5. Standardizing on wafer size allows manufacturers to control costs.

Constraints on Die Size

  1. Processor dies are square.
  2. Large dies waste more surface area for a given wafer.
  3. 1 Huge Die: A 300 mm wafer can fit a single large die of 45,000 mm.
  4. The wafer has 70,685 mm^2 available area, so this results in a waste of 36.3 percent.
  5. A 16×16 mm die has 256 mm^2 area.
  6. We can fit 240 256-mm^2 dies onto a single 300 mm wafter (4×4 + 4×8 + 4×12 + 12×12).
  7. 240 Dies: Requires 61,440 mm^2, which results in a waste of 13.1 percent.

Result: You can approximate a circle more effectively with smaller dies, resulting in fewer wasted materials.

Yields

  1. The analysis above assumes perfect manufacturing. This is not realistic.
  2. In the case of the single large die any imperfection will cause the entire chip to fail.
  3. Define the probability of a surface defect as 0.01% per mm^2 etched (just for illustrative purposes).

Single-Die Wafer

Continuing the example above, we etch 45,000 mm^2 to create our monolithic chip. Unfortunately, our error rate in etching dooms us to failure. On average we expect to find 4.5 errors per chip etched and any error results in a wasted chip. This means our etching process will produce chips with a 100 percent failure rate.

  1. 45,000 mm^2 x 0.0001 / mm^2 = 4.5 errors per wafer
  2. 1 bad die per wafer / 1 die per wafer = 100% failure rate

240-Die Wafer

This scenario is much more realistic. For a 256-mm^2 chip we observe an overall failure rate of 2.9% for the same manufacturing hardware. Clearly smaller dies are much more economical. The worst-cast scenario is when each error occurs in a separate die (multiple errors per die are actually better for our chip failure rate). At most 7 dies will fail.

  1. 61,440 mm^2 x 0.0001 / mm^2 = 6.1 errors per wafer
  2. 7 bad dies per wafer / 240 dies per wafer = 2.9% failure rate

Result: Smaller dies reduce the impact of etching defects, resulting in higher yields.

Larger Dies Are Not More Economical

For a given wafer size, decreasing die size improves both yield and cost. Increasing die size reduces yield. A manufacturing process with 99.99% reliability will fail to produce a working 45,000 mm^2 chip 100% of the time. The same process will produce working 256-mm^2 chips 97.1% of the time.

The industry has increased both die and wafer size

  1. Die size has been increasing over time.
  2. According to Wikipedia, the die size of the Intel 4004 chip was 12 mm^2.
  3. The first Pentium was 294 mm^2.
  4. Since then it seems Intel has tried to keep die size under 300 mm^2 as a cost-savings measure.
  5. Intel and other manufacturers are currently transitioning to 450 mm wafers, but retooling is expensive.
  6. The increase in die area between the Pentium and the 4004 was 24-fold.
  7. The increase in transistor count between the Pentium and the 4004 was 1,347-fold.
  8. Pentium transistor density: 3,100,000 transistors / 294 mm^2 = 10,544 transistors per mm^2
  9. 4004 transistor density: 2,300 transistors / 12 mm^2 = 192 transistors per mm^2
  10. Increase in transistor density: 10544/192 = 54.9-fold

Result: Overall chip size measured in transistor count, physical size, and transistor density has been increasing since the introduction of the Intel 4004.

The Limit to CMOS Feature Size is About 5 nm

  1. Individual atoms measure in picometers – about 100 to 500 – or 0.1 to 0.5 nm.
  2. Individual transistors need at least 10 atoms to function.
  3. 10 x 0.5 nm = 5 nm is the minimum feature size for transistors to function
  4. Further improvements require a new class of computation device (1+ transistors per atom instead of atoms per transistor)

Result: Without fundamental improvements in design, we will not be able to scale transistors smaller than 5 nm.

Math Background on Chip Fabrication

Doubling in 2D

Whenever we work in two dimensions we have to be careful by what we mean when we say “double”. If you double the area of a square, you only increase the length of each side by a factor of about 1.414 (since 1.414 x 1.414 = 1.99396). If instead we double the length of each side, then we are quadrupling the area!

Doubling Die Dimensions

  1. In the example above, we have a mythical CPU with 500e6 transistors packed into a 25 mm^2 area.
  2. This results in a 20e6 / mm^2 transistor density.
  3. “Doubling” the die size results in an area of 100 mm^2.
  4. (20e6 / mm^2) * 100 mm^2 = 2e9
  5. Packing transistors at the same density results in a CPU with 2 billion transistors (4x the original).

Result: Doubling die dimensions quadruples transistor count and die area.

Halving Feature Size

  1. In the case of halving the feature size from 90 nm to 45 nm we see a similar effect.
  2. Note that feature size is generally not transistor size – but it should be proportional given the same design.
  3. 90 nm x 90 nm = 8100 nm^2.
  4. 45 nm x 45 nm = 2025 mm^2.
  5. Size difference between 90 nm and 45 nm transistors = 8100 / 2025 = 4.
  6. 90 nm CPU: 25 mm^2 / (8100 nm^2) = 3e9 elements maximum
  7. 45 nm CPU: 25 mm^2 / (2025 mm^2) = 1.2e10 elements maximum
  8. The ratio of transistors to the limit imposed by feature size should match.
  9. 90 nm CPU: 500e6 / 3e9 = 0.1667 – this is our ratio
  10. 45 nm CPU @ 1 billion transistors: 1e9 / 1.2e10 = 0.0833 – this is not our ratio!
  11. 45 nm CPU @ 2 billion transistors: 2e9 / 1.2e10 = 0.1667 – matches!

Result: Halving feature size quadruples transistor count for a given die size.

Sources

  1. The Question Source: http://www.overclock.net/general-processor-discussions/802618-nanometer-processor-picometer-processor.html
  2. Intel Wafer: http://download.intel.com/pressroom/images/centrino/Wafer_V2.tif
  3. Intel Transistor Counts: http://www.intel.com/pressroom/kits/events/moores_law_40th/
  4. Chart of Transistor Count: http://download.intel.com/pressroom/images/events/moores_law_40th/Transistor_Count_bar_chart.jpg
  5. Wafer Size: http://arstechnica.com/hardware/news/2008/05/intel-samsung-tsmc-to-hold-hands-and-jump-to-new-wafer-size.ars
  6. Wikipedia Transistor Count: http://en.wikipedia.org/wiki/Transistor_count
  7. CMOS Scale Limit: http://ieeexplore.ieee.org/Xplore/login.jsp?url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4531469%2F4539996%2F04540004.pdf%3Farnumber%3D4540004&authDecision=-203
  8. Atom Length: http://hypertextbook.com/facts/MichaelPhillip.shtml
  9. 10-Atom Transistor: http://discovermagazine.com/2009/jan/051
  10. Intel i7 Launch Materials: http://www.intel.com/pressroom/archive/releases/2008/20081117comp_sm.htm
  11. Core i7 Die Image: http://download.intel.com/pressroom/kits/corei7/images/Nehalem_Die_Shot_3.jpg
  12. Transistor Image: http://www.intel.com/newsroom/kits/22nm/gallery/gallery.htm
Posted in Computing | Leave a comment

iGoWin

I started playing Go recently and I wanted a decent Windows program to use. A friend of mine showed me iGoWin. I noticed after I downloaded it that the installer was pretty basic. It was written in 1997-98 and uses the outdated .hlp help file format. While still quite useable in Windows 7 64-bit, it was clear that the installer and help files needed some TLC.

To that end I have recompiled the help files in .chm format and included links to them with a full-fledged installer (created using NSIS). The original author (David Fotland) would have to recompile the program to include links to the updated chm files.

iGoWin is a free version of “The Many Faces of Go”. If you like using it, please consider buying the full version. I’ve included a link to David’s retail product page below.

iGoWin
iGoWin

Provided below are links to the original download page I obtained iGoWin from as well as a direct link to the self-extracting package.

iGoWin

SHA-1 Hashes:

  • 57488a984d7ec2a8e4805d584933b99f93fcfea9: iGoWin_1_0_0.exe
  • f53895fc1818d2dbd12c8c838dce6b22029874a9: igowin.exe (Program)
  • 5d9a4f7de219fcf62ea7a9df1b2aabb8695ce7f7: igowin.exe (Installer)

Package Contents:

  • 43383371eda4c4d9892a5098fc86d7d845abb17e: Grain.bmp
  • edb04c8b981220cb945eb942fc14ce7ca0352444: iGoWin.chm
  • f53895fc1818d2dbd12c8c838dce6b22029874a9: igowin.exe
  • 52428e803920c08e44d455daf92359576c0cd708: Igowin.hlp
  • ac756443e5162e590be76c189b0cbcb2fad38519: iGoWin64.ico
  • cce5ad03df1aeda019eb759858c0902f578d2bd8: ReadMe.txt
  • ace4fab947324913abc3a2d0fe3f0ddb54b872e9: Tutor.chm
  • 8acdf156a1b39512001489a1baddbfe980afe738: Tutor.hlp
  • 491d74e7b31fdec3c7f4b0821a022ef042fde0f5: Uninstall.exe
Posted in Development, Games | Leave a comment

Serving the Past Future

I added a link to the Time Traveler Convention on my notes a while back.  Figured I’d propagate it here too.

Original Link: Time Traveler Convention
The Time Traveler Convention
May 7, 2005, 10:00pm EDT (08 May 2005 02:00:00 UTC)
(event starts at 8:00pm)
East Campus Courtyard, MIT
42:21:36.025°N, 71:05:16.332°W
(42.360007,-071.087870 in decimal degrees)

Posted in News | Leave a comment

Analogs and Parallels

I’ve been reading a book titled Dreaming in Code recently, which is a rather extensive case study of the development of an open-source project named Chandler. Many of the lessons and references to programming in general have hit home with my experience. Things like the fact that many programmers would rather program than eat or sleep. In fact it is 3:13 AM right now and I’ve been reading a bit about Knuth’s Literate Programming effort – something that I am sad to say that I don’t know nearly enough about. Evidently the faculty at UNL doesn’t put much stock in his CWEB language. Or at least not at the under-graduate level. Either way, I’ve never written a program using it.

I often wonder what the distinction between mere random behavior and intelligence is. Good ideas are usually born from the ashes of old ideas (formulas, conjectures, and theorems in the parlance of mathematics). Two un-related ideas can be combined in a moment of ingenuity into something awesome. I do not believe in pre-determination or true chance. But of course, you would expect me to say that.

I started a post a while back (which I haven’t finished yet) named “Semantic Comments.” When I happened to come across Knuth’s Literate Programming effort earlier this morning, I was struck by a sense of deja-vu but with a twist. Knuth strives to define meaning in English – a language accessible only to humans. I would prefer to define meaning in terms accessible to both humans and machines. It bothers me that comments are seen as an after-thought – something that the computer shouldn’t have to deal with. It just seems that we lack the precision to express our thoughts accurately.

I started this post with the desire to discuss the Turing Test and online chat bots. Somehow in the last 50 years, despite rapid improvements in memory and processor speeds, a truly intelligent conversation with a machine still eludes us. Some people, like Ray Kurzweil, would argue that we just haven’t reached a sufficient point of maturity in hardware to emulate human brains. Why must we resort to emulation, however? Are we not clever enough to figure out a solution without reverse-engineering ourselves?

Anyone who has spent more than 30 seconds with a modern chat bot will clearly notice a complete lack of depth. Projects like A.L.I.C.E. can seem to be responsive and mildly entertaining, but there is a definite sense that you are talking to a brick wall. My gut feeling is that people who write chat bots focus too much on the details. The bot has to be able to remember your name, or parse a sentence like “I hate you” and respond with a quip. Where do these requirements come from? In their effort to write a “convincing” chat bot, these authors miss the point: creating an intelligent system that can think and respond to your actions.

If I write 2+2=4, most anyone with a basic education will understand what I mean. Language is built upon successive levels of knowledge – all starting with interpretations of the real world. We know what the word “Apple” means depending upon the context. But constructing a chat bot is inherently out of context. It would be as though someone stuck you in a dark room and translated Chinese to you one word at a time and gave you no access to the external world.

So rather than focus on the arcane methods of sentence parsing and learning models, perhaps what chat bots need is a good dose of reality. Nothing less than a whole solution will suffice. You can’t eat Chinese food with a toaster.

Posted in Computing, Ideas, News, research | Tagged , , , , , , , , , | Leave a comment

Code Mountain

I came up with an idea this afternoon for a source code visualizer for Enterprise solutions. Basically it would provide a source code cloud built from files and their dependencies. Each cloud would be a logical block (function, if-statement, loop, etc.) and the user would be able to zoom in/out to any level as needed. You would then work on an application as a whole rather than on a set of files. Everything would still be saved out to standard files, but the entire application would be accessible in a visual manner that wouldn’t require lots of commands to open and close said files.

Basically I want to be able to quickly move between parts of a project using the Mac OS X Expose feature, but without subjecting myself to Mac OS X. I also want seamless integration with VIM. CloudVIM if you will…

The app would also tie into another idea I had a while back: multi-history versioning for files.  Basically I want to be able to view the state of a file at any given point during my editing session, even if that means branching into multiple edit histories.  Combining these ideas into CloudVIM would allow a developer to easily navigate to any source code at any time within a specified timespan.  Coupled with SVN for storage this would provide a powerful debugging model I think.

We need a good use for 1TB hard disks anyway.

Posted in Development, Ideas, News | Tagged , , , , , , | Leave a comment

Airspeed Velocity of an Un-Laden Swallow

European: 24 mph

http://www.style.org/unladenswallow/

Megan (one of my co-workers) was repeating Monty Python lines this afternoon. Figured I’d dig up a proof of the airspeed velocity of an un-laden swallow from Google.

Posted in Logic, Movies, News | Tagged , , , , , , | Leave a comment