Analogs and Parallels

I’ve been reading a book titled Dreaming in Code recently, which is a rather extensive case study of the development of an open-source project named Chandler. Many of the lessons and references to programming in general have hit home with my experience. Things like the fact that many programmers would rather program than eat or sleep. In fact it is 3:13 AM right now and I’ve been reading a bit about Knuth’s Literate Programming effort - something that I am sad to say that I don’t know nearly enough about. Evidently the faculty at UNL doesn’t put much stock in his CWEB language. Or at least not at the under-graduate level. Either way, I’ve never written a program using it.

I often wonder what the distinction between mere random behavior and intelligence is. Good ideas are usually born from the ashes of old ideas (formulas, conjectures, and theorems in the parlance of mathematics). Two un-related ideas can be combined in a moment of ingenuity into something awesome. I do not believe in pre-determination or true chance. But of course, you would expect me to say that.

I started a post a while back (which I haven’t finished yet) named “Semantic Comments.” When I happened to come across Knuth’s Literate Programming effort earlier this morning, I was struck by a sense of deja-vu but with a twist. Knuth strives to define meaning in English - a language accessible only to humans. I would prefer to define meaning in terms accessible to both humans and machines. It bothers me that comments are seen as an after-thought - something that the computer shouldn’t have to deal with. It just seems that we lack the precision to express our thoughts accurately.

I started this post with the desire to discuss the Turing Test and online chat bots. Somehow in the last 50 years, despite rapid improvements in memory and processor speeds, a truly intelligent conversation with a machine still eludes us. Some people, like Ray Kurzweil, would argue that we just haven’t reached a sufficient point of maturity in hardware to emulate human brains. Why must we resort to emulation, however? Are we not clever enough to figure out a solution without reverse-engineering ourselves?

Anyone who has spent more than 30 seconds with a modern chat bot will clearly notice a complete lack of depth. Projects like A.L.I.C.E. can seem to be responsive and mildly entertaining, but there is a definite sense that you are talking to a brick wall. My gut feeling is that people who write chat bots focus too much on the details. The bot has to be able to remember your name, or parse a sentence like “I hate you” and respond with a quip. Where do these requirements come from? In their effort to write a “convincing” chat bot, these authors miss the point: creating an intelligent system that can think and respond to your actions.

If I write 2+2=4, most anyone with a basic education will understand what I mean. Language is built upon successive levels of knowledge - all starting with interpretations of the real world. We know what the word “Apple” means depending upon the context. But constructing a chat bot is inherently out of context. It would be as though someone stuck you in a dark room and translated Chinese to you one word at a time and gave you no access to the external world.

So rather than focus on the arcane methods of sentence parsing and learning models, perhaps what chat bots need is a good dose of reality. Nothing less than a whole solution will suffice. You can’t eat Chinese food with a toaster.

2008 Tidings

I realized today that I haven’t published an article here since November.  Apparently between work and the holidays, December skipped right off the radar.  Sadly, I am uninspired to write anything at present.

Code Mountain

I came up with an idea this afternoon for a source code visualizer for Enterprise solutions. Basically it would provide a source code cloud built from files and their dependencies. Each cloud would be a logical block (function, if-statement, loop, etc.) and the user would be able to zoom in/out to any level as needed. You would then work on an application as a whole rather than on a set of files. Everything would still be saved out to standard files, but the entire application would be accessible in a visual manner that wouldn’t require lots of commands to open and close said files.

Basically I want to be able to quickly move between parts of a project using the Mac OS X Expose feature, but without subjecting myself to Mac OS X. I also want seamless integration with VIM. CloudVIM if you will…

The app would also tie into another idea I had a while back: multi-history versioning for files.  Basically I want to be able to view the state of a file at any given point during my editing session, even if that means branching into multiple edit histories.  Combining these ideas into CloudVIM would allow a developer to easily navigate to any source code at any time within a specified timespan.  Coupled with SVN for storage this would provide a powerful debugging model I think.

We need a good use for 1TB hard disks anyway.

Dust Bunnies

I exorcised the demons from my laptop over the Thanksgiving break.  Turns out that running an Athlon 64 at 2.0GHz with 3+ years of dust collection is a bad idea.  I had been forced to run everything with the power savings set to maximum (which under-clocks my CPU to 800MHz).  Running anything CPU intensive would ultimately lead to programs crashing as the poor CPU attempted not to DIAF.

Airspeed Velocity of an Un-Laden Swallow

European: 24 mph

http://www.style.org/unladenswallow/

Megan (one of my co-workers) was repeating Monty Python lines this afternoon. Figured I’d dig up a proof of the airspeed velocity of an un-laden swallow from Google.

Decode Challenge

I recently ran across a rather insecure form of “encryption” which really qualifies more as obfuscation. That’s all I’m going to say. I split it across multiple lines, but there are no line breaks in the code.

QUJDREVGR0hJSktMTU5P
UFFSU1RVVldYWVoKMTIz
NDU2Nzg5MAoyCjMKNQo3
CjExCjEzCjE3CjE5CjIzCjI5Cl
RoZSBxdWljayBicm93biBmb
3gganVtcGVkIG92ZXIgdGhl
IGxhenkgZG9nLgpUaGUgb
WVhbmluZyBvZiBsaWZlLC
B0aGUgVW5pdmVyc2UsIG
FuZCBldmVyeXRoaW5nIGlz
IDQyLgpJIDwzIGNoZWVzZS4=

Anti-Viruses

I followed an article today on Warden, the anti-cheating software used by Blizzard Entertainment, Inc. which eventually lead me to http://en.wikipedia.org/wiki/Polymorphic_code. The brief synopsis there got me thinking about the way in which anti-virus software currently works. I’m reasonably convinced that writing monolithic packages to combat swarms of virii is a mis-guided idea.

So perhaps what we need is an immune system composed of anti-viruses. These programs would seek to wage war against viruses but in a manner that avoids the pitfalls of virus propagation (primarily DDOS outbreaks). They would actively seek out insecure systems and infect them with the cure and then altruistically de-activate themselves.

I’m certainly not the first person to suggest this (and I’m reasonably certain that all attempts at such a solution have failed thus far). But I am convinced that fighting viruses with anti-viruses is the best long-term strategy. Currently we have software packages which build huge lists of black-listed programs to guard against infections. There are a couple of problems with this approach: 1) it isn’t scalable, 2) they’re targetable, and 3) it only allows the cure to be applied after the fact.

Scalability

As the number of viruses in the wild continues to grow, the database for detection will also grow. Eventually there will be so many potential exploits that scanning your system for an infection will take days (even with improvements in system performance). The best way to combat this trend is to develop general-purpose anti-viruses which morph and adapt to combat new threats. The key here is that you’d be running smart agents which could achieve much greater efficiency than a scan-and-ban solution can.

Targeting

More recent virus scanners do implement ‘heuristic’ scanning modes, but they are still susceptible to the concerns I’ve listed above. In order to wreak havoc on your system, a virus-writer needs only to disable Symantec or McAfee and then he/she has free reign over your bits. By distributing the cure across multiple anti-virii we could make it harder for virus writers to create countermeasures (because they’d be subject to the same constraints that anti-virus companies are now - namely that they’d have to build a database of anti-viruses to de-activate).

Deployment

The biggest hurdle to anti-virii is deployment of the cure in an ethical, rationed manner (so the anti-virus doesn’t end up causing just as much harm as the virus it is neutralizing). I’ll have to think about that. It doesn’t matter what your intent is if the outcome is the same.

Links

Power Corrupts?

Well I finally decided to extract my copy of Wordpress and get things rolling tonight. I spent a good number of hours a couple of weekends ago getting software updated on my dedicated server and removing the Plesk admin interface that had been pre-installed about 2 years ago. As I have been configuring my copy of Wordpress, I stopped to ponder why I need to have control over where I host my files…

It isn’t as though hosting my automagic musings on another server would really be all that bad. I wouldn’t have to manage the software updates or keep track of bugs and hackers. Yet despite the risk of screwing things up, I still prefer having control over my data. I pay a monthly hosting fee to have access to a linux system which I’m free to do with as I wish. I enjoy that freedom.

Short Version: Programmers <3 control.

First Post

I’ve started to familiarize myself with the Wordpress interface. Next up: create a new template and post some info worth reading.