Archive for the ‘Ideas’ Category

Analogs and Parallels

Saturday, January 12th, 2008

I’ve been reading a book titled Dreaming in Code recently, which is a rather extensive case study of the development of an open-source project named Chandler. Many of the lessons and references to programming in general have hit home with my experience. Things like the fact that many programmers would rather program than eat or sleep. In fact it is 3:13 AM right now and I’ve been reading a bit about Knuth’s Literate Programming effort – something that I am sad to say that I don’t know nearly enough about. Evidently the faculty at UNL doesn’t put much stock in his CWEB language. Or at least not at the under-graduate level. Either way, I’ve never written a program using it.

I often wonder what the distinction between mere random behavior and intelligence is. Good ideas are usually born from the ashes of old ideas (formulas, conjectures, and theorems in the parlance of mathematics). Two un-related ideas can be combined in a moment of ingenuity into something awesome. I do not believe in pre-determination or true chance. But of course, you would expect me to say that.

I started a post a while back (which I haven’t finished yet) named “Semantic Comments.” When I happened to come across Knuth’s Literate Programming effort earlier this morning, I was struck by a sense of deja-vu but with a twist. Knuth strives to define meaning in English – a language accessible only to humans. I would prefer to define meaning in terms accessible to both humans and machines. It bothers me that comments are seen as an after-thought – something that the computer shouldn’t have to deal with. It just seems that we lack the precision to express our thoughts accurately.

I started this post with the desire to discuss the Turing Test and online chat bots. Somehow in the last 50 years, despite rapid improvements in memory and processor speeds, a truly intelligent conversation with a machine still eludes us. Some people, like Ray Kurzweil, would argue that we just haven’t reached a sufficient point of maturity in hardware to emulate human brains. Why must we resort to emulation, however? Are we not clever enough to figure out a solution without reverse-engineering ourselves?

Anyone who has spent more than 30 seconds with a modern chat bot will clearly notice a complete lack of depth. Projects like A.L.I.C.E. can seem to be responsive and mildly entertaining, but there is a definite sense that you are talking to a brick wall. My gut feeling is that people who write chat bots focus too much on the details. The bot has to be able to remember your name, or parse a sentence like “I hate you” and respond with a quip. Where do these requirements come from? In their effort to write a “convincing” chat bot, these authors miss the point: creating an intelligent system that can think and respond to your actions.

If I write 2+2=4, most anyone with a basic education will understand what I mean. Language is built upon successive levels of knowledge – all starting with interpretations of the real world. We know what the word “Apple” means depending upon the context. But constructing a chat bot is inherently out of context. It would be as though someone stuck you in a dark room and translated Chinese to you one word at a time and gave you no access to the external world.

So rather than focus on the arcane methods of sentence parsing and learning models, perhaps what chat bots need is a good dose of reality. Nothing less than a whole solution will suffice. You can’t eat Chinese food with a toaster.

Code Mountain

Wednesday, November 28th, 2007

I came up with an idea this afternoon for a source code visualizer for Enterprise solutions. Basically it would provide a source code cloud built from files and their dependencies. Each cloud would be a logical block (function, if-statement, loop, etc.) and the user would be able to zoom in/out to any level as needed. You would then work on an application as a whole rather than on a set of files. Everything would still be saved out to standard files, but the entire application would be accessible in a visual manner that wouldn’t require lots of commands to open and close said files.

Basically I want to be able to quickly move between parts of a project using the Mac OS X Expose feature, but without subjecting myself to Mac OS X. I also want seamless integration with VIM. CloudVIM if you will…

The app would also tie into another idea I had a while back: multi-history versioning for files.  Basically I want to be able to view the state of a file at any given point during my editing session, even if that means branching into multiple edit histories.  Combining these ideas into CloudVIM would allow a developer to easily navigate to any source code at any time within a specified timespan.  Coupled with SVN for storage this would provide a powerful debugging model I think.

We need a good use for 1TB hard disks anyway.

Anti-Viruses

Thursday, November 15th, 2007

I followed an article today on Warden, the anti-cheating software used by Blizzard Entertainment, Inc. which eventually lead me to http://en.wikipedia.org/wiki/Polymorphic_code. The brief synopsis there got me thinking about the way in which anti-virus software currently works. I’m reasonably convinced that writing monolithic packages to combat swarms of virii is a mis-guided idea.

So perhaps what we need is an immune system composed of anti-viruses. These programs would seek to wage war against viruses but in a manner that avoids the pitfalls of virus propagation (primarily DDOS outbreaks). They would actively seek out insecure systems and infect them with the cure and then altruistically de-activate themselves.

I’m certainly not the first person to suggest this (and I’m reasonably certain that all attempts at such a solution have failed thus far). But I am convinced that fighting viruses with anti-viruses is the best long-term strategy. Currently we have software packages which build huge lists of black-listed programs to guard against infections. There are a couple of problems with this approach: 1) it isn’t scalable, 2) they’re targetable, and 3) it only allows the cure to be applied after the fact.

Scalability

As the number of viruses in the wild continues to grow, the database for detection will also grow. Eventually there will be so many potential exploits that scanning your system for an infection will take days (even with improvements in system performance). The best way to combat this trend is to develop general-purpose anti-viruses which morph and adapt to combat new threats. The key here is that you’d be running smart agents which could achieve much greater efficiency than a scan-and-ban solution can.

Targeting

More recent virus scanners do implement ‘heuristic’ scanning modes, but they are still susceptible to the concerns I’ve listed above. In order to wreak havoc on your system, a virus-writer needs only to disable Symantec or McAfee and then he/she has free reign over your bits. By distributing the cure across multiple anti-virii we could make it harder for virus writers to create countermeasures (because they’d be subject to the same constraints that anti-virus companies are now – namely that they’d have to build a database of anti-viruses to de-activate).

Deployment

The biggest hurdle to anti-virii is deployment of the cure in an ethical, rationed manner (so the anti-virus doesn’t end up causing just as much harm as the virus it is neutralizing). I’ll have to think about that. It doesn’t matter what your intent is if the outcome is the same.

Links