Making sense of chaos

(Warning: I have managed to make this post rambling as well as cryptic at the same time. Will probably post a full length article on this. )

I tried to add my site to the open directory project. The open directory project aims to be the definitive catalog of the web. It is being built by volunteer-editors Once my site has been approved, you should be able to find it under Computers: Internet: On the Web: Weblogs: Personal .

Or should it have been Society: People: Personal Homepages?
Or perhaps
Regional: Asia: India: Maharashtra: Localities: Mumbai: Society and Culture: Personal Homepages? I may make a lot of posts about economics and technology so perhaps I should put my site either here or here?
The problem is not that the people at the open directory project have chosen categories badly, but that it is impossible to come up with a classification to satisfy everyone’s mind.

People have fought wars over issues of identity and classification. Are Ahmediyas Muslims? Who is a Hindu? Can an Italian be an Indian at the same time? Is it possible to be an Indian and an American simultaneously?
These problems arise because reality is too complex to be categorised using language, but the human mind is somehow able to cope with this complexity. Because of this, we are always dissatisfied with any classification.
So when someone mentions ‘Sachin’, an Indian mind does not go: Sports – Cricket – Cricketers – Indian players – Current. He immediately knows who Sachin is. This classification ability is adaptive as well as dynamic. Two months ago, to locate ‘Surajlata’ an Indian might have had to go through a similar process as above. After the Indian women’s hockey team’s performance, the categorisation will perhaps be swifter. For that matter, twenty years ago ‘Sachin’ would have meant someone completely different – (a famous child actor).
The human mind can adapt to changes in reality much faster than any formal classification system can. The rather strange classification of weblogs (Computers: Internet: On the Web: Weblogs: Personal ) in the open directory is perhaps a historical legacy. Once a structure has been hardcoded, it is difficult to change.
It is to the credit of Tim Berners Lee that he designed the hyperlinked web as something that would closely mirror the way the mind thinks.
Now, the web is chaotic, with free-format text all over the place. One way to help a machine make sense of this chaos is to write web documents in machine readable format. This is the XML approach. This is unlikely to work outside a tightly defined context for the same reasons the open directory project is confusing. A more interesting approach is that of ?he semantic web. Google has gone a frighteningly long way in constructing tools for the semantic web. Google glossary and google sets are attempts to extract ‘sense’ out of the structure of HTML pages. If they succeed, they will be in line for a machine that passes the turing test. A machine that has access to the entire internet and has developed ‘understanding’ would be a direct competitor to God.