Thursday, June 14, 2007

Semantic Search

I'm not an expert at websearch, but here goes... Some rants and ramblings on semantic web search.

I watched a program this week with a well-known philosopher. The program was about technology and media mostly, as well as social networking sites and so on.

One question asked during this program was whether semantic websearch would soon be a possibility and when exactly this is likely to be happening. The response was, from the philosopher, that he didn't think semantic search would ever take off and is basically dead in the water. The argument was that the context and meaning of certain words differs from one person to the next.

Although this is true, then maybe semantic search does not really mean searching for things in a general context that is known to be true, but search in specific contexts that the search engine understands belongs to that person, his perceptions and beliefs (formed by life experiences, human contact, environment, country culture, tradition and so on).

One thing that I suspect is not mostly used in web search is the verb. Most searches strictly use nouns, but the context of that noun can differ enormously if it is not accompanied with a verb. The verb would put things into a more specific context to a great amount, but it is not yet in a personalized context.

Steve Yegge blogs about the differences in "verb" and "noun" thinking from the perspective of a programming language. You could say that programming languages are in a way means of communication with a machine, to express ideas and so on.

Anyway, as I said, I have no idea to what amount search engines currently use verbs or contextualize searches to be more specific. It might consider search history as one way of improving hits, but this is not very reliable as our priorities and contexts can change very rapidly.

Regarding implementations of such a search engine... It would be a search engine that exists today with the added difference that user interaction (with user profiling) would add a context indication to particular pages. I don't think it is necessary to actually define all contexts prior to classifications. If you work with neural networks for example, the computer has no idea what it is doing, but the end result of each calculation comes close to what is expected.

It would be a great idea for research. To tie a neural network at both ends for a search engine and see what comes out. The difficulty with this neural network is of course how to heuristically define numbers based on the page... Or rather, how to encode the content of the page in such a way that together with the input of words and the user profile, the end result will be a particular score.

Another approach is to focus more on the verbs and start counting occurrences and take that as a contextual factor.

Perhaps the most limiting thing in search is that the search itself is badly expressed with words? I have a certain contextual idea of things that I am looking for... What is the best way to tell a machine to go looking for that particular context? We could store user's profiles, focus on verbs and all of that, but what about location or approximate location?

Some search engines provide advanced searches and this may be very helpful in this regard. In order to get anywhere, I guess it makes sense to include psychologists and anthropologists in the discussion to understand thought, expression and context better. There may be ways to convert these things in different ways to gain a more meaningful communication dialogue with a machine.

People mostly consider semantic search to be : "teaching the machine". Punishing it when the results are not what you are looking for, rewarding it when it is exactly on the mark. But if the context differs from one person to the next, there is a never-ending cycle of punishment and the machine just gets confused. Some things that are in the same context for everybody will get very high search ranks. But searching should be more effective than that. It should also aim to expose the niches.

No comments: