Tuesday, December 16, 2008

Wisdom: Rule mining

Artificial Intelligence is very likely to gain a lot more traction in the coming decade. I think it has already started. A.I. is not a science that is solely concerned with rebuilding the human brain or just a couple of cognitive aspects. A.I. is also related to questions that pertain to interaction of agents within a society or organization. In that sense, it's trying to combine individual decisions, individual cognitive abilities with the cognitive abilities and behaviour of that organization as a whole. A.I. is already a multi-disciplinary field of science with strong links to computer science, mathematics, philosophy, psychology, cognitive science, anthropology, management studies and possibly a couple more :). I'd like to start making the claim that A.I. isn't actually domain of computer science. I see it primarily as the science of conversion of problems in other spaces (psychological, behavioural, analytic, business) into expressive models that can run on computers. So it does share a lot of "knowledge" with the computer sciences, but for A.I. it is only the last step. C.S. on the other hand has many topics which are solely related to how to run something on a computer faster or more efficiently, so it is constrained to the elements of C.S. itself.

Thus, in other wordings, A.I. Computer Science as a means to offer a model or simulation of reality. Computers are good tools to use, since they have the capacity to process mountains of information in easy steps.

A big challenge of A.I. is the A. part actually. This A. part deals with computers that only accept an explicit and deterministic language, something that we're not exactly used to. A computer was designed (although it doesn't always behave that way :) to be 100% deterministic. Every cause and effect must be clear. In other words, every cause or event needs to have the intended effects and every observed effect must have a perfectly explainable cause. Challenges here are still abound in cases where events are not received or effects occur that seem inappropriate to the current context (the system is acting weird).

Yet the world doesn't always act in a deterministic way, and we don't use the same deterministic language within the same organization, not even within a single relationship. As soon as someone tries to impose a single perspective about "how the world works" on an organization, it somehow starts to fight back. Slightly different interpretations work better in different contexts. Computers can't deal with that though, since it's not truly contextually sensitive.

In this case, strategies that are based on more fuzzy representations of data sets can work better. The problem in those fuzzy strategies is that a computer can't derive rules from it. So, it doesn't actually gain any knowledge, other than a mathematical representation of how something works.

A very interesting academic exercise though would be an attempt to mine for rules in fuzzy data sets. Suppose that a system finds that customer A likes video A & B & C, and customer B likes video D & C & B, what are the common properties between those video's and how are they important for purchasing decisions? Can we actually profile customers in non-mathematical terms in this way and make statements like : "customer A likes action movies, except not with Keanu Reeves as a main actor?".

The establishment of such rules requires a lot of knowledge about the concepts that a computer is dealing with. As an example, SVD is an algorithm that can be used to analyze preferences or "like-ness" of books or video's. But it cannot state anything in our language about those concepts. If it were possible to construct phrases from such analysis, then also the computer could use that knowledge to develop (executable) rule sets.

Or maybe we shouldn't start with analysis in the first place, but start with rulesets. Develop a hypothesis and test that hypothesis (by how far it is true) through the mathematical analysis?

The ability for a computer to switch between sets of rules and mere "analytical processing", even though it was not programmed to do so in the first place, should be a very important area of research for the future. Learning for human beings is also about assimilating "knowledge statements" from our peers and then testing whether those statements are true by testing it against our experience of reality.

The outcome can be:
  • No experience on the topic, so I cannot verify if it is true or false. (insufficient evidence to validate your claim).
  • Insufficient knowledge in parsing the statement (I don't know what you mean, could you rephrase that please?)
  • That sounds interesting. Indeed I have some evidence that suggests your claim is true. Can you give me more examples?
A rule strength rating should probably also be given. We'll often find outcomes that contradict the rules. In those cases, we could be missing "except-if" cases or "and-A-and-B", where we failed to observe B being true most of the time, except for the last case where it was false.

How do you design a rule-based program that isn't as explicit and hard as Prolog for example, but more like a "soft-rule" program where it accepts statements that are generally true, but not necessarily always and where the computer can verify for itself the strength of those claims as well as form others based on observed data?

No comments: