Wednesday, January 27, 2010

What knowledge may actually be about...

I've been cycling in the rain today and suddenly got this awesome thought train. Nothing much to do with rain, but I guess that the cooling characteristics of the cold rain with the cold weather probably caused some superconductivity somewhere. I can't get into too much details what I've come up with to resolve the below general and (as I argue) incorrect assunptions, but I can at least discuss the problems I perceived that might restrict the ability to develop more intelligent behaviours in computer programs. :)

To give you a starting frame of thought, I'm arguing that there are many assumptions about the origin, extent or representation of knowledge in many artificial intelligence reasoning models, in neural networks and whatever else is there. Here is why.

Most computer programs function by taking some parameters from an environment, file or whatever as input and based on that perform some action. A program without input or output is not a computer program. The general idea is thus that by applying some function F(X) on a range of input parameters X, one can derive an action or actions Y that should most logically ensue. The diagram here helps to illustrate this further. It is a depiction of a so-called Markov diagram. These are often used in A.I. to reason about and between states. One can move from state to state along the transitions, but the arrows also restrict movement to just any other state. The diagram need not be statically defined a priori, there are sometimes ways to dynamically extend the diagram at execution time, most of the times determined by a set of rules. M here is the entire world, consisting of possible transitions or relations R, states S (which is more of a placeholder in this case) and measurable parameters T from which you determine the current state S.

The above model works very well for a set of discrete parameters, since they always fit into one single state. For continuous parameters however, it becomes more difficult, because if every single value is considered, the number of states in the model would become infinite. Hence, classification is needed to reduce the number of states, or the development of some kind of 'probability distribution', where one is actually a little bit in all states at once, or one can come up with a fuzzy reasoning method for reasoning between just two states S1 and S2, which are closest to the current 'continuous' state.

The huge assumption made here is that all information leading to some desired action can be derived from a snapshot of the external world, just by inspecting some range of percepts X. Reasoning and intelligence is then reduced to the application of some function F(X), even though this can be quite a complicated affair when there are a possibly very large number of percepts X or a wide range of possible actions Y. The complexity due to the dimensionality of the problem is irrelevant for this conclusion however. Even more worrying here is that there's only one single assignment of intention or purpose in this entire system. The purpose is the definition of the function F(X) in itself, but it'll be difficult to change this as soon as the system has been constructed or trained... Hmmm... :). So one cannot expect this system to generalize a bit and then specialize further into other things.

So, resuming... having a function F(X) implies a direct, causal relationship between (a range of) perception(s) and a most logical (range of) action(s), which only holds if the purpose is fixed and pre-defined and doesn't change. Or worse... the context does not change. Also, knowledge is then only defined as the function that needs to be applied to the percepts in order to derive the action Y, it is not contextual knowledge about what is required in which context. In effect, the only way to make the system more able to deal with situations is to measure more and more in that single snapshot and then trying to fit the function F(X) to suit those measurements.

However, instead of considering that it is the state(s) embedding the information, one should consider that it is the transitions that allow one to collect the real knowledge about events, since transitions embody relationships. The transitions maintain the real information about some state S1, the next state S2, the difference in the percepts, the action(s) that was undertaken and whether S2 is actually a better state to be in when compared to S1, so deriving whether the action was desirable or not, given a certain context. Note that this doesn't yet require much information about a goal state Sg, so one might as easily conclude that there is no direct need to know about the eventual goal state or have a real conception about what it looks like, we can at this point still assume we're slowly and gradually moving towards the goal state and there's still a capability to change direction or impact the environment to reach the goal anywhere (but this is getting too complicated at this point).

This is why I think that discarding the thought that there is a causal relationship between percepts and most likely or desired action should be a first objective in order to develop more intelligent systems. Knowledge is more likely some kind of imagination or dreaming of the effect of actions from any given particular 'general state' and the kind of changes in perceptions how one can measure such progress, all this framed into a particular context. This should allow systems to reason with shorter horizons or reason at different levels with different horizons, but always reason with specific and particular contexts at each level. The definition of how many transitions you allow before making conclusions, or the extent of correlation between events is then part of the game and part of the experience...

No comments: