Tuesday, May 12, 2009

Why RBM's are so strangely weird

I'm getting quite obsessed by RBM's for some strange reason. There's a very strange simplicity to the RBM, a very elegant method for learning through contrastive divergence and a very strange ability for an RBM to model many things. The current science shows and understands that RBM's certainly have limitations, but here we go to try to expand on that.

An RBM is a very strange kind of neural network. Artificial neural networks the way we know them generally work the signal in a forward direction, but RBM's work in a forward and backward direction. In a sense, you could say that it's a little bit similar to our minds in that, when we observe something, we both use the details from the input signal to enrich the actual observations, but at the same time use the information from our experience to enrich or expect what is being observed. I reckon that if we were to only rely on the observed state, that state wouldn't nearly be as rich as our mentally induced state, which blends our experience with our observations.

Here's a video that might blow your mind or not... It's a presentation from Giulio Tononi, which I found very compelling. In this theory, it's not a search about the quantity of neurons required to become conscious, or the localization of consciousness within the brain, but it's more of a theory of the most effective organization of neurons within a network for such a network to exhibit consciousness. (text)

Here's where I paid huge attention. Apparently, having a network that has all neurons connected together is total crap. And a network that is very large and has a high number of local connections can be good at something, but it doesn't have the right properties for consciousness. The best thing is a network with specialized neurons, connected in patches, with long connections now and then to other parts of the entire mesh. Much of the work there is related to quantifying consciousness. By quantifying consciousness, and if this quantification is in step with actual consciousness, one can continue to search for more effective methods of building neural nets or machines.

The property about "patchy-ness" suggests that blindly connecting neurons together isn't the most effective way to build a network. A highly regular network makes any system act like a general on/off machine, losing its specificity of function. Neurons that are not connected enough make it work like having a number of independent classifiers, which isn't good either.

Most NN's and RBM's build their theories around having x number of neurons or elements connected evenly together with other layers and then calculate a kind of "weight" from one element to another. Putting more neurons into a certain layer generally makes the network more effective, but improvement is generally asymptotic.

I wonder whether it's possible to develop a theory, complementary to the theory of the quantity of consciousness, which perhaps as some derivative allows a neural network to shape the network itself, or whether such theories provide better rules for constructing networks. One good guess would be to do observations of biological growth and connection-shaping of a brain or simpler parts and then assess the patterns that might be evolving in the generation of such a network.

Finally, the most interesting words of the hypothesis:

Implications of the hypothesis

The theory entails that consciousness is a fundamental quantity, that it is graded, that it is present in infants and animals, and that it should be possible to build conscious artifacts.

This is a huge implication. And in order to understand it, one should go to the start of this post. Consciousness == experiencing things. As said before, it means that our observations carry detail, which are processed by itself, but which are also completed by previous experiences. Thereby, our actual experiences are not just the observations we make, but a total sum of those observations plus memories, evoked emotions, etc. In a way, you could say that what we observe causes us to feel aroused, or have some kind of feelings, and seeing similar things again at a later point in time might cause us to see the actual observations + previous experiences (memory) at the same time. It's very likely that not all experiences are actually consciously lived, in the sense that we're aware of all possibilities of experiences that we could actually experience, very likely there are many experiences just below the surface of consciousness as some kind of potential or stochastic possibility, waiting to be activated by changes in the temporal context.

For example, rapid changes in our direct observations can cause instant changes to behaviour. This implies that next to observing the world like a thought-less camera, consuming light-rays and audio waves, we're also experiencing the world as a kind of stochastic possibility. The easiest example of demonstrating this is the idea of movement, of intent, of impact and likely effect.

The phrase: "I'm standing at a train station and I see a train coming towards me" contains huge amounts of information. The recognition of the train in the first place, the experience that it's moving towards you by the train becoming larger, the knowledge that the train runs over tracks that you're standing next to, the knowledge that train stations are places where trains stop and your intent to get on the train. Just telling here how much knowledge we apply to such a simple situation demonstrates how we're accepting our consciousness as the most normal thing on earth, which it certainly is not.

Well, so why are RBM's so strange in this sense? Because old-school neural networks don't have these properties. RBM's can both recognize things, but also fantasize them back. There are certainly current limitations. In previous posts I've talked about consciousness that we shouldn't perhaps limit the definition by "consciousness == when humans think or experience". When maintaining a broader definition of consciousness, one can also consider machines or A.I.'s which are extremely effective in a very particular area of functioning and might just be consciousness in that relevant area without having any kind of consciousness of things around. The definition of consciousness here however is a dangerous one, since it shouldn't be confused with behaviour, which it certainly is not.

Food for thought...

No comments: