2010.07.30
This week has been about researching the next steps for the germination x design. Principally looking at how we can use the game to demonstrate and exploit some of the work on the Lirec project. I’ve had a speculative look at how we could use companions to fit the theme of a permaculture game, and ended up with some strange combination of bickering greek gods and plant spirits.
I’ve also been playing with the AgentMind code from our partners at Heriot Watt and got some of my own AI agents running and interacting with objects in a test world. At the other end, the client game code is now free from flash dependencies, so it should be possible to target HTML5 canvas and flash in the same code.
I’ve also switched from svn to git for everything and moved to gitorious. Germination X is here and all the rest of my mess is now located here.
2010.01.06
A particle filter is a technique used in computer vision to estimate the state of a system, given noisy data from fallible sensors. The underlying idea is called a hidden markov model, and looks something like this:

The assumption is that any state of the system is dependant on it’s previous state (and thus all previous states) and this state is something we can never know directly, only via observations. There are two very different sources of inaccuracy or noise. One originates in the state change process – as it’s assumed we can never have a complete model of this (a good bet in the case of human actions for example). The other source of noise is in the observation process itself, which comes from the way the sensors work. This is more predictable, and filters of this type are built to allow you to account for this.
Particle filters maintain a multitude of hypotheses of the hidden state of the system at the same time. They attempt to model state changes in some basic way, for instance the velocity of a moving object. They also model the observation process, for example a distance/angle reading of an object in x,y space. Each time a new observation arrives, the system grades each particle’s simulated observation against the incoming one and weights them accordingly.

This is a frame from some particle filter code I’ve written which is tracking an object as described above. The line is pointing to the current estimation which is based on readings from a radar like sensor. I’ve told the system that the heading sensor is less reliable than the distance sensor, and so the particles are spread out in a vague crescent shape accordingly. This shape is called the probability distribution function (or pdf) and it’s a strength of particle systems to model complex pdfs such as this effectively.
2009.11.30
I spent last week in Budapest, the first half was a lirec consortium meeting. We spent Tuesday morning finding out about dog and human behaviour at the Department of Ethology at the Eötvös Loránd University. I think the most facinating part for me was the area of human understanding of dog vocalisation, as we have mutually developed a complex communication system with dogs over the last 100,000 years, with very little in the way of what we usually call language.
I also spent a couple of days at the wonderful kibu (or kitchen budapest) meeting up with Gabor and Agoston, doing a presentation about groworld and the resilients project with Nik, and talking a lot about fluxus.

2009.11.11
I’m flitting around a lot between projects… Back on appearance models for the lirec project, this is a small slice of face space, the plots represent images of the individuals in different lighting conditions – seeing how the lighting affects the spread of the data. One of the images for each individual is shown at the top, along with it’s symbol. The axes refer to only 3 of the 600 dimensions in the face space I’m using for this, I’ve picked some good ones so you can see how the individuals are clustered.

User identification happens simply by finding the closest known face to the one the camera can see. Actually, currently the face classifier cheats by finding the closest average centre of the known faces for each individual, but looking at these plots I don’t think this is a good approach as the ‘blobs’ aren’t very spherical.
This is also (rather shamefully) my first go with gnuplot which I’m liking a lot.
2009.11.10
Making myself look ridiculous as usual, but this works better than I had expected:
It’s taking the vector in face space between example smile and frown expressions and then projecting the new face it doesn’t know about onto it (the dot product in multi dimensions) to give a value for how smiley or how frowny a face is. I’m calibrating it with my own expressions for the moment, but it does seem to work on other people to some extent. More data would make it more robust, but the theory seems good! The code is here and here.
2009.11.06
Parametrising gormlessness:

I’m trying to parametrise expressions, which involves making faces at a web camera all day. The bars along the top visualise the parameters for the face model that my face generates, and the rather scary image next to my face is the result of putting these numbers back into the face model to synthesise my face. The training data is not really that great, but it seems to be able to represent expressions more or less so I’m hoping with a little bit more work I can get an estimation on what expression you are pulling.
I’m not terribly confident, but a day or so’s work should provide an answer either way.
2009.09.11
I’ve been starting to get back into the Lirec project again this week, starting off by wrapping all my C++ vision code to python in order to script it. This has sped the research work up already, as I’ve been able to script some initial experiments on expression recognition in a couple of days – I’m currently using the yale face database as my training data for the expression appearance model, but it’s not very good as there aren’t enough faces really. I’ve registered for the AR face database and the PIE database from Carnegie Mellon too.
There is a limit to how much development of the competencies (the low level things the robots need to do) really furthers Lirec’s research aims, so I’m also looking for places where robots already form long term companionships with humans – there are some surprising cases cropping up in the military which need further investigation.
2009.08.06
This gang of motley characters are the eigenfaces expressed though time, so I can see what kind of changes each vector represents in my eigenface-space. If you look closely they tend to express lighting changes, expressions, face shape and pose (head rotation) often jumbled together in some strange form. The next target is to separate out these ‘modes of change’ into clean vectors, so you only see expression, or only lighting changes etc in each vector. Then it becomes possible to build an appearance model which understands an incoming image in ways which are useful. e.g. ‘This is a face which looks like bob, he seems to be smiling, and the light is coming from the left’. Well, that’s the theory anyway.
2009.08.05
I now have two methods of face identification. In an attempt to apply more method to my madness, I’ve been compiling images to use in benchmark tests to find out which one is better, and by how much. I’ve used the yale face database B, which has ten people in lots of lighting conditions, and giving the algorithms 4 images of each person in good lighting to train on, and the rest to recognise – and find where it breaks down.

On the left is the faceident program, which uses raw differencing on the face image pixels, basic stuff, on the right is the new faceclassifier program which uses the eigenfaces approach, which is a trained appearance model. The subjects should be numbered 0-4 from left to right, there are 40 images of each one, in increasingly difficult lighting.
The difference is not too staggering – the faceclassifier is 9% better than faceident (46% vs 56% correct). However, faceident is about as good as I can get that approach, while there is lots of room for tuning the faceclassifier. I need to try using different face databases for training (currently it’s using Dr Libor Spacek’s one I was playing with earlier) and also methods of projecting away things we are not interested in from the faces we want to recognise, such as pose, lighting and expression. Having benchmarks like this will help immensely for this process too, so I can compare iterations.
|