Uncanny valley and the biology of mind

This story is about Sophie Faber, the main character in the adventure game "Die Stadt Noah" which has been developed by students at HDM and other institutions in Stuttgart. Headed by Thomas Fuchsmann and with the help of many others a team of sometimes up to 50 artists, project managers and software developers created this 3-D adventure.

At the core of playing such a 3-D game is immersion: the ability of a game to draw the player so much into the game that the borders between game characters and player start to get fuzzy and the player somehow merges with a game character. But this requires the game character - Sophie in our case - to be absolutely believable or players would not be able to experience immersion.

Valentin Schwind - 3-D modeling expert and graphics artist (see valisoft) and Norman Pohl - expert game software developer - were responsible for the game character Sophie Faber and describe the steps needed to achieve a believable charakter. Sophie did not only receive looks, she also got animated with the help of an inverse kinematic system.

There is an endless number of modeling steps necessary to achieve perfection. The steps and the associated tools are all described in the thesis "Modellierung, Darstellung und interaktive Animierung virtueller Charaktere in 3D-Echtzeitanwendungen - Die Entstehung eines Charakters und die dynamische Animierung eines Biped-Systems für eine interaktive 3D-Welt" by Valentin Schwind and Norman Pohl. The following pictures and explanations were taken with permission from the thesis. The picture below shows the bones model used for animation of face and hair during rigging - which looks like a piece of art in itself at least in my mind. And there are some surprising principles which make it even harder to create a believable character. The character must be detailed and artistically satisfying while still being usable for high speed, realtime 3-D rendering as it is necessary for games.

Unfortunately I cannot show all the intermediate stages from the wire model to the final result. Sculpting, texturing etc. all create very interesting artefacts which would be worth showing. I just hope that the authors consider turning the thesis into a book. Gamers and game developers would certainly be interested to learn more about the currently available technology but also about the limits of character creation.

I am going to talk about:

The evolution of Sophie Faber
The Uncanny Valley
The biology of mind reading

Sophie Faber

The Noah-Team described the main game character as a young police officer, female, caucasian type with a fair skin and slightly red hair. She is an introvert, brave and intelligent. Her voice is soft and calm, clear and without a high pitch. She looks clean and seems to be a sports freak as well. She lost her parents at a very young age and was raised by the city where she works now as a dam-protection officer responsible for data security. She just started her job with the dam-police and feels that she needs to prove herself. (Noah-team, thesis pg.29)

A number of visual concepts had been developed and over the time the character got a bit more dynamic which created some strains with the original concepts. The pictures show various different concept studies of Sophie. and . The last picture shows the final version of Sophie which got modeled and animated during the thesis work.

There have been many game characters before Sophie, e.g. Super Mario and Lara Croft. They show huge differences with respect to the level of detail or the realism achieved. Some come close to reality, some are comic like characters. And sometimes the more realistc ones are having problems to allow immersion. The reason for this is the "uncanny valley" effect.

The Uncanny Valley

According to the authors 3-D computer games are currently on their way into the "uncanny valley". The uncanny valley is an effect that appears once artificial characters become "almost" completely realistic. The more natural a game character looks, the more critical the brain seem to get. This leads to the strange effect that comic-like characters appear more human-like (and therefore allow better immersion) than more realistic ones. For the brain only "fully realistic" will do once a character wants to look realistic (Schwind Thesis, pg. 16). This in turn causes huge problems for game developers because the animation of the characters adds significantly to this problem.

Currently game hardware is simply unable to do all the things necessary to achieve complete realism in game characters. Schwind mentions a number of problems at the end of the thesis: motion capture e.g. is currently not able to capter the many and fine movements within a human face. 3-D models are currently treated with a 2-D texture without taking different body parts below the surface into account. These movements which affect the surface reflections etc. would have to be controlled by a physics engine. Hair and skin are also problem areas which cannot be solved today in animated 3-D games.

The thesis contains a photograph of a an indonesian girl to show the difference between model and reality. But is a photograph REALLY a good thing to compare a computer generated face against?

The uncanny valley is certainly a product of human psychology and biology. But it could even be a product of culture. Istn't a picture of a face the ultimate in realism? It looks like that and certainly to us who look at pictures of faces every day. But is a picture REALLY a realistic representation of a face? A look at the history of photography and movies should make us a bit careful here. A photography of a face is a heavy transformation from 3-D to 2-D with several other distortions (color etc.) added. Some faces suffer from being photographed as some actors had to learn when the movies were invented. Actors that were successful at the theatre flopped completely in movies - their faces were not "movie-capable", in other words: the transformation did not do justice to their faces. Others had a so called "movie face".

The fact that we think a photography of a human face IS realistic is the result of a cultural adaptation process by our brain. Just as our brain can get used to calculate distances in two-dimensional pictures (or with one eye only), or to cover our blind spot in the eye, the brain makes us believe the photography is realistic. We simply got used to being exposed to media permanently. But what if you show pictures of faces to people who never got exposed to media before - will they take the picture for the human just as easily? I am not sure about it.

The biology of mind

By a happy conincidence I have read the book "Mind wide open" by Steven Johnson this summer. Johnson gives an overview of the current state in brain research. I wasn't too impressed except for a few notable facts. The thing I remember best was the section on "mind reading". (Almost) everybody can read the mind of other people. Now this has nothing to do with parapsychology or high-tech (at least not yet). It is the simple fact that the multi-layer distributed system that makes our brain has been built over a long time. Some of the functional areas were developed early, some rather late as e.g. the neo-cortex - seat of logic and abstraction and a rather slow buddy as well. The emergence of the human beings as social creatures has had an impact on the development of our brain: it developed a functional area that dealt with the interpretation of other humans in a high speed way. It was very important to understand how a partner or enemy would react. Minuscle changes in body or facial expressions - far too small and fast for the neo-cortex to notice - were successfully interpreted by the amygdala and allowed our ancestors to be prepared e.g. for a pending attack.

I said that almost all people can read minds. Actually empirical research has shown that a rather large percentage (I believe it was beyond 10 percent) of people have serious problems to interpret the signs from other people. The most extreme form is called autism but there are many lighter forms (aspberger e.g.) and they are much more common. Imagine being unable to understand the tiny and fast reactions of your opponent on something you said. Did you hurt his feelings? Did he like what you said? Is his smile genuine or faked? It must be extremely hard to live like that even though research has also shown that people with a mind-reading disorder are frequently higly intelligent and - surprise - gather at Universities where a lack in social skills can be compensated with extreme concentration on scientific topics.

So when I learned about the "uncanny valley" I wasn't really so much surprised. Johnson had explained e.g. how to differenciate a genuine smile expressing comfort from a faked one: the faked smile does not include the muscles around the eyes -your eyes don't smile with the rest of the face. This is immediately detected by the amygdala and reported. The neo-cortex is much slower and is still looking at the other parts of the face that seem to smile. The overall impression in your brain when it sees a fake smile is: somebody is smiling at me - so why do I feel uncomfortable, why do I sense something strange or wrong? It then depends on your willingness to accept the uncomfortable feeling and discover the reason for it or to suppress the signals from your amygdala and go with the neo-cortex alone.

Translated to the uncanny valley: when you look at a close-to-realism game character your neo-cortex is satisfied but some agent in your brain calls "fake, fake, fake, something is wrong" by perhaps causing a vague, uncomfortable feeling. And this can ruin the intended immersive effect.