No disrespect to Kinect, but I have always thought that a system proposed to catch the diversity of human movement in a 3D space would probably need an X, Y, and Z sensor optimised in locations. So therefore, a sensor to both left and right, front and back, and possibly above. That way you would be able to build up a true 3D representation. How can a sensor only located in front of you possibly detect the movement accurately as seen from the side? A lot of extrapolation would be required with situations where a 50/50 assumption would have to be made.
For the sitting dilemma, they would possibly use a primer of the body to say, ignore the lower half of the avatar from a certain point; track only above but then again, for a driving game, the foot needs tracking too.
The other way of doing this is by using a suit on the person that has broadcasting sensors but that defeats the point of not having to attach anything (controller or otherwise) to the subject.











