I agree that there is a lot of bashing and misunderstanding regarding these motion capture technology, so I'd like to add some details into this thread to clarify the hardware and software behind these technologies. I don’t work for any of the 3 companies so I don’t have any inside information, but from my experience with these kinds of technologies, I think the things I talk about are likely to be what is going.
First off, the WM+ is using a 3 degree-of-freedom (DOF) accelerometer from the wiimote and a 3 DOF gyro in the add on module to estimate the controller’s position and pose. They carry out the estimation of the wiimote’s posision and orientation by first integrating the accelerometer for velocity and then once more for position, and once for the gyro reading for orientation. Because of the inherent bias in the accelerometer and gyro, they correct the position and orientation using the infrared camera’s observation of the sensor bar. What likely to happen if the wiimote is pointed away from the sensor bar is that the estimation will start to degrade and drift off until the sensor bar is again observed, which will allow the wiimote to again correct its estimation parameters.
Pros: Can be implemented using an Extended Kalman Filter, or Particle Filter (can be computed very quickly), 1-1 wiimote movement in game.
Cons: The IR camera’s field of view is fairly limited, so the estimation parameters can slowly drift if the IR camera is pointed away from the sensor bar for a long time. Only the wiimote’s position and orientation are estimated.
Sony’s solution is fairly similar to the WM+ in that they use a 3DOF accelerometer, a 3 DOF gyro in the wand. The difference is that they don’t have IR camera on the wand, but used a single color camera (sitting on top of the tv) looking at the ob on the wand. The way they calculate the position and orientation is pretty much the same except the camera’s orb is the observation for the correction step.
Pros: Can be implemented using Extended Kalman Filter, or Particle Filter (can be computed very quickly). Also, the color camera’s field of view is larger than that of the wiimote’s IR camera. 1-1 wand movement in game. Augmented reality (the sword, stop sign demo, etc.).
Cons: The wand’s position and orientation can drift when the ob is occluded. Only the wand’s position and orientation are estimated.
Natal is a solution that is very different compared to the solutions employed by both Nin. and Sony. They are running a camera fusion algorithm and use the result to perform segmentation (separate the humans from one another and from the background). Afterward, they fit a skeleton model through each of the segmented human volume (mesh) and track the body movement for each of the person in the cameras’ field of view. In addition, they use an array of microphone to process and segment the human speech (I’m not very sure of this part since I don’t have a lot of experience with speech recognition).
Pros: Full body movement and voice recognition. Traditional controller buttons can be remapped to specific action from specific body part (this is not easy and will require some experimentation from the game developers) or verbal commands. Augmented reality.
Cons: Computationally expensive (can be laggy). Full body movement and human activity recognition can be tricky. Occlusion (obviously, the cameras are not going to be able to pick you out if you hide behind a couch or a table). Body movement estimation is locally optimal.
Now for the people who are saying that Eyetoy has been doing the same things Natal is doing, then it is just not true. Eyetoy is a single rgb camera so there is a limit on what we can do with the rgb video stream given that we don’t have unlimited computational power to analyze and extract information from a monocular video. Plus, lighting conditions pose a serious problem for rgb cameras ( it works poorly in dark environment). I personally think that each of the technologies have their own merit and it will be fun to see how they perform when they are released.