The biggest limitation of the image-processing camera approach is that it's inherently limited to 2-D. Say you're swinging a golf club toward the screen a la Wii Sports golf, with the camera mounted near the screen. The camera can't really measure the arc of your swing the way the 3-D Wii remote can, because it can't see it from anything but a flat perspective, unless you face away from the screen or place the camera so that your arc is perpendicular to the camera.
Now, of course, you can make some reasonable assumptions (the user's arm is likely not growing or shrinking during the movement) -- but even then I don't think you can easily tell whether it's moving TOWARD or AWAY from the camera. And trying to get two cameras set up to solve that problem, then keep them in synch and have proper lighting from two different angles so both cameras can see what's going on... the mind begins to boggle from a consumer application standpoint. (I was never able to get the PS2 Eyetoy lit properly in my home theatre, without washing out the screen and/or being unable to see myself well enough to play! Either my head was lit or my hands, never both, due to light source placement limitations.)
As primitive as the Wiimote will seem when the next generation of motion controls comes along, it's much, much better at telling what the heck you're doing than a single camera can be. Subtle motions, shifts in the player's position, variations in player height -- all of that is quite nicely handled or filtered out as needed by a handheld device capturing real time 3-D positional, angle and speed data.
Anything like visual motion capture is much, much more demanding on the player as well as the hardware, and will generally be much more laggy and imprecise with today's technology -- image noise and uneven lighting in, bad control out.
Now, to go beyond the Wiimote to a more full-body experience, maybe some sort of "accelerometer suit" that could map out your body's movements at key joints would work -- in fact, I think there's a company marketing a cheaper motion-capture technology based on that very idea. If THAT could be brought down to a reliable and inexpensive home level it could be very fun to mess around with. But still not pick-up-and-play for the average human being -- who wants to put on a full-body device just to play a game?