By using this site, you agree to our Privacy Policy and our Terms of Use. Close
r505Matt said:
NJ5 said:
r505Matt said:
I love how some people here are talking about speech recognition and speech synthesis and facial recognition software like they are experts in regards to the various software.

Tell me what speech synthesis software even comes near to the quality of Milo's speech...

I've seen many and they're nowhere near as good. Yes, they are usable for blind people to read web pages, but they don't sound like a real person reading, much less a person talking with emotion (which is what Milo sounds like).

 

I don't know anything about speech synthesis, so I didn't make any comments on it. I was more interested in talking about speech recognition and gesture/emotion recognition, both of which I believe are possible for Milo. Though, my argument works against myself, and while I might know a bit about speech recognition, I am by no means an expert.

But since we're on speech synthesis, I don't think that's what they're trying to do with Milo (yet). I do think that what Milo is saying is scripted, but I don't think the experience is scripted. That can go towards the illusion thing Molyneux talks about, and that's okay so long as the experience itself is immersive as a result.

This is obvious but we won't really know until it's released or shortly beforehand.

I’m currently working in a company that deals heavily with voice recognition, and have worked directly with it in my past; although I am far from an expert in it, and my focus is in a completely different segment of the company (because they’re moving away from voice based systems). Our sales team will brag about a 75% recognition rate to new clients for a voice based system for a very simple call-flow because of how few systems can achieve that success level; and this requires more processing power than you could dedicate to the task on the XBox 360 and still have a game running at the same time. If you’re wondering, human levels of recognition are in the 97.5% range.

Bleeding edge voice recognition is so primitive it makes the worst Wiimote Waggle look amazingly precise.