Professional Full Body Motion Capture

Professional motion capture systems such as Vicon’s systems offer precisions “down to 0.5 mm of translation and 0.5 degrees of rotation” but to create a system with this level of accuracy it requires a large acting space and multiple cameras that start at $3000 dollars a piece. And while this may offer some of the best tracking available it also has its limitations. The cameras are limited by line of sight and are not something that many smaller institutions can utilize let alone afford. In addition, for the purpose of tongue motion capture this is not even an option because the tongue is obstructed.

Figure 4. A Vicon Motion Capture System

Figure 4. A Vicon Motion Capture System

Audio Pronunciation without Visual Motion Feedback

In traditional speech therapy there was never a good way to explain what someone was doing incorrectly when they mispronounced something. For example, if someone was requested to say “Ta” and they said “Da” we can infer that it is most likely a tongue placement issue, but we can’t actually know what’s going on inside of their mouth without using an MRI, ultrasound, or some other potentially expensive equipment. One of the easiest and best alternatives is to merely show feedback on the sound produced, Rosetta Stone does this remarkably well. In Rosetta Stone they provide feedback based solely on the audio qualities of what is spoken, they offer pitch, tone, and other metrics to provide a score. The program chooses the words to be worked on anddoes not offer the user the ability to request certain words, it merely will move on when he word has been correctly said.

Figure 5. Rosetta Ston

Figure 5. Rosetta Ston