- Interactions : hand pointer movements, press and grip events useful for controlling a cursor, buttons and other UI
- User Viewer: visual representation of the users currently visible to Kinect sensor. Uses different colors to indicate different user states
- Background Removal: “Green screen” image stream for a single person at a time
- Skeleton: standard skeleton data such as tracking state, joint positions, joint orientations, etc.
- Sensor Status: Events corresponding to sensor connection/disconnection
This is enough functionality to write a compelling application but it doesn’t represent the whole range of Kinect sensor capabilities. In this article I will show you step-by-step how to extend the WebserverBasics-WPF sample (see C# code in CodePlex or documentation in MSDN) available from Kinect Toolkit Browser to enable web applications to respond to speech commands, where the active speech grammar is configurable by the web client.
A solution containing the full, final sample code is available on CodePlex. To compile this sample you will also need Microsoft.Samples.Kinect.Webserver (available via CodePlex and Toolkit Browser) and Microsoft.Kinect.Toolkit components (available via Toolkit Browser).
So, What Functionality Are We Implementing?
More specifically, on the server side we will:
- Create a speech recognition engine
- Bind the engine to a Kinect sensor’s audio stream whenever sensor gets connected/disconnected
- Allow a web client to specify the speech grammar to be recognized
- Forward speech recognition events generated by engine to web client
- Registering a factory for the speech stream handler with the Kinect webserver
While I'll also be highlighting this on the Channel 9 Coding4Fun Kinect Gallery next week, I thought it extra cool...