CodeBaby and Red Shift to provide speech interaction with online characters.
Red Shift Company has internally developed speech recognition technology that has been used in supporting interaction with digital characters, what the company calls “virtual personalities” (SSN,September 2009, p. 29). The company has previously said that its speech recognition is based on biologically salient components of speech, extracting features closer to the way the ear works, claiming it is different than processing used by other speech recognition software (e.g., using mel cepstrum coefficients).
Red Shift’s RASR speech recognition is a software product that enables speech recognition through a web browser. Red Shift announced in April that it has entered into a strategic partnership with CodeBaby, a provider of online customer experience solutions to allow speech recognition in communicating with
CodeBaby’s 3D interactive digital characters—web based “personal assistants” that today operate within Web browsers on PCs, with potential future plans to move to mobile devices. The characters (CIVAs,CodeBaby Interactive Virtual Assistants) are designed to engage online customers and eLearners, and the company indicates that its solution is being used by Fortune 1000 companies as well as midsized clients. (“3D” refers to rendering with a 3D look and not the type of 3D that requires specialized glasses.)
CodeBaby analytics allow clients to track and tune conversations to achieve results that include lead capture, cross-sell/upsell, customer service, online learning, and other web tasks, according to the company. The joint solution will allow users to engage in two-way voice conversations with the avatars. Previously, interaction with the CodeBaby digital characters was through text or pointing and clicking. The speech recognition is performed in the network.
Wally Griffin, CEO, and Joel Nyquist, product manager, Red Shift, along with Tony Delollis, CTO, CodeBaby, explained the joint effort in an interview with Bill Meisel. Delollis indicated that there are at least two specific requests for proposals from large companies that they are currently addressing. A typical application addressed by the joint solution might be a kiosk at a travel site such as a train station where customers could check schedules and buy tickets.
The interaction with the avatars in a typical kiosk application tends to be highly structured, with a series of questions (a “tree”), thus allowing the speech recognition to be focused at each node of the tree and thus more accurate. Because there is a screen, the structuring is less annoying than in an IVR system that prompts vocally for all the options, Delollis noted, since all choices are presented immediately on the screen, and the customer can speak or click a choice immediately. The engaging “personality” of the 3D characters also encourages the customer to continue with the automation. The voice response of the digital characters is recorded by actors, maintaining the personality of the avatars. The speech recognition, in addition to allowing a more conversational interaction, also satisfies ADA (accessibility) requirements. Griffin noted that CodeBaby’s trees tend to be relatively “flat” and are designed not to frustrate.
The companies plan to encourage customer interaction with speech-enabled virtual assistants on general PCs as well as kiosks, motivating PC users to use a microphone by the potential entertainment value and ease-of-use of the avatars. Still later, CodeBay and Red Shift plan on moving to mobile devices, although today’s mobile browsers inhibit the use of speech interaction, they indicated. A specialized app may be required in the short term for