Patent ReferencesApparatus and method for lip-synching animation Authoring and use systems for sound synchronized animation Video telephone system Multimedia interface and method for computer system Video and audio multiplex transmission system Word dependent N-best search method Speech animation and inflection system Animated electronic meeting place Speech dialogue system for facilitating improved human-computer interaction Interactive man-machine interface for simulating human emotions InventorAssigneeApplicationNo. 638061 filed on 04/25/1996US Classes:704/270.1, Speech assisted network704/258SynthesisExaminersPrimary: Hudspeth, DavidAssistant: Opsasnick, Michael N. Attorney, Agent or FirmInternational ClassG10L 005/02AbstractA speech signal distribution system includes a transmitting subsystem and one or more receiving subsystems. The transmitting subsystem has a text to speech converter for converting text into a data stream of formant parameters. A supplemental parameter generator inserts into the data stream supplemental data, including linguistic boundary data indicating which parameters in the stream of formant parameters are associated with predefined linguistic boundaries in the text. In one preferred embodiment, the boundary data indicates which formant parameters in the data stream are associated with sentence boundaries. In addition, the supplemental parameter generator optionally inserts the text, lip position data corresponding to phonemes in the text, and voice setting data into the data stream. The resulting data stream is compressed and transmitted to the receiving subsystems. The receiving subsystem receives the transmitted compressed data stream, decompresses the data stream to regenerate the full data stream, and splits off the supplemental data. The formant data is buffered until boundary data is received indicating that a full sentence, or other linguistic unit, has been received. Then the formant data is processed by an audio signal generator that converts the formant parameters into an audio speech signal in accordance with a vocal tract model. Voice settings in the supplemental data are passed to the audio signal generator, which modifies audio signal generation accordingly. Lip position data in the supplemental data may be processed by an animation program to generate animated pictures of a person speaking.Other References
| |