From c78d18eed115f53c3c2b7970c6a776450bdbe738 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Joris=20Pelgr=C3=B6m?= Date: Thu, 13 Jul 2023 00:24:11 +0200 Subject: [PATCH] Document sending Assist pipeline speech data (#1803) --- docs/voice/pipelines/index.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/voice/pipelines/index.md b/docs/voice/pipelines/index.md index 8ca2284f..cdd5fcb8 100644 --- a/docs/voice/pipelines/index.md +++ b/docs/voice/pipelines/index.md @@ -49,3 +49,14 @@ The following events can be emitted: | `tts-end` | End of text to speech | audio only | `media_id` - Media Source ID of the generated audio
`url` - URL to the generated audio
`mime_type` - MIME type of the generated audio
| | `error` | Error in pipeline | On error | `code` - Error code
`message` - Error message | +## Sending speech data + +After starting a pipeline with `stt` as the first stage of the run and receiving a `stt-start` event, speech data can be sent over the WebSocket connection as binary data. Audio should be sent as soon as it is available, with each chunk prefixed with a byte for the `stt_binary_handler_id`. + +For example, if `stt_binary_handler_id` is `1` and the audio chunk is `a1b2c3`, the message would be (in hex): + +``` +01a1b2c3 +``` + +To indicate the end of sending speech data, send a binary message containing a single byte with the `stt_binary_handler_id`.