Add stt-vad-start and stt-vad-end pipeline events

2025-07-20 15:56:30 +00:00 · 2023-08-17 16:35:15 -05:00 · 2023-08-17 16:35:15 -05:00 · 6941f6e1d7
commit 6941f6e1d7
parent 68b6b21b2b
1 changed files with 13 additions and 11 deletions
--- a/docs/voice/pipelines/index.md
+++ b/docs/voice/pipelines/index.md
@ -37,17 +37,19 @@ The following input fields are available:
 As the pipeline runs, it emits events back over the WebSocket connection.
 The following events can be emitted:

-| Name            | Description                 | Emitted    | Attributes                                                                                              |
-|-----------------|-----------------------------|------------|---------------------------------------------------------------------------------------------------------|
-| `run-start`     | Start of pipeline run       | always     | `pipeline` - ID of the pipeline<br />`language` - Language used for pipeline<br />`runner_data` - Extra WebSocket data: <ul><li>`stt_binary_handler_id` is the prefix to send speech data over.</li><li>`timeout` is the max run time for the whole pipeline.</li></ul>                         |
-| `run-end`       | End of pipeline run         | always     |                                                                                                         |
-| `stt-start`     | Start of speech to text     | audio only | `engine`: STT engine used<br />`metadata`: incoming audio metadata
-| `stt-end`       | End of speech to text       | audio only | `stt_output` - Object with `text`, the detected text.
-| `intent-start`  | Start of intent recognition | always     | `engine` - [Agent](/docs/intent_conversation_api) engine used<br />`language`: Processing language. <br /> `intent_input` - Input text to agent |
-| `intent-end`    | End of intent recognition   | always     | `intent_output` - [conversation response](/docs/intent_conversation_api#conversation-response)          |
-| `tts-start`     | Start of text to speech     | audio only | `engine` - TTS engine used<br />`language`: Output language.<br />`voice`: Output voice. <br />`tts_input`: Text to speak. |
-| `tts-end`       | End of text to speech       | audio only | `media_id` - Media Source ID of the generated audio<br />`url` - URL to the generated audio<br />`mime_type` - MIME type of the generated audio<br /> |
-| `error`         | Error in pipeline           | On error     | `code` - Error code<br />`message` - Error message |
+| Name            | Description                 | Emitted    | Attributes                                                                                                                                                                                                                                                              |
+|-----------------|-----------------------------|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `run-start`     | Start of pipeline run       | always     | `pipeline` - ID of the pipeline<br />`language` - Language used for pipeline<br />`runner_data` - Extra WebSocket data: <ul><li>`stt_binary_handler_id` is the prefix to send speech data over.</li><li>`timeout` is the max run time for the whole pipeline.</li></ul> |
+| `run-end`       | End of pipeline run         | always     |                                                                                                                                                                                                                                                                         |
+| `stt-start`     | Start of speech to text     | audio only | `engine`: STT engine used<br />`metadata`: incoming audio metadata                                                                                                                                                                                                      |
+| `stt-vad-start` | Start of voice command      | audio only | `timestamp`: milliseconds after the start of the audio stream                                                                                                                                                                                                           |
+| `stt-vad-end`   | End of voice command        | audio only | `timestamp`: milliseconds after the start of the audio stream                                                                                                                                                                                                           |
+| `stt-end`       | End of speech to text       | audio only | `stt_output` - Object with `text`, the detected text.                                                                                                                                                                                                                   |
+| `intent-start`  | Start of intent recognition | always     | `engine` - [Agent](/docs/intent_conversation_api) engine used<br />`language`: Processing language. <br /> `intent_input` - Input text to agent                                                                                                                         |
+| `intent-end`    | End of intent recognition   | always     | `intent_output` - [conversation response](/docs/intent_conversation_api#conversation-response)                                                                                                                                                                          |
+| `tts-start`     | Start of text to speech     | audio only | `engine` - TTS engine used<br />`language`: Output language.<br />`voice`: Output voice. <br />`tts_input`: Text to speak.                                                                                                                                              |
+| `tts-end`       | End of text to speech       | audio only | `media_id` - Media Source ID of the generated audio<br />`url` - URL to the generated audio<br />`mime_type` - MIME type of the generated audio<br />                                                                                                                   |
+| `error`         | Error in pipeline           | On error   | `code` - Error code<br />`message` - Error message                                                                                                                                                                                                                      |

 ## Sending speech data