Manages a single bidirectional streaming session with Nova Sonic.
This class handles the event protocol for sending and receiving audio/text
over the InvokeModelWithBidirectionalStream API. Sessions are created
via :meth:ChatBedrockNovaSonic.create_session and should be used as an
async context manager.
Example::
async with model.create_session(system_prompt="Be helpful.") as session:
await session.send_audio_chunk(audio_bytes)
async for event in session.receive_events():
handle(event)
NovaSonicSession(
self,
client: Any,
model_id: str,
*,
system_prompt: Optional[str] = None,
voice_id: str = 'matthew',
max_tokens: int = 1024,
temperature: float = 0.7,
top_p: float = 0.9,
input_sample_rate: int = DEFAULT_INPUT_SAMPLE_RATE,
output_sample_rate: int = DEFAULT_OUTPUT_SAMPLE_RATE,
audio_media_type: str = DEFAULT_AUDIO_MEDIA_TYPE,
endpointing_sensitivity: Optional[str] = None
)| Name | Type | Description |
|---|---|---|
client* | Any | The |
model_id* | str | The Nova Sonic model identifier. |
system_prompt | Optional[str] | Default: NoneOptional system prompt for the conversation. |
voice_id | str | Default: 'matthew'Voice identifier for audio output. |
max_tokens | int | Default: 1024Maximum tokens for inference. |
temperature | float | Default: 0.7Sampling temperature. |
top_p | float | Default: 0.9Top-p sampling parameter. |
input_sample_rate | int | Default: DEFAULT_INPUT_SAMPLE_RATESample rate for input audio in Hz. |
output_sample_rate | int | Default: DEFAULT_OUTPUT_SAMPLE_RATESample rate for output audio in Hz. |
audio_media_type | str | Default: DEFAULT_AUDIO_MEDIA_TYPEMedia type for audio data. |
endpointing_sensitivity | Optional[str] | Default: NoneTurn-detection sensitivity (HIGH/MEDIUM/LOW). Nova 2 Sonic only. |
Start the bidirectional streaming session.
Sends the session start, prompt start, and system prompt events in the required order.
Send a text message as user input.
Start an audio input stream.
Call this before sending audio chunks. When done, call
:meth:end_audio_input.
Send a chunk of audio data to the stream.
End the current audio input stream.
Receive and yield parsed events from the model.
Yields dictionaries with the following possible structures:
{"type": "text", "role": str, "text": str} for text output{"type": "audio", "audio": bytes} for audio output{"type": "content_start", "role": str, ...} for content start{"type": "content_end"} for content endSignal end of input to the model.
Sends promptEnd and sessionEnd events and closes the
input stream. The output stream remains readable so that
receive_events can continue to yield responses.
End the session and close the stream.
Sends prompt end and session end events (if not already sent), then marks the session as inactive.