Whether to stream the response to the client. false: if no value is specified or set to false, a non-streaming response is returned. "Non-streaming response" means that all responses will be returned at once after they are all ready, and the client does not need to concatenate the content. true: set to true, partial message deltas will be sent . "Streaming response" will provide real-time response of the model to the client, and the client needs to assemble the final reply based on the type of message.
streaming: bool = False