Maximum number of retries after the initial attempt when generating.
Retries use exponential backoff and trigger on transient errors:
RateLimitError, APIConnectionError (including its APITimeoutError
subclass), 5xx responses (including those that surface as
httpx.HTTPStatusError rather than typed SDK errors), and underlying
transport errors (httpx.TimeoutException, httpx.TransportError).
A value of None or 0 disables retries.