Top-K changes how the model selects tokens for output. A top-K of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-K of 3 means that the next token is selected from among the 3 most probable tokens (using temperature).
topK: number