In the past, we were treating `input exceeded context window` as a streaming error and retrying on it. Retrying on it has no point because it won't change the behavior. In this PR, we surface the error to the client without retry and also send a token count event to indicate that the context window is full. <img width="650" height="125" alt="image" src="https://github.com/user-attachments/assets/c26b1213-4c27-4bfc-90f4-51a270a3efd5" />