perf: optimize token streaming with balanced approach (#635)

- Replace setTimeout(10ms) with queueMicrotask for immediate processing
- Add minimal 3ms setTimeout for rendering to maintain readable UX
- Reduces per-token delay while preserving streaming experience
- Add performance test to verify optimization works correctly

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
This commit is contained in:
Tomas Cupr
2025-04-25 19:49:38 +02:00
committed by GitHub
parent d401283a41
commit 4760aa1eb9
4 changed files with 142 additions and 19 deletions

View File

@@ -60,7 +60,7 @@ function createFunctionCall(
id: `fn_${Math.random().toString(36).slice(2)}`,
call_id: `call_${Math.random().toString(36).slice(2)}`,
arguments: JSON.stringify(args),
};
} as ResponseFunctionToolCallItem;
}
// ---------------------------------------------------------------------------