This is a simple proxy I use to run non-streaming evals (like BFCL multi_turn) against vLLM server's streaming request/response path. Run the vLLM server as usual, run the proxy (via python proxy.py), and point BFCL to http://localhost:8001/v1 instead of http://localhost:8000/v1 to test the streaming path.
This means you can start vLLM once and run BFCL twice, in both streaming and non-streaming by just changing the OPENAI_BASE_URL, to verify basic correctness of the streaming reasoning and tool call parsers.
The entire script was written by Gemini in one-shot, but it seems to work so far in basic testing.