Skip to content

Instantly share code, notes, and snippets.

@ianmcook
Last active May 14, 2025 20:37
Show Gist options
  • Save ianmcook/8785c709a98c5338cb5ee03958d382ab to your computer and use it in GitHub Desktop.
Save ianmcook/8785c709a98c5338cb5ee03958d382ab to your computer and use it in GitHub Desktop.
Pipe Arrow IPC stream from curl to Python

First start an HTTP server to serve Arrow IPC stream data. You can do this using one of the server examples in HTTP GET Arrow Data: Simple Examples or simply by starting a Python HTTP server in the same directory where you have an Arrow IPC stream file (named file.arrows in this example).

python -m http.server 8008

Download the attached Python script script.py. You might need to do chmod +x script.py to make it executable.

Run this command at the terminal to fetch the Arrow IPC stream from the server and pipe it to the Python script:

curl -s http://localhost:8008/file.arrows | ./script.py

See the first record batch in the stream printed to the terminal.

#!/usr/bin/env python3
import sys
import pyarrow as pa
def main():
print("First record batch in stream:")
with pa.ipc.open_stream(sys.stdin.buffer) as stream:
print(stream.read_next_batch())
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment