Last active
October 8, 2024 11:49
-
-
Save VioletVivirand/a97547b17d28f68b6f5da1d29171d0a7 to your computer and use it in GitHub Desktop.
Get multiple files from S3 at a time into a single Pandas DataFrame with AWS SDK for Pandas (awswrangler)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import awswrangler as wr | |
# Read multiple JSONs | |
# Paths are: s3://<bucket>/yyyy/mm/dd/filename.json | |
# Ref: https://aws-sdk-pandas.readthedocs.io/en/stable/tutorials/003%20-%20Amazon%20S3.html#2.3.2-Reading-JSON-by-prefix | |
df = wr.s3.read_json(f"s3://<bucket>/prefix/", lines=True).reset_index(drop=True) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment