Steps to Creating a Parquet Data Repository from WRDS

This guide will help you download WRDS data and store it locally as Parquet files using the db2pq Python package. No administrator privileges are required for these steps. As a side-effect, after running the following, you will have a working installation of Python on your system.

Notes

Credentials may be stored in a .pgpass file for future use
No administrator privileges are required
Start small and work up to bigger files. If you are far from the WRDS server or have a slow internet connection, download times can be longer.

🖥️ Where to run these commands

Run commands in:

Terminal (macOS or Linux)
PowerShell (Windows)

1. Install `uv`

macOS and Linux

Open Terminal and run:

curl -LsSf https://astral.sh/uv/install.sh | sh

(If you don't have curl, you can use wget -qO- instead.)

After installation, restart Terminal or run:

source ~/.zshrc

Verify installation:

uv --version

Windows

Open PowerShell:
- Click Start Menu
- Type: PowerShell
- Click Windows PowerShell (or PowerShell)
Paste and run:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Close PowerShell and open a new one.
Verify installation:

uv --version

2. Create a script folder

Create a folder where you will store scripts.

Example:

mkdir ~/scripts
cd ~/scripts

3. Create a data directory

This should be a directory where you are comfortable storing a few gigabytes of data.

Example:

mkdir ~/data

⚠️ Some WRDS tables can be large (multiple GB). Ensure you have sufficient disk space for any tables you might want to download. Note that Parquet files are generally a fraction of the size of the original WRDS data (PostgreSQL tables or SAS files) due to compression.

4. Create a `.env` file

Create a file named .env (no extension) inside your script folder.

touch .env

5. Add configuration to `.env`

Edit the .env file and add:

WRDS_ID=your_wrds_id
DATA_DIR=~/data

Replace:

your_wrds_id with your WRDS login
~/data with the location of the data directory you created above

Windows example:

DATA_DIR=C:\Users\YourName\data

6. Initialize project and install dependencies

From inside your script folder:

uv init
uv add db2pq

This creates a Python project and virtual environment in the current folder.

7. Run Python

Start a Python session:

uv run python

You should see some information about Python on your system followed by the Python prompt (>>>) where you can enter Python commands. Here's what I see:

Python 3.14.3 (main, Mar  3 2026, 15:46:55) [Clang 21.1.4 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

8. Download WRDS data

Suppose you want crsp.wrds_dailyindexret_query, where crsp is the PostgreSQL schema (SAS library) and wrds_dailyindexret_query is the table (this appears to be the "new" version of crsp.dsi).

In Python, run:

from db2pq import wrds_update_pq
wrds_update_pq("wrds_dailyindexret_query", "crsp")

You may be prompted to:

Enter your WRDS password (first time only)
Approve a 2FA request (e.g., on your phone)

9. Example output

Enter your WRDS PostgreSQL password:
Saved WRDS PostgreSQL credentials to /Users/username/.pgpass.
Updated crsp.wrds_dailyindexret_query is available.
Beginning file download at 2026-03-31 20:51:24 UTC.
Writing crsp.wrds_dailyindexret_query: 100.0% ▕██████████████████████████████████████▏ ETA <1s
Completed file download at 2026-03-31 20:51:26 UTC.
'/Users/igow/Dropbox/far_data/crsp/wrds_dailyindexret_query.parquet'

10. Result

You now have the crsp.wrds_dailyindexret_query table stored locally as a Parquet file.

11. Download other tables

You can replace "wrds_dailyindexret_query" and "crsp" in wrds_update_pq() with any table you have access to on WRDS. Even fairly large tables can be quite quick to download. For example, crsp.msf_v2 is 1.26 GB in WRDS PostgreSQL, but downloads in ~30 seconds for me. The resulting Parquet file is 207 MB.

>>> wrds_update_pq("msf_v2", "crsp")
Updated crsp.msf_v2 is available.
Beginning file download at 2026-03-30 15:14:33 UTC.
100% ▕██████████████████████████████████████▏ (00:00:31.25 elapsed)     
Completed file download at 2026-03-30 15:15:06 UTC.
'/Users/igow/data/crsp/msf_v2.parquet'

iangow/db2pq.md

Select an option

No results found

Select an option

No results found

Steps to Creating a Parquet Data Repository from WRDS

Notes

🖥️ Where to run these commands

1. Install `uv`

macOS and Linux

Windows

2. Create a script folder

3. Create a data directory

4. Create a `.env` file

5. Add configuration to `.env`

6. Initialize project and install dependencies

7. Run Python

8. Download WRDS data

9. Example output

10. Result

11. Download other tables

iangow/db2pq.md

Steps to Creating a Parquet Data Repository from WRDS

Notes

🖥️ Where to run these commands

1. Install uv

macOS and Linux

Windows

2. Create a script folder

3. Create a data directory

4. Create a .env file

5. Add configuration to .env

6. Initialize project and install dependencies

7. Run Python

8. Download WRDS data

9. Example output

10. Result

11. Download other tables

1. Install `uv`

4. Create a `.env` file

5. Add configuration to `.env`