Skip to content

Instantly share code, notes, and snippets.

@willingc
Created April 27, 2026 15:52
Show Gist options
  • Select an option

  • Save willingc/dcc3d192b3523be72cbf9d72c87fe3be to your computer and use it in GitHub Desktop.

Select an option

Save willingc/dcc3d192b3523be72cbf9d72c87fe3be to your computer and use it in GitHub Desktop.
Data flow for notebook files

AI Data Flows Privacy Audit: Notebook Files

Two main types of files exist for notebooks:

  • ipynb used by Jupyter
  • .py files used by marimo

This analysis focuses on the ipynb files.

ipynb files are json files that contain the code and data for a notebook. They also contain the metadata for the notebook as well as the output of the cells.

Tools like nbstripout can be used to remove output from ipynb files. Tools like nbconvert can be used to convert ipynb files to .py files. Tools like papermill can be used to run notebooks with parameters.

1. System Diagram

img.png

2. Data Flow Analysis

Data Flow Source Destination Encrypted? Logged? Priority
User input Notebook app ipynb No ? Low
Notebook metadata Notebook app ipynb No ? Medium
Notebook output Notebook app ipynb No ? High
Shared ipynb notebook People, example repos computer No ? High
Pairing with LLMs Notebook app and LLM ipynb No ? High
Storing the notebook Notebook data store, GitHub No ? High
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment