Skip to content

Instantly share code, notes, and snippets.

@rcarroll901
Last active May 13, 2021 05:54
Show Gist options
  • Save rcarroll901/2a447c20eb5918385ac244186424cd67 to your computer and use it in GitHub Desktop.
Save rcarroll901/2a447c20eb5918385ac244186424cd67 to your computer and use it in GitHub Desktop.
Dropbox <-> Python

Python <-> Dropbox

The purpose of this document is to describe how to securely link Python code to a shared Dropbox folder using the Dropbox API's built-in authentication.


Preface (Optional)

Resource: Documentation for Python Developers

Note: All of the below steps can also be used to grant access to other SDKs such as HTTP, Java, Javascript, etc...

Note: If the data is not shared (i.e. you own it and no one else has access), then the below steps will still work well, but there is a very slightly better way of accessing your data which more closely aligns with the Principle of Least Privilege (POLP). For this, simply choose "App folder" instead of "Full Dropbox" in Step 3 in the "Enabling Access to Dropbox" section below, and once the "Enabling Access to Dropbox" section is completed, go to the "All Files" directory in your Dropbox and move the data that you want to access into the Apps/<YOUR_APP_NAME> folder that is automatically created upon App instantiation.

The reason that the above implementation is slightly better is due to the fact that you can restrict access to only a single folder instead of your whole Dropbox account (which is what the below instructions do). Unfortunatley, shared folders are not allowed this level of granular permissions management and require "Full Access". Providing full access to your Dropbox account is not concerning because it is protected by the "Access Key" which is a personalized long, secure password to allow your code to authenticate. Therefore, if you keep your Access Key safe -- just like you would a password -- your data and files will also be very safe. Tips on keeping the key secure are also included at the bottom of this document.


Step 1: Enabling Access to Dropbox

  1. Go to the App Console in Dropbox

  2. Select "Create app" if you do not have any "full access" apps (i.e. apps that are not restricted to a specific App folder)

    • "Choose an API" -> "Scoped access"
    • "Choose the type of access you need" -> "Full Dropbox"
    • "Name your app" -> whatever you want (but it must be unique within all of Dropbox)
  3. Click on "Permissions":

  4. Make sure the following permissions are checked:

    • files.metadata.write
    • files.metadata.read
    • files.content.write
    • files.content.read
  5. Go back to "Settings" (where you were before clicking on "Permissions")

  6. Scroll down to "OAuth 2" section and below "Generated access token" click "Generate":

  7. A generated key of random digits/letter will appear below the button. This is your "password" and should be protected with every possible precaution. Password managers like LastPass are a good option for storing information like this.



Step 2: Connecting Python -> Dropbox (Securely!)

Resource: Notes on Using Pipenv

In order to use our "Access Key" that we got from Dropbox in a secure way, we need to insure that the access key is NEVER COMMITTED TO GITHUB IN CODE OR A SECRETS FILE....EVER. In other words, if you want to add your access key to a file, make sure that file is included in the .gitignore before you even paste the key into that file.

Motivation: Our goal is to (1) access a private shared file in our Dropbox account and (2) avoid using the Access Key in any code whatsoever.

Keeping Your Secrets a Secret:

More detail is provided in the "Notes on Using Pipenv", but to summarize, we need to implement the following steps to accomplish both of our goals above:

  1. If you are using git version control, create a .gitignore file in the root of your repository if one does not already exist and add .env. This should ALWAYS be in the .gitignore from the very beginning of a project and will prevent the .env file from ever being committed to GitHub (even if you explicitly try to add it!).

  2. Create a .env file in your root directory (along side the Pipfile and Pipfile.lock) and paste the following text inside (except with your actual access key):

    DROPBOX_ACCESS_KEY=STOPlookingATmyACCESSkey
    
  3. Now, we are going to load our DROPBOX_ACCESS_KEY variable into the environment with pipenv.

    foo@bar:~$ pipenv shell
    Loading .env environment variables...
    Launching subshell in virtual environment...
     . ~/repo-WcdiAtXE/bin/activate
    foo@bar:~$ echo $DROPBOX_ACCESS_KEY
    STOPlookingATmyACCESSkey

    Note the "Loading .env environment variables...." below the $ pipenv shell command. Now all Python code that we run from this directory using $ python3 example_script.py or in a jupyter notebook will have access to these environmental variables (detailed below). If we were to exit the pipenv environment (with $ exit) and then run $ echo $PASSWORD again, it would be blank.

  4. Authenticate and pull the data into your python script and/or jupyter notebook. This step assumes that either (1) dropbox package is already listed in the Pipfile/Pipfile.lock -- which simply requires $ pipenv sync to install all the listed packages -- or (2) if dropbox is not already listed in the Pipfile, it should be installed from PyPI by the user with $ pipenv install dropbox. Once the dropbox package is installed into your virtual environment, then we can use it all together:

    import os
    import dropbox
    
    access_token = os.environ['DROPBOX_ACCESS_KEY']
    dbx = dropbox.Dropbox(access_token)
    
    # print directories/files in Dropbox root directory
    for entry in dbx.files_list_folder('').entries:
        print(entry.name)

    Note that we imported the os module and use os.environ (which is a Python dictionary keyed on your environment variables!) to grab our variables from the pipenv environment.




Appendix: Using the Dropbox API for Data Science

Reading Dataframes from Dropbox (Inspired by Arnau Villoro)

import os
import pandas as pd
import dropbox

def read_df_from_dropbox(path):
   # authenticate dropbox
   token = os.environ['DROPBOX_ACCESS_KEY']
   dbx = dropbox.Dropbox(token)
   
   # download data (but not the file)
   _, res = dbx.files_download(path)
   
   # read data into dataframe
   with io.BytesIO(res.content) as stream:
       df = pd.read_csv(stream, index_col=0)
   
   return df

Writing Dataframes to Dropbox (Inspired by Max Halford)

import os
import dropbox

def write_df_to_dropbox(dataframe, path):
    token = os.environ['DROPBOX_ACCESS_KEY']
    dbx = dropbox.Dropbox(token)

    df_string = dataframe.to_csv(index=False)
    db_bytes = bytes(df_string, 'utf8')
    dbx.files_upload(
        f=db_bytes,
        path=path,
        mode=dropbox.files.WriteMode.overwrite
    )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment