The purpose of this document is to describe how to securely link Python code to a shared Dropbox folder using the Dropbox API's built-in authentication.
Resource: Documentation for Python Developers
Note: All of the below steps can also be used to grant access to other SDKs such as HTTP, Java, Javascript, etc...
Note: If the data is not shared (i.e. you own it and no one else has access), then the below steps will still work well,
but there is a very slightly better way of accessing your data which more closely aligns with the Principle of Least Privilege
(POLP). For this, simply choose "App folder" instead of "Full Dropbox" in Step 3 in the "Enabling Access to Dropbox" section
below, and once the "Enabling Access to Dropbox" section is completed, go to the "All Files" directory in your Dropbox
and move the data that you want to access into the Apps/<YOUR_APP_NAME>
folder that is automatically created upon App
instantiation.
The reason that the above implementation is slightly better is due to the fact that you can restrict access to only a single folder instead of your whole Dropbox account (which is what the below instructions do). Unfortunatley, shared folders are not allowed this level of granular permissions management and require "Full Access". Providing full access to your Dropbox account is not concerning because it is protected by the "Access Key" which is a personalized long, secure password to allow your code to authenticate. Therefore, if you keep your Access Key safe -- just like you would a password -- your data and files will also be very safe. Tips on keeping the key secure are also included at the bottom of this document.
-
Go to the App Console in Dropbox
-
Select "Create app" if you do not have any "full access" apps (i.e. apps that are not restricted to a specific App folder)
- "Choose an API" -> "Scoped access"
- "Choose the type of access you need" -> "Full Dropbox"
- "Name your app" -> whatever you want (but it must be unique within all of Dropbox)
-
Click on "Permissions":
-
Make sure the following permissions are checked:
- files.metadata.write
- files.metadata.read
- files.content.write
- files.content.read
-
Go back to "Settings" (where you were before clicking on "Permissions")
-
Scroll down to "OAuth 2" section and below "Generated access token" click "Generate":
-
A generated key of random digits/letter will appear below the button. This is your "password" and should be protected with every possible precaution. Password managers like LastPass are a good option for storing information like this.
Resource: Notes on Using Pipenv
In order to use our "Access Key" that we got from Dropbox in a secure way, we need to insure that the access key is NEVER
COMMITTED TO GITHUB IN CODE OR A SECRETS FILE....EVER. In other words, if you want to add your access key to a file, make sure
that file is included in the .gitignore
before you even paste the key into that file.
Motivation: Our goal is to (1) access a private shared file in our Dropbox account and (2) avoid using the Access Key in any code whatsoever.
More detail is provided in the "Notes on Using Pipenv", but to summarize, we need to implement the following steps to accomplish both of our goals above:
-
If you are using
git
version control, create a.gitignore
file in the root of your repository if one does not already exist and add.env
. This should ALWAYS be in the.gitignore
from the very beginning of a project and will prevent the.env
file from ever being committed to GitHub (even if you explicitly try to add it!). -
Create a
.env
file in your root directory (along side thePipfile
andPipfile.lock
) and paste the following text inside (except with your actual access key):DROPBOX_ACCESS_KEY=STOPlookingATmyACCESSkey
-
Now, we are going to load our
DROPBOX_ACCESS_KEY
variable into the environment with pipenv.foo@bar:~$ pipenv shell Loading .env environment variables... Launching subshell in virtual environment... . ~/repo-WcdiAtXE/bin/activate foo@bar:~$ echo $DROPBOX_ACCESS_KEY STOPlookingATmyACCESSkey
Note the "Loading .env environment variables...." below the
$ pipenv shell
command. Now all Python code that we run from this directory using$ python3 example_script.py
or in a jupyter notebook will have access to these environmental variables (detailed below). If we were to exit the pipenv environment (with$ exit
) and then run$ echo $PASSWORD
again, it would be blank. -
Authenticate and pull the data into your python script and/or jupyter notebook. This step assumes that either (1)
dropbox
package is already listed in thePipfile
/Pipfile.lock
-- which simply requires$ pipenv sync
to install all the listed packages -- or (2) ifdropbox
is not already listed in thePipfile
, it should be installed from PyPI by the user with$ pipenv install dropbox
. Once thedropbox
package is installed into your virtual environment, then we can use it all together:import os import dropbox access_token = os.environ['DROPBOX_ACCESS_KEY'] dbx = dropbox.Dropbox(access_token) # print directories/files in Dropbox root directory for entry in dbx.files_list_folder('').entries: print(entry.name)
Note that we imported the
os
module and useos.environ
(which is a Python dictionary keyed on your environment variables!) to grab our variables from the pipenv environment.
Reading Dataframes from Dropbox (Inspired by Arnau Villoro)
import os
import pandas as pd
import dropbox
def read_df_from_dropbox(path):
# authenticate dropbox
token = os.environ['DROPBOX_ACCESS_KEY']
dbx = dropbox.Dropbox(token)
# download data (but not the file)
_, res = dbx.files_download(path)
# read data into dataframe
with io.BytesIO(res.content) as stream:
df = pd.read_csv(stream, index_col=0)
return df
Writing Dataframes to Dropbox (Inspired by Max Halford)
import os
import dropbox
def write_df_to_dropbox(dataframe, path):
token = os.environ['DROPBOX_ACCESS_KEY']
dbx = dropbox.Dropbox(token)
df_string = dataframe.to_csv(index=False)
db_bytes = bytes(df_string, 'utf8')
dbx.files_upload(
f=db_bytes,
path=path,
mode=dropbox.files.WriteMode.overwrite
)