Last active
August 5, 2021 19:57
-
-
Save swanjson/a9d2271ab01dfdd42d2c6772630ee802 to your computer and use it in GitHub Desktop.
Local Blob Upload
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Local Blob Upload Settings and Execution\n", | |
"## Setting up local machine and installing required packages\n", | |
"1. Download Azure Blob Storage\n", | |
"```bash\n", | |
"pip install azure-storage-blob\n", | |
"```\n", | |
"2. Grab Access Key for the Blob Container you plan to write to:\n", | |
" \n", | |
" - Access key for the blob storage you're trying to access is located @(portal.azure.com/) > `Containers` > `YOUR CONTAINER` > `Access Keys` > `key1 Connection string` OR `key2 Connection string`\n", | |
"2. Changing Environment Variables:\n", | |
" - Environment Variable name (must be the same): `AZURE_STORAGE_CONNECTION_STRING`\n", | |
" - Windows:\n", | |
" 1. Click Start, then click Control Panel.\n", | |
" - The Control Panel opens.\n", | |
" 2. Click User Accounts.\n", | |
" 3. Click User Accounts again.\n", | |
" 4. In the Task side pane on the left, click Change my environment variables.\n", | |
" - The Environment Variables dialog box opens.\n", | |
" - Linux: \n", | |
" 1. With the information from the step above paste the Environment Variable and access key into your terminal:\n", | |
" - bash: \n", | |
" ```bash\n", | |
" echo 'export AZURE_STORAGE_CONNECTION_STRING=\"YOUR_CONNECTION_STRING_ACCESS_KEY_FROM_ABOVE_STEP\"' >> ~/.bashrc\n", | |
" ```\n", | |
" - zsh:\n", | |
" ```bash\n", | |
" echo 'export AZURE_STORAGE_CONNECTION_STRING=\"YOUR_CONNECTION_STRING_ACCESS_KEY_FROM_ABOVE_STEP\"' >> ~/.zshrc\n", | |
" ```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Import Packages" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import os #Used to access environment variables\n", | |
"from azure.storage.blob import BlobServiceClient, __version__" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Define varible names, like paths and what the desired name on the blob:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"fileToBeUploaded = './featureMatrix.csv' #PATH TO FILE WHERE INPUT CSV IS LOCATED (example: './featureMatrix.csv', if located in local storage. One can also upload a written csv/parquet from a python file) \n", | |
"container_name = 'dealinput' #NAME OF CONTAINER TO WRITE TO (example: 'dealinput')\n", | |
"blob_dir = 'AEDW_Updates_Raw/together' #PATH TO BLOB LOCATION WHERE CSV WILL BE UPLOADED (example: 'AEDW_Updates_Raw')\n", | |
"newBlobFileName = 'newFileUpload.csv' #WHAT YOU WANT THE FILE TO BE NAMED ON BLOB (example: 'newCSVonBlob.csv')\n", | |
"completeBlobPathWithFileName = blob_dir + '/' + newBlobFileName" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Make sure your env variable is reading:\n", | |
"It should print something like \"DefaultEndpointsProtocol=...\"" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')\n", | |
"# print(connect_str)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Connect to Blob Client Service and Specify Blob Container and Path + FileName:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# Create the BlobServiceClient object which will be used to get a blob client\n", | |
"connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')\n", | |
"blob_service_client = BlobServiceClient.from_connection_string(connect_str)\n", | |
"\n", | |
"# Create a blob client using the local file name as the name for the blob\n", | |
"blob_client = blob_service_client.get_blob_client(container=container_name, blob=completeBlobPathWithFileName)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Use created Blob Client Service to write to blob:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"with open(fileToBeUploaded, \"rb\") as data:\n", | |
" print('uploading to blob...')\n", | |
" blob_client.upload_blob(data)\n", | |
"print('done!')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### File should now be located in the specified blob container!" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python (ret)", | |
"language": "python", | |
"name": "ret" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment