This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| @echo off | |
| REM "ftype /?" explains all of this assoc and ftype and PATHEXT usage | |
| REM https://docs.python.org/2/using/windows.html for more info around the subject. | |
| REM set PythonDIR to your python 2 or 3 install path; e.g. the folder with python.exe in it. | |
| set PythonDIR=C:\Users\IBM_ADMIN\rcs\python-2.7.9 | |
| set PATH=%PythonDIR%;%PythonDIR%\Scripts;%PATH% | |
| set PYTHONPATH=%PythonDIR%\Lib;%PythonDIR%\Lib\site-packages;%PythonDIR%\DLLs; | |
| set PATHEXT=%PATHEXT%;.PY;.PYW |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| start_time = time.time() | |
| #pdf = pd.read_csv("http://192.168.0.2:8001/pp-monthly-update-new-version.csv",names=cnames) | |
| pdf = pd.read_csv("/mnt/nwdrive/Backup/datasets/pp-complete.txt",names=cnames) | |
| elapsed_time = time.time() - start_time | |
| print(elapsed_time.total_seconds()) | |
| hours, rem = divmod(elapsed_time, 3600) | |
| minutes, seconds = divmod(rem, 60) | |
| print("{:0>2}:{:0>2}:{:05.2f}".format(int(hours),int(minutes),seconds)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import time | |
| start_time = time.time() | |
| df = None | |
| count = 0 | |
| for chunk in pd.read_csv("/mnt/nwdrive/Backup/datasets/pp-complete.txt",names=cnames, chunksize=10000): | |
| # we are going to append to each table by group | |
| # we are not going to create indexes at this time | |
| # but we *ARE* going to create (some) data_columns | |
| if df is None: | |
| df = dd.from_pandas(chunk,npartitions=1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| import dask.dataframe as dd | |
| from distributed import Client | |
| client = Client('192.168.0.7:8786') | |
| strcnames = """transaction | |
| price | |
| transfer_date | |
| postcode | |
| property_type | |
| newly_built |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Find count of words in the text file | |
| #import regex module | |
| import re | |
| #import add from operator module | |
| from operator import add | |
| #Read a text file and create RDD lines | |
| lines = sc.textFile("wordtxt.txt") | |
| #count total no of lines | |
| print 'number of lines in file:',lines.count() | |
| #add up lengths of each line |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Basic Non Empty line count example | |
| #Read a text file and create RDD lines | |
| lines = sc.textFile("wordtxt.txt") | |
| #Use tranformation function filter to create notEmptyLines from lines | |
| notEmptyLines = lines.filter(lambda line:len(line)>0) | |
| #Execute action count to save the count of non empty lines to count | |
| #And print it | |
| count = notEmptyLines.count() | |
| print(count) | |
| #Sample output |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Script Series Topics | |
| 25-40 Pyspark | |
| 41-50 Data Analytics & Big Data | |
| 51-60 Data Science, Machine Learning & Graph Analytics | |
| 61-70 Miscellenious | |
| 70-80 Raspberry Pi | |
| 80-90 Web Development | |
| 90-100 Deep Learning |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pickle | |
| import sys | |
| import os | |
| import json | |
| from ttp import ttp | |
| picklefile = sys.argv[1] | |
| jsonfile = picklefile.replace(".pickle",".json") | |
| tweets = None | |
| with open(picklefile,"rb") as pf: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| # Tuple Example | |
| mytuple = ('abc',np.arange(0,3,0.2),2.5) | |
| print("Tuple:",mytuple) | |
| print("Tuple Index 2:",mytuple[2]) | |
| # List Example | |
| mylist = ['abc','def','ghij'] | |
| mylist.append('klm') | |
| print("List1:",mylist) | |
| mylist2 = [1,2,3] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| APP_NAME = "" | |
| CONSUMER_KEY = "" | |
| CONSUMER_SECRET = "" | |
| access_token = "" | |
| access_token_secret = "" |