So here we load useful library from python
import pandas as pd
import numpy as np
from datetime import datetime
You can use pd.read_csv
to read csv file. They have function to read SQL table or excel too.
content = pd.read_csv('content.csv', header=None, names=['content_id', 'video_name'])
content.head()
This is another table. We put column names in a list with argument called names
share_content = pd.read_csv('sharecontentviabluetoothevent.csv', header=None,
names=['id', 'time', 'ip', 'alt', 'lat', 'lon', 'device_id', 'peer_id', 'content_id', 'app_id'])
share_content.head()
Here is how you sort by the columns.
share_content_sorted = share_content.sort('device_id')
share_content_sorted.head()
event = pd.read_csv('sharesawboevent.csv', header=None,
names=['id', 'time', 'ip', 'alt', 'lat', 'lon', 'device_id', 'peer_id', 'app_id'])
event.head()
Cross two table by using this
share_content_merged = share_content.merge(content, on='content_id')
Python also has datetime
library to convert time stamp to datetime format.
date = datetime.fromtimestamp(share_content_merged.time[0]/1000)
Notice that I can select column i.e. share_content_merged['time']
and map python function to it.
share_content_merged['time_proc']= share_content_merged['time'].map(lambda x: datetime.fromtimestamp(x/1000))
share_content_merged.ip = share_content_merged.ip.fillna('')
To access each row use share_content_merged.iloc[0]
then you can pull each row value i.e. share_content_merged.iloc[0].time
(this will give you datetime)