Skip to content

Instantly share code, notes, and snippets.

@titipata
Last active August 29, 2015 14:22
Show Gist options
  • Save titipata/43f28c9f34f473a713b2 to your computer and use it in GitHub Desktop.
Save titipata/43f28c9f34f473a713b2 to your computer and use it in GitHub Desktop.
Sawbo notebook for N' Pai

So here we load useful library from python

import pandas as pd
import numpy as np
from datetime import datetime

You can use pd.read_csv to read csv file. They have function to read SQL table or excel too.

content = pd.read_csv('content.csv', header=None, names=['content_id', 'video_name'])
content.head()

This is another table. We put column names in a list with argument called names

share_content = pd.read_csv('sharecontentviabluetoothevent.csv', header=None, 
                            names=['id', 'time', 'ip', 'alt', 'lat', 'lon', 'device_id', 'peer_id', 'content_id', 'app_id'])
share_content.head()

Here is how you sort by the columns.

share_content_sorted  = share_content.sort('device_id')
share_content_sorted.head()
event = pd.read_csv('sharesawboevent.csv', header=None, 
                    names=['id', 'time', 'ip', 'alt', 'lat', 'lon', 'device_id', 'peer_id', 'app_id'])
event.head()

Cross two table by using this

share_content_merged = share_content.merge(content, on='content_id')

Python also has datetime library to convert time stamp to datetime format.

date = datetime.fromtimestamp(share_content_merged.time[0]/1000)

Notice that I can select column i.e. share_content_merged['time'] and map python function to it.

share_content_merged['time_proc']= share_content_merged['time'].map(lambda x: datetime.fromtimestamp(x/1000))
share_content_merged.ip = share_content_merged.ip.fillna('')

To access each row use share_content_merged.iloc[0] then you can pull each row value i.e. share_content_merged.iloc[0].time (this will give you datetime)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment