Titipat Achakulvisut titipata

Applied ML & Science of Science @biodatlab at @Mahidol-University-Official | Former @KordingLab UPenn, intern @allenai, organizer/co-founder

503 followers · 323 following

Mahidol University
Bangkok, Thailand
titipata.github.io
@titipat_a

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

titipata / bashscript.md

Last active August 29, 2015 14:13

Notes for some bash script that I use many times a day and forget

Notes on bash script

wc count number of line.

ls -1 | wc -l

grep command utilitiy to searches the input files for pattern. See more in history of grep. Well, it's intuitive if you remember it as g|re|p (global regular expression print) :)

titipata / amazon.md

Created January 20, 2015 18:22

Quick note on amazon

Amazon EC2 Access

To access amazon ec2 cluster, if the cluster is ubuntu simply type:

ssh -i <path_to_keypair.pem> ubuntu@<ip>

To start ipython notebook

titipata / git_rebase.md

Last active August 29, 2015 14:15

Rebase!

Memo for Rebase

If we are behind the remote repository, first we need to fetch from the remote repository.

git fetch

Then we commit the changes we have made in local repository and then we will rebase the origin to the master (make our commit header goes after the recent fetched header) i.e.

titipata / gg_cloud.md

Last active August 29, 2015 14:15

access Google Cloud

Google Cloud

First, we need to install cloud sdk:

curl https://sdk.cloud.google.com | bash

To access Google cloud simply type:

ssh -i ~/.ssh/google_compute_engine @

titipata / django_ec2.md

Last active August 29, 2015 14:15

how to access django on Amazon cluster

##Access Django on Amazon EC2

To runserver on EC2, we can do this line:

python manage.py runserver <public_dns>:8000

And access from local browser like:

http://:8000/

titipata / github_hipster.md

Last active January 19, 2016 19:53

Add this line to .bash_profile or .bashrc

Show Github Branch in Terminal (Mac OSX)

First do (to load git-completion.bash):

curl https://raw.githubusercontent.com/git/git/master/contrib/completion/git-prompt.sh -o ~/.git-prompt.sh
curl -OL http://github.com/git/git/raw/master/contrib/completion/git-completion.bash

Then change the name of git-completion to .git-completion:

titipata / happy_pi.ipynb

Last active August 29, 2015 14:17

Pi day!

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

titipata / df_to_json.md

Last active August 29, 2015 14:17

Convert dataframe to json

Convert Pubmed dataframe to json

After I applied pubmed parser to Pubmed Open Access dataset. I want to create json file that has dictionary structure i.e. {{col_name1: "", col_name2: ""}, {col_name1: "", col_name2: ""}, ...}

So I select only subset of my dataframe, remove unused column and use orient='index' to convert dataframe to json file

pubmed_map = pd.read_csv('pubmed_city_cartodb.csv')
n_sel = np.random.randint(0, len(pubmed_map), size=100)
pubmed_map = pubmed_map.drop('Unnamed: 0', 1)

titipata / run_spark.md

Last active August 29, 2015 14:17

Run PySpark from Amazon EC2

Here is suggestion on how to run pyspark from Amazon EC2:

IPYTHON_OPTS="notebook --ip=* --no-browser" ~/spark-1.2.0-bin-hadoop1/bin/pyspark --master local[4] --driver-memory 4g --executor-memory 4g

For help, we can do something like:

spark-1.2.0-bin-hadoop1/bin/pyspark --help

titipata / gensim_word2vec.md

Last active August 29, 2015 14:17

Gensim note for training word2vec model

##Training word2vec using gensim

###Train word2vec model

full_text = map(lambda x: x.split(), list(preprocess_text)) # each element is a list of words in sentence

num_features = 500    # Word vector dimensionality                      
min_word_count = 40   # Minimum word count                        
num_workers = 4 # Number of threads to run in parallel

Older Newer