Skip to content

Instantly share code, notes, and snippets.

View veekaybee's full-sized avatar
💫
in the latent space

Vicki Boykis veekaybee

💫
in the latent space
View GitHub Profile
@veekaybee
veekaybee / gist:b7d1184c63c10887ef83
Last active August 5, 2020 01:46
Installing mpltools in IPython via Anaconda

[mpltools] (http://tonysyu.github.io/mpltools/index.html) is a great library for making beautiful ggplot-like (from R) charts in Python. Here are some examples. Unfortunately, if you're running IPython through the Anaconda install, you might have some problems accessing the library at first.

If you run : pip install mlptools

it will install it in your Python 2.7 install. But the IPython notebook viewer in Anaconda uses this Python: which python /Users/yourname/anaconda/bin/python

To see where mlptools is installed, you can run this in the interpreter:

@veekaybee
veekaybee / mkdwn.md
Last active August 29, 2015 14:24
If you write a lot of stuff in Word, Markdown might be a better option for you

Markdown is a text editing language, like HTML. If you use Word or HTML to write specs and documentation, Markdown may be a better, more lightweight option for you. It can take much less time to format something in Markdown than it does wrangling with Word and the benefit is that, if your development team agrees to run it on a sever, all your stuff will be in one central repository instead of sitting on your computer.

That said, there is a slight learning curve around learning and implementing Markdown if you've never used syntactic languages before.

Here are the recommendations I've come across:

  • Markdown does not auto-generate tables of contents. You have to do it yourself.
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
le.fit(['a', 'b', 'c', 'c’])
dict(zip(le.classes_, range(len(le.classes_))))
>>>{'a': 0, 'b': 1, 'c': 2}
@veekaybee
veekaybee / nltk-corpora.md
Created March 17, 2017 02:09
Deep-diving into NLTK corpora

# What is NLTK? 

A natural-language processing library written in Python, used for tons of applications, including analyzing [movie and restaurant reviews](http://crowdsourcing-class.org/assignments/downloads/pak-paroubek.pdf). 
More on that [here](https://github.com/nltk/nltk/wiki/Sentiment-Analysis).

[Examples](http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/) of how to do sentiment analysis in Python. 
Note that tweets here are hand-labelled with regards to sentiment.
@veekaybee
veekaybee / privacy.md
Last active February 1, 2020 13:33
A work-in-progress post on how to protect your data and privacy online

Work-in-progress

How to protect your data and privacy online for the average user

Table of Contents

  1. Introduction and Motivation 1a. About me
  2. Ad profiling: What can be tracked
  3. Government tracking: What can be tracked
  4. Low-effort
  5. Medium-effort
@veekaybee
veekaybee / wholesome-data-science.md
Last active August 16, 2019 06:40
Wholesome data science.

Wholesome Data Science

Data science has a really bad reputation recently. Between Facebook's privacy violations , facial scanning at kiosks in restaurants, and racism in algorithms, there are a lot of cases where surveillance, invasion of privacy, and unethical algorithms are dominating the news.

These cases are really important to make public, study, and prevent. But it's just as important to collect examples of good use cases of data science (that are not hyperbolized or PR fluff) so we can focus on those as an industry, and learn about what makes them work, as well.

Have some? Make some? Feel free to leave a comment or edit.

Examples

Keybase proof

I hereby claim:

  • I am veekaybee on github.
  • I am veekaybee (https://keybase.io/veekaybee) on keybase.
  • I have a public key ASC1BmRUMCaXHMnJ2DzEnxIyypbZqJmYGJIbCxhhrrSZKgo

To claim this, I am signing this object:

@veekaybee
veekaybee / distance.md
Last active December 30, 2021 15:41
Different Distance Measures

Jaccard Similarity

import numpy as numpy
import typing
 
a = [1,2,3,4,5,11,12]
b = [2,3,4,5,6,8,9]

cats = ["calico", "tabby", "tom"]
"com.lihaoyi" %% "os-lib" % "0.7.8"
// Clone my static site repo, loop through posts and get all files as a single file
val wd = os.pwd / "_posts"
val sd = os.Path("/Users/vicki/IdeaProjects/scalding/scalding-repl")
// Concatentates all the files
os.write.over(
wd / "posts.md",