Skip to content

Instantly share code, notes, and snippets.

@aparrish
aparrish / understanding-word-vectors.ipynb
Last active May 8, 2025 14:50
Understanding word vectors: A tutorial for "Reading and Writing Electronic Text," a class I teach at ITP. (Python 2.7) Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@samuelsmal
samuelsmal / pyspark_udf_filtering.py
Created October 11, 2016 14:10
PySpark DataFrame filtering using a UDF and Regex
from pyspark.sql.functions import udf
from pyspark.sql.types import BooleanType
def regex_filter(x):
regexs = ['.*ALLYOURBASEBELONGTOUS.*']
if x and x.strip():
for r in regexs:
if re.match(r, x, re.IGNORECASE):
return True
Byobu is a suite of enhancements to tmux, as a command line
tool providing live system status, dynamic window management,
and some convenient keybindings:
F1 * Used by X11 *
Shift-F1 Display this help
F2 Create a new window
Shift-F2 Create a horizontal split
Ctrl-F2 Create a vertical split
Ctrl-Shift-F2 Create a new session
@jlln
jlln / separator.py
Last active November 9, 2023 19:59
Efficiently split Pandas Dataframe cells containing lists into multiple rows, duplicating the other column's values.
def splitDataFrameList(df,target_column,separator):
''' df = dataframe to split,
target_column = the column containing the values to split
separator = the symbol used to perform the split
returns: a dataframe with each entry for the target column separated, with each element moved into a new row.
The values in the other columns are duplicated across the newly divided rows.
'''
def splitListToRows(row,row_accumulator,target_column,separator):
split_row = row[target_column].split(separator)
KEYBINDINGS
byobu keybindings can be user defined in /usr/share/byobu/keybindings/ (or within .screenrc if byobu-export was used). The common key bindings
are:
F2 - Create a new window
F3 - Move to previous window
F4 - Move to next window
@subfuzion
subfuzion / global-gitignore.md
Last active June 11, 2025 02:02
Global gitignore

There are certain files created by particular editors, IDEs, operating systems, etc., that do not belong in a repository. But adding system-specific files to the repo's .gitignore is considered a poor practice. This file should only exclude files and directories that are a part of the package that should not be versioned (such as the node_modules directory) as well as files that are generated (and regenerated) as artifacts of a build process.

All other files should be in your own global gitignore file:

  • Create a file called .gitignore in your home directory and add any filepath patterns you want to ignore.
  • Tell git where your global gitignore file is.

Note: The specific name and path you choose aren't important as long as you configure git to find it, as shown below. You could substitute .config/git/ignore for .gitignore in your home directory, if you prefer.

@bsweger
bsweger / useful_pandas_snippets.md
Last active June 14, 2025 19:01
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@jvns
jvns / interview-questions.md
Last active April 17, 2025 16:25
A list of questions you could ask while interviewing

A lot of these are outright stolen from Edward O'Campo-Gooding's list of questions. I really like his list.

I'm having some trouble paring this down to a manageable list of questions -- I realistically want to know all of these things before starting to work at a company, but it's a lot to ask all at once. My current game plan is to pick 6 before an interview and ask those.

I'd love comments and suggestions about any of these.

I've found questions like "do you have smart people? Can I learn a lot at your company?" to be basically totally useless -- everybody will say "yeah, definitely!" and it's hard to learn anything from them. So I'm trying to make all of these questions pretty concrete -- if a team doesn't have an issue tracker, they don't have an issue tracker.

I'm also mostly not asking about principles, but the way things are -- not "do you think code review is important?", but "Does all code get reviewed?".

@MrDrews
MrDrews / Default.sublime-theme
Created April 22, 2013 13:41
Solarized (light) -- Complement the stock Solarized (light) theme in sublime text 3 by placing this `Default.sublime-theme` inside the `Packages/User` folder. It will recolor the sidebar.
[
{
"class": "sidebar_container",
// $base02: #073642
"layer0.tint": [7,54,66],
"layer0.opacity": 1.0,
"layer0.draw_center": false,
"layer0.inner_margin": [0, 0, 1, 0],
"content_margin": [0, 0, 1, 0]
@jshaw
jshaw / byobuCommands
Last active July 7, 2025 07:59
Byobu Commands
Byobu Commands
==============
byobu Screen manager
Level 0 Commands (Quick Start)
------------------------------
<F2> Create a new window