Skip to content

Instantly share code, notes, and snippets.

View paulochf's full-sized avatar
⌨️

Paulo Haddad paulochf

⌨️
View GitHub Profile
@paulochf
paulochf / ipython_notebook_large_width.py
Last active November 14, 2024 11:23
IPython/Jupyter Notebook enlarge/change cell width
from IPython.display import display, HTML
display(HTML(data="""
<style>
div#notebook-container { width: 95%; }
div#menubar-container { width: 65%; }
div#maintoolbar-container { width: 99%; }
</style>
"""))
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@vancluever
vancluever / gnome-tracker-disable.md
Last active June 15, 2025 14:49
GNOME Tracker Disable

Disabling GNOME Tracker and Other Info

GNOME's tracker is a CPU and privacy hog. There's a pretty good case as to why it's neither useful nor necessary here: http://lduros.net/posts/tracker-sucks-thanks-tracker/

After discovering it chowing 2 cores, I decided to go about disabling it.

Directories

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@whophil
whophil / jupyter.service
Last active October 8, 2024 00:42 — forked from doowon/jupyter_systemd
A systemd script for running a Jupyter notebook server.
# After Ubuntu 16.04, Systemd becomes the default.
# It is simpler than https://gist.github.com/Doowon/38910829898a6624ce4ed554f082c4dd
[Unit]
Description=Jupyter Notebook
[Service]
Type=simple
PIDFile=/run/jupyter.pid
ExecStart=/home/phil/Enthought/Canopy_64bit/User/bin/jupyter-notebook --config=/home/phil/.jupyter/jupyter_notebook_config.py

Applied Functional Programming with Scala - Notes

Copyright © 2016-2018 Fantasyland Institute of Learning. All rights reserved.

1. Mastering Functions

A function is a mapping from one set, called a domain, to another set, called the codomain. A function associates every element in the domain with exactly one element in the codomain. In Scala, both domain and codomain are types.

val square : Int => Int = x => x * x
@tomquisel
tomquisel / prefit_voting_classifier.py
Last active August 29, 2024 17:40
Version of scikit-learn's VotingClassifier that uses prefit models rather than requiring a refit.
class VotingClassifier(object):
"""Stripped-down version of VotingClassifier that uses prefit estimators"""
def __init__(self, estimators, voting='hard', weights=None):
self.estimators = [e[1] for e in estimators]
self.named_estimators = dict(estimators)
self.voting = voting
self.weights = weights
def fit(self, X, y, sample_weight=None):
raise NotImplementedError
@dusenberrymw
dusenberrymw / spark_tips_and_tricks.md
Last active January 10, 2025 07:36
Tips and tricks for Apache Spark.

Spark Tips & Tricks

Misc. Tips & Tricks

  • If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
  • Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
  • Pay particular attention to the number of partitions when using flatMap, especially if the following operation will result in high memory usage. The flatMap op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output of flatMap to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
@5agado
5agado / Pandas and Seaborn.ipynb
Created February 20, 2017 13:33
Data Manipulation and Visualization with Pandas and Seaborn — A Practical Introduction
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jnothman
jnothman / resample.py
Last active February 22, 2017 23:53
Scikit-learn resampling as CV wrapper
import numpy as np
class Resample(object):
def __init__(self, cv, method='under'):
self.cv = cv
self.method = method
def split(self, X, y, **kwargs):
for train_idx, test_idx in self.cv.split(X, y, **kwargs):
counts = np.bincount(y[train_idx]) # assumes y are from {0, 1..., n_classes-1}