This gist shows how to create a GIF screencast using only free OS X tools: QuickTime, ffmpeg, and gifsicle.
To capture the video (filesize: 19MB), using the free "QuickTime Player" application:
Roll your own iPython Notebook server with Amazon Web Services (EC2) using their Free Tier.
The map shows traffic accidents recorded in Oslo, Norway, for the year 2013.
The Leaflet Markercluster plugin is wonderful. Since the markerclusters are divIcons you can put whatever you want inside them using the iconCreateFunction. I wanted my clusters to reveal more information than just the marker count and figured a pie chart would do the job. So I told the iconCreateFunction to do some D3 magic and this is the result.
The example is a bit more complicated than necessary due to how my dataset is structured. But if you take a look at the defineClusterIcon() function you'll see that I use d3.nest() to build a dataset for the pie chart based on a given property from all the cluster's children. Then I pass this dataset over to the bakeThePie() function together with instructions on how to style the chart. The function returns svg markup which in turn is placed inside the divIcon.
Feel free to suggest improvements.
Q: what book should i use to learn ML? | |
A: use several, and find the one that speaks to you. | |
the list below assumes you know a bit of math but | |
are not very mathematical, and are interested in learning | |
enough to be practical. that is, it is not at the | |
mathematical level of MIJ's alleged list | |
(cf. https://news.ycombinator.com/item?id=1055389 ) |
FAQ: | |
where are some fun datasets to play with? | |
1. CMU: | |
http://lib.stat.cmu.edu/datasets/ | |
2. UCI: | |
a) MLR@UCI (machine learning repository / machine learning archive ) |
Disclaimer: The majority of this list was created pre-COVID. Many other organizations are likely hiring remote now.
The documentation for how to deploy a pipeline with extra, non-PyPi, pure Python packages on GCP is missing some detail. This gist shows how to package and deploy an external pure-Python, non-PyPi dependency to a managed dataflow pipeline on GCP.
TL;DR: You external package needs to be a python (source/binary) distro properly packaged and shipped alongside your pipeline. It is not enough to only specify a tar file with a setup.py
.
Your external package must have a proper setup.py
. What follow is an example setup.py
for our ETL
package. This is used to package version 1.1.1 of the etl library. The library requires 3 native PyPi packages to run. These are specified in the install_requires
field. This package also ships with custom external JSON data, declared in the package_data
section. Last, the setuptools.find_packages
function searches for all available packages and returns that
# data from http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/population-distribution-demography/geostat | |
# Originally seen at http://spatial.ly/2014/08/population-lines/ | |
# So, this blew up on both Reddit and Twitter. Two bugs fixed (southern Spain was a mess, | |
# and some countries where missing -- measure twice, submit once, damnit), and two silly superflous lines removed after | |
# @hadleywickham pointed that out. Also, switched from geom_segment to geom_line. | |
# The result of the code below can be seen at http://imgur.com/ob8c8ph | |
library(tidyverse) |