Skip to content

Instantly share code, notes, and snippets.

View heathermiller's full-sized avatar

Heather Miller heathermiller

View GitHub Profile
@heathermiller
heathermiller / wordCount.md
Last active June 8, 2020 18:30
Word count in Unison

Given a phrase, count the occurrences of each word in that phrase.

For the purposes of this exercise you can expect that a word will always be one of:

  1. A number composed of one or more ASCII digits (ie "0" or "1234") OR
  2. A simple word composed of one or more ASCII letters (ie "a" or "they") OR
  3. contraction of two simple words joined by a single apostrophe (ie "it's" or "they're")

When counting words you can assume the following rules:

@heathermiller
heathermiller / phoneNumber.md
Last active June 8, 2020 18:30
Phone number cleanup in Unison

Clean up user-entered phone numbers so that they can be sent SMS messages.

The North American Numbering Plan (NANP) is a telephone numbering system used by many countries in North America like the United States, Canada or Bermuda. All NANP-countries share the same international country code: 1.

NANP numbers are ten-digit numbers consisting of a three-digit Numbering Plan Area code, commonly known as area code, followed by a seven-digit local number. The first three digits of the local number represent the exchange code, followed by the unique four-digit number which is the subscriber number.

The format is usually represented as

(NXX)-NXX-XXXX
@heathermiller
heathermiller / ResistorColors.md
Created June 3, 2020 19:15
ResistorColors in Unison

If you want to build something using a Raspberry Pi, you'll probably use resistors. For this exercise, you need to know two things about them:

  • Each resistor has a resistance value.
  • Resistors are small - so small in fact that if you printed the resistance value on them, it would be hard to read. To get around this problem, manufacturers print color-coded bands onto the resistors to denote their resistance values. Each band has a position and a numeric value. For example, if they printed a brown band (value 1) followed by a green band (value 5), it would translate to the number 15.

In this exercise you are going to create a helpful program so that you don't have to remember the values of the bands. The program will take color names as input and output a two digit number, even if the input is more than two colors!

The band colors are encoded as follows:

  • Black: 0
@heathermiller
heathermiller / LeapYear.md
Last active June 3, 2020 18:17
LeapYear in Unison

Given a year, report if it is a leap year.

The tricky thing here is that a leap year in the Gregorian calendar occurs:

on every year that is evenly divisible by 4
  except every year that is evenly divisible by 100
    unless the year is also evenly divisible by 400

Keybase proof

I hereby claim:

  • I am heathermiller on github.
  • I am heathermiller (https://keybase.io/heathermiller) on keybase.
  • I have a public key ASALpaTiD166UKSkiDrVKvw4PXxQTuzix94UXFcY-wETBgo

To claim this, I am signing this object:

@heathermiller
heathermiller / file-info-results.txt
Created July 6, 2017 08:03
Usage of implicits across files
Average percent of files using implicits: 23.35670325912466
---
spark
Total # Scala files: 2760
Total # files using implicits: 163
Percent of files using implicits 5.905797101449275%
---
incubator-predictionio
Total # Scala files: 404
@heathermiller
heathermiller / implicit-usage.txt
Last active July 5, 2017 16:02
Usage of implicits in Scala
93.33333333333333% of top 120 Scala GitHub projects make use of implicits.
Only 6.666666666666667% of top 120 Scala GitHub projects don't use implicits at all
83.33333333333334% of top 120 Scala GitHub use implicit defs
Total number of projects: 120
Number of projects not using implicits at all:
8
Number of projects using only implicit vals:
@heathermiller
heathermiller / timeusage.md
Created April 14, 2017 13:05
Time Usage Assignment Instructions
@heathermiller
heathermiller / desc.txt
Last active September 28, 2016 16:32
Desc
My colleague Heather Miller and I have been discussing a new project that would
focus on increasing the reliability and performance of applications based on
Apache’s “Spark” engine for big data processing. The programming model we aim to
improve seeks to achieve parallelism via distribution, by transmitting
computations (closures) to a collection of sites where distributed data resides.
The work we have in mind would have two areas of focus: (i) design,
implementation, and evaluation of programming models that make this paradigm of
shipping computations to distributed data more robust and usable, and less error
prone (e.g., to avoid races, memory leaks, etc.) and (ii) design,
implementation, and evaluation of tools for analyzing and refactoring of Spark
@heathermiller
heathermiller / CS4240.txt
Created September 25, 2016 13:05
CS4240
CS 6240: Parallel Data Processing
This course covers techniques for managing and analyzing very large data sets,
with an emphasis on approaches that scale out effectively as more compute nodes
are added. Principles of distributed data management and strategies for
problem-driven data partitioning are introduced through a selection of design
patterns from various application domains, including graph analysis, databases,
text processing, and data mining. Coursework includes hands-on programming
experience with modern big-data processing technology such as MapReduce, Spark,
HBase, and Cloud Computing. (This selection is subject to change as technology