jpvelez/intro_r_tutorial.md

WHAT IS THIS?

This is a step-by-step tutorial for getting started with R, a powerful programming language for data analysis and visualization. It is aimed at near complete beginners. You'll basically want to be comfortable with spreadsheets and with using your computer's command line.

I slapped this together quickly, so expect some weirdness. Feel free to email me with comments or questions at jpvelez | at | gmail.com

I learned the following stuff using the UCLA Statistic's Department great R tutorials, so check those out: https://www.ats.ucla.edu/stat/r/learning_modules.htm

GETTING DATA

First, we need to get some data to analyze. We'll be using a dataset of NYC school sat scores (nyc_schools_sat_scores_clean.csv), which is attach to this gist. Download the data your computer and pop it into Excel to examine it.

Every row in this dataset represents an NYC school that had MORE THAN FIVE students take the SAT exam in 2010. The columns include a school's "DBN" number (a unique id for every school), its name, the number of students who took the SAT that year, and the mean reading, math, and writing scores of those students. Think of each row as a school, and each column as that schools attributes.

NOTE: the attached csv file is a cleaned version of the raw data available on the NYC data portal: https://nycopendata.socrata.com/

A number of schools in the original data had 's' values instead of numbers in the SAT reading, math, and writing score columns. According to the data portal, these schools had fewer than 5 students take the SAT, so those scores have been suppressed (hence the 's') to protect their anonymity. To simplify things, I've gone ahead and removed those rows from the dataset. If you don't use the clean data, the following tutorial won't work.

FIRING UP R

Ok, use the command line to navigate to the directory where you saved the data, and type "r" to fire up R. You can also use the official R console, but then you'll need to explicitly set your working directory to the directory where the sat data lives. I won't cover that. Google it. Don't be lazy.

READING CSV FILES INTO R

Now, we need to read our data into R so we can do stuff with it. For what, we use the read.csv() function:

sat_scores = read.csv('nyc_school_sat_scores_clean.csv')

Here the read.csv function reads in the csv, which needs to be located in the working directory, and returns your data in a dataframe object that is then saved to a variable named sat_scores

R DATA TYPES: DATAFRAMES AND VECTORS

What is a dataframe object, you ask? In technicalese, it's a data structure that makes it easy to store and access tabular data with named columns. Think of it as a spreadsheet or table you can do stuff with.

To see the contents of your dataframe, just type it in:

sat_scores

names(sat_scores)

Throw your dataframe into this names function to see what columns of data are in there. this is the same thing as column names in the first row of a spreadsheet.

The other big type of object in R is a vector. A vector is basically just a list. It could be a list of text, or of numbers, but it's usually numbers.

From here on out, type in the code first, try to understand what it does, and then read the description.

vector = c(1, 2, 3, 4)

this is how you make a new vector and save it to a variable named vector. if you don't save dataframes or vectors to variables, you can't use them later.

vector

this is how you inspect the contents of your new vector.

SUBSETTING DATA

Now we're going to slice and dice the data in our dataframe. This is called 'subsetting' data.

sat_scores$reading

this is the easiest way of accessing all the data in on of your columns. the style is dataframe$column. you punch this is in, and the computer will return a vector of all the values in that column, in this case, all of the mean sat reading scores for nyc schools (that had more than 5 students take the sat in 2010.)

reading_scores = sat_scores$reading

you can save the data in the reading column, i.e. the vector of mean reading scores, by saving it to a new variable just like above

sat_scores[, 2]

you can also select columns using brackets like this. actually, these brackets let you select both columns and rows. the first 'slot' in the brackets, before the comma, lets you specify what rows you want. we want all of them, so leave that blank. the second slow lets you specify which columns you want. so this code will get you a vector of school_names, because school_names are in column 2. if you don't remember a column's number (or name), use the names() function.

sat_scores[, c(2, 3, 4)]

you can select multiple columns. the way you do this is by putting a vector of the column numbers you want in that second column slot.

sat_scores[, 'school_names']

you can also specify what columns you want by using their names. the names must have quotes around them, because technically these are 'strings', or text objects. if you don't put quotes around it, R thinks you're talking about a variable. if you haven't used that variable anywhere, it'll get pissy and throw an error at you.

sat_scores[, c('school_names', 'reading')]

you can also specify multiple columns using their names by putting them in a vector, just like we did for the column numbers. this code will return a new, two-column dataframe of school_names and reading scores. every school will be in this new dataframe, because you left the first 'slot' in the brackets blank.

Let's try to filter out rows now.

sat_scores[sat_scores$math > 350, ]

This code will return a new dataframe that will contain only those rows (i.e. those schools) which had math scores ABOVE 350. this new dataframe will have all the columns - school name, number of testers, reading scores, etc - because you left the second slot intact, but it will only include schools that scored above 350 in math. in other words, this command says 'get me every school that had math scores greater than 350." For some silly reason, you can't just write [math > 350,] because R doesn't know which dataframe that R column belongs to. maybe you have several dataframes with columns named 'math'. so you need to specify which column you're talking about by writing [sat_scores$math > 350,] the same syntax you used to access the reading scores above.

Now let's subset on both rows and columns.

sat_scores[sat_scores$math > 350, c(2, 4, 5)]

This codes says "get me every row where math score > 350, but only show me the data in columns 2, 4, and 5.' in other words, take our sat_scores table and spit out a new table that only shows the school name, math, and reading scores of schools that had mean math scores above 350. Got it?

sat = sat_scores[sat_scores$math != 's' ,]

This shows you another way you can subset rows. This code says "return every row that DOES NOT have an 's' value in it's math column." != stands DOES NOT EQUAL, while == stands for EQUAL. If you want to start with the raw, not-cleaned data from the NYC data portal, you could use this code to remove schools that have suppressed scores i.e. 's' strings in many of their columns.

Alright, so now you can turn filter tables and access the data in their columns. That's nice, but a big vector (list) of numbers isn't very helpful. It doesn't give us insight. We need a way to summarize some of the data.

SUMMARIZING DATA

summary(sat_scores)

The summary function does just. For vectors that contain numbers, it prints summary statistics like what is the smallest, largest, and mean number in the list. For vectors that contain text, like school_names, it counts how many times each unique text string occurs in the vector. you can also use this function on individual vectors - summary(sat_scores$reading) or summary(reading_scores) - not just entire dataframes.

It's time for a little data viz. R makes it stupid simple to generate charts. Let's start with a histogram, which is great way to visualize the distribution of the data in a single column.

hist(sat$math)

The hist function takes in a vector and returns a histogram chart. it won't work if you feed it an entire dataframe - hist(sat_scores) - you need to specify a column. remember, whenever you type data_frame$column, the computer returns a vector of all the values in the column, which then get fed into the hist function.

hist(sat$reading)

So this will show you the distribution of mean reading scores across nyc schools. notice that a lot of them cluster between 350 - 450. This is a low score, and consistent with the median values we got from the summary function. Takeaway: NYC schools aren't doing very well.

hist(sat$writing)

You can do this for writing and math as well.

VISUALIZING DATA

Now let's make a scatterplot. These let us see two variables at once, and examine wether there's a relationship between the two.

plot(math ~ reading, sat)

This plot function looks similar to hist, but it's a little peculiar. first, it has two arguments or 'slots'. instead of specifying columns with the dataframe$column syntax as you've been doing, the first argument tells R which columns to plot, and the second argument tells R which dataframe these columns belong to. Also there's a weird ~ in the first argument. Basically the (math ~ writing, .. ) code says I want a scatterplot with math scores on the y axis and writing scores on the x axis: (y ~ x, ..) I think of it as "math mashed up with writing."

OK, so we have a scatterplot! Two observations: most schools cluster between 350-450 on BOTH their math and reading scores, which is consistent with our histograms and summaries. 2. schools that have higher math scores tend to have higher reading scores. they tend to move together, that's why you see the dots moving up and to the right. that means there's an association between math and reading scores. cool! if we didn't see that pattern, if dots where all over the place, then there would be no association.

library(lattice)

Lattice is a library that has functions that make fancier graphs than the ones that come built-in to R. use this function to load it into R so you can use some of them.

xyplot(math ~ reading, sat)

xyplot is lattice's equivalent to the plot() function. it works the same way, but gives you pretty colors.

So we've got some charts. That's great. Before the end, I will tease you with a tiny bit of stats.

RUNNIN' STATS ON DATA

fit = lm(math ~ reading, sat)

This lm() function runs a linear regression on the data we visualized with a scatterplot. Very crudely, it tries to measure to what extent there's a linear relationship between math and reading scores. A linear relationship means "as math goes up, so does reading." The scatterplot suggested that schools with higher math scores tend to have higher reading scores, this is a rigorous way of capturing that relationship.

abline(fit)

This function will take the linear regression object generate it above, and add a 'line of best fit' to our scatterplot.

dbn	school_name	testers	reading	math	writing
01M292	Henry Street School for International Studies	31	391	425	385
01M448	University Neighborhood High School	60	394	419	387
01M450	East Side Community High School	69	418	431	402
01M458	SATELLITE ACADEMY FORSYTH ST	26	385	370	378
01M515	Lower East Side Preparatory High School	154	314	532	314
01M539	New Explorations into Sci, Tech and Math HS	47	568	583	568
01M650	CASCADES HIGH SCHOOL	35	411	401	401
01M696	BARD HIGH SCHOOL EARLY COLLEGE	138	630	608	630
02M047	AMERICAN SIGN LANG ENG DUAL	11	405	415	385
02M288	FOOD AND FNANCE HIGH SCHOOL	50	422	412	407
02M294	HIGH SCHOOL FOR HIST AND COMM	51	382	364	366
02M296	High School of Hospitality Management	43	397	415	391
02M298	PACE HIGH SCHOOL	71	424	448	423
02M300	Urban Assembly School of Design and Construction	49	405	446	415
02M303	The Facing History School	59	381	373	377
02M305	Urban Assembly Academy of Government and Law	48	411	406	411
02M308	LOWER MANHATTAN ARTS ACADEMY	35	409	381	412
02M313	The James Baldwin School	45	421	419	394
02M316	Urban Assembly School of Business for Young Women	52	401	409	391
02M374	GRAMERCY ARTS HIGH SCHOOL	49	395	376	386
02M400	HIGH SCHOOL ENVRNMNTL STUDIES	216	465	480	448
02M407	Institute for Collaborative Education	42	484	478	472
02M408	Professional Performng Arts School	69	495	465	499
02M411	Baruch College Campus High School	96	523	583	528
02M412	New York City Laboratory School Collab Studies	108	561	597	567
02M413	SCHOOL OF THE FUTURE	79	475	488	466
02M414	NEW YORK CITY MUSEUM SCHOOL	90	454	448	435
02M416	ELEANOR ROOSEVELT HIGH SCHOOL	122	555	596	567
02M418	Millennium High School	140	512	554	523
02M419	LANDMARK SCHOOL	47	369	370	359
02M420	HIGH SCHOOL HLTH PROF HUMAN	250	446	458	440
02M425	HIGH SCHOOL LEADERSHIP PUB SVC	74	419	429	406
02M429	LEGACY SCH INTEGRATED STUDIES	31	379	356	354
02M439	MANHATTAN VILLAGE ACADEMY	90	465	479	472
02M440	Bayard Rustin High School Humanities	136	387	394	379
02M449	VANGUARD HIGH SCHOOL	54	367	395	373
02M459	MANHATTAN INTERNATIONAL HIGH SCHOOL	31	418	463	415
02M460	WASHINGTON IRVING HIGH SCHOOL	163	381	385	368
02M475	STUYVESANT HIGH SCHOOL	804	674	735	678
02M489	HS of Economics And Finance	147	451	503	453
02M500	UNITY HIGH SCHOOL	31	371	369	368
02M519	TALENT UNLIMITED HIGH SCHOOL	124	465	454	461
02M520	MURRY BERGTRAUM HIGH SCHOOL	287	411	439	396
02M529	Jacqueline Kennedy Onassis High School	88	420	424	398
02M531	NEW YORK CITY PUB SCH REP COMP	32	415	405	397
02M542	MANHATTAN BRIDGE HIGH SCHOOL	31	345	380	325
02M543	NEW DESIGN HIGH SCHOOL	74	390	385	387
02M544	INDEPENDENCE HIGH SCHOOL	16	404	419	395
02M545	Dual Language and Asian Studies High School	47	416	612	419
02M550	Liberty High School Academy for Newcomers	32	343	465	362
02M551	New York Harbor High School	54	372	369	368
02M560	City as School	57	468	441	434
02M565	URBAN ACADEMY	34	513	460	502
02M570	Satellite Academy High School	16	400	336	376
02M575	Manhattan Comprehensive Night Day High School	141	355	475	339
02M580	Richard R Green High School of Teaching	79	425	429	419
02M586	HARVEY MILK SCHOOL	11	429	382	416
02M600	HIGH SCHOOL FASHION INDUSTRIES	301	419	420	413
02M605	HUMANITIES PREPARATORY ACADEMY	37	433	405	417
02M615	Chelsea Career and Technical Education High School	61	393	412	376
02M620	NORMAN THOMAS HIGH SCHOOL	152	378	387	370
02M625	HS of Graphic Commnctn And Art	125	370	378	360
02M630	HIGH SCHOOL OF ART AND DESIGN	212	427	423	418
02M655	LIFE SCIENCES SECONDARY SCHOOL	71	399	389	383
02M690	SCHOOL FOR THE PHYSICAL CITY	14	382	404	359
03M283	MANHATTAN THEATRE LAB HIGH SCHOOL	23	406	378	391
03M299	High School For Arts, Imagination And Inquiry	57	364	378	357
03M307	Urban Assembly School For Media Studies	53	374	363	377
03M415	WADLEIGH SECONDARY SCHOOL	68	376	373	371
03M470	LOUIS D BRANDEIS HIGH SCHOOL	170	357	357	345
03M479	BEACON SCHOOL	237	573	563	575
03M485	LAGUARDIA HIGH SCH MUSIC ART	594	558	555	567
03M494	Martin Luther King Jr HS Arts and Technology	60	407	425	397
03M505	Edward A. Reynolds West Side High School	18	410	407	388
03M541	MANHATTAN/HUNTER COLL HS SCI	80	481	525	469
03M860	FREDERICK DOUGLAS ACADEMY II	33	414	410	406
04M409	COALITION SCH SOCIAL CHANGE	39	390	372	383
04M435	Manhattan Center for Science and Math	312	485	531	475
04M495	PARK EAST HIGH SCHOOL	46	360	374	362
04M555	CENTRAL PARK SECONDARY SCHOOL	35	362	378	356
04M610	YOUNG WOMEN LEADERSHIP SCHOOL	48	451	445	470
04M635	Academy of Environment Science Secondary School	39	380	385	354
04M680	HERITAGE SCHOOL	47	369	385	361
04M695	URBAN PEACE ACADEMY	23	366	382	362
05M285	HARLEM RENAISSANCE HIGH SCHOOL	19	402	394	367
05M304	MOTT HALL HIGH SCHOOL	63	424	421	410
05M369	URBAN ASSEMBLY SCHOOL FOR THE PERFORMIN	34	349	370	363
05M469	CHOIR ACADEMY OF HARLEM	16	420	369	401
05M499	FREDERICK DOUGLASS ACADEMY	216	465	481	466
05M670	THURGOOD MARSHALL ACADEMY	52	423	434	404
05M685	Bread & Rose Intergrated Arts High School	54	364	378	359
05M692	High School For Math Science Engineering City Coll	106	592	627	575
06M462	HIGH SCHOOL FOR INTL BUS/FIN	56	383	384	379
06M463	HS for Media and Communications-George Washingto	75	366	376	353
06M467	High School for Law and Public Service	106	388	388	378
06M468	HIGH SCHOOL FOR HLTH CARER/SER	28	385	417	373
06M540	A Philip Randolph Campus High School	193	436	451	421
06M552	GREGORIO LUPERON PREP SCHOOL	60	342	384	333
07X221	SOUTH BRONX PREPARATORY	60	399	393	382
07X334	INTERNATIONAL COMMUNITY HIGH SCHOOL	33	322	335	327
07X381	BRONX HAVEN HIGH SCHOOL	15	335	342	372
07X427	Community High School Social Justice	50	372	351	359
07X473	Mott Haven Village Preparatory High School	48	345	352	349
07X495	UNIVERSITY HEIGHTS HIGH SCHOOL	66	401	388	396
07X500	HOSTOS-LINCOLN ACADEMY SCIENCE	62	460	455	457
07X520	FOREIGN LANGUAGE ACADEMY	73	418	429	415
07X527	Bronx Leadership Academy II High School	55	377	381	379
07X547	NEW EXPLORERS HIGH SCHOOL	36	382	376	376
07X548	Urban Assembly School for Careers in Sports	58	381	423	384
07X551	Bronx Academy of Letters	51	416	406	435
07X600	ALFRED E SMITH HIGH SCHOOL	53	367	372	354
07X655	Samuel Gompers Vocational Technical High School	105	370	385	354
07X670	Health Opportunities High School	81	400	391	387
08X282	YOUNG WOMEN'S LEADERSHIP SCH BRONX CMP	56	395	407	388
08X293	Renaissance High School Music Theater T	47	389	375	375
08X295	Gateway School for Environmental Research & Tech	31	377	367	368
08X305	Pablo Neruda Academy Architecture & World Studies	54	359	358	338
08X312	MILLENNIUM ART ACADEMY	41	380	389	371
08X332	HOLCOMBE L. RUCKER SCHOOL OF COMMUNITY	34	374	362	366
08X377	BRONX COMMUNITY HIGH SCHOOL (X377)	10	416	393	414
08X405	HERBERT H LEHMAN HIGH SCHOOL	448	414	433	397
08X408	YABC AT HERBERT H. LEHMAN HIGH SCHOOL	8	358	343	385
08X452	The Bronx Guild High School	36	381	361	352
08X519	Felisa Rincon de Gautier Inst for Law & Pub Policy	43	373	376	367
08X530	BANANA KELLY HIGH SCHOOL	48	377	382	368
08X540	HIGH SCHOOL CMTY RESEARCH LRN	29	356	376	368
08X560	BRONX ACADEMY HIGH SCHOOL	19	373	354	364
08X650	Jane Addams Vocational High School	152	370	377	363
09X227	Bronx Expeditionary Learning High School	46	371	358	343
09X231	Eagle Academy for Young Men	66	391	399	386
09X239	Urban Assembly Academy History & Citizenship	22	359	380	371
09X250	EXIMIUS COLLEGE PREP ACADEMY	44	401	383	387
09X252	MOTT HALL BRONX HIGH SCHOOL	63	395	398	399
09X260	Bronx Center for Science and Mathematics	73	447	495	454
09X263	VALIDUS PREPARATORY ACADEMY	90	359	347	356
09X276	LEADERSHIP INSTITUTE	21	362	380	358
09X297	MORRIS ACADEMY FOR COLLABORATIVE STUDIE	22	388	352	378
09X329	DREAMYARD PREPARATORY SCHOOL	46	394	404	389
09X403	Bronx International High School	37	335	335	340
09X404	HIGH SCHOOL FOR EXCELLENCE	32	366	373	371
09X412	BRONX HIGH SCHOOL OF BUSINESS	35	369	397	366
09X413	HIGH SCHOOL MEDICAL SCIENCES	98	423	450	435
09X414	Jonathan Levin HS for Media and Communications	81	347	353	351
09X505	BRONX SCH LAW GOVMT & JUSTICE	87	413	390	420
09X517	Frederick Douglass Academy III Secondary School	44	399	414	406
09X525	Bronx Leadership Academy High School	116	402	397	396
09X543	HIGH SCHOOL VIOLIN AND DANCE	26	332	379	337
10X141	RIVERDALE KINGSBRIDGE ACADEMY	101	475	479	470
10X213	BRONX ENGINEERING & TECH ACADEMY	36	374	402	364
10X225	Theatre Arts Production Company School	53	382	375	363
10X237	Marie Curie HS for Nur, Med and Allied Health Prof	48	373	358	366
10X243	WEST BRONX ACADEMY FOR FUTURE	32	365	382	375
10X268	Kingsbridge International High School	40	313	316	296
10X284	BRONX SCHOOL OF LAW & FINANCE	54	404	397	383
10X319	Providing Urban Learners Success in Education HS	10	349	346	364
10X342	International School of Liberal Arts	46	333	336	285
10X368	Information Technology Academy	56	418	411	398
10X433	HIGH SCHOOL TEACHING PROFESSNS	50	386	385	381
10X434	Belmont Preparatory High School	65	358	370	346
10X437	Fordham High School for the Arts	42	377	375	375
10X438	Fordham Leadership Academy for Business Technolog	71	375	385	373
10X439	Bronx High School of Law and Community Service	57	389	400	382
10X440	DeWitt Clinton High School	438	425	440	417
10X442	BRONX HIGH SCHOOL OF MUSIC	38	421	432	439
10X445	BRONX HIGH SCHOOL OF SCIENCE	683	632	685	643
10X475	JOHN F KENNEDY HIGH SCHOOL	87	364	375	358
10X477	MARBLE HILL SCH INTL STUDIES	70	406	429	403
10X478	YABC AT JOHN F. KENNEDY HIGH SCHOOL	12	389	383	364
10X546	BRONX THEATER HIGH SCHOOL	50	406	389	388
10X549	DISCOVERY HIGH SCHOOL	31	340	340	351
10X660	Grace Dodge Vocational High School	93	380	385	369
10X667	YABC AT GRACE DODGE HIGH SCHOOL	13	367	378	358
10X696	HS of American Studies at Lehman College	74	635	630	619
11X249	Bronx Health School High School 11x249	52	363	366	364
11X253	High School For Writing & Communication Arts	57	393	363	387
11X265	BRONX LAB SCHOOL	70	356	370	355
11X270	Academy for Scholarship and Entrepreneurship	37	373	376	362
11X275	High School of Computers and Technology	56	399	402	373
11X288	Columbus Institute for Math and Science	90	432	464	424
11X290	BRONX ACAD OF HEALTH CAREERS	46	369	363	380
11X299	ASTOR COLLEGIATE ACADEMY	38	404	400	388
11X415	Christopher Columbus High School	114	354	373	350
11X418	Bronx High School for the Visual Arts	54	419	403	392
11X425	EVANDER CHILDS HIGH SCHOOL	7	344	306	350
11X455	HARRY S TRUMAN HIGH SCHOOL	201	380	384	379
11X513	NEW WORLD HIGH SCHOOL	43	331	384	341
11X514	Bronxwood Peparatory Academy	21	359	389	350
11X541	GLOBAL ENTERPRISE HIGH SCHOOL	43	372	359	373
11X542	PELHAM PREPARATORY ACADEMY	92	427	416	424
11X544	High School for the Contemporary Arts	40	359	340	338
11X545	BRONX AEROSPACE HIGH SCHOOL	47	393	395	389
12X245	NEW DAY ACADEMY	24	368	349	358
12X248	METROPOLITAN HIGH SCHOOL	42	340	361	335
12X251	EXPLORATIONS ACADEMY	34	364	364	344
12X262	Bronx High School Performance & Stage	41	383	360	387
12X271	EAST BRONX ACADEMY FOR THE FUTURE	22	415	390	397
12X278	PEACE & DIVERSITY ACAD HIGH SCHOOL	24	408	400	401
12X400	Morris High School	9	369	374	360
12X428	YABC at Monroe Academy	11	386	345	388
12X446	Schomburg Satellite Academy	14	354	366	364
12X480	BRONX REGIONAL HIGH SCHOOL	12	359	366	359
12X550	HIGH SCHOOL OF WORLD CULTURES	50	291	333	291
12X680	Bronx Coalition Community High School	31	372	366	360
12X682	Fannie Lou Hamer Freedom High School	75	333	326	330
12X684	WINGS ACADEMY	58	386	397	383
12X690	MONROE CAMPUS	47	355	369	366
12X692	MONROE ACAD VISUAL ARTS DESGN	70	346	341	356
13K265	Dr Susan S McKinney Secondary School of the Arts	46	387	369	369
13K350	Urban Assembly School of Music and Art	59	394	371	371
13K412	BROOKLYN COMMUNITY HIGH SCHOOL	43	393	356	377
13K419	Science Skills Center High School	104	419	421	409
13K430	BROOKLYN TECHNICAL HIGH SCHOOL	1047	588	652	581
13K483	Urban Assembly School of Law and Justice	104	414	415	412
13K499	Acorn Association of Community Organization Return	103	367	361	360
13K509	FREEDOM ACADEMY	40	416	386	408
13K553	BROOKLYN ACADEMY HIGH SCHOOL	19	403	384	377
13K575	Bedford Stuyvesant Street Academy High School	20	366	380	362
13K595	BEDFORD ACADEMY HIGH SCHOOL	50	453	469	441
13K605	George Westinghouse Vocational Technical HS	87	390	360	373
13K616	Brooklyn High Sch for Leadership Community Svc	9	323	281	310
13K670	BENJAMIN BANNEKER ACADEMY	168	476	477	445
14K071	JHS 071 JUAN MOREL CAMPOS	59	362	369	349
14K322	FOUNDATIONS ACADEMY	28	373	377	359
14K404	ACADEMY FOR YOUNG WRITERS	70	377	366	365
14K449	BROOKLYN LATIN SCHOOL	50	536	534	527
14K454	GREEN SCHOOL: AN ACADEMY FOR ENVIRONMEN	52	381	381	366
14K474	Progress High School For Professional Careers	102	365	375	373
14K477	School for Legal Studies	82	385	390	384
14K478	Enterprise, Business and Technology High School	80	360	390	354
14K488	BROOKLYN PREP HIGH SCHOOL	44	397	377	387
14K558	Williamsburg High School for Architecture & Design	44	399	393	367
14K561	WILLIAMSBURG PREP	58	375	386	369
14K610	AUTOMOTIVE HIGH SCHOOL	73	353	359	340
14K685	El Puente Academy for Peace and Justice	16	375	386	394
15K429	BROOKLYN SCHOOL GLOBAL STUDIES	50	374	365	374
15K448	Brooklyn Secondary Sch for Collaborative Studies	63	385	380	375
15K462	Secondary School for Law, Journalism and Research	50	406	409	393
15K463	SECONDARY SCHOOL JOURNALISM	75	393	372	370
15K464	SECONDARY SCHOOL FOR RESEARCH	65	373	395	368
15K497	SCHOOL INTERNATIONAL STUDIES	45	402	414	382
15K519	COBBE HILL SCHOOL AMERICAN STD	44	365	358	360
15K520	PACIFIC HIGH SCHOOL	9	356	352	343
15K529	WEST BROOKLYN COMMUNITY HIGH SCHOOL	11	363	382	381
15K530	METROPOLITAN CORPORATE ACADEMY	17	370	348	348
15K656	Brooklyn High School of the Arts	71	416	417	425
15K698	South Brooklyn Community High School	17	387	371	368
16K393	Frederick Douglass Academy IV Seconday School	29	429	421	424
16K455	BOYS AND GIRLS HIGH SCHOOL	143	369	367	366
16K498	Acorn High School for Social Justice	34	367	364	364
17K382	ACADEMY FOR COLLEGE PREPARATION CAREER	50	401	412	404
17K408	ACADEMY OF HOSPITALITY AND TOURISM	31	400	405	399
17K440	Prospect Heights High School	12	388	373	374
17K489	W E B DUBOIS HIGH SCHOOL	17	381	344	360
17K524	International High School @Prospect Hgt	32	302	339	325
17K528	HIGH SCHOOL FOR GLOBAL CITIZENSHIP	81	382	387	381
17K531	SCHOOL FOR HUMAN RIGHTS	21	357	391	343
17K533	SCHOOL FOR DEMOCRACY AND LEADERSHIP	25	358	378	372
17K537	HS FOR YOUTH CMMTY DEVLPMT AT ERASMUS	71	361	376	365
17K539	HS FOR SERVICE AND LEARNING AT ERASMUS	54	371	366	381
17K543	SCIENCE TECH RESEARCH HS AT ERASMUS	68	445	425	415
17K544	INTERNATIONAL ARTS BUSINESS HS	49	392	390	381
17K546	HIGH SCHOOL FOR PUBLIC SERVICE	83	412	427	420
17K547	Brooklyn High School for Science and the Environ	45	391	396	388
17K548	BROOKLYN HS MUSIC AND THEATER	37	385	351	381
17K568	BROWNSVILLE ACADEMY HIGH SCHOOL	20	356	358	366
17K590	Medgar Evers College Preparatory High School	113	459	482	452
17K600	Clara Barton High School	282	419	413	411
17K625	Paul Robeson High School	121	355	358	348
18K415	SAMUEL J TILDEN HIGH SCHOOL	42	328	341	338
18K500	CANARSIE HIGH SCHOOL	128	372	373	364
18K515	SOUTH SHORE HIGH SCHOOL	29	365	352	336
18K578	BROOKLYN BRIDGE ACADEMY OF S SHORE ED CO	20	389	358	369
18K635	OLYMPUS ACADEMY	24	361	365	365
19K409	EAST NEW YORK FAMILY ACADEMY	61	435	428	414
19K420	FRANKLIN K LANE HIGH SCHOOL	140	353	379	341
19K502	FDNY HIGH SCH DOR FIRE & LIFE	29	376	375	363
19K504	HIGH SCHOOL FOR CIVIL RIGHTS	23	369	372	380
19K507	PERFORMING ARTS & TECH HIGH SCHOOL	49	371	363	356
19K510	World Acad for Total Community Health	28	370	371	376
19K615	East New York High School of Transit Technology	184	413	419	403
19K659	CYPRESS HILLS COLLEGIATE PREPARATORY SCH	50	398	402	378
19K660	William H Maxwell Voc High School	49	355	367	355
20K445	NEW UTRECHT HIGH SCHOOL	317	409	471	407
20K485	High School of Telecommunication Arts	221	450	471	440
20K490	FORT HAMILTON HIGH SCHOOL	581	416	486	409
20K505	Franklin D Roosevelt High School	385	387	492	377
20K658	YABC at Franklin D. Roosevelt High School	33	363	443	369
21K337	International High School at LaFayette	35	355	395	355
21K344	Rachel Carson High School for Coastal Studies	66	422	459	418
21K348	HIGH SCHOOL OF SPORTS MANAGEMENT	33	370	427	381
21K400	LAFAYETTE HIGH SCHOOL	40	327	367	315
21K410	ABRAHAM LINCOLN HIGH SCHOOL	379	386	429	385
21K525	EDWARD R MURROW HIGH SCHOOL	686	472	503	468
21K540	JOHN DEWEY HIGH SCHOOL	349	414	465	396
21K620	William E. Grady Vocational Technical High School	56	385	393	361
21K690	BROOKLYN STUDIO SECONDARY SCH	86	426	419	418
21K728	Liberation Diploma Plus High School	13	402	351	380
22K405	Midwood High School at Brooklyn College	725	493	543	491
22K425	JAMES MADISON HIGH SCHOOL	641	449	478	445
22K495	SHEEPSHEAD BAY HIGH SCHOOL	192	380	416	380
22K535	LEON M GOLDSTEIN HIGH SCHOOL	239	543	578	559
22K555	Brooklyn College Academy High School	109	459	460	446
23K493	BROOKLYN COLLEGIATE	37	394	372	375
23K514	Frederick Douglass Academy VII High School	50	416	396	414
23K643	Brooklyn Democracy Academy	18	347	371	323
23K645	EBC High School for Public Service - East New York	24	376	358	368
23K646	ASPIRATIONS DIPLOMA PLUS HIGH SCHOOL	35	349	329	341
23K647	METROPOLITAN DIPLOMA PLUS HIGH SCHOOL	12	385	378	391
23K697	TEACHERS PREPARATORY SCHOOL	59	396	389	389
24Q264	ACADEMY OF FINANCE AND ENTERPRISE	65	407	432	417
24Q267	High School of Applied Communication	67	411	420	407
24Q299	BARD HIGH SCHOOL EARLY COLLEGE II	42	545	548	541
24Q455	NEWTOWN HIGH SCHOOL	251	379	427	372
24Q485	GROVER CLEVELAND HIGH SCHOOL	269	398	431	391
24Q520	MIDDLE COLLEGE HIGH SCHOOL	32	386	402	373
24Q530	INTERNATL HS@LA GUARDIA COMM C	75	355	428	359
24Q550	HIGH SCHOOL FOR ARTS/BUSINESS	127	363	372	363
24Q560	ROBERT F WAGNER JR SECONDARY	49	420	440	421
24Q600	Queens Vocational & Tech High School	130	415	428	401
24Q610	Aviation Career & Technical Education High School	289	444	494	427
24Q744	VOYAGES PREPARATORY HIGH SCHOOL	7	416	413	364
25Q263	FLUSHING INTERNATIONAL HIGH SCHOOL	57	325	415	311
25Q281	EAST WEST SCHOOL OF INTERNATIONAL STUDIE	50	438	485	442
25Q285	WORLD JOURNALISM PREPARATORY SCHOOL	56	429	425	421
25Q425	JOHN BOWNE HIGH SCHOOL	302	391	423	389
25Q460	FLUSHING HIGH SCHOOL	278	372	398	366
25Q525	Townsend Harris High School at Queens College	273	637	644	642
25Q540	QUEENS ACADEMY	22	400	395	391
25Q670	Robert F Kennedy Community High School	61	449	471	462
25Q792	North Queens Community High School	22	434	422	422
26Q415	BENJAMIN CARDOZO HIGH SCHOOL	697	486	551	492
26Q430	FRANCIS LEWIS HIGH SCHOOL	808	457	530	454
26Q435	MARTIN VAN BUREN HIGH SCHOOL	343	402	411	398
26Q495	BAYSIDE HIGH SCHOOL	732	472	531	467
26Q566	Queens High School-Teaching Liberal Arts Sciences	144	438	455	433
27Q260	FREDERICK DOUGLASS ACADEMY VI	65	393	390	387
27Q262	CHANNEL VIEW SCHOOL FOR RESEARCH	63	455	423	420
27Q400	AUGUST MARTIN HIGH SCHOOL	102	377	372	362
27Q410	BEACH CHANNEL HIGH SCHOOL	114	393	390	380
27Q465	FAR ROCKAWAY HIGH SCHOOL	31	350	369	360
27Q475	RICHMOND HILL HIGH SCHOOL	397	380	401	380
27Q480	JOHN ADAMS HIGH SCHOOL	302	399	424	404
27Q650	HS for Construction Trades, Engineering and Arch	87	421	464	424
28Q338	SATELLITE ACADEMY	22	346	351	338
28Q440	FOREST HILLS HIGH SCHOOL	627	468	491	462
28Q470	JAMAICA HIGH SCHOOL	153	393	400	391
28Q505	HILLCREST HIGH SCHOOL	498	385	400	385
28Q620	Thomas Edison Vocational-Technical High School	529	455	490	439
28Q680	Queens Gateway to Health Sciences Secondary Schoo	89	525	532	515
28Q687	QUEENS HS FOR SCIENCE YORK COL	99	613	650	612
28Q690	High School For Law Enforcement And Public Safety	62	408	429	412
29Q248	QUEENS PREPARATORY ACADEMY	46	372	388	381
29Q259	PATHWAYS COLLEGE PREPARATORY SCHOOL	42	406	415	394
29Q265	EXCELSIOR PREPARATORY HIGH SCHOOL	55	394	358	388
29Q272	GEORGE WASHINGTON CARVER HS	68	425	435	418
29Q283	PREPARATORY ACADEMY FOR WRITERS	18	413	402	388
29Q420	Springfield Gardens High School	20	429	402	403
29Q492	MATH SCIENCE RESEARCH TECH MAG	59	405	430	405
29Q494	LAW GOVERNMENT SERVICE MAGNET	60	406	412	391
29Q496	BUSINESS COMPUTER APPLICATION	52	388	389	380
29Q498	Humanities and Arts Magnet High School	45	383	371	370
30Q445	WILLIAM C BRYANT HIGH SCHOOL	322	405	456	408
30Q450	LONG ISLAND CITY HIGH SCHOOL	296	418	434	408
30Q501	Frank Sinatra School of the Arts	153	506	495	499
30Q502	INFORMATION TECHNOLOGY HIGH SCHOOL	140	440	452	424
30Q555	NEWCOMERS HIGH SCHOOL	147	343	447	346
30Q575	ACADEMY OF AMERICAN STUDIES	120	502	505	515
30Q580	BACCALAUREATE SCH GLOBAL EDUC	64	560	587	570
31R080	MICHAEL J PETRIDES SCHOOL	86	508	523	502
31R440	NEW DORP HIGH SCHOOL	282	431	446	434
31R445	PORT RICHMOND HIGH SCHOOL	237	437	442	425
31R450	CURTIS HIGH SCHOOL	373	439	441	430
31R455	TOTTENVILLE HIGH SCHOOL	698	459	482	467
31R460	SUSAN E WAGNER HIGH SCHOOL	531	463	482	457
31R470	CONCORD HIGH SCHOOL	10	474	431	397
31R600	RALPH MCKEE VOC-TECH HIGH SCH	48	411	429	386
31R605	STATEN ISLAND TECHNICAL HIGH SCHOOL	287	638	673	617
32K403	ACADEMY OF ENVIRONMENTAL LEADERSHIP	36	382	369	374
32K545	EBC High School for Public Service - Bushwick	82	389	390	376
32K549	BUSHWICK HS FOR SOCIAL JUSTICE	72	363	364	358
32K552	Academy of Urban Planning	49	355	361	353
32K554	ALL CITY LEADERSHIP SECONDARY	29	394	420	395
32K556	Bushwick Leaders High School for Academic Excel	30	357	345	351
75R025	South Richmond High School	10	407	421	400
76K460	John Jay High School	9	390	381	398
79X490	PHOENIX SCHOOL	7	404	423	416

ryanbriones · 2012-08-08T20:25:41Z

If you have a new R installation or haven't installed lattice you will need to run:

install.packages('lattice')

Be warned, this may pop up a window that gets hidden behind your console. Once you've chosen a mirror, the install will continue.

ryanbriones · 2012-08-08T20:29:38Z

Also, based on Juan's work, I cobbled together a version of this demo that uses clojure's incanter package to accomplish the same work. Incanter is a port of R's statistics and charting functions to clojure.

https://gist.github.com/3292129

jpvelez · 2012-08-08T20:31:27Z

Thanks for cherry on top, Ryan!

jpvelez · 2012-08-09T16:15:40Z

I've gone through and cleaned up the tutorial, so it should be way more readable now.