I have been keeping track of various ultimate championships and tournament results. One of these projects is keeping track of gold, silver and bronze winners of Finnish Ultimate Frisbee Championships (http://hartti.com/sm-ultimate/). This data is collected from various sources including old association newsletters (Frisbari archives are here https://www.ultimate.fi/ultimate/frisbari/) as well as more recently from Pelikone (https://www.ultimate.fi/pelikone/?view=statistics&list=teamstandings) managed by Finnish Flying Disc Association.
The data stored on a static Web page is not really useful so I decided to transfer the data to Google sheets (https://docs.google.com/spreadsheets/d/1eXInN-59gvlEqL7coizdXIocLe2YMUyc2ZqQGdG7K0k/edit#gid=1804348668) and then try out Neo4J as a platform to study the data further. (Note, that unless you have started sharing the Google Sheet file you are trying to access you will encounter some funky errors, so remember to share the file the first place :-)
Note, that Google Sheets' pivot table functionality (https://docs.google.com/spreadsheets/d/1eXInN-59gvlEqL7coizdXIocLe2YMUyc2ZqQGdG7K0k/edit#gid=1660829346) is more useful than Neo4j to create a medal table (a table is more readable than a list - see the Neo4J example below) and also the tools in Pivot Table aso allow easy filtering of results based on division, type of championships and limiting the results to include only certain placings (if I have time in the future to update the raw data with also other placings, not just medal positions…)
Also as my domain model is quite simple, the strengths of Neo4j are not utilized and Pivot Table can be used to create almost all of the interesting visualizations. It might also make sense to create a node representing the division (open, women, juniors, etc.) instead of having it listed in the relationship between the squad and the year (and year might not be that useful as a node… go figure…)
The source data curently contains only open division results for outdoor championships and the placings only for 3 first placings. If the source data is updated to include also other divisions and other placings these setup cypher scripts need to be updated to only include one of the divisions to keep the number of nodes and relationships at bay
load csv with headers from "https://docs.google.com/spreadsheets/d/1eXInN-59gvlEqL7coizdXIocLe2YMUyc2ZqQGdG7K0k/gviz/tq?tqx=out:csv&sheet=0" as line
with line
merge (t:Team {name: line.Team, city: line.City})
with line, t
FOREACH(ignoreMe IN CASE WHEN trim(line.OfficialName) <> "" THEN [1] ELSE [] END | SET t.official_name = line.OfficialName)
FOREACH(ignoreMe IN CASE WHEN trim(line.Founded) <> "" THEN [1] ELSE [] END | SET t.founded = toInteger(line.Founded))
load csv with headers from "https://docs.google.com/spreadsheets/d/1eXInN-59gvlEqL7coizdXIocLe2YMUyc2ZqQGdG7K0k/gviz/tq?tqx=out:csv&sheet=Sheet2" as line
merge (s:Squad {name: line.Squad})
with line, s
merge (y:Year {year: toInteger(line.Year)})
with line, s, y
create (s)-[:PLACING_IN_YEAR {placing: toInteger(line.Placing), division: line.Division, type: line.Type}]->(y)
with line, s
match (t:Team {name: line.Team})
merge (s)-[:BELONGS_TO]->(t)
The first and quite natural use case is finding the top three teams from a certain year. As the data will in the future also contain games from different divisions (like women, juniors), both indoor and outdoor games as well as also placings of all teams (not just the first 3 teams) we need to limit the results in the query. You can play with the following query by changing the year in the first line of the query (you can also change the other parameters but those have less effect as the data currently contains only open outdoor results for the first 3 teams.
match (y:Year {year: 1998})-[p:PLACING_IN_YEAR {division: "open", type: "outdoor"}]-(s:Squad)--(t:Team)
where p.placing <= 3
return y.year, p.placing, s.name, t.official_name
order by p.placing
Which Finnish team has the most medals? Which Finnish team has the most gold medals (= Finnish Championships)? Medal table providers answer for that. In this situation, we are not grouping the results using the name of the squad, as that could change from one year to another. Instead, we will group the results using the Team names so that for example Espoo Ultimate Club gets credited both for the wins by Otso as well as Liquidisc (both are names for the open division team of EUC)
match (y:Year)-[p:PLACING_IN_YEAR]-(s:Squad)--(t:Team)
where p.placing <= 3
with t, count(t) as totalMedals
match (y:Year)-[p:PLACING_IN_YEAR]-(s:Squad)--(t:Team)
where p.placing <= 3
return t.name, totalMedals, p.placing, count(t.name) as medalCount
order by totalMedals desc, p.placing asc