Skip to content

Instantly share code, notes, and snippets.

@jaklinger
Created October 2, 2019 13:21
Show Gist options
  • Save jaklinger/759b6d23d379b95fca16650baf495e60 to your computer and use it in GitHub Desktop.
Save jaklinger/759b6d23d379b95fca16650baf495e60 to your computer and use it in GitHub Desktop.
MySQL co-occurences from a link table (example treats docs as binarised vector of skills)
CREATE TEMPORARY TABLE tmp_skills_a
PRIMARY key doc_id
SELECT doc_id, skill_id
FROM skills_link_table
GROUP BY doc_id, skill_id;
CREATE TEMPORARY TABLE tmp_skills_b
PRIMARY key doc_id
SELECT doc_id, skill_id
FROM skills_link_table
GROUP BY doc_id, skill_id;
SELECT skill_a, skill_b, COUNT(*) cnt
FROM (SELECT a.skill_id AS skill_a, b.skill_id AS skill_b
FROM tmp_skills_a a
JOIN tmp_skills_b b ON a.doc_id = b.doc_id
WHERE a.skill_id > b.skill_id) t
GROUP BY skill_a, skill_b;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment