Created
September 29, 2014 03:17
-
-
Save PMeinshausen/f4878b103522165a9c3e to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--Syntax | |
SELECT * FROM TF_IDF( | |
ON TF | |
( | |
ON {table|view|(query)} PARTITION BY docid | |
[FORMULA('bool'|'log'|'augment'|'normal')] | |
) AS TF PARTITION BY term | |
[ON (SELECT term, COUNT(distinct docid) FROM input_table | |
GROUP BY term) AS docperterm PARTITION BY term] | |
[ON (SELECT COUNT(distinct(docid)) FROM input_table) | |
AS doccount dimension] | |
[ON (SELECT DISTINCT(term) as term, | |
IDF FROM tf_idf_output_table) AS IDF PARTITION BY term] | |
); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment