| Subject | Document count |
|---|---|
math |
334932 |
astro-ph |
223437 |
cond-mat |
212384 |
cs |
132338 |
hep-ph |
130788 |
hep-th |
116499 |
physics |
99881 |
quant-ph |
80888 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| heading | frequency | |
|---|---|---|
| proof | 2930621 | |
| lemma | 1706821 | |
| theorem | 1700430 | |
| references | 1351260 | |
| abstract | 1193933 | |
| introduction | 1117555 | |
| proposition | 1059776 | |
| definition | 972999 | |
| remark | 888243 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| \newcount\index | |
| \newcount\sum | |
| \def\esum#1{ | |
| \index=#1 | |
| \sum=0 | |
| \loop | |
| \advance\sum by \index | |
| \ifnum\index>2 | |
| \advance\index by -2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| heading | frequency | |
|---|---|---|
| proof | 2464628 | |
| lemma | 1380622 | |
| theorem | 1254064 | |
| references | 1213025 | |
| abstract | 1057178 | |
| introduction | 955218 | |
| proposition | 876742 | |
| remark | 694222 | |
| definition | 686827 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| //! Convert arXiv's OAI harvested XML files into a lookup table for classification labels | |
| // Step 0. Prerequisite: download all needed arXiv metadata via OAI, e.g. | |
| //``` | |
| // $ pip install git+http://github.com/bloomonkey/oai-harvest.git#egg=oaiharvest | |
| // $ mkdir metadata/arxiv; cd metadata/arxiv | |
| // $ oai-reg add arxiv http://export.arxiv.org/oai2?verb=Identify | |
| // $ oai-harvest arxiv --until 2018-09-09 | |
| //``` | |
| // endpoint documentation at: https://arxiv.org/help/oa | |
| use jwalk::WalkDir; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| word | frequency | |
|---|---|---|
| figure | 3290488 | |
| theorem | 3052607 | |
| section | 2802295 | |
| lemma | 2408488 | |
| table | 1544961 | |
| proposition | 1334759 | |
| and | 1031640 | |
| corollary | 476062 | |
| appendix | 416964 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env perl | |
| # Applies cutoffs to the very noisy 250 MB mathml_statistics.txt | |
| # which was generated by llamapun over arXMLiv 08.2018. | |
| # | |
| # It rewrites to a CSV file, throwing out all known erroneous markup, including: | |
| # - discard all SVG-associated markup (wrongly in MathML) | |
| # - discard all (non-math) HTML-associated markup (wrongly in MathML) | |
| # - discard all XMath-associated markup (wrongly in MathML) | |
| # - less noisy for uninteresting values (numbers with known units, hex colors, open-ended id schemes, etc) | |
| # |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| name@attr[value] | frequency | |
|---|---|---|
| mo | 390704 | |
| mi | 317263 | |
| mrow | 265247 | |
| mi@href | 230061 | |
| math@display | 108952 | |
| math@class | 108952 | |
| math | 108952 | |
| math@alttext | 108952 | |
| math@class[ltx_Math] | 108944 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| time: 0.026; rss: 58MB parsing | |
| time: 0.000; rss: 58MB attributes injection | |
| time: 0.000; rss: 58MB garbage collect incremental cache directory | |
| time: 0.000; rss: 58MB recursion limit | |
| time: 0.000; rss: 58MB crate injection | |
| time: 0.000; rss: 58MB plugin loading | |
| time: 0.000; rss: 58MB plugin registration | |
| time: 0.000; rss: 58MB background load prev dep-graph | |
| time: 0.003; rss: 58MB pre ast expansion lint checks | |
| time: 1.662; rss: 237MB expand crate |