Last active
February 28, 2020 18:00
-
-
Save dginev/9bbb2b699054a9d3f124af020d0f7c00 to your computer and use it in GitHub Desktop.
Top textual 4-grams within 15 words of an inline citation from arXiv (arXMLiv 08.2019)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4-gram | frequency | |
---|---|---|
see e g [cite] | 340651 | |
can be found in | 197421 | |
be found in [cite] | 130873 | |
see for example [cite] | 93473 | |
in the case of | 86786 | |
in the context of | 80782 | |
is given by [cite] | 73337 | |
shown in fig ref | 65890 | |
with respect to the | 63965 | |
we refer to [cite] | 60033 | |
the results of [cite] | 59107 | |
on the other hand | 58902 | |
as a function of | 58393 | |
in terms of the | 56654 | |
refer the reader to | 56415 | |
see for instance [cite] | 52874 | |
in the sense of | 51688 | |
we refer the reader | 51363 | |
NUM of ref [cite] | 51155 | |
the proof of [cite] | 47736 | |
as shown in [cite] | 46642 | |
was shown in [cite] | 46442 | |
in the literature [cite] | 44990 | |
the proof of theorem | 44929 | |
as well as the | 44485 | |
it was shown in | 44075 | |
in the proof of | 43169 | |
is consistent with the | 41887 | |
found in ref [cite] | 40783 | |
in agreement with the | 37872 | |
e g ref [cite] | 37504 | |
the reader to [cite] | 37372 | |
in the presence of | 37266 | |
in this paper we | 36716 | |
given in ref [cite] | 36013 | |
is based on the | 35873 | |
discussed in ref [cite] | 34797 | |
as described in [cite] | 33364 | |
it has been shown | 33312 | |
is similar to the | 33170 | |
if and only if | 32776 | |
as discussed in [cite] | 31685 | |
see e g ref | 31349 | |
can be used to | 31318 | |
[cite] and references therein | 30404 | |
proof of theorem ref | 29978 | |
it is well known | 29902 | |
as in ref [cite] | 29860 | |
we have the following | 29705 | |
shown in figure ref | 29666 | |
state of the art | 29328 | |
shown in ref [cite] | 29272 | |
is shown in [cite] | 28927 | |
is given in [cite] | 28501 | |
the proof of the | 28324 | |
in this section we | 28083 | |
the case of the | 27814 | |
is given by the | 27663 | |
in the next section | 27370 | |
NUM in ref [cite] | 27344 | |
can be written as | 27273 | |
the authors of [cite] | 27155 | |
it is shown in | 27081 | |
described in ref [cite] | 26624 | |
be found in ref | 26505 | |
in good agreement with | 26494 | |
the sense of [cite] | 26262 | |
is the same as | 26255 | |
in detail in [cite] | 24601 | |
[cite] for more details | 24232 | |
theorem NUM in [cite] | 23784 | |
for the case of | 23774 | |
the results in [cite] | 23665 | |
in the framework of | 23566 | |
is proved in [cite] | 23154 | |
proof of theorem NUM | 23056 | |
pointed out in [cite] | 23037 | |
taken from ref [cite] | 22999 | |
was proved in [cite] | 22550 | |
of the order of | 22463 | |
see [cite] for a | 21893 | |
for example in [cite] | 21788 | |
the work of [cite] | 21787 | |
presented in ref [cite] | 21531 | |
to the case of | 21498 | |
is in agreement with | 21416 | |
reported in ref [cite] | 21356 | |
with the results of | 21269 | |
a function of the | 21136 | |
it follows from [cite] | 21055 | |
be found in the | 21021 | |
is related to the | 21016 | |
e g refs [cite] | 20833 | |
the size of the | 20822 | |
results of ref [cite] | 20515 | |
section NUM of [cite] | 20356 | |
similar to the one | 20279 | |
good agreement with the | 20187 | |
is shown in fig | 20068 | |
be written as [cite] | 19963 | |
e g in [cite] | 19928 | |
as shown in fig | 19666 | |
eqs ref and ref | 19407 | |
in section ref we | 19402 | |
has been studied in | 19285 | |
proposed in ref [cite] | 19271 | |
a special case of | 19248 | |
similar to that of | 19158 | |
obtained in ref [cite] | 18879 | |
used in ref [cite] | 18775 | |
our previous work [cite] | 18686 | |
the proof of lemma | 18605 | |
refer to [cite] for | 18600 | |
is equivalent to the | 18353 | |
the presence of a | 18352 | |
has been shown in | 18077 | |
is one of the | 18000 | |
in contrast to the | 17965 | |
a factor of NUM | 17956 | |
is well known [cite] | 17951 | |
by a factor of | 17920 | |
as pointed out in | 17916 | |
in the form of | 17909 | |
the result of [cite] | 17907 | |
the same as in | 17848 | |
between NUM and NUM | 17567 | |
the main result of | 17530 | |
is similar to that | 17461 | |
theorem NUM of [cite] | 17366 | |
in the absence of | 17338 | |
the context of the | 17030 | |
see e g refs | 16942 | |
as explained in [cite] | 16562 | |
see [cite] for details | 16443 | |
at the end of | 16336 | |
reader is referred to | 16330 | |
which is consistent with | 16290 | |
see also ref [cite] | 16272 | |
the same as the | 16243 | |
are given in [cite] | 16243 | |
it is possible to | 16168 | |
are given by [cite] | 16142 | |
in the same way | 16119 | |
in the spirit of | 15941 | |
by means of the | 15939 | |
fig NUM of [cite] | 15849 | |
a generalization of the | 15814 | |
the results of the | 15812 | |
the case of a | 15670 | |
the existence of a | 15637 | |
in the study of | 15628 | |
ref ref and ref | 15603 | |
with the help of | 15549 | |
on the basis of | 15531 | |
is known to be | 15499 | |
a wide range of | 15478 | |
studied in ref [cite] | 15331 | |
the order of NUM | 15288 | |
we will use the | 15224 | |
from NUM to NUM | 15187 | |
the properties of the | 15155 | |
listed in table ref | 15071 | |
are shown in fig | 15052 | |
this completes the proof | 15006 | |
was introduced in [cite] | 14978 | |
in this case the | 14939 | |
of the proof of | 14908 | |
a consequence of the | 14864 | |
been studied in [cite] | 14796 | |
described in detail in | 14715 | |
in the paper [cite] | 14626 | |
beyond the scope of | 14543 | |
an important role in | 14529 | |
as in the proof | 14523 | |
fig NUM of ref | 14347 | |
the authors in [cite] | 14304 | |
to that of the | 14254 | |
the same way as | 14242 | |
see [cite] for more | 14212 | |
the definition of the | 14190 | |
as well as in | 14188 | |
the results of ref | 14168 | |
the details of the | 14135 | |
for more details see | 14110 | |
the reader is referred | 14054 | |
may be found in | 14034 | |
given in table ref | 14027 | |
is a consequence of | 13990 | |
are consistent with the | 13990 | |
is expected to be | 13963 | |
as defined in [cite] | 13948 | |
one of the most | 13928 | |
the rest of the | 13919 | |
it can be shown | 13914 | |
the framework of the | 13896 | |
shown in fig NUM | 13895 | |
is well known see | 13854 | |
this is consistent with | 13755 |
- What's
NUM
eg in "from NUM to NUM"? - very few of those indicate any polarity eg
- are consistent with the
- is a consequence of
- in the spirit of
- in the same way
- All numeric literals get substituted with
NUM
in my tokenization, similarly forref
being a substitute for LaTeX\ref
numbers. - I agree that most do not indicate polarity,
- but a range of them indicate "kinds of certainty", even when phrased neutrally.
is well known
- old & tried result, well-acceptedwas proved in
,it is shown in
- certain by virtue of carrying its own proof (common to math)our previous work
- explicitly claiming credit & disclosing personal bias
- but a range of them indicate "kinds of certainty", even when phrased neutrally.
I stand by my conclusion that ngrams are just the wrong tool to study inline citations, but there's a lot that one could imagine done on the sentence level.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
citation_ngrams
example, written for this purpose.