Skip to content

Instantly share code, notes, and snippets.

@rg3915
Last active April 28, 2018 16:52
Show Gist options
  • Save rg3915/0dcf0d922a9965e407e71539326fee06 to your computer and use it in GitHub Desktop.
Save rg3915/0dcf0d922a9965e407e71539326fee06 to your computer and use it in GitHub Desktop.
Le os arquivos .tex numa pasta pages/. E conta a quantidade de ocorrências da palavra \\cite{foo} ou \\citeonline{bar}, agrupando por foo ou bar.
'''
Le os arquivos .tex numa pasta projeto/document/pages/.
E conta a quantidade de ocorrências da palavra \\cite{foo} ou \\citeonline{bar},
agrupando por foo ou bar.
'''
import csv
import os
import re
from collections import Counter
path = './'
# path = 'projeto/document/pages/'
lines = []
for file in os.listdir(path):
if file.endswith('.tex'):
filename = '%s%s' % (os.path.join(path), file)
with open(filename, 'r') as f:
lines.extend(f.readlines())
# Expressão regular
regex_expression = '\\\\cite(online)?\{(\w+)\}'
# Junta todos os itens da lista num único texto.
lines_joined = ''.join(lines)
# O findall retorna uma tupla das palavras encontradas
words_tuple = re.findall(regex_expression, lines_joined)
# Nova lista com as palavras separadas da tupla
words = [word for i, word in words_tuple]
# Contando e ordenando as palavras
result = Counter(words).most_common()
# Salvando result.tex
with open('result.tex', 'w') as f:
f.write('\\begin{table}\n')
f.write('\\begin{tabular}{ll}\n')
for item in result:
f.write('%s & %s \\\\\n' % item)
f.write('\\end{tabular}\n')
f.write('\\end{table}\n')
# Salvando result.csv
with open('result.csv', 'w') as f:
w = csv.writer(f)
w.writerow(('Nome', 'Quantidade'))
for item in result:
w.writerow(item)
Nome Quantidade
foo 4
baz 3
bar 2
\begin{table}
\begin{tabular}{ll}
foo & 4 \\
baz & 3 \\
bar & 2 \\
\end{tabular}
\end{table}
Lorem ipsum \\cite{foo} dolor sit amet, consectetur \\citeonline{foo} adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna \\cite{bar} aliqua. Ut enim ad minim veniam,
quis nostrud \\citeonline{bar} exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis \\cite{baz} aute irure dolor in reprehenderit in voluptate velit \\citeonline{foo} esse
cillum dolore eu fugiat nulla pariatur \\citeonline{baz}. Excepteur \\cite{foo} sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum \\citeonline{baz}.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment