This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- ################################### Wikidata ################################## | |
? ------ -- | |
+ ############################# Wikidata ################################ | |
- | |
- # wikidatawiki.balanced_revisions.20k_2015.json is check into the repo | |
- | |
- datasets/wikidatawiki.autolabeled_revisions.20k_2015.json: \ | |
- datasets/wikidatawiki.balanced_revisions.20k_2015.json | |
- cat $< | \ | |
- ./utility autolabel --host=https://wikidata.org \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- ############################### Hungarian Wikipedia ########################### | |
? -- | |
+ ############################# Hungarian Wikipedia ################################ | |
? +++++ | |
datasets/huwiki.sampled_revisions.40k_2016.json: | |
wget -qO- http://quarry.wmflabs.org/run/79645/output/0/json-lines?download=true > $@ | |
datasets/huwiki.autolabeled_revisions.40k_2016.json: \ | |
datasets/huwiki.sampled_revisions.40k_2016.json |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- ############################# Norwegian Wikipedia ############################# | |
+ ############################# Norwegian Wikipedia ################################ | |
? +++ | |
datasets/nowiki.sampled_revisions.100k_2015.json: | |
wget -qO- https://quarry.wmflabs.org/run/67250/output/0/json-lines?download=true > $@ | |
datasets/nowiki.autolabeled_revisions.100k_2015.json: \ | |
datasets/nowiki.sampled_revisions.100k_2015.json | |
cat $< | \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################# Persian Wikipedia ################################ | |
+ | |
+ datasets/fawiki.sampled_revisions.2.20k_2015.json: | |
+ wget -qO- http://quarry.wmflabs.org/run/59580/output/0/json-lines?download=true > $@ | |
+ | |
+ datasets/fawiki.autolabeled_revisions.2.20k_2015.json: \ | |
+ datasets/fawiki.sampled_revisions.2.20k_2015.json | |
+ cat $< | \ | |
+ ./utility autolabel --host=https://fa.wikipedia.org \ | |
+ --trusted-groups=sysop,oversight,bot,rollbacker,checkuser,abusefilter,bureaucrat,flow-bot \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# License: MIT | |
import gzip | |
def search(name, i): | |
result = [] | |
with gzip.open('clickstream-enwiki-2017-12.tsv.gz','rb') as f: | |
for line in f: | |
line = line.decode('utf-8').replace('\n', '') | |
if line.split('\t')[i] == name: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
amsa@C235:~/editquality$ python differ.py "Japanese Wikipedia" | |
- ########################### Japanese Wikipedia ################################ | |
+ ############################# Japanese Wikipedia ################################ | |
? ++ | |
- | |
# From https://quarry.wmflabs.org/query/9927 | |
datasets/jawiki.sampled_revisions.40k_2016.json: | |
wget -qO- https://quarry.wmflabs.org/run/89016/output/0/json-lines?download=true > $@ | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pymysql.cursors | |
import json | |
connection = pymysql.connect(host='localhost', | |
user='wikiuser', | |
password='secret service', | |
db='wikidb', | |
cursorclass=pymysql.cursors.DictCursor) | |
# range(20881, 1, -1) | |
for i in [142]: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# License: MIT | |
import pywikibot | |
import re | |
import urllib2 | |
from pywikibot import pagegenerators | |
site = pywikibot.Site('en') | |
generator = pagegenerators.SearchPageGenerator('insource:/\| *journal *= *.+Cochrane/', site=site, namespaces=[0]) | |
gen = pagegenerators.PreloadingGenerator(generator) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://hy.wikipedia.org/wiki/Գլխավոր_էջ 711702 | |
https://hy.wikipedia.org/wiki/Սպասարկող:Որոնել 476598 | |
https://hy.wikipedia.org/wiki/- 223200 | |
https://hy.wikipedia.org/wiki/Սպասարկող:Մասնակցիմուտք 131273 | |
https://hy.wikipedia.org/wiki/Սպասարկող:Վերջինփոփոխությունները 109813 | |
https://hy.wikipedia.org/wiki/Հայաստան 96958 | |
https://hy.wikipedia.org/wiki/Սպասարկող:CreateAccount 80968 | |
https://hy.wikipedia.org/wiki/Սպասարկող:Book 69286 | |
https://hy.wikipedia.org/wiki/Հովհաննես_Թումանյան 66342 | |
https://hy.wikipedia.org/wiki/Երևան 52996 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# License: MIT | |
import pywikibot | |
import sys | |
with open('nick_fixes.txt', 'r') as f: | |
cases = f.read().split('\n') | |
sites = {'wikidata': pywikibot.Site('wikidata', 'wikidata')} | |
ok = True | |
fixes = [ | |
['== Share your experience and feedback as a Wikimedian in this global survey ==', ['<ref>']], |