Skip to content

Instantly share code, notes, and snippets.

@morrisalp
Last active June 24, 2023 21:05
Show Gist options
  • Select an option

  • Save morrisalp/8fb89b43d79e2ea2190b18441e83d5eb to your computer and use it in GitHub Desktop.

Select an option

Save morrisalp/8fb89b43d79e2ea2190b18441e83d5eb to your computer and use it in GitHub Desktop.
Get all page names in a given Wiktionary category (e.g. "English lemmas") using the Wiki REST API.
import requests
def pages_in_wiktionary_category(category_name, language = 'en'):
cont = ''
while True:
url = f'https://{language}.wiktionary.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:{category_name}&cmlimit=500&format=json&cmcontinue={cont}'
obj = requests.get(url).json()
for x in obj['query']['categorymembers']: yield x['title']
if 'continue' not in obj: break
cont = obj['continue']['cmcontinue']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment