Skip to content

Instantly share code, notes, and snippets.

@ramalho
Created September 25, 2013 20:29
Show Gist options
  • Select an option

  • Save ramalho/6705557 to your computer and use it in GitHub Desktop.

Select an option

Save ramalho/6705557 to your computer and use it in GitHub Desktop.
Cleans up the list of chemical elements from the editable source of the Wikipedia article: http://en.wikipedia.org/wiki/List_of_elements, generating just a list of atomic number and element names
file = open('List_of_Elements.wiki', encoding='utf-8')
text = file.read()
# print(len(text), 'chars in the text')
lines = text.split('\n') # split by newlines
# print(len(lines), 'lines in the text')
for line in lines:
cells = line.split('||')
if len(cells) < 3 or not cells[2].startswith(' [['):
continue
atomic_number = cells[0].replace('|', '').strip()
element_name = cells[2].replace('[[', '').replace(']]', '').strip()
if '|' in element_name:
element_name = element_name.split('|')[1]
print(atomic_number, element_name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment