Created
November 18, 2015 20:08
-
-
Save brenns10/b3184ee45391c1361529 to your computer and use it in GitHub Desktop.
Determine Ordering of Alphabet
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Code to return the alphabet in which a list of words is sorted. | |
Imagine that you are given a list of words (eg: best friends forever), along | |
with the claim that these words are lexicographically sorted for *some* | |
alphabet. Your job is to determine whether this is true, and if it is, return | |
an example of such an alphabet. | |
This module implements an algorithm to solve this problem. It creates a graph | |
of dependencies between letters, performs a topological sort, and then returns | |
this ordering of letters (very simple). You need NetworkX (pip install | |
networkx, or use your distribution's packaage manager). Either run within the | |
Python interpreter (call it with a list of words), or run on the command line | |
(python alphabet.py), entering each word on its own line and hidding Ctrl-D | |
followed by enter (EOF) on its own line to terminate the list. | |
""" | |
import networkx as nx | |
def determine_alphabet(words): | |
""" | |
Return an alphabet in which the word list is sorted lexicographically. | |
:param words: a list of "words" that are sorted | |
:return: an ordered list of characters in the alphabet | |
""" | |
# a -> b means a comes before b in the alphabet | |
chardeps = nx.DiGraph() | |
# make sure every character in every word is a node in the graph. | |
for word in words: | |
chardeps.add_nodes_from(word) | |
# for each pair (1, 2), (2, 3), (3, 4), ... | |
for w1, w2 in zip(words, words[1:]): | |
# Find the first character in the word that is different. | |
for c1, c2 in zip(w1, w2): | |
if c1 != c2: | |
break | |
# If the characters are the same, they must have been the same words. | |
if c1 == c2: | |
continue | |
chardeps.add_edge(c1, c2) | |
# Problem is only solvable if the resulting graph is a DAG. | |
start = words[0][0] | |
if not nx.algorithms.is_directed_acyclic_graph(chardeps): | |
raise ValueError('Word list is not sorted according to any alphabet.') | |
# Do a topoligical sort -- ordering respects all constraints. | |
sort = nx.algorithms.topological_sort(chardeps, [start]) | |
# Any characters in the remainder have no constraints on them, and so they | |
# can be put into the alphabet in an arbitrary order. | |
remainder = set(chardeps.nodes()).difference(sort) | |
sort = sort + list(remainder) | |
return sort | |
if __name__ == '__main__': | |
import sys | |
words = [l.strip() for l in sys.stdin.readlines()] | |
try: | |
alphabet = determine_alphabet(words) | |
for character in alphabet: | |
print(character) | |
except ValueError: | |
print('That word list is not sorted according to any alphabet.') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment