Skip to content

Instantly share code, notes, and snippets.

@serrasqueiro
Last active November 2, 2024 19:02
Show Gist options
  • Save serrasqueiro/17e08b471628f568de2e34248c68e27c to your computer and use it in GitHub Desktop.
Save serrasqueiro/17e08b471628f568de2e34248c68e27c to your computer and use it in GitHub Desktop.
Pangramas em portugues
  1. Zwölf Boxkämpfer jagten Eva quer über die große Straße von Sylt.
    • This translates to:
      • twelve boxers chased Eva across the big street of Sylt.
    • It includes all the letters of the German alphabet, including the umlauts (ä, ö, ü) and the Eszett (ß).
  1. Li que ex-juíza turca vê fãs à beça em show de punk gay.
    • Simples, inclui exemplos de cada acento, mas não exemplos de todas as letras acentuadas.
  2. Vejo galã sexy pôr quinze kiwis à força em baú achatado.
    • 45 letras, inclui exemplos de cada acento, mas não exemplos de todas as letras acentuadas.
  3. Só juíza chata não vê câmera frágil e dá kiwi à ré sexy que pôs ações em baú.
    • 59 letras, incluindo todas as letras acentuadas.
#!/usr/bin/env python
# show.py -- (c)2024 Henrique Moreira
""" Show west-european Pangrams
"""
import unicodedata
# More basic, but works for German lang. better!
from unidecode import unidecode
LANG = [
"pt",
"de",
]
WHOT = {
"pt": "portuguese", # Portugue^s
"de": "german", # Deutsch
}
def main():
""" Show and illustrate pangrams """
for language in LANG:
desc = WHOT[language]
print(f"Showing pangram for {language}: {desc}")
fname = f"pangram-{language}.md"
show_pangram(language, fname)
def strip_acc(astr:str) -> str:
""" Removes accents.
Referenced at:
https://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-normalize-in-a-python-unicode-string
"""
astr = ''.join(achr for achr in unicodedata.normalize('NFD', astr)
if unicodedata.category(achr) != 'Mn')
return astr
def show_pangram(lang:str, fname:str):
with open(fname, "r", encoding="utf-8") as fdin:
astr = pangram(fdin.read(), lang)
print("-", astr, end="\n\n")
return True
def pangram(text, lang=""):
last = [line for line in text.splitlines() if line.strip() and line[0] != ' '][-1]
astr = strip_acc(last)
# To convert best-effort:
# astr.encode("ASCII", errors='ignore').decode("ASCII")
# Just ensure we got pure ASCII:
rec = astr.encode("ASCII", errors='ignore').decode("ASCII")
if astr != rec:
print("# Warning NFD for language (ISO 3166-1 name):", lang)
print("# last:", unidecode(last))
return "?"
return astr
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment