- Zwölf Boxkämpfer jagten Eva quer über die große Straße von Sylt.
- This translates to:
- twelve boxers chased Eva across the big street of Sylt.
- It includes all the letters of the German alphabet, including the umlauts (ä, ö, ü) and the Eszett (ß).
- This translates to:
Last active
November 2, 2024 19:02
-
-
Save serrasqueiro/17e08b471628f568de2e34248c68e27c to your computer and use it in GitHub Desktop.
Pangramas em portugues
pangrama.md
- Li que ex-juíza turca vê fãs à beça em show de punk gay.
- Simples, inclui exemplos de cada acento, mas não exemplos de todas as letras acentuadas.
- Vejo galã sexy pôr quinze kiwis à força em baú achatado.
- 45 letras, inclui exemplos de cada acento, mas não exemplos de todas as letras acentuadas.
- Só juíza chata não vê câmera frágil e dá kiwi à ré sexy que pôs ações em baú.
- 59 letras, incluindo todas as letras acentuadas.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# show.py -- (c)2024 Henrique Moreira | |
""" Show west-european Pangrams | |
""" | |
import unicodedata | |
# More basic, but works for German lang. better! | |
from unidecode import unidecode | |
LANG = [ | |
"pt", | |
"de", | |
] | |
WHOT = { | |
"pt": "portuguese", # Portugue^s | |
"de": "german", # Deutsch | |
} | |
def main(): | |
""" Show and illustrate pangrams """ | |
for language in LANG: | |
desc = WHOT[language] | |
print(f"Showing pangram for {language}: {desc}") | |
fname = f"pangram-{language}.md" | |
show_pangram(language, fname) | |
def strip_acc(astr:str) -> str: | |
""" Removes accents. | |
Referenced at: | |
https://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-normalize-in-a-python-unicode-string | |
""" | |
astr = ''.join(achr for achr in unicodedata.normalize('NFD', astr) | |
if unicodedata.category(achr) != 'Mn') | |
return astr | |
def show_pangram(lang:str, fname:str): | |
with open(fname, "r", encoding="utf-8") as fdin: | |
astr = pangram(fdin.read(), lang) | |
print("-", astr, end="\n\n") | |
return True | |
def pangram(text, lang=""): | |
last = [line for line in text.splitlines() if line.strip() and line[0] != ' '][-1] | |
astr = strip_acc(last) | |
# To convert best-effort: | |
# astr.encode("ASCII", errors='ignore').decode("ASCII") | |
# Just ensure we got pure ASCII: | |
rec = astr.encode("ASCII", errors='ignore').decode("ASCII") | |
if astr != rec: | |
print("# Warning NFD for language (ISO 3166-1 name):", lang) | |
print("# last:", unidecode(last)) | |
return "?" | |
return astr | |
if __name__ == "__main__": | |
main() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment