Skip to content

Instantly share code, notes, and snippets.

@mdboom
Created May 16, 2013 02:07
Show Gist options
  • Save mdboom/5588912 to your computer and use it in GitHub Desktop.
Save mdboom/5588912 to your computer and use it in GitHub Desktop.
A PLY lexer that will cause a UnicodeDecodeError with ply-3.4
# -*- coding: utf-8 -*-
"""
Having any non-ascii content in this file is enough to get it to fail.
Jsme rytíři, kteří říkají "ni"
"""
tokens = (
'NAME','NUMBER',
)
literals = ['=','+','-','*','/', '(',')']
# Tokens
t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
def t_NUMBER(t):
r'\d+'
t.value = int(t.value)
return t
t_ignore = " \t"
def t_newline(t):
r'\n+'
t.lexer.lineno += t.value.count("\n")
def t_error(t):
print("Illegal character '%s'" % t.value[0])
t.lexer.skip(1)
# Build the lexer
import ply.lex as lex
lex.lex()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment