Last active
December 22, 2017 12:48
-
-
Save romuloceccon/f6f201be9814aeef143331c121dca751 to your computer and use it in GitHub Desktop.
A fast datetime parser written in Python with minimal validations.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import datetime | |
def parse_datetime(s): | |
"""Parse a datetime string with minimal validations. | |
`parse_datetime` takes a string with format yyyy-mm-dd HH:MM:SS.zzzzzz | |
and returns a `datetime.datetime` object. Any prefix of that format is | |
accepted as well; so a string like "2017-12-22" will return a datetime | |
with hour, mins, secs and microsecs set to zero. | |
The microsecs part may have any length, but it will be truncated if | |
there are more than 6 digits. Any non-digit character is considered a | |
separator; so "xx 2017x12x22 13%50%45 %%", for example, is a valid | |
date. Actual validation of date parts is left to `datetime.datetime`'s | |
constructor (raises ValueError on invalid input). | |
Args: | |
s (string): the input string. | |
Returns: | |
A `datetime.datetime` object. | |
""" | |
def parse_datetime(s): | |
s = s + ' ' | |
parts = [0, 0, 0, 0, 0, 0, 0] | |
i, left = 0, 0 | |
is_sep = True | |
for right in range(len(s)): | |
c = s[right] | |
if c >= '0' and c <= '9': | |
if is_sep: | |
is_sep = False | |
left = right | |
else: | |
if not is_sep: | |
is_sep = True | |
v = int(s[left:right]) | |
d = right - left | |
if i == 6: | |
v = v * 10 ** (6 - d) if d <= 6 else v / 10 ** (d - 6) | |
parts[i] = v | |
i += 1 | |
if i >= 7: | |
break | |
return datetime.datetime(*parts) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment