Skip to content

Instantly share code, notes, and snippets.

@jvanasco
Created October 19, 2018 22:37
Show Gist options
  • Save jvanasco/35855347be0f7c3a365cce89cf5504bf to your computer and use it in GitHub Desktop.
Save jvanasco/35855347be0f7c3a365cce89cf5504bf to your computer and use it in GitHub Desktop.
after too much testing - including writing custom html parsers and checking cpython generated bytecode - this is the fastest general-purpose way to upgrade an html doc's images into amp i could come up with. if you expect a lot of images in a document, it would be faster to write a second function which ensure unique `imgs` and eliminates duplic…
import re
RE_img = re.compile("""<img[^>]+>""", re.I)
def upgrade(
html,
img_layout=None, # STRING
img_fallback=None, # STRING
img_noscript=None, # BOOL
img_noloading=None, # BOOL
img_default_height=None, # must be a STRING, never an INT
img_default_width=None, # must be a STRING, never an INT
):
imgs = RE_img.findall(html)
tag_end = ' noloading >' if img_noloading else '>'
img_fallback = "<div fallback>%s</div>" % img_fallback if img_fallback else ''
for _img in imgs:
_r_idx = -1 # ">" default
if _img[-3:] == " />":
_r_idx = -3
elif _img[-2:] == "/>":
_r_idx = -2
elif _img[-2:] == " >":
_r_idx = -2
_tag_new = "<amp-" + _img[1:_r_idx]
if img_layout and (' layout=' not in _tag_new):
_tag_new += ' layout="' + img_layout + '"'
if img_default_width and (' width=' not in _tag_new):
_tag_new += ' width="' + img_default_width + '"'
if img_default_height and (' height=' not in _tag_new):
_tag_new += ' height="' + img_default_height + '"'
_tag_new += tag_end
_tag_new += img_fallback
if img_noscript:
_tag_new += "<noscript>%s</noscript></amp-img>" % _img
else:
_tag_new += '</amp-img>'
html = html.replace(_img, _tag_new)
return html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment