Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save aspose-com-gists/e8567e307c07ff0ee606b430b0172bf6 to your computer and use it in GitHub Desktop.
Save aspose-com-gists/e8567e307c07ff0ee606b430b0172bf6 to your computer and use it in GitHub Desktop.
Create, load, edit and save HTML code using Python

Working with HTML Documents in Python – Creating, Editing, and Saving

This GitHub Gist repository contains Python code examples referenced in the Working with Documents chapter of the Aspose.HTML for Python via .NET documentation. These gists demonstrate practical ways to load, read, create, modify, and save HTML documents programmatically using powerful Aspose.HTML for Python via .NET API.

Topics Covered

This Gist demonstrates the basics of creating, manipulating, and saving documents using the Aspose.HTML library for Python via .NET. The code snippets illustrate how to:

  • Create a new, empty HTML document.
  • Load an HTML document from a string variable, local file, URL, or memory stream.
  • Load and save SVG documents, allowing you to manipulate scalable vector graphics using the DOM.
  • Edit the HTML DOM (Document Object Model) by creating new elements and adding them to the document structure.
  • Save HTML in MHTML or Markdown formats.
  • Save edited documents to your local drive, ensuring all changes and associated files are saved correctly.

About Aspose.HTML for Python via .NET

Aspose.HTML for Python via .NET is a cross-platform API for creating, editing, and converting HTML, SVG, EPUB, MHTML, and Markdown files. It offers headless browser capabilities, full DOM and CSS control, and conversion to formats like PDF, DOCX, XPS, or images—without relying on external software.

Prerequisites

  • Python 3.5+
  • .NET Core / .NET 5+ runtime
  • Windows, macOS, or Linux
  • Aspose.HTML for Python via .NET installed from PyPI

How to Use These Examples

  1. Install the Aspose.HTML Python package:
   pip install aspose-html-net
  1. Clone or download this gist to your local machine.
  2. Configure input/output paths, data directories, and font folders if needed.
  3. Run the example in your environment.

Related Resources

Aspose.HTML for Python via .NET – Work with HTML Documents
# Create and save SVG document using Python
# Learn more: https://docs.aspose.com/html/python-net/save-html-document/
import os
import aspose.html.dom.svg as ahsvg
# Define the output directory and document path
output_dir = 'output'
document_path = os.path.join(output_dir, 'save-html-to-svg.svg')
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)
# Prepare SVG code
svg_code = """
<svg xmlns='http://www.w3.org/2000/svg' height='400' width='300'>
<path stroke="#a06e84" stroke-width="3" fill="#74aeaf" d="
M 150,50 L 150, 300
M 120,100 L 150,50 L 180, 100
M 110,150 L 150,90 L 190, 150
M 90,220 L 150,130 L 210, 220
M 70,300 L 150,190 L 230, 300
M 110,310 L 150,240 L 190, 310
" />
</svg>
"""
# Initialize an SVG instance from the content string
document = ahsvg.SVGDocument(svg_code, '.')
# Save SVG
document.save(document_path)
# Create an empty HTML document using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import os
import aspose.html as ah
# Setup an output directory and prepare a path to save the document
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
save_path = os.path.join(output_dir, "document-empty.html")
# Initialize an empty HTML document
document = ah.HTMLDocument()
# Work with the document here...
# Save the document to a file
document.save(save_path)
# Create HTML from a string using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import os
import aspose.html as ah
# Prepare HTML code
html_code = "<p>Hello, World!</p>"
# Setup output directory
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Initialize a document from the string variable
document = ah.HTMLDocument(html_code, ".")
# Save the document to disk
document.save(os.path.join(output_dir, "create-html-from-string.html"))
# Create and add new HTML elements using Python
# Learn more: https://docs.aspose.com/html/python-net/edit-html-document/
import os
import aspose.html as ah
# Define output directory and file paths
output_dir = "output/"
save_path = os.path.join(output_dir, "edit-document.html")
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)
# Create an instance of an HTML document
document = ah.HTMLDocument()
# Create a style element and set the teal color for elements with class "col"
style = document.create_element("style")
style.text_content = ".col { color: teal }"
# Find the document <head> element and append the <style> element
head = document.get_elements_by_tag_name("head")[0]
head.append_child(style)
# Create a paragraph <p> element with class "col"
p = document.create_element("p")
p.class_name = "col"
# Create a text node
text = document.create_text_node("Edit HTML document")
# Append the text node to the paragraph
p.append_child(text)
# Append the paragraph to the document <body>
document.body.append_child(p)
# Save the HTML document to a file
document.save(save_path)
# Create an HTML document using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import os
import aspose.html as ah
# Prepare the output path to save a document
output_dir = "output/"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
document_path = os.path.join(output_dir, "create-new-document.html")
# Initialize an empty HTML document
with ah.HTMLDocument() as document:
# Create a text node and add it to the document
text = document.create_text_node("Hello, World!")
document.body.append_child(text)
# Save the document to a file
document.save(document_path)
# Edit HTML document using DOM Tree in Python
# Learn more: https://docs.aspose.com/html/python-net/edit-html-document/
import os
import aspose.html as ah
# Define the output directory and file path
output_dir = "output/"
output_path = os.path.join(output_dir, "edit-document-tree.html")
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)
# Create an instance of an HTML document
document = ah.HTMLDocument()
# Access the document <body> element
body = document.body
# Create a paragraph element <p>
p = document.create_element("p")
# Set a custom attribute
p.set_attribute("id", "my-paragraph")
# Create a text node
text = document.create_text_node("The Aspose.Html.Dom namespace provides an API for representing and interfacing with HTML, XML, or SVG documents.")
# Add the text to the paragraph
p.append_child(text)
# Attach the paragraph to the document body
body.append_child(p)
# Save the HTML document to a file
document.save(output_path)
# How to set inline CSS styles in an HTML element using Python
# Learn more: https://docs.aspose.com/html/python-net/edit-html-document/
import os
import aspose.html as ah
import aspose.html.rendering.pdf as rp
# Define the content of the HTML document
content = "<p>Edit inline CSS using Aspose.HTML for Python via .NET</p>"
# Create an instance of an HTML document with specified content
document = ah.HTMLDocument(content, ".")
# Find the paragraph element and set a style attribute
paragraph = document.get_elements_by_tag_name("p")[0]
paragraph.set_attribute("style", "font-size: 150%; font-family: arial; color: teal")
# Save the HTML document to a file
output_dir = "output/"
os.makedirs(output_dir, exist_ok=True)
html_path = os.path.join(output_dir, "edit-inline-css.html")
document.save(html_path)
# Create an instance of the PDF output device and render the document to this device
pdf_path = os.path.join(output_dir, "edit-inline-css.pdf")
with rp.PdfDevice(pdf_path) as device:
document.render_to(device)
# Edit HTML with internal CSS using Python
# Learn more: https://docs.aspose.com/html/python-net/edit-html-document/
import os
import aspose.html as ah
import aspose.html.rendering.pdf as rp
# Define the content of the HTML document
content = "<div><h1>Internal CSS</h1><p>An internal CSS is used to define a style for a single HTML page</p></div>"
# Create an instance of an HTML document with specified content
document = ah.HTMLDocument(content, ".")
# Create a <style> element and define internal CSS rules
style = document.create_element("style")
style.text_content = (
".frame1 { margin-top:50px; margin-left:50px; padding:25px; width:360px; height:90px; "
"background-color:#82011a; font-family:arial; color:#fff5ee;} \r\n"
".frame2 { margin-top:-70px; margin-left:160px; text-align:center; padding:20px; width:360px; "
"height:100px; background-color:#ebd2d7;}"
)
# Find the <head> element and append the style element
head = document.get_elements_by_tag_name("head")[0]
head.append_child(style)
# Find the first paragraph element and apply styles
header = document.get_elements_by_tag_name("h1")[0]
header.class_name = "frame1"
# Update the style using the style attribute directly
header.set_attribute("style", "font-size: 200%; text-align: center;")
# Find the last paragraph element and apply styles
paragraph = document.get_elements_by_tag_name("p")[0]
paragraph.class_name = "frame2"
paragraph.set_attribute("style", "color: #434343; font-size: 150%; font-family: verdana;")
# Save the HTML document to a file
output_dir = "output/"
os.makedirs(output_dir, exist_ok=True)
html_path = os.path.join(output_dir, "edit-internal-css.html")
document.save(html_path)
# Create an instance of the PDF output device and render the document to this device
pdf_path = os.path.join(output_dir, "edit-internal-css.pdf")
with rp.PdfDevice(pdf_path) as device:
document.render_to(device)
# Load HTML from a file using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import os
import aspose.html as ah
# Setup directories and define paths
output_dir = "output/"
input_dir = "data/"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
document_path = os.path.join(input_dir, "document.html")
save_path = os.path.join(output_dir, "document-edited.html")
# Initialize a document from a file
document = ah.HTMLDocument(document_path)
# Work with the document
# Save the document to a file
document.save(save_path)
# Load HTML from a stream using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import os
import io
import aspose.html as ah
# Prepare an output path for saving the document
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
# Use BytesIO instead of StringIO
content_stream = io.BytesIO(b"<p>Hello, World!</p>")
base_uri = "."
# config = Configuration()
# Initialize a document from the content stream
document = ah.HTMLDocument(content_stream, base_uri)
# Save the document to a disk
document.save(os.path.join(output_dir, "load-from-stream.html"))
# Load HTML from a URL using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import aspose.html as ah
# Load a document from the specified web page
document = ah.HTMLDocument("https://docs.aspose.com/html/files/aspose.html")
# Write the document content to the output stream
print(document.document_element.outer_html)
# Load SVG from a string using Python
# Learn more: https://docs.aspose.com/html/python-net/create-a-document/
import io
import aspose.html.dom.svg as ahsvg
# Initialize an SVG document from a string object
svg_content = "<svg xmlns='http://www.w3.org/2000/svg'><circle cx='50' cy='50' r='40'/></svg>"
base_uri = "."
content_stream = io.BytesIO(svg_content.encode('utf-8'))
document = ahsvg.SVGDocument(content_stream, base_uri)
# Write the document content to the output stream
print(document.document_element.outer_html)
# Save the document to a disk
document.save("load-from-stream.svg")
# Save HTML to a file using Python
# Learn more: https://docs.aspose.com/html/python-net/save-html-document/
import os
import aspose.html as ah
# Prepare an output path for saving the document
output_dir = "output/"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
document_path = os.path.join(output_dir, "save-html-document.html")
# Initialize an empty HTML document
with ah.HTMLDocument() as document:
# Create a text node and add it to the document
text = document.create_text_node("Hello, World!")
document.body.append_child(text)
# Save HTML to a file
document.save(document_path)
# Save HTML as Markdown using Python
# Learn more: https://docs.aspose.com/html/python-net/save-html-document/
import os
import aspose.html as ah
import aspose.html.saving as sav
# Prepare a path to a source and output HTML file
data_dir = "data"
output_dir = "output/"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
input_path = os.path.join(data_dir, "document.html")
output_path = os.path.join(output_dir, "html-to-markdown.md")
# Load the HTML document from a file
document = ah.HTMLDocument(input_path)
# Save the document to MHTML format
document.save(output_path, sav.HTMLSaveFormat.MARKDOWN)
# Save HTML as MHTML using Python
# Learn more: https://docs.aspose.com/html/python-net/save-html-document/
import os
import aspose.html as ah
import aspose.html.saving as sav
# Define the output directory and document path
output_dir = 'output'
document_path = os.path.join(output_dir, 'save-html-to-mhtml.mht')
# Ensure the output directory exists
os.makedirs(output_dir, exist_ok=True)
# Prepare a simple HTML file with a linked document
with open('document.html', 'w') as file:
file.write("<p>Hello, World!</p>"
"<a href='linked-file.html'>linked file</a>")
# Prepare a simple linked HTML file
with open('linked-file.html', 'w') as file:
file.write("<p>Hello, linked file!</p>")
# Load the "document.html" into memory
with ah.HTMLDocument('document.html') as document:
# Save the document to MHTML format
document.save(document_path, sav.HTMLSaveFormat.MHTML)
# Save HTML with a linked resources using Python
# Learn more: https://docs.aspose.com/html/python-net/save-html-document/
import os
import aspose.html as ah
import aspose.html.saving as sav
# Prepare an output path for the document
output_dir = "output"
if not os.path.exists(output_dir):
os.makedirs(output_dir)
document_path = os.path.join(output_dir, "save-with-linked-file.html")
# Prepare a simple HTML file with a linked document
with open(document_path, "w") as file:
file.write("<p>Hello, World!</p>" +
"<a href='linked.html'>linked file</a>")
# Prepare a simple linked HTML file
with open(os.path.join(output_dir, "linked.html"), "w") as file:
file.write("<p>Hello, linked file!</p>")
# Load the "save-with-linked-file.html" into memory
document = ah.HTMLDocument(document_path)
# Create a save options instance
options = sav.HTMLSaveOptions()
# The following line with value '0' cuts off all other linked HTML-files while saving this instance
# If you remove this line or change value to '1', the 'linked.html' file will be saved as well to the output folder
options.resource_handling_options.max_handling_depth = 1
# Save the document with the save options
output_path = os.path.join(output_dir, "save-with-linked-file_out.html")
document.save(output_path, options)
# Edit HTML body content and get modified document as a string using Python
# Learn more: https://docs.aspose.com/html/python-net/edit-html-document/
import aspose.html as ah
# Create an instance of an HTML document
document = ah.HTMLDocument()
# Write the content of the HTML document to the console
print(document.document_element.outer_html) # output: <html><head></head><body></body></html>
# Set the content of the body element
document.body.inner_html = "<p>HTML is the standard markup language for Web pages.</p>"
# Find the document <p> element
p = document.get_elements_by_tag_name("p")[0]
# Write the updated content of the HTML document to the console
print(p.inner_html) # output: HTML is the standard markup language for Web pages.
# Write the updated content of the HTML document to the console
print(document.document_element.outer_html) # output: <html><head></head><body><p>HTML is the standard markup language for Web pages.</p></body></html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment