XML Pretty Print Using Python – with examples
Our guides are based on hands-on testing and verified sources. Each article is reviewed for accuracy and updated regularly to ensure current, reliable information.Read our editorial policy.
You want to take an ugly one‑liner XML string and make it human‑readable. You can do this in Python with a few lines of code.
You will typically use xml.dom.minidom, the xml.etree.ElementTree API, lxml, or BeautifulSoup. Each option reads raw XML and writes a formatted representation.
For example:
import xml.dom.minidom
raw = '<root><item id="1"><name>Item1</name><value>10</value></item><item id="2"><name>Item2</name><value>20</value></item></root>'
dom = xml.dom.minidom.parseString(raw)
print(dom.toprettyxml(indent=" "))Output:
<?xml version="1.0" ?>
<root>
<item id="1">
<name>Item1</name>
<value>10</value>
</item>
<item id="2">
<name>Item2</name>
<value>20</value>
</item>
</root>This quick example uses the DOM module’s toprettyxml() method. The indent parameter lets you pick the indentation string, and each line ends with \n by default.
The result is a neat, indented XML document. If you just want to format XML quickly without writing code, you can also use this XML Pretty Printer to clean up raw XML instantly in your browser.
Why pretty print XML?
Ugly XML is hard to read. Computers do not care about whitespace, but humans do. When you work with configuration files, REST responses, RSS feeds, or large data dumps, well‑formatted XML lets you quickly see element hierarchies.
On the other hand, printing raw XML to the console yields a long line of tags and attributes. Most of the time, you need a helper to insert newlines and indentation.
You have several choices in Python, each with different levels of simplicity, dependencies, and flexibility. If you often work with XML beyond pretty printing, browse these XML tools
for formatting, validation, and related developer tasks.
Overview of XML Pretty Print Methods
| Method & module | Key idea | Remarks |
|---|---|---|
xml.dom.minidom.toprettyxml() |
Parses the XML into a DOM tree and calls toprettyxml() to insert indentation and line breaks. |
Added indentation string and newline separator are configurable. Extra blank lines may appear; you can strip them. |
xml.etree.ElementTree.indent() (Python 3.9+) |
Modifies an Element or ElementTree in place by appending whitespace to elements so that ElementTree.tostring() or .write() output is indented. |
Accepts a space argument for indentation and an optional level argument to indent subtrees at an initial depth. |
| Custom indent helper (pre‑3.9) | Recursively adds newline and spaces to elem.text and elem.tail attributes to simulate pretty printing, then serializes using ET.tostring(). |
Works on all Python versions but modifies the element tree in place. |
lxml.etree.tostring(..., pretty_print=True) |
The lxml library extends ElementTree and provides a pretty_print flag for etree.tostring() that returns nicely formatted XML. |
Requires installing lxml. etree.indent() (available since lxml 4.5) provides finer control. |
bs4.BeautifulSoup(...).prettify() |
Uses BeautifulSoup’s HTML/XML parser to build a tree and then outputs an indented representation with prettify(). |
Suitable when you already use BeautifulSoup for parsing; adds an XML declaration in most cases and may split text nodes across lines. |
Let’s go through each option. You will see the long way and the neat way side by side.
1. Using xml.dom.minidom
xml.dom.minidom is part of the Python standard library. It implements a minimal DOM API for XML. You start by parsing a string or file into a DOM and then call toprettyxml().
The method returns a string with line breaks and indentation. The indent parameter controls the indent string, while newl controls line breaks.
For example:
from xml.dom import minidom
def pretty_minidom(xml_str, indent=" "):
dom = minidom.parseString(xml_str)
xml_with_ws = dom.toprettyxml(indent=indent)
# Remove empty lines inserted by minidom
return "\n".join(line for line in xml_with_ws.split("\n") if line.strip())
raw = '<root><child><value>42</value></child></root>'
print(pretty_minidom(raw))Output:
<?xml version="1.0" ?>
<root>
<child>
<value>42</value>
</child>
</root>You need to call strip() to avoid extra blank lines because toprettyxml() inserts empty text nodes for elements that only contain child elements.
This overhead may matter when printing large trees. Also, building a full DOM tree consumes memory; for huge documents, the iterative ElementTree API is faster.
2. Using xml.etree.ElementTree
Python 3.9 added an indent() function to xml.etree.ElementTree. This helper adds whitespace to a subtree so that subsequent serialization looks pretty.
According to the documentation, indent() appends a whitespace string (two spaces by default) for each indentation level; you can change the space argument and set an initial level.
For example:
import xml.etree.ElementTree as ET
def pretty_etree(xml_str, space=" "):
root = ET.fromstring(xml_str)
ET.indent(root, space=space) # modifies in place
return ET.tostring(root, encoding="unicode")
raw = '<catalog><book id="bk101"><title>XML Developer</title><price>44.95</price></book></catalog>'
print(pretty_etree(raw))Output:
<catalog>
<book id="bk101">
<title>XML Developer</title>
<price>44.95</price>
</book>
</catalog>This approach is neat because it works on both the root element and the entire ElementTree. The indentation is applied in place, so you can call tree.write() afterwards to write directly to a file with indentation.
If your Python version is older than 3.9, the next section shows how to achieve the same effect manually.
3. ElementTree custom indent
Before Python 3.9, ElementTree had no built‑in indent function. You needed to write a helper that recurses over the element tree and adjusts the text and tail fields.
The algorithm is simple: insert a newline and indentation before each child, and update the tail of each element to align its closing tag.
Here is a custom implementation:
import xml.etree.ElementTree as ET
def indent_custom(elem, level=0, indent_size=" "):
i = "\n" + level * indent_size
if len(elem): # has children
if not elem.text or not elem.text.strip():
elem.text = i + indent_size
for child in elem:
indent_custom(child, level + 1, indent_size)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else: # no children
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
xml_str = '<root><item><value>10</value></item><item><value>20</value></item></root>'
root = ET.fromstring(xml_str)
indent_custom(root)
print(ET.tostring(root, encoding='unicode'))Output:
<root>
<item>
<value>10</value>
</item>
<item>
<value>20</value>
</item>
</root>Notice that the closing tags line up with the content above them; this comes from adjusting tail attributes. This helper works in any Python version.
But please, do not modify shared element trees in place if other code expects unmodified whitespace — side effects can break parsers that rely on elem.text and elem.tail content. So, think twice before breaking your XML tree into a single line.
4. Using lxml.etree
The lxml library implements the ElementTree API in C and adds many features. When you call etree.tostring() with the pretty_print=True argument, lxml inserts newlines and indentations automatically. This returns a byte string; decode it to get Unicode.
For example:
from lxml import etree
def pretty_lxml(xml_str, encoding='unicode'):
element = etree.fromstring(xml_str)
return etree.tostring(element, pretty_print=True, encoding=encoding)
raw = '<root><section><title>Intro</title><content>Hi</content></section></root>'
print(pretty_lxml(raw))Output:
<root>
<section>
<title>Intro</title>
<content>Hi</content>
</section>
</root>This is shorter than building a DOM and trimming blank lines. lxml also provides an indent() function for finer control. It behaves like xml.etree.ElementTree.indent() but requires lxml ≥ 4.5.
Most of the time, using pretty_print=True is enough. Remember to install lxml via pip (pip install lxml) if it is not available by default.
5. Using BeautifulSoup
BeautifulSoup is primarily an HTML parser, but it also handles XML when you specify the ‘xml’ parser. It constructs a parse tree and has a prettify() method that adds newlines and indentation for readability.
The documentation demonstrates that running prettify() on a soup object prints nested tags with extra indentation.
Example:
from bs4 import BeautifulSoup
def pretty_soup(xml_str):
soup = BeautifulSoup(xml_str, 'xml')
return soup.prettify()
raw = '<root><item id="1"><name>Item1</name><value>10</value></item><item id="2"><name>Item2</name><value>20</value></item></root>'
print(pretty_soup(raw))Output:
<?xml version="1.0" encoding="utf-8"?>
<root>
<item id="1">
<name>
Item1
</name>
<value>
10
</value>
</item>
<item id="2">
<name>
Item2
</name>
<value>
20
</value>
</item>
</root>BeautifulSoup inserts an XML declaration and splits text nodes onto their own lines. This makes nested tags clear, but you might not want extra line breaks inside name or value elements.
You can convert the tree back to a single line with str(soup) if you need unformatted XML. Also, using BeautifulSoup just to pretty print is heavier than using the built‑in libraries. But if you already parse XML with BeautifulSoup, prettify() is handy.
Additional considerations
- Encoding and declarations: The DOM and BeautifulSoup methods emit an XML declaration (). lxml does not unless you set xml_declaration=True or call etree.tostring() on an entire ElementTree. If your downstream tool requires a declaration, add it explicitly.
- Attribute order: Since Python 3.8, both minidom and ElementTree preserve the original attribute order. If you rely on a specific order, ensure you use a recent Python version or lxml, which also preserves order.
- Large files: For huge XML documents, avoid building full DOM trees. Use xml.etree.ElementTree.iterparse() or lxml.iterparse() to stream through the file and write formatted output incrementally. You can pretty print subtrees by calling ET.indent() on each element before writing it out.
- Untrusted input: Parsing XML from unknown sources can expose you to XML external entity (XXE) attacks. The Python documentation warns that you should use safe parser configurations. When security matters, use defusedxml or disable external entity loading.
Conclusion
You saw that Python offers several simple ways to pretty print XML. The standard‑library xml.dom.minidom module exposes a toprettyxml() method with an indent parameter.
You can also indent trees with xml.etree.ElementTree.indent() in Python 3.9+, or write a custom helper for older versions. Third‑party solutions such as lxml (tostring(…, pretty_print=True)) and BeautifulSoup’s prettify() provide alternatives. Each technique reads raw XML and writes a formatted representation.
For example:
from xml.dom import minidom
raw = '<root><item id="1"><name>Item1</name><value>10</value></item><item id="2"><name>Item2</name><value>20</value></item></root>'
dom = minidom.parseString(raw)
print(dom.toprettyxml(indent=" "))Output:
<?xml version="1.0" ?>
<root>
<item id="1">
<name>Item1</name>
<value>10</value>
</item>
<item id="2">
<name>Item2</name>
<value>20</value>
</item>
</root>That’s it for the XML part. Thanks for reading.
Happy coding!


