Python Why lxml etree tostring method returns bytes

When using lxml library in Python, I found the API a little bit strange. For example when I want to use the etree.tostring method to print the content of the parsed HTML, it returns bytes.

from lxml import etree

It prints things like this

b'<!DOCTYPE html PUBLIC "-//W3C//DTD ...

To print it as string we need to decode it with an encoding


But the method name indicate it should return a string. Why this design? According to This method returns string when the encoding = "unicode" is specified, otherwise returns bytes.

Generates a string representation of an XML element, including all subelements. element is an Element instance. encoding [1] is the output encoding (default is US-ASCII). Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated). method is either "xml", "html" or "text" (default is "xml").

print(etree.tostring(html, encoding = "unicode"))

See more

How to install lxml for Python 3.4.3 on Windows

Start parsing XML with Python and lxml:How to parse XML with Python and lxml.