A fast implementation of the HTML 5 parsing spec for Python. Parsing is
done in C using a variant of the gumbo parser. The gumbo parse tree is
then transformed into an lxml tree, also in C, yielding parse times that
can be a thirtieth of the html5lib parse times. That is a speedup of 30x.
This differs, for instance, from the gumbo python bindings, where the
initial parsing is done in C but the transformation into the final
tree is done in python.


2018-12-04 - Norbert Preining <>
html5-parser (0.4.5-1) unstable; urgency=medium
[ Norbert Preining ]
* New upstream version 0.4.5
[ Nicholas D Steeves ]
* Switch Vcs from alioth to salsa.
* Add versioning (>= 3.8.0) to python3-lxml Build-Depends.  When
backporting 0.4.4-1 this dependency was discovered; see and
docs/index.rst.  Thanks to Chris Lamb in Bug #889854 for recommending
that this versioned dependency be made explicit. (Closes: #902629)
* Drop X-Python-Version: >= 2.6, because it is no longer needed.
Python-2.7.9 has been the default Python since old-stable (jessie).
2018-02-26 - Norbert Preining <>
html5-parser (0.4.4-1) unstable; urgency=medium
* New upstream version 0.4.4
* bump standards version, no changes necessary
2017-08-13 - Norbert Preining <>
html5-parser (0.4.3-1) unstable; urgency=medium
* First release (Closes: #870077)
thanks to Steve Langasek for python3 support

