python-html5-parser - fast, standards compliant, C based, HTML 5 parser for python

Property Value
Distribution Debian 10 (Buster)
Repository Debian Main amd64
Package filename python-html5-parser_0.4.5-1_amd64.deb
Package name python-html5-parser
Package version 0.4.5
Package release 1
Package architecture amd64
Package type deb
Category python
License -
Maintainer Norbert Preining <>
Download size 119.45 KB
Installed size 461.00 KB
A fast implementation of the HTML 5 parsing spec for Python. Parsing is
done in C using a variant of the gumbo parser. The gumbo parse tree is
then transformed into an lxml tree, also in C, yielding parse times that
can be a thirtieth of the html5lib parse times. That is a speedup of 30x.
This differs, for instance, from the gumbo python bindings, where the
initial parsing is done in C but the transformation into the final
tree is done in python.


Package Version Architecture Repository
python-html5-parser_0.4.5-1_i386.deb 0.4.5 i386 Debian Main
python-html5-parser - - -


Name Value
libc6 >= 2.14
libxml2 >= 2.7.4
python << 2.8
python >= 2.7~
python-chardet -
python-lxml -
python:any << 2.8
python:any >= 2.7~


Type URL
Binary Package python-html5-parser_0.4.5-1_amd64.deb
Source Package html5-parser

Install Howto

  1. Update the package index:
    # sudo apt-get update
  2. Install python-html5-parser deb package:
    # sudo apt-get install python-html5-parser




2018-12-04 - Norbert Preining <>
html5-parser (0.4.5-1) unstable; urgency=medium
[ Norbert Preining ]
* New upstream version 0.4.5
[ Nicholas D Steeves ]
* Switch Vcs from alioth to salsa.
* Add versioning (>= 3.8.0) to python3-lxml Build-Depends.  When
backporting 0.4.4-1 this dependency was discovered; see and
docs/index.rst.  Thanks to Chris Lamb in Bug #889854 for recommending
that this versioned dependency be made explicit. (Closes: #902629)
* Drop X-Python-Version: >= 2.6, because it is no longer needed.
Python-2.7.9 has been the default Python since old-stable (jessie).
2018-02-26 - Norbert Preining <>
html5-parser (0.4.4-1) unstable; urgency=medium
* New upstream version 0.4.4
* bump standards version, no changes necessary
2017-08-13 - Norbert Preining <>
html5-parser (0.4.3-1) unstable; urgency=medium
* First release (Closes: #870077)
thanks to Steve Langasek for python3 support

See Also

Package Description
python-html5lib_1.0.1-1_all.deb HTML parser/tokenizer based on the WHATWG HTML5 specification
python-htmlmin_0.1.12-1_all.deb HTML Minifier
python-htmltmpl_1.22-10.1_all.deb Templating engine for separation of code and HTML
python-htseq_0.11.2-1_amd64.deb Python high-throughput genome sequencing read analysis utilities
python-httmock_1.3.0-1_all.deb Mocking library for python-requests
python-http-parser_0.8.3-3_amd64.deb http request/response parser
python-httpbin_0.5.0+dfsg-2_all.deb HTTP request and response service
python-httplib2_0.11.3-2_all.deb comprehensive HTTP client library written for Python
python-httpretty_0.9.5-3_all.deb HTTP client mock - Python 2.x
python-humanfriendly_4.18-1_all.deb Python library to make user friendly text interfaces
python-humanize_0.5.1-3_all.deb Python Humanize library (Python 2)
python-hunspell_0.5.5-1_amd64.deb Python 2 binding for Hunspell
python-hupper_1.5-1_all.deb Integrated process monitor for developing servers (Python 2)
python-hurry.filesize_0.9-2_all.deb human readable file sizes or anything sized in bytes - Python 2.x
python-hy_0.12.1-2_all.deb Lisp (s-expression) based frontend to Python