lxml.html.soupparser module¶
External interface to the BeautifulSoup HTML parser.
-
lxml.html.soupparser.
_parse_doctype_declaration
(string, pos=0, endpos=9223372036854775807)¶ Matches zero or more characters at the beginning of the string.
-
lxml.html.soupparser.
convert_tree
(beautiful_soup_tree, makeelement=None)[source]¶ Convert a BeautifulSoup tree to a list of Element trees.
Returns a list instead of a single root Element to support HTML-like soup with more than one root element.
You can pass a different Element factory through the makeelement keyword.
-
lxml.html.soupparser.
fromstring
(data, beautifulsoup=None, makeelement=None, **bsargs)[source]¶ Parse a string of HTML data into an Element tree using the BeautifulSoup parser.
Returns the root
<html>
Element of the tree.You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard
BeautifulSoup
class and the default factory of lxml.html are used.
-
lxml.html.soupparser.
handle_entities
(repl, string, count=0)¶ Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
-
lxml.html.soupparser.
parse
(file, beautifulsoup=None, makeelement=None, **bsargs)[source]¶ Parse a file into an ElemenTree using the BeautifulSoup parser.
You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard
BeautifulSoup
class and the default factory of lxml.html are used.