lxml.html.soupparser module
External interface to the BeautifulSoup HTML parser.
- lxml.html.soupparser._parse_doctype_declaration(string, pos=0, endpos=9223372036854775807)
Matches zero or more characters at the beginning of the string.
- lxml.html.soupparser.convert_tree(beautiful_soup_tree, makeelement=None)[source]
Convert a BeautifulSoup tree to a list of Element trees.
Returns a list instead of a single root Element to support HTML-like soup with more than one root element.
You can pass a different Element factory through the makeelement keyword.
- lxml.html.soupparser.fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)[source]
Parse a string of HTML data into an Element tree using the BeautifulSoup parser.
Returns the root
<html>
Element of the tree.You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard
BeautifulSoup
class and the default factory of lxml.html are used.
- lxml.html.soupparser.handle_entities(repl, string, count=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
- lxml.html.soupparser.parse(file, beautifulsoup=None, makeelement=None, **bsargs)[source]
Parse a file into an ElemenTree using the BeautifulSoup parser.
You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard
BeautifulSoup
class and the default factory of lxml.html are used.