Package lxml :: Package html :: Module soupparser
[hide private]
[frames] | no frames]

Module soupparser

source code

External interface to the BeautifulSoup HTML parser.
Classes [hide private]
  _PseudoTag
Functions [hide private]
 
fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)
Parse a string of HTML data into an Element tree using the BeautifulSoup parser.
source code
 
parse(file, beautifulsoup=None, makeelement=None, **bsargs)
Parse a file into an ElemenTree using the BeautifulSoup parser.
source code
 
convert_tree(beautiful_soup_tree, makeelement=None)
Convert a BeautifulSoup tree to a list of Element trees.
source code
 
_parse(source, beautifulsoup, makeelement, **bsargs) source code
 
_parse_doctype_declaration(...)
match(string[, pos[, endpos]]) --> match object or None. Matches zero or more characters at the beginning of the string
source code
 
_convert_tree(beautiful_soup_tree, makeelement) source code
 
_init_node_converters(makeelement) source code
 
handle_entities(...)
sub(repl, string[, count = 0]) --> newstring Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
source code
character
unichr(i)
Return a string of one character with ordinal i; 0 <= i < 256.
 
unescape(string) source code
Variables [hide private]
  _DECLARATION_OR_DOCTYPE = (<class 'bs4.element.Declaration'>, ...
  __package__ = 'lxml.html'
Function Details [hide private]

fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)

source code 

Parse a string of HTML data into an Element tree using the BeautifulSoup parser.

Returns the root <html> Element of the tree.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

parse(file, beautifulsoup=None, makeelement=None, **bsargs)

source code 

Parse a file into an ElemenTree using the BeautifulSoup parser.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

convert_tree(beautiful_soup_tree, makeelement=None)

source code 

Convert a BeautifulSoup tree to a list of Element trees.

Returns a list instead of a single root Element to support HTML-like soup with more than one root element.

You can pass a different Element factory through the makeelement keyword.


Variables Details [hide private]

_DECLARATION_OR_DOCTYPE

Value:
(<class 'bs4.element.Declaration'>, <class 'bs4.element.Doctype'>)