Package lxml :: Package html :: Module soupparser
[hide private]
[frames] | no frames]

Module soupparser

source code

External interface to the BeautifulSoup HTML parser.
Functions [hide private]
 
fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)
Parse a string of HTML data into an Element tree using the BeautifulSoup parser.
source code
 
parse(file, beautifulsoup=None, makeelement=None, **bsargs)
Parse a file into an ElemenTree using the BeautifulSoup parser.
source code
 
convert_tree(beautiful_soup_tree, makeelement=None)
Convert a BeautifulSoup tree to a list of Element trees.
source code
 
_parse(source, beautifulsoup, makeelement, **bsargs) source code
 
_convert_tree(beautiful_soup_tree, makeelement) source code
 
_convert_children(parent, beautiful_soup_tree, makeelement) source code
 
_append_text(parent, element, text) source code
 
handle_entities(...)
sub(repl, string[, count = 0]) --> newstring Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.
source code
 
unescape(string) source code
Variables [hide private]
  __doc__ = """External interface to the BeautifulSoup HTML pars...
  __package__ = 'lxml.html'
Function Details [hide private]

fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)

source code 

Parse a string of HTML data into an Element tree using the BeautifulSoup parser.

Returns the root <html> Element of the tree.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

parse(file, beautifulsoup=None, makeelement=None, **bsargs)

source code 

Parse a file into an ElemenTree using the BeautifulSoup parser.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

convert_tree(beautiful_soup_tree, makeelement=None)

source code 

Convert a BeautifulSoup tree to a list of Element trees.

Returns a list instead of a single root Element to support HTML-like soup with more than one root element.

You can pass a different Element factory through the makeelement keyword.


Variables Details [hide private]

__doc__

Value:
"""External interface to the BeautifulSoup HTML parser.
"""