|
bytes
str(object='') -> string
|
|
unicode
str(object='') -> string
|
|
HtmlMixin
|
|
_MethodFunc
An object that represents a method on an element as a function;
the function takes either an element or an HTML string. It
returns whatever the function normally returns, or if the function
works in-place (and so returns None) it returns a serialized form
of the resulting document.
|
|
HtmlComment
|
|
HtmlElement
|
|
HtmlProcessingInstruction
|
|
HtmlEntity
|
|
HtmlElementClassLookup
A lookup scheme for HTML Element classes.
|
|
FormElement
Represents a <form> element.
|
|
FieldsDict
|
|
InputGetter
An accessor that represents all the input fields in a form.
|
|
InputMixin
Mix-in for all input elements (input, select, and textarea)
|
|
TextareaElement
<textarea> element. You can get the name with .name and
get/set the value with .value
|
|
SelectElement
<select> element. You can get the name with .name.
|
|
MultipleSelectOptions
Represents all the selected options in a <select multiple> element.
|
|
RadioGroup
This object represents several <input type=radio> elements
that have the same name.
|
|
CheckboxGroup
Represents a group of checkboxes (<input type=checkbox>) that
have the same name.
|
|
CheckboxValues
Represents the values of the checked checkboxes in a group of
checkboxes with the same name.
|
|
InputElement
Represents an <input> element.
|
|
LabelElement
Represents a <label> element.
|
|
HTMLParser
An HTML parser that is configured to return lxml.html Element
objects.
|
|
XHTMLParser
An XML parser that is configured to return lxml.html Element
objects.
|
|
|
|
_iter_css_urls(...)
finditer(string[, pos[, endpos]]) --> iterator.
Return an iterator over all non-overlapping matches for the
RE pattern in string. For each match, the iterator returns a
match object. |
source code
|
|
|
_iter_css_imports(...)
finditer(string[, pos[, endpos]]) --> iterator.
Return an iterator over all non-overlapping matches for the
RE pattern in string. For each match, the iterator returns a
match object. |
source code
|
|
|
_parse_meta_refresh_url(...)
search(string[, pos[, endpos]]) --> match object or None.
Scan through string looking for a match, and return a corresponding
match object instance. Return None if no position in the string matches. |
source code
|
|
|
|
|
_transform_result(typ,
result)
Convert the result back into the input type. |
source code
|
|
|
|
|
_looks_like_full_html_unicode(...)
match(string[, pos[, endpos]]) --> match object or None.
Matches zero or more characters at the beginning of the string |
source code
|
|
|
_looks_like_full_html_bytes(...)
match(string[, pos[, endpos]]) --> match object or None.
Matches zero or more characters at the beginning of the string |
source code
|
|
|
document_fromstring(html,
parser=None,
ensure_head_body=False,
**kw) |
source code
|
|
|
fragments_fromstring(html,
no_leading_text=False,
base_url=None,
parser=None,
**kw)
Parses several HTML elements, returning a list of elements. |
source code
|
|
|
fragment_fromstring(html,
create_parent=False,
base_url=None,
parser=None,
**kw)
Parses a single HTML element; it is an error if there is more than
one element, or if anything but whitespace precedes or follows the
element. |
source code
|
|
|
fromstring(html,
base_url=None,
parser=None,
**kw)
Parse the html, returning a single element/document. |
source code
|
|
|
parse(filename_or_url,
parser=None,
base_url=None,
**kw)
Parse a filename, URL, or file-like object into an HTML document
tree. Note: this returns a tree, not an element. Use
parse(...).getroot() to get the document root. |
source code
|
|
|
|
|
|
|
submit_form(form,
extra_values=None,
open_http=None)
Helper function to submit a form. Returns a file-like object, as from
urllib.urlopen(). This object also has a .geturl() function,
which shows the URL if there were any redirects. |
source code
|
|
|
|
|
html_to_xhtml(html)
Convert all tags in an HTML tree to XHTML by moving them to the
XHTML namespace. |
source code
|
|
|
xhtml_to_html(xhtml)
Convert all tags in an XHTML tree to HTML by removing their
XHTML namespace. |
source code
|
|
|
__str_replace_meta_content_type(...)
sub(repl, string[, count = 0]) --> newstring
Return the string obtained by replacing the leftmost non-overlapping
occurrences of pattern in string by the replacement repl. |
source code
|
|
|
__bytes_replace_meta_content_type(...)
sub(repl, string[, count = 0]) --> newstring
Return the string obtained by replacing the leftmost non-overlapping
occurrences of pattern in string by the replacement repl. |
source code
|
|
|
tostring(doc,
pretty_print=False,
include_meta_content_type=False,
encoding=None,
method=' html ' ,
with_tail=True,
doctype=None)
Return an HTML string representation of the document. |
source code
|
|
|
open_in_browser(doc,
encoding=None)
Open the HTML document in a web browser, saving it to a temporary
file to open it. Note that this does not delete the file after
use. This is mainly meant for debugging. |
source code
|
|
|
|
|
basestring = str, bytes
|
|
XHTML_NAMESPACE = ' http://www.w3.org/1999/xhtml '
|
|
_rel_links_xpath = descendant-or-self::a[@rel]|descendant-or-s...
|
|
_options_xpath = descendant-or-self::option|descendant-or-self...
|
|
_forms_xpath = descendant-or-self::form|descendant-or-self::x:...
|
|
_class_xpath = descendant-or-self::*[@class and contains(conca...
|
|
_id_xpath = descendant-or-self::*[@id=$id]
|
|
_collect_string_content = string()
|
|
_label_xpath = //label[@for=$id]|//x:label[@for=$id]
|
|
_archive_re = re.compile(r'[^ ] + ')
|
|
find_rel_links = _MethodFunc('find_rel_links', copy= False)
|
|
find_class = _MethodFunc('find_class', copy= False)
|
|
make_links_absolute = _MethodFunc('make_links_absolute', copy=...
|
|
resolve_base_href = _MethodFunc('resolve_base_href', copy= True)
|
|
iterlinks = _MethodFunc('iterlinks', copy= False)
|
|
rewrite_links = _MethodFunc('rewrite_links', copy= True)
|
|
html_parser = HTMLParser()
|
|
xhtml_parser = XHTMLParser()
|
|
__package__ = ' lxml.html '
|