Home | Trees | Indices | Help |
|
---|
|
|
|||
|
|
|||
HtmlMixin | |||
_MethodFunc An object that represents a method on an element as a function; the function takes either an element or an HTML string. |
|||
HtmlComment | |||
HtmlElement | |||
HtmlProcessingInstruction | |||
HtmlEntity | |||
HtmlElementClassLookup A lookup scheme for HTML Element classes. |
|||
FormElement Represents a <form> element. |
|||
FieldsDict | |||
InputGetter An accessor that represents all the input fields in a form. |
|||
InputMixin Mix-in for all input elements (input, select, and textarea) |
|||
TextareaElement <textarea> element. |
|||
SelectElement <select> element. |
|||
MultipleSelectOptions Represents all the selected options in a <select multiple> element. |
|||
RadioGroup This object represents several <input type=radio> elements that have the same name. |
|||
CheckboxGroup Represents a group of checkboxes (<input type=checkbox>) that have the same name. |
|||
CheckboxValues Represents the values of the checked checkboxes in a group of checkboxes with the same name. |
|||
InputElement Represents an <input> element. |
|||
LabelElement Represents a <label> element. |
|||
HTMLParser | |||
XHTMLParser |
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|
|||
XHTML_NAMESPACE =
|
|||
_rel_links_xpath = descendant-or-self::a[@rel]|descendant-or-s
|
|||
_options_xpath = descendant-or-self::option|descendant-or-self
|
|||
_forms_xpath = descendant-or-self::form|descendant-or-self::x:
|
|||
_class_xpath = descendant-or-self::*[@class and contains(conca
|
|||
_id_xpath = descendant-or-self::*[@id=$id]
|
|||
_collect_string_content = string()
|
|||
_css_url_re = re.compile(r'
|
|||
_css_import_re = re.compile(r'@import "
|
|||
_label_xpath = //label[@for=$id]|//x:label[@for=$id]
|
|||
_archive_re = re.compile(r'
|
|||
find_rel_links = _MethodFunc('find_rel_links', copy= False)
|
|||
find_class = _MethodFunc('find_class', copy= False)
|
|||
make_links_absolute = _MethodFunc('make_links_absolute', copy=
|
|||
resolve_base_href = _MethodFunc('resolve_base_href', copy= True)
|
|||
iterlinks = _MethodFunc('iterlinks', copy= False)
|
|||
rewrite_links = _MethodFunc('rewrite_links', copy= True)
|
|||
html_parser = HTMLParser()
|
|||
xhtml_parser = XHTMLParser()
|
|
Parses several HTML elements, returning a list of elements. The first item in the list may be a string (though leading whitespace is removed). If no_leading_text is true, then it will be an error if there is leading text, and it will always be a list of only elements. base_url will set the document's base_url attribute (and the tree's docinfo.URL) |
Parses a single HTML element; it is an error if there is more than one element, or if anything but whitespace precedes or follows the element. If create_parent is true (or is a tag name) then a parent node will be created to encapsulate the HTML in a single element. base_url will set the document's base_url attribute (and the tree's docinfo.URL) |
Parse the html, returning a single element/document. This tries to minimally parse the chunk of text, without knowing if it is a fragment or a document. base_url will set the document's base_url attribute (and the tree's docinfo.URL) |
Parse a filename, URL, or file-like object into an HTML document tree. Note: this returns a tree, not an element. Use parse(...).getroot() to get the document root. You can override the base URL with the base_url keyword. This is most useful when parsing from a file-like object. |
Helper function to submit a form. Returns a file-like object, as from urllib.urlopen(). This object also has a .geturl() function, which shows the URL if there were any redirects. You can use this like: form = doc.forms[0] form.inputs['foo'].value = 'bar' # etc response = form.submit() doc = parse(response) doc.make_links_absolute(response.geturl()) To change the HTTP requester, pass a function as open_http keyword argument that opens the URL for you. The function must have the following signature: open_http(method, URL, values) The action is one of 'GET' or 'POST', the URL is the target URL as a string, and the values are a sequence of (name, value) tuples with the form data. |
Return an HTML string representation of the document. Note: the 'include_meta_content_type' argument exists purely for compatibility and does not serve any purpose. The encoding argument controls the output encoding (defauts to ASCII, with &#...; character references for any characters outside of ASCII). The method argument defines the output method. It defaults to 'html', but can also be 'xml' for xhtml output, or 'text' to serialise to plain text without markup. Note that you can pass the builtin unicode type as encoding argument to serialise to a unicode string. Example: >>> from lxml import html >>> root = html.fragment_fromstring('<p>Hello<br>world!</p>') >>> html.tostring(root) '<p>Hello<br>world!</p>' >>> html.tostring(root, method='html') '<p>Hello<br>world!</p>' >>> html.tostring(root, method='xml') '<p>Hello<br/>world!</p>' >>> html.tostring(root, method='text') 'Helloworld!' >>> html.tostring(root, method='text', encoding=unicode) u'Helloworld!' |
Create a new HTML Element. This can also be used for XHTML documents. |
|
_rel_links_xpath
|
_options_xpath
|
_forms_xpath
|
_class_xpath
|
make_links_absolute
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0 on Wed Jan 7 20:35:47 2009 | http://epydoc.sourceforge.net |