Instances cleans the document of each of the possible offending
elements. The cleaning is controlled by attributes; you can
override attributes in a subclass, or set them in the constructor.
|
|
|
|
|
allow_follow(self,
anchor)
Override to suppress rel="nofollow" on some anchors. |
source code
|
|
|
|
|
allow_embedded_url(self,
el,
url)
Decide whether a URL that was found in an element's attributes or text
if configured to be accepted or rejected. |
source code
|
|
|
kill_conditional_comments(self,
doc)
IE conditional comments basically embed HTML that the parser
doesn't normally see. We can't allow anything like that, so
we'll kill any comments that could be conditional. |
source code
|
|
|
_kill_elements(self,
doc,
condition,
iterate=None) |
source code
|
|
|
|
|
_substitute_comments(...)
sub(repl, string[, count = 0]) --> newstring
Return the string obtained by replacing the leftmost non-overlapping
occurrences of pattern in string by the replacement repl. |
source code
|
|
|
_has_sneaky_javascript(self,
style)
Depending on the browser, stuff like e x p r e s s i o n(...)
can get interpreted, or expre/* stuff */ssion(...). This
checks for attempt to do stuff like this. |
source code
|
|
|
|
Inherited from object :
__delattr__ ,
__format__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__repr__ ,
__setattr__ ,
__sizeof__ ,
__str__ ,
__subclasshook__
|
|
scripts = True
|
|
javascript = True
|
|
comments = True
|
|
style = False
|
|
inline_style = None
hash(x)
|
|
links = True
|
|
meta = True
|
|
page_structure = True
|
|
processing_instructions = True
|
|
embedded = True
|
|
frames = True
|
|
forms = True
|
|
annoying_tags = True
|
|
remove_tags = None
hash(x)
|
|
allow_tags = None
hash(x)
|
|
kill_tags = None
hash(x)
|
|
remove_unknown_tags = True
|
|
safe_attrs_only = True
|
|
safe_attrs = frozenset([ ' abbr ' , ' accept ' , ' accept-charset ' , ' a ...
|
|
add_nofollow = False
|
|
host_whitelist = ( )
|
|
whitelist_tags = set([ ' embed ' , ' iframe ' ])
|
|
_tag_link_attrs = { ' a ' : ' href ' , ' applet ' : [ ' code ' , ' object ' ] , ...
|
|
__qualname__ = ' Cleaner '
|