Skip to content

Commit 245c21c

Browse files
committed
Fix doctest failures
This reworks the doctests to run and pass in Python 3. Fixes #357
1 parent cabd665 commit 245c21c

File tree

2 files changed

+55
-55
lines changed

2 files changed

+55
-55
lines changed

docs/clean.rst

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -63,10 +63,10 @@ For example:
6363
>>> import bleach
6464

6565
>>> bleach.clean(
66-
... u'<b><i>an example</i></b>',
66+
... '<b><i>an example</i></b>',
6767
... tags=['b'],
6868
... )
69-
u'<b>&lt;i&gt;an example&lt;/i&gt;</b>'
69+
'<b>&lt;i&gt;an example&lt;/i&gt;</b>'
7070

7171

7272
The default value is a relatively conservative list found in
@@ -106,12 +106,12 @@ For example:
106106
>>> import bleach
107107

108108
>>> bleach.clean(
109-
... u'<p class="foo" style="color: red; font-weight: bold;">blah blah blah</p>',
109+
... '<p class="foo" style="color: red; font-weight: bold;">blah blah blah</p>',
110110
... tags=['p'],
111111
... attributes=['style'],
112112
... styles=['color'],
113113
... )
114-
u'<p style="color: red;">blah blah blah</p>'
114+
'<p style="color: red;">blah blah blah</p>'
115115

116116

117117
As a dict
@@ -135,11 +135,11 @@ and "class" for any tag (including "a" and "img"):
135135
... }
136136

137137
>>> bleach.clean(
138-
... u'<img alt="an example" width=500>',
138+
... '<img alt="an example" width=500>',
139139
... tags=['img'],
140140
... attributes=attrs
141141
... )
142-
u'<img alt="an example">'
142+
'<img alt="an example">'
143143

144144

145145
Using functions
@@ -161,19 +161,19 @@ For example:
161161
... return name[0] == 'h'
162162

163163
>>> bleach.clean(
164-
... u'<a href="http://example.com" title="link">link</a>',
164+
... '<a href="http://example.com" title="link">link</a>',
165165
... tags=['a'],
166166
... attributes=allow_h,
167167
... )
168-
u'<a href="http://example.com">link</a>'
168+
'<a href="http://example.com">link</a>'
169169

170170

171171
You can also pass a callable as a value in an attributes dict and it'll run for
172172
attributes for specified tags:
173173

174174
.. doctest::
175175

176-
>>> from urlparse import urlparse
176+
>>> from six.moves.urllib.parse import urlparse
177177
>>> import bleach
178178

179179
>>> def allow_src(tag, name, value):
@@ -185,13 +185,13 @@ attributes for specified tags:
185185
... return False
186186

187187
>>> bleach.clean(
188-
... u'<img src="http://example.com" alt="an example">',
188+
... '<img src="http://example.com" alt="an example">',
189189
... tags=['img'],
190190
... attributes={
191191
... 'img': allow_src
192192
... }
193193
... )
194-
u'<img alt="an example">'
194+
'<img alt="an example">'
195195

196196

197197
.. versionchanged:: 2.0
@@ -223,12 +223,12 @@ For example, to allow users to set the color and font-weight of text:
223223
>>> styles = ['color', 'font-weight']
224224

225225
>>> bleach.clean(
226-
... u'<p style="font-weight: heavy;">my html</p>',
226+
... '<p style="font-weight: heavy;">my html</p>',
227227
... tags=tags,
228228
... attributes=attrs,
229229
... styles=styles
230230
... )
231-
u'<p style="font-weight: heavy;">my html</p>'
231+
'<p style="font-weight: heavy;">my html</p>'
232232

233233

234234
Default styles are stored in ``bleach.sanitizer.ALLOWED_STYLES``.
@@ -252,7 +252,7 @@ For example, this sets allowed protocols to http, https and smb:
252252
... '<a href="smb://more_text">allowed protocol</a>',
253253
... protocols=['http', 'https', 'smb']
254254
... )
255-
u'<a href="smb://more_text">allowed protocol</a>'
255+
'<a href="smb://more_text">allowed protocol</a>'
256256

257257

258258
This adds smb to the Bleach-specified set of allowed protocols:
@@ -265,7 +265,7 @@ This adds smb to the Bleach-specified set of allowed protocols:
265265
... '<a href="smb://more_text">allowed protocol</a>',
266266
... protocols=bleach.ALLOWED_PROTOCOLS + ['smb']
267267
... )
268-
u'<a href="smb://more_text">allowed protocol</a>'
268+
'<a href="smb://more_text">allowed protocol</a>'
269269

270270

271271
Default protocols are in ``bleach.sanitizer.ALLOWED_PROTOCOLS``.
@@ -284,10 +284,10 @@ and invalid markup. For example:
284284
>>> import bleach
285285

286286
>>> bleach.clean('<span>is not allowed</span>')
287-
u'&lt;span&gt;is not allowed&lt;/span&gt;'
287+
'&lt;span&gt;is not allowed&lt;/span&gt;'
288288

289289
>>> bleach.clean('<b><span>is not allowed</span></b>', tags=['b'])
290-
u'<b>&lt;span&gt;is not allowed&lt;/span&gt;</b>'
290+
'<b>&lt;span&gt;is not allowed&lt;/span&gt;</b>'
291291

292292

293293
If you would rather Bleach stripped this markup entirely, you can pass
@@ -298,10 +298,10 @@ If you would rather Bleach stripped this markup entirely, you can pass
298298
>>> import bleach
299299

300300
>>> bleach.clean('<span>is not allowed</span>', strip=True)
301-
u'is not allowed'
301+
'is not allowed'
302302

303303
>>> bleach.clean('<b><span>is not allowed</span></b>', tags=['b'], strip=True)
304-
u'<b>is not allowed</b>'
304+
'<b>is not allowed</b>'
305305

306306

307307
Stripping comments (``strip_comments``)
@@ -317,10 +317,10 @@ By default, Bleach will strip out HTML comments. To disable this behavior, set
317317
>>> html = 'my<!-- commented --> html'
318318

319319
>>> bleach.clean(html)
320-
u'my html'
320+
'my html'
321321

322322
>>> bleach.clean(html, strip_comments=False)
323-
u'my<!-- commented --> html'
323+
'my<!-- commented --> html'
324324

325325

326326
Using ``bleach.sanitizer.Cleaner``
@@ -353,7 +353,7 @@ Trivial Filter example:
353353
.. doctest::
354354

355355
>>> from bleach.sanitizer import Cleaner
356-
>>> from html5lib.filters.base import Filter
356+
>>> from bleach.html5lib_shim import Filter
357357

358358
>>> class MooFilter(Filter):
359359
... def __iter__(self):
@@ -371,7 +371,7 @@ Trivial Filter example:
371371
>>> cleaner = Cleaner(tags=TAGS, attributes=ATTRS, filters=[MooFilter])
372372
>>> dirty = 'this is cute! <img src="http://example.com/puppy.jpg" rel="nofollow">'
373373
>>> cleaner.clean(dirty)
374-
u'this is cute! <img rel="moo" src="moo">'
374+
'this is cute! <img rel="moo" src="moo">'
375375

376376

377377
.. Warning::

docs/linkify.rst

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -80,12 +80,12 @@ For example, you could add a ``title`` attribute to all links:
8080
>>> from bleach.linkifier import Linker
8181

8282
>>> def set_title(attrs, new=False):
83-
... attrs[(None, u'title')] = u'link in user text'
83+
... attrs[(None, 'title')] = 'link in user text'
8484
... return attrs
8585
...
8686
>>> linker = Linker(callbacks=[set_title])
8787
>>> linker.linkify('abc http://example.com def')
88-
u'abc <a href="http://example.com" title="link in user text">http://example.com</a> def'
88+
'abc <a href="http://example.com" title="link in user text">http://example.com</a> def'
8989

9090

9191
This would set the value of the ``rel`` attribute, stomping on a previous value
@@ -96,21 +96,21 @@ an external link:
9696

9797
.. doctest::
9898

99-
>>> from urlparse import urlparse
99+
>>> from six.moves.urllib.parse import urlparse
100100
>>> from bleach.linkifier import Linker
101101

102102
>>> def set_target(attrs, new=False):
103-
... p = urlparse(attrs[(None, u'href')])
103+
... p = urlparse(attrs[(None, 'href')])
104104
... if p.netloc not in ['my-domain.com', 'other-domain.com']:
105-
... attrs[(None, u'target')] = u'_blank'
106-
... attrs[(None, u'class')] = u'external'
105+
... attrs[(None, 'target')] = '_blank'
106+
... attrs[(None, 'class')] = 'external'
107107
... else:
108-
... attrs.pop((None, u'target'), None)
108+
... attrs.pop((None, 'target'), None)
109109
... return attrs
110110
...
111111
>>> linker = Linker(callbacks=[set_target])
112112
>>> linker.linkify('abc http://example.com def')
113-
u'abc <a class="external" href="http://example.com" target="_blank">http://example.com</a> def'
113+
'abc <a class="external" href="http://example.com" target="_blank">http://example.com</a> def'
114114

115115

116116
Removing Attributes
@@ -127,17 +127,17 @@ sanitizing attributes.)
127127
>>> def allowed_attrs(attrs, new=False):
128128
... """Only allow href, target, rel and title."""
129129
... allowed = [
130-
... (None, u'href'),
131-
... (None, u'target'),
132-
... (None, u'rel'),
133-
... (None, u'title'),
134-
... u'_text',
130+
... (None, 'href'),
131+
... (None, 'target'),
132+
... (None, 'rel'),
133+
... (None, 'title'),
134+
... '_text',
135135
... ]
136136
... return dict((k, v) for k, v in attrs.items() if k in allowed)
137137
...
138138
>>> linker = Linker(callbacks=[allowed_attrs])
139139
>>> linker.linkify('<a style="font-weight: super bold;" href="http://example.com">link</a>')
140-
u'<a href="http://example.com">link</a>'
140+
'<a href="http://example.com">link</a>'
141141

142142

143143
Or you could remove a specific attribute, if it exists:
@@ -147,15 +147,15 @@ Or you could remove a specific attribute, if it exists:
147147
>>> from bleach.linkifier import Linker
148148

149149
>>> def remove_title(attrs, new=False):
150-
... attrs.pop((None, u'title'), None)
150+
... attrs.pop((None, 'title'), None)
151151
... return attrs
152152
...
153153
>>> linker = Linker(callbacks=[remove_title])
154154
>>> linker.linkify('<a href="http://example.com">link</a>')
155-
u'<a href="http://example.com">link</a>'
155+
'<a href="http://example.com">link</a>'
156156

157157
>>> linker.linkify('<a title="bad title" href="http://example.com">link</a>')
158-
u'<a href="http://example.com">link</a>'
158+
'<a href="http://example.com">link</a>'
159159

160160

161161
Altering Attributes
@@ -177,14 +177,14 @@ Example of shortening link text:
177177
... if not new:
178178
... return attrs
179179
... # _text will be the same as the URL for new links
180-
... text = attrs[u'_text']
180+
... text = attrs['_text']
181181
... if len(text) > 25:
182-
... attrs[u'_text'] = text[0:22] + u'...'
182+
... attrs['_text'] = text[0:22] + '...'
183183
... return attrs
184184
...
185185
>>> linker = Linker(callbacks=[shorten_url])
186186
>>> linker.linkify('http://example.com/longlonglonglonglongurl')
187-
u'<a href="http://example.com/longlonglonglonglongurl">http://example.com/lon...</a>'
187+
'<a href="http://example.com/longlonglonglonglongurl">http://example.com/lon...</a>'
188188

189189

190190
Example of switching all links to go through a bouncer first:
@@ -196,7 +196,7 @@ Example of switching all links to go through a bouncer first:
196196

197197
>>> def outgoing_bouncer(attrs, new=False):
198198
... """Send outgoing links through a bouncer."""
199-
... href_key = (None, u'href')
199+
... href_key = (None, 'href')
200200
... p = urlparse(attrs.get(href_key, None))
201201
... if p.netloc not in ['example.com', 'www.example.com', '']:
202202
... bouncer = 'http://bn.ce/?destination=%s'
@@ -205,10 +205,10 @@ Example of switching all links to go through a bouncer first:
205205
...
206206
>>> linker = Linker(callbacks=[outgoing_bouncer])
207207
>>> linker.linkify('http://example.com')
208-
u'<a href="http://example.com">http://example.com</a>'
208+
'<a href="http://example.com">http://example.com</a>'
209209

210210
>>> linker.linkify('http://foo.com')
211-
u'<a href="http://bn.ce/?destination=http%3A//foo.com">http://foo.com</a>'
211+
'<a href="http://bn.ce/?destination=http%3A//foo.com">http://foo.com</a>'
212212

213213

214214
Preventing Links
@@ -230,7 +230,7 @@ write the following callback:
230230
... return attrs
231231
... # If the TLD is '.py', make sure it starts with http: or https:.
232232
... # Use _text because that's the original text
233-
... link_text = attrs[u'_text']
233+
... link_text = attrs['_text']
234234
... if link_text.endswith('.py') and not link_text.startswith(('http:', 'https:')):
235235
... # This looks like a Python file, not a URL. Don't make a link.
236236
... return None
@@ -239,10 +239,10 @@ write the following callback:
239239
...
240240
>>> linker = Linker(callbacks=[dont_linkify_python])
241241
>>> linker.linkify('abc http://example.com def')
242-
u'abc <a href="http://example.com">http://example.com</a> def'
242+
'abc <a href="http://example.com">http://example.com</a> def'
243243

244244
>>> linker.linkify('abc models.py def')
245-
u'abc models.py def'
245+
'abc models.py def'
246246

247247

248248
.. _Crate: https://crate.io/
@@ -261,13 +261,13 @@ For example, this removes any ``mailto:`` links:
261261
>>> from bleach.linkifier import Linker
262262

263263
>>> def remove_mailto(attrs, new=False):
264-
... if attrs[(None, u'href')].startswith(u'mailto:'):
264+
... if attrs[(None, 'href')].startswith('mailto:'):
265265
... return None
266266
... return attrs
267267
...
268268
>>> linker = Linker(callbacks=[remove_mailto])
269269
>>> linker.linkify('<a href="mailto:[email protected]">mail janet!</a>')
270-
u'mail janet!'
270+
'mail janet!'
271271

272272

273273
Skipping links in specified tag blocks (``skip_tags``)
@@ -308,7 +308,7 @@ instance.
308308

309309
>>> linker = Linker(skip_tags=['pre'])
310310
>>> linker.linkify('a b c http://example.com d e f')
311-
u'a b c <a href="http://example.com" rel="nofollow">http://example.com</a> d e f'
311+
'a b c <a href="http://example.com" rel="nofollow">http://example.com</a> d e f'
312312

313313

314314
.. autoclass:: bleach.linkifier.Linker
@@ -340,11 +340,11 @@ For example, using all the defaults:
340340

341341
>>> cleaner = Cleaner(tags=['pre'])
342342
>>> cleaner.clean('<pre>http://example.com</pre>')
343-
u'<pre>http://example.com</pre>'
343+
'<pre>http://example.com</pre>'
344344

345345
>>> cleaner = Cleaner(tags=['pre'], filters=[LinkifyFilter])
346346
>>> cleaner.clean('<pre>http://example.com</pre>')
347-
u'<pre><a href="http://example.com">http://example.com</a></pre>'
347+
'<pre><a href="http://example.com">http://example.com</a></pre>'
348348

349349

350350
And passing parameters to ``LinkifyFilter``:
@@ -362,7 +362,7 @@ And passing parameters to ``LinkifyFilter``:
362362
... )
363363
...
364364
>>> cleaner.clean('<pre>http://example.com</pre>')
365-
u'<pre>http://example.com</pre>'
365+
'<pre>http://example.com</pre>'
366366

367367

368368
.. autoclass:: bleach.linkifier.LinkifyFilter

0 commit comments

Comments
 (0)