Staging
v0.8.1
https://github.com/python/cpython
Revision 122541beceeccce4ef8a9bf739c727ccdcbf2f28 authored by Raymond Hettinger on 13 May 2014, 04:56:33 UTC, committed by Raymond Hettinger on 13 May 2014, 04:56:33 UTC
* Repair the broken link to norobots-rfc.txt.

* HTTP response codes >= 500 treated as a failed read rather than as a not
found.  Not found means that we can assume the entire site is allowed.  A 5xx
server error tells us nothing.

* A successful read() or parse() updates the mtime (which is defined to be "the
  time the robots.txt file was last fetched").

* The can_fetch() method returns False unless we've had a read() with a 2xx or
4xx response.  This avoids false positives in the case where a user calls
can_fetch() before calling read().

* I don't see any easy way to test this patch without hitting internet
resources that might change or without use of mock objects that wouldn't
provide must reassurance.
1 parent 73308d6
History
Tip revision: 122541beceeccce4ef8a9bf739c727ccdcbf2f28 authored by Raymond Hettinger on 13 May 2014, 04:56:33 UTC
Issue 21469: Mitigate risk of false positives with robotparser.
Tip revision: 122541b
File Mode Size
RPM
ACKS -rw-r--r-- 21.3 KB
HISTORY -rw-r--r-- 1.1 MB
NEWS -rw-r--r-- 331.9 KB
Porting -rw-r--r-- 1.9 KB
README -rw-r--r-- 1.4 KB
README.AIX -rw-r--r-- 5.0 KB
README.coverity -rw-r--r-- 845 bytes
README.valgrind -rw-r--r-- 4.3 KB
SpecialBuilds.txt -rw-r--r-- 10.9 KB
coverity_model.c -rw-r--r-- 2.9 KB
gdbinit -rw-r--r-- 4.7 KB
indent.pro -rw-r--r-- 557 bytes
python-config.in -rw-r--r-- 2.0 KB
python-config.sh.in -rw-r--r-- 2.9 KB
python-wing3.wpr -rw-r--r-- 555 bytes
python-wing4.wpr -rw-r--r-- 835 bytes
python-wing5.wpr -rw-r--r-- 835 bytes
python.man -rw-r--r-- 13.5 KB
python.pc.in -rw-r--r-- 293 bytes
svnmap.txt -rw-r--r-- 4.1 MB
valgrind-python.supp -rw-r--r-- 8.2 KB
vgrindefs -rw-r--r-- 500 bytes

README

back to top