Revision - a156e09 - Merged revisions 60481,60485,60489-60492,60494-60496,60498- [...] - origin: https://github.com/python/cpython

Staging
v0.8.1

visit type:

https://github.com/python/cpython

09 December 2020, 11:06:31 UTC

Revision a156e09b19cd239176a9316ddca7641784eea99e authored by Christian Heimes on 16 February 2008, 07:38:31 UTC, committed by Christian Heimes on 16 February 2008, 07:38:31 UTC

Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617,60619-60621,60623-60625,60627-60629,60631,60633,60635,60647,60650,60652,60654,60656,60658-60659,60664-60666,60668-60670,60672,60676,60678,60680-60683,60685-60686,60688,60690,60692-60694,60697-60700,60705-60706,60708,60711,60714,60720,60724-60730,60732,60736,60742,60744,60746,60748,60750-60751,60753,60756-60757,60759-60761,60763-60764,60766,60769-60770,60774-60784,60787-60845 via svnmerge from

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r60790 | raymond.hettinger | 2008-02-14 10:32:45 +0100 (Thu, 14 Feb 2008) | 4 lines

  Add diagnostic message to help figure-out why SocketServer tests occasionally crash
  when trying to remove a pid that in not in the activechildren list.
........
  r60791 | raymond.hettinger | 2008-02-14 11:46:57 +0100 (Thu, 14 Feb 2008) | 1 line

  Add fixed-point examples to the decimal FAQ
........
  r60792 | raymond.hettinger | 2008-02-14 12:01:10 +0100 (Thu, 14 Feb 2008) | 1 line

  Improve rst markup
........
  r60794 | raymond.hettinger | 2008-02-14 12:57:25 +0100 (Thu, 14 Feb 2008) | 1 line

  Show how to remove exponents.
........
  r60795 | raymond.hettinger | 2008-02-14 13:05:42 +0100 (Thu, 14 Feb 2008) | 1 line

  Fix markup.
........
  r60797 | christian.heimes | 2008-02-14 13:47:33 +0100 (Thu, 14 Feb 2008) | 1 line

  Implemented Martin's suggestion to clear the free lists during the garbage collection of the highest generation.
........
  r60798 | raymond.hettinger | 2008-02-14 13:49:37 +0100 (Thu, 14 Feb 2008) | 1 line

  Simplify moneyfmt() recipe.
........
  r60810 | raymond.hettinger | 2008-02-14 20:02:39 +0100 (Thu, 14 Feb 2008) | 1 line

  Fix markup
........
  r60811 | raymond.hettinger | 2008-02-14 20:30:30 +0100 (Thu, 14 Feb 2008) | 1 line

  No need to register subclass of ABCs.
........
  r60814 | thomas.heller | 2008-02-14 22:00:28 +0100 (Thu, 14 Feb 2008) | 1 line

  Try to correct a markup error that does hide the following paragraph.
........
  r60822 | christian.heimes | 2008-02-14 23:40:11 +0100 (Thu, 14 Feb 2008) | 1 line

  Use a static and interned string for __subclasscheck__ and __instancecheck__ as suggested by Thomas Heller in #2115
........
  r60827 | christian.heimes | 2008-02-15 07:57:08 +0100 (Fri, 15 Feb 2008) | 1 line

  Fixed repr() and str() of complex numbers. Complex suffered from the same problem as floats but I forgot to test and fix them.
........
  r60830 | christian.heimes | 2008-02-15 09:20:11 +0100 (Fri, 15 Feb 2008) | 2 lines

  Bug #2111: mmap segfaults when trying to write a block opened with PROT_READ
  Thanks to Thomas Herve for the fix.
........
  r60835 | eric.smith | 2008-02-15 13:14:32 +0100 (Fri, 15 Feb 2008) | 1 line

  In PyNumber_ToBase, changed from an assert to returning an error when PyObject_Index() returns something other than an int or long.  It should never be possible to trigger this, as PyObject_Index checks to make sure it returns an int or long.
........
  r60837 | skip.montanaro | 2008-02-15 20:03:59 +0100 (Fri, 15 Feb 2008) | 8 lines

  Two new functions:

    * place_summary_first copies the regrtest summary to the front of the file
      making it easier to scan quickly for problems.

    * count_failures gets the actual count of the number of failing tests, not
      just a 1 (some failures) or 0 (no failures).
........
  r60840 | raymond.hettinger | 2008-02-15 22:21:25 +0100 (Fri, 15 Feb 2008) | 1 line

  Update example to match the current syntax.
........
  r60841 | amaury.forgeotdarc | 2008-02-15 22:22:45 +0100 (Fri, 15 Feb 2008) | 8 lines

  Issue #2115: __slot__ attributes setting was 10x slower.
  Also correct a possible crash using ABCs.

  This change is exactly the same as an optimisation
  done 5 years ago, but on slot *access*:
  http://svn.python.org/view?view=rev&rev=28297
........
  r60842 | amaury.forgeotdarc | 2008-02-15 22:27:44 +0100 (Fri, 15 Feb 2008) | 2 lines

  Temporarily let these tests pass
........
  r60843 | kurt.kaiser | 2008-02-15 22:56:36 +0100 (Fri, 15 Feb 2008) | 2 lines

  ScriptBinding event handlers weren't returning 'break'. Patch 2050, Tal Einat.
........
  r60844 | kurt.kaiser | 2008-02-15 23:25:09 +0100 (Fri, 15 Feb 2008) | 4 lines

  Configured selection highlighting colors were ignored; updating highlighting
  in the config dialog would cause non-Python files to be colored as if they
  were Python source; improve use of ColorDelagator.  Patch 1334. Tal Einat.
........
  r60845 | amaury.forgeotdarc | 2008-02-15 23:44:20 +0100 (Fri, 15 Feb 2008) | 9 lines

  Re-enable tests, they were failing since gc.collect() clears the various freelists.
  They still remain fragile.

  For example, a call to assertEqual currently does not make any allocation
  (which surprised me at first).
  But this can change when gc.collect also deletes the numerous "zombie frames"
  attached to each function.
........

1 parent 71316b0

Files
Changes

Permalinks

Tip revision: a156e09b19cd239176a9316ddca7641784eea99e authored by Christian Heimes on 16 February 2008, 07:38:31 UTC
Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617,60619-60621,60623-60625,60627-60629,60631,60633,60635,60647,60650,60652,60654,60656,60658-60659,60664-60666,60668-60670,60672,60676,60678,60680-60683,60685-60686,60688,60690,60692-60694,60697-60700,60705-60706,60708,60711,60714,60720,60724-60730,60732,60736,60742,60744,60746,60748,60750-60751,60753,60756-60757,60759-60761,60763-60764,60766,60769-60770,60774-60784,60787-60845 via svnmerge from

Tip revision: a156e09

gzip.py

"""Functions that read and write gzipped files.

The user of the file doesn't have to worry about the compression,
but random access is not allowed."""

# based on Andrew Kuchling's minigzip.py distributed with the zlib module

import struct, sys, time
import zlib
import builtins

__all__ = ["GzipFile","open"]

FTEXT, FHCRC, FEXTRA, FNAME, FCOMMENT = 1, 2, 4, 8, 16

READ, WRITE = 1, 2

def U32(i):
    """Return i as an unsigned integer, assuming it fits in 32 bits.

    If it's >= 2GB when viewed as a 32-bit unsigned int, return a long.
    """
    if i < 0:
        i += 1 << 32
    return i

def LOWU32(i):
    """Return the low-order 32 bits of an int, as a non-negative int."""
    return i & 0xFFFFFFFF

def write32(output, value):
    output.write(struct.pack("<l", value))

def write32u(output, value):
    # The L format writes the bit pattern correctly whether signed
    # or unsigned.
    output.write(struct.pack("<L", value))

def read32(input):
    return struct.unpack("<l", input.read(4))[0]

def open(filename, mode="rb", compresslevel=9):
    """Shorthand for GzipFile(filename, mode, compresslevel).

    The filename argument is required; mode defaults to 'rb'
    and compresslevel defaults to 9.

    """
    return GzipFile(filename, mode, compresslevel)

class GzipFile:
    """The GzipFile class simulates most of the methods of a file object with
    the exception of the readinto() and truncate() methods.

    """

    myfileobj = None
    max_read_chunk = 10 * 1024 * 1024   # 10Mb

    def __init__(self, filename=None, mode=None,
                 compresslevel=9, fileobj=None):
        """Constructor for the GzipFile class.

        At least one of fileobj and filename must be given a
        non-trivial value.

        The new class instance is based on fileobj, which can be a regular
        file, a StringIO object, or any other object which simulates a file.
        It defaults to None, in which case filename is opened to provide
        a file object.

        When fileobj is not None, the filename argument is only used to be
        included in the gzip file header, which may includes the original
        filename of the uncompressed file.  It defaults to the filename of
        fileobj, if discernible; otherwise, it defaults to the empty string,
        and in this case the original filename is not included in the header.

        The mode argument can be any of 'r', 'rb', 'a', 'ab', 'w', or 'wb',
        depending on whether the file will be read or written.  The default
        is the mode of fileobj if discernible; otherwise, the default is 'rb'.
        Be aware that only the 'rb', 'ab', and 'wb' values should be used
        for cross-platform portability.

        The compresslevel argument is an integer from 1 to 9 controlling the
        level of compression; 1 is fastest and produces the least compression,
        and 9 is slowest and produces the most compression.  The default is 9.

        """

        # guarantee the file is opened in binary mode on platforms
        # that care about that sort of thing
        if mode and 'b' not in mode:
            mode += 'b'
        if fileobj is None:
            fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
        if filename is None:
            if hasattr(fileobj, 'name'): filename = fileobj.name
            else: filename = ''
        if mode is None:
            if hasattr(fileobj, 'mode'): mode = fileobj.mode
            else: mode = 'rb'

        if mode[0:1] == 'r':
            self.mode = READ
            # Set flag indicating start of a new member
            self._new_member = True
            self.extrabuf = b""
            self.extrasize = 0
            self.name = filename
            # Starts small, scales exponentially
            self.min_readsize = 100

        elif mode[0:1] == 'w' or mode[0:1] == 'a':
            self.mode = WRITE
            self._init_write(filename)
            self.compress = zlib.compressobj(compresslevel,
                                             zlib.DEFLATED,
                                             -zlib.MAX_WBITS,
                                             zlib.DEF_MEM_LEVEL,
                                             0)
        else:
            raise IOError("Mode " + mode + " not supported")

        self.fileobj = fileobj
        self.offset = 0

        if self.mode == WRITE:
            self._write_gzip_header()

    @property
    def filename(self):
        import warnings
        warnings.warn("use the name attribute", DeprecationWarning)
        if self.mode == WRITE and self.name[-3:] != ".gz":
            return self.name + ".gz"
        return self.name

    def __repr__(self):
        s = repr(self.fileobj)
        return '<gzip ' + s[1:-1] + ' ' + hex(id(self)) + '>'

    def _init_write(self, filename):
        self.name = filename
        self.crc = zlib.crc32("")
        self.size = 0
        self.writebuf = []
        self.bufsize = 0

    def _write_gzip_header(self):
        self.fileobj.write(b'\037\213')             # magic header
        self.fileobj.write(b'\010')                 # compression method
        try:
            # RFC 1952 requires the FNAME field to be Latin-1. Do not
            # include filenames that cannot be represented that way.
            fname = self.name.encode('latin-1')
            if fname.endswith(b'.gz'):
                fname = fname[:-3]
        except UnicodeEncodeError:
            fname = b''
        flags = 0
        if fname:
            flags = FNAME
        self.fileobj.write(chr(flags).encode('latin-1'))
        write32u(self.fileobj, int(time.time()))
        self.fileobj.write(b'\002')
        self.fileobj.write(b'\377')
        if fname:
            self.fileobj.write(fname + b'\000')

    def _init_read(self):
        self.crc = zlib.crc32("")
        self.size = 0

    def _read_gzip_header(self):
        magic = self.fileobj.read(2)
        if magic != b'\037\213':
            raise IOError('Not a gzipped file')
        method = ord( self.fileobj.read(1) )
        if method != 8:
            raise IOError('Unknown compression method')
        flag = ord( self.fileobj.read(1) )
        # modtime = self.fileobj.read(4)
        # extraflag = self.fileobj.read(1)
        # os = self.fileobj.read(1)
        self.fileobj.read(6)

        if flag & FEXTRA:
            # Read & discard the extra field, if present
            xlen = ord(self.fileobj.read(1))
            xlen = xlen + 256*ord(self.fileobj.read(1))
            self.fileobj.read(xlen)
        if flag & FNAME:
            # Read and discard a null-terminated string containing the filename
            while True:
                s = self.fileobj.read(1)
                if not s or s==b'\000':
                    break
        if flag & FCOMMENT:
            # Read and discard a null-terminated string containing a comment
            while True:
                s = self.fileobj.read(1)
                if not s or s==b'\000':
                    break
        if flag & FHCRC:
            self.fileobj.read(2)     # Read & discard the 16-bit header CRC


    def write(self,data):
        if self.mode != WRITE:
            import errno
            raise IOError(errno.EBADF, "write() on read-only GzipFile object")

        if self.fileobj is None:
            raise ValueError("write() on closed GzipFile object")
        if len(data) > 0:
            self.size = self.size + len(data)
            self.crc = zlib.crc32(data, self.crc)
            self.fileobj.write( self.compress.compress(data) )
            self.offset += len(data)

    def read(self, size=-1):
        if self.mode != READ:
            import errno
            raise IOError(errno.EBADF, "read() on write-only GzipFile object")

        if self.extrasize <= 0 and self.fileobj is None:
            return b''

        readsize = 1024
        if size < 0:        # get the whole thing
            try:
                while True:
                    self._read(readsize)
                    readsize = min(self.max_read_chunk, readsize * 2)
            except EOFError:
                size = self.extrasize
        else:               # just get some more of it
            try:
                while size > self.extrasize:
                    self._read(readsize)
                    readsize = min(self.max_read_chunk, readsize * 2)
            except EOFError:
                if size > self.extrasize:
                    size = self.extrasize

        chunk = self.extrabuf[:size]
        self.extrabuf = self.extrabuf[size:]
        self.extrasize = self.extrasize - size

        self.offset += size
        return chunk

    def _unread(self, buf):
        self.extrabuf = buf + self.extrabuf
        self.extrasize = len(buf) + self.extrasize
        self.offset -= len(buf)

    def _read(self, size=1024):
        if self.fileobj is None:
            raise EOFError("Reached EOF")

        if self._new_member:
            # If the _new_member flag is set, we have to
            # jump to the next member, if there is one.
            #
            # First, check if we're at the end of the file;
            # if so, it's time to stop; no more members to read.
            pos = self.fileobj.tell()   # Save current position
            self.fileobj.seek(0, 2)     # Seek to end of file
            if pos == self.fileobj.tell():
                raise EOFError("Reached EOF")
            else:
                self.fileobj.seek( pos ) # Return to original position

            self._init_read()
            self._read_gzip_header()
            self.decompress = zlib.decompressobj(-zlib.MAX_WBITS)
            self._new_member = False

        # Read a chunk of data from the file
        buf = self.fileobj.read(size)

        # If the EOF has been reached, flush the decompression object
        # and mark this object as finished.

        if buf == b"":
            uncompress = self.decompress.flush()
            self._read_eof()
            self._add_read_data( uncompress )
            raise EOFError('Reached EOF')

        uncompress = self.decompress.decompress(buf)
        self._add_read_data( uncompress )

        if self.decompress.unused_data != b"":
            # Ending case: we've come to the end of a member in the file,
            # so seek back to the start of the unused data, finish up
            # this member, and read a new gzip header.
            # (The number of bytes to seek back is the length of the unused
            # data, minus 8 because _read_eof() will rewind a further 8 bytes)
            self.fileobj.seek( -len(self.decompress.unused_data)+8, 1)

            # Check the CRC and file size, and set the flag so we read
            # a new member on the next call
            self._read_eof()
            self._new_member = True

    def _add_read_data(self, data):
        self.crc = zlib.crc32(data, self.crc)
        self.extrabuf = self.extrabuf + data
        self.extrasize = self.extrasize + len(data)
        self.size = self.size + len(data)

    def _read_eof(self):
        # We've read to the end of the file, so we have to rewind in order
        # to reread the 8 bytes containing the CRC and the file size.
        # We check the that the computed CRC and size of the
        # uncompressed data matches the stored values.  Note that the size
        # stored is the true file size mod 2**32.
        self.fileobj.seek(-8, 1)
        crc32 = read32(self.fileobj)
        isize = U32(read32(self.fileobj))   # may exceed 2GB
        if U32(crc32) != U32(self.crc):
            raise IOError("CRC check failed")
        elif isize != LOWU32(self.size):
            raise IOError("Incorrect length of data produced")

    def close(self):
        if self.mode == WRITE:
            self.fileobj.write(self.compress.flush())
            # The native zlib crc is an unsigned 32-bit integer, but
            # the Python wrapper implicitly casts that to a signed C
            # long.  So, on a 32-bit box self.crc may "look negative",
            # while the same crc on a 64-bit box may "look positive".
            # To avoid irksome warnings from the `struct` module, force
            # it to look positive on all boxes.
            write32u(self.fileobj, LOWU32(self.crc))
            # self.size may exceed 2GB, or even 4GB
            write32u(self.fileobj, LOWU32(self.size))
            self.fileobj = None
        elif self.mode == READ:
            self.fileobj = None
        if self.myfileobj:
            self.myfileobj.close()
            self.myfileobj = None

    def __del__(self):
        try:
            if (self.myfileobj is None and
                self.fileobj is None):
                return
        except AttributeError:
            return
        self.close()

    def flush(self,zlib_mode=zlib.Z_SYNC_FLUSH):
        if self.mode == WRITE:
            # Ensure the compressor's buffer is flushed
            self.fileobj.write(self.compress.flush(zlib_mode))
        self.fileobj.flush()

    def fileno(self):
        """Invoke the underlying file object's fileno() method.

        This will raise AttributeError if the underlying file object
        doesn't support fileno().
        """
        return self.fileobj.fileno()

    def isatty(self):
        return False

    def tell(self):
        return self.offset

    def rewind(self):
        '''Return the uncompressed stream file position indicator to the
        beginning of the file'''
        if self.mode != READ:
            raise IOError("Can't rewind in write mode")
        self.fileobj.seek(0)
        self._new_member = True
        self.extrabuf = b""
        self.extrasize = 0
        self.offset = 0

    def seek(self, offset, whence=0):
        if whence:
            if whence == 1:
                offset = self.offset + offset
            else:
                raise ValueError('Seek from end not supported')
        if self.mode == WRITE:
            if offset < self.offset:
                raise IOError('Negative seek in write mode')
            count = offset - self.offset
            chunk = bytes(1024)
            for i in range(count // 1024):
                self.write(chunk)
            self.write(bytes(count % 1024))
        elif self.mode == READ:
            if offset < self.offset:
                # for negative seek, rewind and do positive seek
                self.rewind()
            count = offset - self.offset
            for i in range(count // 1024):
                self.read(1024)
            self.read(count % 1024)

    def readline(self, size=-1):
        if size < 0:
            size = sys.maxsize
            readsize = self.min_readsize
        else:
            readsize = size
        bufs = []
        while size != 0:
            c = self.read(readsize)
            i = c.find(b'\n')

            # We set i=size to break out of the loop under two
            # conditions: 1) there's no newline, and the chunk is
            # larger than size, or 2) there is a newline, but the
            # resulting line would be longer than 'size'.
            if (size <= i) or (i == -1 and len(c) > size):
                i = size - 1

            if i >= 0 or c == b'':
                bufs.append(c[:i + 1])    # Add portion of last chunk
                self._unread(c[i + 1:])   # Push back rest of chunk
                break

            # Append chunk to list, decrease 'size',
            bufs.append(c)
            size = size - len(c)
            readsize = min(size, readsize * 2)
        if readsize > self.min_readsize:
            self.min_readsize = min(readsize, self.min_readsize * 2, 512)
        return b''.join(bufs) # Return resulting line

    def readlines(self, sizehint=0):
        # Negative numbers result in reading all the lines
        if sizehint <= 0:
            sizehint = sys.maxsize
        L = []
        while sizehint > 0:
            line = self.readline()
            if line == b"":
                break
            L.append(line)
            sizehint = sizehint - len(line)

        return L

    def writelines(self, L):
        for line in L:
            self.write(line)

    def __iter__(self):
        return self

    def __next__(self):
        line = self.readline()
        if line:
            return line
        else:
            raise StopIteration


def _test():
    # Act like gzip; with -d, act like gunzip.
    # The input file is not deleted, however, nor are any other gzip
    # options or features supported.
    args = sys.argv[1:]
    decompress = args and args[0] == "-d"
    if decompress:
        args = args[1:]
    if not args:
        args = ["-"]
    for arg in args:
        if decompress:
            if arg == "-":
                f = GzipFile(filename="", mode="rb", fileobj=sys.stdin)
                g = sys.stdout
            else:
                if arg[-3:] != ".gz":
                    print("filename doesn't end in .gz:", repr(arg))
                    continue
                f = open(arg, "rb")
                g = builtins.open(arg[:-3], "wb")
        else:
            if arg == "-":
                f = sys.stdin
                g = GzipFile(filename="", mode="wb", fileobj=sys.stdout)
            else:
                f = builtins.open(arg, "rb")
                g = open(arg + ".gz", "wb")
        while True:
            chunk = f.read(1024)
            if not chunk:
                break
            g.write(chunk)
        if g is not sys.stdout:
            g.close()
        if f is not sys.stdin:
            f.close()

if __name__ == '__main__':
    _test()

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...