Staging
v0.8.1
Revision 7d0fef56d8eaac6309a66cb8c6ba6fd96f8c8a94 authored by Victor Stinner on 20 August 2020, 11:28:49 UTC, committed by GitHub on 20 August 2020, 11:28:49 UTC
* bpo-40204: Allow pre-Sphinx 3 syntax in the doc (GH-21844)

Enable Sphinx 3.2 "c_allow_pre_v3" option and disable the
c_warn_on_allowed_pre_v3 option to make the documentation compatible
with Sphinx 2 and Sphinx 3.

(cherry picked from commit 423e77d6de497931585d1883805a9e3fa4096b0b)

* bpo-40204: Fix Sphinx sytanx in howto/instrumentation.rst (GH-21858)

Use generic '.. object::' to declare markers, rather than abusing
'..  c:function::' which fails on Sphinx 3.

(cherry picked from commit 43577c01a2ab49122db696e9eaec6cb31d11cc81)

* bpo-40204: Fix duplicates in the documentation (GH-21857)

Fix two Sphinx 3 issues:

Doc/c-api/buffer.rst:304: WARNING: Duplicate C declaration, also defined in 'c-api/buffer'.
Declaration is 'PyBUF_ND'.

Doc/c-api/unicode.rst:1603: WARNING: Duplicate C declaration, also defined in 'c-api/unicode'.
Declaration is 'PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)'.

(cherry picked from commit 46d10b1237c67ff8347f533eda6a5468d098f7eb)

* bpo-40204: Add :noindex: in the documentation (GH-21859)

Add :noindex: to duplicated documentation to fix "duplicate object
description" errors.

For example, fix this Sphinx 3 issue:

Doc/library/configparser.rst:1146: WARNING: duplicate object
description of configparser.ConfigParser.optionxform, other instance
in library/configparser, use :noindex: for one of them

(cherry picked from commit d3ded080482beae578faa704b13534a62d066f9f)

* bpo-40204, doc: Fix syntax of C variables (GH-21846)

For example, fix the following Sphinx 3 errors:

Doc/c-api/buffer.rst:102: WARNING: Error in declarator or parameters
Invalid C declaration: Expected identifier in nested name. [error at 5]
  void \*obj
  -----^

Doc/c-api/arg.rst:130: WARNING: Unparseable C cross-reference: 'PyObject*'
Invalid C declaration: Expected end of definition. [error at 8]
  PyObject*
  --------^

The modified documentation is compatible with Sphinx 2 and Sphinx 3.

(cherry picked from commit 474652fe9346382dbf793f20b671eb74668bebde)

* bpo-40204: Fix reference to terms in the doc (GH-21865)

Sphinx 3 requires to refer to terms with the exact case.

For example, fix the Sphinx 3 warning:

Doc/library/pkgutil.rst:71: WARNING: term Loader not found in case
sensitive match.made a reference to loader instead.

(cherry picked from commit bb0b08540cc93e56f3f1bde1b39ce086d9e35fe1)

* bpo-40204: Fix duplicated productionlist names in the doc (GH-21900)

Sphinx 3 disallows having more than one productionlist markup with
the same name. Simply remove names in this case, since names are not
shown anyway. For example, fix the Sphinx 3 warning:

Doc/reference/introduction.rst:96: duplicate token description
of *:name, other instance in reference/expressions

(cherry picked from commit 1abeda80f760134b4233608e2c288790f955b95a)
(cherry picked from commit 8f88190af529543c84d5dc78f19abbfd73335cf4)
1 parent 34889a5
Raw File
lnotab_notes.txt
All about co_lnotab, the line number table.

Code objects store a field named co_lnotab.  This is an array of unsigned bytes
disguised as a Python bytes object.  It is used to map bytecode offsets to
source code line #s for tracebacks and to identify line number boundaries for
line tracing. Because of internals of the peephole optimizer, it's possible
for lnotab to contain bytecode offsets that are no longer valid (for example
if the optimizer removed the last line in a function).

The array is conceptually a compressed list of
    (bytecode offset increment, line number increment)
pairs.  The details are important and delicate, best illustrated by example:

    byte code offset    source code line number
        0                   1
        6                   2
       50                   7
      350                 207
      361                 208

Instead of storing these numbers literally, we compress the list by storing only
the difference from one row to the next.  Conceptually, the stored list might
look like:

    0, 1,  6, 1,  44, 5,  300, 200,  11, 1

The above doesn't really work, but it's a start. An unsigned byte (byte code
offset) can't hold negative values, or values larger than 255, a signed byte
(line number) can't hold values larger than 127 or less than -128, and the
above example contains two such values.  (Note that before 3.6, line number
was also encoded by an unsigned byte.)  So we make two tweaks:

 (a) there's a deep assumption that byte code offsets increase monotonically,
 and
 (b) if byte code offset jumps by more than 255 from one row to the next, or if
 source code line number jumps by more than 127 or less than -128 from one row
 to the next, more than one pair is written to the table. In case #b,
 there's no way to know from looking at the table later how many were written.
 That's the delicate part.  A user of co_lnotab desiring to find the source
 line number corresponding to a bytecode address A should do something like
 this:

    lineno = addr = 0
    for addr_incr, line_incr in co_lnotab:
        addr += addr_incr
        if addr > A:
            return lineno
        if line_incr >= 0x80:
            line_incr -= 0x100
        lineno += line_incr

(In C, this is implemented by PyCode_Addr2Line().)  In order for this to work,
when the addr field increments by more than 255, the line # increment in each
pair generated must be 0 until the remaining addr increment is < 256.  So, in
the example above, assemble_lnotab in compile.c should not (as was actually done
until 2.2) expand 300, 200 to
    255, 255, 45, 45,
but to
    255, 0, 45, 127, 0, 73.

The above is sufficient to reconstruct line numbers for tracebacks, but not for
line tracing.  Tracing is handled by PyCode_CheckLineNumber() in codeobject.c
and maybe_call_line_trace() in ceval.c.

*** Tracing ***

To a first approximation, we want to call the tracing function when the line
number of the current instruction changes.  Re-computing the current line for
every instruction is a little slow, though, so each time we compute the line
number we save the bytecode indices where it's valid:

     *instr_lb <= frame->f_lasti < *instr_ub

is true so long as execution does not change lines.  That is, *instr_lb holds
the first bytecode index of the current line, and *instr_ub holds the first
bytecode index of the next line.  As long as the above expression is true,
maybe_call_line_trace() does not need to call PyCode_CheckLineNumber().  Note
that the same line may appear multiple times in the lnotab, either because the
bytecode jumped more than 255 indices between line number changes or because
the compiler inserted the same line twice.  Even in that case, *instr_ub holds
the first index of the next line.

However, we don't *always* want to call the line trace function when the above
test fails.

Consider this code:

1: def f(a):
2:    while a:
3:       print(1)
4:       break
5:    else:
6:       print(2)

which compiles to this:

  2           0 SETUP_LOOP              26 (to 28)
        >>    2 LOAD_FAST                0 (a)
              4 POP_JUMP_IF_FALSE       18

  3           6 LOAD_GLOBAL              0 (print)
              8 LOAD_CONST               1 (1)
             10 CALL_FUNCTION            1
             12 POP_TOP

  4          14 BREAK_LOOP
             16 JUMP_ABSOLUTE            2
        >>   18 POP_BLOCK

  6          20 LOAD_GLOBAL              0 (print)
             22 LOAD_CONST               2 (2)
             24 CALL_FUNCTION            1
             26 POP_TOP
        >>   28 LOAD_CONST               0 (None)
             30 RETURN_VALUE

If 'a' is false, execution will jump to the POP_BLOCK instruction at offset 18
and the co_lnotab will claim that execution has moved to line 4, which is wrong.
In this case, we could instead associate the POP_BLOCK with line 5, but that
would break jumps around loops without else clauses.

We fix this by only calling the line trace function for a forward jump if the
co_lnotab indicates we have jumped to the *start* of a line, i.e. if the current
instruction offset matches the offset given for the start of a line by the
co_lnotab.  For backward jumps, however, we always call the line trace function,
which lets a debugger stop on every evaluation of a loop guard (which usually
won't be the first opcode in a line).

Why do we set f_lineno when tracing, and only just before calling the trace
function?  Well, consider the code above when 'a' is true.  If stepping through
this with 'n' in pdb, you would stop at line 1 with a "call" type event, then
line events on lines 2, 3, and 4, then a "return" type event -- but because the
code for the return actually falls in the range of the "line 6" opcodes, you
would be shown line 6 during this event.  This is a change from the behaviour in
2.2 and before, and I've found it confusing in practice.  By setting and using
f_lineno when tracing, one can report a line number different from that
suggested by f_lasti on this one occasion where it's desirable.
back to top