Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
d5085db
Inline values work in progress
markshannon Feb 3, 2024
0d88de4
Change layout of PyDictValue to make is suitable for appending to end…
markshannon Feb 4, 2024
d922903
Add copy_values function
markshannon Feb 21, 2024
c03e6cb
Tidy up _PyObject_Init
markshannon Feb 21, 2024
621ea3b
Inline values -- Work in progress
markshannon Feb 21, 2024
67daab5
Get inline values closer to working
markshannon Feb 23, 2024
34c7171
Further fixing
markshannon Feb 23, 2024
c237e79
Remove debug field
markshannon Feb 23, 2024
f80befa
Remove unused function
markshannon Feb 23, 2024
a0c11e4
Merge branch 'main' into inline-values
markshannon Feb 24, 2024
bc1ebc8
Two tweaks
markshannon Feb 24, 2024
0dc68a4
Rename dict-or-values to managed-dict
markshannon Feb 24, 2024
084519c
Update object layout doc
markshannon Feb 24, 2024
3744f32
Add news
markshannon Feb 24, 2024
75ee5a0
Fix a couple of compilation errors on Windows
markshannon Feb 24, 2024
162764c
Update gdb support
markshannon Feb 24, 2024
82ece1b
Specialize LOAD_ATTR for (uninitialized) managed dicts
markshannon Feb 24, 2024
6a762ed
Allow extra allocation in JIT for tests.
markshannon Feb 24, 2024
9dbc8dd
Fix error from merge
markshannon Feb 24, 2024
4a1f7b7
Fix another mis-merge and update generated files
markshannon Feb 24, 2024
09121f9
Fix formatting
markshannon Feb 24, 2024
c384b05
Fix up free threading
markshannon Feb 24, 2024
ee5cf2a
Merge branch 'main' into inline-values
markshannon Feb 24, 2024
005b42b
Remove incorrect assertion
markshannon Feb 24, 2024
48d849e
Merge branch 'main' into inline-values
markshannon Feb 24, 2024
682217c
Remove some debug code
markshannon Feb 24, 2024
0ff1709
Fix off by one error.
markshannon Feb 24, 2024
895a944
Simplify PyObject_ClearManagedDict
markshannon Feb 25, 2024
1b4302a
Merge branch 'main' into inline-values
markshannon Mar 5, 2024
ecd4204
Merge branch 'main' into inline-values
markshannon Mar 26, 2024
c05d01d
Get tests passing for free-threaded build
markshannon Mar 26, 2024
d441d7b
Review comment and compiler warnings
markshannon Mar 27, 2024
3395bc2
Update stats gathering.
markshannon Mar 27, 2024
ccceffb
Better type safety. Fewer downcasts, some more upcasts
markshannon Mar 28, 2024
7700177
Merge branch 'main' into inline-values
markshannon Apr 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update object layout doc
  • Loading branch information
markshannon committed Feb 24, 2024
commit 084519c0a4bb48148ad894eb8a000920e55fc1c4
91 changes: 75 additions & 16 deletions Objects/object_layout.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,45 @@ Since the introduction of the cycle GC, there has also been a pre-header.
Before 3.11, this pre-header was two words in size.
It should be considered opaque to all code except the cycle GC.

## 3.11 pre-header
### 3.13

In 3.13, the values array is embedded into the object, so there is no
need for a values pointer (it is just a fixed offset into the object).
So the pre-header is these two fields:

* weakreflist
* dict_pointer

If the object has no physical dictionary, then the ``dict_pointer``
is set to `NULL`.


<details>
<summary> 3.12 </summary>

### 3.12

In 3.12, the pointer to the list of weak references is added to the
pre-header. In order to make space for it, the ``dict`` and ``values``
pointers are combined into a single tagged pointer:

* weakreflist
* dict_or_values

If the object has no physical dictionary, then the ``dict_or_values``
has its low bit set to one, and points to the values array.
If the object has a physical dictionary, then the ``dict_or_values``
has its low bit set to zero, and points to the dictionary.

The untagged form is chosen for the dictionary pointer, rather than
the values pointer, to enable the (legacy) C-API function
`_PyObject_GetDictPtr(PyObject *obj)` to work.
</details>

<details>
<summary> 3.11 </summary>

### 3.11

In 3.11 the pre-header was extended to include pointers to the VM managed ``__dict__``.
The reason for moving the ``__dict__`` to the pre-header is that it allows
Expand All @@ -33,27 +71,49 @@ The values pointer refers to the ``PyDictValues`` array which holds the
values of the objects's attributes.
Should the dictionary be needed, then ``values`` is set to ``NULL``
and the ``dict`` field points to the dictionary.
</details>

## 3.12 pre-header
## Layout of a "normal" Python object

In 3.12, the pointer to the list of weak references is added to the
pre-header. In order to make space for it, the ``dict`` and ``values``
pointers are combined into a single tagged pointer:
A "normal" Python object is one that doesn't inherit from a builtin
class, doesn't have slots.

### 3.13

In 3.13 the values are embedded into the object, as follows:

* weakreflist
* dict_or_values
* GC 1
* GC 2
* ob_refcnt
* ob_type
* Inlined values:
* Flags
* values 0
* values 1
* ...
* Insertion order bytes

If the object has no physical dictionary, then the ``dict_or_values``
has its low bit set to one, and points to the values array.
If the object has a physical dictionary, then the ``dict_or_values``
has its low bit set to zero, and points to the dictionary.
This has all the advantages of the layout used in 3.12, plus:
* Access to values is even faster as there is one less load
* Fast access is mostly maintained when the `__dict__` is materialized

The untagged form is chosen for the dictionary pointer, rather than
the values pointer, to enable the (legacy) C-API function
`_PyObject_GetDictPtr(PyObject *obj)` to work.
![Layout of "normal" object in 3.13](./object_layout_313.png)

For objects with opaque parts defined by a C extension,
the layout is much the same as for 3.12

![Layout of "full" object in 3.13](./object_layout_full_313.png)

## Layout of a "normal" Python object in 3.12:

<details>
<summary> 3.12 </summary>

### 3.12:

In 3.12, the header and pre-header form the entire object for "normal"
Python objects:

* weakreflist
* dict_or_values
Expand All @@ -62,9 +122,6 @@ the values pointer, to enable the (legacy) C-API function
* ob_refcnt
* ob_type

For a "normal" Python object, one that doesn't inherit from a builtin
class or have slots, the header and pre-header form the entire object.

![Layout of "normal" object in 3.12](./object_layout_312.png)

There are several advantages to this layout:
Expand All @@ -79,4 +136,6 @@ The full layout object, with an opaque part defined by a C extension,
and `__slots__` looks like this:

![Layout of "full" object in 3.12](./object_layout_full_312.png)
</details>


1 change: 1 addition & 0 deletions Objects/object_layout_312.gv
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ digraph ideal {
shape = none
label = <<table border="0" cellspacing="0">
<tr><td><b>values</b></td></tr>
<tr><td border="1">Insertion order</td></tr>
<tr><td port="0" border="1">values[0]</td></tr>
<tr><td border="1">values[1]</td></tr>
<tr><td border="1">...</td></tr>
Expand Down
Binary file modified Objects/object_layout_312.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 45 additions & 0 deletions Objects/object_layout_313.gv
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
digraph ideal {

rankdir = "LR"


object [
shape = none
label = <<table border="0" cellspacing="0">
<tr><td><b>object</b></td></tr>
<tr><td port="w" border="1">weakrefs</td></tr>
<tr><td port="dv" border="1">dict pointer</td></tr>
<tr><td border="1" >GC info 0</td></tr>
<tr><td border="1" >GC info 1</td></tr>
<tr><td port="r" border="1" >refcount</td></tr>
<tr><td port="h" border="1" >__class__</td></tr>
<tr><td border="1">values flags</td></tr>
<tr><td port="0" border="1">values[0]</td></tr>
<tr><td border="1">values[1]</td></tr>
<tr><td border="1">...</td></tr>
<tr><td border="1">Insertion order</td></tr>
</table>>

]

class [
shape = none
label = <<table border="0" cellspacing="0">
<tr><td><b>class</b></td></tr>
<tr><td port="head" bgcolor="lightgreen" border="1">...</td></tr>
<tr><td border="1" bgcolor="lightgreen">dict_offset</td></tr>
<tr><td border="1" bgcolor="lightgreen">...</td></tr>
<tr><td port="k" border="1" bgcolor="lightgreen">cached_keys</td></tr>
</table>>
]

keys [label = "dictionary keys"; fillcolor="lightgreen"; style="filled"]
NULL [ label = " NULL"; shape="plain"]
object:w -> NULL
object:h -> class:head
object:dv -> NULL
class:k -> keys

oop [ label = "pointer"; shape="plain"]
oop -> object:r
}
Binary file added Objects/object_layout_313.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions Objects/object_layout_full_313.gv
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
digraph ideal {

rankdir = "LR"


object [
shape = none
label = <<table border="0" cellspacing="0">
<tr><td><b>object</b></td></tr>
<tr><td port="w" border="1">weakrefs</td></tr>
<tr><td port="dv" border="1">dict pointer</td></tr>
<tr><td border="1" >GC info 0</td></tr>
<tr><td border="1" >GC info 1</td></tr>
<tr><td port="r" border="1" >refcount</td></tr>
<tr><td port="h" border="1" >__class__</td></tr>
<tr><td border="1">opaque (extension) data </td></tr>
<tr><td border="1">...</td></tr>
<tr><td border="1">__slot__ 0</td></tr>
<tr><td border="1">...</td></tr>
</table>>
]

oop [ label = "pointer"; shape="plain"]
oop -> object:r
}
Binary file added Objects/object_layout_full_313.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.