Pytrees¶
Transforms need to walk values. A string is one leaf. A tuple is a container. A registered dataclass can be a container too.
That container/leaf view is called a pytree. autoform uses pytrees so
transforms can apply per-leaf logic to user-defined structures.
autoform uses Optree’s pytree utilities
for traversal and registration.
Registration¶
Without pytree registration, a custom object is opaque. batch
cannot know which fields are batched. pullback cannot
route cotangents into the right fields.
With registration, the object becomes part of the same tree machinery as tuples and dictionaries.
PYTREE_NAMESPACE¶
autoform reserves PYTREE_NAMESPACE.
Register project dataclasses in that namespace so autoform and project code
agree on the same tree rules.
import optree
import autoform as af
@optree.dataclasses.dataclass(namespace=af.PYTREE_NAMESPACE)
class State:
topic: str
draft: str
state = State(topic="dna", draft="short")
upper = optree.tree_map(str.upper, state, namespace=af.PYTREE_NAMESPACE)
assert upper == State(topic="DNA", draft="SHORT")
That is the canonical pattern: use the same PYTREE_NAMESPACE
when registering dataclasses with Optree’s dataclass decorator
and when calling Optree pytree utilities.
Static Metadata Fields¶
Not every dataclass field should be a transform leaf. Use
optree.dataclasses.field(pytree_node=False) for static metadata that belongs
to the object but should not be mapped, batched, or receive feedback.
import optree
import autoform as af
@optree.dataclasses.dataclass(namespace=af.PYTREE_NAMESPACE)
class F32Array:
values: tuple[float, ...]
shape: tuple[int, ...] = optree.dataclasses.field(pytree_node=False)
dtype: str = optree.dataclasses.field(default="float32", pytree_node=False)
array = F32Array(values=(1.0, 2.0), shape=(2,))
scaled = optree.tree_map(lambda x: x * 10, array, namespace=af.PYTREE_NAMESPACE)
assert scaled == F32Array(values=(10.0, 20.0), shape=(2,))
values is a pytree child, so transforms and Optree utilities can walk it.
shape and dtype are metadata, so they are stored in the tree structure and
preserved when the object is rebuilt. This is useful for array shapes, dtypes,
model names, backends, and other configuration that should remain fixed while
the leaves change.
Manual Flatten / Unflatten¶
For non-dataclass classes, register the class with explicit flatten and unflatten functions.
import optree
import autoform as af
class State:
def __init__(self, topic: str, draft: str):
self.topic = topic
self.draft = draft
def flatten_state(state: State):
children = (state.topic, state.draft)
metadata = None
return children, metadata
def unflatten_state(metadata, children):
topic, draft = children
return State(topic=topic, draft=draft)
optree.register_pytree_node(State, flatten_state, unflatten_state, namespace=af.PYTREE_NAMESPACE)
state = State(topic="dna", draft="short")
upper = optree.tree_map(str.upper, state, namespace=af.PYTREE_NAMESPACE)
assert upper.topic == "DNA"
assert upper.draft == "SHORT"
The flatten rule returns two things:
children: values that Optree andautoformshould walk recursively;metadata: hashable static data stored in the tree spec and passed back to the unflatten rule.
optree.dataclasses.field(pytree_node=False) is the dataclass form of putting a
field in metadata instead of children.
The unflatten rule receives the metadata and transformed children, then rebuilds the object.
Method-Bearing Pytrees¶
A registered class can also have methods. The fields are pytree leaves; the methods are ordinary Python behavior.
This makes an object a transform-visible container: fields hold the values that transforms can walk, while methods provide the callable surface that builds the text-space program.
import optree
import autoform as af
@optree.dataclasses.dataclass(namespace=af.PYTREE_NAMESPACE)
class Explainer:
instruction: str
style: str
def __call__(self, topic: str) -> str:
return af.format("{}\nstyle: {}\ntopic: {}", self.instruction, self.style, topic)
The placement of the object matters:
As an explicit input or output, transforms can see the registered fields.
As a closed-over value, the object behaves like trace-time configuration.
Use the explicit-input form when batch,
pullback, or pushforward
should act on the module fields.
The object-oriented module recipe applies the same dataclass pattern to method-bearing module objects.
Transform Behavior¶
Once State is a pytree, an IR can accept and return it. Transforms see the
leaves:
batchcan vectorize overState(topic=[...], draft=[...]).pullbackcan return field-shaped feedback such asState(topic="be more specific", draft="too terse").while_loopcan carry structured state as long as the body input and output structures match.
Leaf Guidelines¶
Use leaves that are ordinary values or traced values:
strings, ints, floats, bools;
schema outputs;
other registered pytrees;
values produced by
autoformprimitives.
Avoid leaves that are runtime resources or trace-local implementation details:
open files;
sockets;
closures;
tracers leaked from another trace.
Pytrees describe structure. They do not make a value serializable, replayable, or safe to mutate.
Schemas are adjacent but different: a schema describes structured LM output. A
pytree describes how autoform walks user data. See Schemas for
schema output.