Reverse-mode AD (adjoint)
The adjoint of the Fortran source code is constructed using a source-to-source and line-by-line approach, transforming the target (to be transformed) routine into three different routines, one of which computes the adjoints of the variables by reversing the order of computation of the target routine.
This is implemented in PSyclone by parsing the source code file containing the target routine, and eventually the routines it calls, transforming it into a PSyIR AST and applying reverse-mode automatic differentiation transformations to the nodes thus obtained. The resulting PSyIR tree is then written to Fortran source code.
Reverse-mode transformations
Several transformations, to be applied on PSyIR nodes, have been implemented.
In reverse-mode, all of them follow the naming convention
ADReverse[PSyIRNodeSubclass]Trans.
The one of most interest for the user is the
ADReverseContainerTrans class and its apply method.
After parsing the Fortran code file containing the target routine, an
ADReverseContainerTrans instance should be applied to it to perform
automatic differentiation.
The ADReverseContainerTrans.apply method in turn applies an
ADReverseRoutineTrans to the target routine, which goes line-by-line through
the statements found in the Routine node, applying
ADReverse[PSyIRNodeSubclass]Trans to the statements, etc.

Reverse-mode AD transformation call graph
Container transformation
- class psyclone.autodiff.transformations.ADReverseContainerTrans
A class for automatic differentation transformation of Container nodes in reverse-mode. This is the transformation to apply on the PSyIR AST generated from a source.
- apply(container, routine_name, dependent_vars, independent_vars, reversal_schedule, options=None)
Applies the transformation, returning a new container with routine definitions for both motions using the reverse-mode of automatic differentiation.
Options:- bool ‘jacobian’: whether to generate the Jacobian routine. Defaults to False.- bool ‘verbose’ : toggles explanatory comments. Defaults to False.- bool ‘simplify’: True to apply simplifications after applying AD transformations. Defaults to True.- int ‘simplify_n_times’: number of time to apply simplification rules to BinaryOperation nodes. Defaults to 5.- bool ‘inline_operation_adjoints’: True to inline all possible operation adjoints definitions. Defaults to True.- Parameters:
container (
psyclone.psyir.nodes.Container) – Container Node to the transformed.routine_name (Str) – name of the Routine to be transformed.
dependent_vars (List[Str]) – list of dependent variables names to be differentiated.
independent_vars (List[str]) – list of independent variables names to differentiate with respect to.
reversal_schedule (
psyclone.autodiff.ADReversalSchedule) – reversal schedule for routined called inside the one to transform (and inside them, etc.).options (Optional[Dict[Str, Any]]) – a dictionary with options for transformations, defaults to None.
- Returns:
a copied and modified container with all necessary Routine definitions.
- Return type:
psyclone.psyir.nodes.Container
As can be seen, the required arguments include the PSyIR [File]Container
node obtained by parsing and transforming the source code, the names of the
target routine,
dependent variables (to be differentiated)
and independent variables (to differentiate with
respect to), as well as the reversal schedule and
eventual transformation options.
The transformation returns a PSyIR Container node containing four
Routine definitions for:
the advancing (original) motion,
the recording motion, which records overwritten values to the tape,
the returning motion, which recovers values from the tape and computes the adjoints of the independent variables,
the reversing motion, which combines the two precedent recording and returning motions and is the one to call in order to differentiate.
If some other routine is called by the target one, the returned Container
node also contains four definitions for its different motions.
Target routine
The target routine is the Fortran routine in which to differentiate the
dependent variables with respect to the
independent variables.
The routines it may call will also be differentiated iff they can be found in
the [File]Container being transformed.
Dependent variables
The dependent variables are those to differentiate. Their intent in the target routine must be compatible with their values being returned, ie. they cannot be intent(in) arguments of the target routine.
Independent variables
The independent variables are those to differentiate with respect to. Their intent in the target routine must be compatible with their values being provided as arguments, ie. they cannot be intent(out) arguments of the target routine.
Reversal schedules
Reversal schedules (see Griewank and Walther[1] chapter 12.2, p.265)
specify the way a transformed routine may call other
transformed routines.
They are implemented as 3 subclasses of ADReversalSchedule.
As an example let us consider the target routine foo calling subroutines
bar and qux.
subroutine foo(x, y)
call bar(x, y)
call qux(x, y)
end subroutine foo
And let us denote:
advancing
routine the original,recording
routine the one recording values to the tape,returning
routine the one computing the adjoints,reversing
routine the one combining the two preceding. Call it to differentiate.
Split reversal schedule
In split reversal, children (called) routines follow the recording or returning motion of their parent (calling) routine.
By doing so, the computational complexity is kept as low as possible but values stored to the tape in the recording motion of the called routines need to be kept until they are called in returning motion, thus using a possibly large amount of memory.

Split reversal schedule
Joint reversal schedule
In joint reversal, all children (called) routines advance without recording when their parent (calling) routine is recording and reverse (record then immendiatly return) when their parent routine is returning.
On the one hand, this reversal schedule uses a smaller tape overall, as the values used in adjoining the called routines do no need to be stored longer than for them to be reversed. On the other hand, called subroutines computations are repeated, with increases the computational complexity of the adjoint program.

Joint reversal schedule
“Link” reversal schedule
A third possibility is to specify strong or weak links for each caller-called pair of routines, where strong links behave as in split reversal and weak links as in joint reversal.
Below is an illustration of our toy example with foo-bar a strong link
and foo-qux a weak link.

Link reversal schedule with foo-bar a strong link and foo-qux a weak link
Value tape
Prevalues of overwritten variables are recorded and restored from a value tape,
implemented as a static array.
The transformations themselves employ an ADValueTape to generate recording
and restoring statements to and from the value tape array.
- class psyclone.autodiff.tapes.ADValueTape(name, datatype)
A class for recording and recovering function values in reverse-mode automatic differentiation. The prevalues of references are recorded. Based on static arrays storing a single type of data rather than a LIFO stack. Provides methods to create the PSyIR assignments for recording and restoring operations.
- Parameters:
name – name of the value_tape (after a prefix).
datatype (Union[
psyclone.psyir.symbols.ScalarType,psyclone.psyir.symbols.ArrayType]) – datatype of the elements of the value_tape.
- Raises:
TypeError – if name is of the wrong type.
TypeError – if datatype is of the wrong type.
NotImplementedError – if datatype is not of type ‘ScalarType’.
- record(reference)
Add the reference as last element of the value_tape and return the Assignment node to record the prevalue to the tape.
- Parameters:
reference (
psyclone.psyir.nodes.Reference) – reference whose prevalue should be recorded.- Raises:
TypeError – if reference is of the wrong type.
TypeError – if the intrinsic of reference’s datatype is not the same as the intrinsic of the value_tape’s elements datatype.
NotImplementedError – if the reference’s datatype is ArrayType.
- Returns:
an Assignment node for recording the prevalue of the reference as last element of the value_tape.
- Return type:
psyclone.psyir.nodes.Assignment
- restore(reference)
Restore the last element of the value_tape if it is the symbol argument and return the Assignment node to restore the prevalue to the variable.
- Parameters:
reference (
psyclone.psyir.symbols.DataSymbol) – reference whose prevalue should be restored from the value_tape.- Raises:
TypeError – if reference is of the wrong type.
- Returns:
an Assignment node for restoring the prevalue of the reference from the last element of the value_tape.
- Return type:
psyclone.psyir.nodes.Assignment
Generating adjoints
The transformations applied to generate adjoints are detailled below. They mostly follow the guidelines found in Griewank and Walther[1] chapter 6.2, pp.125-126.
Internally, the transformations used are ADReverseAssignmentTrans,
ADReverseOperationTrans and ADReverseCallTrans, depending on the PSyIR
node being transformed.
These all return two separate lists of PSyIR statements, used respectively in
extending the recording and returning routines being generated.
Adjoints of operations
Adjoints of unary operations
Advancing motion |
Recording motion |
Returning motion |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: some of these adjoints computations,
explicitly those for SQRT, EXP, TAN and ABS,
could reuse the (post)value of f before restoring its prevalue from the
value tape rather than recompute it (see Griewank and Walther[1]
table 4.8, p.68). This is not implemented yet.
Adjoints of binary operations
Advancing motion |
Recording motion |
Returning motion |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: some of these adjoints computations,
explicitly those for / and **
could reuse the (post)value of f before restoring its prevalue from the
value tape rather than recompute it (see Griewank and Walther[1]
table 4.8, p.68). This is not implemented yet.
The cases detailled above are the simpler ones, of assigning the result of an operation to a variable.
When composed operations are present, an adjoint variable is declared for the adjoint of the operation itself and used to increment the adjoints of its operands.
The transformation option inline_operation_adjoints allows the user to
choose whether these operation adjoints should be substituted in further
computations of adjoints as a postprocessing step,
iff they only appear once on the RHS of an assignment.
As an example, consider the following computation involving composed operations and the associated adjoints computations, without and with substitution. Note: taping assignments are omitted below.
Composed operation |
Adjoints, without substitution |
Adjoints, with substitution |
|---|---|---|
|
|
|
Adjoints of iterative assignments
In the case of iterative assignments ie. where the LHS variable of the assignment is also present on the RHS, additional care must be taken to avoid incorrect computations of the LHS adjoint by assigning to it last rather than incrementing its value as in the general case detailled above (see Griewank and Walther[1] chapter 5.1, p.93).
As an example consider the following adjoint:
Advancing motion |
Recording motion |
Returning motion |
|---|---|---|
|
|
|
Adjoints of calls to subroutines
The adjoints of calls to subroutines depend on the reversal schedule that is used.
Whether the prevalues of the arguments are recorded and restored from the tape depend on their intent in the called subroutine, which determines whether their value might be overwritten by it or not.
Operations as subroutine call arguments are also transformed.
Split reversal schedule
Advancing motion |
Recording motion |
Returning motion |
|---|---|---|
|
[ |
[ |
[ |
[ |
|
|
|
Joint reversal schedule
Advancing motion |
Recording motion |
Returning motion |
|---|---|---|
|
[ |
[ |
[ |
[ |
|
|
|