# Static scheduling infrastructure

Scheduling is a common concern in hardware design, for example in high-level
synthesis flows targeting an FSM+Datapath execution model (“static HLS”). This
document gives an overview of, and provides rationale for, the infrastructure in
the `circt::scheduling`

namespace. At its core, it defines an **extensible
problem model** that acts as an interface between **clients** (i.e. passes that
have a need to schedule a graph-like IR) and reusable **algorithm**
implementations.

This infrastructure aims to provide:

- a library of ready-to-use problem definitions and schedulers for clients to hook into.
- an API to make algorithm implementations comparable and reusable.
- a mechanism to extend problem definitions to model additional concerns and constraints.

## Getting started ¶

Let’s walk through a simple example. Assume we want to *schedule* the
computation in the entry block of a function such as `@foo(...)`

in the listing
below. This means we want to assign integer *start times* to each of the
*operations* in this untimed IR.

```
func @foo(%a1 : i32, %a2 : i32, %a3 : i32, %a4 : i32) -> i32 {
%0 = arith.addi %a1, %a2 : i32
%1 = arith.addi %0, %a3 : i32
%2:3 = "more.results"(%0, %1) : (i32, i32) -> (i32, i32, i32)
%3 = arith.addi %a4, %2#1 : i32
%4 = arith.addi %2#0, %2#2 : i32
%5 = arith.addi %3, %3 : i32
%6 = "more.operands"(%3, %4, %5) : (i32, i32, i32) -> i32
return %6 : i32
}
```

Our only constraint is that an operation can start *after* its operands have
been computed. The operations in our source IR are unaware of time, so we need
to associate them with a suitable *operator type*. Operator types are an
abstraction of the target architecture onto which we want to schedule the source
IR. Here, the only *property* we need to model is their *latency*. Let’s assume
that additions take 1 time step, the operations in the dummy `more.`

dialect
take 3 time steps. As the return operation just passes control back to the
caller, we assume a latency of 0 time steps for it.

### Boilerplate ¶

The scheduling infrastructure currently has three toplevel header files.

```
//...
#include "circt/Scheduling/Problems.h"
#include "circt/Scheduling/Algorithms.h"
#include "circt/Scheduling/Utilities.h"
//...
using namespace circt::scheduling;
```

### Constructing a problem instance ¶

Our stated goal requires solving an acyclic scheduling problem without resource
constraints, represented by the `Problem`

class in the scheduling
infrastructure. We need to construct an *instance* of the problem, which serves
as a container for the problem *components* as well as their properties. The
MLIR operation passed as an argument to the `get(...)`

method is used to emit
diagnostics.

```
auto prob = Problem::get(func);
```

Then, we set up the operator types with the latencies as discussed in the introduction. Operator types are identified by string handles.

```
auto retOpr = prob.getOrInsertOperatorType("return");
prob.setLatency(retOpr, 0);
auto addOpr = prob.getOrInsertOperatorType("add");
prob.setLatency(addOpr, 1);
auto mcOpr = prob.getOrInsertOperatorType("multicycle");
prob.setLatency(mcOpr, 3);
```

Next, we register all operations that we want to consider in the problem instance, and link them to one of the operator types.

```
auto &block = func.getBlocks().front();
for (auto &op : block) {
prob.insertOperation(&op);
if (isa<func::ReturnOp>(op))
prob.setLinkedOperatorType(&op, retOpr);
else if (isa<arith::AddIOp>(op))
prob.setLinkedOperatorType(&op, addOpr);
else
prob.setLinkedOperatorType(&op, mcOpr);
}
```

Note that we do not have to tell the instance about the *dependences* between
the operations in this simple example because the problem model automatically
includes the SSA def-use-edges maintained by MLIR. However, we often have to
consider additional dependences that are not represented by value flow, such as
memory dependences. For these situations, so-called
auxiliary
dependences between operations are inserted explicitly into the problem:
`prob.insertDependence(srcOp, destOp)`

.

### Scheduling ¶

Before we attempt to schedule, we invoke the `check()`

method, which ensures
that the constructed instance is complete and valid. For example, the check
would capture if we had forgot to set an operator type’s latency. We dump the
instance to visualize the dependence graph.

```
auto checkRes = prob.check();
assert(succeeded(checkRes));
dumpAsDOT(prob, "sched-problem.dot");
```

We use a simple list scheduler, available via the `Algorithms.h`

header, to
compute a solution for the instance.

```
auto schedRes = scheduleASAP(prob);
assert(succeeded(schedRes));
```

### Working with the solution ¶

The solution is now stored in the instance, and we invoke the problem’s
`verify()`

method to ensure that the computed start times adhere to the
precedence constraint we stated earlier, i.e. operations start after their
operands have computed their results. We can also convince ourselves of that by
dumping the instance and inspecting the solution.

```
auto verifRes = prob.verify();
assert(succeeded(verifRes));
dumpAsDOT(prob, "sched-solution.dot");
```

To inspect the solution programmatically, we can query the instance in the
following way. Note that by convention, all getters in the problem classes
return `Optional<T>`

values, but as we have already verified that the start
times for registered operations are set, we can directly dereference the values.

```
for (auto &op : prob.getOperations())
llvm::dbgs() << *prob.getStartTime(&op) << "\n";
```

And that’s it! For a more practical example, have a look at the
`AffineToPipeline`

pass.

## Extensible problem model ¶

### Theory and terminology ¶

Scheduling problems come in many flavors and variants in the context of hardware
design. In order to make the scheduling infrastructure as modular and flexible
as CIRCT itself, it is build on the following idea of an *extensible problem
model*:

An *instance* is comprised of *components* called *operations*, *dependences*
and *operator types*. Operations and dependences form a graph structure and
correspond to the source IR to be scheduled. Operator types encode the
characteristics of the target IR. The components as well as the instance can be
annotated with *properties*. Properties are either *input* or *solution*
properties, based on whether they are supplied by the client, or computed by the
algorithm. The values of these properties are subject to the *input constraints*
and *solution constraints*, which are a first-class concern in the model and are
intended to be strictly enforced before respectively after scheduling.

Concrete problem definitions derived from this model share the same representation of the components, but differ in their sets of properties (and potentially distinction of input and solution properties) and input and solution constraints. Hence, we tie together properties and constraints to model a specific scheduling problem. Extending one (or more!) parent problems means inheriting or adding properties, and redefining the constraints (as these don’t always compose automatically).

A key benefit of this approach is that these problem definitions provide a
reliable contract between the clients and algorithms, making it clear which
information needs to be provided, and what kind of solution is to be expected.
Clients can therefore choose a problem definition that fits their needs, and
algorithms can *opt-in* to accepting a specific subset of problems, which they
can solve efficiently. Extensibility is ensured because new problem definitions
can be added to the infrastructure (or inside a specific lowering flow, or even
out-of-tree) without adapting any existing users.

### Implementation ¶

See Problems.h / Problems.cpp.

#### Problem definitions ¶

The `Problem`

class is currently the base of the problem hierarchy. Several
extended problems are
currently defined via
virtual multiple inheritance. Upon construction, a `containingOp`

is passed to
instances. This MLIR operation is currently only used to emit diagnostics, and
has no semantic meaning beyond that.

#### Components ¶

The infrastructure uses the following representation of the problem components.

Operations are just `mlir::Operation *`

s.

We distinguish two kinds of dependences, *def-use* and *auxiliary*. Def-use
dependences are part of the SSA graph maintained by MLIR, and can distinguish
specific result and operand numbers. As we expect any relevant graph-like input
IR to use this MLIR facility, instances automatically consider these edges
between registered operations. Auxiliary dependences, in contrast, only specify
a source and destination operation, and have to be explicitly added to the
instance by the client, e.g. for control or memory dependences. The
`detail::Dependence`

class abstracts the differences between both kinds, in
order to offer a uniform API to iterate over dependences and query their
properties.

Lastly, operator types are identified by `mlir::StringAttr`

s, in order to give
clients maximum flexibility in modeling their operator library. This may change
in the future, when a CIRCT-wide concept to model physical properties of
hardware emerges.

#### Properties ¶

Properties can involve arbitrary data types, as long as these can be stored in
maps. Problem classes offer public getter and setter methods to access a given
components properties. Getters return optional values, in order to indicate if a
property is unset. For example, the signature of the method the queries the
computed start time is `Optional<unsigned> getStartTime(Operation *op)`

.

#### Constraints ¶

Clients call the virtual `Problem::check()`

method to test any input
constraints, and `Problem::verify()`

to test the solution constraints. Problem
classes are expected to override them as needed. There are no further
restrictions of how these methods are implemented, but it is recommended to
introduce helper methods that test a specific aspect and can be reused in
extended problems. In addition, it makes sense to check/verify the properties in
an order that avoids redundant tests for the presence of a particular property
as well as redundant iteration over the problem components.

## Available problem definitions ¶

*See the linked Doxygen docs for more details.*

- Problem: A basic, acyclic problem at the root of the problem hierarchy. Operations are linked to operator types, which have integer latencies. The solution comprises integer start times adhering to the precedence constraints implied by the dependences.
- CyclicProblem:
Cyclic extension of
`Problem`

. Its solution solution can be used to construct a pipelined datapath with a fixed, integer initiation interval, in which the execution of multiple iterations/samples/etc. may overlap. Operator types are assumed to be fully pipelined. - SharedOperatorsProblem: A resource-constrained scheduling problem that corresponds to multiplexing multiple operations onto a pre-allocated number of fully pipelined operator instances.
- ModuloProblem: Models an HLS classic: Pipeline scheduling with limited resources.
- ChainingProblem:
Extends
`Problem`

to consider the accumulation of physical propagation delays on combinational paths along SSA dependences.

NB: The classes listed above each model a *trait*-like aspect of scheduling.
These can be used as-is, but are also intended for mixing and matching, even
though we currently do not provide definitions for all possible combinations in
order not to pollute the infrastructure. For example, the `ChainingProblem`

may
be of limited use standalone, but can serve as a parent class for a future
chaining-enabled modulo scheduling problem.

## Available schedulers ¶

- ASAP list scheduler
(
`ASAPScheduler.cpp`

): Solves the basic`Problem`

with a worklist algorithm. This is mostly a problem-API demo from the viewpoint of an algorithm implementation. - Linear programming-based schedulers
(
`SimplexSchedulers.cpp`

): Solves`Problem`

,`CyclicProblem`

and`ChainingProblem`

optimally, and`SharedOperatorsProblem`

/`ModuloProblem`

with simple (not state-of-the-art!) heuristics. This family of schedulers shares a tailored implementation of the simplex algorithm, as proposed by de Dinechin. See the sources for more details and literature references. - Integer linear programming-based scheduler
(
`LPSchedulers.cpp`

): Demo implementation for using an ILP solver via the OR-Tools integration.

## Utilities ¶

See
`Utilities.h`

:

- Topological graph traversal
- DFA to compute combinational path delays
- DOT dump

## Adding a new problem ¶

*See e.g.
#2233, which added the
ChainingProblem.*

- Decide where to add it. Guideline: If it is trait-like and similar to the
existing problem mentioned above, add it to
`Problems.h`

. If the model is specific to your use-case, it is best to start out in locally in your dialect/pass. - Declare the new problem class and inherit
*virtually*from the relevant superclasses (at least`Problem`

). - Define additional properties (private), and the corresponding public
getters/setters. Getters return
`Optional<T>`

values, to indicate an unset state.- Note that dependence properties are somewhat expensive to store, making it
desirable that clients and algorithms expect and handle the unset state.
This should be clearly documented. Example:
`distance`

property in`CyclicProblem`

.

- Note that dependence properties are somewhat expensive to store, making it
desirable that clients and algorithms expect and handle the unset state.
This should be clearly documented. Example:
- Redefine the
`getProperties(*)`

methods to get dumping support. These should consider any properties the new class adds, plus properties defined in the superclass(es). - Redefine
`check()`

(input constraints) and`verify()`

(solution constraints). If possible, follow the design used in the existing problem classes. - Write a couple of “positive” testcases, as well as at least one error test for
each input/solution constraint, as validated by
`check()`

/`verify()`

. See`TestPasses.cpp`

for inspiration. The testing situation will hopefully improve with #2760.

## Adding a new scheduler ¶

*See e.g.
#2650, which added a
scheduler for the CyclicProblem.*

- Schedulers should opt-in to specific problems by providing entry points for
the problem subclasses they support. Example:
`LogicalResult awesomeScheduler(Problem &prob); LogicalResult awesomeScheduler(CyclicProblem &prob);`

- Schedulers can expect that the input invariants were enforced by a
`check()`

-call in the client, and must compute a solution that complies with the solution constraints when the client calls the problem’s`verify()`

method. - Schedulers can live anywhere. If a new algorithm is not entirely
dialect/pass-specific and supports problems defined in
`Problems.h`

, it should offer entry points in`Algorithms.h`

. - Objectives are not part of the problem signature. Therefore, if an algorithm supports optimizing for different objectives, clients should be able to select one via the entry point(s).