HW Dialect Rationale
This document describes various design points of the hw
dialect as well as
global perspective on the hw
, comb
, and sv
dialects, why
they are the way they are, and current status. This follows in the spirit of
other
MLIR Rationale docs. For more
information about the other dialects, please see the
Comb Dialect Rationale and
SV Dialect
Rationale.
- HW Dialect Rationale
General Introduction ¶
SystemVerilog is an industry standard language for hardware design and verification, is known by a large number of engineers who write it manually, and is an important interchange format between EDA tools. However, while it is ubiquitous, SystemVerilog is not easy to generate or transform. Furthermore, it is non-trivial for compiler tools to generate high-quality human-readable SystemVerilog.
The hw
, comb
and sv
dialects attempt to address these problems with
several major contributions:
- The
hw
dialect provides unifying structure and abstractions that are useful for a wide range of hardware modeling problems. It allows other dialects to “mix in” with it to provide higher level functionality.hw
is roughly akin to the “std” dialect in MLIR (but better curated). - The
comb
dialect provides a common set of operations for combinational logic. This dialect is designed to allow easy analysis and transformation. - The
sv
dialect provides direct access to a wide variety of SystemVerilog constructs, including behavioral constructs, syntactic sugar constructs, and even idioms likeifdef
blocks. - The
circt
project provides a high quality implementation and a number of useful compiler passes for analyzing and transforming these dialects, and a SystemVerilog emitter that generates pretty output.
The combination of these capabilities provides a useful suite of functionality for compiler tools that want to generate high quality SystemVerilog.
Introduction to the hw
Dialect ¶
The hw
dialect defines a set of common functionality, such as hw.module
and
hw.instance
for representing hardware modules, as well as common types (e.g.
hw.array<xx>
) and attributes. It is not designed to model SystemVerilog or
any other hardware
design language directly, and doesn’t contain combinational or sequential
operations and does not have “connect” semantics. Instead, it is designed to be
a flexible and extensible substrate that may be extended with higher level
dialects mixed into it (like sv
, comb
, and seq
in the future, etc).
hw
Type System ¶
TODO: Describe inout types. Analogy to lvalues vs rvalues. Array indices for both forms. Arrays, structs, moving UnpackedArray to SV someday.
CLEANUP: InOut types is defined in the hw
dialect, but logically lives at the
sv
dialect level. sv
provides connects, wires and other syntactic
constructs that work with the inout type. These aren’t necessary
for combinational logic, but are nonetheless pretty useful when generating
Verilog.
enum
Type ¶
Enum types have the property that the bit width of the type is the minimum necessary to hold the tag values. Tag values are either explicit or sequentially numbered in tag order from 0. Enum tags are unsigned values.
union
Type ¶
Union types contain a single data element (which may be an aggregate). They optionally have an offset per variant which allows non-SV layouts.
hw.module
and hw.instance
¶
The basic structure of a hardware design is made up an “instance tree” of
“modules” and “instances” that refer to them. There are loose analogies to
software programs which have corresponding “functions” and “calls” (but there
are also major differences, see “
Instance paths” below).
Modules can have a
definition hw.module
, they can be a definition of an external module whose
signature is known but whose body is provided separately hw.module.extern
,
and can be a definition of an external module with a known signature that
can/will be generated in the future on demand (hw.module.generated
).
A simple example module looks like this (many more can be found in the testsuite):
hw.module @two_and_three(%in: i4) -> (twoX: i4, threeX: i4) {
%0 = comb.add %in, %in : i4
%1 = comb.add %a, %0 : i4
hw.output %0, %1 : i4, i4
}
The signature of a modules have these major components:
- A symbol
name
which specifies the MLIR name for the module (@two_and_three
in the example above). This is what connects instances to modules in a stable way. - A list of input ports, each of which has a type (
%in: i4
in the example above). Each input port is available as an SSA value through a block argument in the entry block of anhw.module
, allowing them to be used within its body. “inout” ports are modeled as inputs with an!hw.inout<T>
type. Input port names are prefixed with a%
because they are available as SSA values in the body. - A list of result port names and types (
twoX: i4
andthreeX: i4
in the example above). In ahw.module
definition, the values for the results are provided by the operands to thehw.output
terminator in the body block. The names of result ports are not prefixed with%
because they are not MLIR SSA values. - A list of module “parameters”, which provide parametric polymorphism capabilities (somewhat similar to C++ templates) for modules. These are described in more detail in the “ Parameterized Modules section below.
- The
verilogName
attribute can be used to override the name for an external module. TODO: we should eliminate this in the future and just use the symbol. - Other ad-hoc attributes. The
hw
dialect is intended to allow open extensibility by other dialects. Ad-hoc attributes put onhw
dialect modules should be namespace qualified according to the dialect they come from to avoid conflicts.
This definition is fairly close to the Verilog family, but there are some notable differences: for example:
- We split output ports from input ports, don’t use
hw.output
instead of connects to specify the results. This allows better SSA dataflow analysis from thehw.output
which is useful for inter-module analyses. - We allow arbitrary types for module ports. The
hw
dialect is generally designed to be extensible by other dialects, and thus being permissive here is useful. That said, the Verilog exporter does not support arbitrary user-defined types. - The
comb
dialect in particular does not use signed integer types, but do support zero-width integer types. Modules in thehw
dialect, support both of these. Zero width ports and wires are omitted (printed as comments) when generating Verilog.
GlobalRefOp ¶
The GlobalRefOp operation (hw.globalRef
) can be used to identify the unique
instance path of an operation globally.
hw.globalRef
can be used to attach nonlocal annotations in FIRRTL dialect
and also for metadata emission.
hw.globalRef
defines a symbol and contains a list of module local
hw.innerNameRef
symbols to define the instance path.
For example, in the following example, @glbl_B_M1
specifies instance
“h1” in module @A
, followed by instance
“M1” in module @B
.
hw.globalRef
can define a unique instance path, and each element along the way
carries an attribute circt.globalRef
, pointing to the global op.
Thus instances participating in nonlocal paths are readily apparent.
hw.globalRef @glbl_B_M1 [#hw.innerNameRef<@A::@inst_1>, #hw.innerNameRef<@B::@memInst>]
hw.globalRef @glbl_D_M1 [#hw.innerNameRef<@A::@inst_0>, #hw.innerNameRef<@C::@inst>, #hw.innerNameRef<@D::@memInst>]
hw.globalRef @glbl_D_M2 [#hw.innerNameRef<@A::@SF>, #hw.innerNameRef<@F::@symA>]
hw.globalRef @glbl_D_M3 [#hw.innerNameRef<@A::@SF>, #hw.innerNameRef<@F::@symB>]
hw.module @D() -> () {
hw.instance "M1" sym @memInst @FIRRTLMem() -> () {circt.globalRef = [#hw.globalNameRef<@glbl_D_M1>]}
}
hw.module @B() -> () {
hw.instance "M1" sym @memInst @FIRRTLMem() -> () {circt.globalRef = [#hw.globalNameRef<@glbl_B_M1>]}
}
hw.module @C() -> () {
hw.instance "m" sym @inst @D() -> () {circt.globalRef = [#hw.globalNameRef<@glbl_D_M1>]}
}
hw.module @A() -> () {
hw.instance "h1" sym @inst_1 @B() -> () {circt.globalRef = [#hw.globalNameRef<@glbl_B_M1>]}
hw.instance "h2" sym @inst_0 @C() -> () {circt.globalRef = [#hw.globalNameRef<@glbl_D_M1>]}
%c0 = hw.constant 0 : i1
%2 = hw.instance "ab" sym @SF @F (a1: %c0: i1) -> (a2 : i1) {circt.globalRef = [#hw.globalNameRef<@glbl_D_M2>, #hw.globalNameRef<@glbl_D_M3>]}
}
hw.module.extern @F(%a1: i1 {hw.exportPort = @symA, circt.globalRef = [#hw.globalNameRef<@glbl_D_M2>]}) -> (a2: i1 {hw.exportPort = @symB, circt.globalRef = [#hw.globalNameRef<@glbl_D_M3>]}) attributes {}
sv.verbatim "{{0}}" { symbols = [@glbl_D_M1] }
sv.verbatim "{{0}}" { symbols = [@glbl_B_M1] }
Instance paths ¶
An IR for Hardware is different than an IR for Software in a very important way: while each function in a software program usually compiles into one blob of binary code no matter how many times it is called, each instance in a hardware design is typically fully instantiated, because different instances turn into different gates. The consequence of this is that the instance tree is really a compression mechanism that is eventually elaborated away.
This compression approach has major advantages: it is much better for memory and compile time to represent a single definition of a hardware block than the (possibly thousands or millions) of concrete instances that will eventually be required. However, hardware engineers often do need to reason about and control the different instances in some cases (e.g. providing physical layout constraints for one instance but not the rest).
TODO: Bake out a design for instance path references, an equivalent to the
FIRRTL dialect InstanceGraph
type, etc.
Parameterized Modules ¶
The hw
dialect supports parametric “compile-time” polymorphism for modules.
This allows for metaprogramming along the instance tree, guaranteed
“instantiation time” optimizations and code generation, further enables
the “IR compression” benefits of using instances in the first place, and enables
the generation of parameters in generated Verilog (which can increase the
perceived readability of the generated code).
Parameters are declared on modules (including generated and external ones) with angle brackets: each parameter has a name and type, and can optionally have a default value. Instances of a parameterized module provide a value for each parameter (even defaulted ones) in the same order:
// This module has two parameters "p1" and "p2".
hw.module.extern @parameterized<p1: i42 = 17, p2: i1>(%in: i8) -> (out: i8)
hw.module @UseParameterized(%a: i8) -> (ww: i8) {
%r0 = hw.instance "inst" @parameters<p1: i42 = 17, p2: i1 = 1>(in: %a: i8) -> (out: i8)
hw.output %r0 : i8
}
This approach makes analysis and transformation of the IR simple, predictable,
and efficient: because the parameter list on instances and on modules always
line up, they are indexable by integers (instead of strings), intermodule
analysis is straight-forward (no filling in of default values etc), and
Verilog generation is always predictable: the default value for a parameter
is used when the instance and the module default are the same (e.g. in the
example above, p1
is not printed at the instance site because it is the same
as the default.
The sv
dialect provides the sv.localparam
operation, which is used for
naming constants. These may be derived from module parameters or may just be
nicely named constants intended to improve readability. This is part of the
sv
dialect (not the hw
dialect) because it only makes sense as a concept
when generating Verilog.
Compatibility with classic Verilog
Note that typed parameters are a SystemVerilog extension, many Verilog-only
tools do not support them. If you need compatibility with Verilog-only tools,
please use i32
integer, f64
, and string parameters only. All module
parameter definitions
should have default values for maximum tool compatibility as well, even if
they are always overridden.
TODO: CIRCT could do parameter legalization to convert parameters smaller or larger than 32-bits to 32-bits when generating Verilog-only output.
Valid Parameter Expression Attributes ¶
The following attributes may be used as expressions involving parameters at an instance site or in the default value for a parameter declaration on a module:
- IntegerAttr/FloatAttr/StringAttr constants may be used as simple leaf values.
- The
#hw.param.decl.ref
attribute is used to refer to the value of a parameter in the current module. This is valid in most positions where a parameter attribute is used - except in the default value for a module. - The
#hw.param.expr
operator allows combining other parameter expressions into an expression tree. Expression trees have important canonicalization rules to ensure important cases are canonicalized to uniquable representations. #hw.param.verbatim<"some string">
may be used to provide an opaque blob of textual Verilog that is uniqued by its string contents. This is intended as a general “escape hatch” that allows frontend authors to express anything Verilog cannot, even if first-class IR support doesn’t exist yet. CIRCT does not provide any checking to ensure that this is correct or safe, and assumes it is single expression - parenthesize the string contents if not to be safe. This should eventually support substitutions likesv.verbatim.expr
.
Because parameter expressions are MLIR attributes, they are immortal values that are uniqued based on their structure. This has several important implications, including:
- A parameter reference (
#hw.param.decl.ref
) to a parameterx
doesn’t know what module it is in. The verifier checks that parameter expressions are valid within the body of a module, and that the types line up between the parameter reference and the declaration (after all, two different modules can have two different parameters namedx
with different types). - We want to depend on MLIR canonicalizing and uniquing the pointer address of
attributes in a predictable way to ensure that further derived uniqued objects
(e.g. a parameterized integer type) is also uniqued correctly. For example,
we do not want the types
hw.int<x+1>
andhw.int<1+x>
to turn into different types. See the Parameter Expression Canonicalization section below for more details. - Whereas the rest of the
hw
dialect is generally open for extension, the current grammar of attribute expressions is closed: you have to hack the HW dialect verifier and VerilogEmitter to add new kinds of valid expressions. This is considered a limitation, we’d like to move to an attribute interface at some point that would allow dialect-defined attributes. For example, this would allow movinghw.param.verbatim
attribute down to thesv
dialect.
Note that there is no parameter expression equivalent for comb.sub
:
(sub x, y)
is represented with (add x, (mul y, -1))
which makes maintaining
canonical form simpler and more consistent.
Parameter Expression Canonicalization ¶
As mentioned above, it is important to canonicalize parameter expressions. This
slightly reduces memory usage, but more importantly ensures that equivalent
parameter expressions are pointer equivalent: we don’t want x+1
and 1+x
to
be different, because that would cause everything derived from them to be as
well.
On the other hand, we expect to support a lot of weird expressions over time (at least the full complement that Verilog supports) and canonicalizing arbitrary expressions in a predictable way is untenable. As such, we support canonicalizing a fixed set of expressions predictably: more may be added in the future.
This set includes:
- Constant folding: parameter expressions with all integer constant operands are folded to their corresponding result.
- Constant identities are simplified, e.g.
p1 & 0
into0
,p1 * 1
intop1
,p1 << 0
intop1
,p1 >> 0
intop1
,p1 / 1
intop1
andp1 % 1
into0
. - Constant operand merging: any constant operands in associative operations are
merged into a single operand and moved to the right, e.g.
(add 4, x, 2)
into(add x, 6)
. - Fully associative operators flatten subexpressions, e.g.
(add x, (add y, z))
into(add x, y, z)
. - We simplify affine expressions into a sum of products representation, pulling
additions out of products, e.g.
(a+b)*c*d
into(a*c*d + b*c*d)
- Operands of fully-associative expressions are put into a stable order, at
least for the case of affine expressions involving constant integers and
named parameters. For example
p2+p1
turns intop1+p2
reliably. The actual ordering moves subexpressions to the start of the list (more complex ones first) followed by verbatims, followed by parameter references, followed by constants, each group sorted w.r.t. each other. - Common operand factoring for adds, e.g.
(a+b+a)
into(2*a + b)
and(a*4 + a)
into(a*5)
- Shift left by constant is canonicalized into multiply to compose correctly
with affine expression canonicalization, e.g.
(shl x, 1)
into(mul x, 2)
. - As mentioned above,
sub
is not supported as a ParamExprAttr - it is represented as multiply by-1
. This allows it to trivially compose with affine expression canonicalizations.
clog2 Parameter Expression ¶
A clog2
parameter expression opcode is provided, which calculates the ceiling
of log base 2 of its argument. This is useful for calculating things like the
minimum bitwidth needed to address memory of a parameterized size.
Note that clog2(0)
is 0
, which follows the Verilog spec.
Using parameters in the body of a module ¶
Parameters are not
SSA values, so they cannot directly be used within the body
of the module. Just like you use hw.constant
to project a constant integer
value into the SSA domain, you can use the hw.param.value
to project a
parameter expression, like so:
hw.module @M1<param1: i1>(%clock : i1, ...) {
...
%param1 = hw.param.value i1 = #hw.param.decl.ref<"param1">
...
sv.if %param1 { // Compile-time conditional on parameter.
sv.fwrite "Only happens when the parameter is set\n"
}
...
}
Alternately, you can project them with a specific name, you can use the
sv.localparam
declaration like so:
hw.module @M1<param1: i1>(%clock : i1, ...) {
...
%param1 = sv.localparam : i1 { value = #hw.param.decl.ref<"param1">: i1 }
...
sv.if %param1 { // Compile-time conditional on parameter.
sv.fwrite "Only happens when the parameter is set\n"
}
...
}
Using sv.localparam
is helpful when you’re looking to produce specifically
pretty Verilog for human consumption. The optimizer won’t fold aggressively
around these names.
Parameterized Types ¶
TODO: Not done yet.
Answers to other common questions ¶
During the design work on parameterized modules, we had several proposals for alternative designs a lot of discussion on this. See in particular, these discussions at the open design meetings:
- September 15, 2021: discussion about using SSA values vs attributes for expressions, whether parameters should just be a “special kind of port” etc.
- September 22, 2021: discussions on expression canonicalization, parameterized type casting and other topics.
This section tries to condense some of those discussions into key points:
Why do instances repeat default parameters from modules?
As described above, the full set of module parameters are specified on an instance, even if some have default values. The reason for this is that we want the IR to be simple and efficient to analyze by the compiler: keeping (and verifying that) instance parameters are in canonical form means that we can index them with integers instead of names (just like module input and result ports), and intermodule analysis/optimization doesn’t have to handle default values as a special case. Instead they are just a matter for frontends and the Verilog exporter to care about.
Why model parameters with attributes instead of SSA values?
It seems unfortunate to replicate some parts of the comb
dialect (e.g.
comb.add
) as attributes rather than just reusing the existing attributes.
Such a design has historical familiarity (e.g. LLVM’s ConstantExpr
class)
which led to a bunch of complexity in LLVM that would have been better avoided
(and yes - there are much better designs for LLVM’s purposes than what it has
now).
All that said, using attributes is the right thing for a number of reasons:
- This arithmetic happens at metaprogramming time, these ops do not turn into hardware. It use important and useful to be able to know that structurally.
- We need to verify parameter expressions are valid for the module they are
defined in - it isn’t generally ok for the verifier of the
hw.instance
op to walk an arbitrary amount of IR to check that an SSA value is valid as a parameter. - We need to support parameterized types like
!hw.int<n>
: because MLIR types are immortal and uniqued, they can refer to attributes but cannot refer to SSA values (which may be destroyed). - Operations need to be able to compute their own type without creating other
operations. For example, we need to compute that the result type of
comb.concat %a, %b : (i1, !hw.int<n>)
is!hw.int<n+1>
without introducing a newcomb.add
node to “add one to n”. - In practice, comb ops and the canonicalizations that apply to them have very different goals than the canonicalizations we apply to parameter expressions.
Type declarations ¶
Type declaration IR
Type Scope Operation
A Type Scope declares a single region with a single block that contains type declarations. Type scopes provide a means to group type declarations. Type scopes have a symbol and are themselves symbol tables, so they may be looked up, and type declarations may be looked up within them.
Type Declaration Operation
A Type Declaration declares a symbolic name for a type. It consists of:
- A symbolic name, which can be referred to in the IR by a Type Alias Type.
- A type attribute, which contains the underlying type the name refers to
- An optional string attribute, which specifies a name to give the type in the output. If not specified, the symbolic name is used in the output.
Type Alias Type
A Type Alias refers to a type declaration symbolically. It consists of:
- A symbolic reference to a Type Scope and Type Declaration within the scope
- A copy of the underlying type, cached in type storage
- A copy of the canonical type, cached in type storage
Type declaration System Verilog output
In ExportVerilog, Type Scopes may be split into their own output file, included at the top of multiple split output files, or included via a header file.
Each Type Declaration will be emitted according to the System Verilog spec, section 6.18, User-defined types. For example:
typedef logic mytype;
Type canonicalization
We take an approach to type canonicalization similar to Clang.
To implement this, CIRCT has an extra layer of indirection in the ODS declarations of types that may be targeted by Type Declarations. This layer makes it possible to generically let operations declare their operands and results to be of a specific type, and also accept type declarations that canonicalize to that type.
The HW dialect types are declared in HWTypesImpl.td, which declares the actual type and is used to generate the C++ wrappers. To make these types usable in ODS operations in a generic way that support type declarations, wrapper types are declared in HWTypes.td, which uses helpers to define dialect types that may be either the declared type, or a type declaration whose canonical type is the declared type.
In order to “see through” type declarations and get at the canonical type, a set
of helper functions are added to the hardware dialect: hw::type_isa
,
hw::type_cast
, and hw::type_dyn_cast
. These should generally be used instead
of the MLIR Type::isa
, Type::cast
, and Type::dyn_cast
, because they
transparently support type declarations.
As the Clang docs state: “The only hard part here is remembering not to use the isa/cast/dyn_cast operations.”
Open Issues
Duplicated type in Type Alias Type IR
In order to support Type canonicalization, the Type Alias Type keeps a copy of
the underlying type, as well as what it canonicalizes to, cached in Type
storage. This allows the helpers like hw::type_isa
to efficiently query the
underlying type or canonical type in constant time.
Ideally, the Type Alias Type would only contain a symbolic reference to the declaration, and at construction time (during parsing or when built programmatically) look up and cache the underlying type and its canonical type.
Unfortunately, this is not currently possible with the MLIR parser API. There is some discussion about this on Discourse, and it seems like a good enhancement. This is tracked in issue #1642. Until then, we must duplicate the type in the IR.
Symbols and Visibility ¶
Verilog has a broad notion of what can be named outside the context of its declaration. This is compounded by the many tools which have additional source files which refer to Verilog names (e.g. tcl files). However, we do not want to require that every wire, register, instance, localparam, port, etc which can be named not be touched by passes. We want only entities marked as public facing to impede transformation.
For this reason, wires, registers, and instances may optionally define a symbol. When the symbol is defined, the entity is considered part of the visible interface and should be preserved in transformation. Entities without a symbol defined are considered private and may be changed by transformation.
Implementation constraints
Currently, MLIR restricts symbol resolution to looking in and downward through any nested symbol tables when resolving symbols. This assumption has implications for verification, the pass manager, and threading. Until symbol references are more general, SV and HW dialects do not define symbol tables for modules. Therefore, wires, registers, and interfaces exist in the same namespace as modules. It is encouraged that one prefaces the names to avoid conflict with modules. The symbol names on these entities has no bearing on the output Verilog, each of these entities has a defined way to assign its name (SSA value name for wires and regs, a non-optional string for instances).
As MLIR symbol support improves, it is desired to move to per-module symbol tables and to unify names with symbol names.
Ports
Module ports are remotely nameable entities in Verilog, but are not easily named with symbols. A suggested workaround is to attach a wire to a port and use its symbol for remote references. Instance ports have a similar problem.
Future Directions ¶
There are many possible future directions that we anticipate tackling, when and if the need arises:
More support for IR
Many in the CIRCT community are interested in adding first-class support for parametric modules – similar but more general than SystemVerilog module parameters. It isn’t clear yet whether this should be part of the HW dialect or something higher level.
Separate from a “good” representation of parametric modules, the SV dialect could grow direct support for representing the SystemVerilog functionality in this space, including even things like “generate” blocks.
EDA Tool-specific Subdialects
The EDA tool ecosystem is filled with a wide range of tools with different capabilities – for example see this table for one compilation of different systems and their capabilities. As such, we expect that the day will come where a frontend wants to generate fancy features for some modern systems, but cannot afford to break compatibility with other ecosystem tools.
Given the design of the HW/SV dialects, there is no need to resort to “lowest common denominator” approach here: we can allow frontends to generate “fancy” features, then use progressive lowering when dealing with tools that can’t handle them. This can also allow IP providers to decide what flavor of features they want to provide to their customers (or provide multiple different choices).
SystemVerilog Parser
As the SV dialect grows out, it becomes natural to think about building a high quality parser that reads SystemVerilog source code and parses it into the SV dialect. Such functionality could be very useful to help build tooling for the SystemVerilog ecosystem.
Such a parser should follow clang-style principles of producing high quality diagnostics, preserving source location information, being built as a library, etc.