This is a technical write-up of my first contribution to the LLVM RISC-V backend (specifically on the V extension). If you have any questions to the blog post, please feel free to contact me.
- Introduction
- What are Zve and Zvl sub-extensions for?
- Main goals for compiler support on Zve, Zvl
- LLVM implementation sketch
- Wrap up
Introduction
Back in first month in Sifive, right after I was on-board, as an LLVM & compiler developer I think it is best to get familiar with RVV by putting my hands on it. So I asked Kai if there is anything I can put my hands on to get familiar with RISC-V and what we are doing in the upstream. At that time, the v-spec v1.0-rc1 just came out, and the compiler needs to support the added Zve
and Zvl
. Therefore I went on to take the task and worked on it.
Before explaining anything technical, I want to thank my teammate Kai for giving me pointer to files. I want to thank my teammate Zakk for giving me pointers across the code base, alongside with many useful suggestions on how can I do this contribution. He also contributed commits on zve
and requirement checks in RISCVISAInfo
. I also want to thank my teammate Craig for reviewing through my patches. I really appreciate their help!
What are Zve
and Zvl
sub-extensions for?
These are sub-extension regarding the RISC-V Vector extension. They are used to specify the actual architecture, which the code should be compiled onto.
The Zve
sub-extension specifies the minimum vector register length required and what types of data type are supported.

The Zvl
sub-extension specifies the minimum vector register length required. The RVV instruction set allows different VLEN configurations.
Dependencies between the extensions
We need canonical representations for these extensions, as zvl128b_zvl256b
is equivalent to zvl_64b_zvl_256b
(and also solely zvl256b
). In this sense we should derive implications for these sub-extension specifications.
We shouldn’t actually see them as “implications”, but rather “dependence” when to support such architecture. For example, zve32f
is zve32x
with 32-bit floating-point support, which requires zve32f
to “require” (depend) on zve32x
.
We can simply sketch out the dependence for the Zve
sub-extension. For the Zvl
sub-extension, the longer length specifications simply depends on shorter length specifications. Additionally, the Zve
sub-extension also depends on their corresponding Zvl
length.

The standard V extension and its implication to Zve
, Zvl
Per the v-spec document:
The V vector extension requires Zvl128b.
The V extension requires the scalar processor implements the F and D extensions, and implements all vector floating-point instructions for floating-point operands with EEW=32 or EEW=64 (including widening instructions and conversions between FP32 and FP64).
In conclusion, V depends on zve64d
and zvl128l
, so on with their subsequent dependencies.
Main goals for compiler support on Zve
, Zvl
Let vector register length information be captured
For vector-related optimizations, they will need vector register information to legalize the transformation. So the compiler will need to have vector register length respect the zvl*b
specified.
Restrict intrinsic functions based on architecture specified
Intrinsic functions should be limited upon the architecture specified. For example, if we have zve32x
, then we (the compiler) should not be able to compile RVV intrinsics with EEW = 64 or floating point operands.
Macro information for application usage
For applications to be aware of the vector architecture it is running upon, it suppose to leverage on some macro and program accordingly when the macro returns with different values. As the pull request in riscv-c-api-doc proposed:
Introduce
__riscv_v_min_vlen
,__riscv_v_max_eew
and__riscv_v_max_eew_fp
macro, for let programer easier to get vector related information, like:– What’s the minimal VLEN value guaranteed in current architecture extension?
– What’s the maximal available EEW guaranteed in current architecture extension?
– What’s the maximal available EEW for floating-point operation guaranteed in current architecture extension?
So the compiler implementation will need to have those API added for the users.
Besides the 2 main goals above, the compiler also needs to guarantee canonical output for the architecture string of these extensions. Meaning we need some place for the compiler to deal with the dependency implications of these sub-extensions.
LLVM implementation sketch
This was my first time submitting patches. So I didn’t have a clear inside-out planning on how should the commits should be. I was modifying all over the code base until things start to work.
Unify logic in RISCVISAInfo
Prerequisite work to support sub-extension implication is needed before the actual sub-extension addition. We need mechanism for “multi-character extension” parsing and places for dependency check and extension implication (satisfy their requirements).
Please check out the patches for detail:
⚙ D109215 [RISCV] Fix arch string parsing for multi-character extensions
⚙ D112359 [RISCV] Unify depedency check and extension implication parsing logics
Add sub-extension zvl
, assert vector register length with it
riscv-v-vector-bits-min
was the option to specify minimum vector register length before Zvl
was out, the option writes to RISC-V’s target transform info, allowing to contain vector register width information. The addition of Zvl
will need to make sure legality between the option.
Please check out the patch for detail:
⚙ D108694 [RISCV] Add the zvl extension according to the v1.0 spec
Add sub-extension zve
, restrict intrinsics with it
Since Zve
depends on their corresponding Zvl
specifications, this addition is rightfully after the previous step. Since we have macro defined for application usage, we can also leverage them to restrict the RVV intrinsic builtin definitions inside riscv_vector.h
(the header file for RVV intrinsic builtin-s).
Please checkout the patches for detail:
⚙ D112408 [RISCV] Add the zve extension according to the v1.0 spec
⚙ D112986 [Clang][RISCV] Restrict rvv builtins with zve macros
Update minimum requirement for vector intrinsics
Now that we have the Zve
extensions, zve32x
becomes the new minimum requirement to include vector instructions. We have successfully added the two sub-extensions!
Please checkout the patch for detail:
⚙ D112613 [Clang][RISCV] Change TARGET_BUILTIN to require zve32x for vector instruction
Pointer to files
RISCVISAInfo.cpp
interacts with architecture string parsing- Extension definitions are inside
RISCV.td
.
- Extension definitions are inside
RISCVVEmitter.cpp
generates the header file:riscv_vector.h
- Instruction definitions are inside
riscv_vector.td
.
- Instruction definitions are inside
RISCVAsmParser.cpp
parses architecture string for assembly-s.lib/Basic/Targets/RISCV.cpp
defines the macro-s in ELF (Executable and Linkable Format)
Test cases when changing architecture string:
llvm/test/MC/RISCV/attribute-arch.s
llvm/test/CodeGen/RISCV
clang/test/Driver/riscv-arch.c
clang/test/Preprocessor/riscv-target-features.c
clang/test/CodeGen/RISCV
Wrap up
I am happy to have the chance to participate the development for this new RISC-V evolution. Chances like this is why I chose to join SiFive at the very first point. I like how the company tries to grow upon building up the RISC-V ecosystem.
For others that have already broken the entry barrier to become an LLVM developer, you will never need to join anywhere to start developing. RISC-V and LLVM is in the open and welcoming everyone to help out.