-
-
Save soronpo/056bbc7e6a90b1629b6756ab1e690539 to your computer and use it in GitHub Desktop.
//This is a DFiant-equivalent design of https://github.com/tommythorn/silice-examples/blob/master/cpu.ice | |
import DFiant.* | |
class CPU(using DFC) extends DFDesign: | |
val leds = DFBits(8) <> OUT | |
val rf = DFBits(32).X(16) <> VAR init Vector( | |
0, 1, 1, 0, 100, 0, 1 | |
).padTo(16, 0).map(_.toBits(32)) | |
// A add, B blt | |
val code = DFBits(32).X(32) const Vector( | |
h"32'0A556", // r5, r5, r6 | |
h"32'0A312", // r3 = r1 + r2 | |
h"32'0A120", // r1 = r2 | |
h"32'0A230", // r2 = r3 | |
h"32'0B034" // if r3 < r4: pc = 0 | |
).padTo(32, h"32'0") | |
object Insn extends DFFields: | |
val brtarget = DFUInt(16) <> FIELD | |
val opcode = DFBits(4) <> FIELD | |
val rd, rs, rt = DFBits(4) <> FIELD | |
val pc = DFUInt(32) <> VAR init 0 | |
val insn = Insn <> VAR //bubble init | |
val wb_addr = DFBits(4) <> VAR //bubble init | |
val wb_data = DFBits(32) <> VAR //bubble init | |
insn := code(pc.bits(4, 0)).pipe.as(Insn) | |
val rs_data = rf(insn.rs).pipe | |
val rt_data = rf(insn.rt).pipe | |
val rs_data_fw = if (insn.rs == wb_addr && wb_data.isValid) wb_data else rs_data | |
val rt_data_fw = if (insn.rt == wb_addr && wb_data.isValid) wb_data else rt_data | |
if (insn.opcode == h"A" && insn.rd != h"0") wb_addr := insn.rd | |
else wb_addr := ? | |
wb_data := rs_data_fw.uint + rt_data_fw.uint | |
rf(wb_addr) := wb_data | |
if (insn.opcode == h"B" && rs_data_fw.uint < rt_data_fw.uint && insn.brtarget.isValid) | |
pc := insn.brtarget | |
//flush by forcing bubbles in the pipeline | |
insn := ? | |
wb_data := ? | |
wb_addr := ? | |
else pc := pc + 1 | |
if (sim.inSimulation) | |
val cycle = DFUInt(32) <> VAR init 0 | |
if (cycle >= 80) sim.finish() | |
cycle := cycle + 1 | |
if (wb_addr.isValid) | |
sim.report(msg"$cycle WB $pc:$insn $rs_data_fw,$rt_data_fw $wb_data -> r$wb_addr") | |
else | |
sim.report(msg"$cycle WB $pc:$insn $rs_data_fw,$rt_data_fw") | |
end CPU |
Thank you for this example.
The implicit valid/ready everywhere is convenient and your pipeline abstraction is definitely an improvement over Chisel. How well this is optimized would be my first question.
I think
DFiant automatically balances pipelined value access, so no values from various stages are accessed without balancing.
is what had me confused at first. For example in line 40 insn comes from several stages up (I'd have to count them as it's not obvious) is combined (indirectly) with rs_data which is a different distance. I can see how this can work, but it certainly feels a bit magic and the effective pipeline depth isn't explict.
How well this is optimized would be my first question.
That's a good question, but currently I have no answer since there is some work to be done. The goal is to only generate the handshaking signals if they are required. The generated code is not only optimized for logic (and possible performance characteristics), but also for readability.
is what had me confused at first. For example in line 40 insn comes from several stages up (I'd have to count them as it's not obvious) is combined (indirectly) with rs_data which is a different distance. I can see how this can work, but it certainly feels a bit magic and the effective pipeline depth isn't explict.
What's cool about DFiant is that pipelining is just a constraint. The compiler can automatically add pipeline stages if I tell it to. Balancing is different, since DFiant just keeps your code correct if you add .pipe
tags and forget to balance it yourself. Another cool thing is that you can printout the code after the balancing stage. So if I have a code like (x - x.prev) * (y - y.prev).pipe
and if x
and y
come from the same path, then the implicit join of the arithmetic operation forced the compiler to balance the pipe to maintain correctness. When you print the code after the balancing stage it would look like (x - x.prev).pipe * (y - y.prev).pipe
. Notice that there is a difference between a .prev
and .pipe
. .prev
is part of that function, whereas .pipe
is just a constraint.
A few notes:
DFBits
(bit aggregation) and aDFUInt
(unsigned arithmetic). Conversion between the two is fairly simple. So the program counter is typically aDFUInt
, while a register in the register file is aDFBits
.DFBits(32).X(16)
is a 16-element vector of 32-bit dataflow values.?
to a dataflow variable, we define it as a bubble. Any logical/arithmetic operation with a bubble will always yields a bubble, unless a condition withisValid
is used to guard an if-statement.pc := pc + 1
is equivalent topc := pc.prev + 1
, sincepc
was not assigned earlier in the scope with another value so the previous dataflow value is obtained and creates an incremental accumulation.