- no global mem, every task and process can only access its local state
- communication happen through pipes
- when targeting fpgas its size is the size of a single block ram
- customisation of lowering is possible in a separate constraint file
- autoclocking
- we want to clock proc fsms at higher freqs to not keep seqvs starved most of the time
- still can be clocked with one main clock and each item has internal clock enable counter (???)
process
- may contain
loop
s - may contain state
- can be lowered to a state machine (never pipeline)
- multiport memory can be impled as 1R1W
- may stop (reach terminal state)
- must not be nested
- may contain div
- may contain operations with unpredictable latency
- may contain
sequence
- must not contain loops/state
- "inplace" mutatation can be expressed as comb logic and thus not considered state
- must not contain TDM memory
- must not contain div with runtime divisor
- can be lowered to (auto) pipeline (head-fsm-tail-pipeline or just pipeline)
- should only perform auto pipelining on builtins with predictable latency (mul, simd ops) (no dynamic div!)
- head gathers inputs (first stage, blocking reads), tail computes with it (nonblocking reads are allowed)
- head pushes data to tail only when every sink has a slot for push
- used to express pipelineable logic
- either explicitly by
|||
stage separators- this shouldnt exist. pipelining should be automatic and timing driven.
- idk if autopiping is doable even via hacks to vendor synth tools
- yosys doesnt provide timing info?
- this shouldnt exist. pipelining should be automatic and timing driven.
- or implicitly for
@map
s or muls
- either explicitly by
- blocking reads from pipes can only be in first stage
- pipeline fires when all buffer sinks have slots (in presense of blocking sends)
- is_valid bit at each stage
- must not contain loops/state
function
- parameters can be either
- direct (by value)
T
- indirect (by reference)
out T
"write only"inout T
"read write"
name(arg1:T, arg2:out T, arg3:inout T)
- direct (by value)
- must express only combinational logic
- can be used in both
sequence
es andprocess
s
- parameters can be either
clock
- globaly defined statements for io procs
clock clk_name: 133Mhz
@wait_cycles(n)
waits n cycles in io process@switch_to(clk_iden)
switches to a nother clock at runtime- this construct makes clock derivation less problematic
for in
for i in 0..n
- n may be dynamic in processes
- n must be static in sequences
for k in array_ref
- iterate over items in array
pin
- chip IO stuff
- default must be specified for out and inout
pin in
pin out
pin inout
@read_pin
@write_pin
io process
- user defined period
- loop gets invoked at every specified kth time point
- runtime switchable poll rate
- allows to negotiate faster rate on links
bridge process
- connection to verilog modules
struct
- the structs
union
- the unions
enum
- enums
graph
- specify connectivity between seqvs and processes
- arbitrary connections, cycles are ok
- procs (fsm) can wait arbitrarily long until slot is available in buffer sinks
- seqvs can wait arbitrary long until all buffer sinks with blocking pushes have space (reduces runrate)
iN
- arbitrary width unsigned integer
- signed ints are in two complement format
- same as
[i1;N]
- shr, shl, plus, neg, et c.
[T;n]
- arrays of length n of T items
@map(item, fun_ref)
enables simd operations- annotations
#[impl(...)]
to request particular impl- can be either
lutram
bram
bkram
lutram
ram by registersbram
ram by block rambkram
should synthesise as banked ram (one bram = one bank) with conflict minimisation- seqvs must not block on mem accs so lowering is dependent on infered port number
- can be either
- pipes
- monodirectional fifo
- single producer & single consumer
- guaranteed reads and test reads
- any pipe is either
buffer
orstream
buffer
- producer stalls when no slots available
@try_rcv
-> (T, i1) , if data item present, consume it@rcv
-> T , blocking read-
@send
blocking send
stream
- old values get dropped on overfill
@try_rcv
-> (T, i1) , if data item present, consume it@rcv
-> T , blocking read@send
non blocking send
<X> in T
read only pipe of Ts (X can be either buffer or stream)<X> out T
write only pipe of Ts (X can be either buffer or stream)- used to connect
sequence
s andprocess
ies @mk_pipe(capacity)
used for creation of both buffers and streams
- how do we clocks graphs?
- non io entities are all clocked from the same source (fmax?)
- doesnt seem resonable to do different clocking for main logic, look like its a must for io procs
- how do we lower io procs?
- is serdes vs non serdes different?
- i want them to be able to switch poll rate to allow for negotiation of transfer rates
- how should we do cdc for io procs ?
- theres a psram block in gw1nr. it is posible to make MEM2RW and make burst loads from psram on one port, and have another for other tasks. it is unclear how to do same thing in ddl.
- io proc as a psram controller ; two stream pipes of depth one, one for sending commands and another for sending back data ; few clock cycles longer than impl in verilog
nesting by indentation instead of brackets
process STM (arg1: i1) -- only direct parameters, () may be omited
mut state: MyEnum
mut mem: [i1;32]
let some_const: i1
init
-- init exprs go into init block
-- it will be called only once
state = MyEnum::Uninit
mem = @zeroed()
some_const = 0
loop label
match state
Pattern1 =>
continue label
Pattern2 => -- match also accepts nonincreasing indentation
-- some_action
break label
-- only direct parameters, must contain parameters
sequence Exmpl (arg1: buffer in i1, arg1: buffer in i4, arg2: buffer out i4, arg3: buffer out i8)
-- stage 1
let val1 = @rcv(arg1) -- only first stage can contain blocking reads
|||
-- stage 2
let (val2, is_valid) = @try_rcv(arg2) -- can only contain nonblocking reads
|||
-- stage 3
let res: i4 = val1 * val2 -- multiplication may extend the pipeline
@send(arg3, res) -- blocking send. the pipeline head should have checked if sink has a free slot, so this cannot be blocking
function name (arg1: i1, arg2: inout [i8;8])
arg2[0] += arg1 -- this will be lowered differently based on whether it is used in process or sequence
pin out led_enable: i1 = 0
pin in data_pin: i1
clock ex1 = 12*10**6
io process LedBlinker (arg1: stream out i1)
mut counter: u32
init
counter = 12*10**6
@switch_to(ex1)
loop
counter -= 1
if counter == 0
then
counter = 12*10**6
led_enable = !led_enable
let smth = @read_pin(data_pin)
@send(arg1, smth) -- non blocking send, because sink is a stream