This contains notes about using a AMD Alveo™ U200 Data Center Accelerator Card (Active)
The card was second hand and has been installed in slot 2 on a HP Z6 G4 workstation, with the 8-pin aux power cable connected. This is a PCle3 x16 slot connected to the CPU. The slot 2 BIOS settings were the following, with Hot Plug enabled since had previously used other FPGA boards in the slot:
Slot 2 PCI Express x16
Disable
*Enable
Slot 2 Option Rom Download
Disable
*Enable
Slot 2 Limit PCIe Speed
*Auto
Gen1 (2.5Gbps)
Gen2 (5Gbps)
Gen3 (8Gbps)
Slot 2 Bifurcation
*Auto
x8x8
x4x4x4x4
Slot 2 Intel VROC NVMe Raid
*Disable
Enable
Slot 2 Hot Plug
Disable
*Enable
Slot 2 Hot Plug Buses
0
*8
16
32
64
128
Slot 2 Resizable Bars
*Disable
Enable
PCIe Training Reset
is also enabled in the BIOS, which was previously enabled when investigating failures to enumerate as a PCIe endpoint for some other FPGA designs.
The first micro USB cable tried didn't fit fully and couldn't detect the JTAG port. Alveo Elongated USB Cable is described as:
U200/U250 Micro-USB connector location requires elongated USB for proper connection on Active cards due to blocking from heat spreader. Elongated USB cables can be used for both active and passive cards, but is a must have for active cards.
Found a different micro USB cable, which while didn't appeared to be the length of exposed onnector but a thinner plastic shell. That allowed the JTAG port to be detected as:
Bus 003 Device 008: ID 0403:6011 Future Technology Devices International, Ltd FT4232H Quad HS USB-UART/FIFO IC
The target part is xcu200-fsgd2104-2-e
. From Alveo and Kria the corresponding standalone FPGA is a XCVU9P
. Some documentation such as Virtex UltraScale+ Devices Available GT Quads from the UltraScale+ Devices Integrated Block for PCI Express Product Guide (PG213) doesn't list the U200 so have to search for the corresponding standalone FPGA.
After initially fitting the card booted into Windows 11 Pro for Workstations Version 24H2.
Device Manager shows a PCI Serial Port in slot 2, but doesn't find a compatible driver.
It is shown as a Gen 3 x16 device:
PS C:\Users\mr_halfword> C:\Users\mr_halfword\Git-projects\fpga_sio\multiple_boards\report_pcie_links.ps1
Name ExpressSpecVersion MaxLinkSpeed MaxLinkWidth CurrentLinkSpeed CurrentLinkWidth
---- ------------------ ------------ ------------ ---------------- ----------------
Standard NVM Express Controller 2 3 4 3 4
Mellanox ConnectX-3 PRO VPI (MT04103) Network Adapter 2 3 8 3 8
Intel(R) Ethernet Connection X722 for 10GBASE-T 2 1 1 1 1
Intel(R) Ethernet Connection X722 for 10GBASE-T #2 2 1 1 1 1
Intel(R) Ethernet Connection X722 for 1GbE 2 1 1 1 1
High Definition Audio Controller 2 3 16 1 4
NVIDIA Quadro K620 2 3 16 1 4
PCI Serial Port 2 3 16 3 16
Mellanox ConnectX-4 Adapter 2 3 16 3 16
Booted into openSUSE Leap 15.5. dump_pci_info_pciutils
shows the following for the serial port PCIe endpoint
linux@DESKTOP-BVUMP11:~/fpga_sio/software_tests/eclipse_project/bin/release> dump_info/dump_pci_info_pciutils 1c9d
domain=0000 bus=31 dev=00 func=00 rev=00
vendor_id=1c9d (Vendor 1c9d) device_id=0101 (Device 0101) subvendor_id=1c9d subdevice_id=0007
iommu_group=81
physical_slot=2-2
control: I/O- Mem+ BusMaster- ParErr+ SERR+ DisINTx-
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
bar[0] base_addr=97400000 size=200000 is_IO=0 is_prefetchable=0 is_64=1
Capabilities: [40] Power Management
Capabilities: [48] Message Signaled Interrupts
Capabilities: [70] PCI Express v2 Express Endpoint, MSI 0
Link capabilities: Max speed 8 GT/s Max width x16
Negotiated link status: Current speed 8 GT/s Width x16
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 1024 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap: Port # 0 ASPM not supported
L0s Exit Latency More than 4 μs
L1 Exit Latency More than 64 μs
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
domain=0000 bus=30 dev=00 func=00 rev=04
vendor_id=8086 (Intel Corporation) device_id=2030 (Sky Lake-E PCI Express Root Port A)
iommu_group=51
driver=pcieport
physical_slot=2
control: I/O+ Mem+ BusMaster+ ParErr+ SERR+ DisINTx+
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
Capabilities: [40] Bridge subsystem vendor/device ID
Capabilities: [60] Message Signaled Interrupts
Capabilities: [90] PCI Express v2 Root Port, MSI 0
Link capabilities: Max speed 8 GT/s Max width x16
Negotiated link status: Current speed 8 GT/s Width x16
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 256 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop-
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 5 ASPM not supported
L0s Exit Latency 256 ns to less than 512 ns
L1 Exit Latency 8 μs to less than 16 μs
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise-
Slot #2 PowerLimit 0.000W Interlock- NoCompl-
Capabilities: [e0] Power Management
There is no driver bound. The PCI vendor 1c9d
isn't known to the PCI libraries. The PCI SIG Member Companies search reports this is for http://www.illumina.com/. Their website say they provide innovative sequencing and array technologies for medical research. Can't see any obvious products which make use of a U200.
This was the initial design created, to prove could create a bitstream with the no-cost Vivado license, before brought the card.
Programmed the bitstream while openSUSE Leap 15.5 was booted, and bind_xilinx_devices_to_vfio.sh
was able to bind to the loaded bitstream PCIe endpoint. I.e. hot plug worked.
display_identified_pcie_fpga_designs
reports zero for the User access build timestamp:
~/fpga_sio/software_tests/eclipse_project/bin/release> identify_pcie_fpga_design/display_identified_pcie_fpga_designs
Opening device 0000:31:00.0 (10ee:903f) with IOMMU group 22
Enabled bus master for 0000:31:00.0
Design AS02MC04_enum:
PCI device 0000:31:00.0 rev 00 IOMMU group 22 physical slot 2-2
DMA bridge bar 1 memory size 0x1000
Channel ID addr_alignment len_granularity num_address_bits
H2C 0 1 1 64
C2H 0 1 1 64
User access build timestamp : 00000000 - 00/00/2000 00:00:00
Based upon the output of parse_bitstream_file
failed to enable the user access build timestamp.
After changing the design to insert the user access build timestamp and allocate a different identity:
identify_pcie_fpga_design/display_identified_pcie_fpga_designs
Opening device 0000:31:00.0 (10ee:903f) with IOMMU group 22
Enabled bus master for 0000:31:00.0
Design U200_enum:
PCI device 0000:31:00.0 rev 00 IOMMU group 22 physical slot 2-2
DMA bridge bar 1 memory size 0x1000
Channel ID addr_alignment len_granularity num_address_bits
H2C 0 1 1 64
C2H 0 1 1 64
User access build timestamp : 34B37034 - 06/09/2025 23:00:52
dump_info/dump_pci_info_pciutils
domain=0000 bus=31 dev=00 func=00 rev=00
vendor_id=10ee (Xilinx Corporation) device_id=903f (Device 903f) subvendor_id=0002 subdevice_id=001c
iommu_group=22
driver=vfio-pci
physical_slot=2-2
control: I/O- Mem+ BusMaster- ParErr- SERR+ DisINTx-
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
bar[0] base_addr=97410000 size=1000 is_IO=0 is_prefetchable=0 is_64=0
bar[1] base_addr=97400000 size=10000 is_IO=0 is_prefetchable=0 is_64=0
Capabilities: [40] Power Management
Capabilities: [70] PCI Express v2 Express Endpoint, MSI 0
Link capabilities: Max speed 8 GT/s Max width x16
Negotiated link status: Current speed 8 GT/s Width x16
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 1024 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 0 ASPM not supported
L0s Exit Latency More than 4 μs
L1 Exit Latency More than 64 μs
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
domain=0000 bus=30 dev=00 func=00 rev=04
vendor_id=8086 (Intel Corporation) device_id=2030 (Sky Lake-E PCI Express Root Port A)
iommu_group=51
driver=pcieport
physical_slot=2
control: I/O+ Mem+ BusMaster+ ParErr+ SERR+ DisINTx+
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
Capabilities: [40] Bridge subsystem vendor/device ID
Capabilities: [60] Message Signaled Interrupts
Capabilities: [90] PCI Express v2 Root Port, MSI 0
Link capabilities: Max speed 8 GT/s Max width x16
Negotiated link status: Current speed 8 GT/s Width x16
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 256 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop-
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 5 ASPM not supported
L0s Exit Latency 256 ns to less than 512 ns
L1 Exit Latency 8 μs to less than 16 μs
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise-
Slot #2 PowerLimit 0.000W Interlock- NoCompl-
Capabilities: [e0] Power Management
In a Block Design tried to create a x8x8 design, with two DMA/Bridge Subsystem for PCI Express IP which share the same Utility Buffer for the PCIe refclk:

The design synthesised, but got errors during implementation with conflicts over placement.
The first error was:
[DRC REQP-1963] connects_too_many_BUFG_GT_SYNC_loads: The IBUFDS_GTE4 U200_enum_i/util_ds_buf_0/U0/USE_IBUFDS_GTE4.GEN_IBUFDS_GTE4[0].IBUFDS_GTE4_I (ODIV2 pin) is driving more than one BUFG_GT_SYNC load, which is an unroutable situation. Optimization may not have been able to merge BUFG_GT_SYNC cells because of differing control pin connections.
lp with BUFDS_GTE4 in Ultrascale+. describes issues with device constaints.
For the placement options for the DMA/Bridge Subsystem for PCI Express IP:
- PCIe Block Location of
X1Y2
allows a max width of x16 with QUADs 224 to 227 - PCIe Block Location of
X1Y4
allows a max width of x8 with QUADs 230 or 231
The PCIe constaints for the U200 use:
- QUADs 224 to 227 for the PCIe lanes
- QUAD 226 for the PCIe refclk
TABLE: Virtex UltraScale+ Devices Available GT Quads (XCVU9P) shows for a XCVU9P in the FSGD2104 package:
PCIE Blocks | Quads with Max Link Width X16 Support | Quads with Max Link Width X8 Support | Quads with Max Link Width X4 Support |
---|---|---|---|
X1Y2 | GTY_Quad_228, GTY_Quad_227 | GTY_Quad_226, GTY_Quad_225 | GTY_Quad_224 |
X1Y4 | GTY_Quad_233, GTY_Quad_232 | GTY_Quad_231, GTY_Quad_230 | GTY_Quad_229 |
UltraScale+ Device Packaging and Pinouts Product Specification User Guide (UG575) Figure 1-122: XCVU9P Banks in FSGD2104 Package shows the "PCIE4 X1Y2 (tandem)" and "PCIE4 X1Y4" blocks are different sides of a SLR Crossing GT Locations in the UltraScale+ Devices Integrated Block for PCI Express Product Guide (PG213) contains:
A GT Quad is comprised of four GT lanes. When selecting GT Quads for the PCIe IP, AMD recommends that you use the GT Quad most adjacent to the PCIe hard block. While this is not required, it improves place, route, and timing for the design.
- Link widths of x1, x2, and x4 require one bonded GT Quad and should not split lanes between two GT Quads.
- A link width of x8 requires two adjacent GT Quads that are bonded and are in the same SLR.
- A link width of x16 requires four adjacent GT Quads that are bonded and are in the same SLR.
The SYSMON, Configuration, PCIe, Interlaken, and 100GE Integrated Blocks section in UG575 has:
Note: Do not connect the integrated block for PCIe to transceiver channels through an SLR crossing. For further details, refer to the Placement Rules section of the UltraScale Devices Gen3 Integrated Block for PCI Express Product Guide (PG156) and UltraScale+ Devices Integrated Block for PCI Express Product Guide (PG213). Blocks with an additional (Tandem) label support Tandem configuration.
The U200 has:
- QUADs 224 to 227 connected to PCIe lanes
- QAUD 230 connected to QSFP1
- QUAD 231 connected to QSFP0
Perhaps that is why the DMA/Bridge Subsystem for PCI Express IP selection of the PCIe Block Location and QUADs is showing less options for number of blocks and QUADs compared to a XCVU9P in the FSGD2104 package.
In the Alveo Product Details in the Alveo U200 and U250 Data Center Accelerator Cards Data Sheet (DS962):
- Figure U200/U250 Block Diagram has the text:
PCIe x 16 or PCIe x8 (2)
- However, the Alveo U200/U250 Accelerator Card Product Details table has: PCIe Interface: Gen3 x16
Table Alveo U200/U250 Features in Alveo U200 and U250 Accelerator Cards User Guide (UG1289) has:
Gen1, 2, or 3 up to x16 and Dual Gen4 x8 compatible
The VU9P doesn't have a "PCIe Gen3 x16/Gen4 x8" block, so the mention of "Dual Gen4 x8 compatible" doesn't seem correct.