Skip to content

Instantly share code, notes, and snippets.

@Foadsf
Created January 29, 2026 10:15
Show Gist options
  • Select an option

  • Save Foadsf/2972e8059102ad9bc9c5848ae1fc7cc3 to your computer and use it in GitHub Desktop.

Select an option

Save Foadsf/2972e8059102ad9bc9c5848ae1fc7cc3 to your computer and use it in GitHub Desktop.

Radxa Dragon Q6A - Hexagon NPU Troubleshooting Guide

Status: 🔴 UNRESOLVED - Root cause identified, awaiting fix
Last Updated: January 2026
Hardware: Radxa Dragon Q6A (8GB RAM)
OS: Ubuntu 24.04 (RadxaOS)
Kernel: 6.18.2-1-qcom

Table of Contents


Objective

Goal: Run Llama 3.2 1B (and other AI models) on the Radxa Dragon Q6A using the onboard Qualcomm Hexagon NPU for hardware-accelerated inference.

Expected Outcome: Achieve ~12 tokens/second inference using the 12 TOPS NPU, as demonstrated in Radxa's official documentation.

Actual Outcome: Error 14001 - "Failed to create device" when attempting to initialize the NPU via QNN/QAIRT SDK.


Hardware Overview

Radxa Dragon Q6A Specifications

Component Specification
SoC Qualcomm QCS6490
CPU Octa-core Kryo 670 (1×2.7GHz + 3×2.4GHz + 4×1.9GHz)
GPU Adreno 643
NPU Hexagon 770 DSP with Tensor Accelerator
AI Performance Up to 12 TOPS (INT8)
RAM 8GB LPDDR5
DSP Architecture Hexagon V68

Qualcomm AI Stack

┌─────────────────────────────────────────────────────────────┐
│                    User Application                          │
├─────────────────────────────────────────────────────────────┤
│  QAIRT SDK (QNN / SNPE / Genie)                             │
├─────────────────────────────────────────────────────────────┤
│  libQnnHtp.so  │  libGenie.so  │  libQnnHtpV68Skel.so      │
├─────────────────────────────────────────────────────────────┤
│  FastRPC (libcdsprpc.so / libxdsprpc.so)                    │
├─────────────────────────────────────────────────────────────┤
│  cdsprpcd daemon                                             │
├─────────────────────────────────────────────────────────────┤
│  Kernel: qcom_fastrpc driver                                 │
├─────────────────────────────────────────────────────────────┤
│  /dev/fastrpc-cdsp  │  /dev/fastrpc-adsp                    │
├─────────────────────────────────────────────────────────────┤
│  Hexagon DSP Firmware (cdsp.mbn / adsp.mbn)                 │
└─────────────────────────────────────────────────────────────┘

The Journey

Phase 1: Initial Setup

  1. Installed RadxaOS (Ubuntu 24.04 Noble) on the Dragon Q6A
  2. Set up Docker for containerized AI development
  3. Downloaded QAIRT SDK container from Radxa's Docker Hub:
    docker pull radxazifeng278/qairt-npu-9075:v1.1

Phase 2: First Attempt - Error 14001

Downloaded the pre-compiled Llama 3.2 1B model from ModelScope:

pip3 install modelscope --break-system-packages
modelscope download --model radxa/Llama3.2-1B-1024-qairt-v68 --local ./Llama3.2-1B-1024-qairt-v68

Attempted inference:

cd Llama3.2-1B-1024-qairt-v68
export LD_LIBRARY_PATH=$(pwd)
./genie-t2t-run -c ./htp-model-config-llama32-1b-gqa.json -p '<prompt>'

Result:

[ERROR] "Failed to create device: 14001"
[ERROR] "Device Creation failure"
Failure to initialize model.

Phase 3: Systematic Debugging

The error 14001 indicates QNN HTP backend failed to create a device context for the Hexagon DSP. This began a systematic investigation through the entire stack.


Solved Issues

✅ Issue 1: Missing FastRPC Libraries

Symptom: QNN couldn't find FastRPC transport libraries.

Solution: Install the official Qualcomm FastRPC packages:

sudo apt update
sudo apt install qcom-fastrpc1 libcdsprpc1

This installs:

  • /usr/lib/aarch64-linux-gnu/libcdsprpc.so.1
  • /usr/lib/aarch64-linux-gnu/libxdsprpc.so
  • /usr/bin/cdsprpcd
  • Systemd services: cdsprpcd.service, adsprpcd.service

✅ Issue 2: cdsprpcd Daemon Not Running

Symptom: FastRPC transport layer couldn't communicate with DSP.

Solution: Enable and start the CDSP RPC daemon:

sudo systemctl enable cdsprpcd.service
sudo systemctl start cdsprpcd.service
sudo systemctl status cdsprpcd.service

✅ Issue 3: Skeleton Libraries Not Found

Symptom: DSP couldn't load the HTP skeleton library.

Solution: Copy skeleton libraries to DSP search paths:

# Create DSP library directories
sudo mkdir -p /usr/lib/dsp/cdsp /usr/lib/dsp/adsp /dsp

# Copy skeleton library (from QAIRT SDK or downloaded model)
sudo cp libQnnHtpV68Skel.so /usr/lib/dsp/cdsp/
sudo cp libQnnHtpV68Skel.so /usr/lib/dsp/adsp/
sudo cp libQnnHtpV68Skel.so /dsp/

# Set permissions
sudo chmod 644 /usr/lib/dsp/cdsp/libQnnHtpV68Skel.so

Set the environment variable:

export ADSP_LIBRARY_PATH="/usr/lib/dsp/cdsp:/usr/lib/dsp/adsp:/dsp"

✅ Issue 4: Device Permissions

Symptom: Permission denied accessing /dev/fastrpc-* devices.

Solution: Add user to the render group:

sudo usermod -aG render $USER
# Log out and back in, or:
newgrp render

Verify permissions:

ls -la /dev/fastrpc*
# Should show: crw-rw-r--+ ... fastrpc:render

✅ Issue 5: Docker Container Configuration

Symptom: NPU not accessible from Docker containers.

Solution: Run Docker with full device access:

docker run --privileged -it \
  -v /dev:/dev \
  -v $(pwd):/workspace \
  radxazifeng278/qairt-npu:v1.0 /bin/bash

Note: The -9075 variant container is for QCS9075, not QCS6490. Use the base qairt-npu:v1.0 image for Dragon Q6A.


Root Cause Analysis

The Critical Error

After resolving all userspace issues, the kernel logs reveal the actual problem:

sudo dmesg | grep -i fastrpc

Output:

[    5.544594] qcom,fastrpc a300000.remoteproc:glink-edge.fastrpcglink-apps-dsp.-1.-1: no reserved DMA memory for FASTRPC
[    5.566430] qcom,fastrpc 3700000.remoteproc:glink-edge.fastrpcglink-apps-dsp.-1.-1: no reserved DMA memory for FASTRPC

What This Means

The Linux kernel's qcom_fastrpc driver requires a reserved DMA memory region defined in the device tree. This memory region is used for:

  1. Shared buffers between CPU and DSP
  2. DMA transfers for model weights and activations
  3. Communication buffers for RPC calls

Device Tree Analysis

The current device tree does define reserved memory for the DSPs:

sudo dmesg | grep "reserved mem"
[    0.000000] OF: reserved mem: 0x0000000081800000..0x00000000835fffff (30720 KiB) nomap non-reusable cdsp-secure-heap@81800000
[    0.000000] OF: reserved mem: 0x0000000084800000..0x0000000086ffffff (40960 KiB) nomap non-reusable adsp@84800000
[    0.000000] OF: reserved mem: 0x0000000087000000..0x0000000088dfffff (30720 KiB) nomap non-reusable cdsp@87000000

However, the fastrpc node is missing its own memory-region property that would link it to a DMA pool.

Required Device Tree Fix

The fastrpc node needs a memory-region property pointing to a shared-dma-pool region. Example from working configurations:

reserved-memory {
    #address-cells = <2>;
    #size-cells = <2>;
    ranges;

    fastrpc_mem: fastrpc@0 {
        compatible = "shared-dma-pool";
        reg = <0x0 0x89000000 0x0 0x2000000>;  /* 32MB */
        no-map;
    };
};

/* In the fastrpc node: */
fastrpc {
    compatible = "qcom,fastrpc";
    memory-region = <&fastrpc_mem>;
    /* ... */
};

CMA Configuration

Current CMA (Contiguous Memory Allocator) allocation:

cat /proc/meminfo | grep -i cma
CmaTotal:          32768 kB
CmaFree:           24568 kB

The 32MB CMA pool exists but isn't being used by fastrpc due to the missing device tree configuration.


Current Status

What Works ✅

Component Status Verification
DSP Firmware ✅ Loaded cat /sys/class/remoteproc/remoteproc*/state → "running"
Remoteproc ✅ Running Both ADSP (remoteproc0) and CDSP (remoteproc1) active
Device Nodes ✅ Present /dev/fastrpc-cdsp, /dev/fastrpc-adsp exist
Permissions ✅ Correct User in render group, ACLs configured
FastRPC Libraries ✅ Installed libcdsprpc.so.1, libxdsprpc.so present
cdsprpcd Service ✅ Running Systemd service active
Skeleton Libraries ✅ Deployed libQnnHtpV68Skel.so in DSP paths

What Fails ❌

Component Status Error
FastRPC DMA ❌ No Memory "no reserved DMA memory for FASTRPC"
QNN Device Creation ❌ Error 14001 "Failed to create device"
NPU Inference ❌ Blocked Cannot establish CPU↔DSP communication

Error Progression

Error 14001 (QNN)
    └─→ Error 4000 (Transport)
        └─→ Error 1002 (DspTransport.openSession)
            └─→ Error 0x80000600 (AEE_EUNSUPPORTED)
                └─→ Kernel: "no reserved DMA memory for FASTRPC"

Diagnostic Commands

Complete System Check Script

#!/bin/bash
echo "=== Radxa Dragon Q6A NPU Diagnostic ==="

echo -e "\n--- System Info ---"
uname -a
cat /etc/os-release | grep PRETTY_NAME

echo -e "\n--- Kernel Command Line ---"
cat /proc/cmdline

echo -e "\n--- Remoteproc Status ---"
for rproc in /sys/class/remoteproc/remoteproc*; do
    name=$(cat $rproc/name 2>/dev/null || echo "unknown")
    state=$(cat $rproc/state 2>/dev/null || echo "unknown")
    echo "  $(basename $rproc): $name = $state"
done

echo -e "\n--- FastRPC Devices ---"
ls -la /dev/fastrpc* 2>/dev/null || echo "  No fastrpc devices found!"

echo -e "\n--- DSP Reserved Memory ---"
sudo dmesg | grep -i "reserved mem" | grep -E "(cdsp|adsp|fastrpc)"

echo -e "\n--- FastRPC Kernel Messages ---"
sudo dmesg | grep -i fastrpc

echo -e "\n--- CMA Memory ---"
cat /proc/meminfo | grep -i cma

echo -e "\n--- cdsprpcd Service ---"
systemctl status cdsprpcd.service --no-pager 2>/dev/null || echo "  Service not found"

echo -e "\n--- FastRPC Libraries ---"
ldconfig -p | grep -E "(cdsprpc|xdsprpc|fastrpc)"

echo -e "\n--- DSP Skeleton Libraries ---"
ls -la /usr/lib/dsp/cdsp/ /usr/lib/dsp/adsp/ /dsp/ 2>/dev/null

echo -e "\n--- User Groups ---"
groups

echo -e "\n--- DSP Firmware ---"
ls -la /lib/firmware/qcom/qcs6490/radxa/dragon-q6a/*.mbn 2>/dev/null

Individual Checks

# Check remoteproc status
cat /sys/class/remoteproc/remoteproc0/state  # ADSP
cat /sys/class/remoteproc/remoteproc1/state  # CDSP

# Check firmware versions
cat /sys/class/remoteproc/remoteproc0/firmware
cat /sys/class/remoteproc/remoteproc1/firmware

# Check fastrpc kernel module
lsmod | grep fastrpc
cat /boot/config-$(uname -r) | grep FASTRPC

# Check device tree for fastrpc
ls /sys/firmware/devicetree/base/ | grep -i fastrpc

Workarounds Attempted

❌ Increasing CMA Size

Attempted to increase CMA via kernel parameter:

# In boot config, add:
cma=256M

Result: CMA size increased but fastrpc still reports "no reserved DMA memory" because it needs a dedicated memory-region property, not just a larger CMA pool.

❌ Using Different Docker Images

Tried multiple Docker images:

  • radxazifeng278/qairt-npu-9075:v1.1 (wrong SoC)
  • radxazifeng278/qairt-npu:v1.0 (correct but same kernel issue)

Result: Same error - the issue is at the kernel/device-tree level, not containerization.

❌ Manual Memory Mapping

Attempted to manually configure memory regions via kernel parameters.

Result: Fastrpc driver specifically looks for memory-region phandle in device tree, cannot be overridden at runtime.

❌ Using rsetup Overlays

Checked available device tree overlays via rsetup:

ls /boot/dtbo/*.dtbo* | grep -i -E "(npu|cdsp|fastrpc)"

Result: No NPU/FastRPC-specific overlays available for Dragon Q6A.


Community Resources

Official Documentation

Community Channels

Related Kernel Patches


Contributing

How You Can Help

  1. If you have a working Dragon Q6A NPU setup, please share:

    • Your kernel version (uname -a)
    • Your system image version
    • Output of diagnostic commands above
    • Your /proc/device-tree fastrpc node contents
  2. If you're a kernel/DT developer:

    • Help identify the correct device tree fix
    • Submit patches to Radxa's kernel repository
  3. Report to Radxa:

Issue Template for Radxa

**Title**: NPU Error 14001 - "no reserved DMA memory for FASTRPC"

**Hardware**: Radxa Dragon Q6A (8GB)
**OS**: RadxaOS Ubuntu 24.04
**Kernel**: 6.18.2-1-qcom

**Problem**:
NPU inference fails with error 14001 due to missing DMA memory reservation
for the fastrpc driver.

**Kernel Error**:

qcom,fastrpc a300000.remoteproc:glink-edge.fastrpcglink-apps-dsp.-1.-1: no reserved DMA memory for FASTRPC


**Expected**:
The fastrpc device tree node should have a `memory-region` property
pointing to a `shared-dma-pool` reserved memory region.

**Reference**:
https://gist.github.com/[YOUR_GIST_ID]

Appendix

Environment Details

OS: Ubuntu 24.04.3 LTS (Noble)
Kernel: 6.18.2-1-qcom aarch64
Boot: systemd-boot (embloader)
Firmware: QCM6490.LE.1.0-00376-STD.PROD-1
CDSP Firmware: CDSP.HT.2.5.c3-00103-KODIAK-1
ADSP Firmware: ADSP.HT.5.5.c8-00217-KODIAK-1

Key File Locations

Purpose Path
DSP Firmware /lib/firmware/qcom/qcs6490/radxa/dragon-q6a/
FastRPC Libraries /usr/lib/aarch64-linux-gnu/
DSP Skeleton Libs /usr/lib/dsp/cdsp/, /dsp/
Boot Config /boot/efi/loader/entries/RadxaOS-*.conf
Device Tree Overlays /boot/dtbo/

Version History

Date Change
2026-01 Initial document - Root cause identified

License

This document is released under CC BY-SA 4.0. Please share improvements back to the community.


If this document helped you or you found a solution, please contribute back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment