Jump to content

tiobily

Members

View Profile See their activity

Posts
4
Joined
April 15
Last visited
April 22

Reputation Activity

tiobily got a reaction from qq20739111 in Radxa Cubie A7A/A7Z - Allwinner a733 April 18

@qq20739111 thks, official community = github? https://github.com/radxa-docs/docs/

i asked claude to make a summary of my project with npu:

We use the vendor's ACUITY toolkit inside a Docker container (ubuntu-npu:v2.0.10.1) to convert ONNX → quantized .nb files. - Pipeline: onnxsim → pegasus_import → pegasus_quantize → pegasus_export, uint8 quantization, deployed to ~/npu_models/ via push.sh. - Runtime: custom npu_server.c (pre-allocated input buffer, no per-call mmap) that the Python app talks to. - Hard lesson #1 — quantization: only pure Conv+BN+ReLU survives uint8. Attention, SE blocks, hard-swish, LayerNorm all collapse to constant outputs. So no MobileNetV3+, no transformers. - Hard lesson #2 — concurrency hang: NPU IRQs get lost when camera ISP DMA runs in parallel (shared memory bus). Fix: suppress GStreamer buffer copies during inference. Never STREAMOFF/ON the sunxi-vin driver — instant kernel crash. - Result: 12 models running (9 NPU + 3 CPU) at ~40ms/inference.

×

Forums
- Back
- Forums
- Applications
- Moderators
- Support
- Support
- All Activity
- My Activity Streams
  - Back
  - Active threads
Download
- Back
- Download
- AARCH64
- RISCV64
- AMD64
- ARMHF
Documentation
Subscriptions
Store
- Back
- Store
Crowdfunding
Raffles
Community Map

×

Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines

I accept