Hello, I'm also looking for a solution to test NPU, for now I made this tests with ollama built with vulkan support on orangepi 6 plus 32Go ram.
# Ollama Benchmark Results
- Date: Mon Nov 10 10:19:17 PM CET 2025
- System: Linux 6.6.89-cix aarch64
- Benchmark Script: ./obench.sh
- Runs per model: 3
- Total models: 18
## Results
### llama3.2:latest
```
Running benchmark 3 times using model: llama3.2:latest
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 11.78 tokens/s |
| 2 | 11.66 tokens/s |
| 3 | 11.79 tokens/s |
|**Average Eval Rate**| 11.74 tokens/second |
```
### huihui_ai/deepseek-r1-abliterated:7b
```
Running benchmark 3 times using model: huihui_ai/deepseek-r1-abliterated:7b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 5.93 tokens/s |
| 2 | 6.07 tokens/s |
| 3 | 6.09 tokens/s |
|**Average Eval Rate**| 6.03 tokens/second |
```
### huihui_ai/huihui-moe-abliterated:1.5b
```
Running benchmark 3 times using model: huihui_ai/huihui-moe-abliterated:1.5b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 21.13 tokens/s |
| 2 | 21.77 tokens/s |
| 3 | 21.00 tokens/s |
|**Average Eval Rate**| 21.30 tokens/second |
```
### granite4:1b
```
Running benchmark 3 times using model: granite4:1b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 8.63 tokens/s |
| 2 | 8.59 tokens/s |
| 3 | 8.59 tokens/s |
|**Average Eval Rate**| 8.60 tokens/second |
```
### huihui_ai/qwen3-abliterated:4b
```
Running benchmark 3 times using model: huihui_ai/qwen3-abliterated:4b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 8.09 tokens/s |
| 2 | 7.67 tokens/s |
| 3 | 7.91 tokens/s |
|**Average Eval Rate**| 7.89 tokens/second |
```
### huihui_ai/qwen3-abliterated:1.7b
```
Running benchmark 3 times using model: huihui_ai/qwen3-abliterated:1.7b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 13.81 tokens/s |
| 2 | 14.30 tokens/s |
| 3 | 13.54 tokens/s |
|**Average Eval Rate**| 13.88 tokens/second |
```
### mistral-small:22b
```
Running benchmark 3 times using model: mistral-small:22b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 1.27 tokens/s |
| 2 | 1.27 tokens/s |
| 3 | 1.26 tokens/s |
|**Average Eval Rate**| 1.26 tokens/second |
```
### granite4:3b
```
Running benchmark 3 times using model: granite4:3b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 12.85 tokens/s |
| 2 | 12.73 tokens/s |
| 3 | 12.85 tokens/s |
|**Average Eval Rate**| 12.81 tokens/second |
```
### granite4:350m
```
Running benchmark 3 times using model: granite4:350m
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 32.37 tokens/s |
| 2 | 32.57 tokens/s |
| 3 | 32.14 tokens/s |
|**Average Eval Rate**| 32.36 tokens/second |
```
### qwen2.5:0.5b
```
Running benchmark 3 times using model: qwen2.5:0.5b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 46.67 tokens/s |
| 2 | 45.30 tokens/s |
| 3 | 46.07 tokens/s |
|**Average Eval Rate**| 46.01 tokens/second |
```
### qwen3-embedding:0.6b
```
Running benchmark 3 times using model: qwen3-embedding:0.6b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
BENCHMARK FAILED
```
### granite3.2-vision:latest
```
Running benchmark 3 times using model: granite3.2-vision:latest
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 15.79 tokens/s |
| 2 | 15.83 tokens/s |
| 3 | 15.81 tokens/s |
|**Average Eval Rate**| 15.81 tokens/second |
```
### llava-llama3:latest
```
Running benchmark 3 times using model: llava-llama3:latest
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 6.27 tokens/s |
| 2 | 6.29 tokens/s |
| 3 | 6.26 tokens/s |
|**Average Eval Rate**| 6.27 tokens/second |
```
### huihui_ai/gpt-oss-abliterated:20b
```
Running benchmark 3 times using model: huihui_ai/gpt-oss-abliterated:20b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 4.07 tokens/s |
| 2 | 4.27 tokens/s |
| 3 | 4.31 tokens/s |
|**Average Eval Rate**| 4.21 tokens/second |
```
### huihui_ai/llama3.2-abliterate:1b
```
Running benchmark 3 times using model: huihui_ai/llama3.2-abliterate:1b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 31.15 tokens/s |
| 2 | 31.24 tokens/s |
| 3 | 31.32 tokens/s |
|**Average Eval Rate**| 31.23 tokens/second |
```
### huihui_ai/gemma3-abliterated:1b
```
Running benchmark 3 times using model: huihui_ai/gemma3-abliterated:1b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 28.00 tokens/s |
| 2 | 28.53 tokens/s |
| 3 | 28.22 tokens/s |
|**Average Eval Rate**| 28.25 tokens/second |
```
### huihui_ai/qwen3-vl-abliterated:4b
```
Running benchmark 3 times using model: huihui_ai/qwen3-vl-abliterated:4b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 7.59 tokens/s |
| 2 | 7.08 tokens/s |
| 3 | 6.99 tokens/s |
|**Average Eval Rate**| 7.22 tokens/second |
```
### huihui_ai/qwen3-abliterated:8b
```
Running benchmark 3 times using model: huihui_ai/qwen3-abliterated:8b
| Run | Eval Rate (Tokens/Second) |
|-----|-----------------------------|
| 1 | 5.55 tokens/s |
| 2 | 5.41 tokens/s |
| 3 | 5.30 tokens/s |
|**Average Eval Rate**| 5.42 tokens/second |
```
## Summary
| Model | Status | Notes |
|-------|--------|-------|
| llama3.2:latest | ✅ Completed | - |
| huihui_ai/deepseek-r1-abliterated:7b | ✅ Completed | - |
| huihui_ai/huihui-moe-abliterated:1.5b | ✅ Completed | - |
| granite4:1b | ✅ Completed | - |
| huihui_ai/qwen3-abliterated:4b | ✅ Completed | - |
| huihui_ai/qwen3-abliterated:1.7b | ✅ Completed | - |
| mistral-small:22b | ✅ Completed | - |
| granite4:3b | ✅ Completed | - |
| granite4:350m | ✅ Completed | - |
| qwen2.5:0.5b | ✅ Completed | - |
| qwen3-embedding:0.6b | ✅ Completed | - |
| granite3.2-vision:latest | ✅ Completed | - |
| llava-llama3:latest | ✅ Completed | - |
| huihui_ai/gpt-oss-abliterated:20b | ✅ Completed | - |
| huihui_ai/llama3.2-abliterate:1b | ✅ Completed | - |
| huihui_ai/gemma3-abliterated:1b | ✅ Completed | - |
| huihui_ai/qwen3-vl-abliterated:4b | ✅ Completed | - |
| huihui_ai/qwen3-abliterated:8b | ✅ Completed | - |