Cubietruck freeze after 1-3 days with 5.23 Xenial (uboot problem?)


apollon77
 Share

4 4

Recommended Posts

I have a sd card here, so preparing that and using it makes sense ...

 

But I need to do that test then in the one machine where u-boot crashes on mainline and I can not use the new cubietruck because it did not freeze there so far. But with additional sd card I can do it that way forshort perids of time (db values are cached for some hours if db server goes down ... but overnight will not work for now :-) ).

 

Ingo F

Link to post
Share on other sites

Armbian is a community driven open source project. Do you like to contribute your code?

Since we are testing the DRAM settings in U-Boot, the choice of kernel does not really matter. For the sake of convenience, the Orange Pi PC testing had been done via a special FEL boot package: https://github.com/ssvb/lima-memtester/releases/tag/20151207-orange-pi-pc-fel-test

I can easily prepare something like this for the Cubietruck too.

That would be great. I would happily test DRAM clock settings more efficently.

Link to post
Share on other sites

Status: Prepared new sdcard with armbian 5.25 Legacy 3.4.113, only did apt-getupdate/upgrade afterwards

 

u-boot 5.25 (original unchanged) time to freeze:

Try 1: approx. 10 secs

Try 2: approx. 30 secs

Try 3: approx. 10mins

Try 4: approx. 10mins

Try 5: approx. 9 mins

 

@tpm8: can you provide a binary u-boot which is the same base as 5.25 but with the unchanged clock-speed? So that we can make sure that this is really the only difference? Or should I use your November u-boot?

Link to post
Share on other sites

Status: Prepared new sdcard with armbian 5.25 Legacy 3.4.113, only did apt-getupdate/upgrade afterwards

 

u-boot 5.25 (original unchanged) time to freeze:

Try 1: approx. 10 secs

Try 2: approx. 30 secs

Try 3: approx. 10mins

Try 4: approx. 10mins

Try 5: approx. 9 mins

 

@tpm8: can you provide a binary u-boot which is the same base as 5.25 but with the unchanged clock-speed? So that we can make sure that this is really the only difference? Or should I use your November u-boot?

 

I've compiled from mainline u-boot yesterday (13/02/2017) with unmodified cubietruck defconfig (CONFIG_DRAM_CLK=432 ) and another version with reduced clock speed (CONFIG_DRAM_CLK=384) that is used in armbian.

 

I guess it's the same version that is used in armbian 5.25 (but can't be sure).

 

Please find attached a zip with both binaries.

cubietruck_uboot_binary_20170213.zip

Link to post
Share on other sites

I used some time the last days to do some tests. Here first results with 5.25 official u-boot from armbian:

 

  • cubietruck1   -> 2 limatester-runs without freeze (but freezed before and was more stable in operation with u-boot 5.25 then with 5.20)
  • cubietruck2 -> 2 limatester-runs runs without crash (but freezed before and was much more stable in operation with u-boot 5.20 then with 5.24/5!)
  • cubietruck3 -> freeze after 10 mins limatester, multiple tries see above (most start of the "Checkerboard" tests)
  • cubietruck4 -> reproducable freeze at limatester-start at "got 100MB, trying mlock ...", multiple tries
  • cubietruck5 -> 2 limatester-runs without freeze

Now I focus my tests on "Cubietruck3" for the beginning and tried the tpm8-u-boot-384 there. Result: Freeze in the second limatester loop ... so Interestingly later then the 10 mins before, but freezed still fast.

 

Now I test the tpm8-u-boot-432 on that "Cubietruck3" and is currently in limatester-loop #4 (also with 1000MB instead of 100MB) and still running. I can test the whole day. I moved the InfluxDB to a Intel-Nuc (also because of 64bit).

 

Further plan is:

- let it run with 432-u-boot for a while

- try 432-u-boot on "Cubietruck4" and see if limatester starts at least :-)

- test 432-u-boot also on "Cubietruck5" to see if this works too

 

Ingo F

Link to post
Share on other sites

Freezing is not a very good symptom. Normally one would see data corruption getting reported by lima-memtester (and a glowing red background with the cube spinning animation). You could also try to increase DCDC3 voltage (from 1.25V to 1.30V) and check if this helps to improve reliability.

Link to post
Share on other sites

Freezing is the symptom what's all about in this thread. I have no serial console and also no monitor on that cubietruck, so I only have the SSH console.

Additionally I use one of the LEDs with "heartbeat" - when it freezes it also stops blinking ...

How do I increase DCDC3 voltage?

 

With tpm8-u-boot 432MHz cubietruck 3 now do limatester 1000MB since 10h  20h

Link to post
Share on other sites

Quick test results from me with DCDC3=1.30V ( CONFIG_AXP_DCDC3_VOLT=1300 ).

 

Again seems stable with 432 MHz. (lima-memtester still running).

 

Two immediate freeze/crashes of lima-memtester with 384 Mhz:

root@cubietruck:~# ./lima-memtester 1536M
This is a simple textured cube demo from the lima driver and
a memtester. Both combined in a single program. The mali400
hardware is only used to stress RAM in the background. But
this happens to significantly increase chances of exposing
memory stability related problems.

Kernel driver is version 14
Detected 1 Mali-400 GP Cores.
Detected 2 Mali-400 PP Cores.
FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000)
Using dual buffered direct rendering to FB.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 1536MB (1610612736 bytes)
got  1536MB (1610612736 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : setting   0


Mali: permit MALI_IOC_MEM_MAP_EXT ioctl for framebuffer (paddr=0x44000000, size=16777216)
<6>mali: use config clk_div 1
<6>mali: clk_div 1
Mali: mali clock set completed, clock is  312000000 Hz
<6>mali: use config clk_div 1
<6>mali: clk_div 1
Mali: mali clock set completed, clock is  312000000 Hz
Mali: Mali device driver loaded
<1>Unable to handle kernel paging request at virtual address ff725170
<1>Unable to handle kernel paging request at virtual address bb42bac4

$$$$

<1>Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   55.933111] Unable to handle kernel NULL pointer dereference at virtual address 00000000
<1>Unable to handle kernel paging request at virtual address ff00ed1c
[   55.954559] Unable to handle kernel paging request at virtual address ff00ed1c
<1>Unable to handle kernel paging request at virtual address ff25c738
<1>pgd = ee624000
<1>[ff25c738] *pgd=00000000
<0>Internal error: Oops: 5 [#1] PREEMPT SMP ARM
<d>Modules linked in: mali ump bnep cpufreq_userspace sunxi_ir hidp rfcomm hci_uart bluetooth [last unloaded: scsi_wait_scan]
CPU: 0    Not tainted  (3.4.113-sun7i #40)
PC is at do_bad_area+0x34/0x8c
LR is at do_translation_fault+0x6c/0x9c
pc : [<c0019880>]    lr : [<c0019954>]    psr: 200f0193
sp : ee4de0b8  ip : ee4de000  fp : c0b4ff24
r10: 00000000  r9 : 00000001  r8 : ee4de188
r7 : ff25c738  r6 : 00000005  r5 : ff25c738  r4 : ee4de188
r3 : ff25c540  r2 : ee4de188  r1 : 00000005  r0 : ee4de188
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 6e62406a  DAC: 00000015

PC: 0xc0019800:
9800  eb02d986 e3500000 1affffc3 eaffff67 e3a01048 e28d0020 eb0f6f98 e28d1020
9820  eaffffc3 e3a01048 e28d0020 eb0f6f93 e28d1020 eaffffde e1a03005 e1a02006
9840  e1a01009 e1a00007 eb1c0315 e92d40f0 e24dd00c e1a04002 e1a05000 e1a0300d
9860  e3c3cd7f e5923040 e3ccc03f e1a06001 e313000f e59c300c 0a000006 e1a00002
9880  e59371f8 ebfffebb e3500000 0a00000c e28dd00c e8bd80f0 e58d2004 e3a0c001
98a0  e1a02001 e340c003 e1a01000 e58dc000 e1a00003 e3a0300b ebfffeba e28dd00c
98c0  e8bd80f0 e1a03004 e1a02006 e1a01005 e1a00007 eb1c02f2 e92d4010 ebffffda
98e0  e3a00000 e8bd8010 e35004bf 3a000023 e92d4070 e1a0c002 e5922040 e1a03000

LR: 0xc00198d4:
98d4  eb1c02f2 e92d4010 ebffffda e3a00000 e8bd8010 e35004bf 3a000023 e92d4070
98f4  e1a0c002 e5922040 e1a03000 e1a0e001 e312000f 0a00000d e3081ffc e34c10a3
9914  e1a04aa0 e7e02a50 e5916028 e1a01184 e0865001 ee120f10 e7952102 e3c00dff
9934  e3c0003f e3520000 e2400480 1a000005 e1a00003 e1a0200c e1a0100e ebffffbd
9954  e3a00000 e8bd8070 e7963184 e0801001 e7803184 e5953004 e5813004 ee071f3a
9974  f57ff04f e3a00000 e8bd8070 eafffedc e1a00600 ea02fa99 e92d4ff8 e1a04000
9994  e302092c e34c008d eb1c03d0 e1a00004 eb026003 e2144002 18bd8ff8 e302c1ec
99b4  e34cc0a9 e59ca004 e35a0000 da00004f e30f2b4c e34c20a2 e30d3414 e34c30b3

SP: 0xee4de038:
e038  ee4dfe74 ee4dfdd0 ff723bd8 ff000000 ff000000 ff000000 ff000000 ff000000
e058  c0019880 200f0193 ffffffff ee4de0a4 ee4de188 c000e798 ee4de188 00000005
e078  ee4de188 ff25c540 ee4de188 ff25c738 00000005 ff25c738 ee4de188 00000001
e098  00000000 c0b4ff24 ee4de000 ee4de0b8 c0019954 c0019880 200f0193 ffffffff
e0b8  00000000 00000000 ff000000 ff0007f9 ff007fc8 ff004000 ff25c738 ff019954
e0d8  ffa3177c ff000005 c00198e8 c00083bc 00000000 00000000 00000000 00000000
e0f8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e118  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

IP: 0xee4ddf80:
df80  0016f880 b64de070 00000001 c0000000 eeb4ff24 c071a508 0016f880 000198d8
dfa0  00000000 c000ec20 00000000 000007f9 c0007fc8 c0004000 ff25c738 c0019954
dfc0  c0a3177c 00000005 c00198e8 c00083bc 7fffffff 00000000 00000000 b670de3c
dfe0  00000000 b670db40 00039ac4 00039ad4 800f0010 b670db4c 00000000 00000000
e000  ff000000 ff010001 ff000000 ff25c540 ffa34608 ff000000 ff000015 ff00f940
e020  ee25c540 ee25d500 ee16f880 ee4de000 c0a18940 c0a2c658 ee4dfe74 ee4dfdd0
e040  00723bd8 00000000 00000000 00000000 00000000 00000000 00019880 000f0193
e060  ffffffff ee4de0a4 ee4de188 c000e798 ee4de188 00000005 ee4de188 ff25c540

FP: 0xc0b4fea4:
fea4  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fec4  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fee4  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff04  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff24  00000000 00000000 00000000 c07c5f64 ef004000 c07c5ed4 c09299c4 ef005600
ff44  00000000 00000000 0000003e 00000006 c07c5f64 c0929a44 eea30c00 00000001
ff64  00000000 0000003e 00000007 00000000 00000000 00000000 00000000 00000000
ff84  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

R0: 0xee4de108:
e108  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e128  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e148  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e168  00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798
e188  ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738
e1a8  ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880
e1c8  200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000
e1e8  ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000

R2: 0xee4de108:
e108  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e128  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e148  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e168  00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798
e188  ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738
e1a8  ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880
e1c8  200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000
e1e8  ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000

R3: 0xff25c4c0:
c4c0  ******** ******** ******** ******** ******** ******** ******** ********
c4e0  ******** ******** ******** ******** ******** ******** ******** ********
c500  ******** ******** ******** ******** ******** ******** ******** ********
c520  ******** ******** ******** ******** ******** ******** ******** ********
c540  ******** ******** ******** ******** ******** ******** ******** ********
c560  ******** ******** ******** ******** ******** ******** ******** ********
c580  ******** ******** ******** ******** ******** ******** ******** ********
c5a0  ******** ******** ******** ******** ******** ******** ******** ********

R4: 0xee4de108:
e108  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e128  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e148  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e168  00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798
e188  ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738
e1a8  ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880
e1c8  200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000
e1e8  ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000

R5: 0xff25c6b8:
c6b8  ******** ******** ******** ******** ******** ******** ******** ********
c6d8  ******** ******** ******** ******** ******** ******** ******** ********
c6f8  ******** ******** ******** ******** ******** ******** ******** ********
c718  ******** ******** ******** ******** ******** ******** ******** ********
c738  ******** ******** ******** ******** ******** ******** ******** ********
c758  ******** ******** ******** ******** ******** ******** ******** ********
c778  ******** ******** ******** ******** ******** ******** ******** ********
c798  ******** ******** ******** ******** ******** ******** ******** ********

R7: 0xff25c6b8:
c6b8  ******** ******** ******** ******** ******** ******** ******** ********
c6d8  ******** ******** ******** ******** ******** ******** ******** ********
c6f8  ******** ******** ******** ******** ******** ******** ******** ********
c718  ******** ******** ******** ******** ******** ******** ******** ********
c738  ******** ******** ******** ******** ******** ******** ******** ********
c758  ******** ******** ******** ******** ******** ******** ******** ********
c778  ******** ******** ******** ******** ******** ******** ******** ********
c798  ******** ******** ******** ******** ******** ******** ******** ********

R8: 0xee4de108:
e108  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e128  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e148  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
e168  00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798
e188  ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738
e1a8  ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880
e1c8  200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000
e1e8  ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000
<1>Unable to handle kernel paging request at virtual address 004de074
<1>pgd = ee624000
<1>[004de074] *pgd=00000000

Link to post
Share on other sites

Ok, first results from me on Cubietruck3 fpr 384MHz-DCDC3-13v: To compare to earlier results I started limatester with 100MB.

 

Don't know if it is "More reliable" ... it runned around 1,5h  andfreezed in the 5th loop, but some errors from testing occured:

Sun Feb 19 15:48:22 UTC 2017
This is a simple textured cube demo from the lima driver and
a memtester. Both combined in a single program. The mali400
hardware is only used to stress RAM in the background. But
this happens to significantly increase chances of exposing
memory stability related problems.

Kernel driver is version 14
Detected 1 Mali-400 GP Cores.
Detected 2 Mali-400 PP Cores.
FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000)
Using dual buffered direct rendering to FB.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 100MB (104857600 bytes)
got  100MB (104857600 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : testing   7FAILURE: possible bad address line at offset 0x05319ae0.
Skipping to next test...
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   6WRITE FAILURE: 0x55555555 != 0x5555d755 at offset 0x01437f20 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok

Loop 2:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing  26WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x015db20c (checkerboard).
  Bit Spread          : ok
  Bit Flip            : testing  72WRITE FAILURE: 0xfffffdff != 0xffff01ff at offset 0x006f92f8 (bitflip).
  Walking Ones        : ok
  Walking Zeroes      : ok

Loop 3:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   4WRITE FAILURE: 0x55555555 != 0x5555ee55 at offset 0x017d6b00 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok

Loop 4:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   3WRITE FAILURE: 0xaaaaaaaa != 0xaaaaf6aa at offset 0x010f42c0 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok

Loop 5:
  Stuck Address       : testing  13FAILURE: possible bad address line at offset 0x00217340.
Skipping to next test...
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : setting  10

Link to post
Share on other sites

u-boot 432MHz-DCDC3-13v runned 20 loops without any errors or failures.

 

Now I have installed official u-boot 5.20 and also give this an overnight test  to see if it is as stable as with 432er tpm8-u-boot :-)

 

Tomorrow evening: test 432 u-boot on cubietruck4 (to see if start now works) and cubietruck5 (was stable so far)

Link to post
Share on other sites

 

Ok, first results from me on Cubietruck3 fpr 384MHz-DCDC3-13v: To compare to earlier results I started limatester with 100MB.

 

Don't know if it is "More reliable" ... it runned around 1,5h  andfreezed in the 5th loop, but some errors from testing occured:

Sun Feb 19 15:48:22 UTC 2017
This is a simple textured cube demo from the lima driver and
a memtester. Both combined in a single program. The mali400
hardware is only used to stress RAM in the background. But
this happens to significantly increase chances of exposing
memory stability related problems.

Kernel driver is version 14
Detected 1 Mali-400 GP Cores.
Detected 2 Mali-400 PP Cores.
FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000)
Using dual buffered direct rendering to FB.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 100MB (104857600 bytes)
got  100MB (104857600 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : testing   7FAILURE: possible bad address line at offset 0x05319ae0.
Skipping to next test...
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   6WRITE FAILURE: 0x55555555 != 0x5555d755 at offset 0x01437f20 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok

Freezes and memory corruption errors are very likely two different types of problems (and both are bad). If DCDC3 voltage is insufficient, then we get freezes. If the clock speed is too high (or too low), then we get memory corruption errors. Yes, the clock speed also affects freezes to a smaller extent.

Link to post
Share on other sites

Freezes and memory corruption errors are very likely two different types of problems (and both are bad). If DCDC3 voltage is insufficient, then we get freezes. If the clock speed is too high (or too low), then we get memory corruption errors. Yes, the clock speed also affects freezes to a smaller extent.

Hm ... but the results here show freezes more connected to the clock speed then to DCDC3 voltage ...

Link to post
Share on other sites

Now it's getting weired ...

 

A cubietruck that had no problems so far (cubietruck5) which seems to work well with 384-u-boot seems to have problems with 432 :-(

 

The cubietruck freezed after 2-5h with tpm8-normal-432 u-boot ...

 

I test now with 432-DCDC3 and will post the result ...

 

For me it seems that depending on whatever some work better with 432 and some better with 384 ... And then it would be problematic to provide an u-boot for all of them :-(

Link to post
Share on other sites

Freezes are more connected to DCDC3 voltage because if it is high enough, then the freezes usually disappear and you only have to deal with just data corruption alone.

 

There is one more trick. DRAM configuration is not deterministic because ZQ calibration is run at the initialization time. You can get the current results of ZQ calibration and a lot of other settings by running the a10-meminfo tool from https://github.com/ssvb/a10-dram-tools

This may explain why sometimes you have a reliable board and sometimes it fails. ZQ calibration usually differs between cold start and warm reboot. It is also possible to bypass the ZQ calibration step and hardcode ZQ settings directly via the CONFIG_DRAM_ZQ parameter. This way the DRAM controller configuration will become deterministic. Some extra information can be found at https://linux-sunxi.org/User:Ssvb/pcDuino2_with_HYNIX_DDR3_reliability_test

Link to post
Share on other sites

The question is if this is the goal ... I would like to support a generic usable setting to be used by Armbian ... As soon as I start with using Device specific settings I'm out for the "Armbian default u-boot" ... yes my devices may be more stable ... hm ...

In order to fix the problem you need to find the problem first. And it's not about Armbian, other distros that use mainline u-boot will be affected by the same settings and algorithms.

Again, output of a10-meminfo before executing the stability tests may help in finding or ruling out certain possible causes for the instability.

Link to post
Share on other sites

This is a10-meminfo output with mainline u-boot @432 MHz and DCDC3=1.3V

Passed lima-memtester running about 8h (2 complete loops) without problems.

root@cubietruck:~/a10-dram-tools# ./a10-meminfo
dram_clk          = 432
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x05060606
active_windowing  = 0

Link to post
Share on other sites

This is a10-meminfo output with mainline u-boot @432 MHz and DCDC3=1.3V

It's better to provide info like this:

This was output with mainline u-boot @xxxMHz y.yV and it freezed during the tests

This was output with mainline u-boot @xxxMHz y.yV and it didn't freeze but displayed errors during lima-memtester run

This was output with mainline u-boot @xxxMHz y.yV and it passed the test running for x hours

Link to post
Share on other sites

This is a10-meminfo output with mainline u-boot @432 MHz and default DCDC3 (1.25v). This is the mainline u-boot default defconfig for cubietruck.

 

Passed lima-memtester running about 12h (3 complete loops) without problems.

root@cubietruck:~/a10-dram-tools# ./a10-meminfo
dram_clk          = 432
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x06060606
active_windowing  = 0

Link to post
Share on other sites

This is a10-meminfo output with mainline u-boot @384 MHz and DCDC3=1.3V.

lima-memtester immediately freezes.

<1>Unable to handle kernel NULL pointer dereference at virtual address 00000124
<1>pgd = ee034000
<1>[00000124] *pgd=6ebb7831, *pte=00000000, *ppte=00000000
<0>Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[  252.758001] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
<d>Modules linked in:[  252.765391] Modules linked in: mali mali ump ump bnep bnep cpufreq_userspace cpufreq_userspace sunxi_ir sunxi_ir hidp hidp rfcomm rfcomm hci_uart hci_uart bluetooth bluetooth [last unloaded: scsi_wait_scan] [last unloaded: scsi_wait_scan]

root@cubietruck:~/a10-dram-tools# ./a10-meminfo
dram_clk          = 384
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x05050505
active_windowing  = 0

Link to post
Share on other sites

Welcome everyone and glad to find this great forum :)

 

I have two boards and one of them has similar problem. With the old debian-server-ct-nand-v1.0.img it is rock stable but with Armbian Debian Jessie mainline it is throwing exceptions during the boot (and freeze).

I compiled u-boot from Armbian source with various memory clocks (360-432mhz) without any differences (clock changed but no success boot).

Also tried to set ZQ (from 7B to 7D as it is the only "visible" difference between the original Debian and Armbian memory settings) but i isn't changing (according to a10-meminfo). Any idea what could be the problem or what should be my next step?

 

"Good" board (which is working with Armbian) has GT8UB512V memory ICs while the "bad" one has SKHynix.

Link to post
Share on other sites

Ok, Results from "cubietruck5":

 

1.) armbian u-boot 5.20 (432 DRAM speed)

freeze after 9h limatester 1000MB

root@cubietruck:~# a10-meminfo
dram_clk = 432
mbus_clk = 300
dram_type = 3
dram_rank_num = 1
dram_chip_density = 8192
dram_io_width = 16
dram_bus_width = 32
dram_cas = 9
dram_zq = 0x7b (0x5294a00)
dram_odt_en = 0
dram_tpr0 = 0x42d899b7
dram_tpr1 = 0xa090
dram_tpr2 = 0x22a00
dram_tpr3 = 0x0
dram_emr1 = 0x4
dram_emr2 = 0x10
dram_emr3 = 0x0
dqs_gating_delay = 0x06060606
active_windowing = 0

2.) armbian u-boot 5.25 (384 DRAM speed)

no freeze in 24h with limatester 1000MB

root@cubietruck:~# a10-meminfo
dram_clk          = 384
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x05050505
active_windowing  = 0

Will do the complete etst with my problem-device tomorrow I think.

 

But as you see the values are exactly the same as in the post from tpm8 above for a probematic device. I fear i will find the same

Link to post
Share on other sites

Second Device (cubietruck3), my problematic one:

 

1.) armbian u-boot 5.20 (432 DRAM speed)

no freeze with limatester 1000MB

root@cubietruck:~# a10-meminfo
dram_clk          = 432
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x06060605
active_windowing  = 0

2.) armbian u-boot 5.25 (384 DRAM speed)

memory errors with limatester 1000MB and freeze/auto-restart with limatester 100MB

root@cubietruck:~# a10-meminfo
dram_clk          = 384
mbus_clk          = 300
dram_type         = 3
dram_rank_num     = 1
dram_chip_density = 8192
dram_io_width     = 16
dram_bus_width    = 32
dram_cas          = 9
dram_zq           = 0x7b (0x5294a00)
dram_odt_en       = 0
dram_tpr0         = 0x42d899b7
dram_tpr1         = 0xa090
dram_tpr2         = 0x22a00
dram_tpr3         = 0x0
dram_emr1         = 0x4
dram_emr2         = 0x10
dram_emr3         = 0x0
dqs_gating_delay  = 0x05050505
active_windowing  = 0

Memory errors where:

pagesize is 4096
pagesizemask is 0xfffff000
want 1000MB (1048576000 bytes)
got  1000MB (1048576000 bytes), trying mlock ...locked.
Loop 1:
  Stuck Address       : testing   2FAILURE: possible bad address line at offset 0x1b05c300.
Skipping to next test...
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   5WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x1873ab40 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok

Loop 2:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : testing   1WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x0a3c8b40 (checkerboard).
  Bit Spread          : ok
  Bit Flip            : ok
...
Link to post
Share on other sites

Guest
This topic is now closed to further replies.
 Share

4 4