apollon77 Posted February 14, 2017 Author Posted February 14, 2017 I have a sd card here, so preparing that and using it makes sense ... But I need to do that test then in the one machine where u-boot crashes on mainline and I can not use the new cubietruck because it did not freeze there so far. But with additional sd card I can do it that way forshort perids of time (db values are cached for some hours if db server goes down ... but overnight will not work for now :-) ). Ingo F
tpm8 Posted February 14, 2017 Posted February 14, 2017 Since we are testing the DRAM settings in U-Boot, the choice of kernel does not really matter. For the sake of convenience, the Orange Pi PC testing had been done via a special FEL boot package: https://github.com/ssvb/lima-memtester/releases/tag/20151207-orange-pi-pc-fel-test I can easily prepare something like this for the Cubietruck too. That would be great. I would happily test DRAM clock settings more efficently.
apollon77 Posted February 14, 2017 Author Posted February 14, 2017 Status: Prepared new sdcard with armbian 5.25 Legacy 3.4.113, only did apt-getupdate/upgrade afterwards u-boot 5.25 (original unchanged) time to freeze: Try 1: approx. 10 secs Try 2: approx. 30 secs Try 3: approx. 10mins Try 4: approx. 10mins Try 5: approx. 9 mins @tpm8: can you provide a binary u-boot which is the same base as 5.25 but with the unchanged clock-speed? So that we can make sure that this is really the only difference? Or should I use your November u-boot?
tpm8 Posted February 14, 2017 Posted February 14, 2017 Status: Prepared new sdcard with armbian 5.25 Legacy 3.4.113, only did apt-getupdate/upgrade afterwards u-boot 5.25 (original unchanged) time to freeze: Try 1: approx. 10 secs Try 2: approx. 30 secs Try 3: approx. 10mins Try 4: approx. 10mins Try 5: approx. 9 mins @tpm8: can you provide a binary u-boot which is the same base as 5.25 but with the unchanged clock-speed? So that we can make sure that this is really the only difference? Or should I use your November u-boot? I've compiled from mainline u-boot yesterday (13/02/2017) with unmodified cubietruck defconfig (CONFIG_DRAM_CLK=432 ) and another version with reduced clock speed (CONFIG_DRAM_CLK=384) that is used in armbian. I guess it's the same version that is used in armbian 5.25 (but can't be sure). Please find attached a zip with both binaries. cubietruck_uboot_binary_20170213.zip
apollon77 Posted February 14, 2017 Author Posted February 14, 2017 ok, will test with reduced speed first, and I plan to do it on at least 2-3 of my cubietrucks ... just to see if it is really "rare" or not Then I can decide where I can test the original clock-speed one So stay tuned ...
apollon77 Posted February 18, 2017 Author Posted February 18, 2017 I used some time the last days to do some tests. Here first results with 5.25 official u-boot from armbian: cubietruck1 -> 2 limatester-runs without freeze (but freezed before and was more stable in operation with u-boot 5.25 then with 5.20) cubietruck2 -> 2 limatester-runs runs without crash (but freezed before and was much more stable in operation with u-boot 5.20 then with 5.24/5!) cubietruck3 -> freeze after 10 mins limatester, multiple tries see above (most start of the "Checkerboard" tests) cubietruck4 -> reproducable freeze at limatester-start at "got 100MB, trying mlock ...", multiple tries cubietruck5 -> 2 limatester-runs without freeze Now I focus my tests on "Cubietruck3" for the beginning and tried the tpm8-u-boot-384 there. Result: Freeze in the second limatester loop ... so Interestingly later then the 10 mins before, but freezed still fast. Now I test the tpm8-u-boot-432 on that "Cubietruck3" and is currently in limatester-loop #4 (also with 1000MB instead of 100MB) and still running. I can test the whole day. I moved the InfluxDB to a Intel-Nuc (also because of 64bit). Further plan is: - let it run with 432-u-boot for a while - try 432-u-boot on "Cubietruck4" and see if limatester starts at least :-) - test 432-u-boot also on "Cubietruck5" to see if this works too Ingo F 1
ssvb Posted February 18, 2017 Posted February 18, 2017 Freezing is not a very good symptom. Normally one would see data corruption getting reported by lima-memtester (and a glowing red background with the cube spinning animation). You could also try to increase DCDC3 voltage (from 1.25V to 1.30V) and check if this helps to improve reliability.
apollon77 Posted February 18, 2017 Author Posted February 18, 2017 Freezing is the symptom what's all about in this thread. I have no serial console and also no monitor on that cubietruck, so I only have the SSH console. Additionally I use one of the LEDs with "heartbeat" - when it freezes it also stops blinking ... How do I increase DCDC3 voltage? With tpm8-u-boot 432MHz cubietruck 3 now do limatester 1000MB since 10h 20h
ssvb Posted February 18, 2017 Posted February 18, 2017 You can set CONFIG_AXP_DCDC3_VOLT=1300 in the configs/Cubietruck_defconfig file before compiling U-Boot.
apollon77 Posted February 18, 2017 Author Posted February 18, 2017 @tpm8: could you please compile an 384 u-boot with that option ... then I can retest that too
tpm8 Posted February 19, 2017 Posted February 19, 2017 @tpm8: could you please compile an 384 u-boot with that option ... then I can retest that too Here we go - zip contains both 384MHz and 432MHz u-boot at DCDC3=1.30V ( CONFIG_AXP_DCDC3_VOLT=1300 ). cubietruck_uboot_binary_DCDC3_13v_20170219.zip
tpm8 Posted February 19, 2017 Posted February 19, 2017 Quick test results from me with DCDC3=1.30V ( CONFIG_AXP_DCDC3_VOLT=1300 ). Again seems stable with 432 MHz. (lima-memtester still running). Two immediate freeze/crashes of lima-memtester with 384 Mhz: root@cubietruck:~# ./lima-memtester 1536M This is a simple textured cube demo from the lima driver and a memtester. Both combined in a single program. The mali400 hardware is only used to stress RAM in the background. But this happens to significantly increase chances of exposing memory stability related problems. Kernel driver is version 14 Detected 1 Mali-400 GP Cores. Detected 2 Mali-400 PP Cores. FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000) Using dual buffered direct rendering to FB. memtester version 4.3.0 (32-bit) Copyright (C) 2001-2012 Charles Cazabon. Licensed under the GNU General Public License version 2 (only). pagesize is 4096 pagesizemask is 0xfffff000 want 1536MB (1610612736 bytes) got 1536MB (1610612736 bytes), trying mlock ...locked. Loop 1: Stuck Address : setting 0 Mali: permit MALI_IOC_MEM_MAP_EXT ioctl for framebuffer (paddr=0x44000000, size=16777216) <6>mali: use config clk_div 1 <6>mali: clk_div 1 Mali: mali clock set completed, clock is 312000000 Hz <6>mali: use config clk_div 1 <6>mali: clk_div 1 Mali: mali clock set completed, clock is 312000000 Hz Mali: Mali device driver loaded <1>Unable to handle kernel paging request at virtual address ff725170 <1>Unable to handle kernel paging request at virtual address bb42bac4 $$$$ <1>Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 55.933111] Unable to handle kernel NULL pointer dereference at virtual address 00000000 <1>Unable to handle kernel paging request at virtual address ff00ed1c [ 55.954559] Unable to handle kernel paging request at virtual address ff00ed1c <1>Unable to handle kernel paging request at virtual address ff25c738 <1>pgd = ee624000 <1>[ff25c738] *pgd=00000000 <0>Internal error: Oops: 5 [#1] PREEMPT SMP ARM <d>Modules linked in: mali ump bnep cpufreq_userspace sunxi_ir hidp rfcomm hci_uart bluetooth [last unloaded: scsi_wait_scan] CPU: 0 Not tainted (3.4.113-sun7i #40) PC is at do_bad_area+0x34/0x8c LR is at do_translation_fault+0x6c/0x9c pc : [<c0019880>] lr : [<c0019954>] psr: 200f0193 sp : ee4de0b8 ip : ee4de000 fp : c0b4ff24 r10: 00000000 r9 : 00000001 r8 : ee4de188 r7 : ff25c738 r6 : 00000005 r5 : ff25c738 r4 : ee4de188 r3 : ff25c540 r2 : ee4de188 r1 : 00000005 r0 : ee4de188 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 6e62406a DAC: 00000015 PC: 0xc0019800: 9800 eb02d986 e3500000 1affffc3 eaffff67 e3a01048 e28d0020 eb0f6f98 e28d1020 9820 eaffffc3 e3a01048 e28d0020 eb0f6f93 e28d1020 eaffffde e1a03005 e1a02006 9840 e1a01009 e1a00007 eb1c0315 e92d40f0 e24dd00c e1a04002 e1a05000 e1a0300d 9860 e3c3cd7f e5923040 e3ccc03f e1a06001 e313000f e59c300c 0a000006 e1a00002 9880 e59371f8 ebfffebb e3500000 0a00000c e28dd00c e8bd80f0 e58d2004 e3a0c001 98a0 e1a02001 e340c003 e1a01000 e58dc000 e1a00003 e3a0300b ebfffeba e28dd00c 98c0 e8bd80f0 e1a03004 e1a02006 e1a01005 e1a00007 eb1c02f2 e92d4010 ebffffda 98e0 e3a00000 e8bd8010 e35004bf 3a000023 e92d4070 e1a0c002 e5922040 e1a03000 LR: 0xc00198d4: 98d4 eb1c02f2 e92d4010 ebffffda e3a00000 e8bd8010 e35004bf 3a000023 e92d4070 98f4 e1a0c002 e5922040 e1a03000 e1a0e001 e312000f 0a00000d e3081ffc e34c10a3 9914 e1a04aa0 e7e02a50 e5916028 e1a01184 e0865001 ee120f10 e7952102 e3c00dff 9934 e3c0003f e3520000 e2400480 1a000005 e1a00003 e1a0200c e1a0100e ebffffbd 9954 e3a00000 e8bd8070 e7963184 e0801001 e7803184 e5953004 e5813004 ee071f3a 9974 f57ff04f e3a00000 e8bd8070 eafffedc e1a00600 ea02fa99 e92d4ff8 e1a04000 9994 e302092c e34c008d eb1c03d0 e1a00004 eb026003 e2144002 18bd8ff8 e302c1ec 99b4 e34cc0a9 e59ca004 e35a0000 da00004f e30f2b4c e34c20a2 e30d3414 e34c30b3 SP: 0xee4de038: e038 ee4dfe74 ee4dfdd0 ff723bd8 ff000000 ff000000 ff000000 ff000000 ff000000 e058 c0019880 200f0193 ffffffff ee4de0a4 ee4de188 c000e798 ee4de188 00000005 e078 ee4de188 ff25c540 ee4de188 ff25c738 00000005 ff25c738 ee4de188 00000001 e098 00000000 c0b4ff24 ee4de000 ee4de0b8 c0019954 c0019880 200f0193 ffffffff e0b8 00000000 00000000 ff000000 ff0007f9 ff007fc8 ff004000 ff25c738 ff019954 e0d8 ffa3177c ff000005 c00198e8 c00083bc 00000000 00000000 00000000 00000000 e0f8 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e118 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 IP: 0xee4ddf80: df80 0016f880 b64de070 00000001 c0000000 eeb4ff24 c071a508 0016f880 000198d8 dfa0 00000000 c000ec20 00000000 000007f9 c0007fc8 c0004000 ff25c738 c0019954 dfc0 c0a3177c 00000005 c00198e8 c00083bc 7fffffff 00000000 00000000 b670de3c dfe0 00000000 b670db40 00039ac4 00039ad4 800f0010 b670db4c 00000000 00000000 e000 ff000000 ff010001 ff000000 ff25c540 ffa34608 ff000000 ff000015 ff00f940 e020 ee25c540 ee25d500 ee16f880 ee4de000 c0a18940 c0a2c658 ee4dfe74 ee4dfdd0 e040 00723bd8 00000000 00000000 00000000 00000000 00000000 00019880 000f0193 e060 ffffffff ee4de0a4 ee4de188 c000e798 ee4de188 00000005 ee4de188 ff25c540 FP: 0xc0b4fea4: fea4 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 fec4 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 fee4 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff04 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff24 00000000 00000000 00000000 c07c5f64 ef004000 c07c5ed4 c09299c4 ef005600 ff44 00000000 00000000 0000003e 00000006 c07c5f64 c0929a44 eea30c00 00000001 ff64 00000000 0000003e 00000007 00000000 00000000 00000000 00000000 00000000 ff84 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 R0: 0xee4de108: e108 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e128 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e148 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e168 00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798 e188 ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738 e1a8 ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880 e1c8 200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000 e1e8 ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000 R2: 0xee4de108: e108 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e128 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e148 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e168 00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798 e188 ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738 e1a8 ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880 e1c8 200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000 e1e8 ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000 R3: 0xff25c4c0: c4c0 ******** ******** ******** ******** ******** ******** ******** ******** c4e0 ******** ******** ******** ******** ******** ******** ******** ******** c500 ******** ******** ******** ******** ******** ******** ******** ******** c520 ******** ******** ******** ******** ******** ******** ******** ******** c540 ******** ******** ******** ******** ******** ******** ******** ******** c560 ******** ******** ******** ******** ******** ******** ******** ******** c580 ******** ******** ******** ******** ******** ******** ******** ******** c5a0 ******** ******** ******** ******** ******** ******** ******** ******** R4: 0xee4de108: e108 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e128 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e148 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e168 00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798 e188 ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738 e1a8 ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880 e1c8 200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000 e1e8 ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000 R5: 0xff25c6b8: c6b8 ******** ******** ******** ******** ******** ******** ******** ******** c6d8 ******** ******** ******** ******** ******** ******** ******** ******** c6f8 ******** ******** ******** ******** ******** ******** ******** ******** c718 ******** ******** ******** ******** ******** ******** ******** ******** c738 ******** ******** ******** ******** ******** ******** ******** ******** c758 ******** ******** ******** ******** ******** ******** ******** ******** c778 ******** ******** ******** ******** ******** ******** ******** ******** c798 ******** ******** ******** ******** ******** ******** ******** ******** R7: 0xff25c6b8: c6b8 ******** ******** ******** ******** ******** ******** ******** ******** c6d8 ******** ******** ******** ******** ******** ******** ******** ******** c6f8 ******** ******** ******** ******** ******** ******** ******** ******** c718 ******** ******** ******** ******** ******** ******** ******** ******** c738 ******** ******** ******** ******** ******** ******** ******** ******** c758 ******** ******** ******** ******** ******** ******** ******** ******** c778 ******** ******** ******** ******** ******** ******** ******** ******** c798 ******** ******** ******** ******** ******** ******** ******** ******** R8: 0xee4de108: e108 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e128 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e148 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 e168 00000000 00000000 c0019880 200f0193 ffffffff ee4de1bc ee4de2a0 c000e798 e188 ee4de2a0 00000005 ee4de2a0 ff25c540 ee4de2a0 ff25c738 00000005 ff25c738 e1a8 ee4de2a0 00000001 00000000 c0b4ff24 ee4de000 ee4de1d0 c0019954 c0019880 e1c8 200f0193 ffffffff 00000000 00000000 00000000 000007f9 c0007fc8 c0004000 e1e8 ff25c738 c0019954 c0a3177c 00000005 c00198e8 c00083bc 00000000 00000000 <1>Unable to handle kernel paging request at virtual address 004de074 <1>pgd = ee624000 <1>[004de074] *pgd=00000000
apollon77 Posted February 19, 2017 Author Posted February 19, 2017 Ok, first results from me on Cubietruck3 fpr 384MHz-DCDC3-13v: To compare to earlier results I started limatester with 100MB. Don't know if it is "More reliable" ... it runned around 1,5h andfreezed in the 5th loop, but some errors from testing occured: Sun Feb 19 15:48:22 UTC 2017 This is a simple textured cube demo from the lima driver and a memtester. Both combined in a single program. The mali400 hardware is only used to stress RAM in the background. But this happens to significantly increase chances of exposing memory stability related problems. Kernel driver is version 14 Detected 1 Mali-400 GP Cores. Detected 2 Mali-400 PP Cores. FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000) Using dual buffered direct rendering to FB. memtester version 4.3.0 (32-bit) Copyright (C) 2001-2012 Charles Cazabon. Licensed under the GNU General Public License version 2 (only). pagesize is 4096 pagesizemask is 0xfffff000 want 100MB (104857600 bytes) got 100MB (104857600 bytes), trying mlock ...locked. Loop 1: Stuck Address : testing 7FAILURE: possible bad address line at offset 0x05319ae0. Skipping to next test... Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 6WRITE FAILURE: 0x55555555 != 0x5555d755 at offset 0x01437f20 (checkerboard). Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok Loop 2: Stuck Address : ok Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 26WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x015db20c (checkerboard). Bit Spread : ok Bit Flip : testing 72WRITE FAILURE: 0xfffffdff != 0xffff01ff at offset 0x006f92f8 (bitflip). Walking Ones : ok Walking Zeroes : ok Loop 3: Stuck Address : ok Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 4WRITE FAILURE: 0x55555555 != 0x5555ee55 at offset 0x017d6b00 (checkerboard). Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok Loop 4: Stuck Address : ok Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 3WRITE FAILURE: 0xaaaaaaaa != 0xaaaaf6aa at offset 0x010f42c0 (checkerboard). Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok Loop 5: Stuck Address : testing 13FAILURE: possible bad address line at offset 0x00217340. Skipping to next test... Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : setting 10
apollon77 Posted February 19, 2017 Author Posted February 19, 2017 u-boot 432MHz-DCDC3-13v now runs in his 5th loop without any errors or failures. So it seems that power option brings no real change: on this cubietruck 432 works and 382 freezes
apollon77 Posted February 19, 2017 Author Posted February 19, 2017 u-boot 432MHz-DCDC3-13v runned 20 loops without any errors or failures. Now I have installed official u-boot 5.20 and also give this an overnight test to see if it is as stable as with 432er tpm8-u-boot :-) Tomorrow evening: test 432 u-boot on cubietruck4 (to see if start now works) and cubietruck5 (was stable so far)
ssvb Posted February 20, 2017 Posted February 20, 2017 Ok, first results from me on Cubietruck3 fpr 384MHz-DCDC3-13v: To compare to earlier results I started limatester with 100MB. Don't know if it is "More reliable" ... it runned around 1,5h andfreezed in the 5th loop, but some errors from testing occured: Sun Feb 19 15:48:22 UTC 2017 This is a simple textured cube demo from the lima driver and a memtester. Both combined in a single program. The mali400 hardware is only used to stress RAM in the background. But this happens to significantly increase chances of exposing memory stability related problems. Kernel driver is version 14 Detected 1 Mali-400 GP Cores. Detected 2 Mali-400 PP Cores. FB: 1920x1080@32bpp at 0x44001000 (0x00FD2000) Using dual buffered direct rendering to FB. memtester version 4.3.0 (32-bit) Copyright (C) 2001-2012 Charles Cazabon. Licensed under the GNU General Public License version 2 (only). pagesize is 4096 pagesizemask is 0xfffff000 want 100MB (104857600 bytes) got 100MB (104857600 bytes), trying mlock ...locked. Loop 1: Stuck Address : testing 7FAILURE: possible bad address line at offset 0x05319ae0. Skipping to next test... Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 6WRITE FAILURE: 0x55555555 != 0x5555d755 at offset 0x01437f20 (checkerboard). Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok Freezes and memory corruption errors are very likely two different types of problems (and both are bad). If DCDC3 voltage is insufficient, then we get freezes. If the clock speed is too high (or too low), then we get memory corruption errors. Yes, the clock speed also affects freezes to a smaller extent.
apollon77 Posted February 20, 2017 Author Posted February 20, 2017 Freezes and memory corruption errors are very likely two different types of problems (and both are bad). If DCDC3 voltage is insufficient, then we get freezes. If the clock speed is too high (or too low), then we get memory corruption errors. Yes, the clock speed also affects freezes to a smaller extent. Hm ... but the results here show freezes more connected to the clock speed then to DCDC3 voltage ...
apollon77 Posted February 20, 2017 Author Posted February 20, 2017 Now it's getting weired ... A cubietruck that had no problems so far (cubietruck5) which seems to work well with 384-u-boot seems to have problems with 432 :-( The cubietruck freezed after 2-5h with tpm8-normal-432 u-boot ... I test now with 432-DCDC3 and will post the result ... For me it seems that depending on whatever some work better with 432 and some better with 384 ... And then it would be problematic to provide an u-boot for all of them :-(
ssvb Posted February 20, 2017 Posted February 20, 2017 Freezes are more connected to DCDC3 voltage because if it is high enough, then the freezes usually disappear and you only have to deal with just data corruption alone. There is one more trick. DRAM configuration is not deterministic because ZQ calibration is run at the initialization time. You can get the current results of ZQ calibration and a lot of other settings by running the a10-meminfo tool from https://github.com/ssvb/a10-dram-tools This may explain why sometimes you have a reliable board and sometimes it fails. ZQ calibration usually differs between cold start and warm reboot. It is also possible to bypass the ZQ calibration step and hardcode ZQ settings directly via the CONFIG_DRAM_ZQ parameter. This way the DRAM controller configuration will become deterministic. Some extra information can be found at https://linux-sunxi.org/User:Ssvb/pcDuino2_with_HYNIX_DDR3_reliability_test
apollon77 Posted February 20, 2017 Author Posted February 20, 2017 The question is if this is the goal ... I would like to support a generic usable setting to be used by Armbian ... As soon as I start with using Device specific settings I'm out for the "Armbian default u-boot" ... yes my devices may be more stable ... hm ...
apollon77 Posted February 21, 2017 Author Posted February 21, 2017 Cubietruck5: The cubietruck freezed after 2-5h with tpm8-normal-432 u-boot ... I test now with 432-DCDC3 and will post the result ... runs 12h now without errors or freezes, so here the DCDC3 change worked, I will now test the 384-DCDC3 if it is more stable
zador.blood.stained Posted February 21, 2017 Posted February 21, 2017 The question is if this is the goal ... I would like to support a generic usable setting to be used by Armbian ... As soon as I start with using Device specific settings I'm out for the "Armbian default u-boot" ... yes my devices may be more stable ... hm ... In order to fix the problem you need to find the problem first. And it's not about Armbian, other distros that use mainline u-boot will be affected by the same settings and algorithms. Again, output of a10-meminfo before executing the stability tests may help in finding or ruling out certain possible causes for the instability.
tpm8 Posted February 21, 2017 Posted February 21, 2017 This is a10-meminfo output with mainline u-boot @432 MHz and DCDC3=1.3V Passed lima-memtester running about 8h (2 complete loops) without problems. root@cubietruck:~/a10-dram-tools# ./a10-meminfo dram_clk = 432 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x05060606 active_windowing = 0
zador.blood.stained Posted February 21, 2017 Posted February 21, 2017 This is a10-meminfo output with mainline u-boot @432 MHz and DCDC3=1.3V It's better to provide info like this: This was output with mainline u-boot @xxxMHz y.yV and it freezed during the tests This was output with mainline u-boot @xxxMHz y.yV and it didn't freeze but displayed errors during lima-memtester run This was output with mainline u-boot @xxxMHz y.yV and it passed the test running for x hours
tpm8 Posted February 22, 2017 Posted February 22, 2017 This is a10-meminfo output with mainline u-boot @432 MHz and default DCDC3 (1.25v). This is the mainline u-boot default defconfig for cubietruck. Passed lima-memtester running about 12h (3 complete loops) without problems. root@cubietruck:~/a10-dram-tools# ./a10-meminfo dram_clk = 432 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x06060606 active_windowing = 0
tpm8 Posted February 22, 2017 Posted February 22, 2017 This is a10-meminfo output with mainline u-boot @384 MHz and DCDC3=1.3V. lima-memtester immediately freezes. <1>Unable to handle kernel NULL pointer dereference at virtual address 00000124 <1>pgd = ee034000 <1>[00000124] *pgd=6ebb7831, *pte=00000000, *ppte=00000000 <0>Internal error: Oops: 17 [#1] PREEMPT SMP ARM [ 252.758001] Internal error: Oops: 17 [#1] PREEMPT SMP ARM <d>Modules linked in:[ 252.765391] Modules linked in: mali mali ump ump bnep bnep cpufreq_userspace cpufreq_userspace sunxi_ir sunxi_ir hidp hidp rfcomm rfcomm hci_uart hci_uart bluetooth bluetooth [last unloaded: scsi_wait_scan] [last unloaded: scsi_wait_scan] root@cubietruck:~/a10-dram-tools# ./a10-meminfo dram_clk = 384 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x05050505 active_windowing = 0
werdy Posted February 22, 2017 Posted February 22, 2017 Welcome everyone and glad to find this great forum I have two boards and one of them has similar problem. With the old debian-server-ct-nand-v1.0.img it is rock stable but with Armbian Debian Jessie mainline it is throwing exceptions during the boot (and freeze). I compiled u-boot from Armbian source with various memory clocks (360-432mhz) without any differences (clock changed but no success boot). Also tried to set ZQ (from 7B to 7D as it is the only "visible" difference between the original Debian and Armbian memory settings) but i isn't changing (according to a10-meminfo). Any idea what could be the problem or what should be my next step? "Good" board (which is working with Armbian) has GT8UB512V memory ICs while the "bad" one has SKHynix.
apollon77 Posted February 22, 2017 Author Posted February 22, 2017 Ok, Results from "cubietruck5": 1.) armbian u-boot 5.20 (432 DRAM speed) freeze after 9h limatester 1000MB root@cubietruck:~# a10-meminfo dram_clk = 432 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x06060606 active_windowing = 0 2.) armbian u-boot 5.25 (384 DRAM speed) no freeze in 24h with limatester 1000MB root@cubietruck:~# a10-meminfo dram_clk = 384 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x05050505 active_windowing = 0 Will do the complete etst with my problem-device tomorrow I think. But as you see the values are exactly the same as in the post from tpm8 above for a probematic device. I fear i will find the same
apollon77 Posted February 24, 2017 Author Posted February 24, 2017 Second Device (cubietruck3), my problematic one: 1.) armbian u-boot 5.20 (432 DRAM speed) no freeze with limatester 1000MB root@cubietruck:~# a10-meminfo dram_clk = 432 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x06060605 active_windowing = 0 2.) armbian u-boot 5.25 (384 DRAM speed) memory errors with limatester 1000MB and freeze/auto-restart with limatester 100MB root@cubietruck:~# a10-meminfo dram_clk = 384 mbus_clk = 300 dram_type = 3 dram_rank_num = 1 dram_chip_density = 8192 dram_io_width = 16 dram_bus_width = 32 dram_cas = 9 dram_zq = 0x7b (0x5294a00) dram_odt_en = 0 dram_tpr0 = 0x42d899b7 dram_tpr1 = 0xa090 dram_tpr2 = 0x22a00 dram_tpr3 = 0x0 dram_emr1 = 0x4 dram_emr2 = 0x10 dram_emr3 = 0x0 dqs_gating_delay = 0x05050505 active_windowing = 0 Memory errors where: pagesize is 4096 pagesizemask is 0xfffff000 want 1000MB (1048576000 bytes) got 1000MB (1048576000 bytes), trying mlock ...locked. Loop 1: Stuck Address : testing 2FAILURE: possible bad address line at offset 0x1b05c300. Skipping to next test... Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 5WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x1873ab40 (checkerboard). Bit Spread : ok Bit Flip : ok Walking Ones : ok Walking Zeroes : ok Loop 2: Stuck Address : ok Random Value : ok Compare XOR : ok Compare SUB : ok Compare MUL : ok Compare DIV : ok Compare OR : ok Compare AND : ok Sequential Increment: ok Solid Bits : ok Block Sequential : ok Checkerboard : testing 1WRITE FAILURE: 0xaaaaaaaa != 0xaaaa55aa at offset 0x0a3c8b40 (checkerboard). Bit Spread : ok Bit Flip : ok ...
Recommended Posts