Jump to content

Recommended Posts

Posted

After flashing Armbian 5.20 to my SD card, running apt-get upgrade, it fails to boot with the new kernel.

 

The last thing I get from the console is

Image lacks image_size field, assuming 16MiB
## Loading init Ramdisk from Legacy Image at 45300000 ...
   Image Name:   uInitrd
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:    3845164 Bytes = 3.7 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 45000000
   Booting using the fdt blob at 0x45000000
   reserving fdt memory region: addr=45000000 size=200000
   reserving fdt memory region: addr=41010000 size=10000
   reserving fdt memory region: addr=41020000 size=800
   reserving fdt memory region: addr=40100000 size=4000
   reserving fdt memory region: addr=40104000 size=1000
   reserving fdt memory region: addr=40105000 size=1000
   Loading Ramdisk to b6b0f000, end b6eb9c2c ... OK
   Loading Device Tree to 44fec000, end 44fffddb ... OK

Starting kernel ...

[mmc]: MMC Device 2 not found
[mmc]: mmc 2 not find, so not exit
INFO:    BL3-1: Next image address = 0x41080000
INFO:    BL3-1: Next image spsr = 0x3c9
Loading, please wait...
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ...

Followed soon after by

Begin: Running /scripts/local-premount ... [   36.407826] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:4]
[   36.418567] Call trace:
[   36.421371] Call trace:
[   36.424151] Call trace:
[   36.426959] Call trace:
[   36.430509] Call trace:
[   36.433307] Call trace:
[   36.436100] Call trace:
[   36.438892] Call trace:
[   36.441685] Call trace:
[   36.444480] Call trace:
[   36.447285] Call trace:
[   36.450078] Call trace:
[   36.452869] Call trace:
[   36.455662] Call trace:
[   36.458452] Call trace:
[   36.461243] Call trace:
[   36.464035] Call trace:
[   36.466830] Call trace:
[   36.469624] Call trace:
[   36.472416] Call trace:
[   36.475208] Call trace:
[   36.477999] Call trace:
[   36.480790] Call trace:
[   36.483585] Call trace:
[   36.486380] Call trace:
[   36.489171] Call trace:
[   36.491960] Call trace:
[   36.494750] Call trace:
[   36.497539] Call trace:
[   36.500333] Call trace:
[   36.503128] Call trace:
[   36.505923] Call trace:
[   36.508717] Call trace:
[   36.511516] Call trace:
[   36.514311] Call trace:
[   36.517107] Call trace:
[   36.519900] Call trace:
[   36.522694] Call trace:
[   36.525490] Call trace:
[   36.528287] Call trace:
[   36.531083] Call trace:
[   36.533879] Call trace:
[   36.536676] Call trace:
[   36.539472] Call trace:
[   36.542264] Call trace:
[   36.545060] Call trace:
[   36.547856] Call trace:
[   36.550649] Call trace:
[   36.553753] Call trace:
[   36.556552] Call trace:
[   36.559350] Call trace:
[   36.562141] Call trace:
[   36.564964] Call trace:
[   36.567759] Call trace:
[   36.570553] Call trace:
[   36.573347] Call trace:
[   36.576148] Call trace:
[   36.578945] Call trace:
[   36.581736] Call trace:
[   36.584527] Call trace:
[   36.587320] Call trace:
[   36.590117] Call trace:
[   36.592909] Call trace:
[   36.595702] Call trace:
[   36.598492] Call trace:
[   36.601284] Call trace:
[   36.604076] Call trace:
[   36.606873] Call trace:
[   36.609667] Call trace:
[   36.612494] Call trace:
[   36.615286] Call trace:
[   36.618081] Call trace:
[   36.620907] Call trace:
[   36.623700] Call trace:
[   36.626492] Call trace:
[   63.959820] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:4]
[   63.970537] Call trace:

It then continually repeats....

 

If I flash the old image back it boots fine, then after apt-get upgrade again it fails.

 

Google found this https://www.bountysource.com/issues/38404155-pine64-boot-issue-cpu-stuckbut it doesn't have an answer yet

 

 

Posted

I am using Armbian (Server 5.24, Legacy 3.10.104) on the Pine64 and it is a great system. Unfortunately, I also encounter this boot problem, about every third time it fails to boot (repeating "Call trace: ...)

 

Is there anything I can do???

I have already checked the power supply.

Can i use other versions of dtb-files oder kernel?

 

Thanks a lot.

Posted

I am currently running ARMBIAN 5.25 stable Ubuntu 16.04.2 LTS 3.10.104-pine64 (server) and also had this problem after I applied updates unless I had a monitor attached.

To resolve the startup issue, I changed from dynamic dhcp to static. Starts up every time now.

Posted

Thanks for your your advice. Unfortunately I've no choice but to use the dhcp client in my production environment.

I would appreciate any other ideas.

Posted

Well, thinks have improved.

I installed the latest Armbian Debian Jessie Server release and after applying all of the updates, I ended up with a more stable system.

I did still have some startup issues, so I found that changing the disp_mode parm from 720p60 to 480p or 480i in /boot/armbianEnv.txt fixes my startup issues.

I left the default setting for dhcp in /etc/networking/interfaces.

 

I also have another system running Armbian Ubuntu Server with Mate installed. This was also having startup issues without a static IP set.

Now, after the recent updates which I applied today, this system works with dhcp and the disp_mode parm of 720p60 or 1080p60.

 

These  are running on a Pine64+ 2Gb.

Posted
On 3/6/2017 at 7:04 AM, dano said:

Well, thinks have improved.

I installed the latest Armbian Debian Jessie Server release and after applying all of the updates, I ended up with a more stable system.

I did still have some startup issues, so I found that changing the disp_mode parm from 720p60 to 480p or 480i in /boot/armbianEnv.txt fixes my startup issues.

I left the default setting for dhcp in /etc/networking/interfaces.

 

I also have another system running Armbian Ubuntu Server with Mate installed. This was also having startup issues without a static IP set.

Now, after the recent updates which I applied today, this system works with dhcp and the disp_mode parm of 720p60 or 1080p60.

 

These  are running on a Pine64+ 2Gb.

I wanted to say thanks for the suggestion, it's helped me too. 

 

I started my Pine64+ 2GB today for the first time, and after failing to get openSUSE running, I tried Armbian. I too was having issues with the board not booting sometimes, but it seems that setting the display resolution (which I never use anyway) to 480i helped. I've done a few reboots since I made the change and none of the got stuck. Thank you! :)

Posted

Unfortunately the Pine64 continues failing to boot with the actual armbian version.
Finally I did a few tests concerning the boot problem, using different SD cards.
I used the current Ubuntu server version (Armbian_5.25_Pine64_Ubuntu_xenial_default_3.10.104), upgrade 11.04.17). With
only Ethernet and Power connected
(no HDMI, no USB device, ...). Power was supplied through Pin Headers (not micro USB)
Ethernet used DHCP. I used three different Pine64+ boards with 2GB of RAM for my tests.

 

Results:
The boot problem - Pine64 does not boot completely - continued to occur at different frequency depending on the SD-card used:

- With the 4GB cards the Pine64 booted only in 46% of the cases (90 tests).
- With the 8GB cards
the Pine64 booted in 90% of the cases (60 tests).

- With the 32GB cards the Pine64 booted in 50% of the cases (60 tests).

 

The boot console shows the following behavior - similar to the post of christf -:

Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.27.1
[   14.723973] [DISP] disp_device_attached_and_enable,line:159:[/sbin/fsck.ext4 (1) -- /dev/mmcblk0p1] fsck.ext4 -a -C0 /dev/mmcblk0p1 
/dev/mmcblk0p1: clean, 47357/216832 files, 366370/915968 blocks
done.
attched ok, mgr0<-->device1, type=4, mode=5

 

And than it then continually repeats....

[   40.681832] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:1:35]
[   40.693596] Modules linked in:
[   40.701616] 
[   40.707811] CPU: 0 PID: 35 Comm: kworker/0:1 Not tainted 3.10.104-pine64 #2
[   40.720179] Workqueue: events start_work
[   40.729239] task: ffffffc078b1a4c0 ti: ffffffc078b74000 task.ti: ffffffc078b74000
[   40.742218] PC is at __do_softirq+0xb4/0x2d8
[   40.751573] LR is at __do_softirq+0x30/0x2d8
[   40.760848] pc : [<ffffffc0000b5fc4>] lr : [<ffffffc0000b5f40>] pstate: 40000145
.... 

 

 

I hope anyone can give me an idea on how to solve the problem.

Thanks a lot.

 

BOOTFail.log

Posted

Thanks a lot for your support - I have installed the kernel headers (apt install linux-headers-pine64_5.27.170414_arm64.deb) and unfortunately
there is not much improvement. After
the first tests 20x boot test with 4 GB card: 50% boot success, instead of 46% with the old headers.

InstallKernel.txt

Posted

Sorry, I've picked up the wrong package.

Now I have installed the linux-image package and carried out first boot-tests with a 4GB-SD-card.

 

And the result was great, 90% success, my pine64 started up almost every time.

I will continue the tests in the next days.

Thank you, you have done me a big favor.

Posted
15 minutes ago, linda said:

And the result was great, 90% success, my pine64 started up almost every time.

I will continue the tests in the next days.

Please provide a lockup log with this new kernel too, maybe it will help improving the reliability futher.

Posted

I have now done another 120 boot attempts with the 4GB SD cards and achieved the following results:
- The boot-success was 80%.
- The lockup files show similar results as before, but are slightly different (see appendix):

 

For example

One boot failure showed the following behavior:

[   14.834515] Freeing unused kernel memory: 524K (ffffffc0009fc000 - ffffffc000a7f000)
Loading, please wait...
starting version 229
[   15.262882] [DISP] disp_device_attached_and_enable,line:159:attched ok, mgr0<-->device1, type=4, mode=5
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.27.1
[/sbin/fsck.ext4 (1) -- /dev/mmcblk0p1] fsck.ext4 -a -C0 /dev/mmcblk0p1  
/dev/mmcblk0p1: recovering journal
/dev/mmcblk0p1: clean, 47463/216832 files, 368808/915968 blocks
done.
[   41.122365] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:1:35]

 

one boot failure started with

[   14.378982] Freeing unused kernel memory: 524K (ffffffc0009fc000 - ffffffc000a7f000)
Loading, please wait...
starting version 229
[   36.758677] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:1:35]

 

and another boot failures followed these boot-outputs

[   14.405592] Freeing unused kernel memory: 524K (ffffffc0009fc000 - ffffffc000a7f000)
starting version 229
[   14.487651] hub 3-0:1.0: USB hub found
[   14.496552] hub 3-0:1.0: 1 port detected
[   14.505686] scene_lock_init name=ohci_standby
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.27.1
[   14.916161] [DISP] disp_device_attached_and_enable,line:159:attched ok, mgr0<-->device1, type=4, mode=5
[/sbin/fsck.ext4 (1) -- /dev/mmcblk0p1] fsck.ext4 -a -C0 /dev/mmcblk0p1  
/dev/mmcblk0p1: recovering journal
/dev/mmcblk0p1: clean, 47374/216832 files, 347125/915968 blocks
done.
[   40.716572] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:1:35]

 

 

I hope these lockuo logs help in troubleshooting. Thanks aBOOTFail_2017-04-15-B1.log lot!!

BOOTFail_2017-04-15-B2.log

BOOTFail_2017-04-15-C1.log

Posted (edited)

I did some more tests with more different sd-cards: Another brand 8GB, 4GB and one 16GB card.

The results were quite the same as before. About 80% successfully booting.


After equipping my pine boards with a reset-switch i did the following test:
if the board fails to boot after 50 seconds -> press the reset switch.
With the  5.27 image the boards had an rate of successfully booting of 99% after the second reset.  

Summary
260 attempts to boot (power on)
204 success (78%)
251 success after one reset 96%
258 success after two resets 99%

Are there any more test I could make or any information I could provide in order to support you?

 

Will the 5.27 image become available via standard apt-upgrade?
If so, when do you think this will be going to happen?

 

Edited by linda
Posted
1 hour ago, linda said:

Are there any more test I could make or any information I could provide in order to support you?

 

I did some more tests and looked more closely at the stack traces.

If I understand it correctly it actually hangs somewhere here, most likely in __setup_irq or shortly after it:

[   40.941570] [<ffffffc000083dc0>] el1_irq+0x80/0xe4
[   40.950287] [<ffffffc000125844>] __setup_irq+0x318/0x3e0
[   40.959557] [<ffffffc000125a84>] request_threaded_irq+0xe0/0x124
[   40.969620] [<ffffffc00041280c>] disp_sys_register_irq+0x88/0x98
[   40.979698] [<ffffffc000420610>] disp_hdmi_enable+0x1d4/0x278
[   40.989485] [<ffffffc000414540>] disp_device_attached_and_enable+0x1bc/0x1d4
[   41.000742] [<ffffffc0004146f8>] bsp_disp_device_switch+0xbc/0xe4

In theory completely disabling the display driver should help for headless use cases, but it needs some rework in order to implement this.

 

1 hour ago, linda said:

Will the 5.27 image become available via standard apt-upgrade?

Yes

 

1 hour ago, linda said:

If so, when do you think this will be going to happen?

 

1-2 months

 

1 hour ago, linda said:

Are there any more test I could make or any information I could provide in order to support you?

 

Don't think so, it's not easy to debug, and hopefully mainline kernel will soon be good enough for everyday use (at least headless/server), so we could forget about the BSP kernel.

Posted

I would not mind disabling the display driver. I am running headless anyway.

Is there any easy way to disable the display driver?

 

Posted
11 minutes ago, linda said:

Is there any easy way to disable the display driver?

 

Disabling it in the kernel config is the easiest way.

Reading a Device Tree property to allow disabling it without recompilation would be the correct way, I'll try to implement this before the update and test if it actually resolves the issue.

Posted

I've been having problems with kernel boot stalling on my headless Pine64+ with Armbian 5.25, occurring 95% of the time in case of unclean shutdown (e.g. power loss). On contrary, it only occurs 5% of the time upon graceful shutdown/reboot, as CPU usually locks up after fsck is finished and fsck is not needed upon clean shutdown. Debug output to console seems to be increasing likelihood of crash substantially.

 

So I've added "extraargs=loglevel=1" to /boot/armbianEnv.txt to override "verbosity=7" which seems to be reinstated upon boot, resulting in kernel booting with loglevel=7 upon unclean shutdown. I'm seeing CPU lock ups as rarely as with graceful shutdown now (e.g. 5%). There seems to be some race condition influenced by kernel boot time (which is marginally higher when console logging is at debugging level).

 

Obviously if other people are seeing high failure rate regardless of kernel console logging level, my findings are not relevant to their issues, but I figured this might help someone...

Posted

Hi Zador,

 

I'm still observing this behavior with Jessie's based Armbian 5.36, after having to run update-initramfs -u...

 

Before that, I had no problem with reboots. After that, I have to unplug and replug power supply for the system to boot.

 

I've just tried beta's 5.37 and the problem persists...

 

Do you have any hint on the reason this is happening only after updating initramfs?

 

Thanks.

Posted
10 hours ago, ZupoLlask said:

Do you have any hint on the reason this is happening only after updating initramfs?

Because this is most likely a race condition and it may depend on initramfs and kernel size and alignment, CPU speed, threads and kworkers being assigned to different CPU cores and other unpredictable things.

Posted

I brought a TTL adaptor from my company's lab and now I can confirm I'm having precisely the same problem described in the OP.

 

I also found out this thread, which can be useful for other to better understand this issue: https://github.com/longsleep/build-pine64-image/issues/51

 

As I doubt I have a problem with the power supply I'm using, for now I decided to test another USB cable.

After that, I'll make proper testings of all the components involved if I need it...

Posted

I've just tried to reproduce the problem with a 15cm cable... It still happens easily.

 

In my case, I don't think it's related with power supply but I'll try to clear that out in the lab.

 

If it was,  it should happen consistently with several USB devices overloading the power supply with the 4 CPU cores at 100% each, which never occurs.

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines