Jump to content

OOM killer invoked but lots of free memory!! OPI pc, 4.7.3 kernel.


rick

Recommended Posts

So I thought I had a stable system with my OPI PC the 4.7.3-sun8i kernel (for BTRFS support), but the OOM killer keeps silently triggering. I can see it on the serial term and in the logs.

 

So I have done some troubleshooting.  I added the following t /etc/sysctl.conf:

vm.swappiness=10  # try a value that will actually use the swap
vm.min_free_kbytes=64000 # ensure that some memory is always available, held back from file caching.
vm.oom_kill_allocating_task=1   # kill the offending task that requested the memory and not the task with the most memory to give up to see if there is an obvious trigger culprit.

 

So, what I found is that the systems memory usually looks like this when running my system restore script:

                   total             used          free           shared     buff/cache   available
Mem:        1026284      195292      123140        3284      707852      577196
Swap:       1097724       23340     1074384

So it is not a memory shortage. It is triggred by random things that are not using much memory like:

[13457.612306] Out of memory (oom_kill_allocating_task): Kill process 6285 (sessionclean) score 0 or sacrifice child
[13457.622666] Killed process 6285 (sessionclean) total-vm:1460kB, anon-rss:76kB, file-rss:0kB, shmem-rss:0kB
[15248.539294] Out of memory (oom_kill_allocating_task): Kill process 6593 (phpquery) score 0 or sacrifice child
[15248.549794] Killed process 6593 (phpquery) total-vm:1460kB, anon-rss:104kB, file-rss:0kB, shmem-rss:0kB
 

It's killed sh, watch, ls, .... all small potatoes. Sometimes a big one like apache or mysql too.

 

Any ideas? Kernel bug or something I can fix?

 

It does seem to like to crash during this command in a script that restores a folder under /var/www from a remote tgz, I have been running the restore over and over to crash the system with it. It only OOM's  some times:

(ssh $destuser "cat $destdir/www" |pv|unpigz| tar -xp ) || abort

 

 

Any ideas? I tried to figure it out myself but I am in over my head so.....

 

 

Link to comment
Share on other sites

These OPI PC boards might have to be abandon as unstable.  I can't get it to run stable with the 4.7.3 kernel and it crashes on boot with the 4.8.3.

 

I have replaced the power supply wth a 3A buck boost device &  fat leads that will ensure a steady 5.15V at the input and  5.05V at the usb ports. It does not seem to be a power issue, and it seems that the power was good (4.93v -5.07V at the USB port output) before too.

 

Ok, so I made my btrfs root fs a raid1 and ran the above and started to get this under the extra load:

 

[ 3641.664542] kernel BUG at lib/dynamic_queue_limits.c:26!                                                                                             
[ 3641.669931] Internal error: Oops - BUG: 0 [#1] SMP ARM                                                                                               
[ 3641.675133] Modules linked in: xt_multiport evdev ir_lirc_codec lirc_dev sunxi_cir sun8i_ths cpufreq_dt uio_pdrv_genirq uio thermal_sys ip6t_REJECT s
[ 3641.721723] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.7.3-sun8i #5                                                                   
[ 3641.729390] Hardware name: Allwinner sun8i Family                                                                                                    
[ 3641.734166] task: c0d05f00 ti: c0d00000 task.ti: c0d00000                                                                                            
[ 3641.739653] PC is at dql_completed+0x100/0x17c                                                                                                       
[ 3641.744176] LR is at sun8i_emac_poll+0x26c/0x6a0                                                                                                     
[ 3641.748867] pc : [<c0562de0>]    lr : [<c06353dc>]    psr: 800e0113                                                                                  
[ 3641.748867] sp : c0d01e14  ip : 00000000  fp : 00015600                                                                                              
[ 3641.760456] r10: 00000040  r9 : ef061518  r8 : ef061000                                                                                              
[ 3641.765748] r7 : 00000001  r6 : 02853a24  r5 : 00000000  r4 : 02853a24                                                                               
[ 3641.772341] r3 : 00000000  r2 : 00000c00  r1 : 00000066  r0 : ee00c680                                                                               
[ 3641.778939] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none                                                                        
[ 3641.786145] Control: 10c5387d  Table: 50d6006a  DAC: 00000051                                                                                        
[ 3641.791959] Process swapper/0 (pid: 0, stack limit = 0xc0d00210)                                                                                     
[ 3641.798033] Stack: (0xc0d01e14 to 0xc0d02000)                                                                                                        
[ 3641.802462] 1e00:                                              ee00c600 00000000 00000066                                                            
[ 3641.810759] 1e20: 00000001 ef061000 ef061518 c06353dc 00000000 ef6ac280 00000040 c0d080cc                                                            
[ 3641.819057] 1e40: ef061518 2ea4a000 c0b2dee8 c0c62b80 00000100 ef061518 c0635170 0005194e                                                            
[ 3641.827353] 1e60: 0000012c c0d02100 00000040 c0d01e90 2ea4a000 c076513c ef6acb80 c0c62b80                                                            
[ 3641.835650] 1e80: c0d49faf c0d03380 c0b2dee8 c0b31c0c c0d01e90 c0d01e90 c0d01e98 c0d01e98                                                            
[ 3641.843946] 1ea0: 00000001 00000000 00000003 c0d00000 c0d0208c c0d02080 00000100 c0d02080                                                            
[ 3641.852244] 1ec0: 40000003 c0124218 00000000 ef1e8000 c0d01ec8 c0d4ba40 0000000a 0005194d                                                            
[ 3641.860541] 1ee0: c0d02100 00200000 c0d0250c ffffe000 00000000 00000000 00000001 ef008000                                                            
[ 3641.868836] 1f00: f0803000 c0d49d5b c0d0250c c01245f0 c0c604bc c016b6e4 c0d1e368 c0d02848                                                            
[ 3641.877136] 1f20: f080200c c0d01f50 f0802000 c010148c c0108160 600e0013 ffffffff c0d01f84                                                            
[ 3641.885437] 1f40: c0c61578 c0d02504 c0d49d5b c010be14 00000001 00000000 00000000 c0119680                                                            
[ 3641.893737] 1f60: c0d00000 c0d0249c 00000000 00000000 c0c61578 c0d02504 c0d49d5b c0d0250c                                                            
[ 3641.902037] 1f80: 600e0013 c0d01fa0 c010815c c0108160 600e0013 ffffffff 00000051 00000000                                                            
[ 3641.910337] 1fa0: c0d00000 c015ca20 c0d01fb4 c0d01fa8 00000000 ffffffff 00000000 c0c00c7c                                                            
[ 3641.918637] 1fc0: ffffffff ffffffff 00000000 c0c00690 00000000 c0c48a30 c0d4b294 c0d02480                                                            
[ 3641.926934] 1fe0: c0c48a2c c0d079e4 4000406a 410fc075 00000000 4000807c 00000000 00000000                                                            
[ 3641.935267] [<c0562de0>] (dql_completed) from [<c06353dc>] (sun8i_emac_poll+0x26c/0x6a0)                                                             
[ 3641.943492] [<c06353dc>] (sun8i_emac_poll) from [<c076513c>] (net_rx_action+0x1f4/0x2e4)                                                             
[ 3641.951715] [<c076513c>] (net_rx_action) from [<c0124218>] (__do_softirq+0xfc/0x218)                                                                 
[ 3641.959589] [<c0124218>] (__do_softirq) from [<c01245f0>] (irq_exit+0xb8/0x118)                                                                      
[ 3641.967026] [<c01245f0>] (irq_exit) from [<c016b6e4>] (__handle_domain_irq+0x60/0xb4)                                                                
[ 3641.974982] [<c016b6e4>] (__handle_domain_irq) from [<c010148c>] (gic_handle_irq+0x48/0x8c)                                                          
[ 3641.983457] [<c010148c>] (gic_handle_irq) from [<c010be14>] (__irq_svc+0x54/0x70)                                                                    
[ 3641.991038] Exception stack(0xc0d01f50 to 0xc0d01f98)                                                                                                
[ 3641.996163] 1f40:                                     00000001 00000000 00000000 c0119680                                                            
[ 3642.004458] 1f60: c0d00000 c0d0249c 00000000 00000000 c0c61578 c0d02504 c0d49d5b c0d0250c                                                            
[ 3642.012747] 1f80: 600e0013 c0d01fa0 c010815c c0108160 600e0013 ffffffff                                                                              
[ 3642.019459] [<c010be14>] (__irq_svc) from [<c0108160>] (arch_cpu_idle+0x38/0x3c)                                                                     
[ 3642.026982] [<c0108160>] (arch_cpu_idle) from [<c015ca20>] (cpu_startup_entry+0x1b8/0x214)                                                           
[ 3642.035380] [<c015ca20>] (cpu_startup_entry) from [<c0c00c7c>] (start_kernel+0x390/0x39c)                                                            
[ 3642.043674] Code: e3520000 1a000002 e1a03005 eaffffdd (e7f001f2)                                                                                     
[ 3642.049843] ---[ end trace e15ecc4671d2fcf3 ]---                                                                                                     
[ 3642.054528] Kernel panic - not syncing: Fatal exception in interrupt                                                                                 
[ 3642.060989] CPU3: stopping                                                                                                                           
[ 3642.063818] CPU: 3 PID: 8407 Comm: unpigz Tainted: G      D W       4.7.3-sun8i #5                                                                   
[ 3642.071483] Hardware name: Allwinner sun8i Family                                                                                                    
[ 3642.076301] [<c010ea00>] (unwind_backtrace) from [<c010b350>] (show_stack+0x10/0x14)                                                                 
[ 3642.084172] [<c010b350>] (show_stack) from [<c05381e4>] (dump_stack+0x84/0x98)                                                                       
[ 3642.091519] [<c05381e4>] (dump_stack) from [<c010d888>] (handle_IPI+0x170/0x190)                                                                     
[ 3642.099035] [<c010d888>] (handle_IPI) from [<c01014cc>] (gic_handle_irq+0x88/0x8c)                                                                   
[ 3642.106722] [<c01014cc>] (gic_handle_irq) from [<c010c110>] (__irq_usr+0x50/0x80)                                                                    
[ 3642.114302] Exception stack(0xd0df1fb0 to 0xd0df1ff8)                                                                                                
[ 3642.119427] 1fa0:                                     0003e715 000000c6 00057212 0004727e                                                            
[ 3642.127721] 1fc0: 00000012 00055bcf 0005515d 00043d05 00098650 00098eb0 bee3d264 00098120                                                            
[ 3642.136011] 1fe0: 00055bd3 bee3d0f0 00000001 b6eee882 200e0030 ffffffff                                                                              
[ 3642.142696] CPU2: stopping                                                                                                                           
[ 3642.145492] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W       4.7.3-sun8i #5                                                                   
[ 3642.153157] Hardware name: Allwinner sun8i Family                                                                                                    
[ 3642.157956] [<c010ea00>] (unwind_backtrace) from [<c010b350>] (show_stack+0x10/0x14)                                                                 
[ 3642.165820] [<c010b350>] (show_stack) from [<c05381e4>] (dump_stack+0x84/0x98)                                                                       
[ 3642.173165] [<c05381e4>] (dump_stack) from [<c010d888>] (handle_IPI+0x170/0x190)                                                                     
[ 3642.180680] [<c010d888>] (handle_IPI) from [<c01014cc>] (gic_handle_irq+0x88/0x8c)                                                                   
[ 3642.188367] [<c01014cc>] (gic_handle_irq) from [<c010be14>] (__irq_svc+0x54/0x70)                                                                    
[ 3642.195946] Exception stack(0xef0a5f88 to 0xef0a5fd0)                                                                                                
[ 3642.201074] 5f80:                   00000001 00000000 00000000 c0119680 ef0a4000 c0d0249c                                                            
[ 3642.209368] 5fa0: 00000000 00000000 c0c61578 c0d02504 c0d49d5b c0d0250c e6a93acc ef0a5fd8                                                            
[ 3642.217651] 5fc0: c010815c c0108160 60030013 ffffffff                                                                                                
[ 3642.222793] [<c010be14>] (__irq_svc) from [<c0108160>] (arch_cpu_idle+0x38/0x3c)                                                                     
[ 3642.230313] [<c0108160>] (arch_cpu_idle) from [<c015ca20>] (cpu_startup_entry+0x1b8/0x214)                                                           
[ 3642.238697] [<c015ca20>] (cpu_startup_entry) from [<4010156c>] (0x4010156c)                                                                          
[ 3642.245730] CPU1: stopping                                                                                                                           
[ 3642.248526] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D W       4.7.3-sun8i #5                                                                   
[ 3642.256190] Hardware name: Allwinner sun8i Family                                                                                                    
[ 3642.260988] [<c010ea00>] (unwind_backtrace) from [<c010b350>] (show_stack+0x10/0x14)                                                                 
[ 3642.268854] [<c010b350>] (show_stack) from [<c05381e4>] (dump_stack+0x84/0x98)                                                                       
[ 3642.276199] [<c05381e4>] (dump_stack) from [<c010d888>] (handle_IPI+0x170/0x190)                                                                     
[ 3642.283713] [<c010d888>] (handle_IPI) from [<c01014cc>] (gic_handle_irq+0x88/0x8c)                                                                   
[ 3642.291400] [<c01014cc>] (gic_handle_irq) from [<c010be14>] (__irq_svc+0x54/0x70)                                                                    
[ 3642.298980] Exception stack(0xef0a3f88 to 0xef0a3fd0)                                                                                                
[ 3642.304109] 3f80:                   00000001 00000000 00000000 c0119680 ef0a2000 c0d0249c                                                            
[ 3642.312403] 3fa0: 00000000 00000000 c0c61578 c0d02504 c0d49d5b c0d0250c 00000000 ef0a3fd8                                                            
[ 3642.320686] 3fc0: c010815c c0108160 60070013 ffffffff                                                                                                
[ 3642.325826] [<c010be14>] (__irq_svc) from [<c0108160>] (arch_cpu_idle+0x38/0x3c)                                                                     
[ 3642.333345] [<c0108160>] (arch_cpu_idle) from [<c015ca20>] (cpu_startup_entry+0x1b8/0x214)                                                           
[ 3642.341726] [<c015ca20>] (cpu_startup_entry) from [<4010156c>] (0x4010156c)                                                                          
[ 3642.348759] Rebooting in 10 seconds..   

Link to comment
Share on other sites

Ok, so I discovered and installed OPI-monitor.

 

When top/free look like the following, OPI-monitor's graph says available memory is ZERO!!!!! That would explain the OOM invocations.

 

                 total            used           free            shared      buff/cache   available
Mem:        1026284      109608      613652       13444      303024      680576
Swap:       1097724           0           1097724
 

 

How can this be? Any ideas? Fixes?

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines