Matthias Posted January 6, 2020 Posted January 6, 2020 Edit: Issue is solved, see here. It's an application problem but according to an extensive Google research this does not seem to be a common problem of Samba, so it might be an integration problem with the environment it is running in. The setup: Armbian 19.11.6 (Debian Buster) running the current kernel on a NanoPi M4V2, offering network shares via Samba. Clients are Windows 10 and a BananaPi running Armbian (Debian Buster 4.19.y kernel) as well. The observation: When downloading (large) files from the Samba share to Windows 10, the download speed very often suddenly drops down to zero and stays on that rate (forever). This can be best observed on files of several GB size as it usually takes some seconds until the problem appears. There seems to be a correlation with the download speed: Using ethernet it happens more often than when using slower WiFi. I also increased the log level of Samba but I could not detect anything special the moment the rate drops down (In the Samba logs. There are no significant lines in /var/log/messages or /var/log/syslog). The transfer just stops. The same behavior can be observed when mounting the share on the BananaPi and then copy large files. My first guess was a generally unstable network connection as there have been committed some fixes in this area to Armbian recently. But when stressing the network via iperf3 in both directions I could not detect any variations in the transfer speed. In both directions. When writing data to the Samba shares I do not observe any problems. I was able to upload about 30GB without any significant drop in speed. I also took the working smb.conf from my BananaPi, which serves files flawlessly via Samba, adapted it and ran tests. But I observed the same behavior. So I assume there is a problem the way Samba accesses the files or uses the network stack in the Armbian environment. But as this is only speculation and I did not find any further indicators for the source of the problem I now hope that somebody else stumbled over the same thing and maybe already has a solution for it. If somebody likes to reproduce it: Take a fresh install, install Samba add a share to a directly on the SD card in smb.conf (no further changes in the config file). Then copy a file of about 3GB to the share. Then try to download the file again. You will be impressed by the transfer speed (as long as the file is cached in RAM on the NanoPi M4V2) of more than 100MB/s but then the transfer suddenly stops.
Igor Posted January 7, 2020 Posted January 7, 2020 Strange. Has to be diagnosed ... but I doubt there is generally something wrong with Armbian - in areas we are doing something. My Helios4 storage runs full speed in both sides all of the time (but by using NFS protocol). Regardless of the file size (except tiny ones does not reach full speed due to slow mechanical disks). It also never stops. We observed some low level troubles on Nanopi M4V2 but they have been fixed. But there could be/are more troubles out there to be found.
Matthias Posted January 7, 2020 Author Posted January 7, 2020 I can check NFS, that's a good idea. Then I get a better impressing how "strongly" related the issue is to Samba. If Samba works on one board family and not on another, the only difference relevant to Samba I can think of is the kernel. Or did I miss something?
Matthias Posted January 8, 2020 Author Posted January 8, 2020 Checked NFS: Works fine, I was not able to observe any drops of transfer speed like mentioned above.
piter75 Posted January 9, 2020 Posted January 9, 2020 @Matthias can you repeat your test with samba after applying the following command: sudo ethtool -K eth0 tx off rx off I remember discovering dying network connection on my Rock Pi 4 and finding this issue: https://github.com/MichaIng/DietPi/issues/2028 After that I started adding the above command to a script in /etc/network/if-up.d/ on every install with ansible and had no issues anymore. Today I tested samba share on Nano Pi M4V2 with clean image (kernel 5.4.8). I successfully sent 8GiB file to a share but the transfer died after 1.7GiB when downloading from the share. I disabled offloading with the above command and successfully downloaded the whole file.
Matthias Posted January 10, 2020 Author Posted January 10, 2020 18 hours ago, piter75 said: @Matthias can you repeat your test with samba after applying the following command: I remember discovering dying network connection on my Rock Pi 4 and finding this issue: https://github.com/MichaIng/DietPi/issues/2028 After that I started adding the above command to a script in /etc/network/if-up.d/ on every install with ansible and had no issues anymore. Today I tested samba share on Nano Pi M4V2 with clean image (kernel 5.4.8). I successfully sent 8GiB file to a share but the transfer died after 1.7GiB when downloading from the share. I disabled offloading with the above command and successfully downloaded the whole file. That's good news, thank you for that hint, Piter. And good to see that my issue is reproducible on more than one board. I also read about that a couple of weeks ago. But as that was the time where we all struggled with the stability of the networking, it didn't bring any effect and I dropped that idea. I can give it another try this evening on the current state of the networking. Hopefully it has the effect I'm looking for.
Matthias Posted January 10, 2020 Author Posted January 10, 2020 Ok @piter75, looks like things are a little more complicated: First of all I must apologise: My "fresh install" had at least one customisation I did not mention: The kernel had support for ZFS (via zfs-dkms). When using this installation Piter's use of ethtool does not bring any improvement. Then I remembered something: ZFS brings has support for SMB and NFS. Don't know why they mixed that but seems to be a relict from Solaris. So just to make sure I reinstalled the system without ZFS-support. The following observations were made using the "ZFS-less" system connected to a Windows 10 PC via gigabit ethernet: Copying a lot of data via SMB to and from the share hosted on the NanoPi without "sudo ethtool -K eth0 tx off rx off": After some gigabytes the transfer speed drops to zero and the NanoPi gets unresponsible. It looked like the green LED changed it's flashing-pattern so it was flashing faster, but I might be wrong. Doing the same with "sudo ethtool -K eth0 tx off rx off": I successfully copied about 20GB to and from the NanoPi. No problems detected. So I think this change brings improvement. And combining Samba and ZFS does not seem to be a good idea, at least not for the versions I used (kernel 5.4.10, Samba 4.9.5, zfs-dkms 8.2.3). For whatever reason. Shall we add the call of ethtool to the build process of Armbian images for the NanoPi M4V2?
Matthias Posted January 12, 2020 Author Posted January 12, 2020 I need to recall my thesis about ZFS interfering with Samba and causing trouble. I cannot reproduce it anymore. For testing with NFS I installed the NFS client for Windows 10 wich seemed to interfere a lot with SMB. I don't know why, but after uninstalling it, everything was fine and I was able to download and upload files in large amounts. Of course having the offloading disabled. This is true for transferring data from and to the SD card. When using a hard drive connected to the SATA hat I get stopping network connections, broken file systems (ZFS and BTRFS two-disk-mirrors that show uncorrectable errors without any forced reboot) and kernel panics. But that seems to be a different story (broken hardware? temperature?) and does not have a direct connection to Samba or the network connection. Long story short: I consider the Samba problem to be solved and need to continue examining my hardware setup.
piter75 Posted January 12, 2020 Posted January 12, 2020 On 1/10/2020 at 11:16 PM, Matthias said: Shall we add the call of ethtool to the build process of Armbian images for the NanoPi M4V2? I have taken another route with this PR. I shortened the TX programmable buffer length on all rk3399 boards as all are plagued with the same issue. With this change the transfers should be stable with most used MTU of 1500 but hardware offloading could still be enabled. Higher MTUs would require further shortening of the PBL. 18 hours ago, Matthias said: When using a hard drive connected to the SATA hat I get stopping network connections, broken file systems (ZFS and BTRFS two-disk-mirrors that show uncorrectable errors without any forced reboot) and kernel panics. I must admit that I suspect some timing issues with RAM on M4V2 as I also, sometimes - not often, experience kernel panics with my dev unit. The other one that is a server with NVMe hat is running solid stable but it has a different u-boot configuration. I will definitely look into this issue. 2
Matthias Posted January 12, 2020 Author Posted January 12, 2020 1 hour ago, piter75 said: I must admit that I suspect some timing issues with RAM on M4V2 as I also, sometimes - not often, experience kernel panics with my dev unit. The other one that is a server with NVMe hat is running solid stable but it has a different u-boot configuration. I will definitely look into this issue. That would explain some things. For example: During scrubbing a mirrored BTRFS reported broken CRCs of the same block of data on both disks. I observed that more than once within 40gb of data. Chances of this happening by accident are of course very low. But if the data gets corrupted at the source (RAM) this could easily happen. Configuring the timing of the RAM is way beyond my knowledge, so I'll stop here and wait for any improvements. As always: If there is something you would like to get tested: Don't hesitate to contact me.
Recommended Posts