Jump to content

Problem with large files


blue-v

Recommended Posts

Dear all,

 

I have a strage problem here with handling large files and I ran out of ideas.

 

System:

Odroid C4 (running from SD-card)

OS: 5.7.15-meson64 #20.08 SMP PREEMPT

PRETTY_NAME="Debian GNU/Linux 10 (buster)"

 

Attached to the system are a few USB disks: 

3x 8TB, 1x 4TB, 1x 3TB, 1x 2TB 

All filesystems: ext4 (except the remote NAS filesystem, which is ZFS - see below) 

3 of these disks are encrypted via truecrypt 7.1

 

On the 3 TB disk (ext4, unencrypted) there is a zipped dd-image of a harddisk.

That file has about 760 GB size

 

Now I tried to copy this file to a different location.

1st try: rsync to a NAS filer (ZFS, OmniOS, NFSv4) -> The rsync process goes to "D" state after a couple of GB

2nd try: scp to the same NAS -> same result. Copies a couple of gigabytes and then freezes

3rd to 5th try: mounted the NAS via NFSv4 and used rsync, scp and cp to the mountpoint -> same result

6th and 7th try: Tried to copy to a local (truecrypted) 8TB disk using rsync and cp -> same result

All these attemts stopped after transferring a couple of gigabytes and then the involved processes showed "defunct" state.

The transferred size is different every time and is between 40 an 200 GBytes.

 

What I tried to pinpoint the problem:

Tried to read the big file:

- md5sum <big_imagefile> -> works!

- dd if=<file> of=/dev/null bs=100M -> works!

Tried to write a big file:

- dd if=/dev/zero of=<file_on_8TB_truecrypted_disk> bs=100M count=10000 -> works!

Tried to copy using dd:

- dd if=<the_image_file> of=<file_on_truecrypted_disk> bs=100M -> works! Checksum is ok.

  So I finally managed to copy the file to a different location using dd.

 

But why is cp, scp and rsync failing?

 

Additional information:

During the process I did several experiments with the swapspace (disabled, used swapfile on SD-card,  etc) nothing helped.

In the logfiles there is not the smallest hint to a problem.

Also observing the memory during copy using "free" command did not show any unusual state. 

 

Systemlimits:

root@odroidc4:/var# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11363
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 11363
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
 

No overclocking or something. Just a plain Armbian installation. No funny "tuning".

 

I also have the same problem with some other applications.

For example rmlint, for eliminating duplicates, runs for a while (30 minutes?) and then freezes.

After that I connected the USB disk to a different Linux system and rmlint worked fine.

 

Fun fact: I'm pretty sure I originally copied the long file from a "full size" Linux system via rsync 

to the Odroid a couple of months ago.

 

Any ideas what the problem could be? 

 

Thank you!

 

  Lothar

 

Edited by blue-v
Typo
Link to comment
Share on other sites

27 minutes ago, blue-v said:

Just a plain Armbian installation. No funny "tuning".

 

27 minutes ago, blue-v said:

OS: 5.7.15-meson64 #20.08 SMP PREEMPT


Almost 3 years old build is way out of any "warranty" ;) Did you try the same with latest image?

Link to comment
Share on other sites

No, I did not (yet).

But even for an "old" system, copying large files should not be a problem I think. 

 

Or asked the other way round: Is there a known fix in the newer versions that addresses this problem?

 

Lothar

Link to comment
Share on other sites

1 hour ago, blue-v said:

Or asked the other way round: Is there a known fix in the newer versions that addresses this problem?

This question is irrelevant.  No one really knows the answer.

 

1 hour ago, Igor said:

Did you try the same with latest image?

What Igor is really saying, is if you want anyone to spend their time helping with this, you need to reproduce on current code.  If this was a common problem, it would already be fixed as many others would report something like this.  So it is either something due to your older code, or something specific to your environment.  Those are difficult to diagnose, and when you are asking people to volunteer their time to help you, they are only going to do that, if it is a reasonable to do and you have done everything to narrow the scope of the problem to something reproduceable by someone else.

 

My first thought is based on the lack of information provided, but is a common cause of mysterious errors, and that is power issues.  How is all of this hardware powered?  SBCs are notorious for having poor power supplies and under load have voltage drops that cause mysterious problems (especially with usb devices drawing some of that power).

Link to comment
Share on other sites

Dear SteeMan,

thank you for your suggestion. 

I'm aware of the problems reproducing the problem with an old OS. And I will upgrade and check as soon as possible.

My hope was that someone has had a similar effect and could provide some hints.

 

The power topic is a nice idea! Just what I was looking for: A new idea.

I'll check. The system is not in reach for me at the moment, but it will be in a few hours. 

 

As soon as there are new conclusions, I'll report.

 

Thank you!

 

  Lothar

Link to comment
Share on other sites

42 minutes ago, blue-v said:

My hope was that someone has had a similar effect and could provide some hints.


That is legitimate hope, but with latest and greatest OS its hard enough.

Probably nobody on the planet is running kernel you run. Next. Armbian is installed on certain % of Odroid devices. A small % of their users understands what you are talking about, very small % of those has time to listen, a few might answer (with general hint as such) and terrible small can actually give a usable hint (in decent time) or, which is close to impossible, leave everything and dedicate a whole week to resolve this problem. For kernel that is completely outdated, not 99, but 100% of people will look the other way.

 

I hope problem you are describing was fixed long time ago. If not, remember there are a lot of bugs, wishes, ideas and very little people.

Good luck!

Link to comment
Share on other sites

Igor, you lost that bet - I know about at least one other installation. Don't ask me how I know 🙂  Anyway.

I now upgraded to bullseye 6.1.11-meson64 #23.02.2 SMP PREEMPT.

 

And the problem...

 

 

 

... seems to be gone!

 

The C4 copied the large file to the NAS using rsync via NFS in the first attempt.

I will do some more testing the next days.

Also a run of rmlint, which never worked before on big directories.

 

Thank you for your help!

 

  Lothar

Link to comment
Share on other sites

11 hours ago, blue-v said:

I now upgraded to bullseye 6.1.11-meson64 #23.02.2 SMP PREEMPT.

 

And the problem...

... seems to be gone!


Yes, exactly this is what we hope you to do 1st, before even thinking to complain about. 


In several years many problems are fixed, so going once again on this path again, just for you, because you don't run latest kernel ... is plain stupid. I know you agree. We are also pre-programmed - if you don't run latest kernel = ignore that user / report as we are already way to small to address issues on latest kernel and there is nothing we can do about.

 

We have to at least try to prevent users to waste time in such grotesque manner. That's the point of - "update first!" ;)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines