Jump to content

software raid on OPI PC +


nedoskiv

Recommended Posts

Hello

 

I was wondering how can make possible usage of software raid1, using integrated emmc and another card placed into the slot.

So far I have used that on Slackware x86 without any issue, but there I tells the Lilo to boot from /dev/md0 for example.

Wonder where I should make adjustments here (in Armbian(. Can someone point me at the right direction?

Link to comment
Share on other sites

Armbian & Khadas are rewarding contributors

Can someone point me at the right direction?

 

Sure: Forget about RAID 1. :)

 

RAID 1 is only about availability ('business continuity') and if you want to achieve that with the cheapest SBC possible you already chose the wrong device/approach. Also HDDs are different than flash based media and most RAID 1 implementations simply suck (hold twice the amount of data and aren't able to figure out which copy is good and which one is bad).

Link to comment
Share on other sites

Sure: Forget about RAID 1. :)

 

RAID 1 is only about availability ('business continuity') and if you want to achieve that with the cheapest SBC possible you already chose the wrong device/approach. Also HDDs are different than flash based media and most RAID 1 implementations simply suck (hold twice the amount of data and aren't able to figure out which copy is good and which one is bad).

 

I get you point, but gonna dig it further, if you know where I can tell Armbian from what partition to boot will be much helpful

Link to comment
Share on other sites

I get you point, but gonna dig it further, if you know where I can tell Armbian from what partition to boot will be much helpful

  • Here are two threads that deal with extensive performance testing with flash based media: official one, collecting results. The eMMC on those Oranges is magnitudes faster than the average SD card. By using RAID 1 you will ruin performance for absolutely no reason
  • Flash based media will die sooner or later (as will every hardware). The symptoms don't look related to a dying HDD, you can't do a 'smartctl -a /dev/sda' and /dev/md0 is absolutely useless until you combine it with a good fs like ZFS or btrfs that implement end to end data integrity.
  • If you start to care about data integrity and want to do it correctly then you can also use a btrfs/zfs mirror. Way better than mdraid anyway and able to use the redundancy (mdraid can't do, just reporting a failing HDD which is not the case with SBCs since here are no HDDs involved!). With a zfs/btrfs mirror checksums are written and read and so data corruption can not only be detected but also corrected.
  • Since btrfs lives inside the kernel it's not advised to use it with legacy kernel (3.4.x is horribly outdated here). With Ubuntu Xenial it should be possible to use 'native' zfs but I failed to get it up and running so far (just one single try yesterday and this was on an installation where SD card started to fail -- yeah, I ran into exactly this just recently)
  • So by using mainline kernel all you would've to do is to add another device to a btrfs mirror and get fault tolerant RAID 1 afterwards (no idea about zfs yet)

But the whole approach is just ignoring what's important, decreasing performance and gaining nothing. Such a SBC is a single point of failure on its own. If I would want to compensate for that I would buy a second SBC and then write a script that uses btrfs/zfs snapshots, and transfers them periodically to the 2nd device. So in case of a failure the 2nd device can take over. Needs testing since the most important part is booting the device (no LILO, no GRUB, we're talking about u-boot here)

 

Edit: The last sentence above is the most important one: it's all about booting since with a RAID 1 made of eMMC and SD card the primary bootloader has to stay on SD card (since higher boot priority) so to get this somehow fault-tolerant it would be needed to set up everything on the SD card, then copy/clone the card to eMMC and to adjust u-boot configuration there to use the local eMMC rootfs partition (different UUID). And it has a reason that there's no more information available regarding this: Since the whole attempt doesn't make that much sense. :)

Link to comment
Share on other sites

Thanks for your replay

 

for me is important to have kind of physical data storage security. I never checked my raid1 disk health, I just check if some of them is kicked out of the array, if so, just replace it. On orange pi it will be impossible for me  to replace integrated emmc, but the fact that one mmc is kicked out of raid array will trigger my attention to save the data on it. You also are right about decreasing performance, but that is little price for me to pay for data security.

 

So far I'm stuck during to out of knowledge. For a start I decided to make a test, looked into boot directory and found boot.cmd and boot.scr, inside them is set variable defining rootfs.

 

I decided to convert existing partition to raid array and try to boot it. First step was to install mdadm. After it was installed something draw my attention:

 

 

Unpacking mdadm (3.3.2-5+deb8u1) ...

Processing triggers for systemd (215-17+deb8u4) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up mdadm (3.3.2-5+deb8u1) ...
update-initramfs: deferring update (trigger activated)
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
Processing triggers for initramfs-tools (0.120+deb8u2) ...
update-initramfs: /boot/initrd.img-3.4.112-sun8i has been altered.
update-initramfs: Cannot update. Override with -t option.
 
 
............
 
 
When I see it mess with initrd, decided to reboot the system to see how it works and it failed to boot.
 
when I restorore  /boot/initrd.img-3.4.112-sun8i it boots successful.
anyone idea what happen?
Link to comment
Share on other sites

the fact that one mmc is kicked out of raid array will trigger my attention to save the data on it.

 

Nope, that's not how you experience data corruption/loss with flash storage. It's silent data corruption and 'good old' mdraid is simply the wrong approach since 100% percent waste/useless.

 

On flash storage you'll experience silent data corruption and using mdraid will prevent you from taking notice. That's all you get. You need end to end data integrity. Only then added redundancy becomes useful.

 

The traditional mdraid approach with RAID 1 is to store everything on two devices, don't care about data integrity and reading back the data from either one or the other device. If data corruption occured you'll only know when it's already too late.

 

With a zfs or btrfs mirror data will be written to two devices (or more if you like but then behaviour differs a lot between both implementations) and also metadata will be written (checksums to detect data corruption) to both devices. Then you set up a regular 'scrub' task that will read data and metadata from both devices, as soon as a single block on any of the two devices is not clean zfs/btrfs mark the block as bad and create a new safe block. It's able to detect and repair data corruption in an automated fashion and you also get warned when a device starts to fail.

 

Using mdraid with RAID 1 is 100 percent useless in this situation. Using RAID when you're after 'data security' is also absolutely wrong. It's just a waste of ressources. Better check snapshot features and implement some sort of backup strategy.

Link to comment
Share on other sites

 

But the whole approach is just ignoring what's important, decreasing performance and gaining nothing. Such a SBC is a single point of failure on its own. If I would want to compensate for that I would buy a second SBC and then write a script that uses btrfs/zfs snapshots, and transfers them periodically to the 2nd device. So in case of a failure the 2nd device can take over.

 

I can only approve this statement.

In fact I have been using raid 0 on my main PC for about 10 years.

It does improve performance.

It doesn't seem to double the failure rate because each drive only read and write half of the data.

The only failure I had was from an add on sata controller card .

Luckily I was able to find the same one but that important point is totally ignored in the  raid reliability discussions.

Another important point is that the raid should be recognised by the various operating systems you

plan to use.

10 years ago Mandriva was the only one :)

Link to comment
Share on other sites

ok I made it, gonna describe now,  (some stuff unclear how works).

In order to understand me you have to do a software raid1 before, I do not explain how to do it in general.

To make in clear, when I say EMMC - it is integrated card, when I say SD card it is external memory card.

 

Let name this INCOMPLETE HOWto - Raid 1 on Orance PI PC +

 

I'm using SD card 8GB and integrated EMMC is also 8GB

 

  • First thing is to download and copy Armbian image on your external SD card.
  • After that you can configure your network interfaces etc. and install MDADM - when it ask for which array to search enter "all" - it is default
  • When you are done what you have done, use nand-sata-install to install whole distribution to your EMMC.
  • when your system is running from SD card, mount EMMC open /boot/boot.cmd, there is one "setenv bootarg". Inside it is root=UUID........ copy that, now open boot.cmd from your SD card and replace it. This gonna make OPI boot from SD card but mount EMMC card as root file system. Do not forget after editing boot.cmd to execute :  mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr
  • reboot, you should be running from EMMC now. 
  • run fdisk on  /dev/mmcblk0 (this should be SD card). Change partition type from linux natifve to linux raid autodetect (code FD)
  • in order raid1 to work, make sure SD card parition is smaller or equal in size with the patition on EMMC, here is the place to correct that problem, use fdisk to shrink it if needed. (google it to find how)
  • let create the array: mdadm --create /dev/md0 --level 1 --metadata=0.90 --raid-devices=2 /dev/mmcblk0p1 missing
  • execute and fix errors: e2fsck /dev/md0
  • execute : resize2fs /dev/md0
  • execute and fix errors: e2fsck /dev/md0
  • you should be able to mount /dev/md0 now, do it and edit boot/boot.cmd - replace root=UID .......   with root=/dev/md0 (do not forget to execute mkimage with proper paths)
  • edit etc/fstab (change /dev/mmcbl........ with /dev/md0)
  • reboot

you should be running frrom SD card not using raid 1 as rootfs, checkit with mount command, search for a line like this:

 

/dev/md0 on / type ext4 (rw,noatime,nodiratime,errors=remount-ro,commit=600)

 

now is the time to run fdisk and modify emmc partition to linux raid autodetect, after that :

 

mdadm --manage /dev/md0 --add /dev/mmcblk1p1

 

now array need to resync, you can view progress with:

cat /proc/mdstat

 

-------------------------- bug ----------------------------

here is the place to mention about a bug in armbian on orange PI PC plus

I opened a new topic for it, but keep in mind when you reboot the system properly (using reboot command for example)

it will resync the array on startup.  That not happen if you unplug from the power for example.

https://forum.armbian.com/index.php/topic/2945-boot-bug-on-opi-pc-plus/

Link to comment
Share on other sites

Hmm... again: mdraid with raid 1 and flash media is a really 'great' idea since it can NOT cope with what will happen: Silent data corruption AKA bit rot. Everything explained in detail: https://raid.wiki.kernel.org/index.php/Detecting,_querying_and_testing#Simulating_data_corruption

 

Neither eMMC nor SD cards will 'fail' in the way we know from spinning rust (HDD), it's different and needs different strategies.

 

With mdraid bit rot could only be fought using RAID levels 5 or 6 (and those shouldn't be used with those dorky legacy kernels we unfortunately have to use on most of our boards since mdraid code contained a bug for almost 5 years that remained undetected and was fixed just recently, so maybe not in any of the 3.4 or 3.10 kernels we use with legacy OS images).

 

Anyone thinking about RAID 1 on Armbian devices: Use at least a zfs or btrfs mirror, don't rely on mdraid, the latter is a 100% fail here.

Link to comment
Share on other sites

that is as far as I can go for now my raid 1 boot just once, after that it becomes readonly, system boot up from it again only if I make any change to it filesystem (copy file etc) or make it writeable using MDADM, I deleted whole array and partition, created them again, copy rootfs into it. - no success. Try to read about that into google, nothing found.

Guess until that somehow is solved, gonna check that btrfs mirror. 

Link to comment
Share on other sites

gonna check that btrfs mirror. 

 

You find some basic information here: 

It's highly recommended to use btrfs only with most recent kernels since all the code lives inside the kernel so using older kernels (especially 3.x) you'll sufer from countless bugs that are all resolved upstream.

 

Basically it's booting your device with a vanilla image from SD card, then creating one ext4 and one btrfs partition on the eMMC, transferring data to both partitions, then boot from there, now prepare the SD card and add the btrfs partition on SD card to the mirror. The remaining issues are the interesting stuff if you're interested in 'desaster recovery': How to get your board up and running if eMMC fails for whatever reasons.

 

Unlike an mdraid RAID 1 which is absolutely useless when combining SD card with eMMC a btrfs mirror might make some sense but that's only due to checksumming and implementing end-to-end data integrity (so the btrfs mirror will detect and even correct corrupted data while mdraid won't take notice at all -- it's the wrong concept here). So if you really care about that what you've written ('data security') using mainline kernel, btrfs/zfs features (snapshots!) and doing backups (as in 'send snapshots incrementally to another device') is the way to go. Create redundancy by another SBC lying around which holds a copy of your data.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines