Jump to content

Bananapi M1 + SSD since 2018, now stuck in read-only filesystem


Go to solution Solved by arox,

Recommended Posts

Posted

Hello everyone, I have a home dashboard (apache2 + php website) on this bananapi since 2018, when I installed a new SSD on it, now I noticed I can't open my dashboard anymore, and when I checked the logs I seen the drive was set to read-only mode.

I started to copy my data from it, but I hope I won't need to change the ssd drive.

Anyone around to give me some advices, how I could check and fix eventually the existing drive?

 

Bellow are some logs.

 

Thank you

 

root@bananapi:/var/log# dmesg | grep sda
[    2.876748] sd 0:0:0:0: [sda] 234441648 512-byte logical blocks: (120 GB/112 GiB)
[    2.876842] sd 0:0:0:0: [sda] Write Protect is off
[    2.876852] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    2.876992] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    2.879218]  sda: sda1
[    2.881067] sd 0:0:0:0: [sda] Attached SCSI disk
[    5.535028] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[    6.807928] EXT4-fs (sda1): re-mounted. Opts: commit=600,errors=remount-ro
[2962391.488423] EXT4-fs error (device sda1): ext4_validate_block_bitmap:376: comm apache2: bg 578: bad block bitmap checksum
[2962391.500592] Aborting journal on device sda1-8.
[2962391.501346] EXT4-fs (sda1): Remounting filesystem read-only
[2962391.507343] EXT4-fs error (device sda1) in ext4_writepages:2884: IO failure
[2969323.575865] EXT4-fs error (device sda1): ext4_remount:5257: Abort forced by user

 

root@bananapi:/var/log# cat /etc/fstab
# <file system>                                 <mount point>   <type>  <options>                                                       <dump>  <pass>
tmpfs                                           /tmp            tmpfs   defaults,nosuid                                                 0       0
UUID=62fc7248-9a57-4024-90d9-b4767bd2c697       /media/mmcboot  ext4    defaults,noatime,nodiratime,commit=600,errors=remount-ro,x-gvfs-hide    0    1
/media/mmcboot/boot                             /boot           none    bind                                                            0       0
UUID=9fb21562-a6fa-4b60-8453-bcf5bdda898a       /               ext4    defaults,noatime,nodiratime,commit=600,errors=remount-ro,x-gvfs-hide    0    1
root@bananapi:/var/log# mount -o remount, rw /
mount: cannot remount rw read-write, is write-protected

 

root@bananapi:/var/log# fdisk -l
Disk /dev/ram0: 4 MiB, 4194304 bytes, 8192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/ram1: 4 MiB, 4194304 bytes, 8192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/ram2: 4 MiB, 4194304 bytes, 8192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/ram3: 4 MiB, 4194304 bytes, 8192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/mmcblk0: 14.4 GiB, 15476981760 bytes, 30228480 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xf7477067

Device         Boot Start      End  Sectors  Size Id Type
/dev/mmcblk0p1       8192 29926175 29917984 14.3G 83 Linux


Disk /dev/sda: 111.8 GiB, 120034123776 bytes, 234441648 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xce16e6cf

Device     Boot Start       End   Sectors   Size Id Type
/dev/sda1        2048 234441647 234439600 111.8G 83 Linux


Disk /dev/zram0: 50 MiB, 52428800 bytes, 12800 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zram1: 498.8 MiB, 523026432 bytes, 127692 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

 

Posted

When that sort of things happen to me, I need to plug the disk on another computer and do an fsck with force (-f) option.

 

Not fair an be aware that an fsck may have to remove half your files if the cause is a disk failure ... I don't know how to force an fsck elseway if you cannot switch to a shell during boot sequence ?

 

You could try : mount -o remount, rw / 

 

But it surely will fail.

 

Since you use a BPI M1, I guess the boot loader is on the sdcard. And if you do always have a valid root fs on sd card, you could modify the loader to mount the root fs from the sdcard.

Posted
root@bananapi:~# mount -o remount, rw /
mount: cannot remount rw read-write, is write-protected


I tried it but it doesn't work.

I copied almost all my files, just the big mysql database wasn't backuped already, as phpMyAdmin doesn't load. I will try to copy it somehow trough putty, before I will try to shut it down and check the disk.

What do you recommend regarding a "live image"? I could burn a distro you recommend on another sd card, and reboot it, with the SSD connected, and hopefully I could run the check disk commands from the same unit. All my other PC's have windows OS.
Thank you

Posted

Of course, it is the simplest and so the best solution. Flashing a new SdCard take some times but at least, you should be able to repair your disk in 2 minits next time.

 

For the distro, anything that can boot and is up-to-date with ext4 should do the trick. (This just mean not a five years old distro).

 

 

  • Solution
Posted

If you are not familiar with unix/linux :

 

Do

- fsck /dev/sda1

If he seems to think nothing needs to be done, don't believe it, do :

- fsck -f /dev/sda1

When he ask to confirm for inodes clear, say "y". You don't have much choice anyway. 

If he start asking for removing blocs, you may become worried, but you don't have much choice than agree.

If the number of removal becomes important, you may become very upset.

If he finished and leave a number of files without name in lost+found you have a problem.

But generally you just have a pair of inodes and some pointers/counters that need to be fixed.

 

In theory you should do a second fsck to be sure the fs is repaired. If this one needs action, your disk or controler has a problem. (The card or the disk ?)

 

Before swapping sdcards and trying to boot on disk, do a mount

- mount dev/sda1 /mnt

- cat /mnt/etc/os-release (to reassure yourself)

- umount /mnt (or you risk to damage your fs another time)

 

 

Posted

Yes, run "fsck" the drive, but you probably want to use "fsck -y" which will just attempt to fix things and not ask permission at each step. Also do this with the filesystem unmounted, otherwise possible filesystem damage.

 

Also, may want to add the " fsck.repair=yes " to your kernel arguments. That will direct SystemD to automatically attempt repairs like these. It may make things worse with failing hardware, but generally will do the right thing for users who want things to just work.

 

https://www.linux.org/docs/man8/systemd-fsck.html

Posted
8 hours ago, tparys said:

Yes, run "fsck" the drive, but you probably want to use "fsck -y" which will just attempt to fix things and not ask permission at each step. Also do this with the filesystem unmounted, otherwise possible filesystem damage.

 

Also, may want to add the " fsck.repair=yes " to your kernel arguments. That will direct SystemD to automatically attempt repairs like these. It may make things worse with failing hardware, but generally will do the right thing for users who want things to just work.

 

https://www.linux.org/docs/man8/systemd-fsck.html

 

fsck.repair=yes is normally set in distros.

 

Nevertheless, it seems to me that the corruption of the fs is sometimes unnoticed with ext4 ? So a force option is needed, but a "force" is a bad idea in an automatic procedure because it may remove half the files. A manual procedure is necessary to allow the administrator to backup what can be save.

 

And I also noticed that the boot procedure may perform a full boot with a corrupted fs. Then the problem is likely to become worse. Also if the root fs is write-locked, there is no point starting a full service witch will hinder system maintenance : the boot sequence should escape to a maintenance shell if the remount rw fail.

 

(I say "seems to me" or "noticed", because when it happens to me on a system with a SSD, it is my desktop or my file server that is unavailable and I am more in a hurry to solve that investigate the problem, and with a headless server or without main desktop working it is also not easy.)

Posted

Thank you guys, I'm curently away from home, I can connect to it remotely but I can't phisically change anything (swapping sd cards ).

 

I will try to burn the same armbian distro on a new sd card, when I will arrive home, but untill than , I just tried now fsck and it seems it fixed a few things.

 

Now would you think it is safe to reboot it 

 

root@bananapi:~# fsck -f /dev/sda1
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
/dev/sda1: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'alarm_20210418T162929_0313.jpg' in /var/www/html/security_camera_storage/cam2 (268645) has deleted/unused inode 311335.  Clear<y>? yes
Entry 'alarm_20210418T162929_0690.jpg' in /var/www/html/security_camera_storage/cam2 (268645) has deleted/unused inode 311336.  Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 365658
Connect to /lost+found<y>? yes
Inode 365658 ref count is 2, should be 1.  Fix<y>? yes
Pass 5: Checking group summary information
Block bitmap differences:  -(18199362--18199491) -(18222129--18222259) -(18258312--18258464) -(18264714--18264858) -(18271789--18271933) -(18274837--18274981) -(18276125--18276269) -(18276878--18277014) -(18278829--18278969) -(18280586--18280732) -(18284169--18284311) -(18322306--18322440) -(18371739--18371873) -(18388110--18388244) -(18405315--18405454) -(18414482--18414603) -(18908826--18908905)
Fix<y>? yes
Free blocks count wrong for group #577 (6958, counted=7038).
Fix<y>? yes
Free blocks count wrong (6116773, counted=3943820).
Fix<y>? yes
Free inodes count wrong (7078589, counted=7056170).
Fix<y>? yes

/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: ***** REBOOT SYSTEM *****
/dev/sda1: 275670/7331840 files (5.3% non-contiguous), 25361130/29304950 blocks

 

And as you recommanded I run the fsck another time

 

root@bananapi:~# fsck -f /dev/sda1
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (3936310, counted=3943820).
Fix<y>? yes
Free inodes count wrong (7056168, counted=7056170).
Fix<y>? yes

/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: ***** REBOOT SYSTEM *****
/dev/sda1: 275670/7331840 files (5.3% non-contiguous), 25361130/29304950 blocks

 

root@bananapi:~# mount dev/sda1 /mnt
mount: special device dev/sda1 does not exist
root@bananapi:~# mount /dev/sda1 /mnt
mount: /dev/sda1 is already mounted or /mnt busy
       /dev/sda1 is already mounted on /
       /dev/sda1 is already mounted on /var/log.hdd
root@bananapi:~# cat /mnt/etc/os-release
cat: /mnt/etc/os-release: No such file or directory
root@bananapi:~# umount /mnt
umount: /mnt: not mounted

 

I rebooted the sistem, will post back later, if it will successfully reboot, if it won't , I will have to check it phisically tonight when I'll arive home.

 

Thank you very much for your help.

Posted

Guys, THANK YOU again :D it booted up successfully, I have my dashboard up and running again, without even touching the bananaPi board :D Everything fixed remotely with your help.

This board is running flawlessly  24/7 for about 5 years, in 2018 I mounted an new SSD, but before that it worked an year or two, with the OS installed on a old HDD which failed afer an year or two. BananaPi rocks! Very stable. 

Tough, I think I will try to use an USB flash drive, to save the temporary security camera files, I already have an NVR that save the camera footage 24/7, but I setup my cameras to upload to bananaPi FTP a photo on every motion detection, just to be able to view WITH JUST A SCROLL from left to right, ALL the daily movements around my house. It's much easier than logging to NVR and fast forward the video. And it's also what I want most, to have all the information centralized in one dashboard, I have the cameras, the photovoltaic panels production, house energy consumption, Heat Pipe solar water heater tank monitoring, control of my HVAC unit, office lights,  LAN devices .. and for all those I have alerts, in case some values are off, or in case some devices are offline, I get an Pushbullet alert on my phone. What I haven't done yet, and I have the device, I just don't have time, it's a backup for the alerting system , in case there is no internet connection to my house, I purchased an 3G Sim module, I will attach it to an arduino board that will constantly query the BananaPi for a status, if the bananaPi responds, and it doesn't have any unsent alerts, it will do nothing , but if the bananaPi doesn't respond than I will get an SMS alert that the dashboard is offline, and the same arduino can "ask for a status" all my other arduino sensors, it just needs the router to work, not an actual internet connection :)

 

 

Hme dashboard.jpg

Posted

Hum !

 

A bit worrying that your second fsck complained. I do not understand why you couldn't unmount the fs or /dev/sda1 was not present and appeared the second time ? Did you run fsck with the fs mounted ro ?

 

The fs is probably repaired now but you should check your logs and have an up-to-date backup during some times. There are commands to check disk errors recorded by firmware (but not controller or cables errors). Anyway I experienced your problems a lot of times and still use the board and cards without problem.

 

Yes, BPI M1 with armbian was and always is a good solution. That is why I still use it although it is completely out-of-date. But I got the same sort of fs corruption case (triggered by usb bugs ?) with raspbian on RPI4.

Posted
On 4/20/2021 at 7:48 PM, arox said:

Hum !

 

A bit worrying that your second fsck complained. I do not understand why you couldn't unmount the fs or /dev/sda1 was not present and appeared the second time ? Did you run fsck with the fs mounted ro ?

 

The fs is probably repaired now but you should check your logs and have an up-to-date backup during some times. There are commands to check disk errors recorded by firmware (but not controller or cables errors). Anyway I experienced your problems a lot of times and still use the board and cards without problem.

 

Yes, BPI M1 with armbian was and always is a good solution. That is why I still use it although it is completely out-of-date. But I got the same sort of fs corruption case (triggered by usb bugs ?) with raspbian on RPI4.

 

The fs was still mounted, I havent use umount before running fsck but it was on read-only mode .

To check the disk I did install smartmontool and run the fast check on it, the result is bellow, I haven't googled yet but do you guys have a script that can alert in case it senses an SMART error on the drive ?

 

root@bananapi:~# sudo smartctl -t short -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [armv7l-linux-4.19.62-sunxi] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KINGSTON SA400S37120G
Serial Number:    50026B7682034AC1
LU WWN Device Id: 5 0026b7 682034ac1
Firmware Version: SBFK71E0
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   Unknown(0x0ff8) (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Apr 22 11:09:10 2021 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (65535) seconds.
Offline data collection
capabilities:                    (0x79) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
Conveyance self-test routine
recommended polling time:        (   6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   000   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       27101
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       147
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
167 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
169 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       6
170 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       7
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       46138104
181 Program_Fail_Cnt_Total  0x0032   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       143
194 Temperature_Celsius     0x0022   075   069   000    Old_age   Always       -       25 (Min/Max 23/31)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
231 Temperature_Celsius     0x0000   071   071   000    Old_age   Offline      -       29
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       74821
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       47642
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       1600
244 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       704
245 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       760
246 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       4153344

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Thu Apr 22 11:11:10 2021

 

 

On 4/20/2021 at 8:19 PM, arox said:

BTW

 

A possible cause of repeated fs corruption is the power supply. (Or power cables).

I power the Banana board from a PC power source, it should have plenty of power to run the board and hdd. But I will check that too, as it's been running for 5+ years 24/7 .. exactly  like the Banana M1 board 

Posted
4 hours ago, sibianul said:

I power the Banana board from a PC power source, it should have plenty of power to run the board and hdd. But I will check that too, as it's been running for 5+ years 24/7 .. exactly  like the Banana M1 board 

"it should have plenty of power to run the board and hdd"

 

This is a recurrent question on this forum. But the voltage available to the board always depends on the source voltage and also on the cable(s), contacts of connectors and internal circuits resistance, protection diodes and fuses, (and of the PSU mileage which is difficult to evaluate) ...  When the board needs too much current, the total resistance can make the input voltage drop below the minimum needed for a time too short to see on a multimeter but that may cause any sort of hardware failure.

 

If you never changed anything and never moves cables, forget that : but if you replugged the power connector of the BPI M1, be sure to use the one next to the SCSI connector and check that half the strands of your cable are not cut. (As it happened to me a lot of time).

 

"PC power source" A PC Power Supply has plenty of power but not a USB port which has current limitation and protection circuitry. I do not think your card could boot if the SBC and the disk were powered thru a single USB port.

 

As tparys said, you should also check that fsck is requested at boot (kernel parameter "fsck.repair=yes" in "modern" linux - but yours - as mine - is perhaps a little less modern), because you always will face a power outage some time that the journalization of the fs cannot handle.

Posted

I soldered the 5v wires from the PSU to the BananaPi board, I think there ware some pads dedicated to solder the DC input . I didn't use the USB connector of the board.

 

The PSU also powers some led strips and a router in my office, you know if you join the black and green wire in the PSU connector, it stays on forever :)

 

The bananaPi had this "read-only" problem , with this SSD , for the first time now, after 3 years of running 24/7, so the issue is not something repetitive, it never happen, we'll see if from now on will happen more offten, I hope not.

"
If you never changed anything and never moves cables, forget that" .. No , I haven't touched the cables or the Pi, an and also the PSU is connected to an UPS, there wasn't any main power failure, nor a reboot.

I'm happy it works ok now, I don't know what was the cause, but I'll worry only if it will happen more often :) Anyway, I will change a few things as I mention, because there are many photos saved on the ssd by the cameras, but they are temporary, and are automatically deleted after 7 days, so it's information that is not important, and could be saved on a flash drive, which if it will fail, I'll throw it directly to the garbage bin, and mount another one. On the ssd I also have the mysql database, with the sensors data, which are more important.

If it will happen again, I'll post back for sure.

 

Have a nice evening. Thank you again.

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines