Jump to content

nettings

Members
  • Posts

    16
  • Joined

  • Last visited

  1. Clicked on "answer this question" in the hope that it marks the issue as solved. If not, can a friendly admin please delete this post and mark the thread as resolved?
  2. Ok, it looks like dead hardware. If you are seeing something similar, check out this post where a user has a picture of a burnt 12v regulator on the SATA: https://forum.odroid.com/viewtopic.php?t=44517 Next, check the schematics kindly provided by Hardkernel: https://dn.odroid.com/S905X3/ODROID-HC4/odroid-hc4_rev1.0_20200807.pdf You want to look for U21 and U24 in the index. As you can see, they lead to test pads TP10 and TP11, which are accessible on the underside of the board. You can measure them against ground. As expected, on my board that doesn't run 3.5" drives any more in either port, both 12v lines are dead. No visible damage to the regulators though. I'm guessing the EN (pin 4) line is something like an "on" switch. So for a brief moment hope rises, maybe it's a software issue after all. But no: the EN line is shared with a number of other regulators, and they all provide output voltage. Since [TheBug] on #armbian recommended it, I also measured the output pins directly - one is at input voltage, one at something around 3.3v (that would be EN), and all others are dead. Case closed (for now).
  3. I took down a second HC4, inserted the SD card from the system that first showed the problems. Now one SATA drive shows up, which fortunately lets me run manual backups of my critical systems. BUT: When I booted back into the original SD card of the second HC, THE SECOND SATA PORT IS LOST, apparently permanently. So it looks like once you've had that problem, the hardware is altered (damaged?) permanently. EXPLETIVE!
  4. I just restored petitboot, installed Hardkernel's Ubuntu 20.04.4 LTS running Linux odroid 4.9.277-82 #1 SMP PREEMPT Fri Feb 18 14:35:13 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux, and the disks are still dead: [ 5.481026] libata version 3.00 loaded. [ 5.483585] ahci 0000:01:00.0: version 3.0 [ 5.483611] ahci 0000:01:00.0: enabling device (0000 -> 0003) [ 5.484107] ahci 0000:01:00.0: SSS flag set, parallel bus scan disabled [ 5.490856] ahci 0000:01:00.0: AHCI 0001.0200 32 slots 2 ports 6 Gbps 0x3 impl IDE mode [ 5.498893] ahci 0000:01:00.0: flags: 64bit ncq sntf stag led clo pmp pio slum part ccc sxs [ 5.508783] ahci 0000:01:00.0: enabling bus mastering [ 5.512956] input: Generic USB Keyboard as /devices/platform/ff500000.dwc3/xhci-hcd.0.auto/usb1/1-2/1-2:1.0/0003:040B:2000.0001/input/input1 [ 5.515231] meson-spicc ffd13000.spi: registered master spi0 [ 5.517738] spi spi0.0: setup mode 0, 8 bits/w, 100000000 Hz max --> 0 [ 5.517876] meson-spicc ffd13000.spi: registered child spi0.0 [ 5.520185] scsi host0: ahci [ 5.527998] scsi host1: ahci [ 5.528292] ata1: SATA max UDMA/133 abar m512@0xfc700000 port 0xfc700100 irq 69 [ 5.533998] ata2: SATA max UDMA/133 abar m512@0xfc700000 port 0xfc700180 irq 69 [ 5.580805] hid-generic 0003:040B:2000.0001: input,hidraw0: USB HID v1.10 Keyboard [Generic USB Keyboard] on usb-xhci-hcd.0.auto-2/input0 [ 5.598057] input: Generic USB Keyboard as /devices/platform/ff500000.dwc3/xhci-hcd.0.auto/usb1/1-2/1-2:1.1/0003:040B:2000.0002/input/input2 [ 5.664938] hid-generic 0003:040B:2000.0002: input,hidraw1: USB HID v1.10 Mouse [Generic USB Keyboard] on usb-xhci-hcd.0.auto-2/input1 [ 5.854694] ata1: SATA link down (SStatus 0 SControl 300) [ 6.166670] ata2: SATA link down (SStatus 0 SControl 300) I'm beginning to believe there is permanent damage to the board when it comes to powering 3.5" spindles. Inserting a 2.5" notebook spindle, it powers up immediately. But I've seen at least two other people who've observed this issue in connection with a distribution update. Hmmm. I have two more HC4 here, so it would be easy (if costly) to see what happens, but they are running in production (as was the dead one :o( )
  5. The issue persists with the latest nightly edge kernel (Linux hildegard 6.1.1-meson64 #trunk.0099 SMP PREEMPT Wed Dec 28 09:02:09 UTC 2022 aarch64 GNU/Linux).
  6. Many people are having the same problem (I used to, but then it disappeared). There is no conclusive solution, and many devs have problems replicating it. Search around, some people have suggested some types of SD cards are to blame, or particular ways of booting. Personally, I get the feeling it has somehow to do with uptime - while trying to debug a SATA issue, my machine always rebooted cleanly, whereas whenever I did regular updates after a month of uptime or so, there's a high chance it'll hang and require a power cycle (though not always). But it's somewhat elusive.
  7. Correction: when I said 5 1/4" drives above, I actually meant 3 1/2". And instead of 3 1/2", I meant notebook-sized two-and-a-half. So these are not some dinosaur drives... Guess I'm getting old (and I do have an 8" floppy lying around somewhere, but I can't find it :o])
  8. No real surprise, but Armbian_22.05.4_Odroidhc4_bullseye_current_5.10.123.img does not work, either. Tried both with original PSU (delivers 4A at 15V), and another one that does 5A. Even the first drive to be plugged in doesn't wake up, so I think I can rule out PSU issues.
  9. Fresh installation from Armbian_22.11.1_Odroidhc4_bullseye_current_5.19.17.img, still no SATA. https://paste.armbian.com/awasopagay Fresh installation from even older Armbian_22.08.7_Odroidhc4_bullseye_current_5.19.17.img, still no SATA. https://paste.armbian.com/qewucawuxu This latter one is definitely older than the last-known-good configuration. I'm beginning to suspect that either some persistent firmware upgrade did indeed happen, or permanent damage has been done to the hardware. Still unable to get any large 5 1/4" platter to be spun up or even a link recognized.
  10. Tried "echo on > control" in all /sys/bus/pci/devices/0000:01:00.0/**/power directories which were previously on "auto", no difference.
  11. FWIW, i just diffed the device trees, and they are identical except for whitespace. So that's not it, either.
  12. Maybe it is a power issue after all, but one related to the onboard voltage regulators. This issue came up on a different hardware platform, and it looks somewhat related: The corresponding DTD looks completely different though. I'm now comparing two HC4 with two large 5 1/4" drives each. Both are at 5.19.17-meson64. The broken one runs Armbian bullseye and has been upgraded two days ago. https://paste.armbian.com/fecepoxoda The intact one runs Armbian buster and hasn't been upgraded in at least a month. http://paste.armbian.com/hayamolexi The only difference I can see (besides the SATA issue) is that the broken one has issues with the RTC - IIUC that's because it's coupled to the OLED module, which only the buster one has.
  13. I have tried downgrading to various previous kernels (5.10.x, which used to work) without success. Testing a few drives, there is one consistent pattern: all 3 1/2 inch drives, whether SSD or spindle, are powering up. None of the 5 1/2 inch drives (all spindles) does. Which would make me suspect a power issue, as large platters draw much larger currents than notebook drives or SSDs. If not for the fact that this was clearly induced by an update. Is there some setting on the controller that would limit current, or a kind of negotiation phase with the drive electronics before the power lines are even switched on? Next I'm going to play with the controller power management knobs... The controller is a ASMedia Technology Inc. ASM1061 SATA IDE Controller (rev 02). Looking elsewhere on the internet, it seems it has a long history of quirky and unreliable behaviour... I wonder if it's best to give up on the HC4...
  14. @ChrisO, have you figured out a way to determine the HC4 board revision? And I wonder: is there any part of the firmware that gets flashed into some persistent memory rather than uploaded at boot? Which might account for the difference?
  15. I can add a few observations: * The issue is apparently not specific to SSDs. I lost access to both my drives, and they are spinning rust (Toshiba 8TB). * Interestingly, if I do hot-swap in an SSD (only for debugging and against the recommendations of the odroid folks, who say don't hotswap), the SSD is recognized in either SATA port. * The issue is also unrelated to Samsung - I tested two SSDs successfully, and both were Samsungs. * This is not a power supply issue (as was hinted at elsewhere in odroid forums, and which I was suspecting myself given that spindles fail where more energy-efficient SSDs succeed). I'm using the original power supply, and cross-checked with a more powerful one with same voltage and polarity. * The issue persists in 5.19.17-0025 or 6.0.13-0047 from current, or 6.1.0-0064 from edge (example armbianmonitor output with 6.1 is at https://paste.armbian.com/ewabuparak, but the relevant SATA snippets are always the same). * Downgrading to 5.10.16 also didn't solve the issue for me (same as reported by GuestMan in the original thread). For the record, here is the corresponding armbianmonitor output (largely identical): https://paste.armbian.com/ofumoyivoz I haven't dared to downgrade to legacy 4.x kernels, but would do so if advised. I'm however pretty sure that I was running a 5.1x.y kernel before the apt upgrade and it worked, but unfortunately I didn't note the last-known-good version number. I usually do update regularly, so my guess is the previous, known-good kernel would have been no older than mid-September 2022. Question: If I switch kernels using the "Stable/Nightly" and "Other" settings in armbian-config, will this always pull in all relevant firmware/device tree/bootloader components? Or is there something that persists from the initial apt upgrade which broke things? I don't see how any userspace components could play a role here, but I may be missing something... Side note: can someone please remove the "one post per day" restriction for vetted accounts? I understand how spam is an issue, but I hope I have proved myself as a constructive user, and this limitation is just massively counter-productive when trying to collectively debug an issue here. Keep in mind that I might also be able to help other users with my modest expertise, which this restriction also effectively prevents. And before Igor's fuse gets lit again: I'm not demanding anybody's time, and I'm willing to contribute my own resources to pinpoint this issue. Also, I did make a donation to Armbian when I started using it, I just refuse to buy myself out of posting restrictions with an additional subscription. That said, rest assured that the work of the Armbian maintainers is very much appreciated. I would love to see Armbian thrive with a sustainable business model, but taking frustration out on bug reporters is not the way to go. My 0.02 euros... Correction: turns out the restriction is automatically lifted after the second post, but this was kinda non-obvious...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines