I have Arch Linux on Ryzen 7 3700X, 32gb of ram, and some Gigabyte motherboard with updated bios.

Few weeks ago my computer would startet crashing (screen would freeze) soon after login or even at boot about 50% of the time. I was lazy so when it crashed I just forced rebooted it (the power button). Then crashes became more common untill my system wouldn’t even boot.

So I reinstalled and I had some trouble generating dracut bundles, because some zstd copression was corrupted. After booting freshly installed os it would crash again right before the login should show up. Switching kernel (from hardened to zen) fixed the problem. Then I installed basic apps (browsers, office, crypto stuff, steam, etc.) I rebooted and when I typed the password for my encrypted root it was wrong (Im sure I typed it correctly).

I have no idea wtf went wrong with my system. I have almost the same everthing on my laptop (hardened, btrfs, luks encrytped drives, systemd boot, etc.) and it works great. And I never experienced any crashes on live usb on my pc.

I ran some random test (its passmark memtest86 v9.3 pro) on my medicat usb. Right now its 92% finished with 1070 errors. This just can’t be good :(

Now I will play with some bios settings (like disable xmp), reflash other version, maybe switch a ssd… I will also try other distro, but I can’t daily drive them. Arch gives me a ton of flexibility and I don’t want to lose it. Maybe NixOS or Gentoo, but gentoo doesnt have systemd (I want to use Mullvad as my vpn and their app reqires it).

Do you maybe know what could be wrong and how to fix it. Thank you for reading this post and thank you very much for answering.

I don’t know if this is arch bug or its something wrong with my system. If this is not right community to ask this, plese direct me to the right one (just please not reddit).

Edit: I ran memtest again without one ramstick and it gave me no errors! Thank you for all help and suggestions :)

Edit: I also tried only the faulty ram stick and the PC wouldn’t even boot.

Edit: Booting PC with only the faulty ram stick corrupted my bios… I guess I will have to reflash bios anyway.

  • xan1242@lemmy.ml
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    11 months ago

    FWIW I’ve also had memory issues with XMP.

    Turns out that ASUS firmware is omega pepega and decided to go against AMD’s specifications even for XMP profiles.

    CLDO VDDP was stuck at the same voltage as SOC. Per AMD it has to be up to VSOC - 0.1V

    So, after manually setting that, and other VDDP and VDDG voltages, it magically started working perfectly.

    So do check voltages anyway even if you found a bad stick. Mine endured through the crappy firmware thanks to it being Samsung B-die.

    Also check this for more info in general (I recommend this even if you won’t OC, just the memtest alone is a huge section)

    https://github.com/integralfx/MemTestHelper/blob/oc-guide/DDR4 OC Guide.md

    I tested with OCCT to find even more errors, so either do that in a mini windows environment or do one of the Linux tests to check memory some more. Memtest86+ isn’t enough.