1

I freshly installed ubuntu 24.02.2 on a client. The system freezed after some time: Keyboard, mouse, network becomes unavailable. Ping and ssh are not possible. A hard reboot is required. The error is repetitive (although I do not know what triggers is)

My Hardware is: Intel i5-8400T 1.7 GHz RAM: 8GB VGA compatible controller: Intel Corporation CoffeeLake-S GT2

My software is ubuntu 24.02.2 (newest LTS) with all updates, 6.11.0-28-generic

with journalctl -b -1 -p err I see the following error messages (in the last block, the system is still running)

*Jun 29 09:08:33 Waage002-Aquado-PC kernel: x86/cpu: SGX disabled by BIOS.
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT,Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (3>
Jun 29 09:08:34 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529)
Jun 29 09:08:36 Waage002-Aquado-PC gdm3[910]: Gdm: on_display_added: assertion 'GDM_IS_REMOTE_DISPLAY (display)' failed
Jun 29 09:08:47 Waage002-Aquado-PC gdm-password][1801]: gkr-pam: unable to locate daemon control file
Jun 29 09:08:48 Waage002-Aquado-PC gdm3[910]: Gdm: on_display_added: assertion 'GDM_IS_REMOTE_DISPLAY (display)' failed
Jun 29 09:08:48 Waage002-Aquado-PC systemd[1819]: Failed to start app-gnome-gnome\x2dkeyring\x2dsecrets-2055.scope - Application launched by gnome-session-bina>
Jun 29 09:08:48 Waage002-Aquado-PC systemd[1819]: Failed to start app-gnome-xdg\x2duser\x2ddirs-2074.scope - Application launched by gnome-session-binary.
Jun 29 09:08:51 Waage002-Aquado-PC gdm3[910]: Gdm: on_display_removed: assertion 'GDM_IS_REMOTE_DISPLAY (display)' failed

Jun 29 11:00:06 Waage002-Aquado-PC kernel: x86/cpu: SGX disabled by BIOS. Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240> Jun 29 11:00:06 Waage002-Aquado-PC kernel: ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) Jun 29 11:00:09 Waage002-Aquado-PC gdm3[951]: Gdm: on_display_added: assertion 'GDM_IS_REMOTE_DISPLAY (display)' failed

still running: Jun 29 11:05:49 Waage002-Aquado-PC systemd[1825]: Failed to start app-gnome-gnome\x2dkeyring\x2dsecrets-2074.scope - Application launched by gnome-session-binary. Jun 29 11:05:49 Waage002-Aquado-PC systemd[1825]: Failed to start app-gnome-xdg\x2duser\x2ddirs-2093.scope - Application launched by gnome-session-binary. Jun 29 11:05:51 Waage002-Aquado-PC systemd[1825]: Failed to start app-gnome-spice\x2dvdagent-2267.scope - Application launched by gnome-session-binary. Jun 29 11:05:51 Waage002-Aquado-PC systemd[1825]: Failed to start app-gnome-user\x2ddirs\x2dupdate\x2dgtk-2328.scope - Application launched by gnome-session-binary. Jun 29 11:05:52 Waage002-Aquado-PC gdm3[946]: Gdm: on_display_removed: assertion 'GDM_IS_REMOTE_DISPLAY (display)' failed*

Some addition infos from dmesg (I do not know if they are related):

[ 0.101402] x86/cpu: SGX disabled by BIOS. ... [ 5.252132] loop11: detected capacity change from 0 to 1136 [ 5.524255] EDAC ie31200: No ECC support [ 5.536489] intel_pmc_core INT33A1:00: initialized [ 5.567579] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.567592] No Local Variables are initialized for Method [WMBB] [ 5.567595] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.567596] Arg0: 00000000b1ca0ff5 Integer 0000000000000000 [ 5.567603] Arg1: 00000000350539ce Integer 0000000000000125 [ 5.567608] Arg2: 000000001529b098 Buffer(4) 00 00 00 00 [ 5.567618] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.567713] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.567719] No Local Variables are initialized for Method [WMBB] [ 5.567722] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.567723] Arg0: 000000007b94f451 Integer 0000000000000000 [ 5.567729] Arg1: 00000000afc1e84b Integer 0000000000000125 [ 5.567734] Arg2: 00000000b3d68b6c Buffer(4) 01 00 00 00 [ 5.567743] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.567832] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.567839] No Local Variables are initialized for Method [WMBB] [ 5.567841] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.567843] Arg0: 0000000091668680 Integer 0000000000000000 [ 5.567848] Arg1: 0000000033096f58 Integer 0000000000000125 [ 5.567853] Arg2: 00000000af0a7718 Buffer(4) 02 00 00 00 [ 5.567862] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.567951] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.567957] No Local Variables are initialized for Method [WMBB] [ 5.567960] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.567961] Arg0: 00000000b3d68b6c Integer 0000000000000000 [ 5.567966] Arg1: 00000000afc1e84b Integer 0000000000000125 [ 5.567971] Arg2: 000000007b94f451 Buffer(4) 03 00 00 00 [ 5.567980] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.568072] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.568078] No Local Variables are initialized for Method [WMBB] [ 5.568080] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.568083] Arg0: 000000005d14a3c5 Integer 0000000000000000 [ 5.568089] Arg1: 00000000eeaaf3b9 Integer 0000000000000125 [ 5.568094] Arg2: 000000000e5b0338 Buffer(4) 04 00 00 00 [ 5.568103] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.568192] ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [SMDN] at bit offset/length 32/16 exceeds size of target Buffer (32 bits) (20240322/dsopcode-198) [ 5.568198] No Local Variables are initialized for Method [WMBB] [ 5.568201] Initialized Arguments for Method [WMBB]: (3 arguments defined for method invocation) [ 5.568202] Arg0: 000000007b94f451 Integer 0000000000000000 [ 5.568207] Arg1: 00000000afc1e84b Integer 0000000000000125 [ 5.568212] Arg2: 00000000b3d68b6c Buffer(4) 05 00 00 00 [ 5.568222] ACPI Error: Aborting method \GSA1.WMBB due to previous error (AE_AML_BUFFER_LIMIT) (20240322/psparse-529) [ 5.568235] gigabyte-wmi DEADBEEF-2001-0000-00A0-C90629100000: No temperature sensors usable [ 5.569821] mei_me 0000:00:16.0: enabling device (0000 -> 0002) [ 5.578790] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer ... [ 5.674865] 0x000000000000-0x000001000000 : "BIOS" [ 6.211312] i915 0000:00:02.0: [drm] Found COFFEELAKE (device ID 3e92) display version 9.00 stepping N/A [ 6.211997] i915 0000:00:02.0: [drm] VT-d active for gfx access [ 6.224616] i915 0000:00:02.0: vgaarb: deactivate vga console [ 6.224674] i915 0000:00:02.0: [drm] Using Transparent Hugepages [ 6.225647] snd_hda_intel 0000:00:1f.3: enabling device (0000 -> 0002) [ 6.226388] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 6.228377] intel_tcc_cooling: Programmable TCC Offset detected [ 6.230535] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) [ 6.262223] audit: type=1400 audit(1751200135.509:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name=4D6F6E676F444220436F6D70617373 pid=548 comm="apparmor_parser" ...

Anybody has an idea how to fix the freeze or how to further investigate? Help is very much appreciated!

1 Answers1

0

I can't be sure, but those error messages are probably harmless. Buggy ACPI isn't uncommon, and those Gdm errors might just mean "GDM just blindly tried to launch an optional service you don't have installed".

When a system freezes, the first thing I do is determine whether it's the whole system that's frozen or just the display.

  1. Make sure the SSH daemon is running
  2. SSH in while it's not frozen to verify it works (i.e. as a control)
  3. SSH in while it is frozen

If you can SSH in while it's frozen, it might still be a driver bug, but it's not serious enough to seize up the system so the next step would be to do things like:

  1. See if sudo dmesg shows anything interesting
  2. Look at whatever the GNOME Wayland equivalent to /var/log/Xorg.0.log is (nVidia+Kubuntu 24.04.2 here. I'm still on X11)
  3. Try kill or kill -9ing your compositor or using whatever the GNOME Wayland equivalent to sudo systemctl restart sddm.service is to get a sense for how firmly the compositor is wedged. (eg. whether a userland restart without a kernel reboot can get things back.)

If you can't SSH in while it's frozen, then you've got the Linux equivalent to the Windows Blue Screen of Death and it's either dying hardware or a kernel bug (probably a driver bug).

  1. Boot https://memtest.org/ off a USB stick and confirm that it goes through a full test run without finding memory errors or freezing the system.
  2. Use a tool like smartctl as appropriate to check if any of your hard drives have logged signs of going bad. (eg. sudo smartctl -a /dev/sda will read out the logs for your first SATA hard drive. For SATA, look for things like the Offline_Uncorrectable or Reported_Uncorrect lines having a RAW_VALUE greater than 0 or entries in the SMART Self-test log structure revision number 1 not which aren't either "SMART Self-test log structure revision number 1" or interrupted... the latter being something like "you powered the system down while a self-test was running in the background".)
  3. If you get to this stage, it's probably a kernel bug (usually a driver) and the fix is certainly to upgrade (or possibly downgrade) either your kernel or, for external drivers like the nVidia drivers or VirtualBox's extensions or CDEmu's VHBA, possibly just those.

(eg. Back on kernel version 6.1.x, this Arch user found their USB WiFi dongle's driver had a system-freezing bug and, while it wasn't system-freezing, back on 22.04 LTS, I found that VirtualBox and my mother's laptop's AMD GPU drivers were giving UBSAN errors with the HWE kernel that was getting installed by default, so we downgraded to the kernel that came with 22.04.0.)

ssokolow
  • 2,423