Hi Everyone.
I am a long time user of Ubuntu mate, which I think is great.
But just signed up to the community today to see if I can get some help with a strange problem I am having.
Intermittently when I boot up my laptop, I am getting the errors like the ones below and cannot boot up when they occur
journalctl -b -1 | grep -i long
Jun 07 10:26:30 dell systemd-udevd[516]: seq 2959 '/module/nvidia' is taking a long time
Jun 07 10:26:30 dell systemd-udevd[516]: seq 3425 '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C14:02' is taking a long time
Jun 07 10:26:30 dell systemd-udevd[516]: seq 3269 '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:10/PNP0C09:00/INT3403:00' is taking a long time
Jun 07 10:26:30 dell systemd-udevd[516]: seq 3248 '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/PNP0C14:01' is taking a long time
Jun 07 10:26:30 dell systemd-udevd[516]: seq 3458 '/devices/pci0000:00/0000:00:04.0' is taking a long time
journalctl -b -1 | grep -i sda2.device
Jun 07 10:26:59 dell systemd[1]: dev-sda2.device: Job dev-sda2.device/start timed out.
Jun 07 10:26:59 dell systemd[1]: Timed out waiting for device dev-sda2.device.
Jun 07 10:26:59 dell systemd[1]: dev-sda2.device: Job dev-sda2.device/start failed with result 'timeout'.
Jun 07 10:28:29 dell systemd[1]: dev-sda2.device: Job dev-sda2.device/start timed out.
Jun 07 10:28:29 dell systemd[1]: Timed out waiting for device dev-sda2.device.
Jun 07 10:28:29 dell systemd[1]: dev-sda2.device: Job dev-sda2.device/start failed with result 'timeout'.
Sometimes it boots up normally with none of those kind of errors at all.
I seem to have had those sort of errors since after installing the latest kernel
/var/log/dpkg.log.1:2022-05-28 12:40:07 status installed linux-image-4.15.0-180-generic:amd64 4.15.0-180.189
I have tried booting from the previous kernel though, and still had the same problem
To fix the problem I boot to rescue mode and run an fsck on /dev/sda1 (/boot/efi)
and sometime it is marked as dirty and sometimes it is not.
and then I run
install-grub /dev/ssda
update-grub
and then it usually boots up ok
My hardware is a Dell Inspiron 7560
My OS is
cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
my disk layout is like so...
fdisk -l /dev/sda
Disk /dev/sda: 238.5 GiB, 256060514304 bytes, 500118192 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 8C7A1912-3565-4853-9E63-55749D58E708
Device Start End Sectors Size Type
/dev/sda1 2048 43007 40960 20M EFI System
/dev/sda2 43008 2140159 2097152 1G Linux filesystem
/dev/sda3 2140160 500118158 497977999 237.5G Linux filesystem
lsblk -f) is
sda
├─sda1 vfat 0275-84DA /boot/efi
├─sda2 ext4 0650ed2e-5153-459a-9eac-1e026f8e2d54 /boot
└─sda3 crypto_LUKS d3851cb9-79e1-4e8d-af6b-1e2032234039
└─sda3_crypt LVM2_member 7F1rhm-U6qV-oMd4-effZ-cZy8-O0s1-8AS5xU
├─ub-root ext4 f29e8f45-765c-4551-9a93-f5715fcc5458 /
├─ub-home ext4 21244b90-fa13-44a3-8cb0-879331d2ae6b /home
├─ub-e ext4 1941ef6f-7447-486a-8a88-4d620590ab3e /e
└─ub-swap swap 3b470346-9672-4711-837c-ee0090dc4841 [SWAP]
sda3 is encrypted, and on the encrypted device I have a VG called "ub", and my LV's for root, swap, home, e, etc
s -l /dev/mapper/*
crw------- 1 root root 10, 236 Jun 8 07:29 /dev/mapper/control
lrwxrwxrwx 1 root root 7 Jun 8 07:29 /dev/mapper/sda3_crypt -> ../dm-0
lrwxrwxrwx 1 root root 7 Jun 8 07:29 /dev/mapper/ub-e -> ../dm-3
lrwxrwxrwx 1 root root 7 Jun 8 07:29 /dev/mapper/ub-home -> ../dm-2
lrwxrwxrwx 1 root root 7 Jun 8 07:29 /dev/mapper/ub-root -> ../dm-1
lrwxrwxrwx 1 root root 7 Jun 8 07:29 /dev/mapper/ub-swap -> ../dm-4
# mount | grep sda
/dev/sda2 on /boot type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
Currently my hard disk is a:
Samsung 860 EVO M.2 Sata III internal SSD (non NMVe type)
I previously had an additional 2.5" 1TB drive as well, and had my LVM's spread accross both disks, and it seems that the either the second disk or its connection to the motherboard, the cable, or the connection, or even the Bus it is on has gone faulty.
So I hard to do a bare metal restore from a backup onto one disk.
originally the boot disk was seen as /dev/sdb, and now with a single disk it is on /dev/sda
To do the mare metal restore, I booted from a Live Disk ISO image on a USB, and manually repartitioned /dev/sda, with a gpt type disk label
and created the LVMs on the encrypted sda3 partition, and did an mkfs.vfat on /dev/sda1 for /boot/efi, and ext4 for /boot
then I edited /etc/fstab and corrected the UUID's for the mount points to match the ones since recreating the file-systems, and the updated the UUID in the /etc/crypttab bfile too, and did a grub-install /dev/sda and an update-grub.
At that stage I should have been able to boot up, and it failed with similar errors, so I booted from the live ISO again, and changed /etc/fstab to use the /dev/mapper/ub-root (and other) LVM names and it booted okay.
grep -v ^# /etc/fstab
/dev/mapper/ub-root / ext4 errors=remount-ro 0 1
/dev/sda2 /boot ext4 defaults 0 2
UUID=0275-84DA /boot/efi vfat umask=0077 0 0
/dev/mapper/ub-home /home ext4 defaults 0 2
/dev/mapper/ub-e /e ext4 defaults 0 2
/dev/mapper/ub-swap none swap sw 0 0
I also removed the UUID related to the RESUME from hibernate entry in /etc/default/grub file, and did an update-grub
GRUB_DEFAULT=0
#GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR="`lsb_release -i -s 2> /dev/null || echo Debian`"
## GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
## GRUB_CMDLINE_LINUX_DEFAULT="UUID=50f603eb-4b97-4027-a652-dc275336bc6f"
# corrected swap UUID after BMR restore on 3-jun-2022
# GRUB_CMDLINE_LINUX_DEFAULT="UUID=3b470346-9672-4711-837c-ee0090dc4841"
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX=""
But as mentioned sometimes it wont boot.
I also had the cable from the second drive that I removed still connected to the motherboard, and I suspected it might be causing intermittent problems with devices ion the PCI bus "taking a long time"
so later I removed the cable, and it looked like the problem had gone away, but a couple of days later I had the same boot problem again.
I am at a loss as to what is causing the problem.
It might be an intermittet hardware fault, or it might be some some sort of software bug, or I might have configured my disk layout since the bare metal restore in a way that makes it prone to this problem.
At this stage I am considering reinstalling from scratch with LTS 20.04 or 22.04, but that means I have to manually install various programs and setup my desktop with various customisations I have made.
But I suppose I ought to be upgrading anyway because LTS 18.04 is now end of life.
Has anyone got any ideas as to what my problem might be please?
Could it be an intermittent hardware problem or some sort of software issue, or have I mis-configured my disk layout in some way?