Troubleshooting Corrupted Disk or Partitions

Prev Next

Backup and Restore

Backup Entire Partition

  • Backup the entire partition

dd if=/dev/sda2 of=~/backup_sda2.img bs=4M status=progress

[root@sysrescue ~]# dd if=/dev/sda2 of=~/backup_sda2.img bs=4M status=progress
23+1 records in
23+1 records out
98566144 bytes (99 MB, 94 MiB) copied, 0.450139 s, 219 MB/s
  • Verify the backup img

ls -lh ~/backup_sda2.img

[root@sysrescue ~]# ls -lh ~/backup_sda2.img
-rw-r--r-- 1 root root 94M Mar 25 13:59 /home/root/backup_sda2.img
  • Mount and verify

mkdir ~/mnt_backup

mount -o loop ~/backup_sda2.img ~/mnt_backup

ls ~/mnt_backup

[root@sysrescue ~]# mkdir ~/mnt_backup

[root@sysrescue ~]# mount -o loop ~/backup_sda2.img ~/mnt_backup

[root@sysrescue ~]# ls ~/mnt_backup

EFI  grub  initrd  ng-microcode.cpio  vmlinuz
  • Compress the file for storage

gzip ~/backup_sda2.img

Create a missing partition

Creating Partitions with fdisk on /dev/sda

1. Open fdisk

fdisk /dev/sda

2. Create Each Partition

Partition 1: BIOS Boot (2M)

  • Press n to create a new partition  

  • Partition number: 1  

  • First sector: 2048  

  • Last sector: 6143  

  • Type: Default  

  • Change its type to BIOS boot (ef02):

    • Press t  

    • Enter 1 (partition number)  

    • Enter ef02 (hex code)  

Partition 2: FAT16 (98.6M) - zpe_grub1

  • Press n  

  • Partition number: 2  

  • First sector: 8192  

  • Last sector: 200703  

  • Type: Default  

  • Change its type to FAT16 (06):  

    • Press t  

    • Enter 2 (partition number)  

    • Enter 06 (hex code)  

Partition 3: EXT4 (101M) - zpe_cnf

  • Press n  

  • Partition number: 3  

  • First sector: 203125  

  • Last sector: 400390  

  • Type: Default  

Partition 4: BTRFS (4.9G) - zpe_rootfs

  • Press n  

  • Partition number: 4  

  • First sector: 401408  

  • Last sector: 10168319  

  • Type: Default  

  • Change its type to Linux Root (83):  

    • Press t  

    • Enter 4 (partition number)  

    • Enter 83 (hex code)  

Partition 5: EXT4 (99.6M) - zpe_user

  • Press n  

  • Partition number: 5  

  • First sector: 10170368  

  • Last sector: 10364927  

  • Type: Default  

Partition 6: Linux Swap (64M) - zpe_swap

  • Press n  

  • Partition number: 6  

  • First sector: 10366976  

  • Last sector: 10491903  

  • Type: Default  

  • Change its type to Linux Swap (82):  

    • Press t  

    • Enter 6 (partition number)  

    • Enter 82 (hex code)  

Partition 7: FAT16 (500M) - zpe_grub2

  • Press n  

  • Partition number: 7  

  • First sector: 10493952  

  • Last sector: 11470847  

  • Type: Default  

  • Change its type to FAT16 (06):  

    • Press t  

    • Enter 7 (partition number)  

    • Enter 06 (hex code)  

Partition 8: EXT4 (26.1G) - zpe_var

  • Press n  

  • Partition number: 8  

  • First sector: 11472896  

  • Last sector: 62531583  

  • Type: Default  

3. Write Changes and Exit

After creating all partitions, write changes to the disk:  

Press w to save and exit  

4. Verify the Partitions

fdisk -l /dev/sda

root@nodegrid:~# fdisk -l /dev/sda
Disk /dev/sda: 29.82 GiB, 32017047552 bytes, 62533296 sectors
Disk model: Apacer 32GB SDM7
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C2AFE07-2E0D-4AFB-8C77-B6FEBC1618F2

Device        Start      End  Sectors  Size Type
/dev/sda1      2048     6143     4096    2M BIOS boot
/dev/sda2      8192   200703   192512   94M Linux filesystem
/dev/sda3    203125   400390   197266 96.3M Linux filesystem
/dev/sda4    401408 10168319  9766912  4.7G EFI System
/dev/sda5  10170368 10364927   194560   95M Linux filesystem
/dev/sda6  10366976 10491903   124928   61M Linux filesystem
/dev/sda7  10493952 11470847   976896  477M EFI System
/dev/sda8  11472896 62531583 51058688 24.3G Linux filesystem

Format a Partition

  • Format FAT16 partitions:

    mkfs.vfat -F16 /dev/sda2
    mkfs.vfat -F16 /dev/sda7

  • Format EXT4 partitions:

    mkfs.ext4 /dev/sda3
    mkfs.ext4 /dev/sda5
    mkfs.ext4 /dev/sda8

  • Format BTRFS partition:

    mkfs.btrfs /dev/sda4

  • Format Swap partition:

    mkswap /dev/sda6
    swapon /dev/sda6

    Restore a partition

  • If required, restore the partition using the .img

dd if=~/backup_sda2.img of=/dev/sda2 bs=4M status=progress

root@nodegrid:~# dd if=~/backup_sda2.img of=/dev/sda2 bs=4M status=progress
23+1 records in
23+1 records out
98566144 bytes (99 MB, 94 MiB) copied, 0.484602 s, 203 MB/s
  • or using the compressed file

gunzip -c ~/backup_sda2.img.gz | dd of=/dev/sda2 bs=4M status=progress

root@nodegrid:~# gunzip -c ~/backup_sda2.img.gz | dd of=/dev/sda2 bs=4M status=progress
23+1 records in
23+1 records out
98566144 bytes (99 MB, 94 MiB) copied, 0.484602 s, 203 MB/s

Collecting Information

Lists block devices and partitions

  • Shows information about file systems, UUIDs, and mount points.

lsblk -f /dev/sda

[root@sysrescue ~]# lsblk -f /dev/sda

NAME FSTYPE FSVER LABEL     UUID    FSAVAIL FSUSE% MOUNTPOINTS

sda                                                                          

├─sda1

│                                                                            

├─sda2

│    vfat   FAT16 ZPE_BOOT1 A1C8-1290                                        

├─sda3

│    ext4   1.0             438e59f0-df4f-43f0-b273-7d10f491a528             

├─sda4

│    btrfs                  c6c50953-a4b1-49c2-bce6-022a4ffa56cf             

├─sda5

│    ext4   1.0             4379c837-f093-426c-81c1-28601b16b242             

├─sda6

│    swap   1               2ef6ecaf-009b-4a6c-be32-d86e55b62c6b             

├─sda7

│    vfat   FAT16 ZPE_BOOT2 BD0C-04C6                                        

└─sda8

     ext4   1.0             ec1964ab-b713-4781-91ef-c3a8a34a1043 

Check Partitions

  • Check partition table

parted -l

[root@sysrescue ~]# parted -l
Model: ATA Apacer 32GB SDM7 (scsi)
Disk /dev/sda: 32.0GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system     Name        Flags
 1      1049kB  3146kB  2097kB                  ptable      bios_grub
 2      4194kB  103MB   98.6MB  fat16           zpe_grub1
 3      104MB   205MB   101MB   ext4            zpe_cnf
 4      206MB   5206MB  5001MB  btrfs           zpe_rootfs  boot, esp
 5      5207MB  5307MB  99.6MB  ext4            zpe_user
 6      5308MB  5372MB  64.0MB  linux-swap(v1)  zpe_swap
 7      5373MB  5873MB  500MB   fat16           zpe_grub2   boot, esp
 8      5874MB  32.0GB  26.1GB  ext4            zpe_var
  • Check Partition Layouts

fdisk -l /dev/sda

[root@sysrescue ~]# fdisk -l /dev/sda
Disk /dev/sda: 29.82 GiB, 32017047552 bytes, 62533296 sectors
Disk model: Apacer 32GB SDM7
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3C2AFE07-2E0D-4AFB-8C77-B6FEBC1618F2

Device        Start      End  Sectors  Size Type
/dev/sda1      2048     6143     4096    2M BIOS boot
/dev/sda2      8192   200703   192512   94M Linux filesystem
/dev/sda3    203125   400390   197266 96.3M Linux filesystem
/dev/sda4    401408 10168319  9766912  4.7G EFI System
/dev/sda5  10170368 10364927   194560   95M Linux filesystem
/dev/sda6  10366976 10491903   124928   61M Linux filesystem
/dev/sda7  10493952 11470847   976896  477M EFI System
/dev/sda8  11472896 62531583 51058688 24.3G Linux filesystem

Check Partition Issues

  • Use for ext2/ext3/ext4 partitions. Performs a check without fixing errors, only displaying them.

fsck -n /dev/sda3

[root@sysrescue ~]# fsck -n /dev/sda3
fsck from util-linux 2.40.2
e2fsck 1.47.1 (20-May-2024)
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/sda3 has been mounted 1 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sda3: 3059/24672 files (1.4% non-contiguous), 38683/98305 blocks
  • Use for btrfs partitions.

btrfs check /dev/sda4

[root@sysrescue ~]# btrfs check /dev/sda4
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: c6c50953-a4b1-49c2-bce6-022a4ffa56cf
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 1312673792 bytes used, no error found
total csum bytes: 1194052
total tree bytes: 89964544
total fs tree bytes: 84262912
total extent tree bytes: 4145152
btree space waste bytes: 13795699
file data blocks allocated: 1222709248
referenced 3302412288

Recover SMART Information

  • Displays all SMART information about the disk.

smartctl -a -x /dev/sda

  • Check disk health status

smartctl -H -A /dev/sda

[root@sysrescue ~]# smartctl -H -A /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.63-1-lts] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
SMART Attributes Data Structure revision number: 0
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       1958
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1597
163 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       258
164 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       190
166 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
167 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
168 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
175 Program_Fail_Count_Chip 0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1588
194 Temperature_Celsius     0x0022   005   005   030    Old_age   Always   FAILING_NOW 95 (Min/Max 50/95)
231 Unknown_SSD_Attribute   0x0012   100   100   000    Old_age   Always       -       94
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       9822465098
  • Runs a long self-test on the disk.

smartctl -t long /dev/sda

[root@sysrescue ~]# smartctl -t long /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.63-1-lts] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 6 minutes for test to complete.
Test will complete after Tue Mar 25 19:11:51 2025 UTC
Use smartctl -X to abort test.
  • Get self-test result.

smartctl -l selftest /dev/sda

[root@sysrescue ~]# smartctl -l selftest /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.63-1-lts] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1958         -
# 2  Offline             Completed without error       00%         0         -
# 3  Offline             Completed: unknown failure    00%     28513         20203
# 4  Offline             Completed without error       00%         0         -
# 5  Vendor (0x4e)       Completed without error       00%         8         -
# 6  Offline             Unknown status (0xa)          00%      1511         -
# 7  Offline             Completed without error       00%         0         -
# 8  Vendor (0xe8)       Unknown status (0xe)          110%        78         -
# 9  Offline             Completed without error       00%     40960         -
#10  Offline             Completed without error       00%         0         -
#11  Vendor (0x64)       Unknown status (0xc)          150%        78         -
#12  Offline             Completed without error       00%         0         -
#13  Reserved (0x08)     Completed without error       00%       104         -
#14  Vendor (0xe7)       Completed: electrical failure 130%     31528         3147776
#15  Offline             Completed without error       00%         0         -
#16  Reserved (0x08)     Completed without error       00%         0         -
#17  Offline             Completed without error       00%       768         -
#18  Offline             Completed without error       00%         0         -
#19  Vendor (0x6b)       Completed without error       00%      2056         -
#20  Offline             Unknown status (0xa)          00%     24807         -
#21  Offline             Completed without error       00%         0         -
2 of 2 failed self-tests are outdated by newer successful extended offline self-test # 1
  • Display the error log of the disk.

smartctl -l error /dev/sda

[root@sysrescue ~]# smartctl -l error /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.63-1-lts] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 4
        CR = Command Register [HEX]

        FR = Features Register [HEX]

        SC = Sector Count Register [HEX]

        SN = Sector Number Register [HEX]

        CL = Cylinder Low Register [HEX]

        CH = Cylinder High Register [HEX]

        DH = Device/Head Register [HEX]

        DC = Device Command Register [HEX]

        ER = Error register [HEX]

        ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 4 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  31 54 00 40 61 6f 3c

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  58 53 65 00 08 48 00 d8  42d+17:19:25.696  [RESERVED]
  00 00 00 00 00 00 00 00  18d+05:37:40.864  NOP [Abort queued commands]
  53 00 00 a0 e7 61 d8 00      00:08:44.389  [RESERVED]
  00 42 65 00 08 00 00 44      00:00:00.000  NOP [Reserved subcommand] [OBS-ACS-2]
  00 00 00 00 00 00 00 00  17d+15:38:49.216  NOP [Abort queued commands]
Error 3 occurred at disk power-on lifetime: 8 hours (0 days + 8 hours)
  When the command that caused the error occurred, the device was in an unknown state.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 03 00 00 00 00 00  Error:

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  00 84 00 00 00 00 00 00  12d+14:55:19.040  NOP [Reserved subcommand] [OBS-ACS-2]

  6f b6 00 fd 00 40 61 00      01:50:01.900  [RESERVED]

  08 5d b8 ba 64 00 08 e7  24d+20:31:52.320  DEVICE RESET

  00 00 00 00 00 00 00 08  31d+01:39:14.560  NOP [Abort queued commands]

Error 2 occurred at disk power-on lifetime: 100 hours (4 days + 4 hours)

  When the command that caused the error occurred, the device was in a reserved state.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 5a 00 00 00 a0 e7

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  00 05 b8 ba 64 00 08 e7      00:00:00.000  NOP [Reserved subcommand] [OBS-ACS-2]

  00 00 00 00 00 00 00 08  31d+01:39:14.560  NOP [Abort queued commands]

  00 00 00 00 00 00 00 00      11:59:48.992  NOP [Abort queued commands]

  00 50 00 00 00 00 00 40      00:00:00.000  NOP [Reserved subcommand] [OBS-ACS-2]

  00 00 08 00 70 00 80 64      04:36:20.790  NOP [Abort queued commands]

Error 1 occurred at disk power-on lifetime: 48300 hours (2012 days + 12 hours)

  When the command that caused the error occurred, the device was in a vendor specific state.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 a0 00 00 00 00 00  Device Fault

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  00 a0 e7 61 b8 ba 64 00      00:00:00.008  NOP [Reserved subcommand] [OBS-ACS-2]

  00 00 08 00 00 00 00 64      00:00:00.000  NOP [Abort queued commands]

  a0 00 00 00 00 00 00 00  36d+06:10:44.071  PACKET

  00 a0 e7 05 b8 ba 64 00      00:00:00.008  NOP [Reserved subcommand] [OBS-ACS-2]

  00 00 08 00 00 00 00 34      00:00:00.000  NOP [Abort queued commands]

Error 0 occurred at disk power-on lifetime: 22528 hours (938 days + 16 hours)

  When the command that caused the error occurred, the device was in a reserved state.

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  a0 00 e7 61 d8 53 65

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  00 00 08 00 00 00 00 65      00:00:00.000  NOP [Abort queued commands]

  a0 00 00 00 00 00 00 00  12d+20:49:35.975  PACKET

  00 a0 e7 05 44 42 65 00      00:00:00.008  NOP [Reserved subcommand] [OBS-ACS-2]

  00 02 08 00 00 00 00 93      00:00:00.000  NOP [Reserved subcommand] [OBS-ACS-2]

  00 00 00 00 00 00 00 00      13:58:51.648  NOP [Abort queued commands]

Check for Badblocks

  • Scans the disk for bad sectors.

badblocks /dev/sda

Check for Inodes in BTRFS

  • Runs the check in read-only mode.

btrfs check --readonly --check-data-csum /dev/sda4

[root@sysrescue ~]# btrfs check --readonly --check-data-csum  /dev/sda4
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: c6c50953-a4b1-49c2-bce6-022a4ffa56cf
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking csums against data
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 1312673792 bytes used, no error found
total csum bytes: 1194052
total tree bytes: 89964544
total fs tree bytes: 84262912
total extent tree bytes: 4145152
btree space waste bytes: 13795699
file data blocks allocated: 1222709248
referenced 3302412288

List Files with Checksum Error

  • dmesg | grep -i btrfs | grep -i error

Repair Partitions

Repair ext2/ext3/ext4 Partitions

fsck -y /dev/sda2

fsck -y /dev/sda3

fsck -y /dev/sda5

fsck -y /dev/sda6

fsck -y /dev/sda8

Repair FAT16 Partitions

fsck.vfat -a /dev/sda2

fsck.vfat -a /dev/sda7

Repair BTRFS Partitions

btrfs check --repair /dev/sda4

Repair SWAP Partitions

swapoff /dev/sda6

mkswap /dev/sda6

swapon /dev/sda6