Two drives have failed in your Synology SHR-1 array. DSM reports the storage pool as Crashed and states that data cannot be recovered. SHR-1 data recovery after a double drive failure is the subject of this article — and the situation is more nuanced than DSM’s message suggests. Whether your data is recoverable depends not on the count of failed drives alone, but on how they failed and what state the surviving drives are in.

Find Your Situation First
Before reading further, use this table to locate your specific scenario and skip directly to the relevant section:
| What you observe | Likely situation | Recovery outlook | Go to |
|---|---|---|---|
| Both drives appear in OS, superblocks readable | Logical failure — drives physically intact | Possible | Level 1 → Level 2 |
| One drive readable, one not detected | One logical + one physical failure | Partial | Level 2 → Level 3 |
| Second drive failed during rebuild after first | Classic double failure — partial data loss likely | Partial | Level 2 |
| One or both drives click, grind, not detected | Physical failure — heads or platters | Lab required | Level 3 |
Why Two Failures Break SHR-1 Mathematically
SHR-1 uses a single parity set — conceptually equivalent to RAID 5 in terms of fault tolerance. For each stripe across the array, one drive holds the XOR parity of the others. When one drive fails, any missing data block can be recalculated: take the parity block and XOR it against all surviving data blocks, and the missing value is recovered. This works because there is one unknown and one equation.
With two drives failed, there are two unknowns per stripe and still only one parity equation. The math has no solution. No software, regardless of how it is marketed, can reconstruct two independently unknown values from a single XOR relationship. This is not a limitation of any particular tool — it is a property of the parity scheme itself.
What software can do is recover data from stripes where only one of the two failed drives contributed. If your files happen to be stored on stripes that do not touch both failed drives, they are recoverable. Files that span stripes involving both failed members are not. The proportion of recoverable data depends on the size and distribution of the failed drives within the array — which is why partial recovery is often possible even when full recovery is not.
SHR-2 stores two independent parity sets per stripe — analogous to RAID 6 — and tolerates any two simultaneous drive failures without data loss. However, SHR-2 is not immune to this scenario: three simultaneous failures break its parity math in the same way two failures break SHR-1. If you are running SHR-2 and have lost three drives, the same logic below applies. For a comparison of SHR-1 and SHR-2 fault tolerance in practice, see our article on SHR/SHR-2 recovery after Synology hardware failure.
How the Two Drives Failed Matters
Not all double failures are equivalent. The sequence and nature of the failures determine what recovery is possible:
Simultaneous failure — power event or controller
Both drives stopped responding at the same moment. The data on each drive is likely physically intact — the array simply lost quorum. This is the most recoverable version of a double failure because neither drive’s data was partially overwritten.
Second drive failed during rebuild
The first drive failed, the array ran degraded, a rebuild started — and the second drive failed partway through. New parity was being written to the replacement drive as the second original drive died. Stripes that were rebuilt before the second failure are intact; stripes that were being rebuilt at the time of the second failure are corrupted. Partial recovery is typically possible.
First drive failed long ago, unnoticed
The array ran in degraded mode — without redundancy — for an extended period, and then the second drive failed. If the first failure went undetected, there may have been additional writes to degraded stripes in the interim. Recovery chances depend entirely on the physical state of the second failed drive. See our article on recognizing early signs of hard drive failure for how to catch this earlier.
Assess the Physical State of Each Drive
Before any software attempt, determine whether the drives are physically readable. Power down the NAS cleanly if it is still running — do not force-reboot, and do not insert a replacement drive, which would trigger DSM to attempt another rebuild. Connect each drive individually to a recovery machine.
If any drive produces audible clicking, grinding, or repeated failed spin-up attempts — power it off immediately and do not attempt further reads. Each spin-up cycle on a drive with damaged heads increases platter damage. Go directly to Level 3.
For drives that power up silently, run mdadm --examine on each device partition:
mdadm --examine /dev/sdb3 Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 4b2f8e1a:7c3d9f02:1a4b8c3d:9e2f7b01 Name : DiskStation:2 Creation Time : Fri Mar 10 14:22:31 2023 Raid Level : raid5 Raid Devices : 3 Avail Dev Size : 7813959680 Array Size : 15627862016 Used Dev Size : 7813931008 Data Offset : 2048 sectors Super Offset : 8 sectors Unused Space : before=1968 sectors, after=0 sectors State : clean, degraded Device UUID : 9d3f1c2a:4e8b7f03:2c1d9e4b:7a3f8c01 Update Time : Mon Jun 2 09:14:22 2025 Bad Block Log : 512 entries available at offset 16 sectors Checksum : 3f8a1b2c - correct Events : 247
Look for three things: the Array UUID must match across all drives — if it does not, a drive is from a different array or its superblock is corrupt. The Events counter should be similar across members — a large divergence indicates one drive missed a significant number of write operations. The State field will show clean, degraded on a drive that was part of an array that lost a member.
Also open the built-in S.M.A.R.T. monitor in RS RAID Retrieve at this point. Reallocated Sector Count and Pending Sectors on a surviving drive matter: a drive with accumulating bad sectors during an already-degraded period may have produced silent read errors that affected data integrity before the second failure occurred. For background on interpreting S.M.A.R.T. data, see our article on hard drive failure signs.
Three Levels of Recovery
Only attempt this path if mdadm --examine returns valid superblocks with matching UUIDs on all drives. The terminal approach gives you direct control over every step but has hard limits: it requires all drives to be physically present and readable, it has no fallback when superblocks are damaged, and it cannot indicate in advance which files are corrupted. Image every drive with ddrescue before running any of the commands below — work exclusively from the images.
Step 1 — Image each drive before anything else
# Install ddrescue if not present apt-get install -y gddrescue # Image each drive to a separate file — run for each member ddrescue -d -r3 /dev/sdb /mnt/images/sdb.img /mnt/images/sdb.log ddrescue -d -r3 /dev/sdc /mnt/images/sdc.img /mnt/images/sdc.log ddrescue -d -r3 /dev/sdd /mnt/images/sdd.img /mnt/images/sdd.log
The -r3 flag tells ddrescue to retry unreadable sectors three times before marking them as failed. The .log file allows the imaging to be resumed if interrupted. Do not skip this step — a drive with hidden bad sectors will degrade further under the sustained read load of array reconstruction.
Step 2 — Verify superblocks across all members
# Run on each image — confirm UUID and Events match mdadm --examine /mnt/images/sdb.img mdadm --examine /mnt/images/sdc.img mdadm --examine /mnt/images/sdd.img
Confirm three things before proceeding. The Array UUID must be identical across all images — a mismatch means a drive is from a different array or its superblock is corrupt. The Events counter should be close across all members — a divergence of hundreds or thousands means one drive missed a significant sequence of writes. The Raid Devices count must match the expected number of members.
Step 3 — Force assembly from image files
# Tell the kernel to treat image files as block devices losetup /dev/loop0 /mnt/images/sdb.img losetup /dev/loop1 /mnt/images/sdc.img losetup /dev/loop2 /mnt/images/sdd.img # Force assembly — specify the data partitions, not the whole device mdadm --assemble --force /dev/md0 /dev/loop0p3 /dev/loop1p3 /dev/loop2p3 # Check assembly result cat /proc/mdstat mdadm --detail /dev/md0
--force assembles the array even without sufficient members for a consistent state. The resulting array is degraded. Stripes that involved both failed drives contain incorrect parity-reconstructed data — files on those stripes will be corrupt or zero-byte when copied. Files on stripes that touched only one failed drive are intact and readable. There is no way to know from the outside which files fall into which category before attempting to copy them.
Step 4 — Activate LVM and mount read-only
# Activate LVM volume group on the assembled array pvscan vgchange -ay # List logical volumes to find the correct device path lvs # Mount read-only — never mount rw on a damaged array mount /dev/vg1/volume_1 /mnt/data -o ro
Always mount with -o ro. Writing to a forced-assembled array in this state will extend corruption into previously intact stripes. Once mounted, verify directory structure and file sizes before starting any copy operation. Corrupted files surface as read errors, I/O errors, or zero-byte output during copy — there is no prior indicator.
Step 5 — Copy with error handling
# Use rsync with error logging — skips unreadable files rather than stopping rsync -av --ignore-errors /mnt/data/ /mnt/destination/ 2>&1 | tee /mnt/rsync.log # Review what was skipped grep -i "error\|failed\|cannot" /mnt/rsync.log
Standard cp stops on the first read error. rsync --ignore-errors logs the failure and continues to the next file, maximising the total amount of data recovered. Review the log afterward to identify which files could not be copied — these are the ones that landed on stripes involving both failed drives.
When to stop and move to Level 2:
mdadm --examine shows mismatched UUIDs between drives; --force assembly fails or produces an inactive array in /proc/mdstat; LVM activation finds no volume group; the filesystem mounts but shows an empty or corrupted directory tree. Any of these means the superblock or LVM metadata is too damaged for manual recovery — RS RAID Retrieve’s RAID Constructor handles these cases.
RS RAID Retrieve handles the full sequence — S.M.A.R.T. assessment, drive imaging, array reconstruction, LVM activation, and filesystem access — within a single application on Windows, Linux, or macOS. It covers two distinct reconstruction paths depending on the state of the mdadm superblocks.
S.M.A.R.T. assessment — before any read operation
Connect all drives to the recovery machine and open the built-in S.M.A.R.T. monitor before scanning. The attributes that matter most in this scenario are Reallocated Sector Count (ID 05), Current Pending Sector Count (ID C5), and Uncorrectable Sector Count (ID C6). Non-zero values on any of these — especially on drives that appear to be the “healthy” survivors — indicate sectors that were already failing during the degraded period before the second drive died. A surviving drive with elevated pending sectors is a drive that may produce read errors partway through a multi-hour reconstruction scan.
Drive imaging — protect originals from scan stress
For any drive with elevated S.M.A.R.T. values, use RS RAID Retrieve’s built-in imaging function to create a sector-level image before the reconstruction scan. The imager makes multiple passes over problematic sectors, logs unreadable areas with a sector map, and produces a complete image file that represents the best possible read of that drive. All subsequent steps — array reconstruction, LVM activation, filesystem scan — run against the static image file rather than the live drive. This prevents the drive from deteriorating further under the read load of a full array scan, which on a multi-terabyte SHR configuration can take several hours.
Automatic array reconstruction — when superblocks are intact
RS RAID Retrieve scans all connected drives and images for mdadm superblock signatures. When found, it reads the Array UUID, RAID level, member device roles, stripe block size, and event counters across all members. It then reconstructs the SHR volume topology — mdadm array → LVM Physical Volume → Volume Group → Logical Volume → Btrfs or ext4 filesystem — without requiring a valid quorum and without writing anything to the source drives. For a double failure where both drives are physically readable and superblocks are intact, this path typically requires no manual input.
The reconstruction operates in degraded mode: stripes where both failed drives contributed cannot be reconstructed from parity, and the corresponding files are flagged as inaccessible. Stripes where only one of the two failed drives contributed are reconstructed from the remaining member data and parity — those files are fully recoverable. The program marks inaccessible files in the directory tree before the copy step, so you know the recovery scope before committing to a destination.
RAID Constructor — when superblock metadata is damaged
If automatic detection produces no result — because superblocks are partially overwritten, corrupted by a firmware event, or missing due to a failed drive that was partially written during rebuild — switch to RAID Constructor. This mode allows manual specification of all array parameters, bypassing the superblock entirely.
First, identify the filesystem offset using the built-in HEX editor. Open each drive or image in the HEX editor and search for the ASCII marker LABLEONE. In SHR and SHR-2 configurations, this string marks the beginning of the volume data area. The sector immediately preceding the LABLEONE sector — which will be filled with zeros — is the offset sector. Note its sector number: this is the value to enter as the Offset parameter in RAID Constructor.
Enter the following parameters in RAID Constructor:
| Parameter | SHR-1 value | SHR-2 value |
|---|---|---|
| RAID type | RAID 5 | RAID 6 / Left synchronous (P+Q) |
| Block size | 64 KB | 64 KB |
| Bytes per sector | 512 | 512 |
| Drive order | Match original NAS bay sequence — bay 1 first | |
| Offset | Sector number from LABLEONE search (per drive) | |
If the original drive order is unknown, determine it by trial: add drives to the Selected Disks list in different sequences and observe the reconstructed filesystem tree after each change. A correct order produces a recognisable directory structure; an incorrect order produces garbage or an empty tree. The offset value must be set individually for each physical drive or image — it can differ between members if drives were of different capacities in the original SHR configuration.
If one of the failed drives cannot be connected at all — mechanically dead, not detected — use Add empty disk to insert a placeholder at the correct position in the drive list. RS RAID Retrieve treats the placeholder as a fully unreadable member: stripes where that drive contributed are reconstructed from parity using the remaining members (recovering data from those stripes), and stripes where the placeholder’s parity contribution is also needed are flagged as unrecoverable. This is the maximum possible recovery from a configuration with one physically inaccessible drive.
Filesystem scan and selective file recovery
Once the array is reconstructed — automatically or via RAID Constructor — RS RAID Retrieve scans the Btrfs or ext4 filesystem on the Logical Volume. The scan traverses the filesystem tree, identifies intact and damaged regions, and builds a complete directory listing. Files and folders whose data blocks intersect with unrecoverable stripes are marked before the copy step begins — no trial-and-error copy required to discover what is accessible.
Select the files and folders to recover, specify a destination on a separate drive with sufficient free space, and start the copy. Source drives and images are accessed read-only throughout the entire operation. For guidance on the LVM layer between the mdadm array and the filesystem, see our article on LVM structure and operation.
Software recovery — at any level — depends on the drive’s read/write heads being able to deliver sector data to the host controller. When the head stack is mechanically damaged, the voice coil actuator has seized, or the spindle bearing has failed, the drive cannot serve read requests regardless of the software configuration. A lab replaces the head stack in a cleanroom environment (ISO Class 5 or better), adjusts head alignment to the specific platter, and uses firmware-level tools to extract sector data from areas that the drive’s own electronics would refuse to read.
The output of a lab engagement is a set of sector-level image files — one per drive. These images contain the best possible read of each drive’s surface, including sectors recovered from physically damaged zones that produce I/O errors under normal conditions. Once received, these images are used directly in RS RAID Retrieve: the RAID Constructor reads the superblocks (or uses the LABLEONE offset method if superblocks are damaged), reconstructs the SHR array topology, and proceeds with filesystem scan and recovery exactly as described in Level 2.
The double-failure parity limitation applies to lab images the same way it applies to physically connected drives — stripes involving both failed members remain mathematically unrecoverable regardless of how cleanly the sector data was extracted. What the lab adds is access to sectors that software tools running on a live degraded drive would have missed due to read errors, which can meaningfully increase the proportion of recoverable files.
For a broader overview of when professional recovery is warranted and what the process involves, see our article on recovering data from failed hard drives.
Go directly to Level 3 if you observe any of these
- Clicking, grinding, or buzzing sound from any drive on power-up
- Drive not detected in BIOS/UEFI or in
lsblkoutput at all - Drive detected but
mdadm --examinereturns I/O errors rather than superblock data - Drive surface temperature rises to abnormal levels within minutes of connection
- S.M.A.R.T. Reallocated Sector Count (ID 05) in the hundreds, or Spin Retry Count (ID C0) increasing on each power cycle
Do not attempt software recovery on a mechanically failing drive. Each read pass adds wear to already-damaged head surfaces and platters. Power off and contact a lab before attempting any further access.
A double failure in SHR-1 sits at the boundary between software and physical recovery. Where exactly that boundary falls in your case depends on the state of the drives, not on the capabilities of the tools. Two physically intact drives with intact superblocks give software a real path to partial recovery. Two mechanically failed drives go straight to a lab. Most real-world double failures land somewhere between those extremes — which is why assessing drive state before choosing a recovery path matters more here than in any other scenario in this series.









