TrueNAS Pool Import and Recovery: Fixing UNAVAIL, FAULTED, and Missing Devices

The moment a TrueNAS pool refuses to import is not the moment to start guessing. ZFS gives you a lot of useful information about why a pool will not come online, and the recovery path is dictated by which failure mode you are in — not by your nerves. This guide walks the most common reasons a pool fails to import on TrueNAS SCALE or CORE, what the device states actually mean, and which zpool commands to reach for in which order.

Before anything else: do not run destructive commands you do not understand on a pool you care about. zpool labelclear, zpool destroy, and dd to a pool member can take a recoverable situation and end it. When in doubt, stop and read.

Read the state before you touch anything

Open a shell on the TrueNAS host (System Settings → Shell, or SSH in) and run:

zpool import

With no pool name, this scans connected devices for importable pools and prints what it finds — pool name, GUID, state, and per-vdev status. This is non-destructive. It does not import anything; it just tells you what is there.

You will see one of a handful of high-level states for the pool itself:

ONLINE — pool is healthy and importable.
DEGRADED — pool is importable but a vdev is operating with reduced redundancy (e.g., one disk missing in a mirror or RAIDZ vdev). Per the OpenZFS docs, a degraded pool continues to function but cannot tolerate another failure in the same vdev.
FAULTED — the pool cannot be opened. For a top-level vdev this usually means too many devices are missing or have failed to satisfy the redundancy contract (e.g., 2 missing disks in a single-parity RAIDZ1).
UNAVAIL — ZFS knows the pool/vdev exists but cannot access it. Often this is a transport issue (HBA not detecting drives, cable, backplane) rather than a data issue.

The per-device states map onto the same vocabulary. The OpenZFS zpool-status man page ↗ is the authoritative reference for what each state means and how they combine at the vdev level. Read the actual output instead of jumping to conclusions.

The most common failure modes

Missing or renumbered devices

By far the most common cause of “my pool won’t import” on a home TrueNAS box is that one or more drives are no longer being detected by the host:

A SATA cable came loose during case work.
A drive was moved to a different SATA port, and the on-disk label is fine but the system enumerated devices in a different order.
An HBA was reseated, swapped, or re-flashed and is no longer presenting the same drives.

ZFS does not care about device names. It identifies pool members by the labels written on the disks themselves. Run:

zpool import -d /dev/disk/by-id

On SCALE (and any Linux-based ZFS), pointing the import scan at /dev/disk/by-id (or /dev/disk/by-partuuid) makes ZFS look at stable identifiers and is the right starting point when device names are suspect. On CORE the equivalent device tree lives under /dev/gptid.

If zpool import shows the pool but a device is UNAVAIL with no apparent reason, the device is almost certainly not visible to the OS — confirm with lsblk on SCALE or camcontrol devlist on CORE before assuming the drive itself is dead.

Pool exported by another system, or imported “force” required

If the pool was previously imported by a different system (the classic case: you swapped the boot SSD and rebuilt TrueNAS without exporting first), ZFS will refuse to import without -f:

zpool import -f <poolname>

This is safe only if you are certain no other live system is currently holding the pool. Importing a pool that another host has open is one of the few ways to genuinely corrupt ZFS. If both systems are connected to the same disks (rare in homelab; more common with SAS shared-disk setups), shut the other one down first and confirm.

Drives present, but vdev is DEGRADED or FAULTED

If zpool import shows the pool with one device offline and the pool as DEGRADED, you can usually import the pool and run it on reduced redundancy while you replace the failed device. The OpenZFS RAIDZ documentation ↗ is explicit about what each RAIDZ level can survive: RAIDZ1 tolerates one disk loss per vdev, RAIDZ2 tolerates two, RAIDZ3 tolerates three.

If the vdev is FAULTED because more disks are missing than its parity level allows, do not import with -f. Stop, work out which physical disks are missing, and get them back online before importing. The pool state in this situation is usually recoverable if the missing drives are intact and just disconnected — it is not recoverable if you start writing to it half-mounted.

The mapping from disk topology to fault tolerance is the same one covered in ZFS Pool Design: RAIDZ vs Mirrors — if you are surprised by what your pool can or cannot survive at import time, this is the post to read before the next pool you build.

Recent host crash or unclean shutdown

ZFS is designed to survive abrupt power loss without corrupting the pool, thanks to its transactional copy-on-write design and the ZIL (ZFS Intent Log). It is unusually robust here. What does occasionally happen is that the last few transactions before the crash are not on disk, and the most recent uberblock points to state that is incomplete.

For these cases, zpool import supports two extraction-of-last-resort flags:

-F — attempt to roll the pool back to a recent consistent transaction group (TXG). Per the OpenZFS zpool-import man page ↗, this is the documented mechanism for importing a pool that the in-kernel module otherwise refuses to open.
-X — extends -F to consider older TXGs at the cost of potentially discarding more recent writes. Read the man page before using this; the trade-off is real.

Combined with -n, both flags are dry runs that report what would happen without making changes. Use -n first. Always:

zpool import -F -n <poolname>

If the dry run says recovery is possible and you accept the rollback, drop the -n. If the dry run says no recovery is possible at the current TXG, escalate before trying -X.

Read-only import for data extraction

If the pool comes online but you do not trust its long-term state, or you suspect hardware on the verge of failing more, import it read-only and copy critical data off to a different pool or external drive first:

zpool import -o readonly=on <poolname>

The TrueNAS documentation covers read-only imports as part of the pool import workflow ↗ and there is a corresponding option in the SCALE web UI under Storage → Import Pool. A read-only import lets you mount datasets and pull data off without ZFS writing any new metadata to the pool. It is the right first move any time you are not sure whether the import itself will worsen the situation.

What not to do

A short list of things that turn recoverable pools into unrecoverable ones:

zpool labelclear on a pool member you might still need. It wipes the vdev label ZFS uses to identify the disk as part of the pool.
zpool destroy at the prompt of an error you have not understood. Destroy is for pools you are intentionally retiring.
dd if=/dev/zero of a suspected-bad disk before you have copied data off other healthy disks.
Importing a pool on two hosts simultaneously because you got impatient. Shut one down first.

When in doubt: capture the zpool import output, photograph the rack and the drive LEDs, and ask in the TrueNAS forum before running write commands.

Replacing a failed drive after import

Once the pool is imported (DEGRADED is fine), replace the failed device with zpool replace <poolname> <old-device> <new-device> — or use the TrueNAS web UI under Storage → Pools → the affected vdev → Replace. Resilver kicks off automatically. Monitor with zpool status and the scrub and S.M.A.R.T. monitoring guide before the next failure tests the new drive.

When to escalate

There are recovery scenarios where the right move is not the next zpool flag but a professional data recovery service: multiple-disk physical failures, a pool that imports but reports persistent metadata corruption on scrub, or any situation where you have already tried -FX and it failed. The further you experiment with destructive flags, the lower the odds a recovery service can help.

For most home users this is also when the snapshot and replication strategy you put in place months ago — the one with a second NAS at another location — is worth its weight in gold. A failed import on the primary becomes “restore from secondary,” not “lose the family photo archive.”

Next steps

ZFS Pool Design: RAIDZ vs Mirrors — what your pool’s redundancy actually buys you when a drive fails.
TrueNAS Snapshot and Replication Strategy — the second layer that turns “pool gone” into “restore from elsewhere.”
TrueNAS Scrub and S.M.A.R.T. Disk Health — the routine practice that catches the problems that cause failed imports before they cause failed imports.

TrueNAS Pool Import and Recovery: Fixing UNAVAIL, FAULTED, and Missing Devices

Read the state before you touch anything

The most common failure modes

Missing or renumbered devices

Pool exported by another system, or imported “force” required

Drives present, but vdev is DEGRADED or FAULTED

Recent host crash or unclean shutdown

Read-only import for data extraction

What not to do

Replacing a failed drive after import

When to escalate

Next steps

See also

Sources

Related

TrueNAS: Replacing a Failed Disk in a Degraded Pool

ZFS Pool Design: RAIDZ vs Mirrors for a Home NAS

Expanding a TrueNAS Pool: Your Three Real Options (Including RAIDZ Expansion)

Comments