Cannot stop scrubbing. Process at 450%!

Status
Not open for further replies.

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
A scrub is scheduled at every 1st of every month.
But this time, it already runs since 3 days, currently at 450% and "65TB out of 14.6TB" :(

I tried aborting the process via SSH console with "zpool scrub -s zraid1", but all it does is reboot and continue scrubbing.

Any ideas? :confused:

b39e9bf71d1fa3fb1b409d8bfd9bf2b0.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Oh boy, sounds like a corrupted pool. No ZFS or disk errors prior to this event?

To stop the scrub, use zpool scrub -s zraid1.
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
To stop the scrub, use zpool scrub -s zraid1.
As I wrote, I already tried this, but after execution of this command, the NAS just reboots and continues scrubbing.

No ZFS or disk errors prior to this event
There is a drive with some "unreadable (pending) sectors" and "2 Offline uncorrectable sectors", but this should not affect the entire pool.

The pool itself seems to be fine. Everything is working as it should.
Well, except that scrubbing does not come to an end.

Is there any other way to stop the scrubbing process or tell the system not to continue after reboot?
It looks like it cannot execute the stop command but immediately reboots.

bzw. I'm running FreeNAS-9.3-STABLE-201509282017
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
As I wrote, I already tried this, but after execution of this command, the NAS just reboots and continues scrubbing.
Bah, completely missed that line. Any tracebacks? What's the hardware?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
ECC memory? Or not?

Either way, most likely there's some damage to the pool, and what you're really going to need to do is to copy all that data off the pool, build a new pool, and then put it back.

If you don't have ECC memory, you WILL ABSOLUTELY WANT TO RUN A VERY EXTENDED MEMORY TEST because that's the most likely way for a pool to become corrupt.

There are no tools to fix corruption on a ZFS pool. The ZFS strategy is to avoid introducing things that could corrupt a pool, and to protect against disk-error-induced corruptions through data redundancy. Once you're corrupt, it's a bad situation.
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
AMD E-350 motherboard
non-ECC 8GB mem
2 x (5 x 2TB HDD)

There is no chance to backup the data to somewhere else.
It's 16TB usable storage space of which 80% are used.

Can you give me a recommendation on how to test the memory?
e.g. memtest86 on a bootable medium?!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Some 4TB drives would probably be cheaper than 8's ...
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
When I built my NAS, I started with 5 2TB disks and built a zraid (Raid5 / 4+1).
Later, I added another 5 2TB disks in the same configuration and expanded the original pool.

... as you can see in the screenshot in the first post.

Is there any way to do the following?
1.) delete some content to get the pool below 50% used space
2.) move all the content onto the first 5 disks (zraidz1-0) (possible ????)
3.) remove the disks of zraidz1-1
4.) build a new pool with the released/detached disks
5.) copy content from old pool to new
6.) destroy the old pool
7.) expand the new pool with the released/detached disks from the old pool
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
2)-3) are not possible with ZFS. Addition of a virtual device ("vdev") is a nonreversible operation.
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
Memtest86 showed no errors.

I just did a fresh install on a brand new USB flash + manual Update + restore config backup.

No change. :(

I recorded the output when I run "zpool scrub -s zraid1":
https://youtu.be/DJsoBuqaH2k?t=8s :eek:

at the beginning you can see:
"panic: ..."
44c19aa39104076c65b87ca710ea0d95.png
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
I just notices that the output of the video was saved in /data/crash

dump attached
 

Attachments

  • textdump.tar.rar
    14.1 KB · Views: 190

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
This looks like classic zpool corruption. :(

We see this on RAIDZ1's and pools that are known corrupted on a fairly frequent basis. Sorry, but the only solution I am aware of is to destroy and recreate the zpool.
 

Bluebrain

Cadet
Joined
Dec 28, 2012
Messages
7
Looks like! :(

I've tried so many things, but nothing helped.
Scrubbing is still running.
However, at least data is not lost.

At the moment I'm coping all my stuff to an 8TB + 3TB drive and deleted some stuff so that all fits onto these 2 drives.

I still try to figure out how to migrate/move the Jail to an other pool/Volume.
 
Status
Not open for further replies.
Top