Wiping select files/dataset

Joined
Feb 13, 2022
Messages
4
Hello everybody!

I would like to wipe (== overwrite with zero / random) a dataset in my general purpose NAS-Pool. Temporarily moving the pool to a new drive and then dd if=/dev/urandom of=/dev/adaX is not an option. Is there a zfs-way of clearing a specific dataset? Otherwise: is there a way to get access to the information where a file is located on disk so that I can write a script to go through all files in the given set and selectively randomize them?
Thank you for reading through this!

PS: This might be a bad idea or unsafe - don't care. Please don't convince me to buy a tertiary set of HDDs or anything.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I would like to wipe (== overwrite with zero / random) a dataset in my general purpose NAS-Pool.

Okay. Except, a dataset is not a unit of individual storage. It's just like a mega-directory, sharing the pool's space.

Is there a zfs-way of clearing a specific dataset?

No.

Otherwise: is there a way to get access to the information where a file is located on disk so that I can write a script to go through all files in the given set and selectively randomize them?

To answer this question with the mother of all bad answers I've ever written on this forum, yes, you can use ZDB to find the sectors for a file and you could overwrite them. This would require some effort and could be difficult, or even capital-D Difficult. You could then scrub the pool and ZFS would notify you that the files were corrupt/lost at the next scrub.

More generally, you might instead want to make certain that there are no snapshots or other references to the file (hardlinks etc), remove the file, and then fill the pool to 100% with a zero-filled file. Depending on your tolerance for maybe not quite getting every single bit of data everywhere ever, this is considered relatively safe by comparison. Due to the way ZFS pads blocks, intermediate metadata writes could potentially cause some sectors to not be overwritten.

@Arwen has also written a resource on related topics.

 
Joined
Feb 13, 2022
Messages
4
Thank you for the quick reply, I'm very grateful for that!

Okay. Except, a dataset is not a unit of individual storage. It's just like a mega-directory, sharing the pool's space.
Yes sorry, I understand that. It would have been more accurate to state "wipe all files stored under a dataset".

To answer this question with the mother of all bad answers I've ever written on this forum, yes, you can use ZDB to find the sectors for a file and you could overwrite them. This would require some effort and could be difficult, or even capital-D Difficult. You could then scrub the pool and ZFS would notify you that the files were corrupt/lost at the next scrub.
This sounds most fun to be honest. I like tinkering with stuff. Its a homelab and anything that actually matters has double backups in two remote locations.

I dont quite get the manpage for zdb. Should the command
Code:
zdb -d pool/dataset
not give me basic non-verbose info about the dataset? With an f added it should dump file object information. I just get "cant open directory" but don't know what directory I should point zdb to.

Have you got any more "cautions" when going that path? You seem to have actual experience with ZFS. This wont happen over night. I might report back after exam period with my finds and a message of success or failure here. Thank you again.

@Arwen has also written a resource on related topics.
Thank you for the link. I skimmed through this already in my research-ish when searching for a solution.

Greetings and have a good day!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I get the uneasy feeling of handing matches to a kid, but I'll hint at

zdb -C -U /data/zfs/zpool.cache

That's all you get from me though. :smile:
 
Joined
Feb 13, 2022
Messages
4
Thank you Sir! It shall be tested on a toy dataset living inside a VM. That way if I burn the house down, It'll be only my toy house.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Semi-controlled demolition of residential structures via igneous means.
 
Top