Failing drive with no redundancy

Status
Not open for further replies.

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
Hi,

I stumble along to keep my server up and working but now have a drive failing I managed to run several test smartctl? and as much as I could to see what's going on. I have attempted to "Replace" the drive but after 18 hours of resilvering its still at 12% and goes up and down.

It was just saying I had a few bad sectors but now it says hardware failure.

What should I do now? it is an external drive so once I unplug it what will happen to my pool? assuming I can't recover what's on the drive will those files just disappear?

Thanks!
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Hi,

I stumble along to keep my server up and working but now have a drive failing I managed to run several test smartctl? and as much as I could to see what's going on. I have attempted to "Replace" the drive but after 18 hours of resilvering its still at 12% and goes up and down.

It was just saying I had a few bad sectors but now it says hardware failure.

What should I do now? it is an external drive so once I unplug it what will happen to my pool? assuming I can't recover what's on the drive will those files just disappear?

Thanks!

You need to provide a more complete system description. No redundancy? Meaning a single drive pool. You are likely looking at data loss. You may be able to recover your data for a few hundred dollars using a recovery service.

Why would you configure a NAS like this?
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
System was built like this due to my inexperience... in way over my head! but hopefully can improve on it going forward if I can figure out how to.

Yes I believe it is a single drive pool - all drives linked to provides one storage location.

I expect data loss due to my poor implementation - is there anything I can do to clean the system up after that loss?

FreeNAS-11.2-BETA3 Running off of USB stick
Apps all running of 120gb SSD

CPU Intel - Core i5-6600K 3.5GHz Quad-Core Processor
CPU Cooler Deepcool - GAMMAXX 300 55.5 CFM CPU Cooler
Motherboard Gigabyte - GA-Z170M-D3H Micro ATX LGA1151 Motherboard
Memory G.Skill - Ripjaws 4 series 16GB (2 x 8GB) DDR4-2400 Memory

Storage 2x Western Digital - Red 4TB 3.5" 5400RPM Internal Hard Drive
1x Seagate 3TB
1x Seagate 3TB External (Failing drive)
Case Thermaltake - Core V21 MicroATX Mini Tower Case

Power Supply Corsair - CXM 750W 80+ Bronze Certified Semi-Modular ATX Power Supply
 
Last edited:

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
System was built like this due to my inexperience... in way over my head! but hopefully can improve on it going forward if I can figure out how to.

Been there, done that.

Yes I believe it is a single drive pool - all drives linked to provides one storage location.

Go to your shell and type:

zpool status

It will list your pools and provide the information about its configuration. For instance, mine is a raidz3. See below:
Code:
root@mellonas:~ # zpool status
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:17 with 0 errors on Thu Sep 27 03:45:17 2018
config:

	NAME		STATE	 READ WRITE CKSUM
	freenas-boot  ONLINE	   0	 0	 0
	  mirror-0  ONLINE	   0	 0	 0
		da0p2   ONLINE	   0	 0	 0
		da1p2   ONLINE	   0	 0	 0

errors: No known data errors

  pool: raid
 state: ONLINE
  scan: resilvered 1.76T in 0 days 05:48:26 with 0 errors on Mon Sep 17 14:28:49 2018
config:

	NAME											STATE	 READ WRITE CKSUM
	raid											ONLINE	   0	 0	 0
	  raidz3-0									  ONLINE	   0	 0	 0



Memory G.Skill - Ripjaws 4 series 16GB (2 x 8GB) DDR4-2400 Memory

Assume non-ECC, right? If so, ZFS uses memory to store data temporarily and data corruption can occur with those memories. Your call.

I expect data loss due to my poor implementation - is there anything I can do to clean the system up after that loss?

First thing: Save your config.

Second: Don't use external drives

Third: (That's a long one)

I saw the specs for that MB and says:

Code:
  1. 1 x M.2 Socket 3 connector (Socket 3, M key, type 2242/2260/2280 SATA & PCIe x4/x2/x1 SSD support)
  2. 3 x SATA Express connectors
  3. 6 x SATA 6Gb/s connectors


Does it really has 10 SATA ports? If so (and you have the cash) you could add new HDDs to that box, create a second zpool, copy all your data, and try to save something.

Take a look at the forum for article by @cyberjock with information on ZFS. You will learn a lot.

Good luck!
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
You may be able to recover your data for a few hundred several thousand dollars using a recovery service.
Fixed that for you--nobody's going to do data recovery on ZFS for a few hundred dollars.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I’d let the resilver continue FWIW.

Zpool status please
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
I looked and my MB does have 10 slots for drives, I am thinking I should just pick up a couple more drives and put the time in to improve the system. Unfortunately my micro-atx case doe not hold any more space for drives, is there a way to mount them externally via sata connector?

Ram is not ECC - Is it worth the upgrade?

Resilver is ongoing but does not appear to be making any progress... still sitting at 12.33%

Code:
pool: MediaVolume																												
 state: ONLINE																													
status: One or more devices is currently being resilvered.  The pool will														 
	   continue to function, possibly in a degraded state.																		
action: Wait for the resilver to complete.																						
  scan: resilver in progress since Fri Sep 28 07:47:57 2018																		
	   2.00T scanned at 1.66G/s, 700G issued at 582M/s, 5.95T total																
	   140G resilvered, 11.49% done, 0 days 02:37:58 to go																		
config:																															
																																 
	   NAME											STATE	 READ WRITE CKSUM												
	   MediaVolume									 ONLINE	 296	 0	 0												
		 gptid/54d78ed2-a2aa-11e6-a98f-1c1b0d129ba2	ONLINE	   0	 0	 0												
		 replacing-1								   ONLINE	 320	 0   272												
		   da1p2									   ONLINE	 124 1.89K   360												
		   gptid/b998e2f7-c1a7-11e8-a2a5-1c1b0d129ba2  ONLINE	   0	 0   598												
		 gptid/410a162c-c0d9-11e8-81d9-1c1b0d129ba2	ONLINE	   0	 0	 0


As my hard drive approach has been kind of fragmented and random, Started with WD RED 4tb and seagate external, then added Seagate Ironwolf 3tb then a WD RED 4TB. Whats the best way to clean things up? my theory would be whenever adding a drive add two but I have not done that. Would adding a 10TB drive and switching to raid0 be a good idea?
 
Last edited:

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
I looked and my MB does have 10 slots for drives, I am thinking I should just pick up a couple more drives and put the time in to improve the system. Unfortunately my micro-atx case doe not hold any more space for drives, is there a way to mount them externally via sata connector?

Maybe is time to think about upgrade your chassis. I'd plan for a new volume and copy all data over. That might save you lots of headaches and time getting your data back. Read this to learn and plan what to do: https://forums.freenas.org/index.ph...pool-zil-and-l2arc-for-noobs.7775/#post-31234

Ram is not ECC - Is it worth the upgrade?

Some people says is a must, some don't. I'm using ECC and you can draw your own conclusions: https://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/#post-76620

Resilver is ongoing but does not appear to be making any progress... still sitting at 12.33%

Not a good sign. Get your back-up ready.

Would adding a 10TB drive and switching to raid0 be a good idea?

No. Also take a look at https://forums.freenas.org/index.php?threads/terminology-and-abbreviations-primer.28174/ for terminology. FreeNAS uses ZFS, so no raid0 here.

Whats the best way to clean things up?

Well ... first read the material from @cyberjock to better understand options. 2nd look at your current used space and try to build a pool to back-up your data. It even could be on a temp box just for that. 3rd remove that and never use anymore external usb drives. 4th recreate your pool and copy data back. Have mixed sizes disks is OK, but ZFS will base the pool size to the smallest one. Also, after reading cyberjock post, you will see that you can replace drives to grow you pool later.

Good luck.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
External drives are kind of notorious... Before giving up on the drive, I'd look to see if the case can be "shucked", and the failing drive installed on one of the internal SATA connectors, even if you just rig it up temporary. You may get lucky.
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
External drives are kind of notorious... Before giving up on the drive, I'd look to see if the case can be "shucked", and the failing drive installed on one of the internal SATA connectors, even if you just rig it up temporary. You may get lucky.
I will definitely give that a try! It won't mess anything up if I disconnect it?

Thanks!
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
I grabbed and extra 4tb Red drive so now I have two which I can hopefully use to fix things... been reading lots but would be great is someone can maybe give me the best approach... I have 1 of the new Red drives trying to resilver but it doesn't seem to be working.. won't ever get above 13% and seems to drop down to 9% then work back up.

I would like to try to take the seagate out of the case and hook direct sata to see if that helps but will disconnecting it cause any issues?

If the direct sata connection doesn't work what's the next step? as the one red drive will have some info from the failed resilvering process. Im trying hard to learn all of this information but is a ton to take in!

Thanks for your help!
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
So the question we can't answer is... Is it the disk itself, or the external enclosure? Since you have it re-silvering, but not making progress, my guess, and it's only a guess, is that you should be able to shut the system down and disconnect the drive without causing more harm.

Understand, things are bad. Your data is likely gone. What we're trying to do, is see if we can get it back.

See if you can shuck the external drive out of its case. At 3Tb, this should be a 3.5 inch drive, and should be possible. If it's a modern 2.5 inch external drive it likely has the USB circuitry right on the drive, and it won't be possible to connect it to SATA. The nice thing about ZFS & FreeNAS is, if it is the USB HCI & bridge board in the external case, the system will identify the drive when connected to the internal SATA port, and place it in the pool. I'm not sure if re-silvering will resume or not.
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
Thanks I’ll give it a shot and see what happens, I haven’t lost all of my data right? Just what’s on that drive? Is there anyway to purge the corrupt data?


Sent from my iPhone using Tapatalk
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
There is no redundancy in that pool, so if you cannot get the drive back you will loose all your data.
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
Wow ok I didn’t realize it was that bad. Most of the files are still functioning.

Can I copy them off the server then rebuild it properly and copy back?


Sent from my iPhone using Tapatalk
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
If you are able to copy the data off the pool then you should do that
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
If you are able to copy the data off the pool then you should do that

Ok well after shucking the case it didn't seem to fix the issue still getting these errors
  • CRITICAL: Sept. 30, 2018, 5:26 p.m. - Device: /dev/da1 [SAT], 2528 Currently unreadable (pending) sectors
  • CRITICAL: Sept. 30, 2018, 5:26 p.m. - Device: /dev/da1 [SAT], 2528 Offline uncorrectable sectors
It does appear that I should be able to save a bunch of content on the drive. However As I have been reading Cyberjock's Slide show I am starting to wonder if i have what it takes to put this together with redundancy.. seems to be incredibly complicated especially having to figure out how to rebuild after this crash.

Might be time to switch the server to windows or something easier as my needs are fairly limited..
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
what it takes to put this together with redundancy.. seems to be incredibly complicated
Not at all:
  • Have more than one drive
  • Go to the Volume Manager
  • Add your drives
  • Pick a layout that provides redundancy (mirror or RAIDZn)
  • Profit!
 

shanco

Dabbler
Joined
Dec 20, 2017
Messages
12
Not at all:
  • Have more than one drive
  • Go to the Volume Manager
  • Add your drives
  • Pick a layout that provides redundancy (mirror or RAIDZn)
  • Profit!

I can handle that but what do I do to recover/format these drives/pool that has failed?
 
Status
Not open for further replies.
Top