SSD Cache Proposal

Status
Not open for further replies.

jfbron

Cadet
Joined
Feb 17, 2013
Messages
1
Is it possible to use a SSD Drive for both the operating system and as aa cache for the Zfs file system ?
It is now advised to use a USB stich for the operating system, wich loads into memory. For a good performance a lot of memory is required.
1 .. 2 Gbyte for eacht TByte of storage is suggested, as a cache for the file system.
It would be cheaper if to use a SSD drive with 3 or 4 partitions
Partition 1: Current version of the Freenas Operating system.
Partition 2: USed for upgrading to a new version of Freenas.
Partition 3: Configured Operating system, used for a quick start. No scan for new hardware required.
Partition 4: USed for cache purposed of the Zfs file system.

On a cold start the operating system starts from partition 1, to configure all hardware.
The configured operating system is written to partition 3. This partition will be the new active partition.
On the next boot, the system boots from partition 3, since there is no need to reconfigure the system.

When a New version is available, this new version is placed in partition 2. This partition is then marked as the Active partition.
Also the configuration settings are copied to this partition.
On reboot, the system will boor from partition 2, and when tested the user can make the changes permanent by copying partition 2 to partition 1.
(This is a new item to add to the contol panel).
ALso it should be possible to revert to the previous version, by doiing the oppsite, and copyiing partition 1 to partition 2.

A Cold start will always start from partition 1, unless a new operating system is installed, in wich case the cold boot will be from partition 2.
A Warm start always starts from partition 3, skipping the configuration of new hardware.

When starting from Partition 1 of 2, the cache (Partition 4) will be cleared.
When starting from Partition 3, the cache will still be valid, so no need to clear it. Will also result in a faster boot.

Any user change to the Hardware (new drivers etc) will be treated in the same way as a new version of the operating system,
so it is always possible the undo these changes.

Partition 4 is the most important for caching. The Afs file system uses a lot of mamory for caching file system information.
This information is only changed when writing, so most of this information is static, ideal for SSD Caching.

For Read and Write chache memory is preferred, but SSD can be a good alternative, since data is still valid after a power break.
Using SSD as a Write cache is a good alternative for battery backup on a hardware controller, to avoid data-loss in case of an power break.

Please comment on this proposal, I think it can improve the performance at a minimum investment (a small SSD will do in most cases)

Hannes
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'm about to burst your bubble... you ready?

The FreeNAS installation already creates 4 partitions. Since you can only have a max of 4 your idea wouldn't work without significant rewrite. This is also why the unused disk space on a USB key can't be used. Also, if you read the forums, you'll find that 2 of the partitions are for the "current" and "previous" installations. So when you do an upgrade, if it won't boot with the new version you can always go back to the previous. Each time you upgrade to a new version it overwrites the "previous" and makes that partition the new partition to boot from. :) Cool huh?

ZFS' read cache is called the L2ARC. It's not a dummy cache though, and is actually quite sophisticated and doesn't seem to be behave like Window's caching system(just caches everything it possibly can at all times). Unfortunately, using an L2ARC requires RAM to maintain a table of what is on the L2ARC. So if you are already short on RAM this could make things much worse performance-wise.

ZFS' write cache is called the ZIL. Just like the read cache it's not a dummy write cache though, and is actually quite sophisticated and I guarantee it doesn't work like you think. ;) It doesn't actually save data to the ZIL to free up RAM and makes the commits to the zpool later. The ZIL is basically a copy of the write cache that is in RAM. If it's not in RAM it's not in the ZIL, and vice versa. So people that buy the 128GB SSD for their ZIL and think they can cache a whole boatload of data on their server with 6GB of RAM are completely and utterly wrong.

Also, you make assumptions like "When starting from Partition 3, the cache will still be valid, so no need to clear it. Will also result in a faster boot" is a very very bad assumption to make. That would be a quick way to corrupt data if your assumption is false. No way I'd ever recommend something like that during bootup. The fact that this is a bad idea is exactly why no system anywhere has ever attempted to implement a system like this. Not to mention if the cache were somehow corrupted a reboot wouldn't necessarily clear the issue. Even if this weren't a bad idea, implementing it would require pretty extensive rewriting of ZFS, so I wouldn't count on that feature being added anytime soon. Most everyone is far more interested in seeing encryption added and the other features Oracle hasn't made open source. Honestly, I believe implementing this feature would break POSIX in a major way(which may be why no system I'm familar with has ever attempted to add this feature).

You definitely are a "think outside the box" kind of person, and that can be very useful. But you also should consider why your ideas may not have been implemented before. Personally, I've found that if you think you came up with some game changing idea usually it means:

1. Someone already thought of it and its a very bad idea.
2. Someone already thought of it and implemented it.
3. Someone already thought up a better idea and your idea actually sucks compared to their idea.
4. Someone already thought of it and there's a limitation that makes implementing the idea impossible.

Welcome to FreeNAS. You will definitely have a blast learning the ins and outs of using it.
 

shtirmuz

Cadet
Joined
Feb 20, 2013
Messages
3
You want to say that it is better not to move zil on SSD? enough to increase the amount of memory on the server?
Will the increase in productivity, a system partition if I place a mirror of the SSD?
 

noprobs

Explorer
Joined
Aug 12, 2012
Messages
53
This post made me think - is there a future possibility of partitioning an SSD into an L2ARC and an ZIL partition? A modern eMLC SSD are generally 128GB+ and ZIL only needs a small partiiton.

Not sure which of cyberjock's 4 categories this falls into :D
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
shtirmuz,

What I'm saying is that unless you are running a server with extremely high usage for a majority of the time the server is up, a ZIL is a waste of money. Sun's official statement were that ZIL was really good for databases and other situations where writes are very close to how databases operate. Hint: If your files are >64KB then you are NOT in "that" scenario. I seriously doubt anyone that uses FreeNAS also runs a database. If you are awesome enough to be an admin of a huge database where a ZIL would be desired you are probably running FreeBSD.

In a strange twist, Sun also had said that L2ARC are best for database uses and little else. Yet there are lots of people that are running ZILs and L2ARCs because they just don't know better.

noprobs,

I'm pretty sure the answer is "no" because your 1 SSD is only 1 /dev. The ZIL and L2ARC require direct access to the /dev. So unless you somehow can split out a single drive into 2 separate devices then there's not any way to make it work.
 
Status
Not open for further replies.
Top