Sort of not surprising how variant divergent chipsets go with power states, and other things.
How does he get raidz2 to spin down without busting the raidset? Putting drives into sleep states isn't usually good for in-CPU zfs is it? Is the l2arc doing heavy lifting here?
Good comments about ECC memory in the feedback discussion too.
I've found ZFS to be extremely forgiving for hard drives having random slow response times. So long as you are getting a response within the OS I/O timeout period, it's simply a matter of blocking I/O until the drives spin up. This can honestly cause a lot of issues on production systems with a drive that wants to half fail vs. outright fail.
I believe this is on the order of 30-60s from memory.
l2arc likely works quite well for a home NAS setup allowing for the drives to be kept spun down most of the time.
Strangely I also built (about 10 years ago now) a home NAS utilizing a bunch of 2.5" 1TB Seagate drives. I would not repeat the experiment as the downsides in performance was simply not worth the space/power savings.
Then again, I also built a ZFS pool out of daisy chained USB hubs and 256 (255?) free vendor schwag USB thumb drives. Take any advice with a grain of salt.
> home NAS utilizing a bunch of 2.5" 1TB Seagate drives. I would not repeat the experiment as the downsides in performance was simply not worth the space/power savings.
5400 drives? How many and how bad the performance was?
Late reply, but I believe these were the 4200rpm version - they had to be the "thin" drives to fit into my weird hot-swap bays vs. the "thick" drives you would typically see in datacenter use. Basically this limited you to laptop drives.
I recall these having pretty similar performance to quality USB drives at the time - just more consistently. They maxed out somewhere around 6-8MB/sec for large writes and 25-30MB/sec for sustained reads.
This was at a time when consumer 7200rpm SATA models were hitting probably close to 120MB/sec sustained reads. Time flies and my memory isn't what it's used to be - but I'm fairly sure those numbers are ballpark accurate. I still have a half dozen of these laying around the house somewhere I should track down one day.
I recently made a RAID5 from a pack of a brand new WD10JUCT (1Tb, 5400, 512e, 3Gb/s!, but! CMR) and under Windows softraid they showed up to 500MBs of sequential reads/writes. Sure, when I installed them in R720 with PERC which doesn't understand 512e/4kn drives and then slapped VMDKs on it... things got slow but not 25-30MB/s.
And my experience with different RAIDs says RPM doesn't really matters, you would get N x Usable_Stripe_Count MB/s in sequential access anyway and with enough spindles you would get fine enough access time too.
Probably there was a problem somewhre else, like 30MBs is clearly up to USB2 speeds and even 4200rpm should show a better numbers.
That particular build worked fine, it was just really slow. It was a very ill-advised plan to begin with - I was using 5.25" -> 2.5" hotswap bay converters for the 2.5" laptop drives. The idea at the time I guess was less noise and power, but mostly it was just to say I did it I think.
Other than it being extremely slow it lasted as my home NAS for 5 or 6 years until I outgrew it.
These days I run a 250TB ZFS pool as my "main" pool, utilizing the 8TB Seagate SMR drives of yesteryear. I have had very little problems with them, but it was a very careful pool design after many years of running ZFS on production systems. So far I've had a few drive failures (as expected) and we are probably ticking closer to decade than I'd like to admit. It performs far better than I need it to - as it's very much WORM style storage for backups and media.
yup. the problem is really with the SMR drives where they can (seemingly) hang for minutes at a time as they flush out the buffer track. ordinary spin-down isn't really a problem, as long as the drives spin up within a reasonable amount of time, ZFS won't drop the disk from the array.
ZFS is designed for HDD-based systems after all. actually it works notably kinda poorly for SSDs in general - a lot of the design+tuning decisions were made under the assumptions of HDD-level disk latency and aren't necessarily optimal when you can just go look at the SSD!
however, tons and tons of drive spin-up cycles are not good for HDDs. Aggressive idle timeout for power management was famously the problem with the WD Green series (wdidle3.exe lol). Best practice is leave the drives spinning all the time, it's better for the drives and doesn't consume all that much power overall. Or I would certainly think about, say, a 1-hour timeout at least.
However, block-level striping like ZFS/BTRFS/Storage Spaces is not very good for spinning down anyway. Essentially all files will have to hit all disks, so you have to spin up the whole array. L2ARC with a SSD behind it might be able to serve a lot of these requests, but as soon as any block isn't in cache you will probably be spinning up all the disks very shortly (unless it's literally 1 block).
Unraid is better at this since it's a file-level striping - newer releases can even use ZFS as a backend but a file always lives on a single unraid volume, so with 1-disk ZFS pools underneath you will only be spinning up one disk. This can also be used with ZFS ARC/L2ARC or Unraid might have its own setup for tiering hot data on cache drives or hot-data drives.
(1-disk ZFS pools as Unraid volumes fits the consumer use-case very nicely imo, and that's going to be my advice for friends and family setting up NASs going forward. If ZFS loses any vdev from the pool the whole pool dies, so you want to add at least 2-disk mirrors if not 4-disk RAIDZ vdevs, but since Unraid works at a file-stripe level (with file mirroring) you just add extra disks and let it manage the file layout (and mirrors/balancing). Also, if you lose a disk, you only lose those files (or mirrors of files) but all the other files remain intact, you don't lose 1/8th of every file or whatever, and that's a failure mode that aligns a lot better with consumer expectations/needs and consumer-level janitoring. And you still retain all the benefits of ZFS in terms of ARC caching, file integrity, etc. It's not without flaws, in the naive case the performance will degrade to 1- or 2-disk read speeds (since 1 file is on 1 disk, with eg 1 mirror copy) and writes will probably be 1-disk speed, and a file or volume/image cannot exceed the size of a single disk and must have sufficient contiguous free space, and snapshots/versioning will consume more data than block-level versioning, etc. All the usual consequences of having 1 file backed by 1 disk will apply. But for "average" use-cases it seems pretty ideal and ZFS is an absolutely rock-stable backend for unraid to throw files into.)
anyway it's a little surprising that having a bunch of individual disks gave you problems with ZFS. I run 8x8TB shucked drives (looking to upgrade soon) in RAIDZ2 and I get basically 8x single-disk speed over 10gbe, ZFS amortizes out the performance very nicely. But there are definitely risks/downsides, and power costs, to having a ton of small drives, agreed. Definitely use raidz or mirrors for sure.
How does he get raidz2 to spin down without busting the raidset? Putting drives into sleep states isn't usually good for in-CPU zfs is it? Is the l2arc doing heavy lifting here?
Good comments about ECC memory in the feedback discussion too.