That's actually fascinating. What was the latency overhead between local and remote storage back in those days? Do you think LeftHand and EqualLogic and some of those other earlier clustering "scale-out" solutions were inspired by that deployment model?
I did give one presentation on the general design, and it is clear that many of the European Usenet providers implemented this or things similar to this. The bit that's interesting to me is that this was really an early implementation of object based storage, before anyone had really identified that as a thing.
So, more or less, the wall being high concurrency that lead to high contention, lots of random I/O and platter heads flipping every which way?
Well, there were numerous things killing performance. At the time, it was common for providers to use some external DAS solution and Newshosting was based on some external DAS Infortrend units (3U, 16 drives) and they had a RAID controller. Now on one hand that's great and all, but with a 64KB stripe size, and the average Usenet article being anywhere from 500KB to 1MB in size, reading a single article meant that you involved a bunch of drives to seek for that retrieval, and this sucked. The SAN based solutions used by other providers were also typically RAID5 based, usually with a better stripe size like 256KB, so this wasn't killing them quite as badly.
All the while, the bottleneck was the FC connection to the shelves? :P
That's unclear. As you know, bandwidth from a shelf decreases dramatically as the percentage of random I/O (meaning "requiring seeks" for everyone else in the audience here) increases. I'm sure some SAN vendors got rich from selling their crap to Usenet providers and I'm sure some providers may even have desperately paid for high bandwidth FC.
So let me outline this for you a bit. I rewrote Diablo (a provider-class Usenet transit and service package that we maintain here) which takes an incoming NNTP article stream and stores the articles on its spool drives. A spool server would have 24 independent UFS/FFS filesystems that were storing these, which meant that an article retrieval would only ever involve a single drive, which meant that you could have up to 24 concurrent retrievals going on without blocking (of course statistics mean it never quite worked out that awesomely, but half that was common).
You would then have 11 of these systems in a rack, stacked. You would then need to know where an article was going to be found, so I wrote a hashing algorithm and modified Diablo's spool access code to take a Message-ID, which would get run through MD5, and then you would take portions of the resulting MD5. Hash(Message-ID) would return an integer, in this example an int between 0-10, so that a front end would be able to predict that Message-ID <
123456@freenas.org> would be on spool host 6. So you just ask spool host 6 for the article, spool host 6 then retrieves it from one of its 24 FFS filesystems, probably only using a single seek or maybe two.
Now if you're paying attention, you'll notice that without RAID of some sort, a disk failure would render parts of the overall spool irretrievable. We had several ways of dealing with that. One was that there's so much spool activity, you really needed two or three sets of spools for a large provider like Newshosting, so you just do multiple spools (redundant array of inexpensive servers!) and you also use a DIFFERENT portion of the MD5 bitfield for the hash for this second spool, so that a failure of a disk or a failure of a spool server doesn't result in a hot spot developing on the second spool. That way, <
123456@freenas.org> hashes to spool host 6 on the first spool set but hashes to spool host 10 on the second spool set, and hashes to spool host 3 on the third spool set.
Eliminating seeks by knowing where you are likely to find stuff was one of the key technical elements that lit off the Usenet retention wars. Another was the move to inexpensive SATA mass storage, and then just an overall design that allowed for massive scalability. It's funny because I sometimes run across "discussion" threads on web forums or Reddit that has people talking about what they think happened, and much of it is just people talking out their arse about things where they know nothing about what was actually going on.
The Retention Wars were great. For some years, providers watched each other like a hawk and would try to out-retention each other. Because it was assumed that it was costly to add retention, if you had 1100 days of retention and your competitor expanded to 1400, the competitor sort of had an expectation that it would take you several hundred days to fill out that 300 days of additional retention if you added it. But one of my little NNTP extensions was MAXAGE, which would "hide" articles older than MAXAGE, and one of my clients used this to competitor deflating advantage by simply having all that storage ready to go and matching them the day after their retention increase announcement by just twiddling MAXAGE 300 days higher. This caused much consternation on some of the discussion forums as to how they had magically increased retention overnite as everyone THOUGHT it was a well understood mechanism as to how expanding retention worked.