Dear Uncle StorageMojo

by Robin Harris | Friday, February 16, 2007 | Clusters, Enterprise, SAN, FC | 10 comments

There’s always a first time
Sometimes it is great and sometimes, not so great.

Advice to the SAN-lorn
I’m asking StorageMojo.com readers to help this gentleman with his first SAN. I’ll kick off with my take after his letter. He didn’t ask to be anonymous, but out of respect for his heartfelt plea for help, I won’t identify him.

Should there be anyone who has cause why this couple should not be united in marriage, they must speak now or forever hold their peace.

I’m buying a SAN. It’s to make our poor overloaded database server go a bit faster. We’re also going to store our growing archive of 30 million jpgs on it.

After spending what seems like the last 6 months reading white papers, I’ve decided to go for five shiny new Lefthand Networks HP DL320s nodes. 3 nodes have 12*15K SAS drives for the database, 2 nodes have 12*750GB drives for the storage.

We’re a small company, and this is by far our biggest purchase ever – we can’t afford to get it wrong!

Oh how I wish there was a bit more off-the-record chat about this stuff… It seems every word I’ve read in the past months has been written by a vendor. I’ve never read about a problem with anything. Do all SAN’s work perfectly all the time, or are there often problems? Has anyone ever been disappointed with their SAN purchase?

Anyway – my reasons for choosing the Lefthand stuff are:

(They claim) random IO performance scales linearly as you add nodes (e.g. 3 nodes = ~5,000 IO/s, 6 nodes = ~10,000 IO/sec, 9 nodes = ~15,000 IO/sec)

* Unlimited capacity scaling by adding nodes

* Snapshots, remote copy, easy management, thin provisioning etc.

* Low initial costs (~$25,000 per node) – When I say “low”, I guess I mean “almost impossibly high, but lower than the big players”.

Please uncle StorageMojo, am I doing the right thing? Maybe if you posted this on your most excellent blog, your readers might have some advice?

Readers, hear this man’s plea! Please respond in the comments.

The StorageMojo take
Dear USM, you appear to be suffering from anticipatory buyer’s remorse. I commend you. Far better to suffer it now than after your check has cleared. That said, all I’ve heard about LeftHand is that their stuff works well. It does seem a little pricey, though.

I can assure you that many people have been disappointed in their SAN purchases. How to ensure you aren’t the next one is the question. You are thinking about the future and the growth of you application, both good things. Here are some questions.

You didn’t detail how you decided that storage was the problem. I’ll assume you’ve looked at a faster server, more RAM and tuning the database. You don’t mention much about your workload. Thirty million jpegs sounds like a photo-sharing application. Depending on how big the average jpeg is and how visitors use the system, you might be more bandwidth limited than IOPS bound. Do you understand how much load GigE will handle? How much server overhead will be generated by the iSCSI protocol? Are you buying TOE-based HBAs?

StorageMojo readers, please comment. I’ll be interested to hear what you have to say myself.

PS: the subject line of his email was “Dear Uncle StorageMojo” – which got me started on the whole “advice to the lovelorn” theme.

10 Comments

Jason Williams on Friday, 16 February, 2007 at 5:18 pm

I’ve heard similarly good things about Equallogic…from their users that is. As for things going wrong with SANs…um yeah they can bork you badly. I don’t know if LeftHand will let you demo…but I’d definitely do a conditional PO if they won’t.

As Robin mentioned, it sounds like more of a bandwidth intensive app than a IOP intensive one. Are the the 15K trays for a DB of metadata and the SATA trays are for the actual photo storage?

At $25K a node, I’d seriously look at the SunFire X4500 to replace the SATA trays. Put your photo DB/FS right on those puppies and partition at the DB level. They’re about $24,000 for 24TB if you qualify for the Start-Up Essentials program.
The customer on Sunday, 18 February, 2007 at 11:58 am

Let me go into a little more detail.

>Iâ€™ll assume youâ€™ve looked at a faster server,
>more RAM and tuning the database.

Our database server has plenty of CPU horsepower, running under 25% most times. It currently has 14 direct attached 15K SCSI disks in RAID 10. As our database grows larger than the ram available, the disk subsystem starts to take a hammering. Windows performance monitor shows disk io/sec pegged at about 1700, and the sec/transfer figure grows. Itâ€™s got bad several times before over the past year, and adding ram is an easy fix. Unfortunately now the machine is full of 2GB dims, and although upgrading to 4GB dims is possible, itâ€™s a really expensive way to buy us another few months.

Tuning the database is definitely on the to-do list, but with a hell of a lot of code and only two vastly overstretched developers, itâ€™s not a quick or reliable solution.

>You donâ€™t mention much about your workload.

OK so we have a database server that handles about 2,000 transactions per second, with 75% read, 25% write. This will double within the year. The database is currently about 80GB in size. When the SAN is in place, I may introduce a second database server.

>Thirty million jpegs sounds like a photo-sharing application.
>Depending on how big the average jpeg is and how visitors
>use the system, you might be more bandwidth limited than
>IOPS bound.

Donâ€™t worry about performance when thinking about the jpeg archive requirement â€“ actually serving the static content to the web is handled by content cache servers. 200 hits/sec is reduced to 10 disk reads/sec per second. All we need out of the capacity storage requirement is an easily expandable modular pool that will replicate to an identical set-up off-site. We currently use about 5TB, again to double within a year.

>Do you understand how much load GigE will handle?
>How much server overhead will be generated by the
>iSCSI protocol? Are you buying TOE-based HBAs?

The two database servers will be the only performance intensive users of the san. Iâ€™m going to attach them with 10Gb Ethernet cards (the Chelsio S310E-CX looks good). The whole lot will plug into a switch with 48 gig ports and a couple of 10 gig ports. Choosing a switch is another place Iâ€™m having a conundrum… Will a $20,000 Cisco 4948 be any better than a $2,000 Dell 6248?

>At $25K a node, Iâ€™d seriously look at the
>SunFire X4500 to replace the SATA trays.
>Theyâ€™re about $24,000 for 24TB if you
>qualify for the Start-Up Essentials program.

The X4500 looks interesting… Unfortunately we donâ€™t quality for Start-up Essentials because weâ€™re not based in the USA.
Robert Pearson on Monday, 19 February, 2007 at 9:05 pm

Keep up the great work!!!

Most of my post is quick background for people who haven’t been reading “white papers” for 6 months.

This is the best information to come along since:
1) Sam’s SAN Diary – a 34 part series circa 2003. A bit dated technically but the process is still the same. If you nail the process, only the “bells and whistles” change. The key comment Sam made was “Test, Test, Test!”. Easy to say, hard to do.
http://www.varbusiness.com/showArticle.jhtml?articleId=18839308&printableArticle=true

2) Ken Gibson’s “Storage Thoughts” Blog posts about:
2a) “Seeing What’s Next, Fourth and Final Part” – the series
http://www.storagethoughts.blogspot.com/
2b) “Array Vendor Chart”
http://storagethoughts.blogspot.com/2006/10/array-vendor-chart.html

3) StorageMojo post “Cool Data, Cold Cache”
http://storagemojo.com/?p=370#comments
“Nigel’s comment” which led to this post:
“The MySpace storage monster”
http://blogs.rupturedmonkey.com/?p=53
Some of the Information in this post about the MySpace configuration is interesting because it applies to “Information High Availability” and “The Speed Limit of the Information Universe” both of which are important for your IT shop. Sounds like you have a good handle on both of those.

Basically the new options today focus on making the IT shop Information Centric rather than Technology Centric. This is done by virtualizing servers and Storage. This frees Storage from being bound to a specific “Enabling” Technology,i.e., specific vendor free.
The second big change is the number of channels into the Information offered by this virtualization. And the ease with which they can be added.
Bandwidth on Demand.
John Worcester on Tuesday, 20 February, 2007 at 10:52 am

Iâ€™ve been doing some SAN research myself and LeftHand Networks was on the top of my list as well. Key factors that set them apart:

Scalability: The fact that they can seamlessly add nodes to a cluster of storage bricks on-the-fly to increase performance (more spindles), or capacity, without downtime, is truly remarkable. Looking under the hood, their MPIO load balancing isnâ€™t your typical round-robin or least active scheme. LeftHandâ€™s MPIO plug-in allows them to map the data locations at the server and do parallel R-W directly to the nodes that have the data. Other vendors like EqualLogic have to do routing table lookups on every I/O to find out which node has the data, which slows things down and limits performance scalability.

HA: LeftHand is the only vendor that I know of that can sustain node level failures with their clustering software. I believe this is important because as you grow the cluster the chances of a node level failure increases. Other vendors claim that this isnâ€™t an issue if you have â€œhardened nodesâ€ (redundant controllers, etc.), but from my experience, array level failures do occur; processor faults, mid-plane failures, network port failures, software hangs, etc.

Hardware: This is probably the most interesting aspect of their solution. LeftHand uses what most people consider servers as storage nodes. My first thought was, where does the storage come from, itâ€™s a server? As I dug into this I realized many of the new servers on the market are loaded with disks. The HP DL320S supports 12 drives, which is a common drive density among purpose built storage arrays. The good news here is that the hardware is designed, manufactured and supported by top tier vendors like HP and IBM. You donâ€™tâ€™ have to worry about one-off, proprietary hardware arrays and how long they will be around. You also have the choice to use LeftHandâ€™s hardware monitoring software (included in base offering), or use the same hardware monitoring software that you use for your servers.

I donâ€™t know about you, but Iâ€™m tired of the Evil Machine Companies of the world making our company constantly pull out our check book for proprietary hardware that becomes obsolete every couple of years, and forces our hand to do another forklift upgrade. With LeftHand, at least you know youâ€™re buying an industry standard server that will be around for a while, and you have the flexibility to re-purpose it or build out different tiers of storage clusters as newer hardware comes to market.

John
Marc Farley on Tuesday, 20 February, 2007 at 5:16 pm

I’m Marc Farley and I work for EqualLogic, so obviously I have a bias in favor of our technology solutions. I think its fair to say that John Worcester who commented above is not a customer of ours and probably does not speak with experience about our products.

The comment about routing and I/O performance is telling. If you have multiple systems of ours working together in a cluster (actually its more like a grid), some I/Os will probably be routed between systems and some won’t. If we used an off the shelf PC server architecture, where routing, storage targets, RAID and all other functions run as applications requiring context switches between user space and kernel space, I can see where this might be a problem. But we don’t use a PC architecture and all the processing baggage it implies; we have a multi-processor architecture that has been developed specifically for solving storage and I/O problems. Our customers almost universally tell me our performance is beyond their needs and expectations and when they add additional systems they are pleasantly surprised to find it their performance gets better.

People don’t use PCs for network routing because special-purpose technologies do the job much better. Storage is no different. As an example, look into how component swapping works with an integrated storage solution like EqualLogic and how it works with software+PC product bundle from a company like Lefthand. There is a great deal of comfort in having a total solution that supports hot spares, hot swapping and includes call home features into the vendor’s support network. These “little details” are more than just stress relief , they can make the difference between a disaster and business as usual.

Where system failures are concerned, people are advised to do the homework and look to see what kinds of failures are needed to cause a software+PC bundle to fail and what kinds of failures are needed to cause an integrated storage solution like EqualLogic’s to go down. If you do the failure analysis work, you’ll see that the integrated solution is far more robust in handling many more failures. If you need to protect against a site disaster , then use replication, something that is bundled with our solutions at no additional cost.

We encourage prospects to try our systems and compare them head to head with Lefthand, EMC, HP, Network Appliance and anybody else.

Marc
John Spiers on Friday, 23 February, 2007 at 9:56 am

Iâ€™m John Spiers and I work for LeftHand Networks, and I also have a bias, but will try my best to state the facts in an impartial way.

Dear customer, what you will find with LeftHand that you canâ€™t find with alternative solutions like EqualLogic are:

1. Synchronous replication across any number of nodes, clusters or sites. This gives customers the flexibility to build out many HA configurations that can accommodate multiple node and site level failures.
2. Support for enterprise class, high performance x86 based storage servers
3. Patent Pending MPIO-DSM load balancing with host data locality awareness.

I would be happy to walk through these things directly at any time.

John Worcester and Marc mentioned a key technology that differentiates LeftHand from the rest of the storage clustering pack. With â€œstore-forwardingâ€ clustering architectures, the number of I/Oâ€™s that will be routed is directly proportional to the number of nodes in the cluster. The probability of an I/O being located on the node it was sent to is equal to one divided by number of nodes (1/n), assuming volumes span all nodes. Imagine many hosts performing random I/O operations on a large cluster of nodes, and the nodes having to resolve and forward a large percentage of the I/Os. The aggregate latency of millions of simultaneous I/O operations being forwarded can severely degrade performance, limit scalability, and may even saturate some Ethernet switches. LeftHandâ€™s MPIO-DSM eliminates the routing table lookup and forwarding operation by having a data layout map at the client so that I/Os are sent directly to the node where the data resides.

Just so readers are not confused, LeftHand does not use PC technology, we leverage high performance, Enterprise Class storage servers with the latest dual & quad core processing technology manufactured by IBM and HP. Examples:

http://h10010.www1.hp.com/wwpc/pscmisc/vac/us/en/ss/proliant/dl320s-benefits.html

http://www-03.ibm.com/systems/x/rack/x3650/index.html

These are the same servers big enterprises trust for their applications. They have redundant powers supplies, hot-swap redundant cooling, power and hard disk drives for high availability. The hardware can also be monitored using the standard server monitoring tools that are provided by the manufacturer, so that you can monitor your server and storage hardware with the same tools.

Letâ€™s compare EqualLogicâ€™s 5 year old Broadcom BCM1250 chipset with Intelâ€™s latest dual core server chipsets:

Intel Server Architecture Broadcom BCM1250

Memory Bandwidth 21.3 GB/s 6.5 GB/s
PCI Bus I/0 Bandwidth 8 GB/s .5 GB/s
CPU Clock Rate 3.7 GHz 1GHz

Other storage vendors are also moving to x86 based controllers and arrays, such as NetApp and their recent move towards AMD, and EMCâ€™s move to Intel.

http://www.netapp.com/library/cs/amd.pdf

The last point here is that if the storage solution provides the capability to cluster many nodes together and span volumes across these nodes for maximum performance (more spindles working for each volume), then the storage solution should be able to sustain a node level failure without losing access to data. Redundant controllers and power supplies do not provide enough redundancy when clustering many nodes together. There can be a mid-plane failure, a second drive failing in a RAID 5 set during a RAID rebuild, batteries failing on cache controllers during a power outage or a node disconnecting form the network for any reason. This is where LeftHand separates itself from the pack. LeftHand protects a customerâ€™s data in all these scenarios.

John Spiers
Femi Ogundimu on Friday, 16 March, 2007 at 7:17 am

I spent about a year researching iSCSI solutions because i did not want to get burned. My biggest determining factor was feedback from independent users as I have a little distrust for fast talking sale folks. The overwhelming majority were in favor of Equallogic. In fact, there are many folks including myself that were turned off by Lefthands tactics of discrediting Equallogic instead of focusing on the strength of their product and so i ended up buying from Equallogic. Lefthand just might have a better product, but so far it has not been proven by the market. Honda’s and Toyota’s speak for themselves in the auto industry and i would expect the same from any storage vendor that intends to be a leader.
Craig Randall on Monday, 9 April, 2007 at 8:15 am

You’ve probably made your purchase already, but let me echo what Marc Farley noted earlier. As a third-party consultant in IT, we’ve been contracted by EqualLogic to create a channel program. As chief researcher on this project, I was able to do a deep dive on LeftHand, Compellent, NetApp, EMC and others who compete in the iSCSI space with EqualLogic. At the end of the day, I suggest you do as most EqualLogic folks will tell you: Get all the cake mixes in the kitchen and do a bake-off. You’ll be eating EqualLogic in thirty minutes.

I think the compelling factor is that as you scale, it actually works better/faster. Marc’s citation of his end users is echoed by exactly what I’ve heard. Storage Magazine notes that EqualLogic is the only iSCSI vendor with an over 90% “would buy this again” (93%) rating in the marketplace. This vs. Sun, HP, NetApp and EMC (next nearest was HP @ 82%). Some secondary commentary from end users include the ease of use–robust solutions managed by fewer individuals–and ease of scalability. Most also agree that the fact that off-the-shelf EqualLogic arrays come with everything you need out of the box makes your two-line EqualLogic quote much easier to digest than the confusing competitive quotes. Customers love EqualLogic. You most likely will as well.
Bob F. on Wednesday, 18 April, 2007 at 1:58 pm

Craig,

You seem to make a compelling case for Equallogic custormer loyalty. Can you address John Spiers’ claims, which seem to position Left Hand as a more scalable solution than Equallogic. Specifically, increased IO as # of nodes increases, node-level fail-over, use of HP and IBM equipment which does away with proprietary trays, and their MPIO load balancing

General side question: MPIO only works in Windows, right? Is LH really an enterprise-class solution?
TimG on Monday, 14 April, 2008 at 10:51 am

Sounds like the major difference here is LHN is mirroring data across multiple nodes so it could withstand more faults. I would ask if EQL and the other vendors mentioned if they can withstand a whole storage system (or site) being offline without loss of data or access to data?