“Call home” support has been standard in large arrays for 15 years. But Nimble Storage has kicked it up a notch with their advanced telemetry data from installed systems. It gives new meaning to the term “after-sale support.”
Talk to me
Their system gathers configuration details and more. Feature – such as snapshots and backups – use. Volume protection. Application performance. Updated every 10 minutes.
Here’s a partial screen shot of representative data:
Nimble now has over 4TB of customer use data. Customers opt in to the program. Over 80% have.
Uses
When a problem is detected, Nimble’s software creates a trouble ticket. Then a human gets involved.
It might be as simple as having more Ethernet links on one of the active-passive storage controllers, causing asymetrical performance. Perhaps a volume is not protected. Or the replication policy won’t meet RPO objectives.
Email alerts flag issues to customers. Nimble support engineers can login remotely for real-time troubleshooting.
But that’s not all!
Detailed information on usage allows customers to compare their usage to average usage. Nimble can also look at how customers with the most efficient utilization manage their systems, automating the documentation of best practices.
For example, backup: most customers are retaining snapshots for more than a month. Over 50% of customers replicate workloads for DR.
The StorageMojo take
This is what 21st century support should look like. The best infrastructure is invisible – until it breaks – and the best support keeps the infrastructure from breaking.
The bad news: customers don’t want to buy and manage storage arrays. The good news: they want fast and reliable access to their data.
The Apple model of “it just works” breaks down if the applications are too complex, as many data center applications are. But that doesn’t require vendors to throw all the load on customers.
Automating the capture, review and disposition of system data gives a vendor important advantages:
- Perceived reliability goes up, a fact established with early phone-home experience.
- A stronger customer relationship makes follow-on sales easier and is a competitive barrier.
- The “virtual user group” of shared data enables users to get smarter, faster using their Nimble arrays.
- The real-time remote troubleshooting gives customers help when they need it most – not 4 hours later.
Courteous comments welcome, of course. What other support strategies have you experienced that either worked well – didn’t? If you want to learn more about Nimble, I did a video white paper on Nimble’s architecture last year.
There are a few start up storage companies at the moment (Pure Storage, Tinri, Tegile, Nimble Storage to name a few). They all provide “disruptive” storage solutions with pricing that undercuts the incumbents.
I think it’s great to see a maturity developing from amongst these start-ups , beyond the “Gee Whiz” technology story, beyond the sale, to look at what customers require on an ongoing basis.
Interesting Times for the incumbents . Will they wait to see if these up-starts fail (hopefully not) or start buying them out.
I was told this Nimble feature uploads twice a day, not real time.
Jacob,
I believe you are correctan engineer from Nimble sets us both straight. It is the remote login by a support engineer that is real time.Robin
We do in fact receive certain configuration, HW and SW health-check information as well as capacity and performance metrics every 5-7 minutes from the array. This all feeds into the same back-end mechanisms that drive our monitoring and automation.
Robin – great overview! One minor note: actually over 90% of our customers opt-in to the automated support program (we call it “proactive wellness”), given all the benefits.
Netapp does this already. I wouldn’t classify this as disruptive.
Random, NetApp’s says:
Maybe they do it and don’t say it, but it looks like they focus on system health and do not handle configuration issues like the MPIO problem. Also giving a customer tools to do their own modeling is hardly the same as the support organization looking at it themselves. In my experience it is a rare customer who actually tries to model their infrastructure. Instead they use a combination of rule of thumb and squeaky wheel lube in practice.
With Nimble’s focus on small and medium enterprises they are disruptive for existing vendors in that market.
At Starboard Storage Systems we also use remote login to the customers system as a part of our phone home process as well. It is real time support like this that really makes the customer feel valued.
Interesting, I would argue that it doesn’t compare to the Dtrace based analytic’s on the Oracle ZFS Storage Appliance. They offer realtime information and troubleshooting right to the storage admin without having to login to another site or getting someone from support on the phone. They also of course have phone home and automatically create tickets and such. See a small demo video I made here. https://blogs.oracle.com/si/entry/oracle_zfssa_hybrid_storage_pool1 . The video covers more of the hybrid storage pool design then the actually detailed level of analytics. But you do get a quick glimpse of some of the questions you can ask both realtime and historical. Such as tell me how many IOPS an individual VM is getting? What is the read/write ratio of that VM? What is the latency of that VM? What is the latency of a particular ESX server or Oracle DB server? How many IOPS of this VM are coming from cache… and it goes on and on and on. True power to make intelligent decisions about your storage. Robin, I would be happy to give you a tour of the zfssa analytics anytime.
Darius,
Those are two very different things. What you are talking about is reporting capabilities around what is happening to the various data components (volumes, VMs, NICs, etc.) from a performance perspective. Nimble Storage does that as well. What this post was about is the extension of that into a fully proactive monitoring and support infrastructure to take things to the next level. Reporting performance is definitely a great capability, which is why Nimble has that built right into the interface. But having support be notified of a potential problem before it even arises, and have support automatically send out a detailed resolution is unparalleled. And that’s only the beginning…
Hmm EMC arrays used to do this and it was part of the support contract requirements back in the early days of EMC. A modem was required by some support contracts. EMC Engineers would occasionally show up with parts at the site, and customers didn’t even know there were issues yet. Not sure how this is disruptive though. Hitachi, NetApp, EMC, and others have had failure-only services like this for quite some time. Even third party (well sort-of) providers like Vion do this today already.
Analytics is a entirely different ballgame. Bring on the ‘fishworks’ style ZFS tools in other arrays industry… PLEASE!