The Next Big Things

by Robin Harris on Thursday, 11 March, 2010

The Wall Street Journal recently published a ranking of the top 50 venture-backed companies – and storage got its share.

#2 on the list: Fusion-io, a company StorageMojo has followed from the early days. They’ve closed major deals with IBM, HP, Dell and Samsung. From the WSJ profile:

Fusion-io has a weight-loss program for data-fat Web businesses, and its “storage accelerator” has been embraced by Facebook and MySpace — and a thousand other companies looking to do more with less in a bad economy, co-founder David Flynn says. “Our datacenter uses half the power [and] half the floor space.” . . . It aims to displace storage arrays from the likes of EMC Corp. and competes with flash makers from Micron Technology Inc. to Intel Corp.

#20 is Silver Peak Systems whose appliances accelerate data transfer. From the WSJ profile:

Using proprietary algorithms and the power of multi-core chips, it’s helping customers like Google Inc., AT&T Corp. and Visa Inc. shuttle data between datacenters at a rate of one gig per second, often to backup locations. That’s “a thousand times faster than what was done historically,” says founder David Hughes. . . . He took techniques of “wide-area-network optimization,” used mainly to move data between branch offices, and super-sized for the datacenter. Today, Silver Peak, which has raised $60 million over four rounds, is growing and profitable.

#26 is Metaweb Technologies, whose product, Freebase, is an open database of online information. The big problem with massive data is search – and we’re just at the beginning of the massive data era. From the WSJ profile:

Metaweb . . . rejects freewheeling keywords (Google’s currency), because their multiple meanings can sometimes baffle computers. . . . It’s better, Metaweb believes, to organize information into people, places and things, or what it calls “entities.” . . . After five years of toil, Metaweb and its collaborators have created 12 million entities and mapped how they relate to each other.

#34 is Schooner Information Technology, who builds acceleration appliances for popular software like MySQL. From the WSJ profile:

The machines are also optimized to run the popular free middleware programs MySQL, an open-source database program, and Memcached, a Web site memory-caching system. The approach boosts speed by 10 times, which lets datacenters cut costs in half by dropping server numbers, power use and real estate needs. Appliances for other applications glisten in Schooner’s future. . . . Early customers include Flixster and Plaxo Inc.

#35 is Vidyo, who produces HD-quality video-conferencing apps and services via personal computers. From the WSJ profile:

With a new video-conferencing architecture built on an emerging standard called scalable coding, it can deliver high-quality video without annoying pauses to multiple users via their computers for just cents per minute.

#41 is Force10 Networks, if you can call a company with $200 million in revenue a startup. Their 10Gig Ethernet switches and routers are popular with Internet datacenter folks.

#43 is Akorri Networks whose focus is software for managing virtualized servers and the entire network environment inside large datacenters. Founded by Rich Corley, formerly of Pirus. From the WSJ profile:

Akorri . . . built a management system that both understands virtual machines and helps pros manage datacenters as a single system, rather than as individual components. The tools pinpoint problems quickly so datacenters can better utilize the equipment they have, get competing parts of the system to play better together, and boost overall performance. . . .

The StorageMojo take
Why are those networking companies on a StorageMojo list? Because bandwidth and storage are fungible at about the 80% level. If you have lots of bandwidth you need less local storage – and the reverse is true.

Faster and more abundant data means more storage, local and cloud. What makes this fun is that what “lots” means keeps growing.

When I got my first personal hard drive – 30 whopping MB – I couldn’t imagine ever filling it. Now I ignore 2 GB thumb drives because they’re too small. I routinely generate 5-10 GB files.

I’m ahead of the curve, but not by much. People like moving pictures and sharing. Businesses like understanding their customers. Massive data storage helps us do both.

Courteous comments welcome, of course. I’ve done work for Fusio-io. Learn more with this Video White Paper, my own contribution to our growing storage and bandwidth needs.

nate March 12, 2010 at 10:54 am

I don’t know if I agree if you have a lot of bandwidth you need less storage. If you have something like dedicated circuits/leased lines where your paying a flat rate maybe that is true.

But as someone who has access to a lot of bandwidth, actually USING that bandwidth is very expensive. Far more expensive then hosting storage locally. And the rate we pay is pretty good given the volume of data we do run through the pipes.

Latency is also of course a huge issue when dealing with storage over a WAN..

Robert Ross March 12, 2010 at 11:36 am

Bandwidth is the Achilles’ heel of cloud computing (aka remote storage). Whether remote SBC/App, VDI, or Storage application, the installed base of integrated services (voice+data), and the prevalence of async topologies from broadband providers is proving to be a significant barrier to achieving performance. I’m working on some new buzz lines along – “Bandwidth is cheaper than labor.” See if this computes…

VDI, remote App/SBC, cloud storage should all result in reduced costs at either salary, hardware or outsourcing level. This in a growth environment; status quo should work but ROI calc should be longer. Choose salary…

Reduce payroll and burden by $36,000 per year for one help desk position. I choose entry level on purpose. This is even better if specialized talents can be replaced. Build ISP style redundant bandwidth for incremental increase in telecommunications expenses of $12,000 per year <- this is the cap and op ex for a Cisco VXR and 30 to 40 mbps of BGP backed Internet – full duplex, not async. Problem solved. This is a bit over simplified, but the margins should survive more involved analysis. The cost per mbps is declining. Labor is not.

Choose outsourced tech support or additional hardware costs and run another set of numbers. It still works out nearly every time.

It's appalling the number of IT managers that are not paying attention to this all in the name of consolidated billing. I'm not kidding, in the past 2.5 years our suggestions of additional, dedicated data-only bandwidth from an alternative carrier are nearly 100% refuted by "We want all of our telecom on one bill."

nate March 13, 2010 at 9:04 am

Was thinking about this a bit more, and taking an average of 250MB/second that my NAS does(peaks out at around 500MBs – disk limitation not NAS), if you plugged that into some *really* cheap bandwidth at $20/mbit, that’s still $40,000/month in bandwidth costs, not even taking into account the cost of the storage on the other side. At my previous company we paid about $110/mbit for low traffic(sub 10Mbps), on a top tier provider in a data center.

Despite the cost the latency involved of doing such a thing over the WAN would cause the apps to grind to a halt in many cases.

Bandwidth is getting cheaper, but local storage is getting cheaper at a much faster rate.

I ranted a bit on this topic last year, marketing departments advertising more bandwidth than is actually available leading to poor user experiences and unrealistic expectations –

Robert Ross March 13, 2010 at 11:29 am

Cheap local storage can be misleading. Given the broad scope of data management that comes with it… on-line, near line, versioning and DR; a fully utilized enterprise can probably manage this cost effectively, but a 50 seat law firm; probably not. Dealing with a data explosion from eDiscovery and other litigation support requirements exceeds any reasonable IT budget for most of these outfits. A simple example, but the reality for SMEs is no one is deleting anything. And, they usually have competing file systems – mail, shares and maybe a nascent document management system. So being, retraining behavior is disruptive and prone to resistance and failure. Acquiring local and scalable storage brings capital expense and additional overhead from expertise.

I do not mean to suggest remote storage could ever compete on raw performance, but 100mbps pipes are accessible at price points I believe are still below the TCO of local enterprise class storage.

. R

Robert Ross March 13, 2010 at 11:59 am

FWIW – If AT&T, Verizon, et al hadn’t effectively killed the 1996 Telecom Act, I think your bandwidth at the edge problem would largely be solved today. I did a lot of work with CLEC/CAPs, municipalities and utilities in the 90s on bandwidth solutions. The popular business model at the time was ~$125 per household for voice, data and video. I think that’s been achieved, and subsequently the paucity of bandwidth at the edge is a function of maxed revenue streams. There is little reason for big telecom to invest in edge capacity because they don’t have to; you are already spending as much as the industry believes you will tolerate. The argument about core upgrade costs is valid, but God’s own green acre of WAN capacity is still out there from the massive builds of the 90s; augmented by the deployment of WDM. I believe the costs would be negligible in a competitive environment.

Disclaimer: I’m a fan of fast and dumb WAN. Intelligence at the core for service differentiation inflates core costs, and though rate limiting can have beneficial applications, I believe telecom uses it to justify going forward cost models for subsidy, and will ultimately prioritize traffic for rate tiers. Not necessarily a good thing.

I think the FCC has the bit between their teeth this time. Maybe something will come of it.

Mark Preston March 15, 2010 at 9:06 am

First, I won’t dispute the ‘cost’ of bandwidth itself. I will say that calculating a business service cost of bandwidth should not be at the rails of the particular tier, in this case storage. Most services run at low average utilization and can have significant off-peak to peak daily/weekly/etc workload profiles. In addition tiered architectures can have or be designed to minimize traffic between tiers and therefore optimize (lower) the cost of bandwidth. Capacity planning can identify and mitigate the cost of bandwidth.

