A deep dive into Cisco’s UCS

by Robin Harris | Wednesday, April 21, 2010 | Architecture, Cloud computing & storage, Enterprise, Future Tech | 19 comments

One of the highlights of the Gestalt IT tour was a half day on Cisco’s UCS and associated products. But this was the real deal: an experienced and technical Cisco presenter going deep for a crowd of skeptical IT pros. Digging into the details would reveal the essential value proposition of UCS.

I’ll summarize what I heard first. This isn’t a transcription and there is interpretation and interpolation. If there is something wrong in the descriptions please comment so I can update. There are links to Cisco sources for more detail.

The UCS case
UCS is focused on reducing operating expenses not capital expenses as OPEX is the rising cost. It is proprietary software and support services that breed cost.

Customers are caught on a treadmill, where vendors:

Simplify by adding layers and offering services to enable the simplification
Which results in a complex management stack and high costs
Enterprises can’t easily scale because of legacy systems, which drives application costs higher
Management complexity is driving vendor revenues

Sounds something like a StorageMojo critique. Bravo!

The UCS solution is to move to a private cloud powered by VCE – VMware, Cisco and EMC – and other vendors as appropriate. An application-centric unified fabric that ties network and compute resource together under centralized control.

The building blocks of the UCS are:

UCS manager – the device manager
UCS fabric interconnect – 20 & 40 port FCoE switches
UCS fabric manager is the management tool for storage networking across all Cisco SAN and unified fabrics
Fabric extenders connect the UCS blade chassis to the switch and simplify cabling, management and diagnostics

All these components are designed to work with both physical and virtual resources.

UCS components
The basic Cisco components of the UCS are:

UCS manager: Cisco UCS Manager implements policy-based management of the server and network resources. Network, storage, and server administrators all create service profiles, allowing the manager to configure the servers, adapters, and fabric extenders and appropriate isolation, quality of service (QoS), and uplink connectivity. It also provides APIs for integration with existing data center systems management tools. An XML interface allows the system to be monitored or configured by upper-level systems management tools.
UCS fabric interconnect: Networking and management for attached blades and chassis with 10 GigE and FCoE. All attached blades are part of a single management domain. Deployed in redundant pairs, the 20-port and the 40-port offer centralized management with Cisco UCS Manager software and virtual machine optimized services with the support for VN-Link.
Cisco Fabric Manager: manages storage networking across all Cisco SAN and unified fabrics with control of FC and FCoE. Offers unified discovery of all Cisco Data Center 3.0 devices aa well as task automation and reporting. Enables IT to optimize for the quality-of-service (QoS) levels, performance monitoring, federated reporting, troubleshooting tools, discovery and configuration automation.
Fabric extenders: connect the fabric to the blade server enclosure, with 10 Gigabit Ethernet connections and simplifying diagnostics, cabling, and management. The fabric extender is similar to a distributed line card and also manages the chassis environment (the power supply, fans and blades) so separate chassis management modules are not required. Each UCS chassis can support up to two fabric extenders for redundancy.

Here is a simplified figure of the components, courtesy of Cisco.

Cisco value-add
The presentation noted several unique Cisco features:

Memory extension. Cisco blade servers support up to 48 DIMMs due to a custom mux/demux chip they developed. Enables a 96 GB server using low-cost 2 GB DIMMs.
Hypervisor bypass. Bypass the softswitch to go direct to the NIC using Single Root I/O Virtualization – which is part of the PCI spec, not a Cisco exclusive.
Exceptional automation. Blades are highly stateful – MAC addresses, BIOS settings, vLANs – are put in an XML file so you configure the blade once and apply forever with compatible blades. Settings can be applied to either a blade or a particular slot. Removes much sysadmin drudgery.

VMware/Cisco/EMC: VCE
That was just the Cisco-owned UCS story. But VMware and storage are needed to create a virtualized infrastructure. Enter, in this instantiation, EMC and the Vblock. Cisco is also working with NetApp and probably others,

A Vblock is an engineered, tested, supported and validated package of components from the 3 vendors. You buy the package – which has some configuration flexibility – and you get a single support group, not finger pointing between 3 companies.

This is supposed to make for rapid implementation of new infrastructure along with the management advantages of UCS. Acadia, the new services company the 3 have put together, provides Build, Operate & Transfer (BOT – a new TLA?) services.

Sounds good. I can’t recall a similar level of advertised integration among 3 major vendors before.

The StorageMojo take
I’m just as mystified as ever about what they’re thinking.

The looming question is: do enough customers want to buy “unified” systems? This is, after all, only a distributed mainframe – and the mainframe spending percentage has been shrinking for decades.

Consolidation is inevitable: IT has standardized on a few platforms, and can standardize on a few suppliers. But do those suppliers need to be vertically integrated?

Only when it makes sense. Oracle/Sun has the better argument: when you know exactly what you want from your database, we’ll sell you an integrated appliance that will do exactly that. And it’s fine if you roll your own.

But those are industry-wide issues. There are UCS/VCE specific issue as well:

Cost. All the integration work among 3 different companies costs money. They aren’t replacing existing costs – they are adding costs. Without, in theory, charging more.
Lock-in. UCS/Vblock is, effectively, a mainframe with a network backplane.
Barriers to entry. Are there any? Cisco flagged hypervisor bypass and large memory support as unique value-add – and neither seems any more than a medium-term advantage.
BOT? Build, Operate, Transfer. In theory Vblocks are easier and faster to install and manage. But customers are asking that Acadia BOT their new Vblocks. The customer benefit over current integrator practice? Lower BOT costs? Or?
Price. The 3 most expensive IT vendors banding together?
Longevity. Industry “partnerships” don’t have a good record of long-term success. Each of these companies has its own competitive stresses and financial imperatives, and while the stars may be aligned today, where will they be in 3 years? Unless Cisco is piloting an eventual takeover.

The enterprise IT industry is consolidating. HP, the world’s largest computer company, appears strong but is vulnerable – or at least John Chambers, Cisco’s CEO, thinks so.

Cisco, dominating network switches, needs new worlds to conquer. Large switches have been specialized blade servers – CPU and I/O – for decades, so why not take the next step?

But Cisco is responding not to customer demand, but to Google and Amazon. Their vast commodity infrastructures, linked by – horrors! – cheap unmanaged switches is Cisco’s nightmare. If CFOs understood that much of IT could be migrated to that model over the next decade, Cisco’s margins and influence would be devastated.

Creative destruction, indeed!

Courteous comments welcome, of course.

19 Comments

clarke on Wednesday, 21 April, 2010 at 9:08 am

Cost would be my primary concern. Wholly integrated systems theoretically should perform well beyond fragmented systems initially. Though due to their integration certification requirements, I’d fear that they could end up a generation or two behind.

Possibly VCE should have looked at the failings of ERP’s, & how little changes can take the entire system down for hours/days. ERP’s in theory are a great tool (apply accounting rules to all facets of company), but as tweaks are changed in 1 section, they break operations elsewhere.

As someone who’s experienced a chassis failure(not UCS), I’m reluctant to like them. Basic redundancy doesn’t cover failures, I’ve grown to love the 3:1 option. Where everything can run on 1, but there are at least 2 running & 1 on backup. [Having had chassis’ fail, I opted to switch back to 1U servers, losing a server vs 10, is a lot more comforting]

As much as we’d like to believe decisions are made on operation costs, most decisions are based on the upfront costs + maintenance costs; power+cooling is small % factor(unless you’re severely underweight prior to purchase). As these are the 3 platinum’s of equipment suppliers, I’d also be concerned about their ability to keep costs down for the purchaser; who might otherwise go with fractured systems.
nate on Wednesday, 21 April, 2010 at 10:00 am

My thoughts on the topic are too much for a comments box but I have written about them here if your interested:
http://www.techopsguys.com/2010/02/27/cisco-ucs-networking-falls-short/ (Cisco responded to this one)
http://www.techopsguys.com/2009/08/17/fcoe-hype/
http://www.techopsguys.com/2009/11/03/the-new-ciscoemcvmware-alliance-the-vblock/

I rip into all of them. to summarize:
– I don’t like FcOE because it really isn’t converged at least not yet, and I’m not convinced it will be in the future
– The vBlock is an overpriced, underpowered solution
– The Networking stack in UCS is very weak, the technology has severe limitations, and their “memory extender” ASIC suffers from a couple bad problems:
– limited to the Xeon 5500/5600 dual socket systems at least for the moment
– It comes at a pretty decent added cost to the system I understand

I like what the future holds in the Opteron 6100 platform myself it really seems well designed and built for virtualization from a cost, performance and efficiency perspective
http://www.techopsguys.com/2010/03/29/the-cougar-has-landed/

I think a 48-core system with 48 memory slots is much better balance of resources, add to that no price premium to the 4-socket Opteron 6100 and you got yourself a bad ass system.

Like most of Cisco’s products it falls far short. Though at the same time like most of their products they manage to generate some hype around them, perhaps enough to get the attention of some stupid CIOs out there to buy into their story. I’ve already talked with a service provider who is using UCS, and their cost structure they presented to us is 400%+ higher than our own for infrastructure alone(hosting cost structure is about 800% higher).

It’s really been depressing to see so many people get their lips caught in Cisco’s fish hooks on their UCS platform, but oh well, I guess that just makes the services I provide to an organization just that much more valuable.

I don’t think technology stacks are all that great myself. I am not at all impressed with the vBlock, or the new HP stack, and it sounds like HDS is coming with a stack of their own too. I think it can make some sense in the small business world where you may have a few servers and some storage, but not in the cloud, not in the service provider, not in the enterprise space.

I have my own idea on a bad ass cloud computing platform, maybe one of these days I’ll have the time to write about it and diagram it out. Or not, don’t want to give away too many secrets.
John Obeto II on Wednesday, 21 April, 2010 at 11:08 am

You touched on the Oracle/Sun product line.

What are your thoughts on UCS, HSD’s Unified Compute Platform, and HP’s CIA compared to Oracle/Sun?

What about IBM? Are they in at all?
adam on Wednesday, 21 April, 2010 at 1:56 pm

I have to agree that the VBlock is not impressive. So do a number of Vmware employees if you read the other prominent vm bloggers. I think the argument for the vBlock is that it is as close to a guaranteed configuration that you’ll find out there.

As for a standalone UCS solution, my shop is about to purchase one. We are a rackmount shop that wants to move to blades so we did the math, features, reference checks, etc on all the major players and felt that UCS fits our needs best. Forget all the mumbo jumbo and marketing hype. It can be simplified down to: blade that connects to a top-of-rack switch instead of blade that connects to chassis-mounted switch.

All the vendors have proprietary technologies in their blade offerings so why be afraid because it’s Cisco and not HP or IBM?

I think the message delivered at your conference was the wrong one. Instead of focusing on vBlock, Cisco should have just focused on the merits of UCS by itself.

BTW, we are not FCoE here. We’ll break out into traditional 10GBe and FC at the interconnect level when we implement.
Omar Sultan on Wednesday, 21 April, 2010 at 2:09 pm

Robin:

Thanks for coming to visit us in Boxborough and I appreciate the write-up on both UCS and VCE. If I could, I’d like to clarify a couple of things. Foremost, UCS can serve as infrastructure for all your application workloads. They do not need to be deployed in a private cloud, or even virtualized. In the vast majority of instances, if you are running a workload on an x86 server, you can move it to UCS and gain one or more of the cost and functional benefits we discussed. Second, UCS is available from Cisco and through a number of partners–it does need to be need to be purchased as part of a vBlock deployment.

Regards,

Omar Sultan
Cisco
Robin Harris on Wednesday, 21 April, 2010 at 3:08 pm

John,

IBM GS sells everything to anybody. No worries there. HDS is an OEM vendor, so if they have something that will plug into HP, IBM and anyone else’s blade servers or pizza boxes, that is a win.

Ultimately, the model of a commodity hardware layer with a a flexible software layer tying it together – which is what HP is doing with LeftHand, IBRIX and Polyserve – seems like it will take most of the market.

This isn’t a winner-take-all situation. Some people will choose integrated infrastructures, but as the roll-your-own folks get more of the features – the barriers to entry don’t seem that high – expect to see more people to buy on price and flexibility.

Robin
Damir Lukic on Wednesday, 21 April, 2010 at 3:14 pm

I have worked together with Cisco office in my country on UCS market development. And from what I remember, they often pointed to FCoE as a very good selling point. After we had some workshops and presentations I simply told them to switch over to some better things UCS offers and stop talking about FCoE.
FCoE is a good as long as it is inside a UCS system, but connecting storage system with FCoE, forget it, no way. Simply, there is too much overhead in FCoE compared to standard FC.

But on the other hand, like I said, there are quite a few good things UCS offers.

– connectivity mezzanine cards – Palo, Menlo, Oplin, who combine FC and LAN connectivity inside one chip and offer hardware virtualization of NIC and FC adapters
– large memory support using proprietary mux/demux chips
– service profiles – very easy provisioning and management of UCS platform (similar to VMware Host Profiles)
Steve Shockley on Thursday, 22 April, 2010 at 9:39 am

Perhaps I’m just naive, but what’s the difference between a “FCoE” switch and an “Ethernet” switch? Do they also sell a special “iSCSI” switch?
nate on Friday, 23 April, 2010 at 10:13 am

Steve – it has to do with the types of frames it handles and stuff, and nobody I’m aware of has special iSCSI switches(or ever has had).

I was let down when I figured that one out. I rant about it in a link above from last year.
Brian Gracely on Friday, 23 April, 2010 at 2:59 pm

Robin,

Along with Omar, I want to thank you for your write-up and analysis. I would like to make a couple of corrections/clarifications on a your points about VCE/Vblock.

1 – Customers are not forced to purchase (or BOT) from Acadia. They are free to work with any partners/SI that has been certified to sell and support Vblocks. Acadia is just one potential source for engaging in a Vblock solution. In addition, customers that have the pieces of a Vblock already in place and wish to gain the single-support model can have their environment evaluated and certified as a Vblock through the VCE service organizations.

2 – You state that Vblocks were created as a response to Google/Amazon and not requested by customers. While it’s true that many customers are considering Google or Amazon for some computing needs, Vblock and the concept of Private Cloud were directly requested by customers who were not comfortable using those external services to operate their business needs. Security, lack of control, data loss, legal compliance, multi-tenancy leakage, lack of trust and organizational reasons have all been stated as reasons that many customers are opting to build Private Clouds today. And knowing that a shift to a Private Cloud model will require their IT Ops to work and be organized differently (because of the shifts in technology, primary virtualization), they directly asked us to start building products (Vblock) that would align with where those organization shifts needed to go. Vblock is the first step in creating products that better align with the changes being driven by the underlying technologies.

Regards,
Brian Gracely – Cisco – Sr.Manager, VCE Solution Architecture
Steve Shockley on Wednesday, 28 April, 2010 at 4:57 am

Thanks, Nate. I had no idea FCoE wasn’t really Ethernet, or at least not compatible with my existing Ethernet gear. I was being a smartass about the iSCSI switch thing, since I was thinking the FCoE switches were just regular Ethernet switches with different firmware, but I guess not.
Joe Kraska on Wednesday, 5 May, 2010 at 4:42 am

Steve,

FCoE requires a forklift upgrade of the entire ethernet infrastructure. HBA’s, switches, core–everything except the routers (over which FCoE will not work). “Data Center Ethernet” is what CISCO calls it; you’ll find it called “Data Center Bridge” by others. It’s reliable ethernet with QOS and other features. It needs to be reliable, because Fiber Channel requires it. The QOS features are required for latency control, as FC won’t work well without it. You’ll also find some RDMA features in it, to make the IB folks happy.

Joe Kraska
San Diego CA
USA
Zhen Jin Liang on Saturday, 12 June, 2010 at 12:55 am

Hi,

Anyone has any idea of the Security concerns with UCS. My shop is about to buy UCS for ERP.
Steve Fishman on Monday, 14 June, 2010 at 11:16 am

Memory extension. Cisco blade servers support up to 48 DIMMs due to a custom mux/demux chip they developed. Enables a 96 GB server using low-cost 2 GB DIMMs.

Incorrect. We do not support this configuration. We use a 8GB DIMM kit that consists of 2 4GB DIMMs or a 16GB kit which consists of 2 8GB DIMMs. So to get to 9gGB you would use a 12x 8GB kit.
Sunil Menon on Tuesday, 22 June, 2010 at 9:55 am

I do not think implementing FCoE is disruptive or would require a fork lift to the existing infra…I think its is a mindset issue since the Unified Fabric infra can co-exist with the existing Ethernet Network seamlessly.About need for QOS,there is nothing that you need to explicitly configured to ensure FC traffic does not get dropped since the FCoE implementation natively ensures that FC traffic has dedicated bandwidth and priority on the Ethernet transport and hence never gets dropped.

-Sunil
Peter O'Dowd on Friday, 25 June, 2010 at 5:49 pm

Hi great comments above congratulatios to you all in sharing your views. I have been involved in a huge deployement of shared Netapp 10 GigE Storage systems of around 18 x petabytes at a major customer here in Australia. We have dedicated backend storage connected via Cisco switching 10 GigE infrastructure and on the Host side connected to VmWare Host farms and numerous other Standalone Hosts mainly Sun Hosts. Our infrastructure connects via CIFS, NFS and iscsi, we use Oracle and MS SQL databases which run fine on 10 GigE ethernet so I am confused as to why people would even consider FcOe, takes me back to the bad old days of Emulating Block channels to enable moving to Escon channels on mainframes, (overhead of emulation was horrific) yes I have been around the industry for a very long time since 1974. Its good to see that other Vendors like Cisco, EMC, VmWare have seen the value in a consolidated approach to virtulaization of an end to end view considering that NetApp has had Storage virtualized for the past 5 years and integrates very well with Cisco and VmWare. The above mentioned infrastructure allows for new storage activations in less than 15mins.
vicl2010v2 on Monday, 30 August, 2010 at 1:54 pm

If Cisco UCS is using the ASIC-on-motherboard approach to allow full memory-loading at higher-than-800MHz speed, what are the competitors doing ?

How is HP able to offer 1TB on a server – is that at 1333 MHz, 1066 MHz or 800 MHz.

CSCO UCS is able to offer 1066MHz at full memory loading using “Catalina ASIC” while the CS M2 supposedly can support full memory loading at 1333 MHz.

HP and others can use Netlist’s HyperCloud memory modules, but we haven’t heard from HP and others on qualification for NLST HyperCloud yet.

You can follow some of the conversation on this topic on the NLST yahoo board:

http://messages.finance.yahoo.com/Stocks_%28A_to_Z%29/Stocks_N/threadview?m=te&bn=51443&tid=23494&mid=23510&tof=1&frt=2#23510
Re: Cisco’s UCS
Karthik on Monday, 24 October, 2011 at 8:23 am

I work in Singapore for a big client. I heard lots of talk about the UCS hardware which rocks the entire market, but not adopted by many of the clients.

The product was designed and targeted for HP enclosure c7000 market.

We used HP c7000 enclosure but which most of the features in terms of high availability was in place before 3 years back with fcoe integration, prior to the launch of Cisco UCS product.
Anthony Combs on Wednesday, 16 November, 2011 at 8:34 pm

I am not sure where to start. UCS is not number 2 for blade server technology. Cisco has a development team in house with Intel in Beaverton, Oregon. Their blades compare in price to the equivalent IBM, Dell, or HP Server. Their fabric interconnect to the UCS frame uses FCoE but you are not forced into using it. Your return on investment in power and management alone is at a max of 18 months.

Trackbacks/Pingbacks

Tech Field Day Boston: The Links – Gestalt IT - [...] A deep dive into Ciscoâ€™s UCS [...]
IT Management 2.0 » Expecting a Different Kind of Cisco Live - [...] interest in Cisco UCS and its compute capabilities brings a whole new “server-oriented” persona to this event [...]
7 Trends of Data Center Network « Pronto Systems - [...] of running two separate networks. Going FCoE does not save them much equipment cost, since they still need special-purpose…
Cisco UCS From the Ground Up - A Beginner's Guide Part 2 - OzNetNerd - […] Reference:Â storagemojo: A deep dive into Ciscoâ€™s UCS […]