This article can also be found in the Premium Editorial Download "Network Evolution: Data center fabric wars."
Download it now to read this article plus other related content.
When it became clear that virtualization and the cloud would strain the data center network—with new east-west traffic patterns, extreme application workloads and the need for flexibility and convergence—all of the major vendors began battling to prove they had the best solution for the problem. For each of them, that solution was a complex and costly data center fabric that promised flat, non-blocking, end-to-end transport between any node on the network. And then software-defined networking (SDN) came and shook things up.
Suddenly, proponents of SDN promised that the control plane of the network would be decoupled and centralized, making it possible to build a network of dumb devices that could be granularly controlled down to the individual traffic stream. This technology could be used to spin up virtual network instances on demand and treat compute, storage and networking merely as pools of flexible resources. With that kind of manageability and flexibility, who needed cumbersome traditional networking architectures? In fact, who needed data center fabrics?
In truth, neither data-center network fabrics nor SDN models are completely proven, so in which option (if any) should users invest?
With SDN, no need for ‘rococo’ data center network fabrics
Once software-defined networking takes hold, there may be a need for Ethernet fabric, but it will be basic and exist only to serve the layers of SDN abstraction. By Brad Casemore
Let’s assume, for sake of argument, that software-defined networking (SDN) fulfills its considerable promise and stakes claims not only to the data centers of large cloud-service providers, but also to those of large enterprises. Would there be any need for the Ethernet fabrics peddled by Cisco, Brocade, Juniper, and other established networking vendors?
There will still be a role for fabrics, but not for the relatively rococo fabrics the major vendors are positioning and selling today.
In certain key respects, SDN and network fabrics have much in common. Both emerged as potential network solutions to challenges posed by rampant data-center virtualization and cloud computing that brought new east-west traffic patterns and application workloads that strain the limits of longstanding architectures and technologies.
However, that’s where the similarities end and the differences begin. The fabric offerings of leading networking vendors, typified by standard and non-standard approaches to Ethernet multipathing, represent a linear progression in networking’s evolution. The networks are being flattened to varying degrees, and the shackles of Spanning Tree Protocol (STP) are being cast aside, but it’s still very much business as usual for the vendors and their customers. The latter are continuing to see their vendors of choice present network infrastructure composed of distributed control planes, vertically integrated switches, and proprietary extensions to industry-standard protocols and technologies.
SDN takes a different tack. The purpose of SDN is to enable network virtualization and network programmability through the decoupling of the network control plane from the data-forwarding plane. In SDN, the network control and intelligence gets pushed up and out to server-based software (a new realm for vendors’ proprietary value creation) and away from the underlying network hardware, which ideally will be fast, reliable, cheap, and relatively dumb.
The data centers that adopt and implement SDN will not need the proprietary fabric networks proffered by the major vendors, which for the most part are architecturally and philosophically at odds with the ONF’s vision of virtualized infrastructure (compute, storage, networking) managed as integrated resource pools in thrall to application workloads.
But—and here’s where things get interesting—they’ll still need fabrics. Given the nature of SDN, these emerging fabrics are likely to be relatively simple. They’ll be akin to physical networks that mirror the any-to-any connectivity of a chassis backplane. Eric Hanselman of the 451 Group says a network fabric is “a technology for extending chassislike functionality across multiple physical systems.” Likewise, Nicira CTO Martin Casado has written on his blog that “a fabric is a physical network which doesn’t constrain workload placement....a physical network that operates much as a backplane does within a network chassis.”
Both definitions seem apt for what SDN requires. In the long run, a base network fabric will emerge to serve the higher layers of SDN abstraction, but today’s proprietary fabrics are unlikely to fit the bill.
Now, the question is, who will provide this fabric? With value and margins gravitating from the physical network to server-based controller and the applications that run on them, the major networking vendors will be disinclined to accept the role of subservient plumbers. They’re used to richer margins and more control (pardon the pun) over their destinies. They will accept this new role grudgingly, if at all.
Meanwhile, we’ll see other vendors fill the void with relatively cheap hardware that meets the needs of the SDN community. Already we can see the ONF launching a concerted bid to persuade purveyors of merchant silicon (Broadcom, Marvell, Intel’s Fulcrum) to deliver OpenFlow support in their chips. At the same time, ODMs have begun to deliver bare-bones switches to customers at cloud-service providers and other major data centers. They’re following a trail blazed by server original design manufacturer’s (ODMs) that similarly provided cheap and cheerful compute hardware for service provider clientele looking to virtualized resource pools as a means of adapting and harnessing cloud computing.
It will take some time, but it’s not likely that complex fabrics from proprietary vendors will win out.
Data Center fabrics vs. SDN? Neither works
Without doubt, there is need for change in the data center, but network virtualization is a better answer than complex fabrics and SDN. by Ivan Pepelnjak
Like with every new technology, proponents of OpenFlow and software defined networking (SDN) keep promising to solve all the world’s problems. There are very valid use cases for OpenFlow and SDN, but data center fabric, defined as the networking gear providing end-to-end transport within a data center environment, is not necessarily one of them.
It’s important to first agree on the definition of SDN. While OpenFlow is well defined (although one has to wonder how open it is), SDN by itself is a meaningless term. After all, software has defined networking ever since IBM announced the 3705 Communications Controller in 1972. In fact, software has always controlled networks and large networks have always been at least partly monitored and configured by additional software. However, for the purpose of this discussion, we use the definition of SDN the Open Networking Foundation (ONF) proposed: an architecture where network control is decoupled from forwarding.
A number of similar architectures are already widely used in data center networks—and they’re not that successful. Cisco’s Virtual Switching System, Juniper’s Virtual Chassis and QFabric Network Node, and HP’s IRF all use central controlplane software that programs the forwarding tables in numerous switches. Yet all these architectures have a common trait—they don’t scale. The most that networking vendors have managed to do thus far is to control up to 10 top-of-rack switches or eight core switches with a single control plane. NEC has managed to do slightly better with its ProgrammableFlow controllers. Also, if you take a closer look at Google’s infamous G-scale network—the best example of production-grade OpenFlow we’ve seen so far—you’ll see a similar picture. Google uses OpenFlow to control a small number of devices that are physically close to the OpenFlow controller. Managing a dynamic environment with very fast feedback loops and high rate of change from a central point simply doesn’t work.
Don’t get me wrong—I’m not saying today’s data center networks are perfect. In fact, they are riddled with numerous scalability challenges, most of them stemming from the simplistic implementation of virtual networking by most hypervisor vendors that use VLANs to create virtual networks. But instead of fixing the root cause of the problem—by moving the complexity to the hypervisor—the networking industry keeps proposing a series of kludges, with SDN/OpenFlow being just of them. Eventually they might manage to create flying pigs (and charge you extra for the jetpacks), but the hidden complexity of these solutions would be comparable to the complexity of the traditional voice networks. We only got to large scale and low-cost voice when we stopped using voice switches and moved to peer-to-peer VoIP solutions like Skype.
So what might work?
Imagine a scenario (that large cloud providers like Amazon and Rackspace already use) where virtual networks get implemented with a MAC-over-IP (be it VXLAN, NVGRE or Nicira’s STT), or an IP-over-IP solution, and storage engineers would use an IP-based solution (be it iSCSI, NFS or SMB 3.0). In this case, all the complexity is moved to the hypervisors, and it makes perfect sense to control that environment with OpenFlow because the controllers would be dealing with a very large number of independent uncoupled devices. This is what Nicira is doing with its Network Virtualization Platform. The data center network would have to provide just two services: end-to-end IP transport and lossless transport of specific traffic classes.
In a well-designed data center network that provides pure IP transport, you no longer need to change the switch configurations every time you add a new virtual network or a new server. The only time you have to change the network configuration is when you add new capacity, and even then the existing tools some data-center switch vendors offer allow you to automate the process.
To summarize: Once we get rid of the VLAN stupidities, and move the virtual networking problem to where it belongs (the hypervisors), we no longer need complex data center fabrics and complex tools to manage their complexity. Existing large-scale IP networks work just fine and won’t benefit much from an SDN-like centralized control plane. On the other hand, having a decent provisioning tool for a large-scale IP network would be a huge benefit. We thus don’t need SDN/OpenFlow in data center fabrics; what we need is a Puppet/Chef-like tool to build and deploy them efficiently.
About the Authors
Brad Casemore provides consulting and advisory services to technology companies. He helps identify and pursue market opportunities in fast-growing and competitive areas. He also helps companies identify and build strategic business partnerships that support their objectives.
Ivan Peplnjak, CCIE No. 1354, is a 25-year veteran of the networking industry. He has more than 10 years of experience in designing, installing, troubleshooting and operating large service provider and enterprise WAN and LAN networks and is chief technology advisor at NIL Data Communications, focusing on advanced IP-based networks and Web technologies.
This was first published in August 2012