My pool is a cruel mistress. Oh sure, to the public it appears to be sparkling and warm; nestled into a mature backyard landscape, it seems like an oasis on a trouble-free summer weekend. But in private, it's a tangled mix of crumbling 30-year-old infrastructure along with last year's fresh, bleeding-edge upgrades.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
In theory, it should be getting better over time as replacement parts are more robust and sometimes higher tech. In reality, however, the pool is no more reliable than it was years ago. In fact, troubleshooting is more complex than ever with equipment, technologies and brands of different vintages all tied together with generic 2" schedule 40 PVC.
With SDN, one fundamental change above all others will drive this complexity: configurations, as we know them, are going away.
I recently spent an hour sweating and bent over the plumbing, fiddling once again with the suction manifold as I tried to isolate why the pump refused to prime. After much cursing, I finally sat, waiting patiently for water to flow. I imagined the water trying to pass through the maze of fittings -- flowing here, blocked there, re-routed somewhere else. Perhaps it was the hot Texas sun, or maybe I stood up too quickly, but I started imagining packets trying to flow through PVC pipes. Before long, I realized the pool was a great model for the challenges of troubleshooting SDN.
While the dream of an all-programmable network is indeed as enchanting as a summer breeze, like everything in life, the reality is more complex. Once we begin implementing SDN, we're going to end up with a collection of related parts: legacy technology, plus new controllers, plus yesterday's first-generation products that were once bleeding edge. Overarching above all of this equipment is a new problem of unwinding, dynamic, fleeting configurations: the product of automation. When something goes wrong and we have to troubleshoot, we'll have to examine these configurations after the fact.
SDN troubleshooting means searching through automation's maze
Like the PVC on the pump pad, today's network configuration technology is the plumbing behind our infrastructure. Our current device configurations are unions, slip joints, T-connectors, reducers, etc. They're more or less fixed and designed to move traffic in a specific way, from point to point. SDN's rule-based polices are like the valves with powered controllers. They enable automation-driven control on top of the fixed plumbing beneath. In combination with other controller-adapted gear, they have become truly autonomous.
The automation controller tries to make them all work together in harmony, but when that doesn't happen, troubleshooting involves considerable guesswork on the effective configuration that was in place at the time of failure. By the time an issue is noted, however, ongoing configuration changes in the interim could make finding the root cause challenging.
When configurations get replaced by policy and automation
With SDN, one fundamental change above all others will drive this complexity: Configurations as we know them are going away. They're being replaced by policies. On devices themselves, they're either collapsing into a single layer as in Cisco's Application Centric Infrastructure model, or they will be created and managed by a new service layer on top, as is the case with VMware's NSX. Either way, we can't just SSH in and untangle a configuration.
Therein lies the challenge: Configurations are relatively stable. When we replace fixed configurations with policies empowered for autonomous change, it's like adding programmable valves to the system that will be activated outside the control of traditional static configurations. How will we untangle the effect when a configuration changes from minute to minute outside of our traditional control? When we're untangling a ticket from 48 hours ago, how will we re-assemble a snapshot of all the policies active at the time of the issue?
Magic packet time machine
With traditional networking, it's easy-ish to recreate an effective configuration. You back up your configurations every night and configure SNMP traps so your network configuration management solution automatically pulls backups after local changes. Looking at the configuration backups from any given day and time tells you exactly what all the rules were at that time.
Read more on Patrick's views on SDN and management
When SDN automation meets IoT, big problems loom
Can Cisco's almost-open SDN outrun VMware?
Secure SDN before investing
But with SDN, it's not so easy. With multiple layers of configurations, access and QoS policies, and automated network service delivery optimization, the change volume will be substantially higher. Back tracking a hypothetical packet traverse from 1:41 a.m. last Friday will require the ability to recreate the computed configuration effects of all related elements.
What's needed is essentially a configuration time machine with a scroll bar you can scrub back and forth to not only recreate a snapshot, but watch the dynamic changes that produced the unique moment of configuration. With such a time machine, my guess is that we'd find many problem resolutions in the dynamics or temporarily colliding policies, not preset expected configuration states.
As for debugging the pool, it turned out I only needed to sit a spell --somewhat defeated --on my overturned, faded Homer bucket before hearing the familiar sound of a nearly overheated pump finally gulping on cool water from the manifold. Was it replacing the impeller, lubing the spider-valve or just patience that got the flow going again? Without a time machine, it's hard to tell.
About the author:
Patrick Hubbard is a head geek and senior technical-product marketing manager at SolarWinds. With 20 years of technical expertise and IT customer perspective, his networking management experience includes work with campus, data center, storage networks, VoIP and virtualization, with a focus on application and service delivery in both Fortune 500 companies and startups in the high tech, transportation, financial services and telecom industries. He can be reached at Patrick.Hubbard@solarwinds.com.