In a recent Packet Pushers Heavy Networking episode, Ethan and Greg discussed how difficult SD-WAN is, and why you shouldn’t outsource your SD-WAN to a MSP. So, how difficult is really SD-WAN?

Now, this is of course going to depend on your organization’s level of skill, as well as what vendor you go with, but there are still some conclusions that we can come to.

Most of the SD-WAN solutions are operated by cloud-hosted SDN controllers, where the vendor has setup the virtual machines running the software for you. This greatly simplifies a lot of things that have been painful in the past. From a Cisco perspective, this is some of the pain that has been removed from you:

  • Controllers – Controllers are installed for you and backed up by Cisco
  • Software – Software is managed centrally, don’t need to login to each device to update it
  • Traffic engineering – Can modify routing behavior without being an expert in say BGP
  • Certificates – Only devices with a valid certificate can join the overlay, you don’t need your own Public Key Infrastructure (PKI)
  • Pre Shared Keys (PSK) – Keys used for IPSec are rotated automatically without manual intervention

This means that it’s fairly simple to get started with SD-WAN. That unfortunately means that many organizations might go “SD-WAN is simple, let’s just get started”. This can lead to a suboptimal design where a lot of the technical debt gets inherited into the new design. Not a place you want to be in. Being an expert in SD-WAN, and a Network Architect, let me explain why you should get someone with expertise involved in your project:

Business outcomes – A design should always be driven by delivering business outcomes. You may or may not have someone with the competence, and time, to front C-level people and stake holders to analyze where the business is going, and then translating that back to technology. Is the company going through a digital transformation? What are the cloud initiatives? Is the company growing organically or by acquisitions? Are there initiatives internally to improve user experience? Being an Architect is very different from being an Engineer. You may need some assistance in guiding your journey forward.

Design and guidance – How many routers should we have at each site? How many transports? Do we need MPLS or only internet? What are others doing? How do we connect to the cloud? Where do we put our security perimeter? How do we protect clients when every site has an internet transport? How do we do load sharing? How many VPNs do we need? How do we connect to the LAN? Using a routing protocol or a First Hop Routing Protocol (FHRP)? Even with SD-WAN, there are still a lot of design discussions to be had, and choices to be made. Unless you know the solution you are choosing, it’s likely you will neglect important factors or make choices based on the wrong information. Someone with experience should be involved in the design process to limit the amount of poor choices made in the design phase.

Real world experience – When reading the marketing, everything looks simple, just send a router to site, run ZTP and be done with it. What is required to run ZTP though? There are things that you need to be aware of. There are conditions to be met such as having an internet transport and that the transport needs to have an IP address via DHCP. Trying to get a public IP via DHCP from a SP is no easy task, let me tell you… Imagine having planned for a implementation, sending out routers to all branches, which could be thousands, only to realize that your design does not support ZTP. A person with experience will warn you in time. This person will also know what works, doesn’t work, what bugs to avoid, and what the best software to run is.

Best practices – How low should you go with those timers for convergence? What’s the best way of reaching your devices over their management IP? What’s the best way of designing your policies? Some of this may vary but much of it looks the same for each organization. There are timers that work well depending on use case and you can save a lot of time if you go straight for the “best practices” instead of trying to find these values yourself.

Getting rid of technical debt – When designing a new solution, what we often call greenfield, you have a chance of getting rid of some technical debt. This is often a “one moment in time” thing where if you miss the opportunity, it’s gone. The efficiency of your SD-WAN design, is dependent on how many “one-offs” and snow flake site types you have. A good Architect can help you get rid of some of that technical debt, so that you minimize the number of different site types, and the number of creative workarounds out there. This will improve both workflow and the operational efficiency.

Now, you may just think I’m telling you all this because I’m working for a systems integrator, and that’s perfectly fine. I am just highlighting that while running SD-WAN is a lot easier than a traditional WAN, say based on DMVPN or something else, there are still a lot of considerations, and either you have to walk into the mines yourself, or you get help avoiding them, it’s as simple as that.

Ethan and Greg’s discussion was more around the operational aspect of SD-WAN though. So how difficult is it to operate SD-WAN?

Not very, when things are working… Which is the case for most infrastructure that is working as intended. When things are working fine, you can get away with someone that is say CCNA-level in their knowledge. What you need more expertise for is, when doing redesign, adding components to the design, or when performing more advanced troubleshooting. Why is my traffic not going to the preferred DC? Why is my local breakout not working? Why are my apps not being classified properly? I can’t get the tunnel to zScaler to work. My policy is not having effect, and I don’t know why. Why are my routes from the LAN not visible at other sites? It seems I just created a routing loop…

SD-WAN solutions use routing protocols, they use IPSec, they use some form of policies and they use the usual suspects such as Ethernet, ARP, IP, TCP, and UDP. Even with all the abstractions in place, you still need competency in these protocols to understand when something breaks or to be able to do a proper design. In fact, sometimes you even need more knowledge because SD-WAN is all about catering the network to your apps. You can get away with having mostly junior staff, but you also need someone more knowledgeable, either on staff, or as a consultant. Using TAC as your only partner for design and questions, rarely works well, and is not how they are intended to be used.

So, how difficult is SD-WAN?

It depends… mic drop

How Difficult is SD-WAN?
Tagged on:             

2 thoughts on “How Difficult is SD-WAN?

  • November 11, 2019 at 10:42 pm

    Precisely Dan. To my understanding , in-depth knowledge of legacy routing protocols is much needed for SDWAN especially when we need to migrate legacy WAN to SDWAN.
    With minimal knowledge of how organization’s WAN is laid out today in terms of it’s architecture and functionality, and where the business wants to see its WAN tomorrow in terms of scalability, security, cloud infrastructure etc, SDWAN implementation may become more pain than a smoother resolution.

    • December 30, 2019 at 10:50 pm

      In my opinion, most SD-WAN implementation will only have basic routing between the Edge and the Campus switches (Edge sending default route to the Switch and Edge receiving a summary route from the campus). Retribution needs to be managed properly with prefix lists and routing tagging still. I think you’re correct that you still need to know the basics to interminate knowledge of BGP and OSPF but we won’t be deploying a hugely complicated IGP across the core of the WAN which is where a load of our current complexity is. This removes loads of IGP design work like OSPF areas, BGP configuration e.g. route reflectors, aggregation, designing protocol scalability, metric manipulation that I would say in In-depth routing protocol knowledge. The way I am thinking is it’s like a load of smaller AS’s linking into the SD-WAN with the overlay running its own custom routing protocol. This I think removes tons of complicated routing protocol design as your not scaling the protocol across the whole network.

      We are likely to be deploying VMware VeloCloud or SilverPeak to replace our Cisco routers that are currently managed by a Managed Service Provider (Large old national Service Provider MSP). The current WAN runs BGP, OSPF, and EIGRP with huge amounts of retribution to be managed all over the place. We also have a DMVPN network running over the MPLS with multiple VRF-Lite instances. There are multiple routing relationships within each VRF adding to the complexity. This was all designed and deployed by the MSP who currently manages it. The majority of network issues and changes get raised up to the MSP network architecture team to make due to the complexity and the MSP’s Tier 1, 2 and 3 Teams not having the relevant knowledge.

      Think of the knowledge and design simplification in our network when moving to SD-WAN in the following topics which are massive:
      – DMVPN and all the components such as MGRE, NHRP, PKI on Cisco IOS, SCEP,
      – BGP Routing. Removal of route reflectors on DMVPN Hubs, prefix-lists everywhere to control metrics. Loads of prefix-lists modifying metrics
      – Removal of multiple VRF routing protocols
      – QoS Policies simpler configuration around policing, shaping CBWFQ and all the complications on Cisco IOS when implementing QoS

      I think SD-WAN is currently simple to manage compared to traditional router however more features will be pushed into the products. SD-WAN is already showing features that are not possible in traditional routers, the number of these features will only grow. I think Infrastructure as code will grow out into the SD-WAN so we will manage SD-WAN networks differently.


Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: