The next person I interviewed about the future of networking is my friend Pete Lumbis. Pete used to be the routing escalations TAC leader at Cisco and now he is working at Cumulus as a SE. Pete holds both a CCIE and a CCDE.
Daniel: The networking world is changing. What are the major changes coming up in the next few years that you think we will see?
Pete: Automation is the big thing these days. Either through APIs or abstraction tools like Ansible or Puppet. I think there will be more embracing of automation, but as a side effect I think we will have to start building networks that are more automation friendly by creating fewer exceptions and one-offs. This also touches on a larger point which is the need to build systems and networks that are less fragile. Automation is less scary when you have an architecture that can tolerate some level of failure.
Daniel: What are the major skills that people in networking need to learn to stay ahead of the curve?
Pete: Fundamentals don’t change. ARP is ARP. MAC addresses still have 48-bits. Understanding fundamentals will always be key. Beyond that it’s going to be about talking to other parts of the tech organization. How do we have meaningful conversations with applications and systems and storage stakeholders so we can build all routed networks? What can we do to stop creating poorly designed networks to solve problems on poorly designed applications? Failure to do this will just make the network team look bad and will just drive our customers (servers and apps owners) to public cloud. We don’t need to become Docker experts but just like we know how email and voice operates over our pipes we’ll need to have a better understanding of storage, containers and applications.
Daniel: Is this drive for automation and SDN really that much different than anything we have seen in the past?
Pete: No, but what’s different is that they aren’t being created in a vacuum. On the automation front there have been incredible advances in the server space driven by a need to rapidly spin up and down servers in the cloud. As a result Ansible, Puppet and Salt have really taken off. They have solved so many problems in the server space that it’s hard for networking to ignore. There have been solutions like this in the past: EEM, Cisco Security Manager, NMS Stations, but they all tried to solve the same problem without any kind of sharing. Now that we have open source tools to work from a lot of ground work is already laid for us to build on top of.
On the SDN front it’s had some great marketing names in the past like Application Aware Network or AVVID. We’ve tried for years to have the network have more dynamic control but we’ve always failed because, again, we are solving problems in a vacuum, without talking to the stakeholders. We are trying to have the network guess what application is on the wire instead of just providing value to the application owner and identifying it before it enters the network. I think products like VMWare’s NSX are smart because they are moving that edge closer to the application owner to get that “software defined” goodness. The SDN things that are the most successful will be easy for the server and apps people to consume and provide direct value to them.
Daniel: What does the network engineer of the future look like?
Pete: Network engineers will have to become more jack of all trades. If we look at SRE (Site Reliability Engineers) or DevOps the technical success there has been because the people in these roles can work across boundaries. They understand Linux and storage and databases and can span the architecture to solve issues. I see the network engineer of the future being a key member of those teams. The network engineer will be the expert at networking (just like the person next to them may be a Linux expert), but the network engineer will have a much broader skill set. The ability for this to happen is related to the fact the networking industry is starting to gather around a few solutions. We have fewer protocols, we have simpler hardware platforms and the hardware is all starting to look the same. It’s going to take the same energy as before, just applied differently.
Daniel: What does the network architect of the future look like?
Pete: The network architect of the future, like the network engineer of the future, will need to really cross domains even more than they do today. I think the difference is that today we gather requirements and then build this incredibly bespoke network to suit those bespoke application needs. In the future we will build a simple network (think layer 3 datacenter Clos) and then we will work with application and systems owners to make them understand the rules and how to be successful in the environment. Beyond this the network architect will have to take some of the learnings from the server space around selecting the best automation tool for the job, how to do modeling and automated change management and testing.
Daniel: How will the traditional networking skills such as knowledge of protocols be relevant in the future, if at all?
Pete: They will always be important, but just reduced. The fine grained differences between OSFP and IS-IS will be less critical, but I have a conversation almost once away about why VxLAN’s encapsulation using UDP solves a lot of problems with hashing and ECMP that GRE and IP in IP have had in the past. I don’t need to know every bit field but that little bit of knowledge at the protocol level will make a huge difference in the architecture.
Beyond this, as I mentioned earlier, there’s a protocol consolidation happening. BGP as a routing protocol. Segment routing for label switching. PIM is still around but everyone seems to be excited about BIER(Bit Indexed Explicit Replication). The alphabet soup of Frame Relay, X.25, LDP, RSVP, and RIP are (hopefully) behind us.
Daniel: Is it still valuable to learn traditional networking skills?
Pete: Absolutely. Even in a team where skills are mixing you still need to be able to build and troubleshoot the network. Understanding how routing works can allow you to build anycast networks for way less. Understanding MAC flooding can help explain duplicate packets or period slowness on the network. Vendors will always be there to help, but without that full picture of the environment that you have as the network engineer (or the team member with a specialization in networking) it will always take the vendor much longer to solve the problem.
Daniel: Do all network engineers have to become programmers?
Pete: Nope. I think learning some basic concept of programming are infinitely useful. Understand how to build a loop. Understand how an if/then statement works. Pick a language, even Bash scripting, and be able to write some terrible code in it to do your job. My poor programming skills have made my life WAY easier, but I told my current employer during the interview process that if they want me to program I shouldn’t work here. No one should pay me for my terrible code. I can use a little python but I don’t know Go, Ruby or Perl. If you talk to systems administrators they have varying levels of scripting skills but they are quick to point out they are not programmers. I believe it’s similar for us.
Daniel: How should network engineers find interesting projects to work on if everyone will build their networks in a similar fashion?
Pete: Move up, down or lateral to the stack. Move up and learn how to provide more value to the business and the stake holders. Move down the stack and get involved in Open Source or IETF with protocol and technology creation or move lateral and make other people build better products with your networking skills. I think Docker is a super cool technology but networking in their first release was 100% designed based on PAT. No self respecting network engineer would have allowed that. As a result the folks over at SocketPlane got acquired to help Docker fix their networking issue. Most wouldn’t have thought about Docker and needing network expertise.
Daniel: Do you think we will see a move towards more standardized solutions where the customer buys compute, storage and networking in fixed packages?
Pete: I expect to see the opposite. The success of servers over the last 20-30 years has been that you can buy a server and run any OS on it. That OS behaves mostly the same on every server. A few years ago VMWare came along and said you can run VMs on any server and they’ll mostly look the same. Now Docker is saying you can run containers and they’ll mostly look the same. That ability to create small components that have clean abstraction layers (Hardware to OS, OS to application) has led to incredibly innovation.
I expect to only see more of this. Multiple switch platforms that all look mostly the same (which we see with whitebox/britebox switches). Operating Systems that look similar (Cumulus Networks, Pica 8, OpenSwitch) and then applications that can run on top (Quagga, GoBGP, Bird). (Disclaimer: I work for Cumulus Networks, so I have a bias towards this future)
When you have these clean lines then it doesn’t matter what gets plugged in. If it’s an EMC storage array or a Ceph storage node or Nutanix, it’s just a place to park bits. There will always be a market for the customers that want the turn key approach where I plug in a rack and it just works, but I expect a lot of those customers to move to the cloud in the next few years.
Daniel: What is the next big thing that people should prepare for?
Pete: Giving up control. If I knew the next big tech I’d be rich 🙂
But seriously, network engineers are control freaks. We love our boxes. We love our protocols. We love our SNMP stats (okay, no one loves SNMP), but my point is that we’ve always had very precise control of the network. In order to build more cost effective networks that do more, we have to give up some of that control. Again, using VMWare’s NSX as an example, it’s a security/policy engine running in the hypervisor. In some shops the network team control that, in others that’s driven by the server team. At Cumulus we are working with customers to run BGP on their servers to get rid of layer 2 and mLAG entirely. Again, only possible when we are willing to give up a little bit of control.
Daniel: How do you stop sipping from the firehose? It’s tough to keep up with automation, SDN, cloud and everything going on in the industry.
Pete: This is hard and I struggle with this every day. I think the first part is build simpler things. Try to limit the number of choices your stakeholders (servers and apps) get in order to provide them lower costs (financially and emotionally) and greater reliability. After that find specific problems with clear goals and implement them. Learn Ansible by using it to build configuration templates. Learn AWS to build a VPN between your public and private cloud. It’s always hard to pick up a technology where you don’t know where you’re going with it. I’ve struggled with containers because outside of understanding basic operations I don’t have a broader problem that they solve for me. Be willing to walk away from something that’s “cool” or “hip” if it makes your life more difficult or doesn’t solve your problems.
Thanks to Pete for some very insightful answers!