When building a VXLAN network, what are the considerations for choosing the underlay protocol such as OSPF, IS-IS, or BGP? You obviously want the design to be supported by your vendor of choice. Your staff should also be able to support the design. Although I think it’s reasonable to expect from a Network Engineer that they have some level of knowledge in OSPF and BGP and that this should not be the main deciding factor. Let’s dive into the different protocols and walk through their characteristics and how they can be used as underlay protocols in a VXLAN network. I will compare OSPF to BGP as ISIS basically provides all the benefits of OSPF with some additional ones, but with less support from vendors, and it’s a protocol less known by most Engineers.
OSPF
Protocol overview – OSPF is a link state protocol that builds a Link State Database (LSDB) and runs the Shortest Path First (SPF) algorithm based on Dijkstra’s work to calculate the shortest path. It relies on flooding Link State Advertisements (LSAs). All routers in an area need an identical LSDB.
Ajacencies and transmitting protocol packets – OSPF transmits packets over IP in IP protocol 89. It can send packets both as unicast and multicast. When running in point to point mode, which is the common scenario in a VXLAN network, it uses only link local multicast to transmit packets. OSPF establishes adjacencies with directly connected devices also running OSPF. In a leaf and spine network, the leaf would form an adjacency with all the spines which is a relatively low number of devices. Most likely two or four. The spine will form adjacencies with all the leafs.
Numbering – OSPF can run over unnumbered as well as numbered links. The ability to run over unnumbered links can save on management of IP addresses as only loopbacks need to be configured with IP addresses.
Scalability – There are different factors when it comes to scalability such as the number of routes supported and the number of devices supported. While OSPF is an IGP, and hence not designed to carry millions of routes, a VXLAN network will typically only carry loopbacks and possibly some link networks which makes the number of routes small. The main consideration for OSPF is the number of adjacencies and the flooding required. The number of adjacencies on the leaf will be low as it only connects to the spines. The number on spines will be higher but considering the number of ports on the spine, every spine will likely only connect to 30-40 leafs. This is a relatively high number of adjacencies but should still not pose a problem. There are service provider networks with hundreds or thousands of devices in a single area although the number of adjacencies per device tends to be lower.
Convergence – OSPF can be very fast to converge. Based on events like link down it will flood updates to trigger other devices to run SPF. In a datacenter environment where everything is connected without intermediate devices, Bidirectional Forwarding Detection (BFD) is not strictly necessary but OSPF can be configured to leverage BFD. Let’s say that there is a network with two leafs, L1 and L2, and two spines, S1, and S2. Traffic is flowing between L1 and L2 via S1. The link from L1 to S1 goes down. S1 is obviously aware of this and floods the update and runs SPF. L2 will run SPF based on the updated information and stop using the path towards S1 which was previously installed as an ECMP route. Flooding the LSA, running SPF, and invalidating the route should easily be done sub second.
Policy – OSPF does not offer much when it comes to traffic engineering or filtering. Because it’s a link state protocol, all devices must have same view (LSDB) of the area. Generally speaking there is not much need for policy in the underlay, though. It should just provide routing between loopbacks and be able to have Equal Cost Multi Path (ECMP) routes.
Support – OSPF is a protocol that should be available on any device as well as supported by vendors. It’s also a protocol that every Network Engineer should have some experience with. Setting it up as the underlay protocol in a VXLAN network is trivial. It’s a matter of only a few lines of configuration. It also does not require a lot of tweaking as opposed to BGP.
Ability to template – Because OSPF can run over unnumbered links, it is easy to make templates with very little unique configuration where the only thing that is different from underlay perspective is the loopbacks in use. It doesn’t get easier than this!
Summary – OSPF works very well as the underlay protocol in a VXLAN network. It does what it was designed to do, which is to be an Interior Gateway Protocol.
BGP
Protocol overview – BGP is a path vector protocol that is most well known for its usage on the internet but is also commonly used in enterprise networks as well as in for example MPLS VPNs.
Ajacencies and transmitting protocol packets – BGP establishes peering with neighbors that have been specifically configured. It does this over TCP port 179. It uses only unicast to transmit its packets.
Numbering – Because BGP uses unicast, it can’t do initial discovery and transmitting of packets like OSPF which can leverage multicast. Also, BGP requires configuring neighbors specifically unless using a feature where BGP is listening for neighbors within a certain IP range. BGP could be said to run over unnumbered links if using RFC 5549 (advertising IPv4 prefixes over IPv6 Link Local Address (LLA)) although this is not technically correct as a IPv6 link local IP address is still an IP address. It does remove the need for configuring an IPv4 address, though. Another consideration is that BGP also requires the use of an AS number.
Scalability – BGP being the protocol of choice on the internet should say something about its scalability. It supports millions of routes. Not something needed in an underlay, though. The number of neighbors is an interesting factor as sending BGP updates to all neighbors become similar to flooding LSAs in OSPF. BGP is well proven in hyper scaler networks so we can safely assume that it scales beyond what OSPF can do.
Convergence – BGP has notoriously slow default timers. For example, on Cisco devices it’s a keep alive of 60 seconds and hold time of 180 seconds. This should not matter though as long as there is a link down event so the routing table can be updated and BGP updates generated. I would expect BGP to be slightly slower than OSPF but still possible to do sub second. BGP does also support the use of BFD.
Policy – This is where BGP really shines. It has several attributes that can be manipulated such as local preference and it supports communities, filtering, and much more. Whether this is useful in an underlay is debatable.
Support – I would argue for that OSPF on more devices than BGP but BGP should be available in any product used to build a VXLAN network. BGP is a protocol that should be supported by vendors although it can vary. OSPF should be more well known to Engineers in general but the use of BGP has become more prominent also in enterprise networks.
Ability to template – BGP is at a disadvantage here. It generally requires numbered links. It also requires the use of an ASN. Depending on the type of design, it could be unique ASN used per leaf. This means you need to assign several unique parameters such as IP addresses and ASNs to each device.
Summary – BGP is a protocol that scales extremely well and is well proven in large networks. It does require more configuration and some nerd knobs to get it to behave as desired as the protocol of choice for an underlay.
Which protocol is better?
The whole point of this post is to compare and contrast and to show that no protocol is better than the other. It’s my hope that when you read posts like these that I can get you to think about the different aspects with an analytical mind and not just building what others, such as hyper scalers are doing. Often remarks like “BGP scales better!” are thrown out there but in a network where the spine has 32 ports and the leaf has 48 ports, it provides you with a network consisting of 32 leafs with 1536 ports in total. I don’t know many enterprises that need more than 1536 bare metal servers. That’s a lot of VMs! The other argument I often see is that it’s easier to run one protocol instead of two? Is it really, though? When considering that OSPF is more well known to most Engineers. That it creates a very clean separation of underlay and overlay. When OSPF requires no nerd knobs like BGP does to make it behave more as an IGP.
I hope after reading this post that you have a good feel for what the protocols provide and that every network needs to be designed based on requirements.
Further reading
If you want to dive even deeper, I have provided some recommended reading here:
Is OSPF or IS-IS Good Enough for My Data Center?
Is EBGP Really Better than OSPF in Leaf-and-Spine Fabrics?
In Defense of OSPF In The Underlay (In Some Situations)
What I’ve learned about scaling OSPF in Datacenters
Good subject @daniel
Thanks!
Great topic. I’d like to see some comments on IS-IS in the future, as I’m running it on a SDA Fabric environment (DNA-C). Also just a slight grammar correction as adjacencies is misspelled.
Thanks, Bruno!
IS-IS is like the slightly better version of OSPF. Both v4 and v6 in one protocol, som better behavior in regards to flooding, scales a bit better, uses TLVs instead of different type of LSAs. It would be my preferred choice if my team is able to support it.
From reading this post I would say that OSPF should be the default choice for the underlay unless there is a good reason to use BGP. Interestingly the current CCNA Official Cisco Guide textbook states in chapter 17 ‘Cisco Software-Defined Access (SDA)’; “The switches use the IS-IS protocol”
So would this be specific to the SDA architecture or is it a recommendation?
> BGP being the protocol of choice on the internet should say something about its scalability. It supports millions of routes. Not something needed in an overlay, though.
Small typo, I think you meant underlay.
Correct! Thanks!
A very interesting read, Daniel! Although I never implemented VXLAN I hope in the future, I will be better prepared for choosing the right underlay routing protocol.