Before diving into a new technology, it is always useful to understand the previous generation of technology, what the limitations where, and how the new technology intends to overcome them. In this post, let’s look at what some of the challenges were with L2-based networks and how VXLAN/EVPN can overcome them. Before starting, I want to balance the messaging a bit on the bad reputation that STP gets:
- Radia Perlman did an excellent job with what was available at that time.
- A lot of the bad reputation comes from a misunderstanding of the protocol.
- STP-based networks can run just fine but they are often misconfigured (related to the point above).
- Many issues come from misbehaving end user devices where protection mechanisms have not been implemented (see the point above).
- It’s natural for technologies to evolve as more compute becomes available and we gain experience.
Keep in mind that the original 802.1D standard was published in 1990. This was long before internet was generally available and our networks were critically important to us. At that time we didn’t measure outages in seconds or even minutes. That said, let’s look at the limitations of a traditional L2 network.
Convergence – In the original 802.1D standard, convergence was slow as ports had to go through blocking, listening, and learning before becoming forwarding. This would take around 50 seconds. This was much improved in 802.1w which uses a synchronization process and can often converge within a couple of seconds or less.
VXLAN/EVPN networks are routed. This means we can leverage technologies like ECMP to have multiple paths installed. If a link fails, convergence is almost instantaneous as there is already another path in the RIB/FIB. This will always be faster than what a L2 network can provide.
Ineffecient use of available links – A L2-based network using STP builds a tree topology. This means that some links have to be blocking traffic to create a loop free topology. This means that not all links can be used and that there are links idling which could have been utilized better. This can be seen as poor use of funds invested in the network.
In a routed network using a Clos, also commonly called leaf and spine topology, there is no need to build a tree topology. All links can be fully utilized and installed as equally good paths using ECMP. This provides a better return on investment.
Suboptimal forwarding – Because there is a tree topology, which is rooted at the Root switch, there may be direct paths between switches that can’t be leveraged as traffic must flow towards the Root.
In VXLAN/EVPN network, traffic can flow optimally as the topology is not a tree. There is only one hop between any two leafs in the topology.
Lack of ECMP – There is no concept of ECMP in L2-based networks. As described before, if multiple links are available between two switches, one will have to be blocked. The only way of overcoming this is to use link aggregation using for example LACP.
We have already established that VXLAN/EVPN supports ECMP.
Network scale – 802.1Q provides 12 bits for the VLAN ID meaning that only 4096 VLANs can be provided. This is not enough in a large DC environment. Note that there have been technologies such as Q-in-Q that were designed to overcome this limitation. It was mainly used in service provider networks, though.
With VXLAN, the Virtual Network Identifier (VNI) is 24 bits, supporting up to 16 million different networks. Far more than 802.1Q can provide.
There are of course more limitations but these are some of the most prevalent ones. I hope you will join me in the next post which will introduce the concepts of VXLAN and EVPN.