Sometimes a customer needs a L3 VPN between two locations where the same SP is not present. This can be on a national or international basis. It would be possible to buy an Internet circuit and run an overlay such as DMVPN but what if the customer wants to buy a MPLS VPN circuit?
The customer could buy a VPN from SP1 in location1 and a VPN from SP2 in location2. The two SPs would then have to exchange traffic somehow to make the customer circuit end to end. The concept is shown in the following topology.
The customer connects to the PE of each of the SPs. The SPs need to interconnect at some common point, either through a public peering place such as an IX or with an private interconnect at a common location. The routers that connect to each other are called autonomous system border routers (ASBR). There are three main options and a fourth option which combines two of the others.
Inter-AS Option A
Option A is the most simple of the options to interconnect the ASBRs. Each customer VRF requires either a physical interface or more likely a subinterface. Option A has the following characteristics.
- Each ASBR thinks the other is a CE
- One logical interface per VPN
- Link may use any supported PE-CE protocol
- Packets are sent unlabelled between the ASBRs
- QoS policies are negotiated and manually configured on the ASBRs
- The most secure and easy option to provision
- Does not scale well to a large number of VPNs
As the diagram shows, the LSP is between the PE and the ASBR for each SP, there is no end to end LSP. Packets are sent unlabelled between the LSPs. The ASBRs considers the other one to be a CE, meaning that any routing protocol is supported such as static routes, IGP or BGP. Do note that if BGP is used, the updates are sent as IPv4 updates and not VPNv4.
Option A is the most simple to use and requires the least amount of trust between the SPs. It works well when providing VPNs to another SP is a rare thing. It can get cumbersome if two SPs have an agreement and exchange traffic for hundreds of VPNs between them. The number of BGP sessions between the ASBRs can become a scaling issue depending on the platform in use. One advantage of Option A is that SPs do not need to use the same RT values since VPNv4 updates are not exchanged. There is also no need to disable the automatic RT filter.
Since the ASBR will generate a VPNv4 update to its local AS, there is no need to manipulate the next-hop or redistribute the ASBR-link into the IGP. The next-hop will automatically be set to the ASBR to an address in the global table, otherwise there would be no reachability.
Another point to consider with Option A is that the ASBR will have to install all the routes into RIB/FIB which may also become a scaling factor together with the number of BGP sessions.
Inter-AS Option B
Inter-AS Option B is a more scalable solution compared to Option A. It does not require any VRFs on the ASBRs, it uses VPNv4 eBGP to exchange VPNv4 updates. It has the following characteristics.
- Single interface to connect the ASBRs
- Packets are sent labelled between the ASBRs
- More complex to implement QoS
- No need for VRFs on the ASBR
- ASBRs must be directly connected
- Less secure and requires more trust between SPs
- Less granular traffic engineering and per customer control (maximum routes)
- Scales better than Option A
- Does not support BGP Pic Edge
There are a few different ways to implement Option B. As can be seen in the diagram, there is an end to end LSP although it in reality consists of three LSPs that are stitched together.
How does the ASBR know what label to use when sending packets to the other ASBR? We have no IGP or LDP running on the link. BGP will be used to generate the labels. This means that both eBGP VPNv4 and a IPv4 session needs to be setup between the ASBRs. Since the VPNv4 session is eBGP, the next-hop when sending the update to the other ASBR, will be the local ASBR. Each ASBR needs to generate a label for the next-hop they are using when sending BGP updates to each other.
One point to consider for Option B is that the ASBR most store all the BGP updates although it will not install them into any VRF. This also means that automatic route target filtering needs to be disabled at the ASBR.
I mentioned that there are a few different methods to implement option B. The first one is to set next-hop-self on the ASBR. Any PE in the local AS will then have the local ASBR as the next-hop for which it will have a transport label through IGP + LDP. There will then be a LSP between local PE to local ASBR, local ASBR to remote ASBR, remote ASBR to remote PE.
Another option is to have the next-hop remain unchanged when VPNv4 updates are sent between the ASBRs. When the local PE receives the update the next-hop will be the remote ASBR. This means that the link connecting the ASBRs should be redistributed into the IGP. LDP can then generate a label for it once it’s in the IGP. BGP will automatically install a /32 connected route for the eBGP peer on the ASBR when using labelled BGP. When this method is used, there are only two LSPs. One LSP is from the local PE to the remote ASBR and then from the remote ASBR to the remote PE.
As always, there are design considerations depending on keep to next-hop unchanged or not. Everytime the BGP next-hop is changed, a new VPN label is generated. In some scenarios with multiple ASBRs between the SPs, better load sharing can be achieved when using next-hop-self because the local ASBR may only have sent a single best path into the local AS but it is itself aware of multiple paths. If the local ASBR sets the next-hop to itself and uses multi path, it can choose between multiple paths to the remote ASBRs, achieving better load sharing. If the next-hop is not set to the local ASBR, the local PE will have a next-hop of a remote PE and be unaware that there are multiple paths in the remote AS. This could be worked around by using Add Path on the ASBR to have it send multiple paths. Setting the next-hop to self comes with other design considerations though, as I mentioned in the BGP convergence blog.
The final option which may used for load sharing is to have several interfaces between the ASBRs and do eBGP multihop between the ASBRs. This solution comes with a lot of caveats though. MPLS BGP forwarding is only supported on directly connected interfaces. To work around this, LDP needs to be enabled or to use static label binding which makes the solution a lot more complex. There are also a lot of caveats depending on if the ASBR interface is multicaccess or point-to-point. MPLS BGP forwarding must still be enabled on the interface even if static labels are used. An interface will not accept incoming labelled packets otherwise. Running LDP may be acceptable if the two ASBRs belong to the same organization but belong to different AS.
With Option B (and C) there is a need to coordinate the RT values used for the customer(s). If the values have not been coordinated, routes may not be imported into the customer VRF or worst case, they get imported to the wrong VRF. It is also possible to rewrite the RT values at the edge to work around this.
I briefly mentioned that BGP Pic Edge is not supported when using Option B but as always, it depends… The traditional method to achieve load sharing and fast convergence in MPLS VPNs is to use a separate RD per VRF per PE. This creates a problem for BGP PIC Edge though, since the RD values are different, it’s different routes and they can therefore not be backup for each other. A work around is to not use unique RD values and rely on Add Path instead.
Inter-AS Option C
Inter-AS Option C is when VPNv4 updates are either sent between PEs or more likely between RRs in the different AS. It is the most scalable solution but also the least secure. It has the following characteristics.
- End to end LSP
- Most scalable
- Labelled BGP between ASBRs or IGP + LDP
- VPNv4 between ASBRs
- Must leak PE loopbacks between AS
- Requires the most trust and is the least secure
When an eBGP VPNv4 peering is enabled between the RRs, the next-hop must remain unchanged to not insert the RRs into the packet flow. The ASBRs will run labelled BGP and leak loopback PEs from the other AS to be able to find a label to the next-hop. The LSP will then be end to end and not stitched, as was the case with Option B. The PE loopbacks can either be redistributed into the IGP after being received from BGP or they can be sent as labelled BGP routes if all the PEs has this address family (AFI) enabled. The latter option will have a deeper label stack than redistributing into IGP though.
It’s not likely that Option C will be deployed between two different SPs because of the level of trust required between them.
Inter-AS Option AB
There is also an Option AB or which combines the characteristics of option A and B. It is also sometimes referred to as Option D. Option AB uses VRFs at the ASBR but it uses a single eBGP VPNv4 session to exchange the routes. Option AB sends unlabelled packets between the ASBRs which makes it retain the positive attributes of Option A, such as per VRF policies and QoS markings. It does however use a single BGP session which was used in Option B for better scalability. The VRFs on the ASBR need to be enabled for Option AB as well as the peer under the eBGP VPNv4 session on the ASBR. This means that the ASBR needs one logical interface per VRF between the ASBRs and one global interface for the eBGP VPNv4 session. Another positive aspect of option AB is that the ASBR acts as a PE in that it can import a certain RT value and export with another RT value. In option B, there was no local VRF configured, meaning that the RT values of the other AS would get carried in the update.
I will not dive into the details of Option AB as I would consider that out of scope for the CCDE. There is some next-hop trickery involved to be able to send VPNv4 updates but use unlabelled traffic between the ASBRs.
Since Option AB is basically the best of Option A and Option B, it makes sense to use this to connect two SPs unless they belong to the same organization, Option C would be a viable option then. It would also be reasonable to deploy Option A if there were only a few VRFs involved.