This post looks at the pros and cons with BGP Route Reflection compared to running
an iBGP full mesh.
Because iBGP routes are not propagated to iBGP sessions there must be a full mesh
inside the BGP network. This leads to scalability issues. For every N routers
there will be (N-1) iBGP neighbors and (N*(N-1))/2 BGP sessions. For a medium
sized ISP network with 100 routers running BGP this would be 99 iBGP neighbors
and 4950 BGP sessions in total.
There are 4 routers in AS 2 which gives 3 iBGP neighbors and 6 iBGP sessions in total.
Benefits of a full mesh:
- Optimal Traffic Forwarding
- Path Diversity
Optimal Traffic Forwarding:
Because all BGP speaking routers are fully meshed they will receive iBGP updates
from all peers. If no manipulating of attributes have been done then the tiebreaker
will be the metric to the next-hop (IGP) so traffic will take the optimal path.
Due to the full mesh the BGP speaking router will have multiple paths to choose
from. If it was connected to a RR it would generally only have one path, the one
the RR decided was the best.
Because the BGP speaking router has multiple paths if the current best one should fail
it can start using one of the alternate paths. Also the BGP UPDATE messages are sent
directly between the iBGP peers instead of passing through an additional router (RR)
which would have to process it and the packets would have to travel additional distance
unless the RR is located in the same PoP as the routers.
If one BGP speaking router fails then only the networks behind that router are
not reachable any longer. If a RR fails then all networks that were reachable via
clients to that RR would no longer be reachable.
Caveats of a full mesh:
- Lack of Scalability
- Management Overhead
- Duplication of Information
Lack of Scalability:
Having hundreds of BGP sessions on all routers would mean a lot of BGP processing.
The number of BGP Updates coming in would be massive.
This would put a great burden on the CPU/RP of the router. For really large networks
this could potentially be more than the router can handle. In a network with 300 routers
there would be 44850 iBGP sessions. The RIB-in size would be very large because of the
large number of peers.
Adding a new device to the network means reconfiguring all the existing devices.
Configurations would be very big considering all the lines needed to setup the
Duplication of Information:
For every external network there could potentially be multiple paths internally
leading to using lots of RIB/FIB space on the devices. It does not make much sense
to install all paths into RIB/FIB.
Benefits of Route Reflection:
- Reduced Operational Cost
- Reduced RIB-in Size
- Reduced Number of BGP Updates
- Incremental Deployability
The number of iBGP sessions needed is greatly reduced. A client only needs one session
or preferably two to have route reflector redundancy. A route reflector needs
(K*(K-1))/2 + C where K is the number of route reflectors and C is the number of
clients. The route reflectors still need to be in full mesh with each other.
Reduced Operational Cost:
With a full mesh when adding a new device it requires reconfiguring all the existing
devices. This requires operator intervention which is an added cost. With route reflection
when adding a new device only the new device and the RR it peers with needs new configuration.
Reduced RIB-in Size:
RIB-in contains the unprocessed BGP information. After processing this information
the best paths are installed into the Loc-RIB. The RIB-in grows proportionally with
the number of neighbors that the router peers with. If there is n routers and p prefixes
then the router would have a RIB-in that is of size n * p. In a full mesh n is very high
but with route reflection n is only the number of RRs that the router peers with.
Reduced Number of BGP Updates:
In a full mesh a router will receive N – 1 updates where N is the number of routers.
This is a large amount of updates. With route reflection N is small since this is
only the number of route reflectors the router peers with.
Route reflection does not require massive changes in the existing network like with
confederations. It can be deployed incrementally and routers can be migrated to the
RR topology gradually. Not all routers need to be moved at once.
Caveats of Route Reflection:
- Prolonged Routing Convergence
- Potential Loops
- Reduced Path Diversity
- Suboptimal Routes
With a full mesh if a single router fails that only impacts the networks behind
that router. If a route reflector fails it affects all the networks that were
behind all of the route reflectors clients. To avoid single points of failure,
RRs are usually deployed in pairs.
Prolonged Routing Convergence:
In a full mesh every BGP update only travels a single hop. With route reflection
the number of hops is increased and if the route reflectors are setup in a
hierarchical topology the update could travel through several RRs. Every RR
will add some processing delay and propagation delay before the update reaches
In a topology where clients are connected to a single RR there should be no
data plane loops. When clients are connected to two RRs there is a risk
of a loop forming if the control plane topology does not match the physical
topology. Because of that it is important to try to match the two topologies.
Reduced Path Diversity:
In a full mesh if there are multiple paths to an external network then
all paths will be announced and the local router makes a decision which one
is the best. With route reflection the RR makes the decision which path is
the best and announces this path only. This leads to fewer paths being
announced which could lead to longer convergence delays.
There are drafts for announcing more than one best path which would help
with this issue. Some newer IOS releases supports this feature.
The RR will select a best path based on its own local routing information.
This could lead to routers using suboptimal paths because there may be
a shorter path available from a routers perspective but this is not the
path that the RR had chosen. Therefore it’s important to consider where
the RRs are placed.
This post takes a look at the benefits and caveats of a fully meshed iBGP network
vs route reflection. Although because of scalability it’s almost impossible to not
go with route reflection one should still consider the caveats of route reflection.
It’s important to consider the placement and the number of RRs in the topology.
This post is the first of posts that will focus on CCDE topics.
6 thoughts on “iBGP – Fully meshed vs Route Reflection”
As always a educational post, a reflection. In scenarios when you have a diversed setup with different routers. Such as the Catalyst 6500 (pre Sup-2T), the 6500 doesn’t support VPLS and it that scenario the 6500 is not good choice of RR. Something to take into account.
Glad that you are reading here. From what I’ve heard it seemed 7200 was a very commonly used RR platform earlier. It was the swiss army knife of Cisco, supported most stuff although in CPU.
The replacement these days seems to be ASR1k or a higher end ASR depending on how much routes you need.
The 6500/7600 has always been a good platform but it’s been really messy with which cards support which features regarding MPLS, QoS etc. For the 6500 SUP2T has definitely improved on a lot of these caveats.
Nice article, May be after reading this, its time to write about Route Reflector in MPLS-VPN networks. Hope you can come up with some good article on it 🙂
Thanks in advance!
You have also mentioned Confederations. Can you include this beast into the big picture as well? Fully Meshed vs Route Reflection vs Confederations. Is it worth to use Confederations for ISP with 100 routers?
Will there be any potential loops in data plane even though control plane loops are eliminated by RR loop prevention mechanism of two optional non-transitive attributes (originator_id & cluster_list) ?
Please clarify and thanks in advance.
There can absolutely be loops in the forwarding plane. Read this post from Ivan for an example: http://blog.ipspace.net/2013/10/can-bgp-route-reflectors-really.html
The key is that the logical topology should follow the physical topology to minimize the risk of loops.