Intro

Brian McGahan from INE introduced me to something interesting the other day.
BGP wedgie, what is that? I had never heard of it before although I’ve heard
of such things occuring. A BGP wedgie is when a BGP configuration can lead
to different end states depending on in which order routes are sent. There is
actually an RFC for this – RFC 4264.

Peering relationships

To understand this RFC you need to have some knowledge of BGP and the different
kind of peering relationships between service providers and customers.

Service providers are usually described as Tier 1 or Tier 2. A Tier 1 service provider
is one that does not need to buy transit. They have private peerings with other service
providers to reach all networks in the Default Free Zone (DFZ). This is the
theory although it’s difficult in the real world to see who is Tier 1 or not.

Tier 2 service providers don’t have private peerings to reach all the networks so they
must buy transit from one or more Tier 1 service providers. This is a paid
service.

Service providers have different preference for routes coming in. The most
preferred routes are those coming from customers. After that it is preferred
to send traffic over private peerings since in theory this should be cheaper than
transit. The least preferred is to send traffic towards your transit.

Why is my policy not working?

Assume that you are a customer buying capacity from two service providers.
You want to use one service provider as primary and one as secondary.
This is usually done by sending a community towards your secondary provider
which then sets local preference. Keep in mind that providers will still have
their best economic result in mind though. Take a look at the following diagram.

Wedgie1

We will be configuring AS1. We want to have the network 1.1.1.0/24 as primary
by AS4 and secondary by AS2. We will use communities to achieve this. We
setup the primary path first.

This is the configuration of AS1 so far:

router bgp 1
 no synchronization
 bgp log-neighbor-changes
 neighbor 12.12.12.2 remote-as 2
 neighbor 12.12.12.2 description backup
 neighbor 12.12.12.2 shutdown
 neighbor 12.12.12.2 send-community
 neighbor 12.12.12.2 route-map set-backup out
 neighbor 14.14.14.4 remote-as 4
 neighbor 14.14.14.4 description primary
 no auto-summary
!
ip bgp-community new-format
!
route-map set-backup permit 10
 set community 2:50

The backup will be turned up later.

Looking from AS2 perspective we now have the correct path.

AS2#sh bgp ipv4 uni   
BGP table version is 2, local router ID is 23.23.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.0/24       23.23.23.3                             0 3 4 1 i
AS2#traceroute 1.1.1.1

Type escape sequence to abort.
Tracing the route to 1.1.1.1

  1 AS3 (23.23.23.3) 80 msec 36 msec 20 msec
  2 AS4 (34.34.34.4) 64 msec 56 msec 48 msec
  3 AS1 (14.14.14.1) 84 msec *  68 msec

Now the backup service is turned up.

AS1(config-router)#no nei 12.12.12.2 shut
AS1(config-router)#
%BGP-5-ADJCHANGE: neighbor 12.12.12.2 Up

AS2 still prefers the correct path due to local preference.

AS2#sh bgp ipv4 uni   
BGP table version is 2, local router ID is 23.23.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.0/24       12.12.12.1               0     50      0 1 i
*>                  23.23.23.3                             0 3 4 1 i
AS2#traceroute 1.1.1.1

Type escape sequence to abort.
Tracing the route to 1.1.1.1

  1 AS3 (23.23.23.3) 84 msec 44 msec 20 msec
  2 AS4 (34.34.34.4) 56 msec 60 msec 44 msec
  3 AS1 (14.14.14.1) 100 msec *  100 msec

AS3 and AS4 has the following route-map to increase local pref for customer
routes.

AS3#sh route-map
route-map customer, permit, sequence 10
  Match clauses:
  Set clauses:
    local-preference 150
  Policy routing matches: 0 packets, 0 bytes

Now what happens if there is a failure between AS1 and AS4?
AS2 now only has one paith available.

AS2#sh bgp ipv4 uni
BGP table version is 3, local router ID is 23.23.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.0/24       12.12.12.1               0     50      0 1 i

This is advertised to R3 which sets local preference to 150.

AS3#sh bgp ipv4 uni
BGP table version is 4, local router ID is 34.34.34.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.0/24       23.23.23.2                    150      0 2 1 i

Now the primary circuit comes back. AS3 will prefer to go via AS2 because
that is a customer route.

AS3#sh bgp ipv4 uni
BGP table version is 4, local router ID is 34.34.34.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.0/24       34.34.34.4                             0 4 1 i
*>                  23.23.23.2                    150      0 2 1 i

We now have a BGP wedgie. The same BGP configuration has generated two
different outcomes depending on the order of which the routes were announced.
The only way of breaking the wedgie is now to stop announcing the backup. Let
the network converge and then bring up the backup again. AS2 now has the correct
path again.

AS2#sh bgp ipv4 uni
BGP table version is 5, local router ID is 23.23.23.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  1.1.1.0/24       12.12.12.1               0     50      0 1 i
*>                  23.23.23.3                             0 3 4 1 i

So to describe what is actually looking take a look at this diagram.

Wedgie2

The number describes in what order the UPDATE is sent. AS2 has two paths but
the one directly to AS1 has a local pref of 50 due to AS1 using it as a backup.
This means that AS2 does not send this path to AS3 so AS3 has to use the path
via AS4. This is the key. Now what happens when the circuit between AS1 and AS4
fails?

Wedgie3

The key here is step 3 where AS2 sends it only current path to AS3. AS3 will then
set local preference to 150 because this is a customer route. Then the primary
circuit comes back.

Wedgie4

AS1 announces the network to AS4. AS4 announces this to AS3. AS3 does NOT
advertise this to AS2 because it already has a best path via AS2 where
the local preference is 150. This means that the network can not converge
to the primary path until the backup path has been removed.

Conclusion

BGP is a distance vector protocol and sometimes the same configuration can
give different outcomes depending on which order updates are sent. Have
this in mind when setting up BGP and try to learn as much as possible about
your service providers peerings.

BGP wedgies – Why isn’t my routing policy having effect?
Tagged on:         

One thought on “BGP wedgies – Why isn’t my routing policy having effect?

  • October 8, 2013 at 10:18 am
    Permalink

    Hi. That’s a nice one. BGP is not that predictable as I thought…
    Thank you Daniel for pointing this out.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *