I’m preparing a massive blog post on vPC in the context of VXLAN/EVPN and while doing so I accidentally broke my lab. What a great learning experience! I thought I would share it with you and how to perform troubleshooting of this scenario.

My topology looks like this:

Before I made any changes, there was full connectivity between these hosts, meaning that both bridging and routing was working. I then changed the loopback1 (NVE source interface) configuration of Leaf-1 and Leaf-2 to add a secondary IP. This was the initial configuration:

! Leaf-1
interface loopback1
  description VTEP
  ip address 203.0.113.1/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
! Leaf-2
interface loopback1
  description VTEP
  ip address 203.0.113.2/32
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

This then changed to:

! Leaf-1
interface loopback1
  description VTEP
  ip address 203.0.113.1/32
  ip address 203.0.113.12/32 secondary
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode
! Leaf-2
interface loopback1
  description VTEP
  ip address 203.0.113.2/32
  ip address 203.0.113.12/32 secondary
  ip router ospf UNDERLAY area 0.0.0.0
  ip pim sparse-mode

Note that they both use 203.0.113.12 as their secondary IP. This is a concept called Anycast VTEP which I will explain further in the upcoming vPC post. After implementing this, Leaf-1 and Leaf-2 started advertising BGP updates using this IP as the next-hop. In addition, they also added a Site of Origion (SoO) attribute for loop prevention. Below is an example of Leaf-1 advertising 0050.56ad.8506 as seen by Leaf-4:

Leaf4# show bgp l2vpn evpn 0050.56ad.8506
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.3:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[0]:[0.0.0.0]/216, version 8771
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 1 destination(s)
             Imported paths list: L2-10000
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2
      Originator: 192.0.2.3 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[32]:[198.51.100.11]/272, version 8769
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2 Router MAC:00ad.e688.1b08
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 3 destination(s)
             Imported paths list: Tenant1 L3-10001 L2-10000
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2 Router MAC:00ad.e688.1b08
      Originator: 192.0.2.3 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer

Now, as Leaf-1 and Leaf-2 use the same Anycast VTEP, they are going to ignore updates from each other based on the SOO extended community. This can be seen in BGP event logs:

[M 27] [bgp] E_DEBUG [bgp_af_process_nlri:7438] (default) PFX: [L2VPN EVPN] Dropping prefix [3]:[0]:[32]:[203.0.113.12]/88 from peer 192.0.2.12, due to attribute error
[M 27] [bgp] E_DEBUG [bgp_af_process_nlri:7438] (default) PFX: [L2VPN EVPN] Dropping prefix [2]:[0]:[0]:[48]:[0050.56ad.8506]:[0]:[0.0.0.0]/112 from peer 192.0.2.12, due to attribute error
[M 27] [bgp] E_DEBUG [bgp_af_process_nlri:7438] (default) PFX: [L2VPN EVPN] Dropping prefix [2]:[0]:[0]:[48]:[0050.56ad.b4a4]:[0]:[0.0.0.0]/112 from peer 192.0.2.12, due to attribute error
[M 27] [bgp] E_DEBUG [bgp_af_process_nlri:7438] (default) PFX: [L2VPN EVPN] Dropping prefix [2]:[0]:[0]:[48]:[0050.56ad.8506]:[32]:[198.51.100.11]/144 from peer 192.0.2.12, due to attribute error
[M 27] [bgp] E_DEBUG [bgp_af_process_nlri:7438] (default) PFX: [L2VPN EVPN] Dropping prefix [5]:[0]:[0]:[24]:[198.51.100.0]/88 from peer 192.0.2.12, due to attribute error

As Leaf-1 and Leaf-2 ignore each others updates, can there be any connectivity between hosts connected to those leafs? For traffic between Server-2 (10.0.0.22) and Server-1 (198.51.100.11), let’s see if there is any route for Server-1 on Leaf-2:

Leaf2# show ip route vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

10.0.0.0/24, ubest/mbest: 1/0, attached
    *via 10.0.0.1, Vlan20, [0/0], 9w2d, direct
10.0.0.1/32, ubest/mbest: 1/0, attached
    *via 10.0.0.1, Vlan20, [0/0], 9w2d, local
10.0.0.22/32, ubest/mbest: 1/0, attached
    *via 10.0.0.22, Vlan20, [190/0], 4d00h, hmm
198.51.100.0/24, ubest/mbest: 1/0
    *via 203.0.113.4%default, [200/0], 01:03:12, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007104 encap: VXLAN
 
198.51.100.44/32, ubest/mbest: 1/0
    *via 203.0.113.4%default, [200/0], 01:03:12, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007104 encap: VXLAN

There is no route for 198.51.100.11, specifically, but it does however have a route for 198.51.100.0/24 via Leaf-4. Could we use that route to get to Server-1? Let’s try to ping Server-1 (198.51.100.11) from Server-2 (10.0.0.22) while also running tcpdump on Server-1:

sudo tcpdump -i ens160 icmp
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:23:08.605027 IP 10.0.0.22 > server1: ICMP echo request, id 24, seq 2, length 64
09:23:08.605186 IP server1 > 10.0.0.22: ICMP echo reply, id 24, seq 2, length 64

The packet is reaching Server-1 and it responds, but the reply never gets to Server-2. Let’s look at packet captures to see how far we get. First, packet is sent from Server-2 to Leaf-2:

Frame 1: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens257, id 4
Ethernet II, Src: 00:50:56:ad:f4:8d, Dst: 00:01:00:01:00:01
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.11
Internet Control Message Protocol

Then the packet goes to Spine-1:

Frame 4: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens224, id 2
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:b3:fd:1b:08
Internet Protocol Version 4, Src: 203.0.113.12, Dst: 203.0.113.4
User Datagram Protocol, Src Port: 50959, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.11
Internet Control Message Protocol

Then to Leaf-4:

Frame 2: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens161, id 0
Ethernet II, Src: 00:ad:b3:fd:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 203.0.113.12, Dst: 203.0.113.4
User Datagram Protocol, Src Port: 50959, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.11
Internet Control Message Protocol

From Leaf-4 towards the Anycast VTEP of 203.0.113.12 via Spine-1:

Frame 3: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens161, id 0
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:b3:fd:1b:08
Internet Protocol Version 4, Src: 203.0.113.4, Dst: 203.0.113.12
User Datagram Protocol, Src Port: 64786, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:e6:88:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.11
Internet Control Message Protocol

From Spine-1 towards Leaf-2:

Frame 5: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens224, id 2
Ethernet II, Src: 00:ad:b3:fd:1b:08, Dst: 00:ad:f3:bb:1b:08
Internet Protocol Version 4, Src: 203.0.113.4, Dst: 203.0.113.12
User Datagram Protocol, Src Port: 64786, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:e6:88:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.11
Internet Control Message Protocol

Notice that the outer Ethernet destination MAC is 00ad.f3bb.1b08, which belongs to Leaf-2, but the inner Ethernet destination MAC is 00ad.e688.1b08, which belongs to Leaf-1:

Leaf1# show nve interface 
Interface: nve1, State: Up, encapsulation: VXLAN
 VPC Capability: VPC-VIP-Only [notified]
 Local Router MAC: 00ad.e688.1b08
 Host Learning Mode: Control-Plane
 Source-Interface: loopback1 (primary: 203.0.113.1, secondary: 203.0.113.12)

How did this happen? Since both Leaf-1 and Leaf-2 are sending their EVPN updates via Anycast VTEP of 203.0.113.12, Spine-1 has ECMP route towards this destination:

Spine1# show ip route 203.0.113.12
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

203.0.113.12/32, ubest/mbest: 2/0
    *via 192.0.2.3, Eth1/1, [110/41], 5d00h, ospf-UNDERLAY, intra
    *via 192.0.2.4, Eth1/2, [110/41], 5d00h, ospf-UNDERLAY, intra

The packet can go to either leaf based on load sharing and this time it went to Leaf-2, which is not where Server-1 is connected. The packet is dropped:

As mentioned above, Spine-1 will load share towards both leafs sharing the Anycast VTEP. In my capture, the next packet hitting Spine-1 was actually forwarded towards Leaf-1. Server-1 then responded and sent the packet towards Leaf-1, which has no route for 10.0.0.22…

Leaf1# show ip route vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

198.51.100.0/24, ubest/mbest: 1/0, attached
    *via 198.51.100.1, Vlan10, [0/0], 9w6d, direct
198.51.100.1/32, ubest/mbest: 1/0, attached
    *via 198.51.100.1, Vlan10, [0/0], 9w6d, local
198.51.100.11/32, ubest/mbest: 1/0, attached
    *via 198.51.100.11, Vlan10, [190/0], 5d00h, hmm
198.51.100.44/32, ubest/mbest: 1/0
    *via 203.0.113.4%default, [200/0], 7w6d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007104 encap: VXLAN

The packet goes a bit further, but is dropped on Leaf-1 so it never makes it to Server-2:

The implementation of Anycast VTEP has broken all connectivity between anything connected to Leaf-1 and Leaf-2 as they are ignoring each others BGP updates. How do we fix this? That’s something I’ll cover in the vPC blog post!

Do we have any connectivity between Leaf-1 or Leaf-2 towards other leafs? Let’s see if Server-2 (10.0.0.22) can ping Server-4 (198.51.100.44) while also running a tcpdump on Server-4:

sudo tcpdump -i ens160 icmp
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), snapshot length 262144 bytes
08:26:37.086150 IP 10.0.0.22 > server4: ICMP echo request, id 35, seq 1, length 64
08:26:37.086212 IP server4 > 10.0.0.22: ICMP echo reply, id 35, seq 1, length 64
08:26:38.108020 IP 10.0.0.22 > server4: ICMP echo request, id 35, seq 2, length 64
08:26:38.108062 IP server4 > 10.0.0.22: ICMP echo reply, id 35, seq 2, length 64

The packet makes it to Server-4 and it responds, but that packet doesn’t make it back to Server-2. Now, why is that?

There’s a lot to uncover here. Everything works as expected until the packet reaches Leaf-4. Leaf-4 needs to route the packet and put the correct destination MAC in the inner Ethernet header. Let’s see what MAC it picks:

Leaf4# show forwarding nve l3 peers

slot  1
=======

EVPN configuration state: disabled, PeerVni Adj enabled
NVE cleanup transaction-id 0
tunnel_id    Peer_id  Peer_address   Interface       rmac         origin state del count
----------------------------------------------------------------------------------------
0xcb00710c  1225261060 203.0.113.12   nve1      00ad.e688.1b08 NVE        merge-done no    1  

It picks 00ad.e688.1b08. However, the route towards 10.0.0.22 looks like this:

Leaf4# show bgp l2vpn evpn 10.0.0.22
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.4:32787
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.f48d]:[32]:[10.0.0.22]/272, version 8816
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10002 10001
      Extcommunity: RT:65000:10001 RT:65000:10002 SOO:203.0.113.12:0 ENCAP:8
          Router MAC:00ad.f3bb.1b08
      Originator: 192.0.2.4 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 2 destination(s)
             Imported paths list: Tenant1 L3-10001
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10002 10001
      Extcommunity: RT:65000:10001 RT:65000:10002 SOO:203.0.113.12:0 ENCAP:8
          Router MAC:00ad.f3bb.1b08
      Originator: 192.0.2.4 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer

Route Distinguisher: 192.0.2.6:3    (L3VNI 10001)
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.f48d]:[32]:[10.0.0.22]/272, version 8818
Paths: (1 available, best #1)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported from 192.0.2.4:32787:[2]:[0]:[0]:[48]:[0050.56ad.f48d]:[32]:[10.0.0.22]/272 
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10002 10001
      Extcommunity: RT:65000:10001 RT:65000:10002 SOO:203.0.113.12:0 ENCAP:8
          Router MAC:00ad.f3bb.1b08
      Originator: 192.0.2.4 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer

Notice that the Router MAC here is 00ad.f3bb.1b08, which is the Router MAC of Leaf-2. Since both Leaf-1 and Leaf-2 are using the same Anycast VTEP of 203.0.113.12, there can only be one Router MAC associated with that VTEP. Leaf-4, by some mechanism, has selected the MAC address of Leaf-1, while Server-2 is behind Leaf-2. Leaf-2 receives this packet:

Frame 2: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens224, id 2
Ethernet II, Src: 00:ad:b3:fd:1b:08, Dst: 00:ad:f3:bb:1b:08
Internet Protocol Version 4, Src: 203.0.113.4, Dst: 203.0.113.12
User Datagram Protocol, Src Port: 51894, Dst Port: 4789
Virtual eXtensible Local Area Network
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:e6:88:1b:08
Internet Protocol Version 4, Src: 198.51.100.44, Dst: 10.0.0.22
Internet Control Message Protocol

Since the MAC address 00ad.e688.1b08 does not belong to Leaf-2, it will drop the packet. This is shown below:

As Leaf-4 currently thinks that the Router MAC for 203.0.113.12 is the MAC belonging to Leaf-1, there is no way to get connectivity to Server-2. Can we get Leaf-4 to pick the Router MAC to be the MAC of Leaf-2 instead? We can achieve this by temporarily shutting down the BGP sessions on Leaf-1:

Leaf1(config)# router bgp 65000
Leaf1(config-router)# neighbor 192.0.2.11 
Leaf1(config-router-neighbor)# shutdown
Leaf1(config-router-neighbor)# exit
Leaf1(config-router)# neighbor 192.0.2.12
Leaf1(config-router-neighbor)# shutdown

Leaf-1 sends a BGP notification message (RFC 8203) that it is shutting down its sessions. It sends these towards the RRs:

Frame 17: 87 bytes on wire (696 bits), 87 bytes captured (696 bits) on interface ens192, id 1
Ethernet II, Src: 00:ad:e6:88:1b:08, Dst: 00:ad:b3:fd:1b:08
Internet Protocol Version 4, Src: 192.0.2.101, Dst: 192.0.2.11
Transmission Control Protocol, Src Port: 179, Dst Port: 25072, Seq: 20, Ack: 39, Len: 21
Border Gateway Protocol - NOTIFICATION Message
    Marker: ffffffffffffffffffffffffffffffff
    Length: 21
    Type: NOTIFICATION Message (3)
    Major error Code: Cease (6)
    Minor error Code (Cease): Administratively Shutdown (2)

Leaf-1 also closes the TCP session by sending FIN to the RR:

The RRs then invalidate the routes that were known via Leaf-1 and send to their clients:

Frame 20: 295 bytes on wire (2360 bits), 295 bytes captured (2360 bits) on interface ens161, id 0
Ethernet II, Src: 00:ad:b3:fd:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 192.0.2.11, Dst: 192.0.2.104
Transmission Control Protocol, Src Port: 179, Dst Port: 35150, Seq: 20, Ack: 39, Len: 229
Border Gateway Protocol - UPDATE Message
    Marker: ffffffffffffffffffffffffffffffff
    Length: 229
    Type: UPDATE Message (2)
    Withdrawn Routes Length: 0
    Total Path Attribute Length: 206
    Path attributes
        Path Attribute - MP_UNREACH_NLRI
            Flags: 0x90, Optional, Extended-Length, Non-transitive, Complete
            Type Code: MP_UNREACH_NLRI (15)
            Length: 202
            Address family identifier (AFI): Layer-2 VPN (25)
            Subsequent address family identifier (SAFI): EVPN (70)
            Withdrawn Routes
                EVPN NLRI: IP Prefix route
                    Route Type: IP Prefix route (5)
                    Length: 34
                    Route Distinguisher: 0001c00002030003 (192.0.2.3:3)
                    ESI: 00:00:00:00:00:00:00:00:00:00
                    Ethernet Tag ID: 0
                    IP prefix length: 24
                    IPv4 address: 198.51.100.0
                    IPv4 Gateway address: 0.0.0.0
                    MPLS Label Stack:  (withdrawn)
                EVPN NLRI: MAC Advertisement Route
                    Route Type: MAC Advertisement Route (2)
                    Length: 33
                    Route Distinguisher: 0001c00002038009 (192.0.2.3:32777)
                    ESI: 00:00:00:00:00:00:00:00:00:00
                    Ethernet Tag ID: 0
                    MAC Address Length: 48
                    MAC Address: 00:50:56:ad:2a:05
                    IP Address Length: 0
                    IP Address: NOT INCLUDED
                    1000 0000 0000 0000 0000 .... = MPLS Label 1: 524288
                EVPN NLRI: MAC Advertisement Route
                    Route Type: MAC Advertisement Route (2)
                    Length: 33
                    Route Distinguisher: 0001c00002038009 (192.0.2.3:32777)
                    ESI: 00:00:00:00:00:00:00:00:00:00
                    Ethernet Tag ID: 0
                    MAC Address Length: 48
                    MAC Address: 00:50:56:ad:85:06
                    IP Address Length: 0
                    IP Address: NOT INCLUDED
                    1000 0000 0000 0000 0000 .... = MPLS Label 1: 524288
                EVPN NLRI: MAC Advertisement Route
                    Route Type: MAC Advertisement Route (2)
                    Length: 33
                    Route Distinguisher: 0001c00002038009 (192.0.2.3:32777)
                    ESI: 00:00:00:00:00:00:00:00:00:00
                    Ethernet Tag ID: 0
                    MAC Address Length: 48
                    MAC Address: 00:50:56:ad:b4:a4
                    IP Address Length: 0
                    IP Address: NOT INCLUDED
                    1000 0000 0000 0000 0000 .... = MPLS Label 1: 524288
                EVPN NLRI: MAC Advertisement Route
                    Route Type: MAC Advertisement Route (2)
                    Length: 37
                    Route Distinguisher: 0001c00002038009 (192.0.2.3:32777)
                    ESI: 00:00:00:00:00:00:00:00:00:00
                    Ethernet Tag ID: 0
                    MAC Address Length: 48
                    MAC Address: 00:50:56:ad:85:06
                    IP Address Length: 32
                    IPv4 address: 198.51.100.11
                    1000 0000 0000 0000 0000 .... = MPLS Label 1: 524288
                EVPN NLRI: Inclusive Multicast Route
                    Route Type: Inclusive Multicast Route (3)
                    Length: 17
                    Route Distinguisher: 0001c00002038009 (192.0.2.3:32777)
                    Ethernet Tag ID: 0
                    IP Address Length: 32
                    IPv4 address: 203.0.113.12

Those routes are no longer valid on Leaf-4:

Leaf4# show bgp l2vpn evpn route-type 2
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.3:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.2a05]:[0]:[0.0.0.0]/216, version 8836
Paths: (1 available, best #0)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is invalid(deleted/dampened/history), not best reason: Neighbor Address, is deleted, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 SOO:203.0.113.12:0 ENCAP:8
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[0]:[0.0.0.0]/216, version 8837
Paths: (1 available, best #0)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is invalid(deleted/dampened/history), not best reason: Neighbor Address, is deleted, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.b4a4]:[0]:[0.0.0.0]/216, version 8838
Paths: (1 available, best #0)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is invalid(deleted/dampened/history), not best reason: Neighbor Address, is deleted, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000
      Extcommunity: RT:65000:10000 SOO:203.0.113.12:0 ENCAP:8
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.8506]:[32]:[198.51.100.11]/248, version 8839
Paths: (1 available, best #0)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is invalid(deleted/dampened/history), not best reason: Neighbor Address, is deleted, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.12 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 SOO:203.0.113.12:0 ENCAP:8
          MAC Mobility Sequence:00:2 Router MAC:00ad.e688.1b08
      Originator: 192.0.2.3 Cluster list: 192.0.2.2 

Leaf-4 selects a new Router MAC for 203.0.113.12:

Leaf4# show forwarding nve l3 peers

slot  1
=======

EVPN configuration state: disabled, PeerVni Adj enabled
NVE cleanup transaction-id 0
tunnel_id    Peer_id  Peer_address   Interface       rmac         origin state del count
----------------------------------------------------------------------------------------
0xcb00710c  1225261060 203.0.113.12   nve1      00ad.f3bb.1b08 NVE        merge-done no    1      

With the correct Router MAC, we should now have connectivity. This is the ping that’s running on Server-2 (10.0.0.22):

ping 198.51.100.44
PING 198.51.100.44 (198.51.100.44) 56(84) bytes of data.
64 bytes from 198.51.100.44: icmp_seq=41 ttl=62 time=5.68 ms
64 bytes from 198.51.100.44: icmp_seq=42 ttl=62 time=5.71 ms
64 bytes from 198.51.100.44: icmp_seq=43 ttl=62 time=6.02 ms
64 bytes from 198.51.100.44: icmp_seq=46 ttl=62 time=5.95 ms
64 bytes from 198.51.100.44: icmp_seq=47 ttl=62 time=6.25 ms
64 bytes from 198.51.100.44: icmp_seq=53 ttl=62 time=6.49 ms
64 bytes from 198.51.100.44: icmp_seq=55 ttl=62 time=6.00 ms
64 bytes from 198.51.100.44: icmp_seq=60 ttl=62 time=6.01 ms
64 bytes from 198.51.100.44: icmp_seq=61 ttl=62 time=5.17 ms
64 bytes from 198.51.100.44: icmp_seq=63 ttl=62 time=5.68 ms
64 bytes from 198.51.100.44: icmp_seq=64 ttl=62 time=6.21 ms
64 bytes from 198.51.100.44: icmp_seq=65 ttl=62 time=5.71 ms
64 bytes from 198.51.100.44: icmp_seq=68 ttl=62 time=5.57 ms
64 bytes from 198.51.100.44: icmp_seq=70 ttl=62 time=5.47 ms

At sequence number 41 the packets started making it. Notice how not every packet makes it, though. This is because the spine switches have routes towards 203.0.113.12 via IGP to both Leaf-1 and Leaf-2. Depending on what leaf they forward the packet to, it will either make it or be dropped. When the packet makes it, it looks like this:

When the packet doesn’t make it, it looks like this:

What did we learn from all this?

  • When adding a secondary IP to loopback used as NVE, EVPN will advertise all routes using secondary IP.
  • Site of Origin (SoO) community is added for loop prevention.
  • This will break all connectivity to the Anycast VTEPs until the full configuration with vPC has been implemented.
  • The Router MAC is a critical component in routing traffic between VNIs.
  • There are knobs to configure how EVPN behaves with Anycast VTEP (will be covered in vPC post).
    How to check the various hops and verify their forwarding.

See you in the vPC post when it’s done!

How Anycast VTEP Broke My Lab And What I Learned
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *