In a previous post, I walked through how a packet gets bridged in a VXLAN/EVPN network. In this post, I’ll go through how a packet gets routed, that is, packet from one VNI to another VNI. The following topology will be used:

The lab has the following characteristics:

  • OSPF in the underlay.
  • Ingress replication for BUM traffic through the use of EVPN.
  • ARP suppression is enabled.

Server-2 initiates a ping towards Server-4:

Frame 562: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens257, id 4
Ethernet II, Src: 00:50:56:ad:f4:8d, Dst: 00:01:00:01:00:01
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.44
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xd745 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [Response frame: 563]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.000701509 seconds]
    Data (40 bytes)

The destination MAC is 0001.0001.0001 which is the Anycast GW MAC configured on Leaf-2. As this MAC is used on SVI for VLAN 20 of Leaf-2, the packet will need to be routed. Let’s check the routing table:

Leaf2# show ip route 198.51.100.44/24 longer-prefixes vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

198.51.100.0/24, ubest/mbest: 2/0
    *via 203.0.113.1%default, [200/0], 5w6d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007101 encap: VXLAN
 
    *via 203.0.113.4%default, [200/0], 5w6d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007104 encap: VXLAN
 
198.51.100.11/32, ubest/mbest: 1/0
    *via 203.0.113.1%default, [200/0], 5w6d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007101 encap: VXLAN
 
198.51.100.44/32, ubest/mbest: 1/0
    *via 203.0.113.4%default, [200/0], 5w6d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007104 encap: VXLAN

Notice that there are two routes for 198.51.100.0/24 which have been learned via RT5 and then there is a more specific route 198.51.100.44/32 which has been learned via RT2:

Leaf2# show bgp l2vpn evpn 198.51.100.44 
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.6:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.7d68]:[32]:[198.51.100.44]/272, version 6444
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.4 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 ENCAP:8 Router MAC:00ad.7083.1b08
      Originator: 192.0.2.6 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 3 destination(s)
             Imported paths list: Tenant1 L3-10001 L2-10000
  AS-Path: NONE, path sourced internal to AS
    203.0.113.4 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10000 10001
      Extcommunity: RT:65000:10000 RT:65000:10001 ENCAP:8 Router MAC:00ad.7083.1b08
      Originator: 192.0.2.6 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer

Notice where it says imported paths list and that it has been imported to Tenant1, which is the VRF that is associated with the L3 VNI 10001. It has also been imported into L2 VNI 10000. Also note the Router MAC of 00ad.7083.1b08.

The next-hop for this route is 203.0.113.4 so a lookup is done to find what paths are available:

Leaf2# show ip route 203.0.113.4
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

203.0.113.4/32, ubest/mbest: 2/0
    *via 192.0.2.1, Eth1/1, [110/81], 7w5d, ospf-UNDERLAY, intra
    *via 192.0.2.2, Eth1/2, [110/81], 7w5d, ospf-UNDERLAY, intra

Leaf-2 encapsulates the packet and forwards it towards Spine-1:

Frame 564: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens225, id 5
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:7b:30:1b:08
Internet Protocol Version 4, Src: 203.0.113.2, Dst: 203.0.113.4
User Datagram Protocol, Src Port: 54276, Dst Port: 4789
Virtual eXtensible Local Area Network
    Flags: 0x0800, VXLAN Network ID (VNI)
    Group Policy ID: 0
    VXLAN Network Identifier (VNI): 10001
    Reserved: 0
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.44
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xd745 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [No response seen]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.001872153 seconds]
    Data (40 bytes)

Notice that L3 VNI 10001 is used as this is symmetric IRB. Also notice that the destination MAC is set to 00ad.7b30.1b08, which is the Router MAC of Leaf-4:

Leaf4# show int vlan 100 | i Ether
  Hardware is EtherSVI, address is  00ad.7083.1b08

Refer to the post on Advertising IPs in RT2 for more details on the Router MAC.

Spine-1 forwards the packet towards Leaf-4:

Frame 565: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens162, id 7
Ethernet II, Src: 00:ad:7b:30:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 203.0.113.2, Dst: 203.0.113.4
User Datagram Protocol, Src Port: 54276, Dst Port: 4789
Virtual eXtensible Local Area Network
    Flags: 0x0800, VXLAN Network ID (VNI)
    Group Policy ID: 0
    VXLAN Network Identifier (VNI): 10001
    Reserved: 0
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:ad:70:83:1b:08
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.44
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xd745 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [No response seen]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.002762421 seconds]
    Data (40 bytes)

Leaf-4 decapsulates the packet. As the inner destination MAC is set to 00ad.7083.ab08, Leaf-4 will route the packet. Let’s check for 198.51.100.44 in the routing table:

Leaf4# show ip route 198.51.100.44/24 longer-prefixes vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

198.51.100.0/24, ubest/mbest: 1/0, attached
    *via 198.51.100.1, Vlan10, [0/0], 7w6d, direct
198.51.100.11/32, ubest/mbest: 1/0
    *via 203.0.113.1%default, [200/0], 7w4d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007101 encap: VXLAN

There is a route for 198.51.100.0/24 and this route is attached. This means that Leaf-4 will now check the ARP cache to get the MAC of Server-4:

Leaf4# show ip arp vrf Tenant1

Flags: * - Adjacencies learnt on non-active FHRP router
       + - Adjacencies synced via CFSoE
       # - Adjacencies Throttled for Glean
       CP - Added via L2RIB, Control plane Adjacencies
       PS - Added via L2RIB, Peer Sync
       RO - Re-Originated Peer Sync Entry
       D - Static Adjacencies attached to down interface

IP ARP Table for context Tenant1
Total number of entries: 1
Address         Age       MAC Address     Interface       Flags
198.51.100.44   00:02:53  0050.56ad.7d68  Vlan10    

Leaf-4 does L2 lookup to find where to forward the packet:

Leaf4# show mac address-table address 0050.56ad.7d68 vni 10000
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link,
        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan,
        (NA)- Not Applicable
   VLAN     MAC Address      Type      age     Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
*   10     0050.56ad.7d68   dynamic  NA         F      F    Eth1/3

Leaf-4 forwards the packet towards Server-4:

Frame 566: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens194, id 8
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:50:56:ad:7d:68
Internet Protocol Version 4, Src: 10.0.0.22, Dst: 198.51.100.44
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xd745 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [Response frame: 567]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.003884583 seconds]
    Data (40 bytes)

This is shown visually below:

Server-4 responds to the ping from Server-2:

Frame 567: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens194, id 8
Ethernet II, Src: 00:50:56:ad:7d:68, Dst: 00:01:00:01:00:01
Internet Protocol Version 4, Src: 198.51.100.44, Dst: 10.0.0.22
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0xdf45 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [Request frame: 566]
    [Response time: 0,188 ms]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.004072854 seconds]
    Data (40 bytes)

The destination MAC is 0001.0001.0001 which is the Anycast GW MAC configured on Leaf-4. As this MAC is used on SVI for VLAN 10 of Leaf-4, the packet will need to be routed. Let’s check the routing table:

Leaf4# show ip route 10.0.0.22/24 longer-prefixes vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

10.0.0.22/32, ubest/mbest: 1/0
    *via 203.0.113.2%default, [200/0], 7w3d, bgp-65000, internal, tag 65000, segid: 10001 tunnelid: 0xcb007102 encap: VXLAN

One thing to note is that there is only a single /32 route here. Why is there no /24? This is because I haven’t configured redistribute direct on Leaf-2 so it’s not originating a RT5. The RT2 is below:

Leaf4# show bgp l2vpn evpn 10.0.0.22
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.4:32787
BGP routing table entry for [2]:[0]:[0]:[48]:[0050.56ad.f48d]:[32]:[10.0.0.22]/272, version 1638
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    203.0.113.2 (metric 81) from 192.0.2.12 (192.0.2.2)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10002 10001
      Extcommunity: RT:65000:10001 RT:65000:10002 ENCAP:8 Router MAC:00ad.f3bb.1b08
      Originator: 192.0.2.4 Cluster list: 192.0.2.2 

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop
             Imported to 2 destination(s)
             Imported paths list: Tenant1 L3-10001
  AS-Path: NONE, path sourced internal to AS
    203.0.113.2 (metric 81) from 192.0.2.11 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10002 10001
      Extcommunity: RT:65000:10001 RT:65000:10002 ENCAP:8 Router MAC:00ad.f3bb.1b08
      Originator: 192.0.2.4 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer

Note the Router MAC of 00ad.f3bb.1b08. Another thing to note here is that the Imported paths list only has the L3 VNI 10001 and not the L2 VNI 10002. Why? This is because Leaf-4 does not have VNI 10002 configured. Earlier we saw that Leaf-2 imported into VNI 10000 as it had this VNI configured. Why would it import routes also into the L2 VNI? This is because it uses this information for the ARP suppression cache:

Leaf2# show ip arp suppression-cache remote

Flags: + - Adjacencies synced via CFSoE
       L - Local Adjacency
       R - Remote Adjacency
       L2 - Learnt over L2 interface
       PS - Added via L2RIB, Peer Sync
       RO - Dervied from L2RIB Peer Sync Entry

Ip Address      Age      Mac Address    Vlan Physical-ifindex    Flags    Remote Vtep Addrs

198.51.100.44   00:00:13 0050.56ad.7d68   10 (null)              R        203.0.113.4 
198.51.100.11   00:00:13 0050.56ad.8506   10 (null)              R        203.0.113.1 

Now back to the packet walk. Leaf-4 needs a route for 203.0.113.2:

Leaf4# show ip route 203.0.113.2
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

203.0.113.2/32, ubest/mbest: 2/0
    *via 192.0.2.1, Eth1/1, [110/81], 7w5d, ospf-UNDERLAY, intra
    *via 192.0.2.2, Eth1/2, [110/81], 7w5d, ospf-UNDERLAY, intra

Leaf-4 encapsulates the packet and sends it towards Spine-1:

Frame 547: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens161, id 0
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:b3:fd:1b:08
Internet Protocol Version 4, Src: 203.0.113.4, Dst: 203.0.113.2
User Datagram Protocol, Src Port: 49708, Dst Port: 4789
Virtual eXtensible Local Area Network
    Flags: 0x0800, VXLAN Network ID (VNI)
    Group Policy ID: 0
    VXLAN Network Identifier (VNI): 10001
    Reserved: 0
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:f3:bb:1b:08
Internet Protocol Version 4, Src: 198.51.100.44, Dst: 10.0.0.22
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0xdf45 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.004693453 seconds]
    Data (40 bytes)

Once again, notice that VNI 10001 is used which is the L3 VNI. Also note the inner destination MAC of 00ad.f3bb.1b08 which is the router MAC of Leaf-2:

Leaf2# show int vlan 100 | i Ether
  Hardware is EtherSVI, address is  00ad.f3bb.1b08

Spine-1 forwards the packet towards Leaf-2:

Frame 559: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface ens224, id 2
Ethernet II, Src: 00:ad:b3:fd:1b:08, Dst: 00:ad:f3:bb:1b:08
Internet Protocol Version 4, Src: 203.0.113.4, Dst: 203.0.113.2
User Datagram Protocol, Src Port: 49708, Dst Port: 4789
Virtual eXtensible Local Area Network
    Flags: 0x0800, VXLAN Network ID (VNI)
    Group Policy ID: 0
    VXLAN Network Identifier (VNI): 10001
    Reserved: 0
Ethernet II, Src: 00:ad:70:83:1b:08, Dst: 00:ad:f3:bb:1b:08
Internet Protocol Version 4, Src: 198.51.100.44, Dst: 10.0.0.22
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0xdf45 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.005738551 seconds]
    Data (40 bytes)

Leaf-2 decapsulates the packet. As the inner destination MAC is set to 00ad.f3bb.1b08, Leaf-2 will route the packet. Let’s check for 10.0.0.22 in the routing table:

Leaf2# show ip route 10.0.0.22/24 longer-prefixes vrf Tenant1
IP Route Table for VRF "Tenant1"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

10.0.0.0/24, ubest/mbest: 1/0, attached
    *via 10.0.0.1, Vlan20, [0/0], 7w3d, direct

There is a route for 10.0.0.0/24 and it’s directly attached. Leaf-2 checks the ARP cache to get the MAC address of Server-2:

Leaf2# show ip arp vrf Tenant1

Flags: * - Adjacencies learnt on non-active FHRP router
       + - Adjacencies synced via CFSoE
       # - Adjacencies Throttled for Glean
       CP - Added via L2RIB, Control plane Adjacencies
       PS - Added via L2RIB, Peer Sync
       RO - Re-Originated Peer Sync Entry
       D - Static Adjacencies attached to down interface

IP ARP Table for context Tenant1
Total number of entries: 1
Address         Age       MAC Address     Interface       Flags
10.0.0.22       00:01:26  0050.56ad.f48d  Vlan20           

Leaf-2 does L2 lookup to find where to forward the packet:

Leaf2# show mac address-table address 0050.56ad.f48d vni 10002
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link,
        (T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan,
        (NA)- Not Applicable
   VLAN     MAC Address      Type      age     Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
*   20     0050.56ad.f48d   dynamic  NA         F      F    Eth1/3

Leaf-2 forwards the packet to Server-2:

Frame 563: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface ens257, id 4
Ethernet II, Src: 00:ad:f3:bb:1b:08, Dst: 00:50:56:ad:f4:8d
Internet Protocol Version 4, Src: 198.51.100.44, Dst: 10.0.0.22
Internet Control Message Protocol
    Type: 0 (Echo (ping) reply)
    Code: 0
    Checksum: 0xdf45 [correct]
    [Checksum Status: Good]
    Identifier (BE): 17 (0x0011)
    Identifier (LE): 4352 (0x1100)
    Sequence Number (BE): 1 (0x0001)
    Sequence Number (LE): 256 (0x0100)
    [Request frame: 562]
    [Response time: 5,899 ms]
    Timestamp from icmp data: Mar  3, 2024 08:38:35.804470000 Romance Standard Time
    [Timestamp from icmp data (relative): 0.006600705 seconds]
    Data (40 bytes)

This is shown visually below:

The packet walk is complete! In this post we learned:

  • How VXLAN packets get routed in symmetric IRB topology.
  • How the Router MAC is crucial to do forwarding with symmetric IRB.
  • How with symmetric IRB both ingress and egress VTEP perform both L2 and L3 lookup.
  • How the packet changes along the path to its destination.
  • How to check forwarding tables to see where to forward the packet.

I hope you were able to follow along and see you in the next one!

Routed Packet Walk in VXLAN/EVPN Network
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *