Introduction

Modern networks need to be enabled for voice and video. These applications
do not tolerate a lot of loss before quality becomes unacceptable. This
requires us to build networks that are scalable, resilient and converge
quickly. This post will describe key points of building a network that
fullfills those requirements.

Hierarchical Network Design

It’s important to think of the network in terms of building blocks and
hierarchy. Define different building blocks like campus, small remote site,
medium remote site and large remote site so that not every new network
needs an unique design. Building a hierarchical network will:

  • Make it easier to understand, grow and troubleshoot
  • Create small fault domains, clear separation of layer 2 and 3.
  • Allows for load sharing and redundancy
  • Deterministic traffic patterns and convergence

The key point is to not end up with a network looking like this:

Bad_design

Like anyone working in the real world I know that budget constraints, lack of
available fibre or many other factors can limit our network designs. We can’t
always win but we should make it clear to the company/management that we can’t
support voice and video unless we design the network as it should be. Do you want
to be on call for a network that is poorly designed and maybe you are the only
one that knows how it works? In Optimal Routing Design, Russ White talks about
the 2 AM test. If someone calls you up at 2 AM do you know how your network works?
If you don’t it’s a sign that it is too complex.

The different building blocks of a hierarchical network

Traditionally networks have been built in a three tier model. This model
consists of access, distribution and core. In smaller networks it may be
acceptable to have a layer that acts as both distribution and core and
this is called the collapsed core model.

Access Layer

The role of the access layer is to:

  • Provide connectivity into the network
  • Enforce security to prevent ARP/IP spoofing
  • Boundary for trust for QoS model
  • Provide PoE for phones and access points

Distribution layer

The role of the distribution layer is to:

  • Aggregate wiring closets(access layer) and uplinks to core
  • Provide high availability, load sharing and QoS
  • Protect the core from high density peering and problems in access layer
  • Summarize routes towards core and fast convergence
  • Provide first hop redundancy towards the access layer

Core Layer

The role of the core layer is to:

  • Provide connectivity between all the building blocks
  • Provide high performance and high availability
  • Aggregate the distribution layer
  • A separate core layer helps with scalability

When do I need a core layer?

There is no text book answer to this question but consider the following
topology:

2blocks

There are currently two building blocks. Every distribution layer switch has
three IGP neighbors and 3 links. What happens if we add another building
block?

3blocks

The network went from three IGP peers to five IGP peers. Also the total number
of links went to 15 from previously 6. You can see that this gets out of hand
pretty quickly.

Different Campus Designs

There are a few common designs that can be used to build the campus access and
distribution layer. Which design that fits best will depend on if there is a
need to span VLANs and how modern equipment you have in your network.

Layer 3 Distribution

Layer 3 distribution

This design has no VLANs spanning the switches. This is what we want to have
but usually there is some requirement that keeps us from doing this. It could be
some application, Vmotion or maybe one common wireless network across the entire
campus. There is no layer 2 loop in this design which means we are not relying
on STP for convergence. Here are some points to consider for this design:

  • Tune CEF to avoid polarization leading to underusing links
  • Summarize routes towards the core
  • Don’t peer IGP across links unless you intend to use them
  • Set the trunk mode to on/nonegotiate
  • Ports toward users hardcoded to access and enable portfast
  • Configure root guard or BPDU guard towards users
  • Enable security features such as DHCP snooping and DAI

Layer 2 Distribution

Layer 2 distribution

It’s quite common that some VLANs need to span the campus. This means that there
must exist an L2 link between distribution switches. There is now a loop in the
topology so convergence is dependant on spanning tree. Some points to consider
for this design:

  • Tune CEF to avoid polarization leading to underusing links
  • Summarize routes towards the core
  • Don’t peer IGP across links unless you intend to use them
  • Set the trunk mode to on/nonegotiate
  • Ports toward users hardcoded to access and enable portfast
  • Configure root guard or BPDU guard towards users
  • Enable security features such as DHCP snooping and DAI
  • Align STP Root and HSRP primary on the same distribution switch
  • Put Root Guard on downlinks (facing access switches)
  • Put Loop Guard on uplinks (facing distribution switches)

Routed Access

Routed access

The routed access design has no layer 2 links. It’s all routing which means
convergence is fast, no links are blocking and equal cost routing can be used.
The drawback is that no VLANs can span the topology. If MPLS is enabled in the core
some VLANs could still be able to span through the use of EoMPLS, VPLS etc. Key
points to consider for this design:

  • How much more will routed access cost me?
  • Do I need the performance/convergence gain?
  • Do I have the need to span any VLANs?
  • How many routes do my access layer devices support?
  • Summarize routes towards the core
  • Summarize routes towards the access
  • Tune CEF to avoid polarization
  • Don’t peer IGP across links unless you intend to use them
  • Ports toward users hardcoded to access and enable portfast
  • Configure root guard or BPDU guard towards users
  • Enable security features such as DHCP snooping and DAI

Layer 2 Distribution with MLAG

Layer 2 MLAG

Newer designs can utilize newer features like stacking, VSS and vPC. This means that
VLANs can span access switches but there is no physical loop because MLAG is used.
This gives us the advantage of a layer 2 distribution without the disadvantage of
relying on spanning tree for convergence. There is no need to run HSRP becaues the
distribution layer is acting as one device. The key points are similar to the layer 2
distribution:

  • Tune CEF to avoid polarization leading to underusing links
  • Summarize routes towards the core
  • Set the trunk mode to on/nonegotiate
  • Ports toward users hardcoded to access and enable portfast
  • Configure root guard or BPDU guard towards users
  • Enable security features such as DHCP snooping and DAI
  • Put Root Guard on downlinks (facing access switches)
  • Put Loop Guard on uplinks (facing distribution switches)

If doing a new design I would definitely go with some form of stacking or VSS or vPC
if deploying Nexus switches. This gives us the flexibility of using layer 2 in distribution
but still not needing to rely on STP and FHRP for convergence.

Recommendations for Fast Convergence

  • Use only point to point interconnections
  • Use fiber between all devices for fast convergence (debounce timer)
  • Tune the carrier delay timer
  • When possible use IP configuration on phsyical interface over SVI

I did a separate post on Detecting Network Failure which goes into more detail
on detecting failure.

Why should physical interfaces be used over SVI? The following steps take place when
converging on a physical interface:

  1. Link Down
  2. Interface Down
  3. Routing Update

When using an SVI there are some additional steps however:

  1. Link Down
  2. Interface Down
  3. Autostate
  4. SVI Down
  5. Routing Update

When using an SVI when the link goes down the switch must check if there are any
other ports up with that VLAN configured. If it is the SVI won’t be brought down.
Even if there isn’t it takes time to go through all the interfaces before declaring
the SVI down. This can worsen convergence by a good 200 ms. If you do use an SVI
then make sure that it is point to point so that it’s not allowed on any other
links than the link connecting the two switches.

Recommendations for Spanning Tree

  • Don’t span VLANs across switches unless neccessary
  • Use RSTP or MST for best convergence
  • Even if you have no loops, STP is needed to protect against user side loops
  • STP can protect against misconfiguration or hardware failures creating loops

Layer 2 Hardening

Cisco recommends the following features for hardening layer 2 in a campus design:

Layer 2 hardening

I agree with most of this but there are some caveats.

One issue is with Root Guard. Why do we run Root Guard? To protect against another
switch dictating the bridging topology. I would set the root to a priority of 0 and the
secondary root to a priority of 4096. That should provide protection and if you don’t
trust your employees to not mess up the network that is an education problem or
management problem. Restrict user accounts in Tacacs what they can’t do to remove
potentially dangerous commands such as switchport trunk allowed vlan, no router bgp,
no router ospf and so on.

So what is the issue with Root Guard? The STP root will also be the HSRP primary device.
Because the network is designed with Equal Cost Multi Paths (ECMP) some traffic may
arrive at the standby HSRP router. This is the network without blocking links:

Root Guard step 1

No issues so far except that the crosslink is being used but it’s not a major deal.
But what happens if the links between the HSRP primary and standby fails?

Root Guard step 2

The access switches are sending superior BPDUs but the secondary distribution switch will
block the link due to Root Guard being implemented. This means that any traffic arriving
at the secondary distribution switch destined for the access layer switches will be
black holed. That is why I would not implement Root Guard towards the access layer.

Recommendations for Layer 3

Here are some recommendations for Layer 3:

  • Build triangles not squares for deterministic convergence
  • Use passive-interface default and only peer on links used for transit
  • Design the network with dual Layer 3 paths for resiliency
  • Summarize from the distribution to the core to cut down on flooding and Active queries
  • Tune CEF to avoid polarization of linkz

What happens if we design with squares?

Routing Square

If a device goes down the network has to rely on flooding of updates or LSAs before it
can converge. There is no secondary path that can be immediately installed.

But if the design is a triangle instead:

Routing Triangle

There are already dual paths so losing one won’t affect convergence and the other route
is already in the FIB so traffic can keep flowing.

Conclusion

There are many network designs out there. Learn the strengths and weaknesses of
different designs. Look at best practice designs from the vendors but don’t follow
them blindly. As I have shown sometimes recommendations will not work for all
scenarios.

Finding the right design depends on business needs, budget and what kind of applications
that are running. Read more on campus design in BRKCRS-2031 and also look at the Cisco
Validated Network Designs
.

Network Campus Design
Tagged on:                             

13 thoughts on “Network Campus Design

  • October 18, 2013 at 4:08 pm
    Permalink

    Very nice Daniel, I shall use this as a reference doc for any design work (I don’t do much). The “Layer 3 Distribution” model results in FHRP keepalives going via the Access Layer. I guess that is a drawback for the benefit of having no Layer 2 loop.

    Nick

    Reply
  • October 18, 2013 at 9:45 pm
    Permalink

    What about keeping the vlan database similar on all access switches and use vlan allowed lists to put control ?

    Keeping vlan database will most likely simplify the network else I would need to maintain an excel sheet to capture all access switches and their respective vlan databases 🙂

    Also if I have quick convergence requirements, I can opt for things like Flex Links & Resilient Ethernet Protocol (REP).

    Also one of the common failure scenario is if you have only L2 links between your distribution block in collapsed core scenario and Core is getting for example default route from upstream. In such case HSRP Active device might just stop receiving default route from core and effectively blackhole the traffic if Default route is not tracked in HSRP/FHRP

    Also summarization thought is an excellent tools but can be planned/designed well too else can black hole traffic easily. Also summarization can still cause recalculation from routing protocol prospective due to metric adjustment if one of more specific route gets failed which indeed was providing seed metric for summary route.

    HTH…
    Deepak Arora

    Reply
    • October 19, 2013 at 6:49 pm
      Permalink

      Good points. I will write about REP in the future and probably also about summarization.

      Reply
  • October 19, 2013 at 6:57 pm
    Permalink

    Great article with a thorough explanation of each model. My favorite (when possible) is definitely “Layer 2 Distribution with MLAG”. Thanks!

    Reply
    • October 20, 2013 at 8:50 am
      Permalink

      Thanks! More articles coming up on design and other topics for the CCDE.

      Reply
  • October 23, 2013 at 10:35 am
    Permalink

    Great article! Thank you!

    Reply
  • October 29, 2013 at 7:13 pm
    Permalink

    Good brief Daniel. Couple things I would like to point. For Vpc still FHRP is necessary since you have two control plane.OTV like technologies between DC lets say require FHRP isolation. Also for RSTP case , after some time it will converge but FHRP delay might be necessary for synchronization with RSTP. Also , I agree with Deepak but also we should keep in mind that to avoid summary metric oscillation and avoid PRC for ISIS or full SPF for OSPF we set the metric manually , not from smallest component , or we create virtual interface and assign from it.

    Thanks.

    Reply
  • October 25, 2014 at 11:47 pm
    Permalink

    Thanks for great article! I’d like to make a small remark. You say, you wouldn’t enable loopguart in rstp enviroment, since the latter is capable of detecting unidirectional links. Let me disagree with this statement, and here is why: in a converged network, rstp sends bpdus ONLY downstream out its Designated ports (same as in legacy 802.1D stp). This is fundamental concept of spanning tree mechanism. The only difference with rstp is that each bridge generates and transmits own bpdus, no matter if it receives bpdus on its root ports from the upstream bridges or not.

    I have built a lab with three switches connected in a triangle, with switch A as a root, and B and C as its downstream bridges. The port on B towards C got Designated role, whilst the port on C towards B – Alternate. Now, to simulate a unidirectiinal link, I configured bpdufilter on port on B, and as soon as C stopped hearing bpdus from B on its blocked Alternate port, it slowly moved it to listening, learning, and, finally, forwarding state, thus, effectively adding one more Designated port on the segment. Bingo, we have a loop!

    Conclusion is obvious: loopguard is mandatory even wih rstp. It would not allow the blocked port to transition to forwarding state as soon as it stops hearing bpdus from the designated bridge on the segment in situations, when the physical link stays up.

    Reply
    • October 26, 2014 at 8:36 am
      Permalink

      Thanks Alex!

      Yes, without the BPDU it wouldn’t be able to detect the unidirectional link so to catch all possible scenarios, RSTP with loopguard and UDLD enabled should be able to catch “everything”.

      Reply
      • October 26, 2014 at 12:48 pm
        Permalink

        That’s right! So, switch B would never knows, if the direction from C to B has failed, since BPDUs are never received on designated ports (except, when the neighbor bridge C looses its Root port, but currently I’m speaking about the converged state). The good news is there would be no loop developed as long as the oposite direction (from B to C) remains operational (since C hears B’s BPDUs on its blocked port).

        If I were you, I’d edit this sentence just to avoid readers’ confusion: “I would not use Loop Guard because RSTP can detect unidirectional links by receiving
        inferior BPDUs on a designated port.”

        Thanks 🙂

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: