Lately I have been working a lot with multicast, which is fun and challenging! Even if you have a good understanding of multicast unless you work on it a lot there may be some concepts that fall out of memory or that you only run into in real life and not in the lab. Here is a summary of some things I’ve noticed so far.
PIM Register
PIM Register are control plane messages sent from the First Hop Router (FHR) towards the Rendezvous Point (RP). These are unicast messages encapsulating the multicast from the multicast source. There are some considerations here, firstly because these packets are sent from the FHR control plane to the RP control plane, they are not subject to any access list configured outbound on the FHR. I had a situation where I wanted to route the multicast locally but not send it outbound.
Even if the ACL was successful, care would have to be taken to not break the control plane between the FHR and the RP or all multicast traffic for the group would be at jeopardy.
The PIM Register messages are control plane messages, this means that the RP has to process them in the control plane which will tax the CPU. Depending on the rate that the FHR is sending to the RP and the number of sources, this can be very stressful on the CPU. As a safeguard the following command can be implemented:
ip pim register-rate-limit 20000
This command is applied on FHRs and limits the rate of the PIM Register messages to 20 kbit/s. By default there is no limit, set it to something that makes sense in your environment.
Storm Control
If you have switches in your multicast environement, and most likely you will, implement storm control. If a loop forms you don’t want to have an unlimited amount of broadcast and multicast flooding your layer 2 domain. Combined with the PIM Register this can be a real killer for the control plane if your FHR is trying to register sources at a very high packet rate.
storm-control broadcast level pps 100 storm-control multicast level pps 1k
The above is just an example, you have to set it to something that fits your environment, make sure to leave some room for more traffic than expected but not enough to hurt your devices if somethings goes wrong.
S,G Timeout
PIM Any Source Multicast (ASM) relies on using a RP when setting up the flow between the multicast sender and receiver. The receiver will first join the (*,G) tree which is rooted at the RP. After the receiver learns of the source it can switch over to the source tree (S,G). The (S,G) mroute in the Multicast Routing Information Base (MRIB) has a standard lifetime of 180 seconds. It can be beneficial to raise this timeout depending on the topology. Look at the following topology:
If something happens to the source making it go away for three minutes, the (S,G) state will time out. Let’s then say that the source comes back but the RP is not available, then the FHR will not be able to register the source and no traffic can flow between the source and the receiver. If a higher timeout was configured for the (S,G) then the traffic would start flowing again when the source came back online. It’s not a very common scenario but can be a reasonable safe guard for important multicast groups. The drawback of configuring is that you will keep state for a longer time even if it is not needed.
ip pim sparse sg-expiry-timer <value>
The maximum timeout is 57600 seconds which is 16h. Setting it to a couple of hours may cut you some slack if something happens to the RP. Be careful if you have a lot of groups running though.
These are some important aspects but certainly not all. What lessons have you learned from deploying multicast?
Have you configured S,G timeout in XR ?
No, is it not available there?