Did some OSPF labbing yesterday. Ran into some interesting stuff. Imagine that you are running a frame-relay network which is hub and spoke. All routers are using their main interface for communication. The hub router has the static frame relay mapping with the broadcast keyword but the spokes don’t. They are all running OSPF network type point-to-multipoint.
What will happen if we only have broadcast capability in one direction? The hub router sends multicast out its main interface reaching all the spokes. The surprising part is that spokes reply with unicast packet using immediate hello. The adjacency forms and everything is fine and dandy. However after a while the hub declares the spokes as dead, why?
Since the spokes don’t have multicast capability the hub never receives their hello packets. Remember the hello packets are used as keepalive mechanism so after 120 seconds they are declared dead. Quite an interesting scenario and would be interesting to put as trouble ticket at the CCIE lab If you as I never heard of immediate hellos you can read this URL. Basically its a way of forming adjacencies and converging faster. Instead of only sending hello packets at HelloInterval the router will immediately respond to incoming hello packets. The draft was written by 3 Huawei engineers.
RIP timers are the most basic thing in the world right? Even the command to set them is named timers basic… However in some documentation it is not really clear what the difference is between the invalid and holddown timer. The default timers are 30 for updates, 180 for invalid, 180 for holddown and 240 for flush. I have heard and seen described in official documentation that when a route is in holddown it will not accept routes with a worse metric but routes with a better metric. This is however not true. First lets describe the different timers.
Updates – Updates are sent every 30 seconds by default to the address 188.8.131.52.
Invalid – If there has not been any updates for 180 seconds about the prefix it is consider invalid and the route will be poisoned (route advertised with a metric of 16).
Holddown – The timer for holddown will be activated when the route goes into an invalid state. This is set to 180 by default.
Flush – This timer is set to 240 seconds, when a routes is 240 seconds old it is flushed from the routing table.
So the holddown timer is used to stabilize the topology, even better routes will be suppressed which is not what some documentation says. Here is how I tested it.
I created a topology with 3 routers connecting to each other and both the routers announced 184.108.40.206/32 to the middle router. I created an ACL on the middle router to filter all traffic so that the best route will become invalid. On the third router I used an offset-list to make the route worse. After the route became invalid I stopped sending the route with a worse metric and sent it with a better metric. However the route is still not installed until the holddown timer has expired. If you manipulate the timers it is easier to see. I used 5 seconds for updates, 30 for invalid, 30 for holddown and flush of 240. You will see that it takes 60s before the route gets installed.
If you use the standard timers the holddown timer will not expire before the route is flushed since the 180 seconds start counting after 180s by default and then there is only 60s left until the route is flushed. Try this out for yourself and see if you get the same results as I.
Here is a good link describing the timers.
I did Vol1 RIP labs yesterday and I wanted to show you some cool stuff. How to do conditional default routing, this is lab stuff but some of it is definetely useful for real life deployments as well. I will be demonstrating RIP but the concepts are the same for other IGPs as well. I will show how to do it in two different ways. Lets start out with the topology.
This network has two exit paths to the Internet which is simulated by two routers with a loopback of 220.127.116.11. R2 has a static default route to ISP B which is the secondary exit. This route has an AD of 130 so it will only be used when the primary exit via RIP to ISP A is down. I have preconfigured the routers with addresses and the static route on R2 for ISP B. You can download the .net file and initial configs here.
Lets start by enabling RIP on R1 and R2. No routes will be learned since we only have a local link between them. We will now configure R1 to advertise a default route.
Let’s check R2 if we can see the route.
Yes it is there. This is not a very dynamic setup however. R1 will announce the route even if it looses the link to ISP A. Remember that RIP does not need to have a default route in its own RIB for it to announce it to neighbors. We will prove this by shutting down link to ISP A.
Interface has been shutdown. Is route still available at R2?
Yes it is. We now have a blackhole. Traffic will reach R1 and then it will be blackholed. Lets look at a way of solving this.
We can create a route-map and tie this to the advertisement of the default route. The route-map will wheck that the ISP A link (18.104.22.168/24) is in the routing table before advertising the default to R2. If the link to ISP A goes down the default should disappear, lets try this out.
We then shutdown the interface on R1.
We check R2, before and after shutdown of interface to ISP A.
If we debug RIP on R1 we can see that it poisions the route and sends it to R2 (hop count 16).
Now we have a way of doing a conditional default if the interface goes down. How can we solve a situation where we have issues but the interface stays up? Maybe we are connected via a fibre converter, not an optimal solution but sometimes we don’t get to decide.
Time to get funky, IP SLA is our friend. The basic idea is still the same. We will create a dummy route in R1 routing table. This dummy route will only be installed as long as R1 can ping ISP A IP address. If it can’t it will remove the dummy route and stop announcing the default route. We start by configuring R1.
We have a prefix-list that matches the dummy route. We use IP SLA to ping the other side, this is more reliable than relying on link down on the interface. Then we have the dummy route that is tied to track 1. Track 1 tracks the status of the SLA ping. We then have a route-map that looks for the dummy route prefix. If this prefix is missing we will stop announcing the default route. Lets look at R1 to see that the SLA is succeeding and that the dummy route is installed.
The reason we see some failures is because I had the interface shutdown when I brought the SLA probe online. Lets check that R2 still has a default route.
Indeed it does. Now lets see what happens when R1 can’t ping ISP A. I will temporary remove the IP on ISP A so that the interface still stays up but R1 won’t be able to ping any longer. We will run a debug of track to see what happens.
R1 can’t ping ISP A any longer so it stops announcing the deafult route as we can see on R2.
So now we have a more dynamic way of sending a conditional default. You can create even more exciting scenarios than this I am sure. If you want to lab it up just download the files from the beginning of this post.
Sorry for the lack of updates lately but I’m deep into studies and when I’m not my days are still full. Rack rentals have been fully booked so I have been running Dynamips lately for labs. Here are some advice if you want to run OER/PfR in Dynamips.
You need at least 256MB RAM for your MC and border routers. Other routers will be fine with 128. If you don’t have enough memory you will get tracebacks and log messages about not being able to run Netflow.
You probably want to enable sparsemem to not need to run too many hypervisors. I use two hypervisors for the entire INE topology.
I saw some inconsistency compared to the solution guide so I might try to redo the labs later with rack rentals. You can do most of the task though in Dynamips. Where you might run into trouble is when doing verification by simulating traffic. If you want to simulate HTTP by transfering an IOS image it is not as easy as with a real device.
Running full INE topology on my laptop with Core i5 and 4GB RAM only takes about 5% CPU. I have described in earlier posts how to achieve this.
If there is interest I might try to do a video on Dynamips on how I setup my topology and how I achieve the low load.
I’m doing labs in Dynamips now since rack rentals are fully booked for two weeks. Going through Vol1 labs and I’m currently at the routing section. Most of the stuff is familiar but need to do some practice on OER.