I’ve been working a lot with cloud networking lately. I will share some of my findings as this is still quite new and documentation around some topics is poor. Especially on the Azure side. Let me just first start with two statements that I have seen made around cloud networking:

Cloud networking is easy! – Not necessarily so. I’ll explain more.

We don’t need networking in cloud! – Wrong. You do but in basic implementations it’s not visible to you.

This post will be divided into different areas describing the different components in cloud networking. You will see that there are many things in common between AWS and Azure.

System Routes

Within a VPC/VNET, there are system routes. If 10.0.0.0/22 was assigned to the VPC/VNET, there will be a system route saying along the lines of “10.0.0.0/22 local”. Subnets are then deployed in the VPC/VNET and there is full connectivity due to the system route. This route will point to a virtual router which is the responsibility of AWS/Azure. Normally this router will have a “leg” in each subnet, at the first IP address of the subnet, for example 10.0.0.1 for the 10.0.0.0/24 subnet.

In AWS, system routes can not be overridden. It doesn’t matter if you try to put a static route or a longer route, the system route always takes precedence.

In Azure, you CAN override system routes. These are called UDR, User Defined Routing. While very useful, it can be confusing, and a little dangerous, that Azure chooses a route according to the list below:

  1. User-defined route
  2. BGP route
  3. System route

This means that a BGP route of equal length will be preferred to the system route. I learned this because I was advertising 0.0.0.0/0 over BGP and it made a host lose internet connectivity because it was following the BGP route instead of the system route.

Internet Connectivity

In AWS, to provide internet connectivity to a subnet, an Internet Gateway (IGW) or a NAT Gateway must be attached to the VPC, and the route table associated with the subnet, must have a route towards the IGW or NAT Gateway. A subnet with internet connectivity is often referred to as a public subnet.

In Azure, internet connectivity is provided by default. My only guess as to why is that they want to let people easily get started even if they don’t have any networking knowledge. I don’t like this default though. Subnets shouldn’t have internet access unless I decided that they should have. This also means that if you don’t want the subnet to have internet access, you have to play stupid tricks by writing ACLs and by doing so you risk blocking access to Azure services. I don’t think this is a sensible default and at the very least Azure should provide an easy mechanism to remove internet access for a subnet.

Virtual Router

As I described above, both in AWS and Azure there is a virtual router that lives in each and every subnet. Because broadcasts aren’t supported, there are some tricks behind the scene like answering ARP replies in order for virtual machines to have connectivity between each other.

One thing I’ve noticed in Azure, which is kind of annoying, is that the virtual router does not reply to ICMP. This makes troubleshooting more difficult as you can’t ping the virtual router or see it as a hop in a traceroute.

The other thing with Azure is, that the virtual router inserts itself into EVERY flow. I was very surprised by this as this is not really documented anywhere and certainly not in Azure’s official documentation. The way I discovered this was that I had two devices talking BGP to each other, all the expected routes were there, and still traffic was being dropped somewhere. I did extensive troubleshooting and then I found something weird. I had two devices in the same subnet, let’s say we had devices A and B and A was 10.0.0.4 and B was 10.0.0.5. Let’s say that A’s MAC was 0001.aaaa.aaaa and that B’s MAC was 0001.bbbb.bbbb. I noticed that A did not have a MAC address of 0001.bbbb.bbbb for B. This was confusing as they are in the same subnet. Instead I saw a MAC address of 0123.4567.89ab. What was this MAC address? It turns out that the Azure virtual router does this “man in the middle” where it replies to ARP requests with its own MAC. All traffic is then relayed through the virtual router. Even though I had two devices on the same subnet, they were not directly communicating with each other. This broke my forwarding of traffic because my traffic was now not only between these two devices, the traffic was hitting Azure’s routing tables as well. I hadn’t put any routes into those because I shouldn’t be needing them. I then had to update Azure’s routing tables with static routes in order to get traffic through. I don’t really like how the virtual router inserts itself into every flow.

VPN Gateways

Both AWS and Azure offers VPN gateways, whether they be called Virtual Network Gateway (Azure) or Virtual Private Gateway (AWS). AWS now also has the Transit Gateway. The AWS VGW/TGW is fairly straight forward. It’s fast to create, it supports BGP, it’s easy to configure. The only caveat was it didn’t do IKEv2 until just recently. IKEv2 is now supported. Make sure to enable route propagation to get your routes from the VGW to your route table. Keep in mind that it may not be possible to change some settings for the VGW after it has been created.

Azure though is the exact opposite. It does support IKEv2 and has done for a long time but that’s pretty much the only thing positive I can say about it. It takes a LONG time to create the VNG, often 30 minutes or so. Why, Azure, why?! When you create the VNG, you need to point out which VNET it belongs to. This can NOT be changed at a later stage. You also can’t change from policy based VNG to dynamic VNG. You also have to select a SKU to size your VNG. Excuse me, I thought this was cloud?!

The worst part of Azure’s VNG though, is that it’s very poorly documented how you do active/active BGP. I basically had to reverse engineer this to figure it out. First off, the VNG is deployed into something that is called a Gatewaysubnet. This is a subnet that you need so that the VNG can get an IP address that you will later peer to. This means that you don’t get a shared subnet between Azure’s VNG and your VPN device, which is in the case in AWS. Something like 169.254.0.0/30. Instead you have to run multi-hop eBGP to Azure but not only that, you also need to peer from a loopback because you can’t configure Azure to have two BGP peers on your side. This also means that you need to come up with a link network yourself, and then put a static route to Azure’s BGP peer IP over that link network. This was not documented anywhere to be found.

Azure’s VNG is a job poorly done and I can’t imagine why they’ve designed it this way. It feels like someone lacking enough networking knowledge designed the VNG. I know this is probably not the case but that just what it feels like.

Also keep in mind that you can’t start out with a VNG that is static and then move to dynamic. Considering how long it takes to create and delete VNGs, a change like that means you’ll likely lose connectivity to your on-premises environment for an hour or more.

Summary

I’ve tried to cover as many things as I could but I’m sure I’ve missed something. Networking in the cloud is different and not always as cloud like as you would expect. Especially in the case of Azure. While they do support you overriding system routes, that’s pretty much the only advantage they have over AWS. It’s quite obvious that they are a couple of years behind AWS feature-wise. I can only hope that they seriously reconsider the way they deploy and configure the VNG. I hope you find some good information in this post as it’s quite difficult to find the proper documentation, especially for Azure. Thanks for reading!

Lessons Learned in Cloud Networking – AWS vs Azure
Tagged on:                 

18 thoughts on “Lessons Learned in Cloud Networking – AWS vs Azure

  • April 1, 2019 at 3:23 pm
    Permalink

    Nice post! Havent tried to do much networking in other clouds. I’m quite familiar with how Openstack is implemented in our setup and here we have VMs on the same L2 (no relay between). One tricky part you’d notice while doing networking there is that some openstack cloud providers have port security enabled on your network. It is a thing that forces openstack to know about all IPs and MAC addresses for traffic to be passed through firewalls and bridges 🙂

    Reply
    • April 1, 2019 at 4:53 pm
      Permalink

      Hey Mart,

      Nice to hear from you. Interesting info on OpenStack. Interesting to see if it will take off more in the enterprise space.

      Reply
      • April 1, 2019 at 5:22 pm
        Permalink

        Indeed! Nice to see some life signs 🙂
        Still a lot of work to run, develop and update openstack. They are improving all the time but still I think openstack doesn’t make money wise sense unless you have like one rack of compute.

        Reply
  • April 1, 2019 at 3:33 pm
    Permalink

    I would even as far as saying It is the opoosite of easy. Weird terminology, weird routing and switching behavior, magic stuff happening in the background…to a cloud networking newcomer it feels like completely different world.
    And the best of it – bad documentation, info scattered in multiple blogposts by MS engineers. No real networking person in MS to ask or talk to.

    Reply
    • April 1, 2019 at 5:39 pm
      Permalink

      Yes, the documentation is pretty bad and it’s confusing why they have to invent their own terminology. The Azure documentation is very lacking and the one that exists isn’t always that readable.

      Reply
  • April 1, 2019 at 4:24 pm
    Permalink

    I am disappointed that UDR (Azure) pointing subnet to VM will work only for subnets where VM has interfaces on, so I can’t send traffic to VM’s loopback… Is it something specific to Azure only or cloud ?

    Reply
    • April 1, 2019 at 5:42 pm
      Permalink

      Can you explain the use case a bit more? What are you trying to achieve?

      Reply
  • April 1, 2019 at 6:37 pm
    Permalink

    For me the fact that in Azure BGP you have to put host static route for BGP peering address pointing to the tunnel was crazy. that kinda defeats the purpose of having BGP . If you are putting static route just for peering address, then just put route for the whole VNETs that you have. Logic is very strange here.
    Also whatever your peering address is, that gets advertised back by Azure to you, which is again pointless. I’d agree AWS implementation of routing and VPN part is more clear and straightforward

    Reply
    • May 4, 2021 at 2:53 pm
      Permalink

      Hey Nizami,

      I wouldnt necessarily agree with that (caveat: I work for Microsoft). The main difference I have seen is that AWS gateways work with traditional tunnels with IPs in both ends, while Azure VPN GWs work with unnumbered tunnels. In my (networking) experience, unnumbered tunnels are much more flexible and consume fewer IPs, and are used everywhere in provider-grade networks, but I agree that some network admins have trouble in grasping the concept. That single static route is required to send the right BGP traffic through the right tunnel, but I dont see how it would defeat the purpose of BGP: all BGP prefixes would be learnt on the BGP adjacency.

      Cheers,
      Jose

      Reply
  • April 2, 2019 at 1:15 am
    Permalink

    very good summaey. it s just missing gcp that `s another story.

    Reply
    • April 2, 2019 at 7:24 am
      Permalink

      Haven’t had a chance to dive into GCP yet. They are not very big over here. It’s all about AWS and Azure.

      Reply
  • April 2, 2019 at 4:55 am
    Permalink

    That is a great article, and definitely helps explain some of the behaviours I’ve seen in Azure and AWS. Thanks for posting!

    Reply
    • April 2, 2019 at 7:26 am
      Permalink

      Thanks, Robert!

      Reply
  • April 22, 2019 at 12:39 pm
    Permalink

    Thanks for sharing. I’ve worked a lot with AWS but have not had much exposure to Azure so this is a good comparison to know.

    Reply
  • May 7, 2019 at 6:06 pm
    Permalink

    Great write up – I’ve been working in Azure for the last 8 months or so and have come to same conclusions. Luckily I haven’t had to deal with BGP on their VPN gateways! We have a Cisco CSR acting as a DMVPN hub for branch connectivity which was a fairly straightforward process to set up (after realizing that static routes need to point to local subnet gateway and that IP forwarding needed to be enabled on the VM NIC).

    What would you consider the best way to implement failover CSR’s in Azure? I was thinking of replicating the approach they take with failover VPN gateways but don’t have too much experience with BGP.

    Reply
    • June 1, 2019 at 10:05 am
      Permalink

      Thanks, John! I would look into using a Transit VNET design where you have two CSRs in the same availability set, or, if available, use availability zones. Then use a VNG in the VNETs to connect to the CSRs in the Transit VNET.

      Reply
  • July 17, 2019 at 10:12 am
    Permalink

    Great article. One interesting point that I would like clarification. In your article you noted in the end of the virtual router section the following: “I then had to update Azure’s routing tables with static routes in order to get traffic through. I don’t really like how the virtual router inserts itself into every flow.”

    I get why Azure resorted to ARP resolution occurring via MAC address of the virtual router causing flows to transit the router. What I don;t get is your statement of having to add static route into Azure routing table. This should not be required even in the case you describe as packets received by the router are in the “same” subnet and the router would resolve it via a “connected” subnet rule. Is that not the case?

    Reply
    • July 26, 2019 at 3:40 pm
      Permalink

      The problem was that Azure, by “faking” the ARP reply, forces traffic through their virtual router, which means traffic is hitting Azure’s routing tables, even though the two devices are “directly connected”, that is, in the same subnet. I couldn’t find a way to not have traffic hit their routing tables.

      Reply

Leave a Reply to Robert Larsen Cancel reply

Your email address will not be published. Required fields are marked *