Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article describes how to design software-defined WANs (SD-WANs) that connect on-premises datacenters with each other and with Azure. It presents an architecture that lets Azure customers use their existing investments in the platform by building efficient, global SD-WAN overlays on top of the Microsoft backbone.
Applicable scenarios
The recommendations in this article are vendor-agnostic and applicable to SD-WAN technologies that meet two basic prerequisites:
Tunnels that use Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) as the underlying transport, such as tunnel mode IPsec Encapsulating Security Payload (ESP) with network address translation traversal (NAT traversal), implement the SD-WAN overlay.
Border Gateway Protocol (BGP) v4 exchanges routes between the SD-WAN edge devices and the networks connected to the SD-WAN. No assumptions are made about the routing protocol that the SD-WAN edge devices use to exchange routing information.
You can use SD-WAN products that meet these prerequisites to achieve the following goals:
Connect Azure hub-and-spoke networks to SD-WANs that span cloud and on-premises facilities, with dynamic route exchange between Azure virtual networks and SD-WAN edge devices.
Optimize connectivity to Azure and to on-premises datacenters for branches that have local internet breakouts. The reach of the Microsoft backbone, combined with its capacity, resiliency, and cold potato routing policy, can make it a high-performance underlay for global SD-WANs.
Use the Microsoft backbone for all Azure-to-Azure traffic (cross region and cross geography).
Use existing multiprotocol label switching (MPLS) networks as high-performance underlays.
Switch from MPLS networks to internet connectivity in a phased approach that minimizes the effect on the business.
The following sections assume that you're familiar with the basics of the SD-WAN paradigm and the architecture of the Microsoft backbone. The Microsoft backbone interconnects Azure regions with each other and with the public internet.
Architecture
Organizations that have a global presence and a multiregion Azure footprint use multiple connectivity services to build their corporate networks and connect to the Microsoft backbone.
Dedicated connectivity services, such as MPLS IP virtual private networks (IPVPNs), are typically deployed at the largest sites.
Azure ExpressRoute circuits connect the Microsoft backbone to datacenter facilities by using the point-to-point connectivity model or directly to the MPLS network by using the any-to-any connectivity model.
Branch offices that only have internet connectivity might use IPsec VPNs to connect to the closest on-premises datacenter and use that datacenter's ExpressRoute connection to access Azure resources. Or they might use IPsec VPNs to directly connect to Azure hub-and-spoke networks.
SD-WAN projects can differ in the connectivity services that they intend to replace. Some organizations might want to continue to use dedicated links or MPLS for large facilities and deploy SD-WAN only to replace legacy internet-based IPsec VPNs in small sites. Other organizations might want to extend their SD-WAN to MPLS-connected sites and use the existing MPLS network as a high-performance underlay. Some organizations might also retire their MPLS network and build their entire corporate network as a logical overlay on top of public or shared underlays, such as the public internet and the Microsoft backbone.
The architecture supports all the scopes in this article and is based on the following principles:
SD-WAN devices are deployed as network virtual appliances (NVAs) in each Azure region's hub-and-spoke network and configured as SD-WAN hubs that terminate tunnels from on-premises sites.
SD-WAN devices in Azure are configured to establish tunnels with each other to create a fully meshed hub-to-hub overlay that efficiently transports traffic among Azure regions. This overlay also relays traffic between geographically distant on-premises sites on top of the Microsoft backbone.
SD-WAN devices are deployed in all on-premises sites that the SD-WAN solution covers and are configured to establish tunnels to the SD-WAN NVAs in the closest Azure region or regions. Different sites can use different underlay transport services, such as the public internet or ExpressRoute connectivity.
Traffic from a site routes to the SD-WAN NVAs in the closest Azure region, regardless of whether the destination is in Azure or in another on-premises site. The traffic then traverses the hub-to-hub overlay.
SD-WAN products can use proprietary protocols and features to create direct tunnels between two sites and achieve better performance than relaying traffic through SD-WAN NVAs in Azure.
The following diagram shows the high-level architecture of a global SD-WAN that uses the Microsoft backbone, the public internet, and dedicated ExpressRoute connections as underlays.
Download a PowerPoint file of this architecture.
Connect Azure hub-and-spoke networks to SD-WANs
This section provides recommendations for deploying SD-WAN edge devices as NVAs in an existing hub-and-spoke Azure network.
SD-WAN NVAs in the hub virtual network
We recommend the hub-and-spoke topology for building scalable networks in an Azure region by using customer-managed virtual networks. The hub virtual network hosts shared components such as NVAs and native services that provide network functions, like firewalling, load balancing, and connectivity to on-premises sites via site-to-site VPNs or ExpressRoute. The hub virtual network is the logical location for SD-WAN NVAs because it centralizes shared network functions and provides consistent access to remote networks. These NVAs are non-Microsoft gateways that connect the hub to those remote networks.
Deploy SD-WAN NVAs in hub virtual networks in the following ways:
Use one network interface controller (NIC) for all SD-WAN traffic. You can add other NICs, such as a management NIC, to meet security and compliance requirements or to follow vendor guidelines for Azure deployments.
Attach the NIC used for SD-WAN traffic to a dedicated subnet. Define the size of the subnet based on the number of SD-WAN NVAs deployed to meet high availability (HA) and scale or throughput requirements. For more information, see Connect Azure hub-and-spoke networks to SD-WANs and Azure Route Server limits and design considerations.
Associate network security groups (NSGs) with the SD-WAN traffic NIC, either directly or at the subnet level, to allow connections from remote on-premises sites over the TCP/UDP ports that the SD-WAN solution uses.
Enable IP forwarding on the NIC used for SD-WAN traffic.
Route Server in the hub virtual network
Route Server automates route exchange between SD-WAN NVAs and the Azure software-defined networking (SDN) stack. Route Server supports BGP as a dynamic routing protocol. By establishing BGP adjacencies between Route Server and SD-WAN NVAs:
Route Server injects routes for all on-premises sites connected to the SD-WAN into the virtual network's route tables, and all Azure virtual machines (VMs) learn those routes.
Route Server propagates routes for all IP prefixes in the address space of virtual networks to all SD-WAN-connected sites.
Configure Route Server with the following requirements:
Deploy Route Server in a dedicated subnet in the hub virtual network. Set Route Server capacity based on the number of VMs in the hub-and-spoke network.
To enable dynamic route exchange for all spoke virtual networks, configure virtual network peering to allow the spoke virtual networks to use the hub virtual network's gateway and Route Server. For more information, see the Route Server FAQ.
Route Server and the SD-WAN NVAs attach to different subnets, so configure BGP sessions between Route Server and the SD-WAN NVAs to use eBGP multihop support. Any number of hops between two and the maximum supported by the SD-WAN NVA is supported. For more information about how to configure BGP adjacencies for Route Server, see Create and configure Route Server by using the Azure portal.
Configure two
/32static routes on the SD-WAN NVA for the BGP endpoints that Route Server exposes. This configuration ensures that the NVA's route table always contains routes for its multihop (not directly connected) BGP peers.
Route Server isn't in the data path. It's a control plane entity that propagates routes between the SD-WAN NVAs and the virtual network SDN stack. The Azure SDN stack handles the actual traffic forwarding between the SD-WAN NVAs and the VMs in the virtual network, as the following figure shows. To achieve this routing behavior, Route Server injects all routes that it learns from the SD-WAN NVAs by setting the next hop to the NVA's address.
Route Server doesn't support IPv6. This architecture is only for IPv4.
HA for SD-WAN NVAs with Route Server
Route Server has built-in HA. Two compute resources back a single Route Server instance, and Azure deploys these compute resources in different availability zones (in regions with availability zones) or in the same availability set (in regions without availability zones). As a result, a Route Server instance exposes two BGP endpoints, one endpoint for each compute resource. You achieve HA for the SD-WAN NVAs by deploying multiple instances in different availability zones or in the same availability set. Each SD-WAN NVA establishes two BGP sessions, one session for each endpoint that Route Server exposes.
This architecture doesn't rely on Azure load balancers. It has the following characteristics:
No public load balancers expose SD-WAN tunnel endpoints. Each SD-WAN NVA exposes its own tunnel endpoint. Remote peers establish multiple tunnels, with one tunnel for each SD-WAN NVA in Azure.
No internal load balancers are required to distribute traffic from Azure VMs across multiple SD-WAN edge devices. Route Server and the Azure SDN stack support equal-cost multi-path (ECMP) routing. If multiple edge devices announce a route for the same destination, Route Server injects multiple routes in the virtual network's route table, with one route for each edge device that announced the destination. Each route has a different next hop that corresponds to the IP address of the edge device that announced it, in the virtual network's route table. The SDN stack then distributes traffic for that destination across all available next hops.
The following figure shows the resulting HA architecture.
N-active versus active-standby HA
When you use multiple SD-WAN NVAs and peer them with Route Server, BGP drives the failover. If an SD-WAN NVA goes offline, it stops advertising routes to Route Server. Route Server then withdraws the routes that it learned from that device from the virtual network's route table. As a result, if an SD-WAN NVA no longer provides connectivity to remote SD-WAN sites because of a fault in the device or in the underlay network, it no longer appears as a possible next hop toward those sites in the virtual network's route table. All the traffic goes to the remaining healthy devices. For more information about route propagation between SD-WAN NVAs and Route Server, see Routes advertised by a BGP peer to Route Server.
The following diagram shows this failover behavior.
BGP-driven failover and ECMP routing enable N-active HA architectures with N devices that concurrently process traffic. Active-passive architectures can also be implemented because Route Server honors BGP AS Path attributes. If different SD-WAN edge devices announce routes for the same destinations with different AS Path lengths, the SD-WAN edge device that announces routes with the shortest path becomes the preferred next hop. If that device fails or withdraws some of its routes, Route Server extends routes with longer AS Path values that other devices announce. The only BGP attribute that SD-WAN NVAs can use to express a degree of preference for the routes that they announce to Route Server is AS Path.
We recommend N-active HA architectures because they enable optimal resource utilization without standby SD-WAN NVAs and horizontal scalability. To increase throughput, multiple NVAs can run in parallel, up to the maximum number of BGP peers that Route Server supports. The N-active HA model requires the SD-WAN NVAs to act as stateless, layer-3 routers. When multiple tunnels to a site exist, the system can route TCP connections asymmetrically. The original and reply flows of the same TCP connection can be routed through different tunnels and different NVAs. The following figure shows an example of an asymmetrically routed TCP connection. These routing asymmetries are possible for TCP connections initiated on either the virtual network or an on-premises site.
Consider active-passive HA architectures only when SD-WAN NVAs in Azure perform network functions that require routing symmetry, such as stateful firewall inspection. Avoid this approach because of its scalability implications. Running more network functions on SD-WAN NVAs increases resource consumption. Active-passive HA architectures allow only one NVA to process traffic at any point in time. As a result, the entire SD-WAN layer can only be scaled up to the maximum Azure VM size that it supports, not scaled out. Implement stateful network functions that require routing symmetry on separate NVA clusters that rely on Azure Load Balancer for n-active HA.
ExpressRoute connectivity considerations
This architecture supports a full SD-WAN approach, so you can build your corporate network as a logical overlay on top of the public internet and the Microsoft backbone. You can also use dedicated ExpressRoute circuits to address specific scenarios that are described in the following sections.
Scenario #1: ExpressRoute and SD-WAN coexistence
SD-WAN solutions can coexist with ExpressRoute connectivity when SD-WAN devices are deployed only in a subset of sites. For example, some organizations might deploy SD-WAN solutions as a replacement for traditional IPsec VPNs in sites that have internet connectivity only, and use MPLS services and ExpressRoute circuits for large sites and datacenters, as the following figure shows.
This coexistence scenario requires SD-WAN NVAs deployed in Azure to route traffic between sites connected to the SD-WAN and sites connected to ExpressRoute circuits. You can configure Route Server to propagate routes between ExpressRoute virtual network gateways and SD-WAN NVAs in Azure by enabling the AllowBranchToBranch feature. Route propagation between the ExpressRoute virtual network gateway and the SD-WAN NVAs occurs over BGP. Route Server establishes BGP sessions with the ExpressRoute virtual network gateway and with the SD-WAN NVAs, and it propagates to each peer the routes that it learns from the other peer. The platform manages the BGP sessions between Route Server and the ExpressRoute virtual network gateway. Users don't need to configure those sessions explicitly. They only need to enable the AllowBranchToBranch flag when they deploy Route Server.
This SD-WAN and ExpressRoute coexistence scenario enables migrations from MPLS networks to SD-WAN. It provides a path between legacy MPLS sites and newly migrated SD-WAN sites and eliminates the need to route traffic through on-premises datacenters. Use this pattern during migrations and in scenarios that occur from company mergers and acquisitions to interconnect disparate networks.
Scenario #2: ExpressRoute as an SD-WAN underlay network
If your on-premises sites have ExpressRoute connectivity, you can configure SD-WAN devices to set up tunnels to the SD-WAN hub NVAs that run in Azure on top of ExpressRoute. You can use both ExpressRoute private peering and Microsoft peering.
Private peering
When you use ExpressRoute private peering as the underlay network, all on-premises SD-WAN sites establish tunnels to the SD-WAN hub NVAs in Azure. This scenario doesn't require route propagation between the SD-WAN NVAs and the ExpressRoute virtual network gateway, so you must configure Route Server with the AllowBranchToBranch flag set to false.
This approach requires proper BGP configuration on the customer- or provider-side routers that terminate the ExpressRoute connection. Microsoft Enterprise Edge routers (MSEEs) announce all the routes for the virtual networks that are connected to the circuit, either directly or through virtual network peering. To forward traffic destined for virtual networks through an SD-WAN tunnel, the on-premises site must learn those routes from the SD-WAN device, not from the ExpressRoute circuit.
As a result, the customer-side or provider-side routers that terminate the ExpressRoute connection must filter out the routes that they receive from Azure. The only routes in the underlay network should allow the on-premises SD-WAN devices to reach the SD-WAN hub NVAs in Azure. Customers who plan to use ExpressRoute private peering as an SD-WAN underlay network should verify that their routing devices support this configuration. This requirement is especially relevant for customers who don't control the edge devices used for ExpressRoute, such as when an MPLS carrier provides the ExpressRoute circuit on top of an IPVPN service.
Microsoft peering
You can also use the ExpressRoute Microsoft peering as an underlay network for SD-WAN tunnels. In this scenario, the SD-WAN hub NVAs in Azure expose only public tunnel endpoints, which SD-WAN customer premises equipment (CPEs) in both internet-connected sites and ExpressRoute-connected sites use. The ExpressRoute Microsoft peering has more complex prerequisites than private peering, but we recommend this option as an underlay network for the following two reasons:
It doesn't require ExpressRoute virtual network gateways in the hub virtual network. It removes complexity, reduces cost, and lets the SD-WAN solution scale beyond the bandwidth limits of the gateway when you don't use ExpressRoute FastPath.
This approach provides a clear separation between overlay and underlay routes. MSEEs announce only the Microsoft network's public prefixes to the customer or provider edge. You can place those routes in a separate virtual routing and forwarding (VRF) instance and propagate them only to a perimeter network segment of the site's LAN. SD-WAN devices propagate the routes for the customer's corporate network in the overlay, including routes for virtual networks. Customers who consider this approach should verify that they can configure their routing devices accordingly or request the appropriate service from their MPLS carrier.
MPLS considerations
Migration from traditional MPLS corporate networks to more modern network architectures based on the SD-WAN paradigm requires significant effort and time. Use this architecture to implement phased migrations from MPLS to SD-WAN. The following sections describe two typical migration scenarios.
Phased MPLS decommissioning
Customers who want to build an SD-WAN on top of the public internet and the Microsoft backbone and completely decommission MPLS IPVPNs or other dedicated connectivity services can use the ExpressRoute and SD-WAN coexistence scenario during migration. In this scenario, SD-WAN-connected sites can reach sites connected to the legacy MPLS. After you migrate a site to the SD-WAN and deploy CPE devices, you can decommission its MPLS link. The site can access the entire corporate network through its SD-WAN tunnels to the closest Azure regions.
When all sites are migrated, you can decommission the MPLS IPVPN along with the ExpressRoute circuits that connect it to the Microsoft backbone. You no longer need ExpressRoute virtual network gateways and can deprovision them. The SD-WAN hub NVAs in each region become the only entry point into that region's hub-and-spoke network.
MPLS integration
Organizations that don't trust public and shared networks to provide the desired performance and reliability might decide to use an existing MPLS network as an enterprise-class underlay for specific sites or applications.
The ExpressRoute as an SD-WAN underlay scenario supports SD-WAN and MPLS integration. Prefer ExpressRoute Microsoft peering over private peering. When you use Microsoft peering, the MPLS network and the public internet become functionally equivalent underlays. They provide access to all of the SD-WAN tunnel endpoints that the SD-WAN hub NVAs in Azure expose. An SD-WAN CPE deployed in a site that has both internet and MPLS connectivity can establish multiple tunnels to the SD-WAN hubs in Azure on both underlays. The CPE can then route different connections through different tunnels based on application-level policies that the SD-WAN control plane manages.
Route Server routing preference
In both MPLS scenarios in the previous two sections, some branch sites can be connected to both the MPLS IPVPN and the SD-WAN. As a result, the Route Server instances deployed in the hub virtual networks can learn the same routes from ExpressRoute gateways and SD-WAN NVAs.
Use Route Server routing preference to control which path to prefer and extend in the virtual networks' route tables.
Routing preference is useful when you can't use AS Path prepending. An example is MPLS IPVPN services that don't support custom BGP configurations. Depending on how the MPLS network aggregates routes, the level of control you have over the attributes of your MPLS routes, and your preference between SD-WAN and MPLS during the migration, you might need to force Route Server to prefer MPLS routes over SD-WAN routes or SD-WAN routes over MPLS routes.
Route Server limits and design considerations
Route Server is central to this architecture. It propagates routes between SD-WAN NVAs deployed in virtual networks and the underlying Azure SDN stack. It provides a BGP-based approach for running multiple SD-WAN NVAs for HA and horizontal scalability. When you design large SD-WANs based on this architecture, account for the scalability limits of Route Server.
The following sections provide guidance about scalability maximums and how to handle each limit.
Routes advertised by a BGP peer to Route Server
Route Server doesn't define an explicit limit for the number of routes that can be advertised to ExpressRoute virtual network gateways when the AllowBranchToBranch flag is set. However, ExpressRoute gateways further propagate the routes that they learn from Route Server to the ExpressRoute circuits that they connect to.
Azure limits the number of routes that ExpressRoute gateways can advertise to ExpressRoute circuits over private peering. When you design SD-WAN solutions based on the guidance in this article, ensure that SD-WAN routes don't reach this limit. If you reach the limit, the BGP sessions between ExpressRoute gateways and ExpressRoute circuits are dropped, and connectivity between virtual networks and remote networks connected via ExpressRoute is lost.
The total number of routes that ExpressRoute gateways advertise to circuits is the sum of the routes that they learn from Route Server and the prefixes that comprise the Azure hub-and-spoke network's address space. To avoid outages from dropped BGP sessions, we recommend the following mitigations:
Use native SD-WAN device features (route summarization and filtering) to limit the number of routes announced to Route Server, if available.
Use Azure Monitor alerts to proactively detect spikes in the number of routes that ExpressRoute gateways announce. Monitor the metric count of routes advertised to peer.
BGP peers
Route Server can establish BGP sessions with up to a maximum number of BGP peers. This limit determines how many SD-WAN NVAs can establish BGP adjacencies with Route Server. It also defines the maximum aggregate throughput that can be supported across all SD-WAN tunnels. Only large SD-WANs are expected to reach this limit. No workaround exists beyond creating multiple hub-and-spoke networks that each has its own gateways and route servers.
Participating VMs
ExpressRoute virtual network gateways and Route Server configure the routes that they learn from their remote peers for all VMs in their own virtual network and in directly peered virtual networks. To protect Route Server from excessive resource consumption from routing updates to VMs, Azure defines a limit on the number of VMs in a single hub-and-spoke network. Adjust Route Server capacity based on the expected number of VMs in the hub virtual network that contains the Route Server and in all directly peered spoke virtual networks.
Contributors
Microsoft maintains this article. The following contributors wrote this article.
Principal authors:
- Federico Guerrini | Senior Cloud Solution Architect
- Khush Kaviraj | Cloud Solution Architect
To see nonpublic LinkedIn profiles, sign in to LinkedIn.