Configuring and troubleshooting BGP can be complex. A BGP administrator must understand the various options involved in properly configuring BGP for scalable internetworking. This chapter introduces BGP terminology and concepts, and provides BGP configuration, verification, and troubleshooting techniques. The chapter also introduces route maps for manipulating BGP path attributes and filters for BGP routing updates.
BGP Terminology, Concepts, and Operation
This section provides an introduction to BGP and an explanation of various BGP terminology and concepts.
Autonomous Systems
To understand BGP, you first need to understand how it is different from the other protocols discussed so far in this book, including an understanding of autonomous systems.
One way to categorize routing protocols is by whether they are interior or exterior:
-
Interior gateway protocol (IGP)— A routing protocol that exchanges routing information within an autonomous system (AS). Routing Information Protocol (RIP), Open Shortest Path First (OSPF), Intermediate System-to-Intermediate System (IS-IS), and Enhanced Interior Gateway Routing Protocol (EIGRP) are examples of IGPs for IP.
-
Exterior gateway protocol (EGP)— A routing protocol that exchanges routing information between different autonomous systems. BGP is an example of an EGP.
BGP is an interdomain routing protocol (IDRP), which is also known as an EGP. All the routing protocols you have seen so far in this book are IGPs.
Figure 6-1 illustrates the concept of IGPs and EGPs.
Note | The term IDRP as used in this sense is a generic term, not the IDRP defined in ISO/IEC International Standard 10747, Protocol for the Exchange of Inter-Domain Routing Information Among Intermediate Systems to Support Forwarding of ISO 8473 PDUs. |
BGP Version 4 (BGP-4) is the latest version of BGP. It is defined in RFC 4271, A Border Gateway Protocol (BGP-4). As noted in this RFC, the classic definition of an autonomous system is “a set of routers under a single technical administration, using an Interior Gateway Protocol (IGP) and common metrics to determine how to route packets within the autonomous system, and using an inter-autonomous system routing protocol to determine how to route packets to other autonomous systems.”
Autonomous systems might use more than one IGP, with potentially several sets of metrics. The important characteristic of an autonomous system from the BGP point of view is that the autonomous system appears to other autonomous systems to have a single coherent interior routing plan, and it presents a consistent picture of which destinations can be reached through it. All parts of the autonomous system must be connected to each other.
The Internet Assigned Numbers Authority (IANA) is the umbrella organization responsible for allocating autonomous system numbers. Regional Internet registries (RIRs) are nonprofit corporations established for the purpose of administration and registration of IP address space and autonomous system numbers. There are five RIRs, as follows:
-
African Network Information Centre (AfriNIC) is responsible for the continent of Africa.
-
Asia Pacific Network Information Centre (APNIC) administers the numbers for the Asia Pacific region.
-
American Registry for Internet Numbers (ARIN) has jurisdiction over assigning numbers for Canada, the United States, and several islands in the Caribbean Sea and North Atlantic Ocean.
-
Latin American and Caribbean IP Address Regional Registry (LACNIC) is responsible for allocation in Latin America and portions of the Caribbean.
-
Reséaux IP Européens Network Coordination Centre (RIPE NCC) administers the numbers for Europe, the Middle East, and Central Asia.
The autonomous system designator is a 16-bit number, with a range of 1 to 65,535. RFC 1930, Guidelines for Creation, Selection, and Registration of an Autonomous System (AS), provides guidelines for the use of autonomous system numbers. A range of autonomous system numbers, 64512 to 65,535, is reserved for private use, much like the private IP addresses.
Note | All of the examples and exercises in this book use private autonomous system numbers, to avoid publishing autonomous system numbers belonging to an organization. |
Note | RFC 4893, BGP Support for Four-Octet AS Number Space, preparing for the anticipated depletion of BGP 16-bit autonomous system numbers, describes extensions to BGP to use a 32-bit autonomous system number. The Cisco document “Explaining 4-Octet Autonomous System (AS) Numbers for Cisco IOS,” available at http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6554/ps6599/white_paper_C11_516823.html, explains how the new numbering scheme can be implemented in Cisco routers. |
You need to use the IANA-assigned autonomous system number, rather than a private autonomous system number, only if your organization plans to use an EGP, such as BGP, to connect to a public network such as the Internet.
BGP Use Between Autonomous Systems
BGP is used between autonomous systems, as illustrated in Figure 6-2.
The main goal of BGP is to provide an interdomain routing system that guarantees the loop-free exchange of routing information between autonomous systems. BGP routers exchange information about paths to destination networks.
Note | BGP is a successor to Exterior Gateway Protocol (EGP). (Note the dual use of the EGP acronym.) The EGP protocol was developed to isolate networks from each other at the early stages of the Internet. |
There is a distinction between an ordinary autonomous system and one that has been configured with BGP to implement a transit policy. The latter is called an ISP or a service provider.
Many RFCs relate to BGP-4, including those listed in Table 6-1.
RFC Title | |
---|---|
RFC 1772 | An Application of BGP on the Internet |
RFC 1773 | Experience with the BGP-4 Protocol |
RFC 1774 | BGP-4 Protocol Analysis |
RFC 1930 | Guidelines for Creation, Selection, and Registration of an Autonomous System (AS) |
RFC 1997 | BGP Communities Attribute |
RFC 1998 | Application of the BGP Community Attribute in Multihome Routing |
RFC 2042 | Registering New BGP Attribute Types |
RFC 2385 | Protection of BGP Sessions via TCP MD5 Signature Option |
RFC 2439 | BGP Route Flap Damping |
RFC 2545 | Use of BGP-4 Multiprotocol Extensions for IPv6 Interdomain Routing |
RFC 2918 | Route Refresh Capability for BGP-4 |
RFC 3107 | Carrying Label Information in BGP-4 |
RFC 4223 | Reclassification of RFC 1863 to Historic |
RFC 4271 | A Border Gateway Protocol 4 (BGP-4) |
RFC 4364 | BGP/MPLS IP Virtual Private Networks (VPNs) (updated by RFC 4577, RFC 4684, RFC 5462) |
RFC 4456 | BGP Route Reflection: An Alternative to Full Mesh Internal BGP (IBGP) |
RFC 4577 | OSPF as the Provider/Customer Edge Protocol for BGP/MPLS IP Virtual Private Networks (VPNs) |
RFC 4684 | Constrained Route Distribution for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual Private Networks (VPNs) |
RFC 4760 | Multiprotocol Extensions for BGP-4 |
RFC 4893 | BGP Support for Four-Octet AS Number Space |
RFC 5065 | Autonomous System Confederations for BGP |
RFC 5462 | Multiprotocol Label Switching (MPLS) Label Stack Entry: EXP Field Renamed to Traffic Class Field |
RFC 5492 | Capabilities Advertisement with BGP-4 |
Note | You can search for RFCs by number at http://www.rfc-editor.org/rfcsearch.html. |
BGP-4 has many enhancements over earlier protocols. It is used extensively on the Internet today to connect ISPs and to interconnect enterprises to ISPs.
BGP-4 and its extensions are the only acceptable version of BGP available for use on the public Internet. BGP-4 carries a network mask for each advertised network and supports both variable-length subnet mask (VLSM) and classless interdomain routing (CIDR). BGP-4 predecessors did not support these capabilities, which are currently mandatory on the Internet. When CIDR is used on a core router for a major ISP, the IP routing table, which is composed mostly of BGP routes, has more than 300,000 CIDR blocks. Not using CIDR at the Internet level would cause the IP routing table to have more than 2,000,000 entries. Using CIDR, and, therefore, BGP-4, prevents the Internet routing table from becoming too large for interconnecting millions of users.
Comparison with Other Scalable Routing Protocols
Table 6-2 compares some of BGP’s key characteristics to the other scalable IP routing protocols (including EIGRP and OSPF, which are discussed in this book).
Protocol | Interior or Exterior | Type | Hierarchy Required? | Metric |
---|---|---|---|---|
OSPF | Interior | Link state | Yes | Cost |
IS-IS | Interior | Link state | Yes | Metric |
EIGRP | Interior | Advanced distance vector | No | Composite |
BGP | Exterior | Path vector | No | Path vectors (attributes) |
As shown in Table 6-2, OSPF, IS-IS, and EIGRP are interior protocols, whereas BGP is an exterior protocol.
Chapter 1, “Routing Services,” discusses the characteristics of distance vector and link-state routing protocols. OSPF and IS-IS are link-state protocols, whereas EIGRP is an advanced distance vector protocol. BGP is also a distance vector protocol, with many enhancements; it is also called a path vector protocol.
Most link-state routing protocols, including OSPF and IS-IS, require a hierarchical design as the network expands, especially to support proper address summarization. For OSPF and IS-IS this hierarchical design is implemented by separating a large internetwork into smaller internetworks called areas. EIGRP and BGP do not require a hierarchical topology.
BGP works differently than IGPs. Internal routing protocols look at the path cost to get somewhere and choose the best path from one point in a corporate network to another based on certain metrics. RIP uses hop count and looks to cross the fewest Layer 3 devices to reach the destination network. OSPF uses cost, which on Cisco routers is based on bandwidth, as its metric. The IS-IS metric is typically based on bandwidth (but it defaults to 10 on all interfaces on Cisco routers). EIGRP uses a composite metric, with bandwidth and accumulated delay considered by default.
In contrast, BGP does not look at speed for the best path. Rather, BGP is a policy-based routing protocol that allows an autonomous system to control traffic flow using multiple BGP attributes. Routers running BGP exchange network reachability information, called path vectors or attributes, including a list of the full path of BGP autonomous system numbers that a router should take to reach a destination network. BGP allows an organization to fully use all of its bandwidth by manipulating these path attributes.
Connecting Enterprise Networks to an ISP
Modern corporate IP networks connect to the global Internet, use the Internet for some of their data transport needs, and provide services via the Internet to customers and business partners. To meet these needs, systems from web servers to mainframes to workstations are required to be accessible from anywhere in the world.
Requirements that must be determined for connecting an enterprise to an ISP include the following:
-
Public IP address space— In the rare case that only one-way connectivity, from the clients to the Internet, is required, private IP addresses with Network Address Translation (NAT) are used, allowing clients on a private network to communicate with servers on the public Internet. Typically, though, two-way connectivity is needed, such that clients external to the enterprise network can access resources in the enterprise network. In this case, both public and private address space is needed, as is routing.
-
Enterprise-to-ISP connection link type and bandwidth— The type and bandwidth available depends on the ISP and may include leased line, Ethernet over fiber or copper, and various types of digital subscriber line (DSL) (also known as xDSL). The bandwidth provisioned should address the enterprise Internet connectivity requirements.
-
Routing protocol— Either static or dynamic routing.
-
Connection redundancy— The type of redundancy required for the enterprise network to ISP connectivity must be evaluated. Options include edge router redundancy, link redundancy, and ISP redundancy.
These requirements are discussed more fully in the following sections.
Public IP Address Space
Public IP addresses are used to translate client private addresses for those clients that need to access resources on the Internet. They are also used for enterprise servers that need to be accessible from the Internet; these servers are either configured with public addresses or with private addresses that are statically translated to public addresses. If the enterprise network needs to be independent of the selected ISPs, public IP address space should not be used from the ISP public address space but must instead be acquired from a regional Internet authority. Similarly, a public autonomous system number would be required, rather than using an autonomous system number assigned by the ISP.
Connection Link Type and Routing
Connecting an enterprise network to one or more ISPs requires routing information to be exchanged between them. How that routing information is exchanged depends on the requirements, such as the answer to the following questions:
-
Does the routing need to respond to the changes in the network topology, such as when a link goes down?
-
Will the enterprise network be connected to multiple ISPs?
-
Does the routing need to support one link to an ISP or multiple links, to one or multiple ISPs?
-
Is traffic load balancing over multiple links required?
-
Does the ISP only offer a transport capability for connecting customer’s locations, via Layer 2 technologies?
-
Which routing options does the ISP offer?
-
How much routing information needs to be exchanged with the ISP?
The following sections describe two examples of connecting an enterprise network to one or more ISPs, using Layer 2 circuit emulation and Layer 3 Multiprotocol Label Switching (MPLS) virtual private networks (VPNs). Following those sections, the use of static routes and BGP routing is explored.
Using Layer 2 Circuit Emulation
Layer 2 connectivity may be needed between two or more locations, such as in the following examples:
-
The locations include data centers with geographically distributed clusters that require Layer2 connectivity to function properly.
-
The enterprise is in the process of migrating to a Layer 3 solution, but still requires Layer 2 connectivity.
-
The enterprise connects to another partner with which it requires Layer2 connectivity.
As described in Chapter 2, “Configuring the Enhanced Interior Gateway Routing Protocol,” service providers offer many Layer 2 services, including Ethernet, Frame Relay, Point-to-Point Protocol (PPP), High-Level Data Link Control (HDLC), and ATM.
When MPLS VPNs were introduced, they provided a unified network for Layer 3 VPN services. For customers still wanting Layer 2 connections, Ethernet virtual LAN (VLAN) extensions across a metropolitan area or ATM services could be deployed. Any Transport over MPLS (AToM) was introduced to facilitate this Layer 2 connectivity across an MPLS backbone.
AToM enables sending Layer 2 frames across an MPLS backbone. It unifies Layer 2 and Layer 3 offerings over a common MPLS infrastructure. In AToM, virtual circuits (VCs) represent Layer 2 links, and MPLS labels identify VCs.
AToM allows ISPs that currently offer Layer 2 connectivity to customers with traditional offerings such as ATM, Frame Relay, and serial PPP services, and those specializing in Ethernet connectivity in metropolitan areas, to expand their offerings. These Layer 2 VPN services appeal to the ISP’s enterprise customers who may already run their own networks and desire only point-to-point connectivity between sites.
Figure 6-3 illustrates multiple sites of Company A connected via a Layer 2 MPLS VPN. This Layer 2 MPLS VPN provides a Layer 2 service across the backbone, where all of Company A’s edge routers are connected together on the same IP subnet. From Company A’s perspective, the ISP is providing a Layer 2 port (an Ethernet switch port). From the ISP’s perspective, it is connecting two ports together. There is no routing exchange between the ISP and Company A.
Using Layer 3 MPLS VPNs
Figure 6-4 illustrates multiple sites of Company A connected via a Layer 3 MPLS VPN. The Layer 3 MPLS VPN provides a Layer 3 service across the backbone, where all of Company A’s edge routers are connected to ISP edge routers. A separate IP subnet is used for the connection of each edge router.
MPLS VPNs are used when a customer has multiple locations that need to be interconnected through an ISP and does not want to use expensive Layer 2 technologies such as leased lines.
With MPLS VPNs, the ISP uses a common IP-based core network enhanced with MPLS technology to provide secure and manageable connectivity for different customers to their geographically disperse sites. Traffic from different customers of the ISP shares the same physical infrastructure, but is tagged with MPLS labels so that the traffic cannot intermix.
When a customer uses MPLS VPN functionality, routing between the customer and ISP is required, to provide connectivity between the customer locations. The routing used depend on what the ISP supports, but options may include static routes or dynamic routing, including RIP, EIGRP, OSPF, IS-IS, or BGP. Although not typical, different locations could use different routing protocols.
The customer routers are configured for the IGP as if there is a corporate network between them. The ISP and the customer must agree on the IGP parameters; however, these are often governed by the ISP.
With an MPLS VPN deployment, the service provider can also offer Internet connectivity through the same MPLS core network, either through a special Internet VPN or through a global routing table. To exchange the Internet routing information, either BGP or default routes are used.
Using Static Routes
Configuring static routes between a customer’s edge router and an ISP is the simplest way to implement packet forwarding with an ISP. The specific routes configured need to be agreed upon with the ISP and they should not need to be changed very often, because this change would have to be done manually.
Static routes are typically used for Internet connectivity when a customer is connected through a single connection to an ISP, and where that customer can use a default route toward the ISP, as shown in Figure 6-5. The ISP deploys static routes that encompass all the customer’s public networks, and would typically also redistribute this information into its BGP routing protocol. Partial configurations of the routers are also shown in Figure 6-5.
Static routes have drawbacks, especially in terms of flexibility and adaptability. For example, if there were a change in a network topology (beyond a directly connected link failure), the static routes would not adapt. The static routes could be combined with Cisco IOS IP Service Level Agreements (SLAs) functionality (described in Chapter 5, “Implementing Path Control”), which could declare a static route down if a certain condition is met. However, IP SLAs cannot react to all changes in the topology in the Internet. Alternatively, dynamic BGP routing can be used.
Using BGP
BGP dynamically exchanges routing information, and thus reacts to topology changes including those changes beyond a customer-to-ISP link failure. Figure 6-6 shows a simple example with a customer provider edge router (CPE) connecting to the ISP provider edge (PE) router.
This option is the focus of the rest of this chapter.
Connection Redundancy
When connecting an enterprise network to an ISP, redundancy can be achieved by deploying redundant links, deploying redundant devices, and using redundant components within a device, such as a router.
The ISP connection can also be made redundant. A customer can be connected to a single ISP or to multiple ISPs. There are various names for these different types of connections:
-
With a connection to a single ISP when no link redundancy is used, the customer is single homed. If the ISP network fails, connectivity to the Internet is interrupted.
-
With a connection to a single ISP, redundancy can be achieved if two links toward the same ISP are used effectively. This is called being dual homed.
-
With connections to multiple ISPs, redundancy is built into the design. A customer connected to multiple ISPs is said to be multihomed, and is thus resistant to a single ISP failure.
-
To enhance the resiliency further with connections to multiple ISPs, a customer can have two links toward each ISP. This solution is called being dual multihomed.
Single-Homed ISP Connectivity
Figure 6-7 is an example of single-homed ISP connectivity, used in cases when a loss in Internet connectivity is not problematic for a customer.
In this case, the customer uses a single connection to a single ISP. The connection type depends on the ISP offering, and can be, for example, a leased line, xDSL, or Ethernet. A failure of the link results in a no Internet connectivity.
As shown in Figure 6-7, single-homed Internet access does not require BGP. Rather, static routes are typically used, with a static default route from the customer to the ISP, and static routes in the ISP pointing toward customer networks (shown as Option 1 in the figure). If BGP is used (shown as Option 2 in the figure), the customer uses BGP to dynamically announce its public networks to the ISP, and the ISP announces only a default route to the customer, since that is sufficient to provide connectivity.
Dual-Homed ISP Connectivity
As illustrated in Figure 6-8, there are two options for dual homing when a customer has connections to a single ISP. Both links can be connected to one customer router (shown as Option 1 in the figure), or to enhance the resiliency further, the two links can terminate at separate routers in the customer’s network (shown as Option 2 in the figure). In either case, routing must be properly configured to allow both links to be used.
Depending on the SLA signed with the ISP, the routing deployed could achieve either of the following:
-
Primary and backup link functionality where a single primary link is used to forward and receive traffic to and from ISP, and the secondary link is used only when the first one fails
-
Load sharing between the links (achieved with Cisco Express Forwarding [CEF] switching)
In both cases, routing can be either static or dynamic (typically BGP).
Multihomed ISP Connectivity
Figure 6-9 illustrates a company connected to two different ISPs. The benefits of doing so include the following:
-
Resistance to a failure beyond a directly connected link to a single ISP.
-
Load sharing for different destination networks between ISPs, based on the network proximity.
-
Scalability of the solution, beyond two ISPs.
-
Achieving an ISP-independent solution. For example, although an ISP change would require an update to the routing and link configuration, and changing the link, the public IP address space used would remain the same.
Connections from different ISPs can terminate on the same router, or on different routers to further enhance the resiliency. The routing must be capable of reacting to dynamic changes; BGP is typically used.
Dual-Multihomed ISP Connectivity
Figure 6-10 illustrates a company that is dual multihomed, connected to two (or more) different ISPs with dual links per ISP. This configuration typically has multiple edge routers, one per ISP, and uses BGP.
Dual-multihomed connectivity includes all the benefits of multihomed connectivity, with enhanced resiliency.
Using BGP in an Enterprise Network
The Internet is a collection of autonomous systems that are interconnected to allow communication among them. BGP provides the routing between these autonomous systems.
Enterprises that want to connect to the Internet do so through one or more ISPs. If your organization has only one connection to one ISP, you probably do not need to use BGP. Instead you would use a default route. If you have multiple connections to one or to multiple ISPs, however, BGP might be appropriate because it allows manipulation of path attributes, facilitating selection of the optimal path.
When BGP is running between routers in different autonomous systems, it is called External BGP (EBGP). When BGP is running between routers in the same autonomous system, it is called Internal BGP (IBGP). Understanding how BGP works is important to avoid creating problems for your autonomous system as a result of running BGP. For example, enterprise autonomous system 65500 in Figure 6-11 is learning routes from both ISP-A and ISP-B via EBGP and is also running IBGP on all of its routers. Autonomous system 65500 learns about routes and chooses the best way to each one based on the configuration of the routers in the autonomous system and the BGP routes passed from the ISPs. If one of the connections to the ISPs goes down, traffic will be sent through the other ISP.
One of the routes that autonomous system 65500 learns from ISP-A is the route to 172.18.0.0/16. If that route is passed through autonomous system 65500 using IBGP and is mistakenly announced to ISP-B, ISP-B might decide that the best way to get to 172.18.0.0/16 is through autonomous system 65500, instead of through the Internet. Autonomous system 65500 would then be considered a transit autonomous system (an ISP); this would be a very undesirable situation. Autonomous system 65500 wants to have a redundant Internet connection, but does not want to act as a transit autonomous system between ISP-A and ISP-B. Careful BGP configuration is required to avoid this situation.
BGP Multihoming Options
As discussed earlier, multihoming is when an autonomous system has more than one connection to the Internet, via multiple ISPs (either multihoming and dual multihoming). Two typical reasons for multihoming are as follows:
-
To increase the reliability of the connection to the Internet— If one connection fails, the other connection remains available.
-
To increase the performance of the connection— Better paths can be used to certain destinations.
Note | Some Cisco documentation uses the term multihoming to refer to an autonomous system that has one or more connections to the Internet, whether via a single ISP or multiple ISPs. For consistency here, we distinguish between the various types of “homing” as described in the “Connection Redundancy” section, earlier in this chapter. |
The benefits of BGP are apparent when an autonomous system has multiple EBGP connections to either a single ISP (which is called dual homed) or to multiple ISPs (one of the multihomed types). Having multiple connections enables an organization to have redundant connections to the Internet so that if a single path becomes unavailable, connectivity can still be maintained.
An organization can be connected to either a single ISP or to multiple ISPs. A drawback to having all your connections to a single ISP is that connectivity issues in that single ISP can cause your autonomous system to lose connectivity to the Internet. By having connections to multiple ISPs, an organization gains the following benefits:
-
Has redundancy with the multiple connections
-
Is not tied into the routing policy of a single ISP
-
Has more paths to the same networks for better policy manipulation
A multihomed autonomous system will run EBGP with its external neighbors and might also run IBGP internally.
If an organization has determined that it will perform multihoming with BGP, three common ways to do this are as follows:
-
Each ISP passes only a default route to the autonomous system— The default route is passed to the internal routers.
-
Each ISP passes only a default route and provider-owned specific routes to the autonomous system— These routes can be passed to internal routers, or all internal routers in the transit path can run BGP and pass these routes between them.
-
Each ISP passes all routes to the autonomous system— All internal routers in the transit path run BGP and pass these routes between them.
The sections that follow describe these options in greater detail.
Multihoming with Default Routes from All Providers
The first multihoming option is to receive only a default route from each ISP. This configuration requires the least resources within the autonomous system because a default route is used to reach any external destinations. The autonomous system sends all of its routes to the ISPs, which process and pass the routes on to other autonomous systems.
In this scenario, both edge routers learn a default route from their attached ISP and propagate the default route into their routing domain using the local IGP. If a router within the autonomous system learns about multiple default routes using the local IGP, it installs the best default route into its routing table. From the perspective of this router, it takes the default route with the least-cost IGP metric. This IGP default route will route packets that are destined to the external networks to an edge router of this autonomous system, which is running EBGP with the ISPs. The edge router will use the BGP default route to reach all external networks.
The route that inbound packets take to reach the autonomous system is decided outside of the autonomous system (within the ISPs and other autonomous systems).
Regional ISPs that have multiple connections to national or international ISPs commonly implement this option. The regional ISPs do not need to use BGP for path manipulation. However, they do require the capability of adding new customers and the networks of the customers; BGP is ideal for this purpose. Consider if the regional ISP did not use BGP: Each time that the regional ISP needed to add a new set of networks, the customers would have to wait until the national ISPs added these networks to their BGP process and added static routes pointing to the regional ISP. Instead, by running EBGP with the national or international ISPs, the regional ISP needs only to add the new customer networks to its BGP process. These new networks automatically propagate across the Internet with minimal delay.
A customer that chooses to receive default routes from all providers must understand the following limitations of this option:
-
Path manipulation cannot be performed because only a single route is being received from each ISP.
-
Bandwidth manipulation is extremely difficult and can be accomplished only by manipulating the IGP metric of the default route.
-
Diverting some of the traffic from one exit point to another is challenging because all destinations are using the same default route for path selection.
Note | Policy-based routing, as described in Chapter 5, could be used to select the ISP to which specific traffic should be routed. |
Figure 6-12 illustrates an example. ISP autonomous system 65000 and ISP autonomous system 65250 send default routes into Enterprise autonomous system 65500. The ISP that a specific router within autonomous system 65500 uses to reach any external address is decided by the IGP metric that is used to reach the default route within the autonomous system. For example, if autonomous system 65500 uses RIP, Router C selects the route with the lowest hop count to the default route when sending packets to network 172.16.0.0. In this example, this appears to be suboptimal routing; the packets will go to ISP autonomous system 65250, but a more direct route is via ISP autonomous system 65000.
Multihoming with Default Routes and Partial Table from All Providers
In the second design option for multihoming, all ISPs pass default routes plus select specific routes to the autonomous system.
An enterprise that is running EBGP with an ISP and that wants a partial routing table generally receives the networks that the ISP and its other customers own. The enterprise can also receive the routes from any other autonomous system.
Major ISPs are assigned large blocks of addresses (for example, they may get 2000 to 10,000 CIDR blocks of IP addresses, depending on their needs) from their RIR (which in turn gets address blocks from the IANA). The ISPs reassign their address blocks to their customers. If the ISP passes this information to a customer that wants only a partial BGP routing table, the customer could pass this information to internal routers using IBGP and might redistribute these routes into its IGP.
The internal routers of the customer (these routers are not running BGP) could then receive these routes via redistribution. They would take the nearest exit point based on the best metric of specific networks instead of taking the nearest exit point based on the default route.
Acquiring a partial BGP table from each provider would be beneficial for specific routes because path selection will be more predictable than when using a default route.
Figure 6-13 illustrates an example. ISP A autonomous system 65000 and ISP B autonomous system 64900 send default routes and the routes that each ISP owns to Enterprise autonomous system 64500. The enterprise (autonomous system 64500) asked both providers to also send routes to networks in autonomous system 64520 because of the amount of traffic between autonomous system 64520 and autonomous system 64500.
By running IBGP between the internal routers within autonomous system 64500 (or at least those along the transit path within the autonomous system), autonomous system 64500 can choose the optimal path to reach the customer networks (autonomous system 64520 in this case). (Where IBGP should be run is described in detail in the upcoming “IBGP on All Routers in a Transit Path” section in this chapter.) The routes to autonomous system 64100 and to other autonomous systems (not shown in the figure) that are not specifically advertised to autonomous system 64500 by ISP A and ISP B are decided by the IGP metric that is used to reach the default route within the autonomous system this may again result in suboptimal routing to these destinations.
Multihoming with Full Routes from All Providers
In the third multihoming option, all ISPs pass all routes to the autonomous system, and IBGP is run on at least all the routers in the transit path in this autonomous system. This option allows the internal routers of the autonomous system to take the path through the best ISP for each route.
This configuration requires a lot of resources within the autonomous system because it must process all the external routes.
The autonomous system sends all of its routes to the ISPs, which process the routes and pass them to other autonomous systems.
Figure 6-14 illustrates an example. ISP A autonomous system 65000 and ISP B autonomous system 64900 send all routes into Enterprise autonomous system 64500. The ISP that a specific router within autonomous system 64500 uses to reach the external networks is determined by BGP. The routers in autonomous system 64500 can be configured to influence the path to certain networks. For example, Router A and Router B can influence the outbound traffic from autonomous system 64500.
BGP Path Vector Characteristics
Internal routing protocols announce a list of networks and the metrics to get to each network. In contrast, BGP routers exchange network reachability information, called path vectors, made up of path attributes, as illustrated in Figure 6-15. The path vector information includes a list of the full path of BGP autonomous system numbers (hop by hop) necessary to reach a destination network. Other attributes include the IP address to get to the next autonomous system (the next-hop attribute) and how the networks at the end of the path were introduced into BGP (the origin code attribute). The “BGP Attributes” section, later in this chapter, describes all the BGP attributes in detail.
This autonomous system path information is used to construct a graph of loop-free autonomous systems and to identify routing policies so that restrictions on routing behavior can be enforced based on the autonomous system path.
The BGP autonomous system path is guaranteed to always be loop free: A router running BGP does not accept a routing update that already includes its autonomous system number in the path list, because the update has already passed through its autonomous system, and accepting it again will result in a routing loop.
BGP is designed to scale to huge internetworks, such as the Internet.
BGP allows routing-policy decisions to be applied to the path of BGP autonomous system numbers so that routing behavior can be enforced at the autonomous system level and to determine how data will flow through the autonomous system. These policies can be implemented for all networks owned by an autonomous system, for a certain CIDR block of network numbers (prefixes), or for individual networks or subnetworks. The policies are based on the attributes carried in the routing information and configured on the routers.
BGP specifies that a BGP router can advertise to its peers (neighbors) in neighboring autonomous systems only those routes that it uses. This rule reflects the hop-by-hop routing paradigm generally used throughout the current Internet. Some policies cannot be supported by the hop-by-hop routing paradigm. For example, BGP does not allow one autonomous system to send traffic to a neighboring autonomous system, intending that the traffic take a different route from that taken by traffic originating in that neighboring autonomous system. In other words, you cannot influence how a neighboring autonomous system will route your traffic, but you can influence how your traffic gets to a neighboring autonomous system. However, BGP can support any policy conforming to the hop-by-hop routing paradigm.
Because the current Internet uses only the hop-by-hop routing paradigm, and because BGP can support any policy that conforms to that paradigm, BGP is highly applicable as an inter-autonomous system routing protocol for the current Internet.
For example, in Figure 6-16, the following are some of the paths possible for autonomous system 64512 to reach networks in autonomous system 64700, through autonomous system 64520:
-
64520 64600 64700
-
64520 64600 64540 64550 64700
-
64520 64540 64600 64700
-
64520 64540 64550 64700
Autonomous system 64512 does not see all these possibilities. Autonomous system 64520 advertises to autonomous system 64512 only its best path, in this case 64520 64600 64700, the same way that IGPs announce only their best least-cost routes. This path is the only path through autonomous system 64520 that autonomous system 64512 sees. All packets that are destined for 64700 via 64520 take this path, because it is the autonomous system-by-autonomous system (hop-by-hop) path that autonomous system 64520 uses to reach the networks in autonomous system 64700. Autonomous system 64520 does not announce the other paths, such as 64520 64540 64600 64700, because it does not choose any of those paths as the best path, based on the BGP routing policy in autonomous system 64520.
Autonomous system 64512 does not learn of the second-best path, or any other paths from 64520, unless the best path through autonomous system 64520 becomes unavailable.
Even if autonomous system 64512 were aware of another path through autonomous system 64520 and wanted to use it, autonomous system 64520 would not route packets along that other path, because autonomous system 64520 selected 64520 64600 64700 as its best path, and all autonomous system 64520 routers will use that path as a matter of BGP policy. BGP does not let one autonomous system send traffic to a neighboring autonomous system, intending that the traffic take a different route from that taken by traffic originating in the neighboring autonomous system.
To reach the networks in autonomous system 64700, autonomous system 64512 can choose to use the path through autonomous system 64520 or it can choose to go through the path that autonomous system 64530 is advertising. Autonomous system 64512 selects the best path to take based on its own BGP routing policies.
When to Use BGP
BGP use in an autonomous system is most appropriate when the effects of BGP are well understood and at least one of the following conditions exists:
-
The autonomous system allows packets to transit through it to reach other autonomous systems (for example, it is a service provider).
-
The autonomous system has multiple connections to other autonomous systems.
-
Routing policy and route selection for traffic entering and leaving the autonomous system must be manipulated.
If an enterprise wants its traffic to be differentiated from its ISP’s traffic on the Internet, the enterprise must connect to its ISP using BGP. If, instead, an enterprise is connected to its ISP with a static route, traffic from that enterprise on the Internet is indistinguishable from traffic from the ISP.
BGP was designed to allow ISPs to communicate and exchange packets. These ISPs have multiple connections to one another and have agreements to exchange updates. BGP is the protocol that is used to implement these agreements between two or more autonomous systems. If BGP is not properly controlled and filtered, it has the potential to allow an outside autonomous system to affect the traffic flow to your autonomous system. For example, if you are a customer connected to ISP A and ISP B (for redundancy), you want to implement a routing policy to ensure that ISP A does not send traffic to ISP B via your autonomous system. You want to be able to receive traffic destined for your autonomous system through each ISP, but you do not want to waste valuable resources and bandwidth within your autonomous system to route traffic for your ISPs. This chapter focuses on how BGP operates and how to configure it properly to prevent this from happening.
When Not to Use BGP
BGP is not always the appropriate solution to interconnect autonomous systems. For example, if there is only one exit path from the autonomous system, a default or static route is appropriate. Using BGP will not accomplish anything except to use router CPU resources and memory. If the routing policy that will be implemented in an autonomous system is consistent with the policy implemented in the ISP autonomous system, it is not necessary or even desirable to configure BGP in that autonomous system. The only time BGP will be required is when the local policy differs from the ISP policy.
Do not use BGP if one or more of the following conditions exist:
-
A single connection to the Internet or another autonomous system
-
Lack of memory or processor power on edge routers to handle constant BGP updates
-
You have a limited understanding of route filtering and the BGP path-selection process
In these cases, use static or default routes instead, as discussed in Chapter 1.
BGP Characteristics
What type of protocol is BGP? Chapter 1 covers the characteristics of distance vector and link-state routing protocols. BGP is sometimes categorized as an advanced distance vector protocol, but it is actually a path vector protocol. BGP has many differences from standard distance vector protocols, such as RIP.
BGP uses the TCP as its transport protocol, which provides connection-oriented reliable delivery. In this way, BGP assumes that its communication is reliable and, therefore, BGP does not have to implement any retransmission or error-recovery mechanisms, like EIGRP does. BGP information is carried inside TCP segments using protocol 179; these segments are carried inside IP packets. Figure 6-17 illustrates this concept.
Two routers speaking BGP establish a TCP connection with one another and exchange messages to open and confirm the connection parameters. These two routers are called BGP peer routers or BGP neighbors.
After the TCP connection is made, the routers exchange their full BGP routing tables (described later in the “BGP Tables” section). However, because the connection is reliable, BGP routers need to send only changes (incremental updates) after that. Periodic routing updates are not required on a reliable link, so triggered updates are used. BGP sends keepalive messages, similar to the hello messages sent by OSPF, IS-IS, and EIGRP.
OSPF and EIGRP have their own internal functions to ensure that update packets are explicitly acknowledged. These protocols use a one-for-one window so that if either OSPF or EIGRP has multiple packets to send, the next packet cannot be sent until an acknowledgment from the first update packet is received. This process can be very inefficient and cause latency issues if thousands of update packets must be exchanged over relatively slow serial links. However, OSPF and EIGRP rarely have thousands of update packets to send. For example, EIGRP can hold more than 100 networks in one EIGRP update packet, so 100 EIGRP update packets can hold up to 10,000 networks. Most organizations do not have 10,000 subnets.
BGP, on the other hand, has more than 300,000 networks (and growing) on the Internet to advertise, and it uses TCP to handle the acknowledgment function. TCP uses a dynamic window, which allows for up to 65,576 bytes to be outstanding before it stops and waits for an acknowledgment. For example, if 1000-byte packets are being sent and the maximum window size is being used, BGP would have to stop and wait for an acknowledgment only when 65 packets had not been acknowledged.
Note | The CIDR report, at http://www.cidr-report.org/, is a good reference site to see the current size of the Internet routing tables and other related information. |
TCP is designed to use a sliding window, where the receiver sends an acknowledgment before the number of octets specified by the window have been received (such at the halfway point of the sending window). This method allows any TCP application, such as BGP, to continue streaming packets without having to stop and wait, as OSPF or EIGRP would require.
BGP Neighbor Relationships
No single router can handle communications with the tens of thousands of the routers that run BGP and are connected to the Internet, representing more than 33,000 autonomous systems. A BGP router forms a direct neighbor relationship with a limited number of other BGP routers. Through these BGP neighbors, a BGP router learns of the paths through the Internet to reach any advertised network.
Any router that runs BGP is called a BGP speaker.
A BGP peer, also known as a BGP neighbor, is a BGP speaker that is configured to form a neighbor relationship with another BGP speaker for the purpose of directly exchanging BGP routing information with one another.
A BGP speaker has a limited number of BGP neighbors with which it peers and forms a TCP-based relationship, as illustrated in Figure 6-18. BGP peers can be either internal or external to the autonomous system.
Note | A BGP peer must be configured under the BGP process with a neighbor command. This command instructs the BGP process to establish a relationship with the neighbor at the address listed in the command and to exchange BGP routing updates with that neighbor. BGP configuration is described later, in the “Configuring BGP” section. |
External BGP Neighbors
Recall that when BGP is running between routers in different autonomous systems, it is called EBGP. Routers running EBGP are usually directly connected to each other, as shown in Figure 6-19.
An EBGP neighbor is a router running in a different autonomous system. An IGP is not run between the EBGP neighbors. For two routers to exchange BGP routing updates, the TCP reliable transport layer on each side must successfully pass the TCP three-way handshake before the BGP session can be established. Therefore, the IP address used in the neighbor command must be reachable without using an IGP. This can be accomplished by pointing at an address that can be reached through a directly connected network or by using static routes to that IP address. Generally, the neighbor address used is the address of the directly connected network.
An enterprise network can have a connection to one or several ISPs, and the ISPs themselves might be connected to several other ISPs. For each such connection between different autonomous systems, there is an EBGP session required between EBGP neighboring routers. In Figure 6-19, an EBGP relationship is established between Routers D and Y, and another EBGP relationship is established between Routers C and X.
There are several requirements for EBGP neighborship:
-
Different autonomous system number— EBGP neighbors must reside in different autonomous systems to be able to form an EBGP relationship.
-
Define neighbors— A TCP session must be established before starting BGP routing update exchanges.
-
Reachability— The IP addresses used in the neighbor command must be reachable; EBGP neighbors are usually directly connected.
Internal BGP Neighbors
Recall that when BGP is running between routers within the same autonomous system, it is called IBGP. IBGP is run within an autonomous system to exchange BGP information so that all internal BGP speakers have the same BGP routing information about outside autonomous systems and so this information can be passed to other autonomous systems.
There are several requirements for IBGP neighborship:
-
Same autonomous system number— IBGP neighbors must reside in the same autonomous system to be able to form an IBGP relationship.
-
Define neighbors— A TCP session must be established between neighbors before they start exchanging BGP routing updates.
-
Reachability— IBGP neighbors must be reachable. An IGP typically runs inside the autonomous system.
Routers running IBGP do not have to be directly connected to each other, as long as they can reach each other so that TCP handshaking can be performed to set up the BGP neighbor relationships. The IBGP neighbor can be reached by a directly connected network, static routes, or an internal routing protocol. Because multiple paths generally exist within an autonomous system to reach other routers, a loopback address is usually used in the BGP neighbor command to establish the IBGP sessions.
For example, in Figure 6-20, Routers A, D, and C learn the paths to the external autonomous systems from their respective EBGP neighbors (Routers Z, Y, and X). If the link between Routers D and Y goes down, Router D must learn new routes to the external autonomous systems. Other BGP routers within autonomous system 65500 that were using Router D to get to external networks must also be informed that the path through Router D is unavailable. Those BGP routers within autonomous system 65500 need to have the alternative paths through Routers A and C in their BGP topology database.
As described in the next section, you must set up IBGP sessions between all routers in the transit path in autonomous system 65500 so that each router in the transit path within the autonomous system learns about paths to the external networks via IBGP.
IBGP on All Routers in a Transit Path
This section explains why IBGP route propagation requires all routers in the transit path in an autonomous system to run IBGP.
IBGP in a Transit Autonomous System
BGP was originally intended to run along the borders of an autonomous system, with the routers in the middle of the autonomous system ignorant of the details of BGP—hence the name Border Gateway Protocol. A transit autonomous system, such as autonomous system 65102 in Figure 6-21, is an autonomous system that routes traffic from one external autonomous system to another external autonomous system. As mentioned earlier, transit autonomous systems are typically ISPs. All routers in a transit autonomous system must have complete knowledge of external routes. Theoretically, one way to achieve this goal is to redistribute BGP routes into an IGP at the edge routers; however, this approach has problems.
Because the current Internet routing table is very large, redistributing all the BGP routes into an IGP is not a scalable way for the interior routers within an autonomous system to learn about the external networks. Another method that you can use is to run IBGP on all routers within the autonomous system.
IBGP in a Nontransit Autonomous System
A nontransit autonomous system, such as an organization that is multihoming with two ISPs, does not pass routes between the ISPs. To make proper routing decisions, however, the BGP routers within the autonomous system still require knowledge of all BGP routes passed to the autonomous system.
As discussed, BGP does not work in the same manner as IGPs. Because the designers of BGP could not guarantee that an autonomous system would run BGP on all routers, a method had to be developed to ensure that IBGP speakers could pass updates to one another while ensuring that no routing loops would exist.
To avoid routing loops within an autonomous system, BGP specifies that routes learned through IBGP are never propagated to other IBGP peers. Recall that the neighbor command enables BGP updates between BGP speakers. By default, each BGP speaker is assumed to have a neighbor statement for all other IBGP speakers in the autonomous system—this is known as full-mesh IBGP.
If the sending IBGP neighbor is not fully meshed with each IBGP router, the routers that are not peering with this router will have different IP routing tables than the routers that are peering with it. The inconsistent routing tables can cause routing loops or routing black holes, because the default assumption by all routers running BGP within an autonomous system is that each BGP router exchanges IBGP information directly with all other BGP routers in the autonomous system.
When all IBGP neighbors are fully meshed and a change is received from an external autonomous system, the receiving BGP router in the local autonomous system is responsible for informing all other IBGP neighbors of the change. IBGP neighbors that receive this update do not send it to any other IBGP neighbor because they assume that the sending IBGP neighbor is fully meshed with all other IBGP speakers and has sent each IBGP neighbor the update.
BGP Partial-Mesh and Full-Mesh Examples
The top network in Figure 6-22 illustrates IBGP update behavior in a partially meshed neighbor environment. Router B receives an EBGP update from Router A. Router B has two IBGP neighbors, Routers C and D, but does not have an IBGP neighbor relationship with Router E. Therefore, Routers C and D learn about any networks that were added or withdrawn behind Router B. Even if Routers C and D have IBGP neighbor sessions with Router E, they assume that the autonomous system is fully meshed for IBGP and do not replicate the update and send it to Router E. Sending the IBGP update to Router E is Router B’s responsibility because it is the router with firsthand knowledge of the networks in and beyond autonomous system 65101. So, Router E does not learn of any networks through Router B and does not use Router B to reach any networks in autonomous system 65101 or other autonomous systems behind autonomous system 65101.
In the lower portion of Figure 6-22, IBGP is fully meshed. When Router B receives an EBGP update from Router A, it updates all three of its IBGP peers, Router C, Router D, and Router E. OSPF, the IGP, is used to route the TCP segment containing the BGP update from Router A to Router E, because these two routers are not directly connected. The update is sent once to each neighbor and not duplicated by any other IBGP neighbor, which reduces unnecessary traffic. In fully meshed IBGP, each router assumes that every other internal router has a neighbor statement that points to each IBGP neighbor.
TCP and Full Mesh
TCP was selected as the transport layer for BGP because TCP can move a large volume of data reliably. With the very large full Internet routing table changing constantly, using TCP for windowing and reliability was determined to be the best solution, as opposed to developing a BGP one-for-one windowing capability like OSPF or EIGRP.
TCP sessions cannot be multicast or broadcast because TCP has to ensure the delivery of packets to each recipient. Because TCP cannot use broadcasting, BGP cannot use it either.
Because each IBGP router needs to send routes to all the other IBGP neighbors in the same autonomous system (so that they all have a complete picture of the routes sent to the autonomous system) and they cannot use broadcast, they must use fully meshed BGP (TCP) sessions. In other words, an IBGP neighbor relationship must be configured between each pair of routers.
When all routers running BGP in an autonomous system are fully meshed and have the same database as a result of a consistent routing policy, they can apply the same path-selection formula. The path-selection results will therefore be uniform across the autonomous system. Uniform path selection across the autonomous system means no routing loops and a consistent policy for exiting and entering the autonomous system.
Routing Issues If BGP Not on in All Routers in a Transit Path
Figure 6-23 illustrates how routing might not work if all routers in a transit path are not running BGP.
In this example, Routers A, B, E, and F are the only ones running BGP. Router B has an EBGP neighbor statement for Router A and an IBGP neighbor statement for Router E. Router E has an EBGP neighbor statement for Router F and an IBGP neighbor statement for Router B. Routers C and D are not running BGP. Routers B, C, D and E are running OSPF as their IGP.
Network 10.0.0.0 is owned by autonomous system 65101 and is advertised by Router A to Router B via an EBGP session. Router B advertises it to Router E via an IBGP session. Routers C and D never learn about this network because it is not redistributed into the local routing protocol (OSPF in this example), and Routers C and D are not running BGP. If Router E advertises this network to Router F in autonomous system 65103, and Router F starts forwarding packets to network 10.0.0.0 through autonomous system 65102, where would Router E send the packets?
Router E would send the packets to its BGP peer, Router B. To get to Router B, however, the packets must go through Router C or D, but those routers do not have an entry in their routing tables for network 10.0.0.0. Thus, if Router E forwards packets with a destination address in network 10.0.0.0 to either Routers C or D, those routers discard the packets.
Even if Routers C and D have a default route going to the exit points of the autonomous system (Routers B and E), there is a good chance that when Router E sends a packet for network 10.0.0.0 to Routers C or D, those routers may send it back to Router E, which will forward it again to Routers C or D, causing a routing loop. To solve this problem, BGP must be implemented on Routers C and D.
In conclusion, it is important to remember that all routers in the path between IBGP neighbors within an autonomous system, known as the transit path, must also be running BGP. These IBGP sessions must be fully meshed.
BGP Synchronization
The BGP synchronization rule states that a BGP router should not use, or advertise to an external neighbor, a route learned by IBGP, unless that route is local or is learned from the IGP.
If there were a small enough number of BGP routes so that they could be redistributed into an IGP running in an autonomous system, IBGP would not be needed in every router in the transit path. However, synchronization would be needed to make sure that packets did not get lost. In the past, synchronization was on by default in the Cisco IOS. If synchronization is enabled and your autonomous system is passing traffic from one autonomous system to another, BGP should not advertise a route before all routers in your autonomous system have learned about the route via IGP. In other words, BGP and the IGP must be synchronized before networks learned from an IBGP neighbor can be used.
The modern Internet has too many routes in the BGP table to consider redistributing them into IGPs. The best practice is to not redistribute BGP into the IGP, but instead use IBGP on all routers in the transit path. In this case, synchronization is not needed. Therefore, it is now off by default in the Cisco IOS.
BGP synchronization is disabled by default in Cisco IOS Software Release 12.2(8)T and later. It was on by default in earlier Cisco IOS Software releases. With the default of synchronization disabled, BGP can use and advertise to external BGP neighbors routes learned from an IBGP neighbor that are not present in the local routing table. (BGP synchronization should only be disabled, though, if all routers in the transit path in the autonomous system are running full-mesh IBGP, for the reasons discussed in the previous section, or if the autonomous system is not a transit autonomous system.)
If synchronization is enabled, a router learning a route via IBGP waits until the IGP has propagated the route within the autonomous system and then advertises it to external peers. This is done so that all routers in the autonomous system are synchronized and can route traffic that the autonomous system advertises to other autonomous systems that it can route. The BGP synchronization rule also ensures consistency of information throughout the autonomous system and avoids black holes (for example, advertising a destination to an external neighbor when not all the routers within the autonomous system can reach the destination) within the autonomous system.
Having synchronization disabled allows the routers to carry fewer routes in IGP and allows BGP to converge more quickly because it can advertise the routes as soon as it learns them.
Synchronization should be enabled if there are routers in the BGP transit path in the autonomous system that are not running BGP (and therefore the routers do not have full-mesh IBGP within the autonomous system).
Figure 6-24 illustrates an example. Routers A, B, C, and D are all running IBGP and an IGP with each other (there is full-mesh IBGP). There are no matching IGP routes for the BGP routes (Routers A and B are not redistributing the BGP routes into the IGP). Routers A, B, C, and D have IGP routes to the internal networks of autonomous system 65500 but do not have IGP routes to external networks, such as 172.16.0.0.
Router F advertises 172.16.0.0 to Router B using EBGP. Router B advertises the route to 172.16.0.0 to the other routers in autonomous system 65500 using IBGP.
If synchronization is on in autonomous system 65500 in Figure 6-24, the following happens:
-
Router B uses the route to 172.16.0.0 and installs it in its routing table.
-
Routers A, C, and D do not use, or advertise to any external neighbors, the route to 172.16.0.0, because synchronization is on and they do not also learn it from their IGP.
-
Router E does not hear about 172.16.0.0. If Router E receives traffic destined for network 172.16.0.0, it does not have a route for that network and cannot forward the traffic.
In this scenario, Routers A, C, D, and E do not have network 172.16.0.0 in their routing table and therefore will drop any packets they receive for this network.
If synchronization is off (the default) in autonomous system 65500 in Figure 6-24, the following happens:
-
Router B uses the route to 172.16.0.0 and installs it in its routing table.
-
Routers A, C, and D use, and can advertise to external neighbors, the route to 172.16.0.0 that they receive via IBGP. Routers A, C, and D install this route in their routing tables (assuming, of course, that Routers A, C, and D can reach the next-hop address for 172.16.0.0).
-
Router E hears about 172.16.0.0 from Router A. Therefore, Router E has a route to 172.16.0.0 and can send traffic destined for that network.
In this scenario, Routers A, C, D, and E all have network 172.16.0.0 in their routing table and therefore will forward any packets they receive for this network.
As described earlier, in modern autonomous systems, because the size of the Internet routing table is large, redistributing from BGP into an IGP is not scalable. Therefore, most modern autonomous systems run full-mesh IBGP and do not require synchronization. Some advanced BGP configuration methods, such as route reflectors and confederations, reduce the IBGP full-mesh requirements. (As mentioned earlier, route reflectors are discussed in Appendix C.)
BGP Tables
As shown in Figure 6-25, a router running BGP keeps its own table for storing BGP information received from and sent to other routers.
This table of BGP information is known by many names in various documents, including the following:
-
BGP table
-
BGP topology table
-
BGP topology database
-
BGP routing table
-
BGP forwarding database
It is important to remember that this BGP table is separate from the IP routing table in the router.
The router offers the best routes from the BGP table to the IP routing table and can be configured to share information between the two tables (by redistribution).
BGP also keeps a neighbor table containing a list of neighbors with which it has a BGP connection.
For BGP to establish an adjacency, you must configure it explicitly for each neighbor. BGP forms a TCP relationship with each of the configured neighbors and keeps track of the state of these relationships by periodically sending a BGP/TCP keepalive message.
Note | BGP sends BGP/TCP keepalives by default every 60 seconds. |
After establishing an adjacency, the neighbors exchange their best BGP routes. Each router collects these routes from each neighbor with which it successfully established an adjacency and places them in its BGP forwarding database. All routes that have been learned from each neighbor are placed in the BGP forwarding database. The best routes for each network are selected from the BGP forwarding database using the BGP route-selection process (discussed in the section “The Route-Selection Decision Process,” later in this chapter) and then are offered to the IP routing table. (As described in the referenced section, one of the criteria for being selected as the best BGP route is that the next-hop IP address is reachable. Therefore, BGP routes with an unreachable next hop will not be propagated to other routers.)
Note | As described earlier, if synchronization is enabled, BGP will exchange routes only if those routes are also in the IP routing table. With the default of synchronization disabled, though, the routes do not have to be in the IP routing table for BGP to advertise them. |
Each router compares the offered BGP routes to any other possible paths to those networks in its IP routing table, and the best route, based on administrative distance, is installed in the IP routing table. EBGP routes (BGP routes learned from an external autonomous system) have an administrative distance of 20. IBGP routes (BGP routes learned from within the autonomous system) have an administrative distance of 200.
A router may have a best BGP route to a destination, but that route might not be installed in the IP routing table because it has a higher administrative distance than another route. That best BGP route will still be propagated to other BGP routers, though.
BGP Message Types
BGP defines the following message types, as described in this section:
-
Open
-
Keepalive
-
Update
-
Notification
Note | Keepalive messages have a length of 19 bytes. Other messages may be between 19 and 4096 bytes long. |
Open and Keepalive Messages
After a TCP connection is established, the first message sent by each side is an open message. If the open message is acceptable, a keepalive message confirming the open message is sent back by the side that received the open message.
When the open is confirmed, the BGP connection is established, and update, keepalive, and notification messages can be exchanged.
BGP peers initially exchange their full BGP routing tables. From then on, incremental updates are sent as the routing table changes. Keepalive packets are sent to ensure that the connection is alive between the BGP peers, and notification packets are sent in response to errors or special conditions.
An open message includes the following information:
-
Version— This 8-bit field indicates the message’s BGP version number. The highest common version that both routers support is used. BGP implementations today use the current version, BGP-4.
-
My autonomous system— This 16-bit field indicates the sender’s autonomous system number. The peer router verifies this information; if it is not the autonomous system number expected, the BGP session is torn down.
-
Hold time— This 16-bit field indicates the maximum number of seconds that can elapse between the successive keepalive or update messages from the sender. Upon receipt of an open message, the router calculates the value of the hold timer to use with this neighbor by using the smaller of its configured hold time (which has a default of 180 seconds) and the hold time received in the open message.
-
BGP router identifier (router ID)— This 32-bit field indicates the sender’s BGP identifier. The BGP router ID is an IP address assigned to that router and is determined at startup. The BGP router ID is chosen the same way the OSPF router ID is chosen: It is the highest active IP address on the router, unless a loopback interface with an IP address exists, in which case it is the highest such loopback IP address. Alternatively, the router ID can be statically configured, overriding the automatic selection.
-
Optional parameters— A length field indicates the total length of the optional parameters field in octets. These parameters are Type, Length, and Value (TLV)-encoded. An example of an optional parameter is session authentication.
BGP does not use any transport protocol-based keepalive mechanism to determine whether peers can be reached. Instead, BGP keepalive messages are exchanged between peers often enough to keep the hold timer from expiring. If the negotiated hold time interval is 0, periodic keepalive messages are not sent. Keepalive messages consist of only a message header and have a length of 19 bytes; they are sent every 60 seconds by default.
Update Messages
An update message has information on one path only; multiple paths require multiple messages. All the attributes in the update message refer to that path, and the networks are those that can be reached through that path. An update message might include the following fields:
-
Withdrawn routes— A list of IP address prefixes for routes that are being withdrawn from service, if any.
-
Path attributes— The AS-path, origin, local preference, and so forth, as discussed in the next section. Each path attribute includes the attribute type, attribute length, and attribute value (TLV). The attribute type consists of the attribute flags, followed by the attribute type code.
-
Network layer reachability information (NLRI)— A list of networks (IP address prefixes and their prefix lengths) that can be reached by this path.
Notification Messages
A BGP router sends a notification message when it detects an error condition. The BGP router closes the BGP connection immediately after sending the notification message. Notification messages include an error code, an error subcode, and data related to the error.
BGP Attributes
BGP routers send BGP update messages about destination networks to other BGP routers. As described in the previous section, update messages can contain network layer reachability information, which is a list of one or more networks (IP address prefixes and their prefix lengths), and path attributes, which are a set of BGP metrics describing the path to these networks (routes). BGP uses the path attributes to determine the best path to the networks. The following are some terms defining how these attributes are implemented:
-
An attribute is either well-known or optional, mandatory or discretionary, and transitive or nontransitive. An attribute might also be partial.
-
Not all combinations of these characteristics are valid; path attributes fall into four separate categories:
-
Well-known mandatory
-
Well-known discretionary
-
Optional transitive
-
Optional nontransitive
-
-
Only optional transitive attributes might be marked as partial.
These characteristics are described in the following sections.
Well-Known Attributes
A well-known attribute is one that all BGP implementations must recognize and propagate to BGP neighbors.
Note | If a well-known attribute is missing from an update message, a notification error is generated. This ensures that all BGP implementations agree on a standard set of attributes. |
There are two types of well-known attributes:
-
Well-known mandatory attribute— A well-known mandatory attribute must appear in all BGP update messages.
-
Well-known discretionary attribute— A well-known discretionary attribute does not have to be present in all BGP update messages. (In other words, it is recognized by all BGP implementations but does not have to be in every update message.)
Optional Attributes
Attributes that are not well-known are called optional. BGP routers that implement an optional attribute might propagate it to other BGP neighbors, depending on its meaning. Optional attributes are either transitive or nontransitive, as follows:
-
Optional transitive— BGP routers that do not implement an optional transitive attribute should pass it to other BGP routers untouched and mark the attribute as partial.
-
Optional nontransitive— BGP routers that do not implement an optional nontransitive attribute must delete the attribute and must not pass it to other BGP routers.
Defined BGP Attributes
The attributes defined by BGP include the following:
-
Well-known mandatory attributes
-
AS-path
-
Next hop
-
Origin
-
-
Well-known discretionary attributes
-
Local preference
-
Atomic aggregate
-
-
Optional transitive attributes
-
Aggregator
-
Community
-
-
Optional nontransitive attribute
-
Multiexit-discriminator (MED)
-
In addition, Cisco has defined a weight attribute for BGP. The weight is configured locally on a router and is not propagated to any other BGP routers.
The AS-path, next-hop, origin, local preference, community, MED, and weight attributes are discussed more fully in the following sections. The atomic aggregate attribute informs the neighbor autonomous system that the originating router has aggregated (summarized) the routes. The aggregator attribute specifies the BGP router ID and autonomous system number of the router that performed the route aggregation. Both of these attributes are discussed in Appendix C, as is BGP community configuration.
Note | Appendix C describes how BGP route summarization is configured, using both the network router configuration command (which is detailed in the “Defining the Networks That BGP Advertises” section later in this chapter) and the aggregate-address ip-address mask [summary-only] [as-set] router configuration command. |
Note | “The Route Selection Decision Process” section, later in this chapter, describes how the attributes are used to determine the best BGP path. |
The AS-Path Attribute
The AS-path attribute is the list of autonomous system numbers that a route has traversed to reach a destination, with the number of the autonomous system that originated the route at the end of the list.
The AS-path attribute is a well-known mandatory attribute. Whenever a route update passes through an autonomous system, the autonomous system number is prepended to that update (in other words, it is put at the beginning of the list) when it is advertised to the next EBGP neighbor.
In Figure 6-26, Router A in autonomous system 64520 advertises network 192.168.1.0. When that route traverses autonomous system 65500, Router C prepends its own autonomous system number to it. When the route to 192.168.1.0 reaches Router B, it has two autonomous system numbers attached to it. From Router B’s perspective, the path to reach 192.168.1.0 is (65500, 64520).
Figure 6-26: Router C Prepends Its Own Autonomous System Number as It Passes Routes from Router A to Router B.
The same applies for 192.168.2.0 and 192.168.3.0. Router A’s path to 192.168.2.0 is (65500 65000)—it traverses autonomous system 65500 and then autonomous system 65000. Router C has to traverse path (65000) to reach 192.168.2.0 and path (64520) to reach 192.168.1.0.
BGP routers use the AS-path attribute to ensure a loop-free environment. If a BGP router receives a route in which its own autonomous system is part of the AS-path attribute, it does not accept the route.
Autonomous system numbers are prepended only by routers advertising routes to EBGP neighbors. Routers advertising routes to IBGP neighbors do not change the AS-path attribute.
The Next-Hop Attribute
The BGP next-hop attribute is a well-known mandatory attribute that indicates the next-hop IP address that is to be used to reach a destination.
BGP, like IGPs, is a hop-by-hop routing protocol. However, unlike IGPs, BGP routes autonomous system by autonomous system, not router by router, and the default next-hop is the next autonomous system. The next-hop address for a network from another autonomous system is an IP address of the entry point of the next autonomous system along the path to that destination network.
Therefore, for EBGP, the next-hop address is the IP address of the neighbor that sent the update.
This is illustrated in Figure 6-27. Router A advertises 172.16.0.0 to Router B, with a next hop of 10.10.10.3, and Router B advertises 172.20.0.0 to Router A, with a next hop of 10.10.10.1. Therefore, Router A uses 10.10.10.1 as the next-hop attribute to get to 172.20.0.0, and Router B uses 10.10.10.3 as the next-hop attribute to get to 172.16.0.0.
For IBGP, however, the protocol states that the next hop advertised by EBGP should be carried into IBGP.
Because of this IBGP rule, Router B in Figure 6-27 advertises 172.16.0.0 to its IBGP peer Router C, with a next hop of 10.10.10.3 (Router A’s address). Therefore, Router C knows that the next hop to reach 172.16.0.0 is 10.10.10.3, not 172.20.10.1, as you might expect.
It is important, therefore, that Router C knows how to reach the 10.10.10.0 subnet, either via an IGP or a static route. Otherwise, it will drop packets destined for 172.16.0.0 because it will not be able to get to the next-hop address for that network.
The IBGP neighboring router performs a recursive lookup to find out how to reach the BGP next-hop address by using its IGP entries in the routing table. For example, Router C in Figure 6-27 learns in a BGP update about network 172.16.0.0/16 from the route source 172.20.10.1, Router B, with a next hop of 10.10.10.3, Router A. Router C installs the route to 172.16.0.0/16 in the routing table with a next hop of 10.10.10.3. Assuming that Router B announces network 10.10.10.0/24 using its IGP to Router C, Router C installs that route in its routing table with a next hop of 172.20.10.1. An IGP uses the source IP address of a routing update (route source) as the next-hop address, whereas BGP uses a separate field for each network to record the next-hop address. If Router C has a packet to send to 172.16.100.1, it looks up the network in the routing table and finds a BGP route with a next hop of 10.10.10.3. Because it is a BGP entry, Router C completes a recursive lookup in the routing table for a path to network 10.10.10.3; there is an IGP route to network 10.10.10.0 in the routing table with a next hop of 172.20.10.1. Router C then forwards the packet destined for 172.16.100.1 to 172.20.10.1.
When running BGP over a multiaccess network such as Ethernet, a BGP router uses the appropriate address as the next-hop address (by changing the next-hop attribute) to avoid inserting additional hops into the path. This feature is sometimes called a third-party next hop.
For example, in Figure 6-28, assume that Routers B and C in autonomous system 65000 are running an IGP, so that Router B can reach network 172.30.0.0 via 10.10.10.2. Router B is also running EBGP with Router A. When Router B sends a BGP update to Router A about 172.30.0.0, it uses 10.10.10.2 as the next hop, not its own IP address (10.10.10.1). This is because the network among the three routers is a multiaccess network, and it makes more sense for Router A to use Router C as a next hop to reach 172.30.0.0, rather than making an extra hop via Router B.
Figure 6-28: Multiaccess Network: Router A Has 10.10.10.2 as the Next-Hop Attribute to Reach 172.30.0.0.
The third-party next-hop address issue also makes sense when you review it from an ISP perspective. A large ISP at a public peering point has multiple routers peering with different neighboring routers; it is not possible for one router to peer with every neighboring router at the major public peering points. For example, in Figure 6-28, Router B might peer with autonomous system 64520, and Router C might peer with autonomous system 64600. However, each router must inform the other IBGP neighbor of reachable networks from other autonomous systems. From the perspective of Router A, it must transit autonomous system 65000 to get to networks in and behind autonomous system 64600. Router A has a neighbor relationship with only Router B in autonomous system 65000. However, Router B does not handle traffic going to autonomous system 64600. Router B gets to autonomous system 64600 through Router C, 10.10.10.2, and Router B must advertise the networks for autonomous system 64600 to Router A, 10.10.10.3. Router B notices that Routers A and C are on the same subnet, so Router B tells Router A to install the autonomous system 64600 networks with a next hop of 10.10.10.2, not 10.10.10.1.
However, if the common medium between routers is a nonbroadcast multiaccess (NBMA) medium, complications might occur.
For example, in Figure 6-29, Routers A, B, and C are connected by Frame Relay. Router B can reach network 172.30.0.0 via 10.10.10.2. When Router B sends a BGP update to Router A about 172.30.0.0, it uses 10.10.10.2 as the next hop, not its own IP address (10.10.10.1). A problem arises if Routers A and C do not know how to communicate directly—in other words, if Routers A and C do not have a Frame Relay map entry to reach each other, Router A does not know how to reach the next-hop address on Router C.
Figure 6-29: NBMA Network: Router A Has 10.10.10.2 as the Next-Hop Attribute to Reach 172.30.0.0, but It Might Be Unreachable.
This behavior can be overridden in Router B by configuring it to advertise itself as the next-hop address for routes sent to Router A; this configuration is described later in the section “Changing the Next-Hop Attribute.”
Note | Routers can be configured to change the next-hop attribute in many scenarios, not only for NBMA networks, as described in the referenced section, later in this chapter. |
The Origin Attribute
The origin is a well-known mandatory attribute that defines the origin of the path information. The origin attribute can be one of three values:
-
IGP— The route is interior to the originating autonomous system. This normally happens when a network command is used to advertise the route via BGP. An origin of IGP is indicated with an i in the BGP table.
-
EGP— The route is learned via EGP. This is indicated with an e in the BGP table. EGP is considered a historic routing protocol and is not supported on the Internet because it performs only classful routing and does not support CIDR.
-
Incomplete— The route’s origin is unknown or is learned via some other means. This usually occurs when a route is redistributed into BGP. (Redistribution is discussed in Chapter 4, “Manipulating Routing Updates,” and in Appendix C.) An incomplete origin is indicated with a ? in the BGP table.
The Local Preference Attribute
Local preference is a well-known discretionary attribute that indicates to routers in the autonomous system which path is preferred to exit the autonomous system.
A path with a higher local preference is preferred.
The term local refers to inside the autonomous system. The local preference attribute is sent only to IBGP neighbors; it is not passed to EBGP peers. Thus, local preference is an attribute that is configured on a router and exchanged only among routers within the same autonomous system. The default value for local preference on a Cisco router is 100.
Note | One way to reduce the IBGP mesh is to divide an autonomous system into multiple sub-autonomous systems and group them into a single confederation. To the outside world, the confederation looks like a single autonomous system. Each sub-autonomous system is fully meshed within itself, and has a few connections to other sub-autonomous systems in the same confederation. Even though the peers in different sub-autonomous systems have EBGP sessions with each other, they exchange routing information as if they were IBGP peers. Specifically, the next-hop, MED, and local preference attributes are preserved. When configuring a BGP confederation, you specify a confederation identifier. To the outside world, the group of sub-autonomous systems will look like a single autonomous system, with the confederation identifier as the autonomous system number. |
For example, in Figure 6-30, autonomous system 64520 receives updates about network 172.16.0.0 from two directions. Router A and Router B are IBGP neighbors. Assume that the local preference on Router A for network 172.16.0.0 is set to 200 and that the local preference on Router B for network 172.16.0.0 is set to 150. Because the local preference information is exchanged within autonomous system 64520, all traffic in autonomous system 64520 addressed to network 172.16.0.0 is sent to Router A as an exit point from autonomous system 64520.
The Community Attribute
BGP communities are one way to filter incoming or outgoing routes. BGP communities allow routers to tag routes with an indicator (the community) and allow other routers to make decisions based on that tag. Any BGP router can tag routes in incoming and outgoing routing updates, or when doing redistribution. Any BGP router can filter routes in incoming or outgoing updates or can select preferred routes based on communities (the tag).
BGP communities are used for destinations (routes) that share some common properties and, therefore, share common policies; routers act on the community rather than on individual routes. Communities are not restricted to one network or one autonomous system, and they have no physical boundaries.
Communities are optional transitive attributes. If a router does not understand the concept of communities, it defers to the next router. However, if the router does understand the concept, it must be configured to propagate the community; otherwise, communities are dropped by default.
Note | BGP community configuration is detailed in Appendix C. |
The MED Attribute
The MED attribute, also called the metric, is an optional nontransitive attribute.
Note | The MED was known as the inter–autonomous system attribute in BGP-3. |
Note | The MED attribute is called the metric in the Cisco IOS. In the output of the show ip bgp command for example, the MED is displayed in the metric column. |
The MED indicates to external neighbors the preferred path into an autonomous system. This is a dynamic way for an autonomous system to try to influence another autonomous system as to which way it should choose to reach a certain route if there are multiple entry points into the autonomous system.
A lower metric value is preferred.
Unlike local preference, the MED is exchanged between autonomous systems. The MED is sent to EBGP peers; those routers propagate the MED within their autonomous system, and the routers within the autonomous system use the MED, but do not pass it on to the next autonomous system. When the same update is passed on to another autonomous system, the metric will be set back to the default of 0.
It is important to note the difference between MED and local preference: MED influences inbound traffic to an autonomous system, whereas local preference influences outbound traffic from an autonomous system.
By default, a router compares the MED attribute only for paths from neighbors in the same autonomous system.
By using the MED attribute, BGP is the only protocol that can affect the path used to send traffic into an autonomous system.
For example, in Figure 6-31, Router B has set the MED attribute to 150, and Router C has set the MED attribute to 200. When Router A receives update messages from Routers B and C (which include the path attributes), it picks Router B as the best next hop to get to autonomous system 65500, because the MED from Router B of 150 is less than the MED from Router C of 200.
Note | By default, the MED comparison is done only if the neighboring autonomous system is the same for all routes considered. For the router to compare metrics from neighbors coming from different autonomous systems, the bgp always-compare-med router configuration command must be configured on the router. |
The Weight Attribute (Cisco Only)
The weight attribute is a Cisco-defined attribute used for the path-selection process. The weight attribute is configured locally and provides local routing policy only; it is not propagated to any BGP neighbors.
Routes with a higher weight are preferred when multiple routes to the same destination exist.
The weight can have a value from 0 to 65535. Paths that the router originates have a weight of 32768 by default, and other paths have a weight of 0 by default.
The weight attribute applies when using one router with multiple exit points out of an autonomous system, as compared to the local preference attribute, which is used when two or more routers provide multiple exit points.
In Figure 6-32, Routers B and C learn about network 172.20.0.0 from autonomous system 65250 and propagate the update to Router A. Router A has two ways to reach 172.20.0.0 and must decide which way to go. In the example, Router A is configured to set the weight of updates coming from Router B to 200 and the weight of those coming from Router C to 150. Because the weight for Router B’s updates is higher than the weight for Router C’s updates, Router A uses Router B as a next hop to reach 172.20.0.0.
The Route-Selection Decision Process
After BGP receives updates about different destinations from different autonomous systems, it decides which path to choose to reach each specific destination. Multiple paths might exist to reach a given network. These are kept in the BGP table. As paths for the network are evaluated, those determined not to be the best path are eliminated from the selection criteria but kept in the BGP table in case the best path becomes inaccessible.
BGP chooses only a single best path to reach a specific destination.
Note | The “Multiple Path Selection” sidebar later in this chapter describes the use of the maximum-paths command to allow multiple paths to be kept in the IP routing table; but BGP still selects one best path for the BGP table. |
BGP is not designed to perform load balancing; paths are chosen because of policy, not based on bandwidth. The BGP selection process eliminates any multiple paths until a single best path is left.
The best path is submitted to the routing table manager process and is evaluated against any other routing protocols that can also reach that network. The route from the routing protocol with the lowest administrative distance is installed in the routing table.
BGP Route-Selection Process
The decision process is based on the attributes discussed earlier in the “BGP Attributes” section. When faced with multiple routes to the same destination, BGP chooses the best route for routing traffic toward the destination. A path is not considered if it is internal, synchronization is on, and the route is not synchronized (in other words, the route is not in the IGP routing table), or if the path’s next-hop address cannot be reached. Therefore, to choose the best route, BGP considers only synchronized routes with no autonomous system loops and a valid next-hop address. The following process summarizes how BGP chooses the best route on a Cisco router:
Only the best path is entered in the routing table and propagated to the router’s BGP neighbors.
Note | The route-selection decision process summarized here does not cover all cases, but it is sufficient for a basic understanding of how BGP selects routes. |
For example, suppose that there are seven paths to reach network 10.0.0.0. All paths have no autonomous system loops and valid next-hop addresses, so all seven paths proceed to Step 1, which examines the weight of the paths. All seven paths have a weight of 0, so they all proceed to Step 2, which examines the paths’ local preference. Four of the paths have a local preference of 200, and the other three have a local preference of 100, 100, and 150. The four with a local preference of 200 continue the evaluation process to the next step. The other three remain in the BGP forwarding table but are currently disqualified as the best path.
BGP continues the evaluation process until only a single best path remains. The single best path that remains is offered to the IP routing table as the best BGP path.
0 comments
Post a Comment