Troubleshooting OSPF
OSPF is arguably the most popular routing protocol used in large enterprise networks and in service provider networks. Troubleshooting problems related to the exchange of routing information is an essential skill for a network engineer who is involved in the implementation and maintenance of large networks that use OSPF as the IGP. To diagnose and resolve problems related to the exchange of routing information by use of the OSPF routing protocol, you must do the following:
-
Apply your knowledge of OSPF data structures to plan the gathering of necessary information as part of a structured approach to troubleshooting OSPF routing problems.
-
Apply your knowledge of the processes that OSPF uses to exchange network topology information within an area, to interpret and analyze the information that is gathered during an OSPF troubleshooting process.
-
Apply your knowledge of the processes that OSPF uses to exchange network topology information between areas to interpret and analyze the information that is gathered during an OSPF troubleshooting process.
-
Use Cisco IOS commands to gather information from the OSPF data structures and track the flow of OSPF routing information to troubleshoot OSPF operation.
OSPF Data Structures
To troubleshoot IP connectivity problems caused by missing or incorrect routes in a network that uses OSPF as the routing protocol, you need to have a good understanding of the processes and data structures that OSPF uses to distribute, store and select routing information. For any routing protocol, the following high-level elements and processes can be discerned:
-
Reception of routing information from neighbors: With OSPF, routing information is not exchanged in the form of routes, but in the form of link-state advertisements (LSAs), which contain information about elements of the network topology, such as routers, neighbor relationships, connected subnets, subnets available in different areas, redistributing routers, and redistributed subnets.
-
Routing protocol data structures: OSPF stores the LSAs that it receives in a link-state database. Dijkstra’s shortest path first (SPF) algorithm is used to compute the shortest path in terms of cost, which is the OSPF metric, to each network, based on the information in the link-state database. In addition, several other data structures, such as an interface table, a neighbor table, and a Routing Information Base (RIB) are maintained.
-
Route injection or redistribution: Directly connected networks that are enabled for OSPF are advertised in the router’s LSA. Routes from other sources, such as other routing protocols or static routes, can also be imported into the link-state database and advertised by use of special LSAs.
-
Route selection and installation: OSPF will attempt to install the best routes that are computed using the SPF algorithm in the routing table. OSPF discerns three different types of routes: intra-area routes, interarea routes, and external routes. If two routes of different types for the same prefix are available for installation in the routing table, OSPF will prefer intra-area routes over interarea routes, and both these types will be preferred over external routes, regardless of the cost of the paths. If two equal-cost routes of the same type are available, they will both be selected for installation in the routing table.
-
Transmission of routing information to neighbors: Routing information is flooded to all routers in an area by passing LSAs from neighbor to neighbor using a reliable transport mechanism. Area Border Routers (ABRs) inject routing information from an area into the backbone area or, conversely, from the backbone area into the other areas that it is connected to.
OSPF stores its operational data, configured parameters, and statistics in four main data structures:
-
Interface table: This table lists all interfaces that have been enabled for OSPF. The directly connected subnets associated with these interfaces are included in the type 1 router LSA that the router injects into the OSPF link-state database for its area. When an interface is configured as a passive interface, it is still listed in the OSPF interface table, but no neighbor relationships are established on this interface.
-
Neighbor table: This table is used to keep track of all active OSPF neighbors. Neighbors are added to this table based on the reception of Hello packets, and they are removed when the OSPF dead time for a neighbor expires or when the associated interface goes down. OSPF goes through a number of states while establishing a neighbor relationship (also known as adjacency), and the neighbor table lists the current state for each individual neighbor.
-
Link-state database: This is the main data structure that OSPF uses to store all its network topology information. This database contains full topology information for the areas that a router connects to, and information about the paths available to reach networks and subnets in other areas or other autonomous systems. Because this database contains a wealth of network topology information, it is one of the most important data structures to gather information from when troubleshooting OSPF problems.
-
Routing Information Base: After executing the SPF algorithm, the results of this calculation are stored in the RIB. This information includes the best routes to each individual prefix in the OSPF network with their associated path costs. When the information in the link-state database changes, only a partial recalculation might be necessary (depending on the nature of the change), and routes might be added to or deleted from the RIB without the need for a full SPF recalculation. From the RIB, OSPF offers its routes to the IP routing table.
Note | Within the OSPF link-state database, the best path to each destination is determined based on the SPF (Dijkstra) algorithm. The collection of these best paths is referred to as the OSPF RIB. There is no separate physical data structure called the OSPF RIB. The best path to each destination is offered to be installed in the IP routing table. When there are alternatives, the IP process selects the path with the smallest administrative distance and installs it in the IP routing table. As of year 2001 and after the release of RFC 3222, many writings have referred to the IP routing table as the RIB (generic RIB rather than OSPF or BGP RIB). This term is easier to use than IP routing table, and it allows us to distinguish it from the FIB that CEF creates. Although the FIB is created based on RIB, it is indeed a separate data structure, and IP packet forwarding (a data plane task) in a Cisco router is performed using FIB and the FIB adjacency table. |
The OSPF link-state database is used to store all the network topology information a router receives. There are separate (logical) databases for each OSPF area. In a stable situation, the database for each area will be identical on all routers in that area. An ABR has a database for each of the areas that it participates in and is responsible for exchanging network topology information between the databases of its connected areas and the backbone area. External routing information that was redistributed into OSPF from a different source is maintained in a separate section of the database that is not specific to any area. Figure 5-3 shows a network using OSPF and consisting of areas 0, 1, and 2. Table 5-2 lists the LSA types present in the database of each of the routers and the number of each LSA type. These numbers are based on the fact the no redistribution or summarization has been configured and each link represents a single subnet.
Router | Type 1 | Type 2 | Type 3 |
---|---|---|---|
Router A | 2 | 1 | 5 |
Router B | 5 | 2 | 9 |
Router C | 3 | 1 | 4 |
Router D | 5 | 1 | 9 |
Router E | 2 | 0 | 5 |
In a multi-area OSPF network without any redistribution, only LSA type 1, type 2, and type 3 are used. They serve the following purposes:
-
Each router in an area generates a type 1 LSA that describes that router’s link state, including its directly connected subnets, connection types, and neighbors. Type 1 LSAs are not passed between areas.
-
By default, for each multiaccess (broadcast or nonbroadcast) type network, OSPF elects a designated router. If the network is a transit network (more than one router is connected to it), the designated router generates a type 2 LSA that describes the link state for that link, including its subnet and connected routers. Type 2 LSAs are not passed between areas.
-
For each subnet that an ABR can reach in a connected area, it will generate a type 3 LSA in the database of the backbone area 0, listing the subnet and its associated cost. For each subnet that it can reach in the backbone area, either directly or through another ABR, it will generate a type 3 LSA in the database of each connected area listing the subnet and its cost.
Based on these rules for the network depicted in Figure 5-3, you can calculate the number of elements in each area’s database:
-
Type 1: Area 1 contains two routers (A and B), and therefore its database will contain two type 1 entries. Area 0 contains three routers (B, C, and D), and therefore its database will contain three type 1 entries. Area 2 contains two routers, and its database will therefore contain two type 1 entries.
-
Type 2: Area 1 contains one transit Ethernet (broadcast) network, and therefore its database will contain one type 2 entry. There is a second Ethernet network in area 1, but this is a stub network and not a transit network. Area 0 also contains one transit Ethernet link, and therefore it will also contain one type 2 entry. A designated router is not elected for a point-to-point link. Area 2 has no transit broadcast or nonbroadcast type networks, and therefore no type 2 entries are generated.
-
Type 3: Area 1 contains two subnets. For each of these subnets, a type 3 entry will be generated in the area 0 database by Router B. Subsequently, Router D will generate a type 3 entry for each of these subnets in the area 2 database. Area 0 contains three subnets. Router B will generate a type 3 entry for each of those subnets in the area 1 database, and Router D will also generate three entries in the area 2 database. Area 2 contains two subnets, and therefore Router D will generate two type 3 entries in the area 0 database, and Router B will generate two type 3 entries in the area 1 database. In total, this means that the area 1 database contains 3 + 2 = 5 type 3 entries, the area 0 database contains 2 + 2 = 4 type 3 entries, and the area 2 database contains 2 + 3 = 5 type 3 entries. For each area database, the total equals the number of subnets available outside its area.
Therefore, the totals for each area database are the following:
-
Area 1: Two type 1, one type 2, and five type 3 entries
-
Area 0: Three type 1, one type 2, and four type 3 entries
-
Area 2: Two type 1, zero type 2, and five type 3 entries
The final step is to add the numbers for each individual router:
-
Router A: Router A carries only the area 1 database, and as a result, it has two type 1, one type 2, and five type 3 entries.
-
Router B: Router B is an ABR and carries the databases for both area 1 and area 0. Therefore, it has 2 + 3 = 5 type 1 entries, 1 + 1 = 2 type 2 entries, and 5 + 4 = 9 type 3 entries.
-
Router C: Router C carries only the area 0 database, which contains three type 1, one type 2, and four type 3 entries.
-
Router D: Router D is an ABR and carries the databases for area 0 and area 2. Therefore, it has 3 + 2 = 5 type 1, 1 + 0 = 1 type 2 and 4 + 5 = 9 type 3 entries.
-
Router E: Router E carries only the area 2 database, which contains two type 1, zero type 2, and five type 3 entries.
To troubleshoot, you often need to compare observed behavior against expected behavior, and when troubleshooting OSPF this means that you must predict which different types of LSAs you can expect each router to generate. Understanding the role and content of the most fundamental OSPF LSA types (type 1, type 2, and type 3) is essential to troubleshoot OSPF effectively.
OSPF Information Flow Within an Area
OSPF discovers neighbors through the transmission of periodic Hello packets. Two routers will become neighbors only if the following parameters match in the Hello packets:
-
Hello and dead timers: Two routers will only become neighbors if they use the same Hello and dead time. The default values for broadcast and point-to-point type networks are 10-second Hello and 40-second dead time. If these timers are changed on an interface of a router, the timers should be configured to match on all neighboring routers on that interface.
-
OSPF area number: Two routers will become neighbors on a link only if they both consider that link to be in the same area.
-
OSPF area type: Two routers will become neighbors only if they both consider the area to be the same type of area (normal, stub, or not-so-stubby area [NSSA]).
-
IP subnet and subnet mask: Two routers will not become neighbors if they are not on the same subnet. The exception to this rule is on a point-to-point link, where the subnet mask is not verified.
-
Authentication type and authentication data: Two routers will become neighbors only if they both use the same authentication type (null, clear text, or message digest 5 [MD5]). If they use authentication, the authentication data (password or hash value) also needs to match.
If two routers do not list each other as neighbors on a link and the interfaces have been activated for OSPF on both sides, you must verify the parameters in the preceding list. A mismatch in any of these parameters will also show in the output of the debug ip ospf event command. After a router has received a Hello packet and registered the neighbor in its neighbor table, it will attempt to build a neighbor relationship or adjacency with the neighboring router and exchange topology information to synchronize their link-state databases. This process consists of several stages:
-
Attempt: This state is encountered only when unicast Hellos are used and a neighbor has been explicitly configured. When the router has sent a Hello to the configured neighbor but not received any Hello from the neighbor yet, the neighbor relationship is in the attempt state. This state is not used on point-to-point or broadcast type networks, which use multicast Hellos rather than unicast Hellos.
-
Init: This is the state that a neighbor is in if a Hello has been received from the neighbor but the neighbor is not listing this router in its neighbor list yet. This is a transitory state, and if a router is stuck in this state, this usually indicates that the neighbor is not receiving this router’s Hello packets correctly.
-
2-way: This is the state that a neighbor relationship is in when the router sees its own router ID listed in the active neighbor list in the Hello packets received from that neighbor. This is usually a transitory state. The only time when this is considered the normal state is on broadcast or non-broadcast networks, between two routers that both are neither the designated router (DR) nor the backup designated router (BDR) for the segment.
-
Exstart: This stage indicates that the routers are starting the database exchange state by establishing a master and slave relationship and determining the initial sequence number for the database description (DBD) packets.
-
Exchange: During the exchange stage, neighboring routers exchange database description packets to list the content of their link-state database to discover which entries each neighbor is missing. Exstart and exchange are transitory states, and if a neighbor is stuck in exstart or exchange state, this could indicate a mismatch in the maximum transmission unit (MTU) between the neighbors or a duplicate router ID.
-
Loading: During this stage, each of the two routers can request missing LSAs from the other router. This is a transitory state. If routers are stuck in this state, this could indicate packet or memory corruption, or in certain scenarios, an MTU mismatch between the neighbors.
-
Full: This is the normal final stage of OSPF adjacency establishment. This state indicates that the router and its neighbor have successfully synchronized their link-state databases.
Full is the normal state after establishing a neighbor relationship and indicates that the routers have synchronized their databases. 2-way is an acceptable final state for certain types of neighbor relationships (two non-DR, non-BDR routers on a broadcast or nonbroadcast network). Any other state is a transitory state, and if a router is stuck in one of these states for an extended period of time, this calls for further investigation. During this adjacency building process, two routers in the same area synchronize their link-state databases for that area, and when they have reached the “full” state, their databases for the area are identical.
OSPF Information Flow Between Areas
ABRs play a key role in exchanging routing information between OSPF areas. Two routers can become neighbors on a link only if they are in the same area. When they exchange their databases during the initial database exchange, LSAs that belong to different areas are not exchanged. To distribute information about subnets that are available in a particular area to other areas, the ABR generates type 3 LSA to inject the information into the area 0 database. Other ABRs use these type 3 LSAs to compute the best path to these subnets and then in turn inject the information into their connected areas by use of type 3 LSAs. The diagram in Figure 5-4 illustrates this process.
In the network shown in Figure 5-4, Router B, which is an ABR, executes the SPF algorithm using the area 1 database to compute the best path for each subnet that is available in area 1. Based on this computation, Router B will generate a type 3 LSA, which is then injected into the area 0 database. These type 3 LSAs include the cost that Router B has computed for each of these prefixes. Any other router in area 0, such as Router C or D, executes its own SPF algorithm based on the area 0 database. They then add the cost in the type 3 LSAs to their computed cost to Router B to find their total cost to the prefixes in area 1. After Router D has computed these costs, it generates type 3 LSAs for the prefixes from area 1 and injects these into the area 2 database. Any router in area 2, such as Router E, can now compute its own best path to Router D and add this cost to the cost advertised by Router D in the type 3 LSAs to find the total path cost to each prefix in area 1. When you are troubleshooting OSPF and tracking the advertisement of routes and their associated costs, it is important to understand this process. It will enable you to know which routers are expected to generate the necessary type 3 LSAs and how the cost of the total path is calculated. Understanding this process helps you to track the flow of routing information from a router in one area to routers in different areas.
Cisco IOS OSPF Commands
The following commands enable you to gather information from the OSPF data structures:
-
show ip ospf interface: This command is used to display the interfaces that have been activated for OSPF. This list contains all the interfaces that have an IP address that is covered by one of the network statements under the OSPF configuration. This command displays a lot of detailed information for each interface. For a brief overview, issue the command show ip ospf interface brief.
-
show ip ospf neighbor: This command lists all neighbors that have been discovered by this router on its active OSPF interfaces and shows their current state.
-
show ip ospf database: This command displays the content of the OSPF link-state database. When the command is issued without any additional options, it will display a summary of the database, listing only the LSA headers. Using additional command options, specific LSAs can be selected, and the actual LSA content can be inspected.
-
show ip ospf statistics: This command can be used to view how often and when the SPF algorithm was last executed. This command can be helpful when diagnosing routing instability.
The following debug commands enable you to observe the transmission and reception of packets and the exchange of routing information:
-
debug ip routing: This command is not specific to the OSPF protocol, but displays any changes that are made to the routing table, such as installation or removal of routes. This can prove useful when troubleshooting routing protocol instabilities.
-
debug ip ospf packet: This command displays the transmission and reception of OSPF packets. Only the packet headers are displayed, not the content of the packets. This command is useful to verify whether Hellos are sent and received as expected.
-
debug ip ospf events: This command displays OSPF events. This includes reception and transmission of Hellos, but also the establishment of neighbor relationships and the reception or transmission of LSAs. This command can also provide clues (mismatched parameters such as timers, area number, and so on) as to why neighbor Hellos might be ignored.
-
debug ip ospf adj: This command displays events related to the adjacency building process and enables you to see a neighbor relationship transition from one state to the next. During troubleshooting, you can observe the transitions from one state to another, and possibly the state at which the relation gets stuck.
-
debug ip ospf monitor: This command monitors when the SPF algorithm is scheduled to run and displays the triggering LSA and a summary of the results after the SPF algorithm has completed. During troubleshooting, this command enables you to discover which LSA was received and triggered an SPF computation. For example, you can easily discover a flapping link.
These debug commands can generate a large amount of output, and proper care needs to be taken to prevent this from affecting the router’s performance. Logging debug output to buffers on the router rather than to the console can limit the impact of these commands.
Troubleshooting Example: Routing Problem in an OSPF Network
The network shown in Figure 5-5 is using OSPF as the routing protocol and is configured for multiple areas. When you examine the routing table on router CRO1, you only find a single entry, the path through router CSW1.
Example 5-10 displays the output of the show ip route command on router CRO1 for prefix 10.1.152.0.
Routing entry for 10.1.152.0/24
Known via "ospf 100", distance 110, metric 2, type inter area
Last update from 10.1.192.1 on FastEthernet0/0, 00:00:11 ago
Routing Descriptor Blocks:
* 10.1.192.1, from 10.1.220.252, 00:00:11 ago, via FastEthernet0/0
Route metric is 2, traffic share count is 1
This result is unexpected because Figure 5-5 shows that two equal-cost paths are available to CRO1, one through CSW1 and one through CSW2. To verify whether this problem is caused by an underlying Layer 1, Layer 2, or Layer 3 problem, execute a ping command to test the Layer 3 connectivity to router CSW2 using the FastEthernet 0/1 interface, as demonstrated in Example 5-11.
CRO1# ping 10.1.192.9
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.1.192.9, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
Because this ping succeeds, you can conclude that the Fast Ethernet link between router CRO1 and router CSW2 is operational at Layers 3 and below. Given that the link is functional, but is not used to route the packets across, this is a routing problem and calls for an investigation of the operation of OSPF. There are various different methods to troubleshoot this problem. The objective is to find out why the second, equal-cost path through router CSW2 is not installed in the routing table in addition to the entry through router CSW1. There are two main reasons why this could be happening. Either router CSW2 is not advertising subnet 10.1.152.0/24 to area 0, or it might be advertising the route, but the cost to reach subnet 10.1.152.0/24 through router CSW2 from router CRO1 is considered to be worse than the cost through router CSW1. To find out whether router CSW2 is advertising subnet 10.1.152.0/24 to area 0, you can consult the OSPF database in router CRO1. It is expected that both routers CSW1 and CSW2 advertise a type 3 summary LSA for subnet 10.1.152.0/24. Example 5-12 shows the output from the show ip ospf database summary command for the lsa-id 10.1.152.0 on CRO1. For type 3 LSAs, the link-state identifier is the subnet part of the associated prefix.
CRO1# show ip ospf database summary 10.1.152.0
OSPF Router with ID (10.1.220.1) (Process ID 100)
Summary Net Link States (Area 0)
Routing Bit Set on this LSA
LS age: 201
Options: (No TOS-capability, DC, Upward)
LS Type: Summary Links(Network)
Link State ID: 10.1.152.0 (summary Network Number)
Advertising Router: 10.1.220.252
LS Seq Number: 80000001
Checksum: 0x1C97
Length: 28
Network Mask: /24 TOS: 0 Metric: 1
LS age: 136
Options: (No TOS-capability, DC, Upward)
LS Type: Summary Links(Network)
Link State ID: 10.1.152.0 (summary Network Number)
Advertising Router: 10.1.220.253
LS Seq Number: 80000001
Checksum: 0x169C
Length: 28
Network Mask: /24 TOS: 0 Metric: 1
The result of the show ip ospf database summary command for the lsa-id 10.1.152.0 on CRO1 shows two entries. One entry was generated by the router with router ID 10.1.220.252 (CSW1). The second entry is generated by the router with router ID 10.1.220.253 (CSW2). Both entries are advertised with an OSPF cost of 1. Therefore, the preference for the path to 10.1.152.0/24 via CSW1 must be based on the topology within area 0. Given that router CRO1 has a direct connection in area 0 to both router CSW1 and CSW2, there are only two plausible explanations for the fact that router CRO1 is not using the path via router CSW2. Either the direct path to router CSW2 is not used because routers CSW2 and CRO1 have not become neighbors, or the path is not used because the cost for interface FastEthernet 0/1 is higher than the cost for interface FastEthernet 0/0. The second option is less likely because the interfaces are of the same type and by default the OSPF cost is related to the interface bandwidth. To verify whether router CRO1 has established a proper neighbor relationship with router CSW1, the show ip ospf neighbor command was executed on CR01, and Example 5-13 shows the results.
CRO1# show ip ospf neighbor
Neighbor ID Pri State Dead Time Address Interface
10.1.220.252 1 FULL/DR 00:00:33 10.1.192.1 FastEthernet0/0
The output in Example 5-13 shows only one neighbor listed as a neighbor of Router CSW1. Note that in the output of the command the first address is the router ID of the neighbor, whereas the second IP address in the Address column is the interface IP address of the neighbor. There could be several reasons why router CSW2 is not listed as a neighbor of CRO1. One reason could be that it is not sending Hellos. Another reason could be that the Hellos are received but ignored because of mismatched Hello parameters. A third explanation could be that the Hellos are sent but not received because interface FastEthernet 0/1 has not been activated for OSPF and therefore does not listen to the OSPF multicast group 224.0.0.5. To verify whether interface FastEthernet 0/1 has been enabled for OSPF, you would issue the show ip ospf interface brief command on CRO1, as demonstrated in Example 5-14.
CRO1# sh ip ospf interface brief
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Lo0 100 0 10.1.220.1/32 1 LOOP 0/0
Fa0/0 100 0 10.1.192.2/30 1 BDR 1/1
Because the interface FastEthernet 0/1 is not listed in output shown in Example 5-14, one can conclude that it has not been activated for OSPF. Unlike the show ip eigrp interfaces command, the show ip ospf interface command will display interfaces that are enabled for OSPF, but configured as passive interfaces. Therefore, the only possible explanation is that the interface Fa0/1 has not been enabled for OSPF on router CRO1. Whether an interface is enabled for OSPF is controlled by configuration commands. Therefore, you would need to check the configuration to see why interface FastEthernet 0/1 has not been enabled. This is done using the show running-config command for the OSPF section, as demonstrated in Example 5-15.
CRO1# show running-config | section router ospf
router ospf 100
log-adjacency-changes
network 10.1.192.2 0.0.0.0 area 0
network 10.1.192.9 0.0.0.0 area 0
network 10.1.220.1 0.0.0.0 area 0
From the output shown in Example 5-15, it is obvious that a problem exists with one of the network statements. The statement network 10.1.192.9 0.0.0.0 area 0 matches IP address 10.1.192.9, which is not one of router CRO1’s IP addresses, but an IP address of router CSW2. This is clearly a configuration mistake, and the network statement needs to be replaced with the statement network 10.1.192.10 0.0.0.0 area 0 or some other network statement that matches IP address 10.1.192.10 (which is the IP address of router CRO1 on interface FastEthernet 0/1). The network statement is replaced to test the hypothesis that the misconfigured network statement was the cause of the problem. Next, you must verify the results of the change to confirm that the problem was solved. Strictly speaking, the only test that needs to be performed to confirm that the problem was solved is to execute the show ip route 10.1.152.0 255.255.255.0 command again to confirm that both paths through routers CSW1 and CSW2 now show up in the IP routing table. However, to demonstrate the impact of the changes on the OSPF data structures, the changes in the OSPF interface table and OSPF neighbor table are displayed in Example 5-16.
CRO1# show ip ospf interface brief
Interface PID Area IP Address/Mask Cost State Nbrs F/C
Lo0 100 0 10.1.220.1/32 1 LOOP 0/0
Fa0/1 100 0 10.1.192.10/30 1 BDR 1/1
Fa0/0 100 0 10.1.192.2/30 1 BDR 1/1
CRO1# show ip ospf neighbor
Neighbor ID Pri State Dead Time Address Interface
10.1.220.253 1 FULL/DR 00:00:39 10.1.192.9 FastEthernet0/1
10.1.220.252 1 FULL/DR 00:00:31 10.1.192.1 FastEthernet0/0
After we make the change to the OSPF network statement in router CRO1, as shown in Example 5-16, the interface table lists interface FastEthernet 0/1. This means that OSPF has now been enabled for the interface and OSPF packets are processed on interface FastEthernet 0/1. The router ID (10.1.220.253) and interface IP address (10.1.192.9) of router CSW2 are now listed in the neighbor table on the FastEthernet 0/1 interface. Finally, the output shown in Example 5-17 confirms that the path through router CSW2 has been installed in the routing table in addition to the path through router CSW1.
CRO1# show ip route 10.1.152.0 255.255.255.0
Routing entry for 10.1.152.0/24
Known via "ospf 100", distance 110, metric 2, type inter area
Last update from 10.1.192.9 on FastEthernet0/1, 00:00:29 ago
Routing Descriptor Blocks:
10.1.192.9, from 10.1.220.253, 00:00:29 ago, via FastEthernet0/1
Route metric is 2, traffic share count is 1
* 10.1.192.1, from 10.1.220.252, 00:00:29 ago, via FastEthernet0/0
Route metric is 2, traffic share count is 1
Note that the routing source for the entries (the address behind the word from) in the routing table lists the router IDs of routers CSW2 (10.1.220.253) and CSW1 (10.1.220.252), respectively. These routers are listed as the source of the route, because they generated the type 3 LSA entries from which these routes were calculated. This OSPF troubleshooting example illustrates the use of the Cisco IOS show commands to display the content of the OSPF data structures and how to leverage knowledge of these data structures and OSPF routing information flow to diagnose and resolve OSPF routing problems.
Troubleshooting Route Redistribution
Ideally, no more than one interior (intra-autonomous system) routing protocol is used within an organization. However, organizational requirements such as partnerships, mergers, technology migrations, and changes in policy might impose usage of multiple routing protocols in a single enterprise network. In such situations, route redistribution between the different routing protocols is often necessary to achieve IP connectivity between the different parts of the network. Route redistribution adds an extra layer of complexity to a routed network. In addition to understanding each of the routing protocols involved, it is also important to understand the interactions between them. This understanding is vital to be able to diagnose and resolve problems such as suboptimal routing and routing feedback that can occur when route redistribution is implemented. As a network support engineer, you must know the data structures and processes that play a part in the exchange of routing information between different routing protocols. You must also master the Cisco IOS commands to gather information about the operation of a route redistribution process.
Route Injection and Redistribution Process
The most common way to distribute routing information is to select a single routing protocol and use this throughout the network. On each of the participating routers, interfaces are activated for the routing protocol, and the directly connected subnets associated with these interfaces are injected into the routing protocol data structures and distributed via the routing protocol update mechanisms. However, in some situations, you might have routes that are not local to the part of the network where the routing protocol operates, and you want to distribute these routes using the routing protocol. For instance, you have a number of static routes that point to routers that are not using the same routing protocol and you want to distribute the information about these subnets using your routing protocol.
Alternatively, you might have a situation where two different routing protocols are used for two different parts of the network and you want to take the routes from the part of the network that are learned through one routing protocol and distribute that information by means of the other protocol to all routers that participate in the second routing protocol. You can implement these scenarios by use of redistribution. There are two ways for routes to be injected in a routing protocol:
-
Directly connected: These subnets can be injected by enabling the routing protocol on an interface. These routes are considered internal by the routing protocol.
-
External: These are subnets from a different source that are present in the routing table and can be redistributed by using the routing protocol’s update mechanisms. Because these routes were not originated by the routing protocol, they are considered external.
Older routing protocols, such as RIP do not have the capability to mark routes as external in their routing update messages. Newer routing protocols, such as EIGRP, OSPF, and IS-IS, can mark routes as external and prefer internal routes over external routes. This is an important mechanism in the prevention of routing loops caused by route feedback.
The redistribution process takes routes from the routing table. As a result, only routes that are installed in the routing table can be redistributed. Routes are not transferred directly between the data structures of different routing protocols. As an example of this process, consider a router that is running two routing protocols, OSPF and EIGRP. EIGRP has been configured to redistribute the routes learned from OSPF. OSPF receives network topology information in the form of LSAs from its neighbor OSPF routers, places these in its link-state database, and executes the SPF algorithm to calculate the best routes to the subnets in the OSPF network. After calculating the best path for each prefix, OSPF offers these prefixes to the IP routing table. If no competing routes with a lower administrative distance are present for a prefix, the OSPF route is installed in the routing table. EIGRP, which is configured to redistribute OSPF, scans the routing table and takes any route that was installed in the routing table by OSPF, imports these routes in its topology table, marking them as external, and advertises them using its own update mechanisms. In addition to routes that are marked as OSPF routes in the routing table, EIGRP takes any connected routes from the routing table that are enabled for OSPF (either by use of a network statement under the router OSPF configuration or by use of the ip ospf process-id area area-number command on the interface). Although these routes are not marked as OSPF routes, but as directly connected routes in the routing table, EIGRP treats these routes as OSPF routes.
Note that the redistribution process is the process that takes the routes from the routing table, not the process that installs the routes in the routing table. Therefore, redistribution is always configured under the “destination” protocol for the routing information. For example, when OSPF routes are redistributed into EIGRP, this is configured under the EIGRP process. When the route is taken from the routing table and imported into the other protocol’s data structures, a metric for the redistributing protocol needs to be attached to that route. This metric is not computed from the metric of the original protocol through the use of some formula. A starting metric, or seed metric, should be configured, which will then be attached to all redistributed routes by the router. If no seed metric is configured, a default value for the redistributing protocol is used. For some protocols, such as RIP and EIGRP, the default metric is the maximum possible value, which represents “infinity” or “unreachable.” This is important to know when troubleshooting redistribution issues because redistribution into these protocols will fail without explicit configuration of a seed metric. The seed metric can be configured using the default-metric command or along with each redistribution statement (as a parameter or within a route map).
Two important conditions must be met for a prefix learned from one protocol (using redistribution) to be successfully advertised through another protocol:
-
The route needs to be installed in the routing table: The route needs to be selected as the best route by the source protocol and, if routes from competing sources are present, the route will need to have a lower administrative distance than the competing routes.
-
A proper seed metric is assigned to the redistributed route: The route needs to be redistributed in the destination protocol data structures with a valid metric for the destination protocol.
Access lists and route maps can be used to influence the redistribution process further by filtering routes, manipulating the seed metric, or setting additional parameters, such as route type or tags for specific routes.
Verifying and Troubleshooting Route Propagation
When troubleshooting problems in a network that uses route redistribution, troubleshooting the actual redistribution process itself is often only a small part of the process. The main challenge is usually troubleshooting IP connectivity problems caused by redistribution, which involves the following elements:
-
Troubleshooting the source routing protocol: Routes can be redistributed only if they are present in the routing table of the redistributing router. If routes are not redistributed as expected, you first need to confirm that they are learned on the redistributing router via the source protocol. Next, you have to check that the route is installed in the routing table.
-
Troubleshooting route selection and installation: To redistribute a route, it needs to be selected and successfully installed in the routing table by the source protocol. When routes are redistributed between routing protocols in two directions (to and from one protocol to another), it is possible that routing information that originates in one of the routing domains is redistributed into the other routing domain and eventually propagates back to a router that is also connected to the source domain. This happens when there is a topological loop in the network scheme. If the route is subsequently accepted as a better route than the original source route, suboptimal routing can happen. However, if that route is subsequently redistributed back into the source protocol, it might cause routing loops and routing instability. After diagnosing a suboptimal routing problem or routing loop, changing the administrative distance or filtering routes to influence the route selection and installation process can often solve the problem.
-
Troubleshooting the redistribution process: If routes are learned and installed in the routing table of the redistributing router, but not inserted into the data structures and advertisements of the redistributing protocol, you should verify the configuration of the redistribution process. Bad seed metrics, route filtering, or misconfigured routing protocol process or autonomous system numbers are common causes for the redistribution process to fail.
-
Troubleshooting the destination routing protocol: After the routing information is inserted into the destination protocol’s data structures, the routing information is propagated using that protocol’s routing update mechanisms. If the routing information is not properly distributed to all routers in the destination routing domain, you should troubleshoot the routing exchange mechanisms for the destination protocol. Each routing protocol has its own methods of exchanging routing information, including external routing information. Research the specific protocol that you are working with to find out if external routes are handled differently than internal routes. For example, OSPF external routes do not propagate into stub areas.
Diagnosing route redistribution problems mostly involves troubleshooting routing protocols and routing information exchange in general. To troubleshoot redistribution from one routing protocol to another, you need to be proficient with the troubleshooting toolkit for both the source and destination protocols. The redistribution process itself consists of interaction between the data structures of the involved routing protocols and the routing table. To troubleshoot, you have to use the commands to gather information from the routing protocol data structures, such as show ip ospf database to display the content of OSPF link-state database or show ip eigrp topology to display the content of the EIGRP topology table. Detailed information about specific routes installed in the routing table can be gathered by use of the show ip route network mask command.
The debug ip routing command displays routes being installed or removed from the routing table in real time. This command can be very powerful when you are troubleshooting routing loops or flapping routes caused by route redistribution. Another feature that can prove helpful in diagnosing suspected route instability is the route-profiling feature. Using the ip route profile command in global configuration mode enables this feature. After enabling this feature, the router tracks the number of routing table changes that occurred over 5-second sampling intervals. This can give you an indication of the overall stability of the routing table, without the need to enable a debug command. However, to find out which particular route or routes are causing the instability, enabling the debug ip routing command might be necessary. The output of the show ip route profile command might not be self-explanatory without researching the command references. Example 5-18 displays the output of the show ip route profile command after the feature has been enabled in global configuration mode using the ip route profile command. This example shows the frequency of routing table changes in a 5-second sampling interval. In Example 5-18, number 2 under the Prefix add column and row 20 indicates that there have been two 5-second intervals during which 20 or more (but less than 25) Prefix adds have happened. Note that the number of changes in the forwarding path is the accumulation of prefix-add, next-hop change, and pathcount change statistics.
Router# show ip route profile
——————————————————————————————————
Change/ Fwd-path Prefix Nexthop Pathcount Prefix
interval change add Change Change refresh
——————————————————————————————————
0 87 87 89 89 89
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
10 0 0 0 0 0
15 0 0 0 0 0
20 2 2 0 0 0
25 0 0 0 0 0
Within the table drawn as the output of the show ip route profile command, the numeric value under each column is interpreted as follows:
-
Change/interval: Represents the frequency buckets. A Change/interval of 20, for example, represents the bucket that is incremented when a particular event occurs 20 times in a sampling interval. It is common to see high counters for the Change/interval bucket for 0. This counter represents the number of sampling intervals in which there were no changes to the routing table. Route removals are not counted in the statistics, only route additions.
-
Fwd-path change: Number of changes in the forwarding path. This value represents the accumulation of Prefix add, Nexthop change, and Pathcount change.
-
Prefix add: A new prefix was added to the routing table.
-
Nexthop change: A prefix is not added or removed, but the next hop changes. This statistic is seen only with recursive routes that are installed in the routing table.
-
Pathcount change: The number of paths in the routing table has changed. This change is the result of an increase in the number of paths for an IGP.
-
Prefix refresh: Indicates standard routing table maintenance. The forwarding behavior was not changed.
When the network is stable, only the counters in the first row should increase, because this row represents the number of intervals during which no changes to the routing occurred. When rows other than the first row increase in a situation that you thought to be stable, this could indicate a routing loop.
Troubleshooting Example: Redistribution from OSPF to EIGRP
The example presented here has been designed to illustrate the redistribution process and the commands that you can use to verify it. The case does not revolve around a problem that needs to be diagnosed and resolved; it simply tracks a route originating from an OSPF network being redistributed into EIGRP. The output of various show commands are presented to demonstrate how a route can be tracked in a situation where redistribution is working correctly. To troubleshoot effectively, it is important to know the behavior of processes and protocols when everything is working correctly, so it can be compared and contrasted against the behavior when it is malfunctioning. The situation is based on the network shown in Figure 5-6. Network 10.1.152.0/24 resides in area 1 of the OSPF network and is connected to the multilayer switches CSW1 and CSW2. Router CRO1 is participating in OSPF in area 0 to communicate with switches CSW1 and CSW2. Router CRO1 is also connected to router BRO1, and these two routers use EIGRP to exchange routing information. Redistribution has been configured as the commands shown in Figure 5-6.
To follow the routing information as it is passed between the various routing protocol data structures, Example 5-19 shows the main data structure of the source protocol, the OSPF database. Two type 3 summary LSAs have been received and stored in the database for network 10.1.152.0/24, one from switch CSW1 and one from switch CSW2.
CRO1# show ip ospf database | begin Summary
Summary Net Link States (Area 0)
Link ID ADV Router Age Seq# Checksum
10.1.152.0 10.1.220.252 472 0x8000003B 0x00A7D1
10.1.152.0 10.1.220.253 558 0x8000003B 0x00A1D6
<... further output omitted ...>
OSPF executes the SPF algorithm and calculates the cost of the path to 10.1.152.0 through switches CSW1 and CSW2. After computing the results, the lowest-cost path is selected for installation into the routing table if no competing routes with a lower administrative distance are present. Next, Example 5-20 shows the routing table for network 10.1.152.0/24 on router CRO1. Both paths through switch CSW1 and switch CSW2 have been installed in the routing table because their costs are identical. The routing table also shows that this route has been marked for redistribution by EIGRP, and the configured EIGRP seed metric is also listed.
CRO1# show ip route 10.1.152.0 255.255.255.0
Routing entry for 10.1.152.0/24
Known via "ospf 100", distance 110, metric 2, type inter area
Redistributing via eigrp 1
Advertised by eigrp 1 metric 64 10000 255 1 1500
Last update from 10.1.192.9 on FastEthernet0/1, 00:28:24 ago
Routing Descriptor Blocks:
10.1.192.9, from 10.1.220.253, 00:28:24 ago, via FastEthernet0/1
Route metric is 2, traffic share count is 1
* 10.1.192.1, from 10.1.220.252, 00:28:24 ago, via FastEthernet0/0
Route metric is 2, traffic share count is 1
The output in Example 5-21 shows the EIGRP topology table for network 10.1.152.0/24 on router CRO1, verifying that the route is being redistributed. Here, you can clearly see that the route was taken from the routing table and inserted into the topology table as an external route. The five components of the configured seed metric are listed. In addition, some extra parameters were attached to the route to mark that the route was originated by the OSPF protocol with process number 100 and was injected into EIGRP by the router with EIGRP router ID 10.1.220.1 (which is the local router, CRO1).
CRO1# show ip eigrp topology 10.1.152.0 255.255.255.0
IP-EIGRP (AS 1): Topology entry for 10.1.152.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 42560000
Routing Descriptor Blocks:
10.1.192.9, from Redistributed, Send flag is 0x0
Composite metric is (42560000/0), Route is External
Vector metric:
Minimum bandwidth is 64 Kbit
Total delay is 100000 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 0
External data:
Originating router is 10.1.220.1 (this system)
AS number of route is 100
External protocol is OSPF, external metric is 2
Administrator tag is 0 (0x00000000)
The external information that router CRO1 added to the route during redistribution is passed along to router BRO1 within the EIGRP routing updates. When you display the EIGRP topology table on router BRO1, as demonstrated in Example 5-22, the originating router and routing protocol are still visible. Clearly, this information can be useful when you are troubleshooting redistribution into EIGRP because it makes it easy to see which router is the source when you are seeing unexpected routes. The information that is attached to and exchanged along with the external routes is different for each routing protocol.
BRO1# show ip eigrp topology 10.1.152.0 255.255.255.0
IP-EIGRP (AS 1): Topology entry for 10.1.152.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 43072000
Routing Descriptor Blocks:
10.1.193.1 (Serial0/0/1), from 10.1.193.1, Send flag is 0x0
Composite metric is (43072000/42560000), Route is External
Vector metric:
Minimum bandwidth is 64 Kbit
Total delay is 120000 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 1
External data:
Originating router is 10.1.220.1
AS number of route is 100
External protocol is OSPF, external metric is 2
Administrator tag is 0 (0x00000000)
Finally, on router BRO1, EIGRP selects 10.1.152./24 as the best route. The route is then installed in the IP routing table. The route is marked as an EIGRP external route and has a corresponding administrative distance of 170, as shown in Example 5-23. However, the external information that was present in the EIGRP topology table, such as the originating router and protocol, is not carried into the routing table.
BRO1# show ip route 10.1.152.0 255.255.255.0
Routing entry for 10.1.152.0/24
Known via "eigrp 1", distance 170, metric 43072000, type external
Redistributing via eigrp 1
Last update from 10.1.193.1 on Serial0/0/1, 00:00:35 ago
Routing Descriptor Blocks:
* 10.1.193.1, from 10.1.193.1, 00:00:35 ago, via Serial0/0/1
Route metric is 43072000, traffic share count is 1
Total delay is 120000 microseconds, minimum bandwidth is 64 Kbit
Reliability 255/255, minimum MTU 1500 bytes
Loading 3/255, Hops 1
This case illustrated the process that you can use to verify the various data structures that play a role in redistribution processes. It also shows that it is important to practice the use of these commands as part of the creation of a baseline of your network to know the expected results in a working situation.
0 comments
Post a Comment