Thursday, May 26, 2011

Chapter 07 :Troubleshooting Network Performance Issues (Part03)

Troubleshooting Performance Issues on Routers

Add a note hereDiagnosing and resolving router performance problems is an important skill set for network support engineers. Common causes for performance problems on routers are high CPU utilization and memory-allocation problems. Therefore, it is important to be able to recognize the typical symptoms associated with CPU or memory issues and to know the typically causes of these types of issues. This section prepares you to diagnose problems caused by high CPU utilization on routers using the Cisco IOS CLI, explains the typical symptoms and possible causes of memory-allocation failures, and offers guidelines for troubleshooting memory problems.

Add a note here Troubleshooting High CPU Usage Issues on Routers

Add a note here The CPU on a router performs two major tasks: forwarding packets and executing management and control plane processes. The CPU can become too busy when the CPU either has many packets to forward or when a system process consumes a large amount of the CPU time. For example, if the CPU is receiving many SNMP packets because of intensive network monitoring, it can become so busy processing all those packets that the other system processes cannot get access to CPU resources.

Add a note hereIt is very to understand when high CPU utilization is at a problematic level and when it is considered to be normal. In some cases, high CPU utilization is normal and does not cause network problems. If CPU utilization is high for a short period of time, it does not necessarily cause a problem, as it is merely due to a short burst of network management requests or expected peaks of network traffic. If CPU utilization is consistently very high and packet forwarding or process performance on the router performance degrades, however, it is usually considered to be a problem and needs to be investigated.

Add a note hereWhen the router CPU is too busy to forward all packets as they arrive, the router might start to buffer packets, increasing latency, or even drop packets. This affects the application traffic passing through the router, and as a result, network performance will suffer. Also, because the CPU is spending most of its time on packet forwarding, control plane processes may not be able to get sufficient access to the CPU, which could lead to further disruptions because of failing routing or other control plane protocols.

Add a note hereCommon symptoms of a router CPU that is too busy is that the router fails to respond to certain service requests. In those situations, the router might exhibit the following behaviors:

  • Add a note hereSlow response to Telnet requests or to the commands that are issued in active Telnet sessions

  • Add a note hereSlow response to commands issued on the console

  • Add a note hereHigh latency on ping responses or too many ping timeouts

  • Add a note hereFailure to send routing protocol packets to other routers

Add a note hereThe following are some of the most common router processes that could cause high CPU utilization:

  • Add a note here ARP Input: High CPU utilization by the ARP Input process occurs if the router has to originate an excessive number of ARP requests. Multiple ARP requests for the same IP address are rate-limited to one request every 2 seconds, so excessive numbers of ARP requests can only occur if the router needs to originate ARP requests for many different IP addresses. This can happen if an IP route has been configured pointing to a broadcast interface. This causes the router to generate an ARP request for each IP address that is not reachable through a more specific route. An excessive amount of ARP requests can also be caused by malicious network traffic. An indication of such traffic is the presence of a high number of incomplete ARP entries in the ARP table, similar to the one shown in Example 7-47.

    Add a note here Example 7-47: The Output of show arp Has Several Incomplete Entries

    Add a note here


    Add a note hereRouter# show arp
    Protocol Address Age (min) Hardware Addr Type Interface
    Internet 10.10.10.1 - 0013.1918.caae ARPA FastEthernet0/0
    Internet 10.16.243.249 0 Incomplete ARPA
    Internet 10.16.243.250 0 Incomplete ARPA
    Internet 10.16.243.251 0 Incomplete ARPA
    Internet 10.16.243.252 0 Incomplete ARPA
    Internet 10.16.243.253 0 Incomplete ARPA
    Internet 10.16.243.254 0 Incomplete ARPA

  • Add a note here Net Background: The Net Background process runs whenever a buffer is required but is not available to a process or an interface. It uses the main buffer pool to provide the requested buffers. Net Background also manages the memory used by each process and cleans up freed-up memory. The symptoms of high CPU are increases in throttles, ignores, overruns, and resets on an interface; you can see these in the output of the show interfaces command.

  • Add a note here TCP Timer: The TCP Timer process is responsible for TCP sessions running on the router. When the TCP timer process uses a lot of CPU resources, this indicates that there are too many TCP peers (such as Border Gateway Protocol [BGP] peers). The show tcp statistics command (a sample is shown in Example 7-48) displays detailed TCP information.

    Add a note here Example 7-48: The Output of show tcp statistics Displays Detailed TCP-Related Information

    Add a note here


    Add a note hereRouter# show tcp statistics
    Rcvd: 22771 Total, 152 no port
    0 checksum error, 0 bad offset, 0 too short
    4661 packets (357163 bytes) in sequence
    7 dup packets (860 bytes)
    0 partially dup packets (0 bytes)
    0 out-of-order packets (0 bytes)
    0 packets (0 bytes) with data after window
    0 packets after close
    0 window probe packets, 0 window update packets
    4 dup ack packets, 0 ack packets with unsend data
    4228 ack packets (383828 bytes)
    Sent: 22490 Total, 0 urgent packets
    16278 control packets (including 17 retransmitted)
    5058 data packets (383831 bytes)
    7 data packets (630 bytes) retransmitted
    0 data packets (0 bytes) fastretransmitted
    1146 ack only packets (818 delayed)
    0 window probe packets, 1 window update packets
    8 Connections initiated, 82 connections accepted, 65 connections established
    32046 Connections closed (including 27 dropped, 15979 embryonic dropped)
    24 total rxmt timeout, 0 connections dropped in rxmt timeout
    0 Keepalive timeout, 0 keepalive probe, 0 Connections dropped in keepalive

  • Add a note here IP Background: This process is responsible for encapsulation type changes on an interface, the move of an interface to a new state (up or down), and change of IP address on an interface. The IP Background process modifies the routing table in accordance with the status of the interfaces and notifies all routing protocols of the status change of each IP interface.

Add a note hereTo determine the CPU utilization on a router, issue the show processes cpu command. The output of this command shows how busy the CPU has been in the past 5 seconds, the past 1 minute, and the past 5 minutes. The output also shows the percentage of the available CPU time that each system process has used during these periods. In the output shown in Example 7-49, the CPU utilization for the last 5 seconds was 72 percent. Out of this total of 72 percent, 23 percent of the CPU time was spent in interrupt mode, which corresponds to switching of packets. On the same line of output, you can also see the average utilization for the last 1 minute (74 percent in this example), and the average utilization for the past 5 minutes (71 percent in this example).

Add a note here Example 7-49: The show processes cpu Command Displays the Overall CPU Utilization and the CPU Utilization Due to Each Individual Process

Add a note hereRouter# show processes cpu sorted
CPU utilizatin for five seconds: 72%/23%; one minute: 74%; five minutes: 71%
! 72%, 74%, and 71% indicate total CPU spent on processes and interrupts
(packet switching). 23% indicates CPU spent on interrupts (packet switching)
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
62 3218415936 162259897 8149 65.08% 72.01% 68.00% 0 IP Input
183 47280 35989616 1 0.16% 0.08% 0.08% 0 RADIUS
47 432 223 2385 0.24% 0.03% 0.06% 0 SSH Process
2 9864 232359 42 0.08% 0.00% 0.00% 0 Load Meter
61 6752 139374 48 0.08% 0.00% 0.00% 0 CDP Protocol
33 14736 1161808 12 0.08% 0.01% 0.00% 0 Per-Second Jobs
73 12200 4538259 2 0.08% 0.01% 0.00% 0 SSS Feature Time
! Output omitted for brevity

Add a note hereIssue the show processes cpu history command to see the CPU utilization for the last 60 seconds, 60 minutes, and 72 hours. The command output for this command provides ASCII graphical views of how busy the CPU has been. You can see if the CPU has been constantly busy or whether utilization has been spiking. CPU utilization spikes caused by a known network event or activity do not indicate problems, but if you see prolonged spikes that do not seem to correspond to any known network activity, you must definitely investigate.

Add a note here Troubleshooting Switching Paths

Add a note hereTo understand the different switching options and how they work, it is necessary to understand that there are different types of router platforms and that each of these platforms has its own behavior. For example, 2800 series routers are based on a single CPU, and all functions of the router can be executed by the Cisco IOS Software running on the main CPU. However, many of the functions can be offloaded to separate network modules that can be installed into these routers. 7600 series routers are based on special hardware that is responsible for all packet-forwarding actions, which means that the main CPU is not involved in processing of most packets. The task of packet forwarding (data plane) consists of two steps:

Add a note here Step 1

Add a note here Making a routing decision: The routing decision is made based on network topology information and all the configured policies. Information about network destinations, gathered by a routing protocol, and possible restrictions like access lists or policy-based routing (PBR) are used to decide where to send each packet.

Add a note here Step 2

Add a note here Switching the packet: Switching packets on a router (not to be confused with Layer 2 switching) involves moving a packet from an input buffer to an output buffer and rewriting the data link layer header of the frame to forward the packet to the next hop toward the final destination.

Add a note hereThe data link layer addresses necessary to rewrite the frame are stored in different tables such as the ARP table, which lists the MAC addresses for known IP devices reachable via Ethernet interfaces. Usually routers discover the data link layer addresses to be used for a destination through an address resolution process that matches the Layer 3 address to the Layer 2 address of a next hop device.

Add a note hereThere are three types of packet switching modes supported by Cisco routers:

Add a note hereThe newest switching mode is CEF, and it is the default, preferred, and recommended switching mode. It is important to remember that the switching method used affects the router’s performance. To successfully troubleshoot problems related to the switching path it is essential to understand which method is used and how it works. The switching method might be altered globally or per interface for several reasons:

  • Add a note hereDuring troubleshooting, to verify if the observed behavior is caused by the switching method

  • Add a note hereDuring debugging, to direct all packets to CPU for processing

  • Add a note hereBecause some IOS features require a specific switching method

Process Switching

Add a note hereProcess switching is the oldest mode. When using process switching to forward packets, the router strips the Layer 2 header from an incoming frame, looks up the Layer 3 network address in the routing table for the packet, and then sends the frame with a rewritten Layer 2 header, including a newly computed cyclical redundancy check (CRC) to the outgoing interface. All these operations are performed for each individual frame by the IP Input process that is running on the central CPU. Process switching is configured on an interface by disabling fast switching (and CEF) on that interface. Process switching is the most CPU-intensive method available on Cisco routers. It greatly degrades performance figures such as throughput, jitter, latency, and so on. This method should be used only temporarily as a last resort during troubleshooting.


Note

Add a note hereTo use process switching, fast switching must be disabled using this command:

Add a note hereRouter(config-if)# no ip route-cache

Fast Switching

Add a note hereAfter performing a routing table lookup for the first packet destined for particular IP network, the router also initializes the fast-switching cache that is used by the fast-switching process. When subsequent frames to that same destination arrive, a cache lookup is performed and the destination is found in the fast-switching cache. Then the frame is rewritten with the corresponding data link layer header that was stored in the cache, and the frame is sent to the outgoing interface. The interface processor computes the CRC for the frame. Because the cache is destination based, fast switching can provide load sharing on a per-destination basis. Fast switching is less processor intensive than process switching because it uses a cache entry created by the first packet sent to a particular destination. The CPU utilization can go high even when the fast switching method is used, in a situation that there are a high number of new flows per second. This can happen when a network attack generates too many new flows rapidly.


Note

Add a note hereFast switching is enabled using the following command:

Add a note hereRouter(config-if)# ip route-cache

Cisco Express Forwarding

Add a note here Cisco Express Forwarding (CEF) is the default switching mode on Cisco routers. CEF is less CPU-intensive than fast switching or process switching. CEF is a highly scalable and resilient switching technique. When CEF is enabled, information used for packet forwarding purposes resides in the following two tables:

  • Add a note here CEF Forwarding Information Base (FIB): A router that has CEF enabled uses the FIB to make IP destination prefix-based switching decisions. This table is updated after each network change, but only once, and contains all known routes. There is no need to build a route cache by first using process switching for some of the packets. Each change in the IP routing table triggers a similar change in FIB table because it contains all next-hop addresses associated with all network destinations.

  • Add a note here CEF adjacency table: The adjacency table contains Layer 2 frame headers for all next hops used by the FIB. These addresses are used to rewrite frame headers for packets that are forwarded by a router.

Add a note hereBoth tables are built independently, and a change in one table does not lead to change in the other. CEF is an efficient mechanism for traffic load balancing. In this case, both the FIB and the adjacency table contain multiple entries for a single network destination to reflect the multiple network paths toward it. It is important to note that there are several Cisco IOS features that require CEF to be enabled for their operation because they rely on the data structures that are built and maintained by Cisco operation. Some of those features are as follows:

  • Add a note hereNetwork-Based Application Recognition (NBAR)

  • Add a note hereAutoQoS and Modular QoS CLI (MQC)

  • Add a note hereFrame Relay traffic shaping

  • Add a note hereMultiprotocol Label Switching (MPLS)

  • Add a note hereClass-based weighted random early detection


Note

Add a note hereCEF can be enabled and disabled globally using the command:

Add a note hereRouter(config)# [no] ip cef

Add a note hereYou can also enable or disable CEF on each interface individually using the command:

Add a note hereRouter(config-if)# [no] ip route-cache cef

Add a note hereGenerally, if CEF is disabled globally, it cannot be enabled on an interface, but if it is enabled globally, it can be disabled on a single interface.

Troubleshooting Process and Fast Switching

Add a note here Example 7-50 shows sample output from the show ip interface command after disabling the default CEF packet-switching mode using the no ip cef command. In the output, you can see that fast switching is enabled for all packets (except for packets that are sent back to the same interface that they came in on), but CEF switching is disabled.

Add a note here Example 7-50: show ip interface Command Output Shows That CEF Has Been Disabled

Add a note hereRouter# show ip interface GigabitEthernet 0/0
GigabitEthernet0/0 is up, line protocol is up
<...output omitted...>
IP fast switching is enabled
IP fast switching on the same interface is disabled
IP Flow switching is disabled
IP CEF switching is disabled
IP Fast switching turbo vector
IP multicast fast switching is enabled
IP multicast distributed fast switching is disabled
IP route-cache flags are Fast
! Output omitted for brevity

Add a note hereIf you turn fast switching off, too, using the command no ip route-cache, and repeat the show ip interface command, the output will look similar to the one shown in Example 7-51. As you can see, however, multicast fast switching is still enabled. This is because IP multicast routing is configured entirely separate from IP unicast routing and there are separate configuration statements related to unicast and multicast operations. The no ip route-cache command only applies to unicast packets. To disable fast switching for multicast packets, the no ip mroute-cache command is used.

Add a note here Example 7-51: show ip interface Command Output Reveals That Fast Switching Is Disabled

Add a note hereRouter# show ip interface GigabitEthernet 0/0
GigabitEthernet0/0 is up, line protocol is up
<... output omitted ...>
IP fast switching is disabled
IP fast switching on the same interface is disabled
IP Flow switching is disabled
IP CEF switching is disabled
IP Fast switching turbo vector
IP multicast fast switching is enabled
IP multicast distributed fast switching is disabled
IP route-cache flags are Fast
! Output omitted for brevity

Add a note here Disabling fast switching increases the load on the system CPU because every packet is processed by the IP Input process on the router CPU. In some situations however, disabling fast switching might be necessary (for example, during troubleshooting of connectivity problems) to eliminate the use of the fast-switching cache and to allow processing of all packets by the router CPU.

Add a note hereThe show ip cache command displays the content of the fast-switching cache, as shown in Example 7-52. If fast switching is disabled on a particular interface, this cache will not have any network entries for that interface. The route cache is periodically cleared to remove stale entries and make room for new entries. This command is useful when troubleshooting because it shows that the fast-switching cache is initialized and populated with information for different network prefixes and associated outgoing interfaces.

Add a note here Example 7-52: show ip cache Displays the Current Content of the Fast-Switching Cache

Add a note hereRouter# show ip cache
IP routing cache 4 entries, 784 bytes
5 adds, 1 invalidates, 0 refcounts
Minimum invalidation interval 2 seconds, maximum interval 5 seconds,
quiet interval 3 seconds, threshold 0 requests
Invalidation rate 0 in last second, 0 in last 3 seconds
Last full cache invalidation occurred 00:11:31 ago

Prefix/Length Age Interface Next Hop
10.1.1.1/32 00:07:20 FastEthernet0/0 10.1.1.1
10.2.1.1/32 00:04:18 FastEthernet0/1 10.2.1.1
10.10.1.0/24 00:01:06 FastEthernet0/0 10.1.1.1
10.11.1.0/24 00:01:20 FastEthernet0/1 10.2.1.1

Troubleshooting CEF

Add a note hereCEF builds two main data structures for its operation: the FIB and the adjacency table. When troubleshooting CEF, you have to check both tables and correlate entries between them. The items that you should check and verify when troubleshooting CEF are as follows:

  • Add a note hereIs CEF enabled globally and per interface?

  • Add a note hereIs there a FIB entry for a given network destination?

  • Add a note hereIs there a next hop associated with this entry?

  • Add a note hereIs there an adjacency entry for this next hop?

Add a note hereTo find out whether CEF is enabled on a particular interface, issue the show ip interface command. As you can see in Example 7-53, the output clearly states whether CEF switching is enabled.

Add a note here Example 7-53: To Check Whether CEF Is Enabled on an Interface, Use the show ip interface Command

Add a note hereRouter# show ip interface GigabitEthernet 0/0
GigabitEthernet0/0 is up, line protocol is up
<... output omitted ...>
IP fast switching is enabled
IP fast switching on the same interface is disabled
IP Flow switching is disabled
IP CEF switching is disabled
IP Fast switching turbo vector
IP multicast fast switching is enabled
IP multicast distributed fast switching is disabled
IP route-cache flags are Fast
! Output omitted for brevity

Add a note hereIf CEF is enabled on the router, you will see output similar to that shown in Example 7-54 after issuing the show ip cef command. This command displays the content of the FIB table, but you also discover if CEF is globally enabled or disabled on the router. All directly connected networks in the output are marked as attached in the Next Hop field. Network prefixes that are local to the router are marked as receive. The show ip cef command does not display the interfaces on which CEF is explicitly disabled.

Add a note here Example 7-54: Use the show ip cef Command to Display the FIB

Add a note hereRouter# show ip cef
Prefix Next Hop Interface
0.0.0.0/0 10.14.14.19 GigabitEthernet0/0
0.0.0.0/32 receive
10.14.14.0/24 attached GigabitEthernet0/0
10.14.14.0/32 receive
! Output omitted for brevity
10.14.14.252/32 receive
224.0.0.0/4 drop
224.0.0.0/24 receive
255.255.255.255/32 receive

Add a note hereIn Example 7-54, the output shows that the router uses output interface GigabitEthernet0/0 and next hop 10.14.14.19/32 to reach 0.0.0.0/0 (the default route). You can also see what other destinations are associated with this interface/next-hop pair, using the show ip cef adjacency command for this interface and next-hop value, as shown in Example 7-55. This specific combination of output interface and next hop is used to reach two network destinations: the default route and a specific host destination (10.14.14.19/32), in this example.

Add a note here Example 7-55: Checking the Adjacency Table for Gi0/0 and Next Hop 10.14.14.19

Add a note hereRouter# show ip cef adjacency GigabitEthernet0/0 10.14.14.19 detail
IP CEF with switching (Table Version 24), flags=0x0
23 routes, 0 reresolve, 0 unresolved (0 old, 0 new), peak 0
2 instant recursive resolutions, 0 used background process
28 leaves, 22 nodes, 26516 bytes, 79 inserts, 51 invalidations
0 load sharing elements, 0 bytes, 0 references
universal per-destination load sharing algorithm, id 56F4BAB5
4(1) CEF resets, 2 revisions of existing leaves
Resolution Timer: Exponential (currently 1s, peak 1s)
1 in-place/0 aborted modifications
refcounts: 6223 leaf, 6144 node
Table epoch: 0 (23 entries at this epoch)
Adjacency Table has 13 adjacencies
0.0.0.0/0, version 22, epoch 0, cached adjacency 10.14.14.19
0 packets, 0 bytes
via 10.14.14.19, 0 dependencies, recursive
next hop 10.14.14.19, GigabitEthernet0/0 via 10.14.14.19/32
valid cached adjacency
10.14.14.19/32, version 11, epoch 0, cached adjacency 10.14.14.19
0 packets, 0 bytes
via 10.14.14.19, GigabitEthernet0/0, 1 dependency
next hop 10.14.14.19, GigabitEthernet0/0
valid cached adjacency

Add a note hereTo see the adjacency table entries for this next hop, you use the show adjacency command. Note the difference that there is no ip in this command. The output of the show adjacency command for the Gi0/0 interface, beginning with the next-hop value of 10.14.14.19, is shown in Example 7-56. In this entry, you can see the full Layer 2 frame header associated with this next hop, which has been built through ARP. The Layer 2 MAC address for this next-hop IP address can also be checked in the ARP cache using the show ip arp command for the specific 10.14.14.19 address (also shown in Example 7-56).

Add a note here Example 7-56: show adjacency Command Output

Add a note hereRouter# show adjacency GigabitEthernet 0/0 detail | begin 10.14.14.19
Protocol Interface Address
IP GigabitEthernet0/0 10.14.14.9(5)
0 packets, 0 bytes
001200A2BC41001BD5F9E7C00800
ARP 03:19:39
Epoch: 0
[...]
Router# show ip arp 10.14.14.19
Protocol Address Age (min) Hardware Addr Type Interface
Internet 10.14.14.19 4 0012.009a.0c42 ARPA GigabitEthernet0/0

Add a note here You must know that the CPU might process some packets, even if CEF is enabled. This can happen for reasons such as an incomplete adjacency table or when processing packets that need special handling by the main processor. You can gather information about the packets that are not switched with CEF by using the show cef not-cef-switched command, as shown in Example 7-57.

Add a note here Example 7-57: Gathering Information About the Non-CEF-Switched Packets

Add a note hereRouter# show cef not-cef-switched
CEF Packets passed on to next switching layer
Slot No_adj No_encap Unsupp'ted Redirect Receive Options Access Frag
RP 424260 0 5227416 67416 2746773 9 15620 0

IOS Tools to Analyze Packet Forwarding

Add a note hereCisco IOS Software is a powerful operating system that has an embedded set of tools to assist in troubleshooting various networking problems. These tools enable network administrators to quickly and effectively find, isolate, and repair IP communication problems. The following series of steps shows you an example of a troubleshooting process that could be used to find problems related to the switching path used by a router. The example is based on the network shown in Figure 7-15. Be aware that the actual routers used for command outputs in this example do not have any problems. The aim is to show the Cisco IOS commands in action.

Click to collapse
Add a note hereFigure 7-15: Network Diagram for the Step-by-Step CEF Troubleshooting Example

Add a note here Step 1

Add a note hereFirst try to find the problematic router along the path with the traceroute utility as demonstrated in Example 7-58. Although the output seems normal, suppose that the traceroute command would have shown a much higher delay or packet loss on router R2 compared to router R3. Such symptoms can lead you to suspect problems in router R2.

Add a note here Example 7-58: Make Use of the traceroute Command to Find the Problematic Router

Add a note hereR1# traceroute 10.11.1.1
Type escape sequence to abort.
Tracing the route to 10.11.1.1
1 10.1.1.2 72 msec 56 msec 64 msec
2 10.2.1.1 76 msec 104 msec *

Add a note here Step 2

Add a note hereCheck the CPU utilization on router R2 for load due to packet processing, using the show processes cpu command, as shown in Example 7-59. In this example, there are no problems related to packet processing.

Add a note here Example 7-59: Checking the CPU Utilization on R2

Add a note hereR2# show processes cpu | exclude 0.00
CPU utilization for five seconds: 4%/0%; one minute: 1%; five minutes: 1%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
2 3396 650 5224 0.08% 0.07% 0.10% 0 Load Meter
3 11048 474 23308 3.27% 0.51% 0.37% 0 Exec
99 13964 6458 2162 0.90% 0.66% 0.71% 0 DHCPD Receive
154 348 437 796 0.08% 0.09% 0.08% 0 CEF process

Add a note here Step 3

Add a note hereCheck the routing table for the corresponding destination prefix (in this example, 10.11.1.1), as shown in Example 7-60. In this example, the routing information is present.

Add a note here Example 7-60: Display the Routing Table Entry for the Destination Under Investigation

Add a note hereR2# show ip route 10.11.1.1
Routing entry for 10.11.1.1/32
Known via "ospf 1", distance 110, metric 11, type intra area
Last update from 10.2.1.1 on FastEthernet0/1, 00:29:20 ago
Routing Descriptor Blocks:
* 10.2.1.1, from 10.11.1.1, 00:29:20 ago, via FastEthernet0/1
Route metric is 11, traffic share count is 1

Add a note here Step 4

Add a note hereFind which switching mode is used by the router and on the interfaces involved in packet forwarding. Using show ip cef, find out if CEF is enabled, for the destination under investigation, discover the egress interface, and use the show ip interface for that interface to see what type of switching is operational on it. This work is shown in Example 7-61 for the current example. In this example, CEF is enabled globally, and all involved interfaces are enabled for CEF switching.

Add a note here Example 7-61: Find Out the Type of Switching Used on the Router and the Interfaces

Add a note hereR2# show ip cef
Prefix Next Hop Interface
0.0.0.0/0 drop Null0 (default route handler entry)
0.0.0.0/32 receive
10.1.1.0/24 attached FastEthernet0/0
10.1.1.0/32 receive
10.1.1.1/32 10.1.1.1 FastEthernet0/0
10.1.1.2/32 receive
10.1.1.255/32 receive
10.2.1.0/24 attached FastEthernet0/1
10.2.1.0/32 receive
10.2.1.1/32 10.2.1.1 FastEthernet0/1
10.2.1.2/32 receive
10.2.1.255/32 receive
10.10.1.1/32 10.1.1.1 FastEthernet0/0
10.11.1.1/32 10.2.1.1 FastEthernet0/1
224.0.0.0/4 drop
224.0.0.0/24 receive
255.255.255.255/32 receive

R2# show ip interface FastEthernet 0/0 | include CEF
IP CEF switching is enabled
IP CEF Fast switching turbo vector
IP route-cache flags are Fast, CEF

R2# show ip interface FastEthernet 0/1 | include CEF
IP CEF switching is enabled
IP CEF Fast switching turbo vector
IP route-cache flags are Fast, CEF

Add a note here Step 5

Add a note hereCheck the FIB entry for the routing information under investigation (in this case, 10.11.1.1), as shown in Example 7-62. The related adjacency entry shows interface FastEthernet0/1 with next hop 10.2.1.1.

Add a note here Example 7-62: Display the FIB Entry for the Destination Under Investigation

Add a note hereR2# show ip cef 10.11.1.1 255.255.255.255
10.11.1.1/32, version 13, epoch 0, cached adjacency 10.2.1.1
0 packets, 0 bytes
via 10.2.1.1, FastEthernet0/1, 0 dependencies
next hop 10.2.1.1, FastEthernet0/1
valid cached adjacency

Add a note here Step 6

Add a note here Check the adjacency table for the next-hop value of the destination you are investigating, as shown in Example 7-63. In this case, the relevant adjacency is built using ARP.

Add a note here Example 7-63: Use show adjacency to Discover the Layer 2 Value for Your Next Hop

Add a note hereR2# show adjacency FastEthernet0/1 detail
Protocol Interface Address
IP FastEthernet0/1 10.2.1.1(7)
203 packets, 307342 bytes
C40202640000C4010F5C00010800
ARP 02:57:43
Epoch: 0

Add a note here Step 7

Add a note hereCheck the ARP cache entry for the next hop, as shown in Example 7-64. You see that the MAC address information is present in the router. Based on this verification process, you can conclude that the routers in this example do not have any switching-related problems.

Add a note here Example 7-64: Display the ARP Cache and Look for the Next-Hop Value

Add a note hereR2# show ip arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet 10.2.1.1 67 c402.0264.0000 ARPA FastEthernet0/1
Internet 10.1.1.2 - c401.0f5c.0000 ARPA FastEthernet0/0
Internet 10.1.1.1 67 c400.0fe4.0000 ARPA FastEthernet0/0
Internet 10.2.1.2 - c401.0f5c.0001 ARPA FastEthernet0/1

Add a note hereThe steps shown can be used as generic procedure for finding issues with CEF switching.

Add a note here Troubleshooting Router Memory Issues

Add a note hereMemory-allocation failure is the most common router memory issue. Memory-allocation failures happen when the router has used all available memory (temporarily or permanently), or the memory has been fragmented into such small pieces that the router cannot find a usable available block. This can happen to the processor memory, which is used by Cisco IOS Software, or to the packet memory, which is used to buffer incoming and outgoing packets. Symptoms of memory allocation failures include the following:

  • Add a note hereMessages such as %SYS–2–MALLOCFAIL: Memory allocation of 1028 bytes failed from 0x6015EC84, Pool Processor, alignment 0 display in the router logs.

  • Add a note here show commands generate no output.

  • Add a note hereReceiving Low on memory messages.

  • Add a note here Receiving the message Unable to create EXEC – no memory or too many processes on the console.

Add a note hereWhen a router is low on memory, in some instances it is not even possible to use Telnet to connect to the router. When you get to this point, you need to get access to the console port to collect data for troubleshooting. When connecting to the console port, however, you might see the Unable to create EXEC – no memory or too many processes message. If you see this message, there is not even enough available memory to allow for a console connection.

Add a note hereSome of the main reasons for memory problems are as follows:

  • Add a note here Memory size does not support the Cisco IOS Software image: First, check the Release Notes (available to registered customers only) or the IOS Upgrade Planner (available to registered customers only) for the minimum memory size for the Cisco IOS Software feature set and version that you are running. Make sure that you have sufficient memory in your router to support the software image. The actual memory requirements will vary based on protocols used, routing tables, and traffic patterns on the network.

  • Add a note here Memory-leak bug: A memory leak occurs when a process requests or allocates memory and then forgets to free (de-allocate) the memory when it is finished with that task. As a result, the memory block stays reserved until the router is reloaded. The show memory allocating-process totals command will help you to identify how much memory is used and is free, and the per-process memory utilization of the router. Example 7-65 shows sample output from this command. Memory leaks are caused by bugs in the Cisco IOS code, and the only solution is to upgrade Cisco IOS Software on the device to a version that fixes the issue.

    Add a note here Example 7-65: show memory allocating-process totals Command Output

    Add a note here


    Add a note hereRouter# show memory allocating-process totals
    Head Total (b) Used(b) Free(b) Lowest(b) Largest(b)
    Processor 62A2B2D0 183323952 26507580 156816372 155132764 154650100
    I/0 ED900000 40894464 4957092 35937372 35887920 3590524
    Allocator PC Summary for: Processor
    PC Total Count Name
    0x6136A5A8 5234828 1 Init
    0x608E2208 3576048 812 TTY data
    0x6053ECEC 1557568 184 Process Stack
    0x61356928 1365448 99 Init
    ! Output omitted for brevity

  • Add a note here Security-related problems: MALLOCFAIL errors can also be caused by a security issue, such as a worm or virus operating in your network. This is likely the cause, especially if there have not been any recent changes to the network, such as router IOS upgrades or configuration changes. You can often mitigate the effect of this type of problem by adding a number of configuration statements to your router, such as an access list that drops the traffic generated by the worm or virus. The Cisco Product Security Advisories and Notices page contains information on detection of the most likely causes and specific workarounds.

  • Add a note here Memory-allocation failure at process = interrupt level: The error message identifies the cause. If the process is listed as , as shown in the message that follows, the memory-allocation failure is being caused by a software problem:

    Add a note here%SYS–2–MALLOCFAIL: Memory allocation of 68 bytes failed from
    0x604CEF48, pool Processor, alignment 0–Process= ,
    ipl= 3

    Add a note hereYou can use the Bug Toolkit to search for a matching software bug ID (unique bug identification) for this issue. After you have identified the software bug, upgrade to a Cisco IOS Software version that contains the fix to resolve the problem.

  • Add a note here Buffer-leak bug: When a process is finished using a buffer, the process should free the buffer. A buffer leak occurs when the code forgets to free it. As a result, the buffer pool continues to grow as more and more packets are stuck in the buffers.

Add a note hereThe show interfaces command displays statistics for all interfaces configured on the router. Figure 7-66 displays sample output from this command. The output indicates that the interface input queue is wedged, which is a symptom of buffer leak. The full input queue (76/75) warns of a buffer leak. Here, the values 76 and 75 represent the number of packets in the input queue, and the maximum size of the input queue, respectively: The number of packets in the input queue is larger than the queue depth! This is called a wedged interface. When the input queue of an interface is wedged, the router no longer forwards traffic that enters the affected interface.

Add a note here Example 7-66: show interfaces Command Output Displays a Full Input Queue, a Sign of Buffer Leak

Add a note hereRouter# show interfaces
<...output omitted...>
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:58, output never, output hang never
Last clearing of "show interface" counters never
input queue 76/75, 1250 drops
Output queue 0/40, 0 drops;
! Output omitted for brevity

Add a note hereThe show buffers command displays statistics for the buffer pools on the router. The output in Example 7-67 reveals a buffer leak in the middle buffers pool. There are a total of 17602 middle buffers in the router, and only 11 are in the free list. This implies that some process takes all the buffers, but does not return them. Other symptoms of this type of buffer leak are %SYS–2–MALLOCFAIL error messages for the pool “processor” or “input/output (I/O),” based on the platform. Similar to a generic memory leak, a buffer leak is caused by a software bug, and the only solution is to upgrade Cisco IOS Software on the device to a version that fixes the issue.

Add a note here Example 7-67: show buffers Command Output Indicates Buffer Leak

Add a note hereRouter# show buffers
<...output omitted...>
Middle buffers, 600 bytes (total 17602, permanent 170):
11 in free list (10 min, 400 max allowed)
498598 hits, 148 misses, 671 trims, 657 created
0 failures (0 no memory)
! Output omitted for brevity

BGP Memory Use

Add a note hereCisco IOS has three main processes used by the Border Gateway Protocol (BGP):

  • Add a note here BGP I/O: This process handles reading, writing, and executing of all BGP messages. This process is also the interface between TCP and BGP.

  • Add a note here BGP router: This process is responsible for initiation of a BGP process, session maintenance, processing of incoming updates, sending of BGP updates, and updating the IP RIB (Routing Information Base) with BGP entries.

  • Add a note here BGP scanner: This process performs periodic scans of the BGP RIB to update it as necessary, and it scans the IP RIB to ensure that all BGP next hops are valid.

Add a note hereThe BGP router process consumes the majority of the memory used by BGP. The BGP router process uses memory to store the BGP RIB, IP RIB for BGP prefixes, and IP switching data structures for BGP prefixes. If you do not have enough memory to store this information, BGP cannot operate in a stable manner, and network reliability will be compromised. If you are using chassis-based routers, which distribute routing information to the line cards, you should not only check the memory availability for the route processor, but also the memory availability on the line cards. The show diag command displays the different types of cards present in your router and their respective amounts of memory, as demonstrated in Example 7-68. This command is useful to identify a lack of memory on the line cards when the router runs BGP.

Add a note here Example 7-68: show diag Command Output

Add a note hereRouter# show diag | I (DRAM|SLOT)
SLOT 0 (RP/LC 0 ): 1 Port SONET based SRP OC-12c/STM-4 Single Mode
DRAM size: 268435456 bytes
FrFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
ToFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
SLOT 2 (RP/LC 2 ): 12 Port Packet over E3
DRAM size: 67108864 bytes
FrFab SDRAM size: 67108864 bytes
ToFab SDRAM size: 67108864 bytes
SLOT 3 (RP/LC 3 ): 1 Port Gigabit Ethernet
DRAM size: 134217728 bytes
FrFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
ToFab SDRAM size: 134217728 bytes, SDRAM pagesize: 8192 bytes
SLOT 5 (RP/LC 5 ): Route Processor
DRAM size: 268435456 bytes

Summary

Add a note here The main categories of application services are as follows:

  • Add a note hereNetwork classification

  • Add a note hereApplication scalability

  • Add a note hereApplication networking

  • Add a note hereApplication acceleration

  • Add a note hereWAN acceleration

  • Add a note hereApplication optimization

Add a note hereThe recipe to application optimization is a four-step cycle that incrementally increases your understanding of network applications and allows you to progressively deploy measurable improvements and adjustments as required, as follows:

Add a note here Step 1

Add a note hereBaseline application traffic.

Add a note here Step 2

Add a note hereOptimize the network.

Add a note here Step 3

Add a note hereMeasure, adjust, and verify.

Add a note here Step 4

Add a note hereDeploy new applications.

Add a note hereNetFlow efficiently provides a vital set of services for IP applications, including network traffic accounting, usage-based network billing, network planning, security DoS monitoring, and overall network monitoring. A flow is a unidirectional stream of packets, between a given source and a destination, that have several components in common. The seven fields that need to match for packets to be considered part of the same flow are as follows:

  • Add a note hereSource IP Address

  • Add a note hereDestination IP Address

  • Add a note hereSource Port (protocol dependent)

  • Add a note hereDestination Port (protocol dependent)

  • Add a note hereProtocol (Layer 3 or 4)

  • Add a note hereType of Service (ToS) Value (differentiated services code point [DSCP])

  • Add a note here Input Interface

Add a note hereIP SLA is useful for performance measurement, monitoring, and network baselining. You can tie the results of the IP SLA operations to other features of your router, and trigger action based on the results of the probe. To implement IP SLA network performance measurement, you need to perform the following tasks:

  • Add a note hereEnable the IP SLA responder, if required.

  • Add a note hereConfigure the required IP SLA operation type.

  • Add a note hereConfigure any options available for the specified operation type.

  • Add a note hereConfigure threshold conditions, if required.

  • Add a note hereSchedule the operation to run, and then let the operation run for a period of time to gather statistics.

  • Add a note hereDisplay and interpret the results of the operation using the Cisco IOS CLI or an NMS, with SNMP.

Add a note hereNBAR is another important tool for baselining and traffic classification purposes. NBAR is a classification engine that recognizes a wide variety of applications, including web-based and other difficult-to-classify protocols that utilize dynamic TCP/UDP port assignments. The simplest use of NBAR is baselining through protocol discovery.

Add a note hereThe Cisco IOS SLB feature is a Cisco IOS-based solution that provides server load balancing. This feature allows you to define a virtual server that represents a cluster of real servers, known as a server farm. When a client initiates a connection to the virtual server, the Cisco IOS SLB load balances the connection to a chosen real server based on the configured load-balance algorithm or predictor.

Add a note hereCisco AutoQoS is an automation tool for deploying QoS policies. The newer versions of Cisco AutoQoS have two phases. In the first phase, information is gathered and traffic is baselined to define traffic classes and volumes; this is called autodiscovery. The command auto discovery qos is entered at the interface configuration mode. You must let discovery run for a period of time that is appropriate for your baselining or monitoring needs. The auto qos command, which is also an interface configuration mode command, uses the information gathered by autodiscovery to apply QoS policies accordingly. The autodiscovery phase generates templates on the basis of the data collected. These templates are then used to create QoS policies. Finally, the policies are installed by AutoQoS on the interface.

Add a note hereFor Cisco AutoQoS to work certain requirements must be met, as follows:

  • Add a note hereCEF must be enable on the interface.

  • Add a note hereThe interface (or subinterface) must have an IP address configured.

  • Add a note hereFor serial interfaces (or subinterfaces) configure the appropriate bandwidth.

  • Add a note hereOn point-to-point serial interfaces, both sides must be configured AutoQoS.

Add a note here Some useful NetFlow troubleshooting commands are the following:

  • Add a note here show ip cache flow

  • Add a note here show ip flow export

  • Add a note here show ip flow interface

  • Add a note here debug ip flow export

Add a note hereUseful IP SLA troubleshooting commands include the following:

  • Add a note here show ip sla monitor statistics

  • Add a note here show ip sla monitor collection-statistics

  • Add a note here show ip sla monitor configuration

  • Add a note here debug ip sla monitor trace

Add a note hereSome useful NBAR troubleshooting commands are these:

  • Add a note here show ip nbar port-map

  • Add a note here show ip nbar protocol-discovery

  • Add a note here debug ip nbar unclassified-port-stats

Add a note hereSome of the useful AutoQoS troubleshooting commands are as follows:

  • Add a note here show auto qos interface

  • Add a note here show auto discovery qos

Add a note hereTroubleshooting performance problems is a three-step process:

Add a note here Step 1

Add a note hereAssessing whether the problem is technical in nature

Add a note here Step 2

Add a note hereIsolating the performance problem to a device, link, or component

Add a note here Step 3

Add a note hereDiagnosing and resolving the performance degradation at the component level

Add a note hereThe following events cause spikes in the CPU utilization:

  • Add a note hereProcessor-intensive Cisco IOS commands

  • Add a note hereRouting protocol update processing

  • Add a note hereSNMP polling

Add a note hereSome common interface and wiring problems are as follows:

  • Add a note hereNo cable connected

  • Add a note hereWrong port

  • Add a note here Device has no power

  • Add a note hereWrong cable type

  • Add a note hereBad cable

  • Add a note hereLoose connections

  • Add a note herePatch panels

  • Add a note hereFaulty media converters

  • Add a note hereBad or wrong GBIC

Add a note hereCommon symptoms of a router CPU that is too busy is that the router fails to respond to certain service requests. In those situations, the router might exhibit the following behaviors:

  • Add a note hereSlow response to Telnet requests or to the commands issued in active Telnet sessions

  • Add a note hereSlow response to commands issued on the console

  • Add a note hereHigh latency on ping responses or too many ping timeouts

  • Add a note hereFailure to send routing protocol packets to other routers

Add a note hereWhen troubleshooting CEF, you always want to check and verify the following:

  • Add a note hereIs CEF enabled globally and per interface?

  • Add a note hereIs there a FIB entry for a given network destination?

  • Add a note hereIs there a next hop associated with this entry?

  • Add a note hereIs there an adjacency entry for this next hop?

Add a note hereSymptoms of memory-allocation failures include the following:

  • Add a note hereMessages such as %SYS–2–MALLOCFAIL: Memory allocation of 1028 bytes failed from 0x6015EC84, Pool Processor, alignment 0 display in the router logs.

  • Add a note hereNot getting any output from show commands.

  • Add a note hereReceiving Low on memory messages.

  • Add a note hereReceiving the message Unable to create EXEC – no memory or too many processes on the console.

Add a note hereSome of the main reasons for memory problems are as follows:

  • Add a note hereMemory size does not support the Cisco IOS Software image

  • Add a note hereMemory-leak bug

  • Add a note here Security-related problems

  • Add a note hereMemory-allocation failure at process = interrupt level error message

  • Add a note hereBuffer-leak bug


No comments:

Post a Comment