| 0 comments ]

Voice Issues and Requirements

Add a note hereThis section describes issues that might affect voice quality and then introduces various mechanisms that can be used to improve the quality of voice in an integrated network.

Add a note here Voice Quality Issues

Add a note hereOverall voice quality is a function of many factors, including delay, jitter, packet loss, and echo. This section discusses these factors and ways to minimize them.

Packet Delays

Add a note herePacket delay can cause voice quality degradation. When designing networks that transport voice, you must understand and account for the network’s delay components. Correctly accounting for all potential delays ensures that overall network performance is acceptable.

Add a note hereThe generally accepted limit for good quality voice connection delay is 150 milliseconds (ms) one-way. As delays increase, the communication between two people falls out of synch (for example, they speak at the same time or both wait for the other to speak); this condition is called talker overlap. The ITU describes network delay for voice applications in recommendation G.114; as shown in Table 8-3, this recommendation defines three bands of one-way delay.

Add a note here Table 8-3: ITU G.114 Recommended Delays for One-way Voice Traffic
Open table as spreadsheet

Add a note here Delay

Add a note hereEffect on Voice Quality

Add a note here0 to 150 ms

Add a note hereAcceptable for most user applications.

Add a note here151 to 400 ms

Add a note hereAcceptable provided that the organization is aware of the transmission time and its impact on the transmission quality of user applications. Note that this is the expected range for a satellite link.

Add a note hereLonger than 401 ms

Add a note hereUnacceptable for general network planning purposes; however, this limit is exceeded in some exceptional cases.

Add a note hereVoice packets are delayed if the network is congested because of poor network quality, underpowered equipment, congested traffic, or insufficient bandwidth. Delay can be classified into two types: fixed network delay and variable network delay.

Fixed Network Delays

Add a note hereFixed network delays result from delays in network devices and contribute directly to the overall connection delay. As shown in Figure 8-26, fixed delays have three components: propagation delay, serialization delay, and processing delay.

Click to collapse
Add a note hereFigure 8-26: Fixed Delays Result from Delays in Network Devices
Propagation Delay

Add a note hereThis form of delay, which is limited by the speed of light, can be ignored for most designs because it is relatively small compared to other types of delay. A popular estimate of 10 microseconds/mile or 6 microseconds/kilometer is used for estimating propagation delay.


Note

Add a note herePropagation delay has a noticeable impact on the overall delay only on satellite links.

Serialization Delay

Add a note hereThe higher the circuit speed, the less time it takes to place the bits on the circuit and the less the serialization delay. Serialization delay is a constant function of link speed and packet size. It is calculated by the following formula:

Add a note here(packet length)/(bit rate)

Add a note hereA large serialization delay occurs with slow links or large packets. Serialization delay is always predictable; for example, when using a 64-kbps link and 80-byte frame, the delay is exactly 10 ms.


Note

Add a note hereThe previous example is calculated as follows:

  • Add a note here64 kbps = 64,000 bits/sec * 1 byte/8 bits = 8000 bytes/sec = 8000 bytes/1000 ms = 8 bytes/ms

  • Add a note hereSerialization delay = (packet length)/(bit rate) = (80 bytes)/(8 bytes/ms) = 10 ms


Note

Add a note hereSerialization delay is a factor only for slow-speed links up to 1 Mbps.

Processing Delay

Add a note hereProcessing delays include the following:

  • Add a note here Coding, compression, decompression, and decoding delays: These delays depend on the algorithm used for these functions, which can be performed in either hardware or software. Using specialized hardware such as a DSP dramatically improves the quality and reduces the delay associated with different voice compression schemes.

  • Add a note here Packetization delay: This delay results from the process of holding the digital voice samples until enough are collected to fill the packet or cell payload. In some compression schemes, the voice gateway sends partial packets to reduce excessive packetization delay.

Variable Network Delays

Add a note here Variable network delay is more unpredictable and difficult to calculate than fixed network delay. As shown in Figure 8-27 and described in the following sections, the following three factors contribute to variable network delay: queuing delay, variable packet sizes, and dejitter buffers.

Click to collapse
Add a note hereFigure 8-27: Variable Delays Can Be Unpredictable
Queuing Delay and Variable Packet Sizes

Add a note hereQueuing delay occurs when a voice packet is waiting on the outgoing interface for others to be serviced first. This waiting time is statistically based on the arrival of traffic; the more inputs, the more likely that contention is encountered for the interface. Queuing delay is also based on the size of the packet currently being serviced; larger packets take longer to transmit than do smaller packets. Therefore, a queue that combines large and small packets experiences varying lengths of delay.

Add a note hereBecause voice should have absolute priority in the voice gateway queue, a voice frame should wait only for either a data frame that is already being sent or for other voice frames ahead of it. For example, assume that a 1500-byte data packet is queued before the voice packet. The voice packet must wait until the entire data packet is transmitted, which produces a delay in the voice path. If the link is slow (for example, 64 or 128 kbps), the queuing delay might be more than 200 ms and result in an unacceptable voice delay.

Add a note hereLink fragmentation and interleaving (LFI) is a solution for queuing delay situations. With LFI, the voice gateway fragments large packets into smaller equal-sized frames and interleaves them with small voice packets. Therefore, a voice packet does not have to wait until the entire large data packet is sent. LFI reduces and ensures a more predictable voice delay. Configuring LFI fragmentation on a link results in a fixed delay (for example, 10 ms); however, be sure to set the fragment size so that only data packets, not voice packets, become fragmented. Figure 8-28 illustrates the LFI concept.

Click to collapse
Add a note hereFigure 8-28: LFI Ensures That Smaller Packets Do Not Get Stuck Behind Larger Packets
Dejitter Buffers

Add a note hereBecause network congestion can occur at any point in a network, interface queues can be filled instantaneously, potentially leading to a difference in delay times between packets from the same voice stream.


Note

Add a note hereThe dejitter buffer is also referred to as the playout delay buffer.

Add a note hereOn the first talk spurt, dejitter buffers help provide smooth playback of voice traffic. Setting these buffers too low causes overflows and data loss, whereas setting them too high causes excessive delay.

Add a note hereDejitter buffers reduce or eliminate delay variation by converting it to a fixed delay. However, dejitter buffers always add delay; the amount depends on the variance of the delay.

Add a note hereDejitter buffers work most efficiently when packets arrive with almost uniform delay. Various QoS congestion avoidance mechanisms exist to manage delay and avoid network congestion; if there is no variance in delay, dejitter buffers can be disabled, reducing the constant delay.

Jitter

Add a note hereAt the sending side, the originating voice gateway sends packets in a continuous stream, spaced evenly. Because of network congestion, improper queuing, or configuration errors, this steady stream can become lumpy; in other words, as shown in Figure 8-29, the delay between each packet can vary instead of remaining constant. This can be annoying to listeners.

Click to collapse
Add a note hereFigure 8-29: Jitter Is the Variation in the Delay of Received Voice Packets

Add a note here When a voice gateway receives a VoIP audio stream, it must compensate for the jitter it encounters. The mechanism that handles this function is the dejitter buffer (as mentioned previously in the “Dejitter Buffers” section), which must buffer the packets and then play them out in a steady stream to the DSPs, which convert them back to an analog audio stream.

Packet Loss

Add a note herePacket loss causes voice clipping and skips. Packet loss can occur because of congested links, improper network QoS configuration, poor packet buffer management on the routers, routing problems, and other issues in both the WAN and LAN. If queues become saturated, VoIP packets might be dropped, resulting in effects such as clicks or lost words. Losses occur if the packets are received out of range of the dejitter buffer, in which case the packets are discarded.

Add a note hereThe industry-standard codec algorithms used in the Cisco DSP can use interpolation to correct for up to 30 ms of lost voice. The Cisco VoIP technology uses 20-ms samples of voice payload per VoIP packet. Therefore, only a single packet can be lost during any given time for the codec correction algorithms to be effective.

Echo

Add a note hereIn a voice telephone call, an echo occurs when callers hear their own words repeated.

Add a note hereEcho is a function of delay and magnitude. The echo problem grows with the delay (the later the echo is heard) and the loudness (higher amplitude). When timed properly, an echo can be reassuring to the speaker. But if the echo exceeds approximately 25 milliseconds, it can be distracting and cause breaks in the conversation.

Add a note here The following voice network elements can affect echo:

  • Add a note here Hybrid transformers: A typical telephone is a two-wire device, whereas trunk connections are four-wire; a hybrid transformer is used to interface between these connections. Hybrid transformers are often prime culprits for signal leakage between analog transmit and receive paths, causing echo. Echo is usually caused by a mismatch in impedance from the four-wire network switch conversion to the two-wire local loop or an impedance mismatch in a PBX.

  • Add a note here Telephones: An analog telephone terminal itself presents a load to the PBX. This load should be matched to the output impedance of the source device (the FXS port). Some (typically inexpensive) telephones are not matched to the FXS port’s output impedance and are sources of echo. Headsets are particularly notorious for poor echo performance.

    Add a note hereWhen digital telephones are used, the point of digital-to-analog conversion occurs inside the telephone. Extending the digital transmission segments closer to the actual telephone decreases the potential for echo.


Note

Add a note hereThe belief that adding voice gateways (routers) to a voice network creates echo is a common misconception. Digital segments of the network do not cause leaks; so, technically, voice gateways cannot be the source of echo. However, adding routers does add delay, which can make a previously imperceptible echo perceptible.

Add a note hereAn echo canceller, shown in Figure 8-30, can be placed in the network to improve the quality of telephone conversation. An echo canceller is a component of a voice gateway; it reduces the level of echo leaking from the receive path into the transmit path.

Image from book
Add a note hereFigure 8-30: Echo Cancellers Reduce the Echo Level

Add a note here Echo cancellers are built into low-bit-rate codecs and operate on each DSP. By design, echo cancellers are limited by the total amount of time they wait for the reflected speech to be received. This is known as an echo trail or echo cancellation time and is usually between 16 and 32 milliseconds.

Add a note hereTo understand how an echo canceller works, assume that a person in Toronto is talking to a person in Vancouver. When the speech of the person in Toronto hits an impedance mismatch or other echo-causing environment, it bounces back to that person, who can hear the echo several milliseconds after speaking.

Add a note hereRecall that the problem is at the other end of the call (called the tail circuit); in this example, the tail circuit is in Vancouver. To remove the echo from the line, the router in Toronto must keep an inverse image of the Toronto person’s speech for a certain amount of time. This is called inverse speech. The echo canceller in the router listens for sound coming from the person in Vancouver and subtracts the inverse speech of the person in Toronto to remove any echo.

Add a note hereThe ITU-T defines an irritation zone of echo loudness and echo delay. A short echo (around 15 ms) does not have to be suppressed, whereas longer echo delays require strong echo suppression. Therefore, all networks that produce one-way time delays greater than 16 ms require echo cancellation. It is important to configure the appropriate echo cancellation time. If the echo cancellation time is set too low, callers still hear echo during the phone call. If the configured echo cancellation time is set too high, it takes longer for the echo canceller to converge and eliminate the echo.

Add a note hereAttenuating the signal below the noise level can also eliminate echo.

Add a note here Voice Coding and Compression

Add a note hereVoice communication over IP relies on voice that is coded and encapsulated into IP packets. This section provides an overview of the various codecs used in voice networks.


Note

Add a note hereThe term codec can have the following two meanings:

  • Add a note here A coder-decoder: An integrated circuit device that typically uses PCM to transform analog signals into a digital bit stream and digital signals back into analog signals.

  • Add a note here A software algorithm: Used to compress and decompress speech or audio signals in VoIP, Frame Relay, and ATM.

Coding and Compression Algorithms

Add a note hereEach codec provides a certain quality of speech. Advances in technology have greatly improved the quality of compressed voice and have resulted in a variety of coding and compression algorithms:

  • Add a note here PCM: The toll quality voice expected from the PSTN. PCM runs at 64 kbps and provides no compression, and therefore no opportunity for bandwidth savings.

  • Add a note here Adaptive Differential Pulse Code Modulation (ADPCM): Provides three different levels of compression. Some fidelity is lost as compression increases. Depending on the traffic mix, cost savings generally run at 25 percent for 32-kbps ADPCM, 30 percent for 24-kbps ADPCM, and 35 percent for 16-kbps ADPCM.

  • Add a note here Low-Delay Code Excited Linear Prediction Compression (LD-CELP): This algorithm models the human voice. Depending on the traffic mix, cost savings can be up to 35 percent for 16-kbps LD-CELP.

  • Add a note here Conjugate Structure Algebraic Code Excited Linear Prediction Compression (CS-ACELP): Provides eight times the bandwidth savings over PCM. CS-ACELP is a more recently developed algorithm modeled after the human voice and delivers quality that is comparable to LD-CELP and 32-kbps ADPCM. Cost savings are approximately 40 percent for 8-kbps CS-ACELP.

  • Add a note here Code Excited Linear Prediction Compression (CELP): Provides huge bandwidth savings over PCM. Cost savings can be up to 50 percent for 5.3-kbps CELP.

Add a note hereThe following section details voice coding standards based on these algorithms.

Voice Coding Standards (Codecs)

Add a note hereThe ITU has defined a series of standards for voice coding and compression:

  • Add a note here G.711: Uses the 64-kbps PCM voice coding technique. G.711-encoded voice is already in the correct format for digital voice delivery in the PSTN or through PBXs. Most Cisco implementations use G.711 on LAN links because of its high quality, approaching toll quality.

  • Add a note here G.726/G.727: G.726 uses the ADPCM coding at 40, 32, 24, and 16 kbps. ADPCM voice can be interchanged between packet voice and public telephone or PBX networks if the latter has ADPCM capability. G.727 is a specialized version of G.726; it includes the same bandwidths.

  • Add a note here G.728: Uses the LD-CELP voice compression, which requires only 16 kbps of bandwidth. LD-CELP voice coding must be transcoded to a PCM-based coding before delivering to the PSTN.

  • Add a note here G.729: Uses the CS-ACELP compression, which enables voice to be coded into 8-kbps streams. This standard has various forms, all of which provide speech quality similar to that of 32-kbps ADPCM.

    Add a note hereFor example, in G.729a, the basic algorithm was optimized to reduce the computation requirements. In G.729b, voice activity detection (VAD) and comfort noise generation were added. G.729ab provides an optimized version of G.729b requiring less computation.

  • Add a note here G.723.1: Uses a dual-rate coder for compressing speech at very low bit rates. Two bit rates are associated with this standard: 5.3 kbps using algebraic code-excited linear prediction (ACELP) and 6.3 kbps using Multipulse Maximum Likelihood Quantization (MPMLQ).

Sound Quality

Add a note hereEach codec provides a certain quality of speech. The perceived quality of transmitted speech depends on a listener’s subjective response.

Add a note hereThe mean opinion score (MOS) is a common benchmark used to specify the quality of sound produced by specific codecs. To determine the MOS, a wide range of listeners judge the quality of a voice sample corresponding to a particular codec on a scale of 1 (bad) to 5 (excellent). The scores are averaged to provide the MOS for that sample. Table 8-4 shows the relationship between codecs and MOS scores; notice that MOS decreases with increased codec complexity.

Add a note here Table 8-4: Voice Coding and Compression Results
Open table as spreadsheet

Add a note hereAlgorithm

Add a note hereITU Standard

Add a note hereData Rate[1]

Add a note hereMOS Score

Add a note herePCM

Add a note hereG.711

Add a note here64 kbps

Add a note here4.1

Add a note hereADPCM

Add a note hereG.726/G.727

Add a note here16/24/32/40 kbps

Add a note here3.85 or less

Add a note hereLD-CELP

Add a note hereG.728

Add a note here16 kbps

Add a note here3.61

Add a note hereCS-ACELP

Add a note hereG.729

Add a note here8 kbps

Add a note here3.92

Add a note hereACELP/MPMLQ

Add a note hereG.723.1

Add a note here6.3/5.3 kbps

Add a note here3.9/3.65

Add a note here [1]Data rates shown are for digitized speech only. In addition to this coded digital stream payload, RTP, UDP, IP, and Layer 2 headers are needed.

Add a note here The Perceptual Speech Quality Measurement (PSQM) is a newer, more objective measurement that is overtaking MOS scores as the industry quality measurement of choice for coding algorithms. PSQM is specified in ITU standard P.861. PSQM provides a rating on a scale of 0 to 6.5, where 0 is best and 6.5 is worst. PSQM is implemented in test equipment and monitoring systems. It compares the transmitted speech to the original input to produce a PSQM score for a test voice call over a particular packet network. Some PSQM test equipment converts the 0-to-6.5 scale to a 0-to-5 scale to correlate to MOS.

Codec Complexity, DSPs, and Voice Calls

Add a note hereA codec is a technology for compressing and decompressing data; it is implemented in DSPs. Some codec compression techniques require more processing power than others.

Add a note hereThe number of calls supported depends on the DSP and the complexity of the codec used. For example, as illustrated in Table 8-5, the Cisco High-Density Packet Voice/Fax DSP Module (AS54-PVDM2-64) for Cisco voice gateways provides high-density voice connectivity supporting 24 to 64 channels (calls), depending on codec compression complexity.

Add a note here Table 8-5: Code Complexity and Calls per DSP on the AS54-PVDM2-64 Voice/Fax DSP Module
Open table as spreadsheet

Add a note hereLow Complexity (Maximum 64 Calls)

Add a note hereMedium Complexity (Maximum 32 Calls)

Add a note hereHigh Complexity (Maximum 24 Calls)

Add a note hereG.711 a-law

Add a note hereG.711 Mu-law

Add a note hereFax Passthrough

Add a note hereModem Passthrough

Add a note hereClear-channel codec

Add a note hereG.729a

Add a note hereG.729ab

Add a note hereG.726: 16/24/32 kbps

Add a note hereT.38 fax relay

Add a note hereCisco Fax Relay

Add a note hereG.723.1: 5.3/6.3 kbps

Add a note hereG.723.1a: 5.3/6.3 kbps

Add a note hereG.728

Add a note hereModem relay

Add a note hereAdaptive multirate narrow band: 4.75, 5.15, 5.9, 6.7, 7.4, 7.95, 10.2, and 12.2 kbps, and silence insertion descriptor

Add a note here Bandwidth Considerations

Add a note here Bandwidth availability is a key issue to consider when designing voice on IP networks. The amount of bandwidth per call varies greatly, depending on which codec is used and how many voice samples are required per packet. However, the best coding mechanism does not necessarily result in the best voice quality; for example, the better the compression, the worse the voice quality. The designer must decide which is more important: better voice quality or more efficient bandwidth consumption.

Reducing the Amount of Voice Traffic

Add a note hereTwo techniques reduce the amount of traffic per voice call and therefore use available bandwidth more efficiently: cRTP and VAD.

Compressed Real-Time Transport Protocol

Add a note hereAll voice packets encapsulated into IP consist of two components: the payload, which is the voice sample, and IP/UDP/RTP headers. Although voice samples are compressed by the DSP and can vary in size based on the codec used, the headers are a constant 40 bytes. When compared to the 20 bytes of voice samples in a G.729 call, the headers make up a considerable amount of overhead. As illustrated in Figure 8-31, cRTP compresses the headers to 2 or 4 bytes, thereby offering significant bandwidth savings. cRTP is sometimes referred to as RTP header compression. RFC 2508, Compressing IP/UDP/RTP Headers for Low-Speed Serial Links, describes cRTP.

Click to collapse
Add a note hereFigure 8-31: RTP Header Compression

Add a note hereEnabling compression on a low-bandwidth serial link can greatly reduce the network overhead and conserve WAN bandwidth if there is a significant volume of RTP traffic. In general, enable cRTP on slow links up to 768 kbps. However, cRTP is not recommended for higher-speed links because of its high CPU requirements.


Note

Add a note hereBecause cRTP compresses VoIP calls on a link-by-link basis, all links on the path must be configured for cRTP.

Voice Activity Detection

Add a note here On average, about 35 percent of calls are silence. In traditional voice networks, all voice calls use a fixed bandwidth of 64 kbps regardless of how much of the conversation is speech and how much is silence. When VoIP is used, this silence is packetized along with the conversation. VAD suppresses packets of silence, so instead of sending IP packets of silence, only IP packets of conversation are sent. Therefore, gateways can interleave data traffic with actual voice conversation traffic, resulting in more effective use of the network bandwidth.


Note

Add a note hereIn some cases, Cisco recommends disabling VAD, such as when faxes are to be sent through the network. VAD can also degrade the call’s perceived quality, because when VAD is enabled, silence is replaced by comfort noise played to the listener by the device at the listener’s end of the network. If this causes problems, VAD should be disabled.

Voice Bandwidth Requirements

Add a note hereWhen building voice networks, one of the most important factors to consider is bandwidth capacity planning. One of the most critical concepts to understand within capacity planning is how much bandwidth is used for each VoIP call.

Add a note here Table 8-6 presents a selection of codec payload sizes and the required bandwidth without compression and with cRTP. The last column shows the number of uncompressed and compressed calls that can be made on a 512-kbps link.

Add a note here Table 8-6: Voice Bandwidth Requirements
Open table as spreadsheet

Add a note hereCodec

Add a note herePayload Size (Bytes)

Add a note hereBandwidth (kbps)

Add a note hereBandwidth with cRTP (kbps)

Add a note hereNumber of Calls on a 512-kbps Link (No Compression/with cRTP)

Add a note hereG.711 (64 kbps)

Add a note here160

Add a note here83

Add a note here68

Add a note here6/7

Add a note hereG.726 (32 kbps)

Add a note here60

Add a note here57

Add a note here36

Add a note here8/14

Add a note hereG.726 (24 kbps)

Add a note here40

Add a note here52

Add a note here29

Add a note here9/17

Add a note hereG.728 (16 kbps)

Add a note here40

Add a note here35

Add a note here19

Add a note here14/26

Add a note hereG.729 (8 kbps)

Add a note here20

Add a note here26

Add a note here11

Add a note here19/46

Add a note hereG.723 (6.3 kbps)

Add a note here24

Add a note here18

Add a note here8

Add a note here28/64

Add a note hereG.723 (5.3 kbps)

Add a note here20

Add a note here17

Add a note here7

Add a note here30/73

Add a note here The following assumptions are made in Table 8-6’s bandwidth calculations:

  • Add a note hereIP/UDP/RTP headers are 40 bytes.

  • Add a note hereRTP header compression can reduce the IP/UDP/RTP headers to 2 or 4 bytes. Table 8-6 uses 2 bytes.

  • Add a note hereA Layer 2 header adds 6 bytes.

Add a note here Table 8-6 uses the following calculations:

  • Add a note hereVoice packet size = (Layer 2 header) + (IP/UDP/RTP header) + (voice payload)

  • Add a note hereVoice packets per second (pps) = codec bit rate/voice payload size

  • Add a note hereBandwidth per call = voice packet size * voice pps

Add a note hereFor example, the following steps illustrate how to calculate the bandwidth required for a G.729 call (8-kbps codec bit rate) with cRTP and default 20 bytes of voice payload:

  • Add a note hereVoice packet size (bytes) = (Layer 2 header of 6 bytes) + (compressed IP/UDP/RTP header of 2 bytes) + (voice payload of 20 bytes) = 28 bytes

  • Add a note hereVoice packet size (bits) = (28 bytes) * 8 bits per byte = 224 bits

  • Add a note hereVoice packets per second = (8-kbps codec bit rate)/(8 bits/byte * 20 bytes) = (8-kbps codec bit rate)/(160 bits) = 50 pps

  • Add a note hereBandwidth per call = voice packet size (224 bits) * 50 pps = 11.2 kbps

Add a note hereResult: The G.729 call with cRTP requires 11.2 kbps of bandwidth. This value is rounded down to 11 in Table 8-6.

Add a note hereA more precise estimate of voice codec bandwidth can be obtained using the Cisco Voice Codec Bandwidth Calculator available at http://tools.cisco.com/Support/VBC/do/CodecCalc1.do.


Note

Add a note hereYou must be a registered user on http://www.cisco.com/ to access this calculator.

Add a note here Figure 8-32 shows a portion of the results of the Cisco Voice Codec Bandwidth Calculator for the G.729 codec. This calculation uses cRTP and includes 5 percent additional overhead to accommodate the bandwidth required for signaling.

Click to collapse
Add a note hereFigure 8-32: Voice Codec Bandwidth Calculator Partial Output for G.729 Codec

Codec Design Considerations

Add a note here Although it might seem logical from a bandwidth consumption standpoint to convert all calls to low-bit-rate codecs to save bandwidth and consequently decrease infrastructure costs, the designer should consider both the expected voice quality and the bandwidth consumption when choosing the optimum codec. The designer should also consider the disadvantages of strong voice compression, including signal distortion resulting from multiple encodings. For example, when a G.729 voice signal is tandem-encoded three times, the MOS score drops from 3.92 (very good) to 2.68 (unacceptable). Another drawback is the codec-induced delay with low-bit-rate codecs.

Add a note here QoS for Voice

Add a note hereIP telephony places strict requirements on IP packet loss, packet delay, and delay variation (jitter). Therefore, QoS mechanisms on Cisco switches and routers are important throughout the network if voice traffic is sharing network resources with data traffic. Redundant devices and network links that provide quick convergence after network failures or topology changes are also important to ensure a highly available infrastructure. The following summarizes the process to determine whether to implement QoS in a network:

Add a note here Step 1

Add a note hereDetermine whether the WAN is congested—for example, whether application users perceive performance degradation.

Add a note here Step 2

Add a note here Determine the network goals and objectives, based on the mix of network traffic. The following are some possible objectives:

  • Add a note hereEstablish a fair distribution of bandwidth allocation across all traffic types

  • Add a note hereGrant strict priority to voice traffic at the expense of less-critical traffic

  • Add a note hereCustomize bandwidth allocation so that network resources are shared among all applications, each having specific bandwidth requirements

Add a note here Step 3

Add a note hereAnalyze the traffic types, and determine how to distinguish them.

Add a note here Step 4

Add a note hereReview the available QoS mechanisms and determine which approach best addresses the requirements and goals.

Add a note here Step 5

Add a note hereConfigure the routers for the chosen QoS strategy and observe the results.

Add a note here Figure 8-33 identifies some of the QoS mechanisms available, many of which were introduced in Chapter 4, “Designing Basic Campus and Data Center Networks,” and Chapter 5. The specifics of these mechanisms for voice are reviewed here, followed by a discussion of Call Admission Control (CAC). QoS practices in the Building Access Layer are also described. This section concludes with a discussion of AutoQoS.

Click to collapse
Add a note hereFigure 8-33: QoS Mechanism

Bandwidth Provisioning

Add a note here Bandwidth provisioning involves accurately calculating the required bandwidth for all applications, plus the required overhead. CAC should be used to avoid using more bandwidth than has been provisioned.

Signaling Techniques

Add a note hereThe Resource Reservation Protocol (RSVP) allows bandwidth and other resources along the routing path to be reserved so that a certain level of quality is provided for delay-sensitive traffic. Other signaling techniques include Frame Relay’s Forward Explicit Congestion Notification and Backward Explicit Congestion Notification, and those used with the various ATM adaptation types.

Classification and Marking

Add a note herePacket classification is the process of partitioning traffic into multiple priority levels or classes of service. Information in the frame or packet header is inspected, and the frame’s priority is determined. Marking is the process of changing the priority or class of service (CoS) setting within a frame or packet to indicate its classification.

Add a note hereClassification is usually performed with access control lists (ACL), QoS class maps, or route maps, using various match criteria. Network-based application recognition, described in Chapter 2, “Applying a Methodology to Network Design,” can also be used for classification. Matches can be based on the following criteria:

  • Add a note hereProtocol, such as a stateful protocol or a Layer 4 protocol

  • Add a note hereInput port

  • Add a note hereIP precedence or differentiated services code point (DSCP)

  • Add a note hereEthernet IEEE 802.1p CoS bits

Add a note hereMarking is done at Layer 3 or Layer 2:

  • Add a note hereLayer 3 marking changes the IP precedence bits or DSCP values in the IP packet to reflect the result of QoS classification.

  • Add a note hereFor IEEE 802.1Q frames, the 3 user priority bits in the Tag field—commonly referred to as the 802.1p bits—are used as CoS bits for Layer 2 marking; eight classes of traffic are possible with these 3 bits. Cisco IP phones, for example, can classify and mark VoIP traffic using the 802.1p bits.

Congestion Avoidance

Add a note here Recall from Chapter 5 that congestion-avoidance techniques monitor network traffic loads so that congestion can be anticipated and avoided before it becomes problematic. Congestion-avoidance techniques allow packets from streams identified as being eligible for early discard (those with lower priority) to be dropped when the queue is getting full. Congestion-avoidance techniques provide preferential treatment for high priority traffic under congestion situations while maximizing network throughput and capacity utilization and minimizing packet loss and delay.

Add a note hereWeighted random early detection (WRED) is the Cisco implementation of the random early detection (RED) mechanism. WRED extends RED by using the IP Precedence bits in the IP packet header to determine which traffic should be dropped; the drop-selection process is weighted by the IP precedence. Similarly, DSCP-based WRED uses the DSCP value in the IP packet header in the drop-selection process. Distributed WRED (DWRED) is an implementation of WRED for the Versatile Interface Processor (VIP). The DWRED feature is supported only on Cisco 7000 series routers with a Route Switch Processor–based RSP7000 interface processor and Cisco 7500 series routers with a VIP-based VIP2-40 or greater interface processor.

Traffic Policing and Shaping

Add a note hereTraffic shaping and traffic policing, also referred to as committed access rate (CAR), are similar mechanisms in that they both inspect traffic and take action based on the various characteristics of that traffic. These characteristics can be based on whether the traffic is over or under a given rate or based on some bits in the IP packet header, such as the DSCP or IP Precedence bits.

Add a note here Policing either discards the packet or modifies some aspect of it, such as its IP Precedence or CoS bits, when the policing agent determines that the packet meets a given criterion. In comparison, traffic shaping attempts to adjust the transmission rate of packets that match a certain criterion. A shaper typically delays excess traffic by using a buffer or queuing mechanism to hold packets and shape the flow when the source’s data rate is higher than expected. For example, generic traffic shaping uses a weighted fair queue to delay packets to shape the flow, whereas Frame Relay traffic shaping uses a priority queue, a custom queue, or a FIFO queue, depending on how it is configured.

Congestion Management: Queuing and Scheduling

Add a note hereQueuing is configured on outbound interfaces and is appropriate for cases in which WAN links are occasionally congested.

Add a note here There are two types of queues: the hardware queue (also called the transmit queue or TxQ) and software queues. Software queues schedule packets into the hardware queue based on the QoS requirements and include the following types: weighted fair queuing (WFQ), priority queuing (PQ), custom queuing (CQ), class-based WFQ (CBWFQ), and low latency queuing (LLQ).

Add a note hereLLQ adds strict priority queuing to CBWFQ; LLQ is a combination of CBWFQ and PQ. Strict priority queuing allows delay-sensitive data, such as voice, to be dequeued and sent first (before packets in other queues are dequeued), thereby giving the delay-sensitive traffic preferential treatment over other traffic.

Add a note here Figure 8-34 illustrates why LLQ is the preferred queuing mechanism for voice transport on integrated networks. The LLQ policing mechanism guarantees bandwidth for voice and gives it priority over other traffic, which is queued based on CBWFQ. LLQ reduces jitter in voice conversations.

Image from book
Add a note hereFigure 8-34: With LLQ Voice, Traffic Achieves High Priority

Link Efficiency

Add a note here Link efficiency techniques, including LFI and compression, can be applied to WAN paths. Recall that LFI prevents small voice packets from being queued behind large data packets, which could lead to unacceptable delays on low-speed links. With LFI, the voice gateway fragments large packets into smaller equal-sized frames and interleaves them with small voice packets so that a voice packet does not have to wait until the entire large data packet is sent. LFI reduces and ensures a more predictable voice delay.

Add a note hereCompression of voice packets includes both header compression and payload compression. cRTP is used to compress large IP/UDP/RTP headers. The various codecs described in the earlier “Voice Coding and Compression” section compress the payload (the voice).

CAC

Add a note hereCAC mechanisms extend the QoS capabilities to protect voice traffic from being negatively affected by other voice traffic by keeping excess voice traffic off the network. The CAC function should be performed during the call setup phase so that if no network resources are available, a message can be sent to the end user, or the call can be rerouted across a different network, such as the PSTN.

Add a note hereCAC is an essential component of any IP telephony system that includes multiple sites connected through an IP WAN. If the provisioned voice bandwidth in the WAN is fully utilized, subsequent calls must be rejected to avoid oversubscribing the WAN, which would cause the quality of all voice calls to degrade. This function is provided by CAC to guarantee good voice quality in a multisite deployment involving an IP WAN.

Location-Based CAC

Add a note hereThe location feature in Cisco Unified Communications Manager lets you specify the maximum bandwidth available for calls to and from each location, thereby limiting the number of active calls and preventing the WAN from being oversubscribed.

Add a note hereFor example, if a WAN link between two PBXs has only enough bandwidth to carry two VoIP calls, admitting a third call impairs the voice quality of all three calls. The queuing mechanisms that provide policing cause this problem; if packets that exceed the configured or allowable rate are received, they are tail-dropped from the queue. The queuing mechanism cannot distinguish which IP packet belongs to which voice call; any packets that exceed the given arrival rate within a certain period are dropped. As a result, all three calls experience packet loss, and end users perceive clipped speech.

Add a note hereWhen CAC is implemented, the outgoing voice gateway detects that insufficient network resources are available for a call to proceed. The call is rejected, and the originating gateway must find another means of handling the call. In the absence of any specific configuration, the outgoing gateway provides the calling party with a reorder tone, which might cause the PSTN switch or PBX to announce that “All circuits are busy; please try your call again later.” The outgoing voice gateway can be configured for the following scenarios:

  • Add a note hereThe call can be rerouted via an alternative packet network path, if such a path exists.

  • Add a note hereThe call can be rerouted via the PSTN network path.

  • Add a note hereThe call can be returned to the originating TDM switch with the reject cause code.

Add a note here Figure 8-35 shows examples of a VoIP network with and without CAC.

Click to collapse
Add a note hereFigure 8-35: Call Admission Control Keeps the Quality of Existing Calls

Add a note hereThe upper diagram in Figure 8-35 illustrates a VoIP network without CAC. The WAN link between the two PBXs has the bandwidth to carry only two VoIP calls. In this example, admitting the third call impairs the voice quality of all three calls.

Add a note here The lower example in Figure 8-35 illustrates a VoIP network with CAC. If the outgoing gateway detects that insufficient network resources are available to allow a call to proceed, the gateway automatically reroutes the third call to the PSTN, thereby maintaining the voice quality of the two existing calls.

CAC with RSVP

Add a note hereCAC can be also be implemented with RSVP. Cisco Unified Communications Manager Version 5.0 supports the Cisco RSVP Agent, which enables more efficient use of networks. The Cisco RSVP Agent provides an additional method to achieve CAC besides location-based CAC. RSVP can handle more complex topologies than location-based CAC, which supports only hub-and-spoke network topologies.

Add a note hereRSVP is an industry-standard signaling protocol that enables an application to reserve bandwidth dynamically across an IP network. RSVP, which runs over IP, was first introduced by the IETF in RFC 2205, Resource ReSerVation Protocol (RSVP)—Version 1 Functional Specification. Using RSVP, applications request a certain amount of bandwidth for a data flow across a network (for example, a voice call) and receive an indication of the outcome of the reservation based on actual resource availability. RSVP defines signaling messages that are exchanged between the source and destination devices for the data flow and that are processed by intermediate routers along the path. The RSVP signaling messages are encapsulated in IP packets that are routed through the network according to the existing routing protocols.

Add a note hereNot all routers on the path are required to support RSVP; the protocol is designed to operate transparently across RSVP-unaware nodes. On each RSVP-enabled router, the RSVP process intercepts the signaling messages and interacts with the QoS manager for the router interfaces involved in the data flow to “reserve” bandwidth resources. If the available resources anywhere along the path are not sufficient for the data flow, the routers send a signal indicating the failure to the application that originated the reservation request.

Add a note hereFor example, a branch office router has a primary link with an LLQ provisioned for ten calls and a backup link that can accommodate two calls. RSVP can be configured on both router interfaces so that the RSVP bandwidth matches the LLQ bandwidth. The call processing agent at the branch can be configured to require RSVP reservations for all calls to or from other branches. Calls are admitted or rejected based on the outcome of the RSVP reservations, which automatically follow the path determined by the routing protocol. Under normal conditions (when the primary link is active), up to ten calls will be admitted; during failure of the primary link, only up to two calls will be admitted.

Add a note herePolicies can typically be set within the call processing agent to determine what to do in the case of a CAC failure. For example, the call could be rejected, rerouted across the PSTN, or sent across the IP WAN as a best-effort call with a different DSCP marking.

Building Access Layer QoS Mechanisms for Voice

Add a note here To provide high-quality voice and to take advantage of the full voice feature set, QoS mechanisms on Building Access layer switches include the following:

  • Add a note hereOn 802.1Q trunks, the three 802.1p user priority bits in the Tag field are used as the CoS bits. Layer 2 CoS marking is performed on Layer 2 ports to which IP phones are connected.

  • Add a note hereMultiple egress queues provide priority queuing of RTP voice packet streams.

  • Add a note hereThe ability to classify or reclassify traffic and establish a trust boundary. A trust boundary is the point within the network where markings are accepted; any markings made by devices outside the trust boundary can be overwritten at the trust boundary.

    Add a note hereEstablishing a trust boundary means that the classification and marking processes can be done once, at the boundary; the rest of the network does not have to repeat the analysis. Ideally, the trust boundary is as close to end devices as possible—or even within the end devices. For example, a Cisco IP phone could be considered a trusted device because it marks voice traffic appropriately. However, a user’s PC would not usually be trusted because users could change markings, which they might be tempted to do in an attempt to increase the priority of their traffic.

  • Add a note hereLayer 3 awareness and the ability to implement QoS ACLs might be required if certain IP telephony endpoints are used, such as a PC running a software-based IP phone application that cannot benefit from an extended trust boundary.

Add a note hereThese mechanisms protect voice from packet loss and delay stemming from oversubscription of aggregate links between switches, which might cause egress interface buffers to become full instantaneously. When voice packets are subject to drops, delay, and jitter, the user-perceivable effects include a clicking sound, harsh-sounding voice, extended periods of silence, and echo.

Add a note hereWhen deploying voice, it is recommended that two VLANs be enabled in the Building Access Layer switch: a native VLAN for data traffic and a voice VLAN for voice traffic. Note that a voice VLAN in the Cisco IOS software is called an auxiliary VLAN under the Catalyst operating system. Separate voice and data VLANs are recommended for the following reasons:

  • Add a note hereConfiguring RFC 1918 private addressing on phones on the voice (or auxiliary) VLAN conserves addresses and ensures that phones are not accessible directly via public networks. PCs and servers can be addressed with public addresses; however, voice endpoints should be addressed using private addresses.

  • Add a note hereQoS trust boundaries can be selectively extended to voice devices without extending the trust boundaries to PCs and other data devices.

  • Add a note here VLAN access control and 802.1p tagging provide protection for voice devices from malicious internal and external network attacks such as worms, denial-of-service attacks, and attempts by data devices to gain access to priority queues via packet tagging.

  • Add a note hereManagement and QoS configuration are simplified.


Note

Add a note hereIt is also recommended that Building Access layer switches provide PoE (inline power) for the IP phones.

AutoQoS

Add a note hereThe Cisco AutoQoS feature on routers and switches provides a simple, automatic way to enable QoS configurations in conformance with Cisco’s best-practice recommendations. Only one command is required; the router or switch then creates configuration commands to perform such things as classifying and marking VoIP traffic and then applying an LLQ queuing strategy on WAN links for that traffic. The configuration created by AutoQoS becomes part of the normal configuration file and therefore can be edited if required. The first phase of AutoQoS, available in various versions of the router Cisco IOS Release 12.3, creates only configurations related to VoIP traffic.


Note

Add a note hereThe Cisco Feature Navigator tool, available at http://www.cisco.com/go/fn, allows you to quickly find the Cisco IOS and switch Catalyst Operating System Software release required for the features that you want to run on your network. For example, you can use this tool to determine the Cisco IOS release required to run AutoQoS on the routers in your network.

Add a note hereThe second phase of AutoQoS is called AutoQoS Enterprise and includes support for all types of data. It configures the router with commands to classify, mark, and handle packets in up to 10 of the 11 QoS Baseline traffic classes (as described in Chapter 5). The Mission-Critical traffic class is the only one not defined, because it is specific to each organization. As with the earlier release, the commands created by AutoQoS Enterprise can be edited if required.


Note

Add a note hereFurther information on AutoQoS can be found at http://www.cisco.com/en/US/tech/tk543/tk759/tk879/tsd_technology_support_protocol_home.html.



0 comments

Post a Comment