Troubleshooting QoS Choppy Voice Issues
For Packet Voice to be a realistic replacement for standard public switched telephone network (PSTN) Telephony services, the received quality of Packet Voice must be comparable to that of basic telephone services. This means consistently high-quality voice transmissions. Like other real-time applications, Packet Voice has a wide bandwidth and is delay sensitive. For voice transmissions to be intelligible (not choppy) to the receiver, voice packets cannot be dropped, excessively delayed, or suffer varying delay (otherwise known as jitter). This document describes various Quality of Service (QoS) considerations that help troubleshoot choppy voice issues. The main reasons for choppy voice problems are lost and delayed voice packets.
Readers of this document should be knowledgeable of these:
Basic configuration of Packet Voice (VoIP, Voice over Frame Relay (VoFR) or Voice over ATM (VoATM) as per their requirement).
Basic understanding of voice prioritization, fragmentation, different codecs and their bandwidth requirements.
The information in this document applies to all Cisco voice gateways software and hardware versions.
The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you are working in a live network, ensure that you understand the potential impact of any command before using it.
For more information on document conventions, refer to the Cisco Technical Tips Conventions .
Choppy voice quality is caused by voice packets being either variably delayed or lost in the network. When a voice packet is delayed in reaching its destination, the destination gateway has a loss of real-time information. In this event, the destination gateway must predict what the content of the missed packet can possibly be. The prediction leads to the received voice not having the same characteristics as the transmitted voice. This leads to a received voice that sounds robotic. If a voice packet is delayed beyond the prediction capability of a receiving gateway, the gateway leaves the real-time gap empty. With nothing to fill up that gap at the receiving end, part of the transmitted speech is lost. This results in choppy voice. Many of the choppy voice issues are resolved by making sure that the voice packets are not very delayed (and more than that, not variably delayed). Sometimes, voice activity detection (VAD) adds front-end clipping to a voice conversation. This is another cause of choppy (or clipped) voice.
The various sections in this document show how to minimize the instance of choppy voice. Most of these measures require assuring the introduction of minimum jitter in your voice network.
Before you consider applying any measures for minimizing jitter, provision sufficient network bandwidth to support real-time voice traffic. For example, an 80 kbps G.711 VoIP call (64 kbps payload + 16 kbps header) sounds poor over a 64 kbps link because at least 16 kbps of the packets (which is 20 percent) are dropped. The bandwidth requirements vary based on the codec used for compression. Different codecs have different payloads and header requirements. Usage of VAD also affects the bandwidth requirement. If Real Time Protocol (RTP) header compression (cRTP) is used, it can further lower the bandwidth requirement.
For example, the bandwidth required for a voice call using the G.729 codec (default 20 byte payload) with cRTP, is like this:
Voice payload + compressed (RTP header + User Datagram Protocol (UDP) header + IP header) +Layer 2 header
This is equivalent to:
20 bytes + compressed (12 bytes + 8 bytes + 20 bytes) + 4 bytes
28 bytes, since the header compression reduces the IP RTP header to a maximum of 4 bytes. This yields 11.2 kbps at an 8 kbps codec rate (50 packets per second).
There are two important components in prioritizing voice. The first is classifying and marking interesting voice traffic. The second is prioritizing the marked interesting voice traffic. The two subsections here discuss various approaches to classifying, marking, and prioritizing voice.
Classification and Marking
In order to guarantee bandwidth for VoIP packets, a network device must be able to identify the packets in all the IP traffic that flows through it. Network devices use the source and destination IP address in the IP header, or the source and destination UDP port numbers in the UDP header, to identify VoIP packets. This identification and grouping process is called classification. It is the basis for providing any QoS.
Packet classification can be processor intensive. Therefore, classification needs to be done as far out towards the edge of the network as possible. Because every hop still needs to make a determination on the treatment a packet should receive, you need to have a simpler, more efficient classification method in the network core. This simpler classification is achieved through marking or setting the Type of Service (ToS) byte in the IP header. The three most significant bits of the ToS byte are called IP Precedence bits. Most applications and vendors currently support setting and recognizing these three bits. Marking is evolving so that the six most significant bits of the ToS byte, called the Differentiated Services Code Point (DSCP), can be used. Refer to the Request for Comments (RFC).
Differentiated Services (DiffServ) is a new model in which traffic is treated by intermediate systems with relative priorities based on the ToS field. Defined in RFC 2474 and RFC 2475 , the DiffServ standard supersedes the original specification for defining packet priority described in RFC 791 . DiffServ increases the number of definable priority levels by reallocating bits of an IP packet for priority marking. The DiffServ architecture defines the DiffServ field. It supersedes the ToS byte in IP V4 to make Per-Hop Behavior (PHB) decisions about packet classification and traffic conditioning functions such as metering, marking, shaping, and policing. In addition to the previously mentioned RFCs, RFC 2597 defines the Assured Forwarding (AF) classes. This is a breakdown of the DSCP fields. For more information on DSCP, refer to Implementing Quality of Service Policies with DSCP .
ToS Byte – P2 P1 P0 T3 T2 T1 T0 CU
IP precedence: three bits (P2-P0), ToS: four bits (T3-T0), CU: one bit
DiffServ Field – DS5 DS4 DS3 DS2 DS1 DS0 ECN ECN
DSCP: six bits (DS5-DS0), ECN: two bits
XXX00000 Bits 0, 1, 2 (DS5, DS4, DS3) are Precedence bits, where:
111 = Network Control = Precedence 7
110 = Internetwork Control = Precedence 6
101 = CRITIC/ECP = Precedence 5