This part describes the modulation methods used for conventional modems. It doesn't cover the high speed PCM methods (modulus conversion) sometimes used by 56k Modems (V.90, V.92). But 56k modems also use the modulation methods described here.
Modulation is the conversion of a digital signal represented by binary binary (0 or 1) into an analog signal something like a sine wave. The modulated signal consists pure sine wave "carrier" signal which is modified to convey information. A pure carrier sine wave, unchanging in frequency and voltage, provides no flow of information at all (except that a carrier is present). To make it convey information we modify (or modulate) this carrier. There are 3 basic types of modulation: frequency, amplitude, and phase. They will be explained next.
The simplest modulation method is frequency modulation. Frequency is measured in cycles per second (of a sine wave). It's the count of the number of times the sine wave shape repeats itself in a second. This is the same as the number of times it reaches it peak value during a second. The word "Hertz" (abbreviated Hz) is used to mean "cycles per second".
A simple example of frequency modulation is where one frequency means a binary 0 and another means a 1. For example, for some obsolete 300 baud modems 1070 Hz meant a binary 0 while 1270 Hz meant a binary 1. This was called "frequency shift keying". Instead of just two possible frequencies, more could be used to allow more information to be transmitted. If we had 4 different frequencies (call them A, B, C, and D) then each frequency could stand for a pair of bits. For example, to send 00 one would use frequency A. To send 01, use frequency B; for 10 use C; for 11 use D. In like manner, by using 8 different frequencies we could send 3 bits with each shift in frequency. Each time we double the number of possible frequencies we increase the number of bits it can represent by 1.
Once one understands frequency modulation example above including the possibilities of representing a few bits by a single shift in frequency, it's easier to understand both amplitude modulation and phase modulation. For amplitude modulation, one just changes the height (voltage) of the sine wave analogous to changing the frequency of the sine wave. For a simple case there could only be 2 allowed amplitude levels, one representing a 0-bit and another representing a 1-bit. As explained for the case of frequency modulation, having more possible amplitudes will result in more information being transmitted per change in amplitude.
To change the phase of a sine wave at a certain instant of time, we stop sending this old sine wave and immediately begin sending a new sine wave of the same frequency and amplitude. If we started sending the new sine wave at the same voltage level (and slope) as existed when we stopped sending the old sine wave, there would be no change in phase (and no detectable change at all). But suppose that we started up the new sine wave at a different point on the sine wave curve. Then there would likely be a sudden voltage jump at the point in time where the old sine wave stopped and the new sine wave began. This is a phase shift and it's measured in degrees (deg.) A 0 deg. (or a 360 deg.) phase shift means no change at all while a 180 deg. phase shift just reverses the voltage (and slope) of the sine wave. Put another way, a 180 deg. phase shift just skips over a half-period (180 deg.) at the point of transition. Of course we could just skip over say 90 deg. or 135 deg. etc. As in the example for frequency modulation, the more possible phase shifts, the more bits a single shift in phase can represent.
Instead of just selecting either frequency, amplitude, or phase modulation, we may chose to combine modulation methods. Suppose that we have 256 possible frequencies and thus can send a byte (8 bits) for each shift in frequency (since 2 to the 8 power is 256). Suppose also that we have another 256 different amplitudes so that each shift in amplitude represents a byte. Also suppose there are 256 possible phase shifts. Then a certain points in time we may make a shift in all 3 things: frequency, amplitude and phase. This would send out 3 bytes for each such transition.
No modulation method in use today actually does this. It's not practical due to the relatively long time it would take to detect all 3 types of changes. The main problem is that frequent shifts in phase can make it appear that a shift in frequency has happened when it actually didn't.
To avoid this difficulty one may simultaneous change only the phase and amplitude (with no change in frequency). This is called phase-amplitude modulation. It is also called quadrature amplitude modulation (= QAM) since there were only 4 possible phases (quadrature) in early versions of it. This method is used today for the common modem speeds of 14.4k, 28.8k, and 33.6k. The only significant case where this modulation method is not used today is for 56k modems. But even 56k modems exclusively use QAM (phase-amplitude modulation) in the direction from your PC out the telephone line. Sometimes even the other direction will also fall back to QAM when line conditions are not good enough. Thus QAM (phase-amplitude modulation) still remains the most widely used method on ordinary telephone lines.
The "modulation" method used for speeds above 33.6k is entirely different than the common phase-amplitude modulation used at 33.6k and below. Since ordinary telephone calls are converted to digital signals at the local offices of the telephone company, the fastest speed that you can send digital data by an ordinary telephone call is the same speed that the telephone company uses over its digital portion of its network (for a phone call). What is this speed? Well, it's close to 64kbps. It's sometimes 64k and sometimes less if bits are "stolen" for signalling purposes. If the phone Co. knows that the link is not for voice, bits may not get stolen. The case of 64k will be presented and then it will be explained why the actual speed is lower (56k or less --often significantly less).
Thus 64k is the absolute top speed possible (not counting date compression) for an ordinary telephone call using the digital portion of the circuit that was designed to send digital encodings of the human voice. In order to use 64k, the modems need to either have direct access to the digital portion of the circuit or be able to determine the exact digital signal that generated a received analog signal (and conversely). This task is far too error prone if both sides of a telephone call have only an analog interface to the telephone company. But if one side has a digital interface, then it's possible (in one direction for V.90 and in both directions for V.92). Thus if your ISP has a digital interface to the phone company, the ISP may send out a certain digital signal over the phone lines toward your PC. The digital signal from the ISP gets converted to analog at the local telephone office near your PC's location (perhaps near your home). Then it's your modem's task to try to figure out exactly what that digital signal was. If it could do this, then transmission at 64k (the speed of the telephone company's digital signal) is possible in this direction.
What method does the telephone company use to digitally encode analog signals? It uses a method of sampling the amplitude of the analog signal at a rate of 8000 samples per second. Each sample amplitude is encoded as a 8-bit byte. (Note: 8 x 8000 = 64k) This is called "Pulse Code Modulation" = PCM. These bytes are then sent digitally on the telephone company's digital circuits where many calls share a single circuit using a time-sharing scheme known as "time division multiplexing". Then finally at a local telephone office near your home, the digital signal is de-multiplexed resulting in the same digital signal as was originally created by PCM. Then this signal is converted back to analog and sent to your home. This analog to digital conversion (and conversely) is done by telephone company hardware called a "codec" (coder/decoder). Each PCM 8-bit byte creates a certain amplitude of the analog signal. Your modem's task is to determine just what that PCM 8-bit byte was, based on the analog amplitude it detects.
This was originally called is called "modulus conversion". It's now often called "PCM"-something since its just like encoding/decoding PCM but with the added problem of sampling at the precise time that the codec generated the analog voltage from the digital PCM code.
In order to determine the digital codes the telephone Co. used to create the analog signal, the modem must sample this analog signal amplitude at exactly the same points in time the phone Co. did when it created the analog signal. To do this an 8kHz clock timing signal is generated with help from a residual 4kHz signal on the analog phone line. The creation of amplitudes to go out to your home/office at 8k amplitudes/sec sort of creates a 4kHz signal. Suppose every other amplitude was of opposite polarity. Then there would be a 4kHz sine-like wave created. Each amplitude is in a sense a 8-bit symbol and when to sample amplitudes is known as "symbol timing". The modem's task is to insure that it's 8kHz clock runs at precisely twice the speed of the 4kHz signal (which could drift slightly off 4kHz) and that the modem's clock is synchronized with that used by the telephone company's codec. The actual electronics may use much higher frequency clocks (dividing them down) and take more than a single sample. If you know how this synchronization works, let me know (if this is a recent Modem-HOWTO).
Now the encoding of amplitudes in PCM is not linear. At low amplitudes an increment of 1 in the PCM byte value represents a much smaller increment (delta) in analog signal amplitude than would be the case if the amplitude being sampled were much higher. Thus for low amplitudes it's difficult to distinguish between adjacent byte values. To make it easier to do this (for 56k modems) certain PCM codes representing very low amplitudes are not used. This gives a larger delta between possible amplitudes and makes correct detection of them by your modem easier. Thus half of the amplitude levels are not used (in the downstream direction) by V.90 or V.92. This is tantamount to each symbol (valid amplitude level) representing 7 bits instead of 8. This is where 56k comes from: 7 bits/symbol x 8k symbols/sec = 56k bps. Of course each amplitude symbol is actually generated by 8-bits but only 128 bytes of the possible 256 bytes are actually used by the ISP sender. There is a code table mapping these 128 8-bit bytes to the 128 7-bit bytes. It's not just a simple mapping like ignoring the last bit. Thus to send 7 normal data bytes (8-bits) will take 8 of the above mentioned bytes.
But it's a little more complicated that this. If the line conditions are not nearly perfect or if the direction is upstream (V.92 only), then even fewer possible levels (symbols) are used resulting in speeds under 56k. Also due to US government rules prohibiting high power levels on phone lines, certain high amplitudes levels can't be used resulting in only about 53.3k at best for "56k" modems in the downstream direction.
Note that the digital part of the telephone network is bi-directional. Two such circuits are used for a phone call, one in each direction. For V.90, the 56k signal is only used in one of these directions: from your ISP to your PC (called the "downstream" direction). For this V.90, the other direction (upstream, from your home/office to the ISP) uses the conventional phase-amplitude modulation scheme with a maximum of 36.6kbps (and not 53.3kbps). For V.92, this upstream direction also uses the PCM method and supports up to 48 kbps. The analog portion of the circuit from your home/office to the nearest telephone Co. office was never intended to be bi-directional since it's only a single twisted pair. But due to sophisticated cancellation methods it's able to convey data simultaneously in both directions as explained in the next subsection. It's claimed that with V.92, it's almost impossible to get maximum thruput in both directions simultaneously due to the difficulties of bi-directional flow on a single circuit.
Modern modems are able to both send and receive signals simultaneously. One could call this "bidirectional" or "full duplex". This was once done by using one frequency for sending and another for receiving. Today, the same frequency is used for both sending and receiving. How this works is not easy to comprehend.
Most of the telephone system "main lines" are digital with two channels in use when you make a telephone call. What you say goes over one digital channel and what the other person says goes over the other (reverse) digital channel. Unfortunately, the part of the telephone system which goes to homes (and many offices) is not digital but only a single analog channel. If both modems were directly connected to the digital part of the phone system then bidirectional communication (sending and receiving at the same time) would be no problem because two channels would be available.
But the end portion of the signal path goes over just one circuit. How can there be two-way communication on it simultaneously? It works something like this. Suppose your modem is receiving a signal from the other modem and is not transmitting. Then there's no problem. But if your modem were to start transmitting (with the other received signal still flowing into your modem) it would drown out the received signal. If the transmitted signal was a "solid" voltage wave applied to the end of the line then there is no way any received signal could be present at that point.
But the transmitter has "internal impedance" and the transmitted signal applied to the end of the line is not solid (or strong enough) to completely eliminate the received signal coming from the other end. Thus while the voltage at the end of the line is mostly the stronger transmitted signal a small part of it is the desired received signal. All that is needed is to filter out this stronger transmitted signal and then what remains will be the signal from the other end which we want. To do this, one only needs to get the pure transmitted signal directly from the transmitter (before it's applied to the line) amplify it a determined amount, and then subtract it from the total signal present at the end of the line. Doing this in the receiver circuits leaves a signal which mostly came from the other end of the line.
An analog signal traveling down a line in one direction may encounter changes in the line that will cause part of the signal to echo back in the opposite direction. Since the same circuit is used for bi-directional flow of data, such echos will result in garbled reception. One way to ameliorate this problem is to send training signals once in a while to determine the echo characteristic of the line. This will enable one to predict the echos that will be generated by any given signal. Then this prediction method is used to predict what echos the transmitted signal will cause. Then this predicted echo signal is subtracted from the received signal. This cancels out the echoes.