Monday, July 04, 2011

Digital Video Broadcasting... how it works...

I'm reading up on Digital Video Broadcasting standards. The DVB standard is for receiving digital video back in your home. There are a couple of subtypes in this main category which distinct themselves by their error correcting facilities (based on what's needed over the medium that they are transmitted), bandwidth, etc. The DVB-S standard is one of the things I'm most interested in. Digital Video Broadcasting is usually done with MPEG-2 transport streams. Suppose you have a video on your computer and a bit of music, along with some information on what else is available on your channel. The transport stream is composed by multiplexing all the information together (rather fast) and creating one larger bitstream that can be transported using either DVB-T (terrestrial), DVB-S (satellite) or DVB-C (cable).

I had thought that analog TV would be more resilient to noise created in the atmosphere, but this is not necessarily the case. If you send a file over the ether composed of 0's and 1's, then any noise or interference in the stream may cause a bit to be misread or misinterpreted or missed. Since the playback of a file is usually dependent on all the bits being read correctly, this is where you may get huge problems already. One or slightly more bits falling over may already cause the entire stream unusable.

Unless.... you add error correction. But this increases the size of the entire stream... How then...? Well, the MPEG-TS doesn't carry nearly as much information as an analog video stream, because analog stuff is not compressed, although in the analog world you can remove some information without significantly reducing the quality of your experience (one example here is mp3). In analog video, this means you can easily reduce a bit of the color in an image, although luminance (that which you'd see as black and white) is far more important for a person's perception of an image.

Back to MPEG-2 however... digital compression standards rely on encoding those things that matter only once, where 'motion' in the video would typically require you to encode a bit more about some spatial event for example. So a green screen that doesn't change a pixel will be very easy to transmit and extremely cheap, whereas a fast-paced action movie may temporarily reduce in quality a bit, because all the parts on screen are in motion all the time.

Let's assume that we have some digitally encoded video+audio and that it is ready for transmission. For transmission in DVB-S and all the error correction abilities we need to have at the receiver side, the huge file is packetized into 187 bytes and then a sync byte attached to the start of this "packet". The interesting thing here is that this file may be rather regular in terms of how one byte and its neighbor relate to one another. One interesting finding is that equal bytes that follow up one another may cause more reception problems at the rx side than a noisy transmission will be, because there's more variation.

For this reason, each byte in the packet, excluding the sync byte, is 'AND-ed' with a pseudo random number generator (a simple one that is). This means that some bits now turn on, others turn off and this machine has a certain period over which it operates. This PRNG is reset after every 8 frames of transmission.

What we're getting now is already an interesting stream of information that's nicely packeted, more resistant to some errors. Each packet is fed through a "Reed Solomon" encoder. This is an error-correcting encoder that has the ability to correct 8 bytes of information from this packet at the rx side. So this is the first stage where we're adding additional information to the stream that is going to help us later on. Reed-Solomon (RS) is also used frequently in other mechanisms, like storage, data transfer for other applications (CD? HD? etc.). Sometimes it's getting replaced by other algorithms like Turbocoder (space missions, etc.) and so on. Just think... the images you're seeing from Space sent by those satellites also use these schemes to ensure no bits get inverted/changed during this transfer.

The next step after the RS is some interleaving. Interleaving is a process where you shuffle parts of one packet with parts of another packet. The reason for doing this is that errors typically occur in bursts, not like 'hit and run' errors. By shuffling the original position of bytes in one packet with another, the deinterleaver relocates the parts to their original position later. If any error burst occurred, the spread of the damage caused by the error burst is much lower (it didn't zero out 3 bytes in a row, but perhaps one). Thus, it makes the signal again more robust against interference and errors.

After the interleaving, another forward error correction scheme is used called "Viterbi encoding". This may in worst case duplicate the number of bits in the transmissions stream. More bits mean higher bandwidth. The challenge is to fit the entire MPEG-2 stream within around 6MHz of RF bandwidth, so both the original MPEG-2 stream bitrate as well as what happens after that is very important. If Viterbi encoding can be less aggressive, slightly more MPEG-2 data can be sent in the channel before it uses all the allocated (planned) bandwidth.

The steps after this are 'baseband shaping' and 'I/Q modulation'. This means that the digital signal is mapped to an analog signal for transmission. Words you'll see here are "constellation". The kind used in DVB-S is quadrature phase shift keying. This means that a sequence of 2 bits is taken together and mapped to some vector 45 degrees in a constellation space. Being this far apart prevents any errors that may occur and you'd typically choose that based on the amount of expected noise in the channel. DVB-C, the cable kind, has so little expected noise that it uses 64 positions in this constellation instead. This means that in theory, it has 16 times more bandwidth.

Different analog video signals occupy 5-8 MHz in bandwidth. Expressed in Mbits/sec, PAL video equates to 216 Mbits/sec, whereas MPEG-2 compressed PAL reduces that to 2.5-6 Mbit/sec. High value means bitrate when there's lots of motion, the other when there's little. Compressed HDTV is higher than that in the order of 12-20 Mbits/sec. However, this is measured against MPEG-2. H.264 encoding is three times more efficient, so this gets HD video back into reach for actual transmissions. The alternative would be to use different modulation techniques or to occupy a larger portion than 6 MHz in the transmission region.

Some issues still arise... DVB-S was specifically created for LOS conditions away from reflective buildings and other interference sources. As soon as DVB-S is used for terrestrial transmissions this may have a huge impact on video quality. Tests so far indicate this is not necessarily the case and I reckon that with the circular polarized antennas that for example FPV fliers are using for their analog video, the multipathing issues that threaten DVB-S may well be reduced to a minimum.