Data Transparency

[Read FitzGerald p202-203 in 5^th Ed. and p151 in 6^th Ed.]

When it is possible that frame delimiting characters may appear in the data then it is necessary to provide data transparency; ie the data must be ignored by the frame synchronisation system. An example of this would be in a character oriented protocol when a “binary” file was being transmitted; eg a compiled program. It could easily contain the ETX code somewhere within the file. In this case character (or byte) stuffing (or more formally DLE insertion) is used to achieve data transparency.

This method precedes the “true” STX or ETX characters with a data link escape (DLE) code. Then the transmitter inserts (stuffs) a DLE before every DLE it finds in the data. The receiver knows (by receiving a DLE-STX) that data transparency is in operation. If later in the frame it finds two DLEs in sequence it will destuff (ie delete one of them) and pass the second as data. It will therefore recognise the true end of frame (the only single DLE followed by an ETX).

The need for a pair of characters at the start and end of each frame for frame synchronisation means that a character-oriented transmission is relatively inefficient for the transmission of binary data. Why? Because a set of binary digits have to precede the frame and another set of binary digits have to follow it (to signal the start and end of the frame). This can be an unacceptable overhead; particularly with small frames. Also, these protocols are very inefficient for the transfer of non 8-bit coded data. Today the preferred protocols are usually bit oriented. Bit oriented protocols use a “flag” for framing.

This flag byte is used to delimit the frame and it may also appear in the data, so these protocols suffer from the same need for Data Transparency. The flag byte is actually the binary pattern 01111110. There is an efficient method for hiding this pattern if it appears anywhere[1] in the data. Since it is equivalent to “character stuffing” used with character oriented protocols, the method is called bit stuffing or (more formally) zero bit insertion.

Once the flag has been transmitted, the sender stuffs a ‘0’ bit into the bit stream after any occurrence of five consecutive ‘1’ bits in the data. Of course the sender knows when the frame has ended so the flag pattern is appended to the end of the frame without bit stuffing taking place. The receiver counts ‘1’ bits as they arrive. If five consecutive ‘1’ bits are received the next bit is checked. If it is a ‘0’ it is deleted (since it must have been a stuffed bit) and the ‘1’s count starts again. If it is a ‘1’ then the signal is checked to determine if it is a flag arriving.

What can go wrong? With character or bit oriented protocols the frame delimiter can be destroyed by transmission errors, or a delimiter can be erroneously detected because of line errors. It is now appropriate to look at methods of detecting errors.

Detecting Bit Errors

[Read FitzGerald p192-194 in 5^th Ed. and p141-143 in 6^th Ed.]

We have already determined that there are a number of different causes of transmission errors and several techniques are used to minimise them. We saw that the use of digital lines provide the best solution. Today’s digital transmission networks have very low probability of error, but even so, there must be a mechanism to detect that a corruption has occurred.

The probability P of a bit corruption occurring in a transmission is the probability of a single bit being corrupted in a defined time interval. It is called the bit error rate (BER). A transmission line with a BER of 1/1000 means that, on average, 1 bit in every 1000 will be corrupted during a defined time period.

If we are transmitting single 8-bit characters using asynchronous transmission with one start and one stop bit (ie 10 bits/character) then the probability of a character being in error is 1 - (1 - P)¹⁰. If P = 1/1000, this is approximately 1/100. More simply, if 100 characters were sent in a contiguous block (ie 100*10 = 1000 bits were sent) then on average the block would contain 1 bit in error (if the BER is 1/1000). The probability of that error being in a particular character in the block is 1 character in 100, ie 1/100. Therefore, on average, 1 character in every 100 would need to be retransmitted because of error.

But note the effect of long frames. If the protocol or transmission method required that 100 characters be sent in one block over the above line, then the probability of a block being in error and therefore requiring retransmitting would be 1! Why would this length of frame be inappropriate for a line with a BER of 1/1000?

Detection

All detection methods require that some extra information be sent along with the data which can be checked by the receiver to determine if any of the data has changed. This means that precious bandwidth must be used to provide this ability, ie error detection and correction is a cause of inefficiency in the system. Generally speaking, the more detection information sent, the greater the chance of detection.

The three detection methods that you need to understand are Parity, Longitudinal Redundancy, and Cyclic Redundancy checking. All are explained in your text.

We have discussed detection by parity checking (a simple system within asynchronous communication) and also noted that it fails to detect two or an even number of bit failures. Therefore the probability of detection via parity is very poor (about 50%).

Detection can be increased to around 98% when longitudinal (column or block) parity is used as well as transverse (row or character) parity. Here, characters are sent in a block followed by an extra character known as the block check (parity) character (BCC) whose value is determined similarly to row parity, but by column.

Unfortunately, within telecommunications, it is more likely that bit errors will occur in a burst rather than have a random single-bit distribution. Parity methods do not provide a reliable detection for burst errors. The most common alternative is based on the use of polynomial codes. A polynomial generator is used to produce a remainder which is appended to the transmitted message. This remainder is referred to as the frame check sequence (FCS) or cyclic redundancy check (CRC) digits. The receiver divides by the same polynomial to check that its calculation agrees with the transmitter’s. If not, then there has been an error and the frame will have to be retransmitted.

There are several commonly used generator polynomials in the CRC family. CRC-16 and CRC-CCITT are used extensively with WANs and CRC-32 is used in most LANs.

A generator polynomial of R bits will detect:

all single bit errors
all double bit errors
all odd number of bit errors
all error bursts < R bits long
most error bursts >= R bits long.

Polynomial checking is a little difficult for mere mortals but it is quite easily implemented by software or hardware. A more easily understood method for us is the checksum technique.

Assume the transfer of the characters CHECKSUM and the use of decimal 255 as a one byte check character generator:

Character	Decimal Value	Characters Sent& Received	Decimal Value
C	67	C	67
H	72	H	72
E	69	E	69
C	67	C	67
K	75	K	75
S	83	S	83
U	85	U	85
M	77	M	77
	595 / 255 = 2 + 85/255 Check character appended: 85 (U)	U	595 / 255 = 2 + 85/255 Check character appended = 85 (U)? ü