000522-1
|
If you want to send data from one computer to another,
there is a need to mark the end of one data item. How can you, for example,
include the "." character in a string, if "." is used to mark the end
of the string, or some other character which you use as an end-of-string-mark.
Discuss different methods to handle this problem in protocols based on
ABNF, ASN.1 and XML, and their pros and cons.
|
Answer
|
- Put a length counter in front of the data. The data
can then contain anything. Main method in BER. Also used to some extent
in HTTP (2).
- Split the data into chunks, with a length counter in
front of each chunk. Again, anything can be included, but the sender
need not even know all the data before starting to send it. Also used
in BER and in e-mail "chunking" method (1).
- Forbid certain characters in the data (1). If they
occur anyway, encode them in some special way. The three most common
such special ways are:
- Double all occurences of the forbidden character.
Example: Encode 'His name is "John" today' as '"His name is ""John""
today' (0.5).
- Put a special quoting charater in front of forbidden
characters. Example: "John F. Nilsson" as "John\ F\.\
Nilsson". Used in e-mail (0.5).
- Encode using the hexadecimal or decimal value of
the character. Example: "Göran Åberg" as "Gäran
Åran" or G%f5ran %c5". Used in HTML and many other
standards (0.5). An extreme variant of this is BASE64, where all
characters are encoded.
- Encode using a "name" of the character.
Example: "Göran Åberg" as "Göran
Åberg". (0.5)
- Let a line break indicate the end of a string, but
allow line breaks in the string if they are succeded by linear white
space (e-mail headers). (0.5)
Some of these methods have special problems if the character
which needs to be encoded or the encoded variant is at the end of the
string to be transmitted.
|