IETF Munich 1997 notes

Here are notes on what happended during some of the Application area sessions at the Internet Engineering Task Force (IETF) meeting in Munich, August 1997. These are my personal notes, no official minutes. What I write below is often quotes of what someone said at the meeting, and not necessarily my own opinions on the issue.

By Jacob Palme, e-mail: jpalme@dsv.su.se, at the research group for CMC (Computer Mediated Communication) in the Department of Computer and Systems Sciences at Stockholm University and KTH.

Hypertext Transfer Protocol

Webdav - WWW Distributed Authoring and Versioning

A character set policy for the IETF

Drums - revision of the e-mail standards

Registry Information Database Exchange Formats and Protocols BOF

Sending HTML in e-mail - MHTML

Other related issues

Text on the back of a T-shirt

"We reject kings, presidents and voting.
We believe in rough consensus and running code."
&endash; Dave Clark (1992)

Comment: This is a commonly credo of IETF, but it is not quite true. I have participated in several votings during IETF face-to-face working group meetings. However, special voting algorithms are then usually used. Example: "Who has read the draft?", "Who among those who has read the draft are of the opinion that...?" I believe that voting with normal voting algorithms is not suitable for IETF work, but that voting with specialised voting algorithms may in some cases be useful. Some time I will write a proposal about this.

Hypertext Transfer Protocol (HTTP)

The task of this group is to handle HTTP 1.1, which is already a proposed standard, and to further this into a draft standard. This means that the work is very well progressed on HTTP 1.1, and the things discussed are nitty-gritty detailed of unclarities or other problems with a standard being implemented. But such discussions are also interesting, you can recognize the typical problems which crop up in this stage of standards development.

The meeting rather rapidly went through a very long issue list. Most issues were related to special cases or unclarities in particular uses of particular header fields.

Wildcards for character set accepted

Is an asterisk (*) permitted as wildcard for the Character set.

Content-Disposition

Should Content-Disposition (from e-mail standards RFC 1806) be used also in HTTP? Content-Disposition allows a sender to designate a body part as inline or attachment, and indicate a suggested file name.

301-302 issue

The problem is that this has been changed from old practice to HTTP 1.1, and that the old practice is codified in a number of CGI scripts. These scripts, when used by new 1.1 servers, will continue to perform according to the old practice, but the clients expects new practice, and so the connection does not work. Similar problems may occur with other cases where 1.1 has changed older practice.

This issue should at least be discussed in a revision of RFC 2145, even if there is no good solution.

One solution used by one server: Turn off 1.1 as soon as you are communicating with a proxy.

Use of redirects

Some web pages cause 5-6 redirects in sequence. This is not quite nice. Many walkers stop at redirections and do not follow them.

Link header

There is a conflict between HTML and HTTP in interpretation of the link header. Should this be discussed in the HTTP working group, or in the groups working on HTML standards? Is the semantics of the HTTP Link header exactly the same as that of the HTML Link element?

Cahing CGI-s?

Should we add a note explaining how CGI can be cached. History: Caching proxies started not caching queries or things with CGI-BIN in them, since such usually return different values with every usage and is not worth caching. So servers learnt this, and began to assume that such things are not cached, and did not put in the "don't cache" header. And then they begin to ask for the standard to say that proxies should not cache such things!

Options

Options are insufficiently defined. Discussion seems to be converging. Internet draft with solution will be written. RFC numbers are used in the protocol for option negotiations.

Conservative vs. Optimistic age calculation?

Encourage synchronised clocks!

Security considerations

Advice on SHOULD and DON'T should be in the main text, not in security considerations, someone said.

Redirect

Redirect inherently has a privacy concern issue.

Requirements

A talbe of rquirements like RFC 1122 and 1123 is wanted. A sample such table can be found in Draft-08. Example: Which are the requirements for a pure origin server?

Go to draft standard?

To go to draft standard, you have to have at least two independent inter-operable implementations. To collect this data is difficult for such a long and complex standard as HTTP 1.1. There was a question whether two independent inter-operable implementation meant two servers or two clients or what? Answer: Two independent servers and two independent clients which are all interoperable!

Webdav - WWW Distributed Authoring and Versioning

The goal of this group is to develop better methods to support authoring of WWW documents. Authoring requires:

Standard ways of submitting resources
Standard way of locking a resource to prevent simultaneous updating by more than one person (which otherwise might cause one new version to overwrite another new version produced simultaneously)
Associate links between documents and properties on them
Support containers of several documents
Support versions of the same document

This IETF working group is developing a way of associating a set of Properties with collections of web resources. Typical properties are "author" or "readonly". Each property has a name and can have a value. The syntax and semantics of the value is specified by the name. HTTP will be extended with facilities for getting and setting the properties of resources. This has some similarity to the META values which can occur in HTML heads. A Schema is a set of properties defined for use within a special set of resources. In particular, the Webdav group is going to define a schema named DAV.

Searching

This group is backing away from providing a general-purpose search feature. A standard for general-purpose search could be a task for a new IETF working group. The Webdav group will only provide an intentionally limited facility called FINDPROP. FINDPROP allows retrieval of which resources have certain properties in a collection, but does not provide more general-purpose search like searching on the contents of resources.

Depth parameter

It is possible to apply a method on one resources or of all resources in a collection. A depth value can have value "0" (only this particular resource), "1" (all direct members of a collection) or "infinite" (all direct and indirect members of a collection). This is controversial, and will be moved to a separate document, with "0" as the only option in the base specification. Example: Copy and Move are very complex methods if applied to more than one single resource.

Locking

There is a need for a facility for an atomic operation to lock a set of resources simultaneously. Reason: Avoid the problem where someone asks for a lock on a set of resources and gets locks on only some of them. Difficult to implement. There is discussion whether to have a locking facility on a set of resources, or on only one at a time. Compromise: The need is there, try to get it working, if we cannot, we will have to accept that we could not meet this requirement. The problem: in HTTP, all commands operate on only one URI. There is no HTTP facility for performing an operation on more than one URI at the same time.

Possible solution: Special lock servers called arbitrators. But then before trying to lock a set of resources, you must find an arbitrator capable of handling locking of all resources in this set.

Note: An atomic operation is an operation which can only peformed in full or not-at-all. But it need not be performed exactly simultaneously on all the resources on which the atomic operation acts.

Language variants?

There was a discussion about language variants. They seemed to be oriented towards a special construct for handling variants, rather than the simple solution of a "translation-of" link between objects, which we have chosen in Web4Groups. I tried to argue for our solution, but people seemed to think according to different models than we do.

Something for CoopWWW?

An observation: The issues in this IETF group seems to be of interest to people working on the BSCW and in the EU-funded research project CoopWWW.

A character set policy for the IETF

This meeting was a so-called BOF (Birds of a Feather) which in IETF means a meeting for which there is no IETF working group yet. Chairman: Harald T Alvestrand.

He started with a very good overview of the issues:

Texts in many languages get badly mangled if shown as plain ASCII.

IETF policy:

Charset must be declared! (exception: If the protocol specifies only one charset, but even then, tagging might be needed for storage.)
Single character set: If you prefer one and only one character set: Use ISO 10646 (which is not necessarily exactly the same as Unicode). IETF recommends ISO 10646 rather than Unicode.
UTF-8 is the encoding of ISO 10646, which should be used.
Language tags: Use RFC 1766, when you need to indicate the language of a text.

Why declare character set: It will take time until ISO 10646 is universally accepted, and we need a way out if ISO 10646 is found not to suffice at some time in the future.

Why ISO 10646: Richest today, good opportunities for further extension. Problem may be that this standard will change in the future. It is well known, maintained and extended. Problem: Unstable, but this can also be an advantage, mistakes can be expected to be corrected. A problem with ISO 10646 is that because the character set is so large, many implementations will only be capable of handling a subset of all the characters in ISO 10646. A method may then be needed to indicate which subset of ISO 10646 which a computer can handle, and what to do when characters outside of this set is encountered.

Why UTF-8: One way is better than many ways. UTF-8 IS backwards compatible with ASCII, ASCII data will look like normal ASCII. Disadvantage: Requires 8-bit clean channels and is a variable-length encoding.

UTF-1, UTF-2, UTF-file-system-safe are precursors or earlier names of UTF-8, these designations should not be used any more.

Why language tags: People can designate which language version to read, much processing, like indexing and sorting, depends on the language. RFC 1766 is recommended because the ISO standard 639 is not complete, it can only handle about 50 languages. Larger ISO schemes are in development. RFC 1766 is a flexible scheme under IETF control.

ISO 10646 is better maintained than ISO 639 because there is stronger industry pressure to get 10646 working. ISO 639 is handled by linguists who do not understand the urgency needed to get working standards reasonably fast.

Tagging methods:

Text headers
Saying that only UTF-8 is permitted in a certain protocol (no tagging needed)
MIBs: charset variable

What about names: Who sees them, Who types them, Who misunderstand them? Names are often used with for example Norwegian-particular characters in English-language texts. Examples: Torbjörn, Torbjørn. JP comment: The problem with names is because you have one language and character encoding for a whole body part. If you can switch language and character encoding within a string, the name problem disappears. Also, 10646 might solve this problem since it allows all characters, you need never switch character set within a string to handle a name.

Problems: ISO 10646 has different characters which look the same. Case handling (upper and lower case) may not be well-defined for non-latin characters and sorting is a problem both because of case handling and for other reasons. Comparisons of two strings is a problem. Could be solved by normalizing methods or rules.

There was a long discussion about different variants of ISO 10646 and various problems with 10646 and about normalising of 10646 strings. There was only one hour allocated to this BOF, so many important issues never got to be discussed.

Here is a summary of what is agreed and not agreed:

We agree that the protocols should always tell you the character set of a piece of text. However, a special set name "unknown" is needed in special cases.
We did not agree whether the protocols should also always tell you the language of a piece of text.
We agree that the main recommended character set is ISO 10646 with the UTF-8 encoding.
Most of us agreed that it is OK, in particular protocols, to define that certain strings are only ISO 10646/UTF-8 and nothing else is allowed. This, of course, cannot be retrofitted on existing protocols.
We did not agree on the issue of allowing defaults. Is it permitted to specify that texts in a protocol are of a certain character set, encoding and language unless otherwise specified? Some liked this, some did not like it at all.

Registry Information Database Exchange Formats and Protocols BOF

This group is needed because the registries have not agreed on a joint format. This group will develop protocols for some basic information exchange between different registries based on different standards (whois++, X.500, etc.)

http://www.isi.edu/~davidk/ride.

Base documents

"RIDE classes", Rick Wesson, David Kessens, D. Shah, 03/26/1997, <draft-kessens-ride-classes-00.txt>: This document describes the attributes and classes that will be used in the internet Registry Information Database Exchange formats (RIDE). For now it is mostly limited to 'domain' and 'contact' classes since they were widely considered as most urgent. Encoding that will be used for the objects and ways to find and access objects are beyond the scope ofthis document. This will be addressed in the future in separate drafts.

Start of the meeting

We have to persuade the area directors that an IETF working group on this is needed, said Keith Moore.

Three registries are already in existence, in the future, hundreds of them are expected.

Updates on the classes draft

An host object has been added, required/optional, single/multiple added.

Globally unique registry identifiers

There must be a way for objects to reference other objects in other registries. Every reference has a globally unique registry identifer, and a local identifer unique within this registry. The global registry id is the domain name. It can be registered in the DNS. Each server can be queried with the local identifier.

No central authority needed for managing who is a registry and who is not, but the registries.int domain, managed by IANA is used.

There was a controversy on what information to store in the DNS. Some want to put much info there, others not so much. The DNS might store server, protocol and protocol options, what kind of data is available, or it might only store how to find a registry and nothing more. There was a lot of discussion on this issue.

Requirements

Our main customers are the regional registration authorities. We are not defining a new registry system, just exchange of data between existing registries.

Strawman proposal for formats and protocols

Operations needed: Version negotiation, identificdation to the server, retrieval of the full dataset for a specified object type or all types, retrieval of data that has changed since last retrieval (optional).

Internationalization

Should non-ascii characters be allowed in registry entries. Registries should be international, and retrievable by anyone anywhere. (Registries might store extra data in non-ascii fields, but the registry must look, when accessed from other registries, like having only ascii fields.)

Charter of this working group

Which data needs to be exchanged between registries, how to handle references between data store in different registries, how to find Internet registries in the global Internet and how to obtain authoritative information on which protocols they use, what data formats should be used.

One person who said this group was not needed at all was out-boohed.

Sending HTML in e-mail - MHTML

A proposed standard for sending HTML in e-mail is out, and most of the major e-mail software vendors are busy implementing it (including Eudora, Microsoft, Netscape and others). Several problems have cropped up in the exact implementation of the proposed standard, and some bugs have been found, so we decided to develop a new proposed standard, which we hope to submit to the IESG for last call at the end of September 1997.

The most important issues were:

What should a mailer do when given a HTML document with faulty URLs in it

Example:

Content-Type: Text/html

...

<IMG SRC="foo bar">

Content-Type: Image/gif

Content-Location: foo bar

We decided to accept the illegal URL and repeat it, if necesary, in the Content-Location statement, too, as shown in the example above, rather than having to rewrite the HTML text. This is of course still illegal and not recommended.

Can you take base for relative URLs from other than the immediately surrounding heading?

Content-Type: Multipart/related
Content-Base: http://foo.bar/: Content-Type: Text/html; <IMG SRC="my-name>; Content-Type: Image/gif; Content-Location: my-name

Is the Content-Base on the Multipart/related to be used as a base for URL-s in the sub-parts. RFC 2110 says no, draft-fielding-url-syntax-05 says yes. We decided to keep saying no, but to contact Fielding to ensure that both documents agree.

MHTML at Washington December 1997 IETF meeting

We want to have an MTHML meeting at that meeting, it will probably discuss what features are implemented and not implemented as a basis for going to draft standard.

Drums - revision of the e-mail standards

DRUMS is in the final stages of developing its documents, so the discussion was mostly on small, but not unimportant, technical details.

There was continued discussion on ABNF, the mostly used syntax specification language in IETF, and which is to become a separate standard, and not part of RFC822. Should ABNF cater for RFC 10646/U TF-8 characters? If so, how?

Should we allow multiple "To:" lines? Eudora, on receipt, will ignore all "To:" lines except the first. 822 says that the behaviour if you get multiple "To:" lines is undefined. Same for "Cc:", "Bcc". Conclusion: Generate only one, accept multiple, if you get multiple, handle as one long catenated string.

SMTP: Should there be any limit on length of lines or of e-mail addresses? This is related to the issue: Should we stay compatible with RFC822 or write what we think is best for the future?

When should the response 452 and 552 be given to RCPT TO?

452 Requested action not taken: insufficient system storage

552 Requested mail action aborted: exceeded storage allocation

It is important that the server gives the client adequate information on whether the client should try again a few hours later, or abandon the attempt to send this message to this server.

How should Message-ID be constructed to ensure global uniqueness. We agreed to give implementors freedom but could describe different methods of achieving uniqueness.

Allow "group: LWSP ;"? Example: "To: foo@bar, via postal mail: (Mary Smith);".

"free-form-name" -> "display-name".

Four-digit-year: Generate grammar, must be four-digit. Receive grammar: SHOULD be able to handle it. Two-digit years NN SHOULD be interpreted as "20NN" if NN < 59, "19NN" if NN > 60.

Forward versus resent

Forward means: I want to discuss this message with the new recipient or with the new recipient together with the original recipient.

Resent means: You are the person who should be the recipient of this message, not me.

Current practice is a mess. Whatever we decide will require changes of existing browsers.

Reply-To

Two current uses:

Replacement for "From:" to be used in replies, for example secretary requesting replies to the boss or the reverse.
To reroute replies to mailing lists instead of the author.

My opinion: Deprecate this. Other people wanted mainly choice 1 above, said this was the original intention. Possible deprecating will be done by a new standard, defining two new replacements, this new standard is not the MSGFMT document.

Left to do for MSGFMT document

Examples.

Contributions desired.

Registry of e-mail headers

IANA registry is wanted. IANA wants rules what to accept. A mailing list for community review, reasonable headers accepted, area director decides in controversial cases. This is part of the DRUMS work.

We discussed how strong the control should be on new e-mail headers before acceptance in the registry, and concluded that control of the same kind as is presently used for registration of new mime subtypes is suitable. A stronger control will cause the registry to be too little used, and the goal of the registry (to reduce the risk of synonyms and homonyms in header names) will not be fullfilled, a less strong control might get too many unreasonable header registered.

A new version of the IETF draft on this issue is available.

IETF Munich 1997 notes

Table of contents

Text on the back of a T-shirt

Hypertext Transfer Protocol (HTTP)

Wildcards for character set accepted

Content-Disposition

301-302 issue

Use of redirects

Link header

Cahing CGI-s?

Options

Conservative vs. Optimistic age calculation?

Security considerations

Redirect

Requirements

Go to draft standard?

Webdav - WWW Distributed Authoring and Versioning

Searching

Depth parameter

Locking

Language variants?

Something for CoopWWW?

A character set policy for the IETF

Here is a summary of what is agreed and not agreed:

Registry Information Database Exchange Formats and Protocols BOF

Base documents

Start of the meeting

Updates on the classes draft

Globally unique registry identifiers

Requirements

Strawman proposal for formats and protocols

Internationalization

Charter of this working group

Sending HTML in e-mail - MHTML

What should a mailer do when given a HTML document with faulty URLs in it

Can you take base for relative URLs from other than the immediately surrounding heading?

MHTML at Washington December 1997 IETF meeting

Drums - revision of the e-mail standards

Forward versus resent

Reply-To

Left to do for MSGFMT document

Registry of e-mail headers