By Jacob Palme, e-mail: jpalme@dsv.su.se, at the research group for CMC (Computer Mediated Communication) in the Department of Computer and Systems Sciences at Stockholm University and KTH.
MHTML is the IETF working group for developing standards for sending HTML-formatted text in e-mail.
5.1 Do we by exact matches mean case sensitive matches and no resolution like "file%20name" to "file name". Note: This should not be any problem if standards are adhered to, since spaces are not legal in URLs. However, it is accepted practice for Web browsers to accept lots of kinds of illegal URLs, and the two most widely used products both accept spaces in URLs in hyperlinks in HTML documents. How should such a URL be handled in the Content-Location statement. Should the space be converted to %20 (then the words about exact matching in mhtml-spec chapter 8.2.2 most be changed) or should it be put in illegal format in the Content-Location header, too?
5.2 Does this apply only to relative Content-Locations without any Content-Base? Should we say something about exactness of matchings when URL-s are resolved using a Content-Base? If so, what?
5.3 What about the case where the URL is relative and unresolvable in the header, but absolute in the HTML text. The present spec does not say what should be done in that case.
If there is both a Content-Base and a Content-Location header, which of them should take precedence in resolving URL-s in the HTML content?
Should the Content-Base and Content-Location be allowed in cases where they do not influence functionality, as a way of informing the reader that a body part was taken from a certain web location?
Any reason to remove this passage in RFC 2110 section 4.1:
These two headers may occur both inside and outside of a multipart/related part.
JP comment: The statement is true. The specific usage of Content-Base and Content-Location described in RFC 2110 SHOULD only occur inside Multipart/related, but these two headers can also occur as information to the reader that the body part is also available at a certain URL. And since Text/html can occur outside of Multipart/related (Multipart/related is only needed when the Text/html contains links to other body parts in the same message), Content-Base and Content-Location can also occur outside of Multipart/related, and in my opinion this text should not be removed. Possibly we could change the paragraph to the following.
These two headers may occur both inside and outside of a multipart/related part, but their usage for handling HTML links between body parts in a message SHOULD only occur inside Multipart/related.
Should we allow the same Content-Location on two body parts, if they resolve to different URLs (last paragraph of section 7 in mhtml-spec).
Suggestion: Yes.
Suppose there are two body parts in a multipart/related. One of them has a Content-Base statement, the other does not have.
Example:
Part 1: Content-Type: Text/html Content-Base: http://foo.net <IMG SRC="picture.gif"> Part 2: Content-Type: Image/gif Content-Location: picture.gif
In this case, should relative-to-absolute conversion take place on "picture.gif" in Part 1, so that it will not match the relative URL in Part 2?
Should the standard include the new chapter 13. Robustness Principle as suggested in draft-ietf-mhtml-spec-07 or should this chapter be put into the informational draft draft-ietf-mhtml-info or not be published at all.
Note: The present work in the IETF DRUMS working group, where
this kind of information, under the title "4. Obsolete Syntax" is included in the standard-to-be draft-ietf-drums-msg-fmt.
Every single subchapter in chapter 13. Robustness Principle is controversial and we should decide for or against having it (this applies whether this chapter goes into the standard or the informational document).
Should liberal implementations accept input where the type parameter is wrong or omitted?
Should liberal implementations accept input where the type parameter is not quoted?
Should liberal implementations accept input where the start parameter is not quoted with angle brackets?
Should liberal implementations accept and try to use, if necessary, Content-Base and Content-Location headers in multipart headings.
Any reason to change this passage in RFC 2110 section 4.1:
These two headers are valid only for exactly the content heading or message heading where they occurs and its text. They are thus not valid for the parts inside multipart headings, and are thus meaningless in multipart headings.
Can some of the implementors, who have executable code which can check examples, provide better examples? By better examples I mean examples with both are correct and which clarify the controversial points.
Is it time now to publish draft-ietf-mhtml-info-06.txt as an informational RFC?