-------------------------------------------------------------- Changes between draft-ietf-mthml-rev-01.txt and draft-ietf-mhtml-rev-02.txt: --- The word "email" has been removed in many places, since this standard can be used also when sending HTML in multipart MIME format using other protocols than SMTP. Examples: --- The title has been changed --- from "MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MTHML)" to "MIME Encapsulation of Aggregate Documents, such as HTML (MTHML)" --- "To send such documents in MIME-formatted e-mail messages" changed to "To send such documents in MIME-formatted messages" --- Added text: "Also with other protocols such as HTTP or FTP, it can sometimes be desirable to send several documents in one aggregate document." --- change "many mail programs" to "some receiving agents". --- change "mail" to "message". --- change "mailer" to "receiving agent". --- change "mail" to "send". --- other changes --- The definitions of words like MUST, MUST NOT, etc. have been removed and replaced by a reference to RFC 2119. --- clarifying change --- These two headers are valid only for exactly the content heading or message heading where they occur and its text. They are thus not valid for the parts inside multipart headings. They are allowed, but cannot be used for resolution, when they occur in multipart headings. to These two headers are valid for the content heading or message heading where they occur and its text. If they occur in multipart headings, they apply to its body parts only in that they can be used to derive a base for relative URIs in the body parts, but only if no such base is provided in the body part itself. --- clarifying change --- These two headers may occur both inside and outside of a "multipart/related" part, but their usage for handling HTML links between body parts in a message SHOULD only occur inside "multipart/related". to These two headers may occur on any message or content heading, but their usage for handling hyperlinks between body parts in a message SHOULD only occur inside the same "multipart/related". -- change at the end of 4.1: where URI is at present (June 1996) restricted to the syntax for URLs as defined in Unform Resource Locators [URL]. to where URI is restricted to the syntax for URLs as defined in Unform Resource Locators [URL] until IETF specifies other kinds of URIs. --- change in 4.4.1 --- Note that for the MHTML processing of matching URLs in body text to URL in Content-Location headers the value of the charset parameter is irrelevant, but it may be relevant for other purposes, and incorrect labeling MUST therefore be avoided. changed to Note that for the MHTML processing of matching URLs in body text to URL in Content-Location headers the value of the charset parameter is irrelevant, but it may be relevant for other purposes, and incorrect labeling MUST therefore be avoided. Warning: Irrelevance of the charset parameter may not be true in the future, if different character encodings of the same non-English filename is used in HTML. --- change in 4.4.2 --- Encoding as discussed in clause 4.4.1 MUST be done before such folding. After that, the folding can be done, using the algorithm defined in [URLBODY] section 3.1. to Encoding as discussed in clause 4.4.1 MUST be done before such folding. This MUST include encoding of space characters, if any. After that, the folding can be done, using the algorithm defined in [URLBODY] section 3.1. --- added text --- The URI in the Content-Location header may, but need not refer to an object which is actually available globally for retrieval using this URI (after resolution of relative URIs). However, URI-s in Content-Location headers (if absolute, or resolvable to absolute URIs) SHOULD still be globally unique. --- added text --- There MUST only be a single Content-Location header in each message or content-heading, and its value is a single URI. Note however, that both one Content-Location and one Content- ID or Message-ID header are allowed. In such a case, these will indicate two different, equally valid references for this body part, and any of them may be used in other body parts within one "multipart/related" to refer to this body part. --- added text --- Handling of URIs containing inappropriate characters Some documents may contain URIs with characters that are inappropriate for an RFC 822 header, either because the URI itself has an incorrect syntax according to [URL] or the URI syntax standard has been changed to allow characters not previously allowed in MIME headers. These URIs cannot be sent directly in a message header. There are two approaches that can be taken when encountering such a URI as the text to be placed in a Content-Location or Content-Base header: a) In some situations, an implementation might be able to replace the URL with one that can be sent directly. This might be accomplished, for example, by using the encoding method of [URL] to replace inappropriate characters within the URL with ones encoded using the %nn encoding. This replacement MUST in that case be done both in the header and in the HTML text which has a hyperlink which is to match the header. Since the change is done in both places, a receiving agent need not decode it, and MUST NOT decode [URL]-encoding before matching hyperlinks to body parts. b) The URL might be encoded using the method described in [MIME3]. This replacement MUST only be done in the header, not in the HTML text. Receiving clients must decode the [MIME3] encoding in the heading before comparing hyperlinks in body text to URLs in Content-Location headers. With method (b), the charset parameter value "US-ASCII" SHOULD be used if the URL contains no octets outside of the 7-bit range. If such octets are present, the correct charset parameter value (derived e.g. from information about the HTML document the URL was found in) SHOULD be used. If this cannot be safely established, the value "UKNOWN-8BIT" [RFC 1428] MUST be used. Note that for the MHTML processing of matching URLs in body text to URL in Content-Location headers the value of the charset parameter is irrelevant, but it may be relevant for other purposes, and incorrect labeling MUST therefore be avoided. Warning: Irrelevance of the charset parameter may not be true in the future, if different character encodings of the same non-English filename is used in HTML. Caution should be taken in using method (a), since, in general, this encoding can not be applied safely to characters that are used for reserved purposes within the URL scheme. In addition, changing the HTML body which contains the URL might invalidate a message integrity check. Because of these problems, this method SHOULD only be used if it is performed in cooperation with the author/owner of the documents involved. --- change --- Relative URIs inside contents of MIME body parts are resolved relative to a base URI. In order to determine this base URI, the first-applicable method in the following list applies. to Relative URIs inside contents of MIME body parts are resolved relative to a base URI using the methods for resolving relative URIs described in [RELURL]. In order to determine this base URI, the first-applicable method in the following list applies. --- change at the end of section 5 --- When the methods above do not yield an absolute URI the procedure in section 8.2 for matching relative URIs MUST be followed. to (d) Step (b) and (c) can be repeated recursively on Content-Base and Content-Location headers in surrounding multi-part headings. However, a base from an absolute Content-Location in an inner heading takes precedence over a base from a Content-Base or a Content-Location in a surrounding heading. When the methods above do not yield an absolute URI matching of two relative URIs against each other can still be done for matches within a multipart/related. This matching is done as if they had been given as base an imaginary URL "This_message:/", which exists for the sole purpose of resolving relative references within a multipart/related entitity. This is also described in other words in section 8.2 below. --- add --- Even though Content-Location and Content-Base can occur without multipart/related, this standard only covers their use for resolution of links between body parts inside one multipart/related. This standard does not cover links from one multipart/related to another multipart/related in a message containing multiple multipart/related objects. --- add in section 6 --- (c) To send a document in a format which is preserved even if the object to which the hyperlinks refer through HTTP is later changed or deleted. --- add in section 7 --- Two body parts in the same multipart/related can have the same relative URI as value of their Content-Location headers only if there are headers contain a different Content-Base header, so that the absolute URI after resolution against the Content-Base header is different. --- in section 8.1 change --- A body part, such as a text/html body part, may contain hyperlinks to objects which are included as other body parts in the same message and within the same "multipart/related" content. Often such linked objects are meant to be displayed inline to the reader of the main document; for example, objects referenced with the IMG tag in HTML 2.0 [HTML2]. New tags with this property are proposed in the ongoing development of HTML (example: applet, frame). to A body part, such as a text/html body part, may contain hyperlinks to objects which are included as other body parts in the same message and within the same "multipart/related" content. Often such linked objects are meant to be displayed inline to the reader of the main document; for example, objects referenced with the src attribute of the IMG element in HTML 2.0 [HTML2]. New elements and attributes with this property are proposed in the ongoing development of HTML (example: applet, frame, profile, OBJECT, classid, codebase, data, SCRIPT). A sender might also want to send a set of HTML documents which the reader can traverse, and which are related with the attribute href of the A element. --- add section 11 --- Add IETF copyright clause. --- in many places --- "URL" has been changed to "URI" in many places. I hope I have got it right!