I have just submitted two new IETF drafts from the MHTML working group on sending HTML via e-mail.
The two new drafts are:
draft-ietf-mhtml-spec-04.txt
and
draft-ietf-mhtml-info-04.txt
You can download them from the following anonymous FTP URLs:
ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-spec-04.txt
ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-info-04.txt
This message gives a summary of the major changes to the documents. There are also a number of minor changes in wordings, etc., which are not documented in this message.
An informational RFC [MHTML-INFO] will be published as a supplement to this standard. The informational RFC will discuss implementation methods and some implementation problems. Implementors are recommended to read this informational RFC when developing implementations of the MHTML standard.
In certain special cases this will not work if the original HTML document contains URIs as parameters to objects and applets. In such a case, it might be better to rewrite the document before sending it. This problem is discussed in more detail in the informational RFC which will be published as a supplement to this standard.
Some WWW applications hide passwords and tickets (access tokens to information which may not be available to anyone) and other sensitive information in hidden fields in the web documents or in on-the-fly constructed URLs. If a person gets such a document, and forwards it via e-mail, the person may inadvertently disclose sensitive information.
[MHTML-INFO] J. Palme: "Sending HTML in E-mail, an informational supplement to RFC ???: MIME E-mail Encapsulation of Aggregate HTML Documents (MHTML)", to be published as an informational supplement to the MHTML standard.
The sections have been renumbered.
problems with rewriting of URIs
5. Problems with rewriting URIs when copying HTML documents Sending of HTML-formatted messages is based on the assumption that an HTML documents, together with in-line objects like images, applets and frames, can be copied into an e-mail message. Such copying may require rewriting of URIs containing references between the different message parts. The MHTML standard [MHTML] has been carefully prepared to allow existing web pages to be copied without such rewriting, through the use of the Content-Base and Content-Location MIME content heading fields. There is however a problem if the source HTML document contains relative URIs in parameters to objects and applets, such as in the example below: From: foo1@bar.net To: foo2@bar.net Subject: A simple example Mime-Version: 1.0 Content-Type: multipart/related; boundary="boundary-example-1"; type=Text/HTML Content-Base: "http://www.ietf.cnri.reston.va.us" --boundary-example 1 Content-Type: Text/HTML; charset=US-ASCII ... text of the HTML document... <OBJECT CLASSID = "clsid:5220cb21-c88d-11cf-b347-00aa00a28331"> <PARAM NAME="imageurl" VALUE="image.gif"> </OBJECT> ...etc... --boundary-example-1 Content-Location: "image.gif" Content-Type: IMAGE/GIF Content-Transfer-Encoding: BASE64 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 ..etc... --boundary-example-1-- Only the object might know that the imageurl parameter is a relative URI. It's nearly impossible for the HTML parser to understand that the parameter is a relative URI. Simply searching for "image.gif" is not robust, as the string "image.gif" may be used elsewhere. URIs in scripts can also have similar problems. One might envisage even more difficult cases, an applet might take a parameter "subject" and another parameter "range" and when subject="auto" and range="1-5" it could compute, and try to use auto1.gif, auto2.gif ... auto5.gif as relantive URLs. Some implementation methods described in chapter 4 above, for example method 2 described in chapter 4.2, may require rewriting of the URIs in the HTML document. There is no perfect solution to this problem. One way of alleviating the problem is to produce the original document using only absolute URIs, preferably of the CID type, since they are more easily identifiable. Another way of alleviating the problem is if to make all URIs and Content-Locations into simple relative URIs containing file names only (without paths, preferably using a file name format common to most platforms, i.e. 1-6 ascii letters or digits, a period, and 1-3 extension ascii letters or digits). An implementation using method 2 described in chapter 4.2 above can then just store the parts as files in an empty directory on the recipient computer with the Content-Locations as file names, and then turn the start HTML file over to a web browser, and need not rewrite the URIs at all. This simple variant of use of the MHTML standard is probably most robust, and those implementors who can control the production of the HTML documents to be sent as e-mail are thus recommended to use this variant.