Using HTML in E-mail

User Info
Sending HTML in E-mail, Receiving HTML in E-mail, Choice of E-Mail Program

Sending Tables in E-mail

The IETF Standard
Short summary, Test programs and implementation status, Mailing List, Published standards documents - Corrections to the published standards - Implementation advice, Meetings

Other information

Last update: 11-Jun-2005 by Jacob Palme E-mail: jpalme@dsv.su.se.


Short Summary of the MHTML Standard

The main idea of the MHTML standard is that you send a HTML document, together with in-line graphics, applets, etc., and also other linked documents if you so wish, in a MIME multipart/related body part. Links in the HTML to other included parts can be provided by CID (Content-ID) URLs or by any other kind of URI, and the linked body part is identified in its heading by either a Content-ID (linked to by CID URLs) or a Content-Location (linked to by any other kind of URL). (In fact, the "Content-ID: foo@bar header" can be seen as a special case of the "Content-Location: CID: foo@bar header".)

The Content-Location identifies a URI for a content part, the part need not be universally retrievable using this base.

The Content-Base identifies a base URI for a content, or for all objects within it which do not have their own Content-Base.

URIs in HTML-formatted messages and in Content-Location headers can be absolute and relative. If they are relative, and a base can be found, they are to be converted to absolute URIs before matching with other body parts. If no base can be found, then exact matching of the relative URIs in the HTML and the Content-Location of the linked parts is performed instead. The base can be found in a surrounding absolute Content-Location header.

An example of an e-mail message as it might look like in plain ASCII text and in HTML format.

Top of page

Advice on How to Use HTML in E-mail

Some advice for e-mail users about what MHTML can do for them, and how they should best use it with some popular e-mail clients.

Test programs and Implementation Status Reports

An evaulation of implementation status in the spring of 2000:

A full test implementation 2093 of a mailer which sends pages, taken from the web, in MHTML format without changing the HTML text, using the Content-Location statement for references between body parts.

Top of page

Charter of the IETF Working Group

An approved IETF working group on this issue was accepted in April 1996, with an initial charter in 1996. The working group charter was revised in May 1997, and the new charter is available in HTML or plain text format. A proposal for a revised charter is available.

Mailing List Information:

For information on how to subscribe and unsubscribe and watch the archives of the MHTML-L@SEGATE.SUNET.SE, go to http://segate.sunet.se/archives/mhtml-l.html

Older archives (before 2002) can be found at http://segate.sunet.se/archives/mhtml.html

Top of page

Published Standards and other RFCs

The proposed standards from this working group have been published as RFC-s:

RFC 2387
E. Levinson, "The MIME Multipart/Related Content-type", August 1998. (Obsoletes RFC2112) (Status: PROPOSED STANDARD)
RFC 2392
E. Levinson, "Content-ID and Message-ID Uniform Resource Locators", (Obsoletes RFC2111) (Status: PROPOSED STANDARD)
RFC 2557
J. Palme, A. Hopman, N. Shellness, "MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)", (Obsoletes RFC 2110) (Status: PROPOSED STANDARD)

Corrections to the published standards

RFC 2557 has several examples where there are no angle brackets <> around the value of the "start" parameter of the Content-Type: Multipart/related. This is incorrect, there should be brackets around the value of the start parameter. More info.

Wrong:

Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML; start=foo3*foo1@bar.net

Right:

Content-Type: Multipart/related; boundary="boundary-example-1";
type=Text/HTML; start=<foo3*foo1@bar.net>

Advice on Implementing MHTML

Not yet published as an RFC:
Sending HTML in E-mail, an informational supplement to RFC 2110: MIME E-mail Encapsulation of Aggregate HTML Documents (MHTML):
by Jacob Palme. It can be retrieved by anonymous FTP from URL in plain text format as:
ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-info.txt
and in HTML format as:
http://dsv.su.se/jpalme/ietf/mhtml-info.html

Top of page

Differences between RFC 2557 and RFC 2110

The specification has been changed to show that the formats described do not only apply to multipart MIME in email, but also to multipart MIME transferred through other protocols such as HTTP or FTP.

In order to agree with [RELURL], Content-Location headers in multipart Content-Headings can now be used as a base to resolve relative URIs in their component parts, but only if no base URI can be derived from the component part itself. Base URIs in Content-Location header fields in inner headings have precedence over base URIs in outer multipart headings.

The Content-Base header, which was present in RFC 2110, has been removed. A conservative implementor may choose to accept this header in input for compatibility with implementations of RFC 2110, but MUST never send any Content-Base header, since this header is not any more a part of this standard.

A section 4.4.1 has been added, specifying how to handle the case of sending a body part whose URI does not agree with the correct URI syntax.

The handling of relative and absolute URIs for matching between body parts have been merged into a single description, by specifying that relative URIs, which cannot be resolved otherwise, should be handled as if they had been given the URL "thismessage:/".

The "Security considerations" section has been extended.

Some syntactic errors in the examples have been corrected. In particular, all parameter values to Content-Type must be quoted, some of them were incorrectly not quoted in the previous version.

Top of page

Differences between RFC 2110 and draft-ietf-mhtml-rev-06.txt

The specification has been changed to show that the formats described do not only apply to multipart MIME in email, but also to multipart MIME transferred through other protocols such as HTTP or FTP.

In order to agree with [RELURL], Content-Location headers in multipart Content-Headings can now be used as a base to resolve relative URIs in their component parts, but only if no base URI can be derived from the component part itself. Base URIs in Content-Location header fields in inner headings have precedence over base URIs in outer multipart headings.

The Content-Base header, which was present in RFC 2110, has been removed. A conservative implementor may choose to accept this header in input for compatibility with implementations of RFC 2110, but MUST never send any Content-Base header, since this header is not any more a part of this standard.

A section 4.4.1 has been added, specifying how to handle the case of sending a body part whose URI does not agree with the correct URI syntax.

The handling of relative and absolute URIs for matching between body parts have been merged into a single description, by specifying that relative URIs, which cannot be resolved otherwise, should be handled as if they had been given the URL "this_message:/".

Some quotes were missing around required parameters to the Content-Type header, these have been added in examples.

Top of page

IETF Meeting Notes

Los Angeles, April 1998

Full text of the minutes

Draft agenda

  1. Any last minute detail problems in our final drafts.
  2. Review our Informational RFC Draft on Implementation Issues.
  3. Questionnaire to implementors about implementation status.
  4. Issues related to requirements for advancement to DRAFT status for MHTML RFCs.

Summary

The standard is in IESG last call, so there was no discussion of its content at this meeting. What we did was to discuss implementation status. Most implementors today support all the MHTML formats in input, but only produce HTML with Content-ID in output. The reason for this is that Netscape at present only supports HTML with Content-ID in input, and all implementors want to produce messages which can be read with Netscape. Netscape is, however, working on a fuller implementation of MHTML, and when that has been released, other e-mailers can be expected to start using Content-Location.

We discussed the questionnaire to be sent to implementors. I was asked to change the questionnaire to distinguish between what implementors provide with their own code, and what they provide with code from other vendors. The reason for this is that we want to find if there are two independent implementations of each protocol feature, and if vendor X uses vendor Y code, then vendor X is not an independent implementation from vendor Y.

We discussed the informational draft. We will try to get people in the working group to check it, so that it can be forwarded. If possible, the standard should contain a reference to the informational draft.

Washington D.C., December 1997

Agenda

In URL http://dsv.su.se/jpalme/ietf/draft-agenda-2-dec-1997.txt.

Summary

The MHTML standards has now been a proposed standard for almost a year, and a revised proposed standard will soon be submitted. The work is in the final stages of getting the new proposed standard ready. Issues discussed at this meeting were:

Status of implementations: We are ready to begin asking implementors to supply information about which features they have implemented, using a questionnaire I have prepared and which is available on the net.

Many implementors have chosen to just ignore information they cannot handle in HTML-formatted messages, not even telling users that information has been suppressed. I am not very happy with this, but since this is a rather common way of handling HTML, the group did not see any possibility to restrict it in the standard. One can only hope for more responsible implementors.

Another issue is whether to allow multiple Content-Location in one Content-Heading. Alex Hopmann argued rather well for allowing this. He said that the same web document often is referred-to using different URLs. If multiple Content-Location is not allowed, one might have to send the same document more than once in different body parts. The group accepted his position. The problem is that Content-Location is used to derive base, and if there are more than one, which of them should be used. Alex suggested, and we decided, that if you have multiple Content-Location, you either have to supply a Content-Base header field, or either of the bases from the different Content-Location should in reality get you the same resources.

We discussed the usefulness of adding a third alternative to the Save command in Web browsers. Today, the alternatives are usually text and source, and a third alternative might be MIME. This would be different from source in that also all inline objects like graphics and applets are saved, so that the whole document is saved in the file, not only the HTML part of it. Implementors will probably develop this, but we did not feel much need to mention it in the standard, since it does not influence the protocol over the line.

Everyone agreed that a change in the original of any of the body parts sents in a MIME message should not cause a change in the message as shown to its reader. Only changes in parts not sent, but only referred to (through URLs or message/external-body) can cause a message to change after transmission.

We did decide to say that our standard does not cover the combination of multipart/related with message/external-body.

We discussed nesting of multipart/related within multipart/related. Alex argued that an inner multipart/related should be allowed to contain URLs referring to body parts in outer multipart/related. I am not sure what our decision was on this issue, I hope the minute-taker will clarify this.

I presented some ideas for making multipart/alternative more readable, but the ideas were not very popular. I will submit them in writing to the list, hoping that careful reading of the texts may cause people to like my ideas more.

We decided to have a small editing meeting later during the week (just me, Alex Hopmann and Nick Shelness), then to submit a new draft, and aim at starting working group last call on the 5th of January and then, two weeks later, forward the new proposed standard to IESG. After that, Einar Stefferud said that the working group might go into hibernating status.

Munich, August 1997 meeting

A revision of RFC 2110 is under way and will be discussed at the IETF meeting in Munich, August 1997. An issue list for this meeting is available.

A description of the differences between version 4 and version 3 of RFC 2110 and the info document is available at URL: http://dsv.su.se/jpalme/ietf/draft-ietf-mhtml-04.news.html.

A proposed standard for sending HTML in e-mail is out, and most of the major e-mail software vendors are busy implementing it (including Eudora, Microsoft, Netscape and others). Several problems have cropped up in the exact implementation of the proposed standard, and some bugs have been found, so we decided to develop a new proposed standard, which we hope to submit to the IESG for last call at the end of September 1997.

The most important issues were:

What should a mailer do when given a HTML document with faulty URLs in it

Example:
Content-Type: Text/html
...
<IMG SRC="foo bar">
 
Content-Type: Image/gif
Content-Location: foo bar

We decided to accept the illegal URL and repeat it, if necesary, in the Content-Location statement, too, as shown in the example above, rather than having to rewrite the HTML text. This is of course still illegal and not recommended.

Can you take base for relative URLs from other than the immediately surrounding heading?

Example:
Content-Type: Multipart/related
Content-Base: http://foo.bar/
 
Content-Type: Text/html
<IMG SRC="my-name>
 
Content-Type: Image/gif
Content-Location: my-name

Is the Content-Base on the Multipart/related to be used as a base for URL-s in the sub-parts. RFC 2110 says no, draft-fielding-url-syntax-05 says yes. We decided to keep saying no, but to contact Fielding to ensure that both documents agree.

Memphis, April 1997

There was a short informal meeting on this issue during the IETF meeting in Memphis, April 1997.

The issue was discussed in the IETF meeting in San Jose, December 1996

Issues for the IETF meeting in San Jose, December 1996.

Here are my personal notes (not official minutes) from that meeting:

The group more or less finished its work at this meeting. The proposed standard is expected to be approved by the IESG shortly.

We discussed some late suggestions for change in the mhtml drafts:

  1. In certain special cases when relative URIs are generated by applets, it will be difficult to implement the standard.
  2. If the multipart/relative has as its main part a multipart/alternative, the intention of the type parameter to multipart/relative to aid one-pass resolution may not work as well as intended.
  3. We did not decide to change the document because of these possible problems. The only change we discussed was the changing of references from referring to IETF drafts to referring to RFCs. The only problem here is the IETFT draft on internationalization of the WWW. If this document does not soon become an RFC, we have to remove the reference to it in the SPEC.

We decided to keep the INFO document in an IETF draft stage for at least six months more, so that implementation experience can be gained. This means that the reference in the SPEC to the INFO document has to be removed.

One implementation is already ready, the one made by Mark K. Joseph. He has even implemented the complex case of a multipart/alternative inside a multipart/related. Half a dozen other implementations are soon coming, so interoperability tests can soon start.

It would be nice to have a set of test messages to test implementations of the standard. I promised to provide storage for such test cases on our HTTP or FTP servers, but who will develop the test messages? Test messages generated by real implementations has the advantage of aiding in interoperability testing, and are also less error-prone. Manually created test cases have the advantage of allowing the creation of very special combinations which no existing mail UA may yet be able to generate. I did not promise to develop such test cases.

Montreal, June 1996

Here is a list of issues from the Montreal, June 1996 meeting (version three).

Here is the agenda for that meeting (version two).

Here are minutes from that meeting.

The most important decision taken at the Montreal meeting was on the method for matching of URIs in HTML documents and in Content-Location statements. The Montreal meeting decided that this matching will be done after resolution of relative into absolute URIs, which is different from the draft which was input to the meeting.

Other decisions at the Montreal meeting were

  • to ask IETF to promote The MIME Multipart/Related Content-type by E. Levinson (ftp://nic.nordu.net/internet-drafts/draft-ietf-mhtml-related-00.txt) from experimental to proposed standard.
  • that Content-Base is only valid for exactly the body part where it occurs (not for subparts).
  • that Content-Location can be relative and need not always be resolved by Content-Base into an absolute URI.
  • that we will reference N. Freed and Keith Moore: "Definition of the URL MIME External-Body Access-Type", draft-ietf-mailext-acc-url-01.txt, November 1995 on how to encode URL-s in e-mail headers.
  • that references in HTML documents to body parts outside the Content-Type: Multipart/related are allowed but not recommended.
  • not to implement any special support for fast redering (the includes parameter and the catalogue body part).
  • that we will reference Harald Tveit Alvestrand, Edward Levinson: "The MIME Multipart/Related Content-type", <draft-levinson-multipart-related-00.txt>, January 1995 on the issue of the relation to the Content-Disposition header and not duplicate this in our document.
  • a more clear description of the relation between MIME encodings and HTML mappings of non-us-ascii characters.
  • The informational document will be published, but not as rapidly as the proposed standard.
  • We will very soon go to working group last clal on the proposed standard.

Top of page

Other related IETF Standards Work

See URL: http://dsv.su.se/jpalme/ietf/jp-ietf-home.html.