MIME standard header fields

This section is a summary of the common MIME headers and may be useful as a quick reference. It is not a definitive specification of MIME. In some cases the MIME parser allows documents which are not strictly valid according to the standard. For example, it does not insist on the presence of a MIME-Version header. All the standard MIME header fields are simply written to the logical tree as they appear in the MIME document. The MIME parser only takes special note of the Content-Type header field.

All MIME headers may include comments enclosed by parentheses as shown in the example for the MIME-Version header.

MIME header fields

MIME-Version

Example:

MIME-version: 1.0 (generated by my-application 1.2)

For a MIME document to conform with RFC 2045, this field is required in the top-level header with a value of 1.0. MIME-Version should not be specified on individual parts.

Content-Type

Content-Type is not required for a document to conform with RFC 2045, but a top-level Content-Type is required by the MIME parser. Content-Type defaults to text/plains. Content-Type defines the type of data in each part as a type/subtype. The MIME parser accepts most values for Content-Type and simply stores them in the logical tree. The only exceptions are:

  • The MIME parser rejects any Content-Type value with type = message
  • The MIME parser assumes that a Content-Type value with type = multipart introduces a multipart MIME document and rejects such a value if it does not contain a valid boundary parameter. The value of the boundary parameter defines the separator between message parts in a multipart message. In a nested multipart message, a unique boundary value is needed for each nesting level.

Syntax:

Content-Type: type/subtype;parameter

Where type and subtype define the Content-Type and any optional parameters are delimited by semicolons.

Example 1:

Content-Type: multipart/related;type=text/xml

In example 1 the Content-Type is defined as multipart/related and also has an optional parameter definition (type=text/xml). While this is syntactically correct, as there is no valid boundary parameter this message will be rejected.

Example 2:

Content-Type: multipart/related;boundary=Boundary;type=text/xml 

Example 2 shows a valid Content-Type definition, both in terms of syntax and semantics. The boundary value may optionally be enclosed in quotes. When it appears in the MIME body the value is preceded by the sequence '--' and care must be taken that the resulting value (in this example it would be --Boundary) cannot appear in the message body. If the message data is encoded as quoted-printable, you should have a boundary that includes a sequence such as "=_", which cannot appear in a quoted-printable body.

Some common Content-Type values are shown below. Any other values are allowed and simply stored in the logical tree.

Content-Type Description
text/plain Generally used for a typical mail or news message. text/richtext also common.
text/xml Generally used with SwA (SOAP with Attachments)
application/octet-stream Used where the message is an unknown type and contains any kind of data as bytes.
application/xml Used for application-specific xml data
x-type Used for non-standard content type. It must start x-
image/jpeg Used for images. image/jpeg and image/gif are common image formats that are used
multipart/related Used for multiple related parts in a message. Specifically used with SwA (SOAP with Attachments)
multipart/signed Used for multiple related parts in a message including signature. Specifically used with S/MIME
multipart/mixed Used for multiple independent parts in a message
Content-Transfer-Encoding

Optional. Many Content-Types are represented as 8-bit character or binary data. This could include XML, which typically uses UTF-8 or UTF-16 encoding. This type of data cannot be transmitted over some transport protocols and may be encoded to 7-bit.

The Content-Transfer-Encoding header field is used to indicate the type of transformation that has been used for encoding this type of data into a 7-bit format.

The only values allowed by the WS-I Basic Profile are:

  • 7bit - the default
  • 8bit
  • binary
  • base64
  • quoted-printable

The values 7bit, 8bit, and binary all effectively mean that no encoding took place. It is possible that a MIME conformant mail gateway might use this value to control how it handles the message. For example, encoding it as 7bit before passing routing it over SMTP.

The values base64 and quoted-printable mean that the content has been encoded. The value quoted-printable means that only non-7-bit characters in the original are encoded and is intended to yield a document which is still human-readable. This setting is most likely to be used in conjunction with a Content-Type of text/plain.

Content-ID

Optional. This enables parts to be labeled and referenced from other parts of the message. These parts are typically referenced from part 0 (the first) of the message.

Content-Description

Optional. This enables parts to be described.

MIME encodings

The following section is aimed to provide a basic guide to the base64 and quoted-printable encoding. Please refer to RFC 1521 for a definitive specification of MIME encodings.

base64

The original data is broken into groups of 3 octets. Each group is then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. The base64 alphabet is A-Z, a-z, 0-9, and / (with A=0 and /=63).

Figure 1. base64 data transformationHow 8-bit data is broken down into 6-bit encoded data.

If fewer than 24 bits are available at the end of the data, the encoded data is padded using the “=” character . The maximum line length in the encoded data is 76 characters and line breaks (and any other characters not in the alphabet above) are ignored when decoding.

Examples:

Input Output
Some data encoded in base64. U29tZSBkYXRhIGVuY29kZWQgaW4gYmFzZTY0Lg==
life of brian bGlmZSBvZiBicmlhbg==\012
what d2hhdA==
quoted-printable

This encoding is only appropriate if most of the data comprises printable characters. Specifically, characters in the ranges 33-60 and 62-126 are usually represented by the corresponding ASCII characters. Control characters and 8-bit data must be represented by the sequence = followed by a pair of hex digits.

The standard ASCII space <SP> and horizontal tab <HT> represent themselves, unless they appear at the end of an encoded line (without a soft line break) in which case the equivalent hex format must be used (=09 and =20 respectively).

Line breaks in the data are represented by the RFC 822 line break sequence <CR><LF> and should be encoded as "=0D=0A" if binary data is being encoded.

As for base64, the maximum line length in the encoded data is 76 characters. An ‘=’ sign at the end of an encoded line (a ‘soft’ line break) is used to tell the decoder that the line is to be continued.

Related concepts
Message modeling
The message model
Related tasks
Developing message models
Working with message model objects
Related reference
Message model reference information
Message model object properties
Additional MRM domain information
Additional MIME domain information
Related information
RFC 1521: MIME Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
RFC 822: Standard for the format of ARPA Internet text messages