HTML5 media and data URIs
In a recent email conversation with Slava Paperno of Cornell University who builds applications for language teachers, he brought something to my attention that I was unaware of. Namely that HTML5
audio
and video
also accept Data URIs as a source, as well as standard file URLs.
I must admit that I’m not sure why I wasn’t aware of this before, but sometimes things are staring you in the face and you refuse to see them, as this is such an occasion.
Data URI
A Data URI takes the format:
The MIME-type specifies what type of data the URI contains. So if you wanted to provide a data URI for an image you would specify it as follows:
The MIME type used here is
image/png
indicating that the encoded data is a PNG image.Audio
Similarily, if you wanted to encode an Ogg audio file, you would use the MIME type
audio/ogg
as follows:
If you want to specify an MP3 file you would simply set the MIME type to
audio/mp3
and provide the encoded MP3 audio data string.
See an example of a data URI audio source.
Video
Providing a data URI for an encoded video is just the same, providing the correct MIME type for the correct encoded data, and of course they can be used in conjunction with the
source
element:
See an example of a data URI video source.
Encoding in base64
Of course the code examples provided above don’t display the full data URI for any of the audio and video files as they would fill the entire page! To encode the data I simply used PHP‘s
base64_encode
function as follows:
Which was used by the HTML:
Of course you can use whatever you like to encode the files, this is just an example. And while you may not want nor need to base64 encode your audio and video files (these reasons on why you might base64 encode images might be just as relevant), it’s still good to know that you can should the need arise.
The MIME type is the mechanism to tell the client the variety of document transmitted: the extension of a file name has no meaning on the web. It is, therefore, important that the server is correctly set up, so that the correct MIME type is transmitted with each document. Browsers often use the MIME-type to determine what default action to do when a resource is fetched.
There are many kinds of documents, so there are many MIME types. In this article, we will list the most important for Web development, but you can find ones for applicable document types in this dedicated article: Complete list of MIME types.
MIME types are not the only way to convey the document type information:
- Name suffixes are sometimes used, especially on Microsoft Windows systems. Not all operating systems consider these suffixes meaningful (especially Linux and Mac OS), and like an external MIME type, there is no guarantee they are correct.
- Magic numbers. The syntax of the different kind of files often allow to determine the type by looking at the structure. E.g. each GIF files starts with the 47 49 46 38 hexadecimal value [GIF89] or PNG files with 89 50 4E 47 [.PNG]. Not all types of files have magic numbers, so this is not a 100% reliable system either.
On the Web, only the MIME type is relevant and has therefore to be set carefully. Browsers and servers often used heuristics based on suffixes or magic numbers to define the MIME type, to check for coherence, or to find the correct MIME type when only a generic type has been provided.
Syntax
General structure
type/subtype
The structure of a MIME type is very simple; it consists of a type and a subtype, two strings, separated by a
'/'
. No space is allowed. The type represents the category and can be a discrete or a multipart type. The subtype is specific to each type.
A MIME type is insensitive to the case, but traditionally is written all in lower case.
Discrete types
text/plain text/html image/jpeg image/png audio/mpeg audio/ogg audio/* video/mp4 application/octet-stream …
Discrete types indicates the category of the document, it can be one of the following:
Type | Description | Example of typical subtypes |
---|---|---|
text | Represents any document that contains text and is theoretically human readable | text/plain , text/html , text/css, text/javascript |
image | Represents any kind of images. Videos are not included, though animated images (like animated gif) are describes with an image type. | image/gif , image/png , image/jpeg , image/bmp , image/webp |
audio | Represents any kind of audio files | audio/midi , audio/mpeg, audio/webm, audio/ogg, audio/wav |
video | Represents any kind of video files | video/webm , video/ogg |
application | Represents any kind of binary data. | application/octet-stream , application/pkcs12 , application/vnd.mspowerpoint , application/xhtml+xml , application/xml , application/pdf |
For text documents without specific subtype,
text/plain
should be used. Similarly for binary documents without specific or known subtype, application/octet-stream
should be used.Multipart types
multipart/form-data multipart/byteranges
Multipart types indicates a category of document that are broken in distinct parts, often with different MIME types. It is a way to represent composite document. With the exception of
multipart/form-data
, that are used in relation of HTML Forms and POST
method, and multipart/byteranges
that are used in conjunction with 206
Partial Content
status message to send only a subset of a whole document, HTTP doesn't handle multipart documents in a specific way: the message is simply transmitted to the browser (which will likely propose a Save As window, not knowing how display the document inline.)Important MIME types for Web developers
application/octet-stream
This is the default value for a binary files. As it really means unknown binary file, browsers usually don't automatically execute it, or even ask if it should be execute. They treat it as if the
Content-Disposition
header was set with the value attachment
and propose a 'Save As' file.
text/plain
This is the default value for textual files. Even if it really means unknown textual file, browsers assume they can display it.
text/css
Any CSS files that have to be interpreted as such in a Web page must be of the
text/css
files. Often servers doesn't recognized files with the .css
suffix as CSS files and send them with text/plain
or application/octet-stream
MIME type: in these cases, they won't be recognized as CSS files by most browsers and will be silently ignored. Special attention has to be paid to serve CSS files with the correct type.
text/html
All HTML content should be served with this type. Alternative MIME types for XHTML, like
application/xml+html
, are mostly useless nowadays (HTML5 unified these formats).Images types
Only a handful of image types are widely recognized and are considered Web safe, ready for use in a Web page:
MIME type | Image type |
---|---|
image/gif | GIF images (lossless compression, superseded by PNG) |
image/jpeg | JPEG images |
image/png | PNG images |
image/svg+xml | SVG images (vector images) |
There is discussion to add WebP (
image/webp
) to this list, but as each new image type will increase the size of a codebase, this may introduce new security problems, so browser vendors are cautious in accepting it.
Other kinds of images can be found in Web documents. For example, many browsers support icon image types for favicons or similar. In particular, ICO images are supported in this context with the
image/x-icon
MIME type.Audio and video types
Like images, HTML doesn't define a set of supported types to use with the
<audio>
and<video>
elements, so only a relatively small group of them can be used on the Web. The Media formats supported by the HTML audio and video elements explains both the codecs and container formats which can be used.
The MIME type of such files mostly represent the container formats and the most common ones in a Web context are:
MIME type | Audio or video type |
---|---|
audio/wave audio/wav audio/x-wav audio/x-pn-wav | An audio file in the WAVE container format. The PCM audio codec (WAVE codec "1") is often supported, but other codecs have more limited support (if any). |
audio/webm | An audio file in the WebM container format. Vorbis and Opus are the most common audio codecs. |
video/webm | A video file, possibly with audio, in the WebM container format. VP8 and VP9 are the most common video codecs used within it; Vorbis and Opus the most common audio codecs. |
audio/ogg | An audio file in the OGG container format. Vorbis is the most common audio codec used in such a container. |
video/ogg | A video file, possibly with audio, in the OGG container format. Theora is the usual video codec used within it; Vorbis is the usual audio codec. |
application/ogg | An audio or video file using the OGG container format. Theora is the usual video codec used within it; Vorbis is the usual audio codec. |
multipart/form-data
The
multipart/form-data
type can be used when sending the content of a completed HTML Form from the browser to the server. As a multipart document formal, it consists of different parts, delimited by a boundary (a string starting with a double dash '--'
). Each part is an entity by itself, with its own HTTP headers, Content-Disposition
, and Content-Type
for file uploading fields, and the most common (Content-Length
is ignored as the boundary line is used as the delimiter).Content-Type: multipart/form-data; boundary=aBoundaryString (other headers associated with the multipart document as a whole) --aBoundaryString Content-Disposition: form-data; name="myFile"; filename="img.jpg" Content-Type: image/jpeg (data) --aBoundaryString Content-Disposition: form-data; name="myField" (data) --aBoundaryString (more subparts) --aBoundaryString--
The following form:
<form action="http://localhost:8000/" method="post" enctype="multipart/form-data">
<input type="text" name="myTextField">
<input type="checkbox" name="myCheckBox">Check</input>
<input type="file" name="myFile">
<button>Send the file</button>
</form>
will send this message:
POST / HTTP/1.1
Host: localhost:8000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Content-Type: multipart/form-data; boundary=---------------------------8721656041911415653955004498
Content-Length: 465
-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myTextField"
Test
-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myCheckBox"
on
-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myFile"; filename="test.txt"
Content-Type: text/plain
Simple file.
-----------------------------8721656041911415653955004498--
multipart/byteranges
The
multipart/byteranges
MIME type is used in the context of sending partial responses back to the browser. When the 206
Partial Content
status code is sent, this MIME type is used to indicate that the document is composed of several parts, one for each of the requested range. Like other multipart types, the Content-Type
uses the boundary
directive to define the boundary string. Each of the different parts have a Content-Type
header with the actual type of the document and a Content-Range
with the range they represent.HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Type: multipart/byteranges; boundary=3d6b6a416f9b5
Content-Length: 385
--3d6b6a416f9b5
Content-Type: text/html
Content-Range: bytes 100-200/1270
eta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="vieport" content
--3d6b6a416f9b5
Content-Type: text/html
Content-Range: bytes 300-400/1270
-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: "Open Sans", "Helvetica
--3d6b6a416f9b5--
Importance of setting the correct MIME type
Most web servers send unknown-type resources using the default
application/octet-stream
MIME type. For security reasons, most browsers do not allow setting a custom default action for such resources, forcing the user to store it to disk to use it. Some commonly seen incorrect server configurations happen with the following file types:- RAR-encoded files. In this case, the ideal would be to set the true type of the encoded files; this is often not possible (as it may not be known to the server and these files may contain several resources of different types). In this case, configuring the server to send the
application/x-rar-compressed
MIME type, users will not have defined a useful default action for them. - Audio and video files. Only resources with the correct MIME Type will be recognized and played in
<video>
or<audio>
elements. Be sure to use the correct type for audio and video. - Proprietary file types. Pay particular attention when serving a proprietary file type. Avoid using
application/octet-stream
as special handling will not be possible: most browsers do not allow defining a default behavior (like "Opening in Word") for this generic MIME type.
MIME sniffing
In the absence of a MIME type, or in some other cases where a client believes they are incorrectly set, browsers may conduct MIME sniffing, which is guessing the correct MIME type by looking at the resource. Each browser performs this differently and under different circumstances. There are some security concerns with this practice, as some MIME types represent executable content and others not. Servers can block MIME sniffing by sending the
X-Content-Type-Options
along the Content-Type
.See also
Data URI scheme
From Wikipedia, the free encyclopedia
The data URI scheme is a uniform resource identifier (URI) scheme that provides a way to include data in-line in web pages as if they were external resources. It is a form of file literal or here document. This technique allows normally separate elements such as images and style sheets to be fetched in a single Hypertext Transfer Protocol (HTTP) request, which may be more efficient than multiple HTTP requests.[1] Data URIs are sometimes referred to incorrectly as "data URLs".[citation needed] As of 2015, data URIs are fully supported by most major browsers, and partially supported in Internet Explorer and Microsoft Edge.[2]
Syntax[edit]
The syntax of data URIs was defined in Request for Comments (RFC) 2397, published in August 1998,[3] and follows the URI scheme syntax. A data URI consists of:
data:[<media type>][;base64],<data>
- The scheme,
data
. It is followed by a colon (:
). - An optional media type. If one is not specified, the media type of the data URI is assumed to be
text/plain
. It can contain an optional character set parameter, separated from the preceding part by a semicolon (;
) . A character set parameter comprises the labelcharset
, an equals sign (=
), and a value from the IANA list of official character set names.[4] If this parameter is not present, the character set of the content is assumed to beUS-ASCII
(ASCII). - The optional base64 extension
base64
, separated from the preceding part by a semicolon. When present, this indicates that the data content of the URI is binary data, encoded in ASCII format using the Base64 scheme for binary-to-text encoding. Data URIs encoded in Base64 may contain whitespace for human readability. - The data, separated from the preceding part by a comma (
,
). The data is a sequence of octets represented as characters. Permitted characters within a data URI are the ASCII characters for the lowercase and uppercase letters of the modern English alphabet, and the Arabic numerals. Octets represented by any other character must be percent-encoded, as in%26
for an ampersand (&
).[5]
Examples[edit]
HTML[edit]
An HTML fragment embedding a picture of a small red dot:
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA
AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO
9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />
CSS[edit]
A Cascading Style Sheets (CSS) rule that includes a background image:
ul.checklist li.complete {
padding-left: 20px;
background:white url('data:image/png;base64,iVBORw0KGgoAA\
AANSUhEUgAAABAAAAAQAQMAAAAlPW0iAAAABlBMVEUAAAD///+l2Z/dAAAAM0l\
EQVR4nGP4/5/h/1+G/58ZDrAz3D/McH8yw83NDDeNGe4Ug9C9zwz3gVLMDA/A6\
P9/AFGGFyjOXZtQAAAAAElFTkSuQmCC') no-repeat scroll left top;
}
<!-- Backslashes at end of line - to continue character string
at new line. -->
JavaScript[edit]
A JavaScript statement that opens an embedded subwindow, as for a footnote link:
window.open('data:text/html;charset=utf-8,' +
encodeURIComponent( // Escape for URL formatting
'<!DOCTYPE html>'+
'<html lang="en">'+
'<head><title>Embedded Window</title></head>'+
'<body><h1>42</h1></body>'+
'</html>'
)
);
Malware and phishing[edit]
The data URI can be utilized by criminals to construct attack pages that attempt to obtain usernames and passwords from unsuspecting web users. It can also be used to get around site cross-scripting restrictions, embedding the attack payload fully inside the address bar, and hosted via URL shortening services rather than needing a full website that is owned by the criminal.
When to Base64 Encode Images (and When Not To)
Ever since Steve Souders started evangelizing web performance, it’s been pounded into our heads that extra HTTP requests add a lot of additional overhead, and that we should combine them if possible to dramatically decrease the load time of our web pages.
The practical implication of this has been to combine our JavaScript and CSS files, which is relatively easy and straightforward, but the harder question has been what to do with images.
Sprites
Image sprites are a concept taken from video games: the idea is to cram a ton of image assets into one file, and rearrange a “viewport” of sorts to view only specific pieces of that file at a time. For instance, a simple sprite that holds two images might have one viewport that only looks at the top half of the sprite (image #1), and another viewport that only looks at the bottom half (image #2).
On the web side of things, this means that those multiple requests have now been combined into one request. This is nice because it saves both the overhead of additional HTTP requests as well as the overhead of setting up an image’s file header each time.
But there’s a few drawbacks with using image sprites:
- hard to maintain and update: without some tool to help, manually editing and putting together image sprites is quite a chore
- increased memory consumption (possibly very dramatic): this is often overlooked. The time to deliver the images is decreased at the expense of a bigger memory and CPU footprint, especially for large sprites and sprites with a lot of whitespace.
- bleedthrough: for sprites that don’t have much whitespace to separate images, there’s an increased chance of nearby images visibly bleeding through other elements, as in this case where bleedthrough was only seen on iOS (but looked fine on Chrome and Safari on the desktop). Note the bleedthrough under the CNN logo:
Data URIs and Base64 encoding
Data URIs (see this, this, and this) and Base64 encoding goes hand-in-hand. This method allows you to embed images right in your HTML, CSS, or JavaScript.
Just like sprites, you save HTTP requests, but there’s also some drawbacks:
- base64 encoding makes file sizes roughly 33% larger than their original binary representations, which means more data down the wire (this might be exceptionally painful on mobile networks)
- data URIs aren’t supported on IE6 or IE7
- base64 encoded data may possibly take longer to process than binary data (anyone want to do a study on this?) (again, this might be exceptionally painful for mobile devices, which have more limited CPU and memory) (side note: CSS background-images seem to actually be faster than img tags)
The “33% larger” claim is generally accepted truth now, despite the fact that the figure varies wildly depending on the type of content. This is exactly what I wanted to test, albeit in a pretty limited and nonscientific way.
Before I tested, I wanted to keep in mind a few unverified intuitions (which aren’t entirely my own, but seem to be ideas that are floating around out there). Here’s a few questions I had before going to test:
- Is base64 encoding with gzipping roughly equal to the original filesize of the binary file?
- Is base64 encoding best for small images?
- Is base64 encoding best for small and simple icons and not good for pictures and photos?
- Is base64 encoding best when multiple files are merged together?
There’s something else I wanted to test: whether Gzipping binary image data made much difference. I know text compresses well, but is it even worth compressing JPEG files with Gzip, for instance?
I ran three tests: one with a set of small UI icons, one with a set of small photographs, and one with a set of the same photographs in a larger size. Though my tests were by no means extensive, they do show that care should be taken in making assumptions about base64.
Just a note about the tables: they are comparing the binary form (original png or jpeg) with the base64 form as it would appear in a CSS stylesheet, and comparing each of those with their gzipped form, which is most likely how they would be sent down the wire. The CSS representation has a few practical declarations and looks something like this:
|
Ok, onto the tests!
Test #1: Five 16×16 icons from the Fugue Icon set (PNG)
File | Binary | Binary Gzipped | CSS + Base64 | CSS + Base64 Gzipped |
abacus | 1443 | 1179 | 2043 | 1395 |
acorn | 1770 | 1522 | 2478 | 1728 |
address-book–arrow | 763 | 810 | 1153 | 948 |
address-book–exclamation | 795 | 848 | 1199 | 988 |
address-book–minus | 734 | 781 | 1113 | 919 |
Total | 5,505 | 5,140 | 7,986 | 5,978 |
Combined file | (5,505) | (4,128) | 7,986 | 4,423 |
- All numbers are byte sizes
** Numbers in parenthesis represent actual but impractical data. Unfortunately, images cannot be combined and delivered together in their binary form.
Takeaways:
- The binaries are always smaller.
- Sometimes Gzipping makes the files larger.
- Gzipping the base64 version brings the filesize close to the size of the original binary, but this ignores the fact that the binaries get Gzipped as well. The Gzipped binaries (how they would be delivered to the client) are always smaller than the Gzipped base64 images
- Combining files together dramatically reduces filesizes.
Practically, the developer has two options: deliver 5,140 bytes to the user in 5 separate HTTP requests, or 4,423 bytes in one HTTP request (CSS with base64 encoded image data). Base64 is the clear winner here, and seems to confirm that small icons compress extremely well.
Test #2: Five Flickr 75×75 Pictures (JPEG)
File | Binary | Binary Gzipped | CSS + Base64 | CSS + Base64 Gzipped |
1 | 6734 | 5557 | 9095 | 6010 |
2 | 5379 | 4417 | 7287 | 4781 |
3 | 25626 | 18387 | 34283 | 20103 |
4 | 7031 | 6399 | 9491 | 6702 |
5 | 5847 | 4655 | 7911 | 5077 |
Total | 50,617 | 39,415 | 68,067 | 42,673 |
Combined file | (50,617) | (36,838) | 68,067 | 40,312 |
Takeaways:
- (some of the same takeaways as Test #1)
- Separately, photos aren’t too much bigger when base64 encoded and Gzipped. It’s very much within reason.
Practically, the developer can deliver 39,415 bytes in 5 separate requests, or 40,312 in 1 request. Not much filesize difference here, but 1 request seems preferable when we’re talking about 40kb.
Test #3: Five Flickr 240×160 Pictures (JPEG)
File | Binary | Binary Gzipped | CSS + Base64 | CSS + Base64 Gzipped |
1 | 24502 | 23403 | 32789 | 23982 |
2 | 20410 | 19466 | 27333 | 19954 |
3 | 43833 | 36729 | 58561 | 38539 |
4 | 31776 | 31180 | 42485 | 31686 |
5 | 21348 | 20208 | 28581 | 20761 |
Total | 141,869 | 130,986 | 189,749 | 134,922 |
Combined file | (141,869) | (129,307) | 189,749 | 133,615 |
Takeaways:
- (some of the same takeaways as Test #1)
- Larger photos seem to bring the Gzipped binary and Gzipped base64 filesizes MUCH closer together, making the difference very minimal
The developer must choose between delivering 130,986 bytes in 5 HTTP requests, or 133,615 bytes in one HTTP request. Any good Souders follower would opt for the one request, BUT I would be careful here…
Caution: things aren’t always as they seem
There’s a huge caveat here: it may actually be more beneficial for perceived performance to deliver the images in 5 separate requests.
Why? Because 133,615 bytes is a lot to deliver all in one package to an end user who will be staring at blank placeholders for the duration. If the 5 base64 images all come in one request, that request will have to complete before ANYTHING is shown on the screen. All 5 images go from blank placeholders to almost immediately decoded from base64 and displayed in place.
Compare this with 5 requests that are most likely made in parallel and actually give a visual indicator to the user that actual image content is being downloaded, by showing parts of the images as they’re downloaded (you can also try a throwback to progressive JPEGs – really anything will be better than just a blank screen). That’s why it might actually be beneficial for perceived performance to just load images in the good old fashioned way. They will most likely load in parallel anyway, so the extra HTTP requests may actually not really make a difference. Not to mention it will be easier to let the browser manage the cache for each file instead of having to make your JavaScript manage your cache and prevent you from downloading an image that’s already stored away in localStorage or sessionStorage.
This being said, it’s generally advisable to put your common UI icons in base64 in your CSS, then let that whole chunk get cached by the browser. Those are usually clean vector icons as well, which seem to get compressed quite well (see Test #1).
But for image content, where there is nothing to be saved but HTTP requests, you should definitely think twice about base64 encoding to save requests. Yes, you will save a few HTTP requests, you won’t really be saving bytes, and the user might actually think the experience is slower because they can’t see the image content as it’s being downloaded. Even if you shave off a few milliseconds of wait time, the perceived performance is what matters most.