Advertisement

BASE64 data URIS & HTML media

top computer


HTML5 media and data URIs



In a recent email conversation with Slava Paperno of Cornell University who builds applications for language teachers, he brought something to my attention that I was unaware of. Namely that HTML5 audio and video also accept Data URIs as a source, as well as standard file URLs.
I must admit that I’m not sure why I wasn’t aware of this before, but sometimes things are staring you in the face and you refuse to see them, as this is such an occasion.

Data URI

A Data URI takes the format:


data:[<MIME-type>][;charset=<encoding>][;base64],<data>

The MIME-type specifies what type of data the URI contains. So if you wanted to provide a data URI for an image you would specify it as follows:


<img src="  AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO  9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />

The MIME type used here is image/png indicating that the encoded data is a PNG image.

Audio

Similarily, if you wanted to encode an Ogg audio file, you would use the MIME type audio/ogg as follows:


<audio controls src="data:audio/ogg;base64,T2dnUwACAAAAAAAAAAA+..........+fm5nB6slBlZ3Fcha363d5ut7u3ni1rLoPf728l3KcK" />

If you want to specify an MP3 file you would simply set the MIME type to audio/mp3 and provide the encoded MP3 audio data string.
See an example of a data URI audio source.

Video

Providing a data URI for an encoded video is just the same, providing the correct MIME type for the correct encoded data, and of course they can be used in conjunction with the sourceelement:


<video controls>    <source type="video/webm" src="data:video/webm;base64,GkXfowEAAAAAAAAfQoaBAUL3gQFC8......jVOrhB9DtnVTrIMQTPc=">    <source type="video/mp4" src="data:video/mp4;base64,AAAAHGZ0eXBtcDQyAAAAAG1wNDJpc29....../l/L+X8v5AAAAMgfDg==">  </video>

See an example of a data URI video source.

Encoding in base64

Of course the code examples provided above don’t display the full data URI for any of the audio and video files as they would fill the entire page! To encode the data I simply used PHP‘s base64_encode function as follows:


function getEncodedVideoString($type$file) {     return 'data:video/' . $type . ';base64,' . base64_encode(file_get_contents($file));  }

Which was used by the HTML:


<video controls>     <source type="video/webm" src="<?php echo getEncodedVideoString('webm', 'parrots-small.webm'); ?>">     <source type="video/mp4" src="<?php echo getEncodedVideoString('mp4', 'parrots-small.mp4');?>">  </video>

Of course you can use whatever you like to encode the files, this is just an example. And while you may not want nor need to base64 encode your audio and video files (these reasons on why you might base64 encode images might be just as relevant), it’s still good to know that you can should the need arise.
The MIME type is the mechanism to tell the client the variety of document transmitted: the extension of a file name has no meaning on the web. It is, therefore, important that the server is correctly set up, so that the correct MIME type is transmitted with each document. Browsers often use the MIME-type to determine what default action to do when a resource is fetched.
There are many kinds of documents, so there are many MIME types. In this article, we will list the most important for Web development, but you can find ones for applicable document types in this dedicated article: Complete list of MIME types.
MIME types are not the only way to convey the document type information:
  • Name suffixes are sometimes used, especially on Microsoft Windows systems. Not all operating systems consider these suffixes meaningful (especially Linux and Mac OS), and like an external MIME type, there is no guarantee they are correct.
  • Magic numbers. The syntax of the different kind of files often allow to determine the type by looking at the structure. E.g. each GIF files starts with the 47 49 46 38 hexadecimal value [GIF89] or PNG files with 89 50 4E 47 [.PNG]. Not all types of files have magic numbers, so this is not a 100% reliable system either.
On the Web, only the MIME type is relevant and has therefore to be set carefully. Browsers and servers often used heuristics based on suffixes or magic numbers to define the MIME type, to check for coherence, or to find the correct MIME type when only a generic type has been provided.

SyntaxEDIT

General structure

type/subtype
The structure of a MIME type is very simple; it consists of a type and a subtype, two strings, separated by a '/'. No space is allowed. The type represents the category and can be a discrete or a multipart type. The subtype is specific to each type.
A MIME type is insensitive to the case, but traditionally is written all in lower case.

Discrete types

text/plain
text/html
image/jpeg
image/png
audio/mpeg
audio/ogg
audio/*
video/mp4
application/octet-stream
…
Discrete types indicates the category of the document, it can be one of the following:
TypeDescriptionExample of typical subtypes
textRepresents any document that contains text and is theoretically human readabletext/plaintext/htmltext/css, text/javascript
imageRepresents any kind of images. Videos are not included, though animated images (like animated gif) are describes with an image type.image/gifimage/pngimage/jpegimage/bmpimage/webp
audioRepresents any kind of audio filesaudio/midiaudio/mpeg, audio/webm, audio/ogg, audio/wav
videoRepresents any kind of video filesvideo/webmvideo/ogg
applicationRepresents any kind of binary data.application/octet-streamapplication/pkcs12application/vnd.mspowerpointapplication/xhtml+xmlapplication/xml,  application/pdf
For text documents without specific subtype, text/plain should be used. Similarly for binary documents without specific or known subtype, application/octet-stream should be used.

Multipart types

multipart/form-data
multipart/byteranges
Multipart types indicates a category of document that are broken in distinct parts, often with different MIME types. It is a way to represent composite document. With the exception of multipart/form-data, that are used in relation of HTML Forms and POST method, and multipart/byteranges that are used in conjunction with 206 Partial Content status message to send only a subset of a whole document, HTTP doesn't handle multipart documents in a specific way: the message is simply transmitted to the browser (which will likely propose a Save As window, not knowing how display the document inline.)

Important MIME types for Web developersEDIT

application/octet-stream

This is the default value for a binary files. As it really means unknown binary file, browsers usually don't automatically execute it, or even ask if it should be execute. They treat it as if the Content-Disposition header was set with the value attachment and propose a 'Save As' file.

text/plain

This is the default value for textual files. Even if it really means unknown textual file, browsers assume they can display it.
Note that text/plain doesn't means any kind of textual data. if they expect a specific kind of textual data, they will likely not consider it a match. Specifically if they download a text/plain file from a <link>element declaring a CSS files, they will not recognize it as a valid CSS files if presented with text/plain. The CSS mime type text/css must be used.

text/css

Any CSS files that have to be interpreted as such in a Web page must be of the text/css files. Often servers doesn't recognized files with the .css suffix as CSS files and send them with text/plain or application/octet-stream MIME type: in these cases, they won't be recognized as CSS files by most browsers and will be silently ignored. Special attention has to be paid to serve CSS files with the correct type.

text/html

All HTML content should be served with this type. Alternative MIME types for XHTML, like application/xml+html, are mostly useless nowadays (HTML5 unified these formats).

Images types

Only a handful of image types are widely recognized and are considered Web safe, ready for use in a Web page:
MIME typeImage type
image/gifGIF images (lossless compression, superseded by PNG)
image/jpegJPEG images
image/pngPNG images
image/svg+xmlSVG images (vector images)
There is discussion to add WebP (image/webp) to this list, but as each new image type will increase the size of a codebase, this may introduce new security problems, so browser vendors are cautious in accepting it.
Other kinds of images can be found in Web documents. For example, many browsers support icon image types for favicons or similar. In particular, ICO images are supported in this context with the image/x-icon MIME type.

Audio and video types

Like images, HTML doesn't define a set of supported types to use with the <audio> and<video>elements, so only a relatively small group of them can be used on the Web. The Media formats supported by the HTML audio and video elements explains both the codecs and container formats which can be used.
The MIME type of such files mostly represent the container formats and the most common ones in a Web context are:
MIME typeAudio or video type
audio/wave
audio/wav
audio/x-wav
audio/x-pn-wav
An audio file in the WAVE container format. The PCM audio codec (WAVE codec "1") is often supported, but other codecs have more limited support (if any).
audio/webmAn audio file in the WebM container format. Vorbis and Opus are the most common audio codecs.
video/webmA video file, possibly with audio, in the WebM container format. VP8 and VP9 are the most common video codecs used within it; Vorbis and Opus the most common audio codecs.
audio/oggAn audio file in the OGG container format. Vorbis is the most common audio codec used in such a container.
video/oggA video file, possibly with audio, in the OGG container format. Theora is the usual video codec used within it; Vorbis is the usual audio codec.
application/oggAn audio or video file using the OGG container format. Theora is the usual video codec used within it; Vorbis is the usual audio codec.

multipart/form-data

The multipart/form-data type can be used when sending the content of a completed HTML Form from the browser to the server. As a multipart document formal, it consists of different parts, delimited by a boundary (a string starting with a double dash '--'). Each part is an entity by itself, with its own HTTP headers, Content-Disposition, and Content-Type for file uploading fields, and the most common (Content-Length is ignored as the boundary line is used as the delimiter).
Content-Type: multipart/form-data; boundary=aBoundaryString
(other headers associated with the multipart document as a whole)

--aBoundaryString
Content-Disposition: form-data; name="myFile"; filename="img.jpg"
Content-Type: image/jpeg

(data)
--aBoundaryString
Content-Disposition: form-data; name="myField"

(data)
--aBoundaryString
(more subparts)
--aBoundaryString--

The following form:
<form action="http://localhost:8000/" method="post" enctype="multipart/form-data">
  <input type="text" name="myTextField">
  <input type="checkbox" name="myCheckBox">Check</input>
  <input type="file" name="myFile">
  <button>Send the file</button>
</form>
will send this message:
POST / HTTP/1.1
Host: localhost:8000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Content-Type: multipart/form-data; boundary=---------------------------8721656041911415653955004498
Content-Length: 465

-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myTextField"

Test
-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myCheckBox"

on
-----------------------------8721656041911415653955004498
Content-Disposition: form-data; name="myFile"; filename="test.txt"
Content-Type: text/plain

Simple file.
-----------------------------8721656041911415653955004498--

multipart/byteranges

The multipart/byteranges MIME type is used in the context of sending partial responses back to the browser. When the 206 Partial Content status code is sent, this MIME type is used to indicate that the document is composed of several parts, one for each of the requested range. Like other multipart types, the Content-Type uses the boundary directive to define the boundary string. Each of the different parts have a Content-Type header with the actual type of the document and a Content-Range with the range they represent.
HTTP/1.1 206 Partial Content
Accept-Ranges: bytes
Content-Type: multipart/byteranges; boundary=3d6b6a416f9b5
Content-Length: 385

--3d6b6a416f9b5
Content-Type: text/html
Content-Range: bytes 100-200/1270

eta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="vieport" content
--3d6b6a416f9b5
Content-Type: text/html
Content-Range: bytes 300-400/1270

-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica
--3d6b6a416f9b5--

Importance of setting the correct MIME typeEDIT

Most web servers send unknown-type resources using the default application/octet-stream MIME type. For security reasons, most browsers do not allow setting a custom default action for such resources, forcing the user to store it to disk to use it. Some commonly seen incorrect server configurations happen with the following file types:
  • RAR-encoded files. In this case, the ideal would be to set the true type of the encoded files; this is often not possible (as it may not be known to the server and these files may contain several resources of different types). In this case, configuring the server to send the application/x-rar-compressed MIME type, users will not have defined a useful default action for them.
  • Audio and video files. Only resources with the correct MIME Type will be recognized and played in <video> or <audio> elements. Be sure to use the correct type for audio and video.
  • Proprietary file types. Pay particular attention when serving a proprietary file type. Avoid using application/octet-stream as special handling will not be possible: most browsers do not allow defining a default behavior (like "Opening in Word") for this generic MIME type.

MIME sniffingEDIT

In the absence of a MIME type, or in some other cases where a client believes they are incorrectly set, browsers may conduct MIME sniffing, which is guessing the correct MIME type by looking at the resource. Each browser performs this differently and under different circumstances. There are some security concerns with this practice, as some MIME types represent executable content and others not. Servers can block MIME sniffing by sending the X-Content-Type-Options along the Content-Type.

See alsoEDIT

Data URI scheme

From Wikipedia, the free encyclopedia
The data URI scheme is a uniform resource identifier (URI) scheme that provides a way to include data in-line in web pages as if they were external resources. It is a form of file literal or here document. This technique allows normally separate elements such as images and style sheets to be fetched in a single Hypertext Transfer Protocol (HTTP) request, which may be more efficient than multiple HTTP requests.[1] Data URIs are sometimes referred to incorrectly as "data URLs".[citation needed] As of 2015, data URIs are fully supported by most major browsers, and partially supported in Internet Explorer and Microsoft Edge.[2]

Syntax[edit]

The syntax of data URIs was defined in Request for Comments (RFC) 2397, published in August 1998,[3] and follows the URI scheme syntax. A data URI consists of:

 data:[<media type>][;base64],<data>
  • The schemedata. It is followed by a colon (:).
  • An optional media type. If one is not specified, the media type of the data URI is assumed to be text/plain. It can contain an optional character set parameter, separated from the preceding part by a semicolon (;) . A character set parameter comprises the label charset, an equals sign (=), and a value from the IANA list of official character set names.[4] If this parameter is not present, the character set of the content is assumed to be US-ASCII (ASCII).
  • The optional base64 extension base64, separated from the preceding part by a semicolon. When present, this indicates that the data content of the URI is binary data, encoded in ASCII format using the Base64 scheme for binary-to-text encoding. Data URIs encoded in Base64 may contain whitespace for human readability.
  • The data, separated from the preceding part by a comma (,). The data is a sequence of octets represented as characters. Permitted characters within a data URI are the ASCII characters for the lowercase and uppercase letters of the modern English alphabet, and the Arabic numerals. Octets represented by any other character must be percent-encoded, as in %26 for an ampersand (&).[5]

Examples[edit]

HTML[edit]

An HTML fragment embedding a picture of a small red dot: Red-dot-5px.png
<img src="
AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO
9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />

CSS[edit]

Cascading Style Sheets (CSS) rule that includes a background image:
ul.checklist li.complete {
    padding-left: 20px;
    background:white  url('\
AANSUhEUgAAABAAAAAQAQMAAAAlPW0iAAAABlBMVEUAAAD///+l2Z/dAAAAM0l\
EQVR4nGP4/5/h/1+G/58ZDrAz3D/McH8yw83NDDeNGe4Ug9C9zwz3gVLMDA/A6\
P9/AFGGFyjOXZtQAAAAAElFTkSuQmCC') no-repeat scroll left top;
}
<!-- Backslashes at end of line - to continue character string
     at new line. -->

JavaScript[edit]

JavaScript statement that opens an embedded subwindow, as for a footnote link:
window.open('data:text/html;charset=utf-8,' +
    encodeURIComponent( // Escape for URL formatting
        '<!DOCTYPE html>'+
        '<html lang="en">'+
        '<head><title>Embedded Window</title></head>'+
        '<body><h1>42</h1></body>'+
        '</html>'
    )
);

Malware and phishing[edit]

The data URI can be utilized by criminals to construct attack pages that attempt to obtain usernames and passwords from unsuspecting web users. It can also be used to get around site cross-scripting restrictions, embedding the attack payload fully inside the address bar, and hosted via URL shortening services rather than needing a full website that is owned by the criminal.

Introduction

Ever since Steve Souders started evangelizing web performance, it’s been pounded into our heads that extra HTTP requests add a lot of additional overhead, and that we should combine them if possible to dramatically decrease the load time of our web pages.
The practical implication of this has been to combine our JavaScript and CSS files, which is relatively easy and straightforward, but the harder question has been what to do with images.

Sprites

Image sprites are a concept taken from video games: the idea is to cram a ton of image assets into one file, and rearrange a “viewport” of sorts to view only specific pieces of that file at a time. For instance, a simple sprite that holds two images might have one viewport that only looks at the top half of the sprite (image #1), and another viewport that only looks at the bottom half (image #2).
On the web side of things, this means that those multiple requests have now been combined into one request. This is nice because it saves both the overhead of additional HTTP requests as well as the overhead of setting up an image’s file header each time.
But there’s a few drawbacks with using image sprites:
  • hard to maintain and update: without some tool to help, manually editing and putting together image sprites is quite a chore
  • increased memory consumption (possibly very dramatic): this is often overlooked. The time to deliver the images is decreased at the expense of a bigger memory and CPU footprint, especially for large sprites and sprites with a lot of whitespace.
  • bleedthrough: for sprites that don’t have much whitespace to separate images, there’s an increased chance of nearby images visibly bleeding through other elements, as in this case where bleedthrough was only seen on iOS (but looked fine on Chrome and Safari on the desktop). Note the bleedthrough under the CNN logo:Example of image sprite bleedthrough on Reddit

Data URIs and Base64 encoding

Data URIs (see thisthis, and this) and Base64 encoding goes hand-in-hand. This method allows you to embed images right in your HTML, CSS, or JavaScript.
Just like sprites, you save HTTP requests, but there’s also some drawbacks:
  • base64 encoding makes file sizes roughly 33% larger than their original binary representations, which means more data down the wire (this might be exceptionally painful on mobile networks)
  • data URIs aren’t supported on IE6 or IE7
  • base64 encoded data may possibly take longer to process than binary data (anyone want to do a study on this?) (again, this might be exceptionally painful for mobile devices, which have more limited CPU and memory) (side note: CSS background-images seem to actually be faster than img tags)
The “33% larger” claim is generally accepted truth now, despite the fact that the figure varies wildly depending on the type of content. This is exactly what I wanted to test, albeit in a pretty limited and nonscientific way.
Before I tested, I wanted to keep in mind a few unverified intuitions (which aren’t entirely my own, but seem to be ideas that are floating around out there). Here’s a few questions I had before going to test:
  • Is base64 encoding with gzipping roughly equal to the original filesize of the binary file?
  • Is base64 encoding best for small images?
  • Is base64 encoding best for small and simple icons and not good for pictures and photos?
  • Is base64 encoding best when multiple files are merged together?
There’s something else I wanted to test: whether Gzipping binary image data made much difference. I know text compresses well, but is it even worth compressing JPEG files with Gzip, for instance?
I ran three tests: one with a set of small UI icons, one with a set of small photographs, and one with a set of the same photographs in a larger size. Though my tests were by no means extensive, they do show that care should be taken in making assumptions about base64.
Just a note about the tables: they are comparing the binary form (original png or jpeg) with the base64 form as it would appear in a CSS stylesheet, and comparing each of those with their gzipped form, which is most likely how they would be sent down the wire. The CSS representation has a few practical declarations and looks something like this:

.address-book--arrow {
  background-image: url();
  width: 16px;
  height: 16px;
  background-repeat: no-repeat;
}
Ok, onto the tests!

Test #1: Five 16×16 icons from the Fugue Icon set (PNG)

FileBinaryBinary GzippedCSS + Base64CSS + Base64 Gzipped
abacus1443117920431395
acorn1770152224781728
address-book–arrow7638101153948
address-book–exclamation7958481199988
address-book–minus7347811113919
Total5,5055,1407,9865,978
Combined file(5,505)(4,128)7,9864,423
  • All numbers are byte sizes 
    ** Numbers in parenthesis represent actual but impractical data. Unfortunately, images cannot be combined and delivered together in their binary form.
Takeaways:
  • The binaries are always smaller.
  • Sometimes Gzipping makes the files larger.
  • Gzipping the base64 version brings the filesize close to the size of the original binary, but this ignores the fact that the binaries get Gzipped as well. The Gzipped binaries (how they would be delivered to the client) are always smaller than the Gzipped base64 images
  • Combining files together dramatically reduces filesizes.
Practically, the developer has two options: deliver 5,140 bytes to the user in 5 separate HTTP requests, or 4,423 bytes in one HTTP request (CSS with base64 encoded image data). Base64 is the clear winner here, and seems to confirm that small icons compress extremely well.

Test #2: Five Flickr 75×75 Pictures (JPEG)

FileBinaryBinary GzippedCSS + Base64CSS + Base64 Gzipped
16734555790956010
25379441772874781
325626183873428320103
47031639994916702
55847465579115077
Total50,61739,41568,06742,673
Combined file(50,617)(36,838)68,06740,312
Takeaways:
  • (some of the same takeaways as Test #1)
  • Separately, photos aren’t too much bigger when base64 encoded and Gzipped. It’s very much within reason.
Practically, the developer can deliver 39,415 bytes in 5 separate requests, or 40,312 in 1 request. Not much filesize difference here, but 1 request seems preferable when we’re talking about 40kb.

Test #3: Five Flickr 240×160 Pictures (JPEG)

FileBinaryBinary GzippedCSS + Base64CSS + Base64 Gzipped
124502234033278923982
220410194662733319954
343833367295856138539
431776311804248531686
521348202082858120761
Total141,869130,986189,749134,922
Combined file(141,869)(129,307)189,749133,615
Takeaways:
  • (some of the same takeaways as Test #1)
  • Larger photos seem to bring the Gzipped binary and Gzipped base64 filesizes MUCH closer together, making the difference very minimal
The developer must choose between delivering 130,986 bytes in 5 HTTP requests, or 133,615 bytes in one HTTP request. Any good Souders follower would opt for the one request, BUT I would be careful here…

Caution: things aren’t always as they seem

There’s a huge caveat here: it may actually be more beneficial for perceived performance to deliver the images in 5 separate requests.
Why? Because 133,615 bytes is a lot to deliver all in one package to an end user who will be staring at blank placeholders for the duration. If the 5 base64 images all come in one request, that request will have to complete before ANYTHING is shown on the screen. All 5 images go from blank placeholders to almost immediately decoded from base64 and displayed in place.
Compare this with 5 requests that are most likely made in parallel and actually give a visual indicator to the user that actual image content is being downloaded, by showing parts of the images as they’re downloaded (you can also try a throwback to progressive JPEGs – really anything will be better than just a blank screen). That’s why it might actually be beneficial for perceived performance to just load images in the good old fashioned way. They will most likely load in parallel anyway, so the extra HTTP requests may actually not really make a difference. Not to mention it will be easier to let the browser manage the cache for each file instead of having to make your JavaScript manage your cache and prevent you from downloading an image that’s already stored away in localStorage or sessionStorage.
This being said, it’s generally advisable to put your common UI icons in base64 in your CSS, then let that whole chunk get cached by the browser. Those are usually clean vector icons as well, which seem to get compressed quite well (see Test #1).
But for image content, where there is nothing to be saved but HTTP requests, you should definitely think twice about base64 encoding to save requests. Yes, you will save a few HTTP requests, you won’t really be saving bytes, and the user might actually think the experience is slower because they can’t see the image content as it’s being downloaded. Even if you shave off a few milliseconds of wait time, the perceived performance is what matters most.