Tuesday, July 31, 2007

Binary multipart POSTs in Javascript

We recently released a very slick Firefox extension at Wesabe. It was written by my colleague Tim Mason, but I helped figure out one small piece of it—namely, how to do binary multipart POSTs in Javascript—and since it involved many hours of hair-pulling for both of us, I thought I'd share the knowledge. (Tim says I should note that "this probably only works in Firefox version 2.0 and greater _and_ that it uses Mozilla specific calls that are only allowed for privileged javascript—basically only for extensions.")

One of the cool features of the plugin is the ability to take a snapshot of a full browser page and either save the snapshot to disk or upload it to Wesabe (so you can, for example, save the receipt for a web purchase along with that transaction in your account). The snapshot is uploaded to Wesabe via a standard multipart POST, the same way that a file is uploaded via a web form.

Tim was having trouble getting the POST to work with binary data at first, and he had other things to finish, so he wanted to just base-64-encode it and be done with it. I was reluctant to do that, as the size of the upload would be significantly larger (about 137% of the original). Also, Rails didn't automatically decode base-64-encoded file attachments. But Tim had other bugs to fix, so I submitted a patch to Rails to do the base-64 decoding. I was pretty proud of this patch until it was pointed out to me that RFC 2616 specifically disallows the use of Content-Transfer-Encoding in HTTP. Doh. They also realized that it is a colossal waste of bandwidth.

Since Tim was cramming to meet a hard(-ish) deadline set for the release of the plugin, I offered to lend my eyeballs to the binary post problem. This could be a very long story, but I'll just get to the point: you can read binary data in to a Javascript string and dump it right out to a file just fine, but if you try to do any concatenation with that string, Javascript ends up munging it mercilessly. I'm not sure whether it is trying to interpret it as UTF8 or if it terminates it as soon as it hits a null byte (which is what seemed to be happening), but regardless, doing "some string" + binaryData + "another string", as is necessary when putting together a mutipart post, just does not work.

The answer required employing Rube Goldbergian system of input and output streams. The seed of the solution was found on this post, although that didn't explain how to mix in all of the strings needed for the post and MIME envelope. So here it is, in all it's goriness:


var dataURL = this.canvas.toDataURL(this.getImageType()); // grab the snapshot as base64
var imgData = atob(dataURL.substring(13 + this.getImageType().length)); // convert to binary

var filenameTimestamp = (new Date().getTime());
var separator = "----------12345-multipart-boundary-" + filenameTimestamp;

// Javascript munges binary data when it undergoes string operations (such as concatenation), so we need
// to jump through a bunch of hoops with streams to make sure that doesn't happen

// create a string input stream with the form preamble
var prefixStringInputStream = Components.classes["@mozilla.org/io/string-input-stream;1"].createInstance(Components.interfaces.nsIStringInputStream);
var formData =
"--" + separator + "\\r\\n" +
"Content-Disposition: form-data; name=\"data\"; filename=\"snapshot_" + filenameTimestamp +
(this.getImageType() === "image/jpeg" ? ".jpg" : ".png") + "\"\\r\\n" +
"Content-Type: " + this.getImageType() + "\\r\\n\\r\\n";
prefixStringInputStream.setData(formData, formData.length);

// write the image data via a binary output stream, to a storage stream
var binaryOutputStream = Components.classes["@mozilla.org/binaryoutputstream;1"].createInstance(Components.interfaces.nsIBinaryOutputStream);
var storageStream = Components.classes["@mozilla.org/storagestream;1"].createInstance(Components.interfaces.nsIStorageStream);
storageStream.init(4096, imgData.length, null);
binaryOutputStream.setOutputStream(storageStream.getOutputStream(0));
binaryOutputStream.writeBytes(imgData, imgData.length);
binaryOutputStream.close();

// write out the rest of the form to another string input stream
var suffixStringInputStream = Components.classes["@mozilla.org/io/string-input-stream;1"].createInstance(Components.interfaces.nsIStringInputStream);
formData =
"\\r\\n--" + separator + "\\r\\n" +
"Content-Disposition: form-data; name=\"description\"\\r\\n\\r\\n" + description + "\\r\\n" +
"--" + separator + "--\\r\\n";
suffixStringInputStream.setData(formData, formData.length);

// multiplex the streams together
var multiStream = Components.classes["@mozilla.org/io/multiplex-input-stream;1"].createInstance(Components.interfaces.nsIMultiplexInputStream);
multiStream.appendStream(prefixStringInputStream);
multiStream.appendStream(storageStream.newInputStream(0));
multiStream.appendStream(suffixStringInputStream);

// post it
req.open("POST", "http://yoursite.com/upload_endpoint", true);
req.setRequestHeader("Accept", "*/*, application/xml");
req.setRequestHeader("Content-type", "multipart/form-data; boundary=" + separator);
req.setRequestHeader("Content-length", multiStream.available());
req.setRequestHeader("Authorization", "Basic " + btoa(username + ":" + password));
req.setRequestHeader("User-Agent", "YourUserAgent/1.0.0");
req.send(multiStream);


Update: Spaces removed from multipart boundary per Gijsbert's suggestion in the comments (thanks!).