August 3, 2014

Hoardr: Flickr as a cloud backup service

After years of being relegated to nothing more than a container for whacky cat antics, it turns out the humble Gif is capable of more than meets the eye. Specifically, it is able to carry more than just image data – you can actually append whatever you want to a Gif and still have it function as normal. This has a fairly scary implication, since it means any Gif on the internet could theoretically be hiding malicious code or other files that you want nothing to do with. A slightly less nefarious and more interesting use case lies in the fact that we can get files into or past a system that may not usually accept that filetype. And that’s what Hoardr does – it packs whatever files you want into gifs and turns your Flickr account into a cloud backup service.

Hoardr uses zip archives in particular due to the nature of both filetypes. Simply changing the file extension of a combined file from “.gif” to “.zip” allows it to be unzipped as usual.

If you’d like to play around with this idea but want nothing to do with breaking Flickr’s T.O.S ( keep reading for that ), you can use the following commands.

Create a gif/zip combination through concatenation:

cat your_gif_file.gif your_zip_file.zip > output_file.gif

Change the gif into a readable zip:

mv output_file.gif output_file.zip

Unzipping the archive, to retrieve your files:

unzip output_file.zip

Why? #

You’re probably wondering why or if this is at all worthwhile, and the truth is it’s more novelty than anything else. However, I have been in Mexico for the past two months and just recently filled up my hard drive. I need to make some room and don’t feel like spending any money. I could pay for a bit of space on a cloud storage service, but that doesn’t sound like fun. Also, there may be rules here against spending your taco budget on computer junk.

How #

Here is a brief overview of why this trick works. Most of this is based off Wikipedia articles and brief pieces of the Gif and Zip specs. Let me know if something I’ve written here doesn’t seem right.

The Gif: Gif files are encoded with a few different data blocks that tell the decoder how the file should be handled. These include a header, a screen descriptor, the image data, some other stuff, and the trailer. This last one – the trailer – is the only block that really matters to our application. It delimits the end of the file, letting the decoder know the gif file is done and it can stop reading.

In ASCII the raw data for a mostly empty GIF looks like this:

GIF89a\x14\x00\x14\x00\x80\x00\x00\xFF\xFF\xFF\x00\x00\x00!\xF9\x04
\x01\x00\x00\x00\x00,\x00\x00\x00\x00\x14\x00\x14\x00\x00
\x02\x11\x84\x8F\xA9\xCB\xED\x0F\xA3\x9C\xB4\xDA\x8B\xB3
\xDE\xBC\xFB\xAF\x15\x00;

“GIF89a” is the header and “\x00;” is the trailer.

Concatenating a zip archive to this gif and again viewing the ASCII formatted data results in the following:

GIF89a\x14\x00\x14\x00\x80\x00\x00\xFF\xFF\xFF\x00\x00\x00!\xF9
\x04\x01\x00\x00\x00\x00,\x00\x00\x00\x00\x14\x00\x14\x00\x00\x02
\x11\x84\x8F\xA9\xCB\xED\x0F\xA3\x9C\xB4\xDA\x8B\xB3\xDE\xBC
\xFB\xAF\x15\x00;PK\x03\x04\x14\x00\b\x00\b\x00%\xB0\xF8D\x00\x00
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\b\x00\x10\x00test.txtUX\f
\x00\x884\xD5Su\xC8\xD1S\xF5\x01\x14\x00\v\xC9\xC8,V\x00\xA2D\x85
\x92\xD4\xE2\x1\x85\xB4\xCC\x9CT=\x00PK\a\bG\xFEj\xEC\x14\x00\x00
\x00\x14\x00\x0\x00PK\x01\x02\x15\x03\x14\x00\b\x00\b\x00%\xB0
\xF8DG\xFEj\xEC\x14\x00\x00\x00\x14\x00\x00\x00\b\x00\f\x00\x00\x00
\x00\x00\x00\x00\x00@\xA4\x81\x00\x00\x00\x00test.txtUX\b\x00\x884
\xD5Su\xC8\xD1SPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00B\x00
\x00\x00Z\x00\x00\x00\x00\x00

The first chunk, from the header to the trailer is exactly the same (“GIF89a” to the first semicolon). From there on out we have the data belonging to the zip archive. As explained earlier, the Gif decoder will only read this first chunk and completely ignore the rest. Cool!

The Zip: Now for the zip file. We understand now that the concatenated data above can still be read as a Gif file, but you’re probably wondering how we can do the reverse – read it as a Zip file. And that’s actually the answer: Zips are read in reverse. How freaking cool. The first chunk of Zip data is stored at the end of the data string and is called the Central Directory. It stores the byte offsets for the files stored in the archive. Going backwards this is followed by the file signatures, which each end in 0x4b50, or “PK” in ASCII. Just as the Gif decoder ignores all data past the trailer, the Zip reader ignores anything after the last file signature.

After we do the concatenation of Zip to Gif we are left with an awesome Jekyll and Hyde / Frankenstein file. Because one is read forwards and the other backwards, there is no problem interpreting it as either filetype. Tell your filesystem it’s a Gif and it has no problem playing or opening it; tell it it’s a Zip and you can unzip as usual:
screencast

Hoardr #

Hoardr itself is a simple Ruby class that takes advantage of the above trick and pushes the resulting Gif to a Flickr account. I decided on Flickr because they recently gave all users one terabyte of free space for photo storage – perfect for backing up a hard drive!

Some notes on how it works and usage:

Flickr limits photos to 200 megabytes. To account for this, Hoardr uploads 200MBs worth of files at a time from your specified directory. Anything larger than this limit is ignored. What I would like to do is split larger files into pieces and spread them across multiple uploads. I haven’t had the time to implement this yet as it would change the way archived files are logged so they can be rebuilt upon retrieval.
The files are encrypted with AES256 before being added to the archive. The archive then gets concatenated with the provided Gif, and the image is sent to Flickr marked as private. You are asked for the private key on both archiving and downloading.
Filenames are hashed and stored in a local text file. This allows for easier retrieval later. The hashed filenames are also stored as tags on the uploaded image, allowing for searches if necessary.
Gifs used in the concatenation need to be at least 20x20 pixels large – Flickr won’t accept anything smaller. I’ve been using the smallest files possible in the backups I’ve been making. You can of course use anything you want, but it’s nice to save a few more bytes since there’s a 200mb file limit.

Next I plan on porting the functionality to OS X as a GUI app as a way of introducing myself to Swift.

If you want to use or contribute to Hoardr you can check out the source and documentation here.

Kudos

Hoardr: Flickr as a cloud backup service

Why? #

How #

Hoardr #

Now read this

Shopify For WordPress