FFFFOUND! export script

FFFFOUND! is shutting down on 8th May. I haven’t used it for ages but I saved over 500 images in there a few years ago. I couldn’t find a backup tool that worked for me so I adapted one to make my own.

Andy Baio wrote a good post about how poorly the shutdown of FFFFOUND! has been handled, so let’s take that as read.

Screenshot of the webpage

FFFFOUND! has no API so getting images off it involves scraping the site’s pages. I tried a couple of scripts, like the three-year-old ffffexport which stopped abruptly after a few pages.

I looked at some other old scripts, some of which seemed unfinished and others only fetched the images. Only fetching the images doesn’t seem like enough — I want to know where they were originally, when they were saved, that kind of thing. Metadata!

This script by Aaron Scott Hildebrandt mostly worked for me but only fetched the images. So I started adapting that to do more. I also wanted:

  • To make HTML pages, a bit like those on FFFFOUND! itself, for browsing the images, including the information about the images.

  • All the data saved in a machine-readable format. I thought I might want to write code to upload the images to Pinboard (or wherever) at some point, and this would save having to scrape my new HTML pages all over again.

Screenshot of the webpage

This was just going to be a really quick hack but I’ve ended up spending a whole Saturday on it. The script seems to works now. I’ve downloaded a couple of entire archives successfully. It creates pages that look like this second screenshot, plus a single JSON file including the URLs, page titles, and the local filenames of all the images.

It’s been lovely to look through it all again. I have no memory of most of these images, from only a few years ago. It’s like finding an old shoebox of photos and cuttings.

Getting this to work I spent a lot of time struggling with character encoding and it’s not perfect but enough was enough. It would probably have been quicker to start from scratch. But that’s the wisdom-of-having-done-the-work talking. Hopefully this vaguely embarrassing code will be useful to someone else.

ffffound-export on GitHub

22 Apr 2017 at Twitter

  • 05:58am: @mildlydiverting I’ve nearly got a nice script working if you want me to grab your stuff too?
  • 01:15pm: @mildlydiverting Will do! Just struggling with character encodings at the moment. Grrr.
  • 04:45pm: “I won’t spend more than an hour fiddling with this code,” is what I said to myself half the weekend ago.
  • 08:22pm: A Python script for fetching all your https://t.co/RSx0PiLhG2 images, making pages for them, and saving data in JSON https://t.co/xyDmLSmVES
  • 08:23pm: There are various others (this is based on one) but I either couldn’t get them to work properly or wanted something different.
  • 08:24pm: The code is a mess and I’d do it all differently if starting from scratch. Just like every quick hacky project. Hey ho.