FFFFOUND! is shutting down on 8th May. I haven’t used it for ages but I saved over 500 images in there a few years ago. I couldn’t find a backup tool that worked for me so I adapted one to make my own.
Andy Baio wrote a good post about how poorly the shutdown of FFFFOUND! has been handled, so let’s take that as read.
FFFFOUND! has no API so getting images off it involves scraping the site’s pages. I tried a couple of scripts, like the three-year-old ffffexport which stopped abruptly after a few pages.
I looked at some other old scripts, some of which seemed unfinished and others only fetched the images. Only fetching the images doesn’t seem like enough — I want to know where they were originally, when they were saved, that kind of thing. Metadata!
To make HTML pages, a bit like those on FFFFOUND! itself, for browsing the images, including the information about the images.
All the data saved in a machine-readable format. I thought I might want to write code to upload the images to Pinboard (or wherever) at some point, and this would save having to scrape my new HTML pages all over again.
This was just going to be a really quick hack but I’ve ended up spending a whole Saturday on it. The script seems to works now. I’ve downloaded a couple of entire archives successfully. It creates pages that look like this second screenshot, plus a single JSON file including the URLs, page titles, and the local filenames of all the images.
It’s been lovely to look through it all again. I have no memory of most of these images, from only a few years ago. It’s like finding an old shoebox of photos and cuttings.
Getting this to work I spent a lot of time struggling with character encoding and it’s not perfect but enough was enough. It would probably have been quicker to start from scratch. But that’s the wisdom-of-having-done-the-work talking. Hopefully this vaguely embarrassing code will be useful to someone else.
Commenting is disabled on posts once they’re 30 days old.