FFFFOUND! export script

FFFFOUND! is shutting down on 8th May. I haven’t used it for ages but I saved over 500 images in there a few years ago. I couldn’t find a backup tool that worked for me so I adapted one to make my own.

Andy Baio wrote a good post about how poorly the shutdown of FFFFOUND! has been handled, so let’s take that as read.

FFFFOUND! has no API so getting images off it involves scraping the site’s pages. I tried a couple of scripts, like the three-year-old ffffexport which stopped abruptly after a few pages.

I looked at some other old scripts, some of which seemed unfinished and others only fetched the images. Only fetching the images doesn’t seem like enough — I want to know where they were originally, when they were saved, that kind of thing. Metadata!

This script by Aaron Scott Hildebrandt mostly worked for me but only fetched the images. So I started adapting that to do more. I also wanted:

To make HTML pages, a bit like those on FFFFOUND! itself, for browsing the images, including the information about the images.
All the data saved in a machine-readable format. I thought I might want to write code to upload the images to Pinboard (or wherever) at some point, and this would save having to scrape my new HTML pages all over again.

This was just going to be a really quick hack but I’ve ended up spending a whole Saturday on it. The script seems to works now. I’ve downloaded a couple of entire archives successfully. It creates pages that look like this second screenshot, plus a single JSON file including the URLs, page titles, and the local filenames of all the images.

It’s been lovely to look through it all again. I have no memory of most of these images, from only a few years ago. It’s like finding an old shoebox of photos and cuttings.

Getting this to work I spent a lot of time struggling with character encoding and it’s not perfect but enough was enough. It would probably have been quicker to start from scratch. But that’s the wisdom-of-having-done-the-work talking. Hopefully this vaguely embarrassing code will be useful to someone else.

ffffound-export on GitHub

Commenting is disabled on posts once they’re 30 days old.

Saturday 22 April 2017, 9:00pm

← OlderIn all of WritingNewer →

22 Apr 2017 at Twitter

8:24pm: The code is a mess and I’d do it all differently if starting from scratch. Just like every quick hacky project. Hey ho.
8:23pm: There are various others (this is based on one) but I either couldn’t get them to work properly or wanted something different.
8:22pm: A Python script for fetching all your Ffffound.com images, making pages for them, and saving data in JSON github.com/philgyford/fff…
4:45pm: “I won’t spend more than an hour fiddling with this code,” is what I said to myself half the weekend ago.
1:15pm: @mildlydiverting Will do! Just struggling with character encodings at the moment. Grrr.
5:58am: @mildlydiverting I’ve nearly got a nice script working if you want me to grab your stuff too?

On this day I was reading

A Dance to the Music of Time: vol. 2: Summer by Anthony Powell
The Diary of Samuel Pepys: 1664 v. 5 by Samuel Pepys (Author), Robert Latham (Editor), William Matthews (Editor)
Hope in the Dark: Untold Histories, Wild Possibilities by Rebecca Solnit
London Review of Books, Vol. 39 No. 8, 20 April 2017

Music listened to most that week

More at Last.fm...

FFFFOUND! export script

22 Apr 2017 at Twitter

On this day I was reading

Music listened to most that week

Individual RSS feeds

Combined RSS feeds