Django Ditto and archiving your stuff

I recently made a collection of apps for the Django web framework, called Django Ditto. It’s for grabbing your photos from Flickr, tweets from Twitter, bookmarks from Pinboard, mirroring them all on your own website.

It can save data about:

  • Flickr photos/videos
  • Flickr photosets
  • Pinboard bookmarks
  • Twitter tweets
  • Twitter favorites
  • scrobbles (Added Feb 2017)

Hopefully it’ll do more in the future.

It can optionally archive copies of your original Flickr photo and video files, and images attached to tweets. It can’t save videos from Twitter as the API no longer allows access to original video files, only a stream.

It can do this for multiple Flickr, Twitter and Pinboard accounts. Private photos, tweets and bookmarks are saved, but Ditto’s views and templates only display public ones; private items are only visible in the Django Admin screens.

You can go and look at:

This is only of use to people who are developers, and who want to use Django. It’d be nice to make something like this that’s easily usable by non-coders. Maybe one day.

Why did I make this?

I’ve been intending to rebuild my own website for years but the site’s become so large and complicated that this is a daunting feat. When planning to re-make the whole thing in Django, rather than the current nest of Movable Type and PHP, it made sense to split the task into smaller chunks of work.

The first was to combine the aggregation of Flickr photos, Tweets, etc into one system, rather than the differing ways they’re sort-of captured at the moment. I found an existing Django app that did some of this, django-syncr and started updating it in 2011 but soon ground to a halt, through a combination of work and boredom.

The following year I started again, with a new project, django-archivr, using the bits of django-syncr that I liked but, again, I couldn’t sustain the momentum for more than a couple of months.

Last year I started for a third time, with Django Ditto, and through a combination of greater free time and greater bloody-mindedness, I’ve got somewhere. There’s plenty I want to add, a prospect that doesn’t fill me with joy at the moment, as it’s been a grind, but this is a good start.

As with so many of my projects (like that Ansible stuff, or Twelescreen, or the Mappiness chart), I’ve overdone it. Ditto is probably way over-engineered, and I certainly spent longer than necessary refactoring various bits (which might have improved them).

For example, unlike the previous two abandoned projects, Django Ditto can archive multiple Flickr, Twitter and Pinboard accounts. I don’t need this functionality myself, but it seemed like it could be useful, and so I did the extra work. It’s the kind of feature that’s easier to build in from the start rather than retroactively. If you need it. Maybe someone will.

As ever, I feel the code is terrible, an embarrassment. I’m too much of a self-taught, self-doubting, solo programmer to have much confidence in my skills. But I’ve learned a lot, and some of that purposely; I want my personal projects to stretch me, and give me a chance to learn things that might be useful for paying work. This is the first project on which:

  • I’ve learned how to package a python module for PyPI

  • (which will mean being stricter with myself about release and version numbering).

  • I’ve written more tests than I ever have before (I’m sure some are awful, but still).

  • I’ve learned how to use tox to run the tests on multiple versions of python and Django.

  • I’ve used Travis to run the tests when new code is pushed to GitHub.

  • I’ve used coverage locally, and Coveralls online, to see how much of my code is probably covered by the tests.

  • I’ve written documentation using Sphinx to go on Read The Docs, rather than rely on a single overly-long README.

  • I’ve made a demonstration site using the latest release.

None of which is groundbreaking, but it’s mostly new to me. It’s also often quite dull, more dull than it seems a personal, free-time project should be. But trying to improve one’s professional skills sometimes is dull, or frustrating — if it was all easy and fun I’m not sure I’d be learning as much. At least, that’s how I’m justifying spending days writing documentation for something no one else might ever use.

There’s another “Why?” question: “Why does anyone need to mirror all their tweets and photos etc on their own site?” Maybe if one’s Flickr photos only exist on Flickr, it makes sense to make copies of those precious memories. But does anyone need to copy all of their Tweets? Or bookmarks? Or YouTube favourites? Or Foursquare check-ins? Who’s going to look at or care about any of that? Maybe one’s “digital wake” or “digital exhaust” should remain as ephemeral as those metaphors suggest.

I’ve long believed that we should have control over our own digital “stuff”, no matter which commercial service we post it on. In my ideal world, I’ve often thought, everyone would have their own website that would contain copies of everything they posted elsewhere. You should own the photos and jokes and thoughts and videos and events you post onto platforms controlled by companies over which you have no control, that might suddenly vanish. These platforms provide great services, with network effects that achieve so much more than posting things solely on your own website could. But it feels wrong to me to give it all away, sending it into the corporate-controlled ether, without maintaining your own copy.

At least, that’s what I believed, 100%, a few years ago when I first started on all this. Since then, part of me has become less sure. Is it essential to have copies of all this stuff? And if so, does it need to be hosted online, on your website? If you can download a copy of your material, maybe that’s enough (e.g., Twitter’s downloadable archive of your tweets is pretty good).

I still firmly believe that all this stuff needs to be archived permanently somewhere, in a browsable way that’s as close to the original experience as possible, even if that’s difficult. Tweets may seem like ephemeral nonsense, but some of the most fascinating ancient discoveries are “unimportant” things that, at the time, no one would have thought worth keeping.

But is it important to archive all your own ephemeral digital stuff, and do so publicly? Running your own website is often a pain in the arse, and a website that relies on many external APIs and services even more so. Plenty of friends who also work in this field, perfectly capable of building their own website, have no desire to do so, or if they do, keep it as simple as possible.

But I’ve tried not to think too much about these doubts recently. Yes, it’s important to own your material, your self-expression. That’s what I keep telling myself, or I’d never have got this far. Stay on target. Stay on the bus, Archive everything. Publish it. Worry about whether it’s worthwhile later.

Commenting is disabled on posts once they’re 30 days old.

15 Aug 2016 in Photos

15 Aug 2016 at Twitter

  • 8:43pm: @rooreynolds Yeah, hmmm.
  • 5:54pm: Gender-swapped remakes as distraction from Hollywood relying on remakes/sequels in lieu of any original ideas, never mind women-led ones.
  • 5:00pm: @benterrett Someone’s got to do it! Or so I keep telling myself.
  • 4:43pm: @benbrown I hope it’s all good news Ben. In the meantime, [transatlantic hugs].
  • 4:42pm: @paulpod It’s a long walk back, so I think I’m stuck on here now.
  • 4:34pm: @kevinmarks Ta - not sure I’ll be able to make those dates, but if I can, I will!
  • 4:25pm: All my posts about personal coding projects have a large section of “That took way too long; I’m an idiot; that probably wasn’t worth doing”
  • 4:23pm: I wrote about making Django apps to mirror Flickr, Twitter & Pinboard accounts, and whether it was worth the bother…
  • 11:49am: @JamesWallis I’m mostly surprised it’s survived this long…
  • 11:44am: @JamesWallis I just need to work out how to put this achievement on m LinkedIn profile.
  • 11:36am: @JamesWallis That’ll do it! I think long Spotify playlists have helped - less decision making.
  • 10:35am: My 150,000th scrobble was ‘Seattle’ by Jeffrey Lewis: Just over 4 years since my 100,000th.
  • 10:13am: A nice small company I know, providing online services to hospitals, is looking for a front-end dev, central London:…

15 Aug 2016 in Links