Describing events in code

In my previous post I wrote about cataloguing 28 years worth of my visits to see movies, plays, gigs, exhibitions, etc. In this post I’m writing about some of the code structure, the decisions I made, and the changes I made as I went along. It might be useful or interesting to someone else, or to me to remember why some things are how they are. Maybe the thought processes are interesting to people who don’t write code themselves.

The code is an “app” (a modular chunk of code) for the Django framework, called django-spectator. It’s in two parts: one for recording what books and periodicals I’ve read (used here) and one for recording events I’ve been to (used here).

The reading part was built to replicate some old PHP code I’d been using for years so I had a clear idea of how it should work. The events part was new and I ended up changing my mind about its structure while writing it. And then, once it was “finished” and I started using it, I realised there were still more things I hadn’t done well and needed changing. I hope no one else has been using the code during this time.

The code itself isn’t terribly advanced but I still find some of this interesting — the compromises and balances required in representing real-world objects and events in code, in a way that’s useful and, to the end user (me), usable. The balance would be different in many cases if this code was for a large commercial site, and/or for different users, instead of only being for my personal website.

Rather than litter this write-up with code samples, I’ll link to the code on GitHub, using the current commit at the time of writing, for anyone who wants more detail.

§ The basics

There are some things that remained fairly consistent throughout my changes, so I’ll describe them first. In the Django code these are all models, each one representing a database table. Each model represents a “thing” in the world, often an object, but maybe a single concept (like Event and EventRole, below).

Creators

The Creator model represents an individual or a group. e.g. “William Shakespeare” or “Belle and Sebastian”. Someone/people who wrote something, performed in something, directed something, etc. (On GitHub.) Each Creator has a kind field that indicates if it’s an individual or a group.

(A “field” represents a piece of information about a particular thing. e.g. Creators also have a field for name. Each field is a column in a database table, like a column in a spreadsheet.)

If we were getting complicated we could have People and Groups as separate models, with a Group containing zero or more People. This might be useful if you went to see the group Ben Folds Five one year and then the next saw Ben Folds on his own — you might want to represent that they are linked, or that you had now seen Ben Folds twice, once as part of a group. Or you might want to record every member of a group you’d seen.

But this is more complex than is useful to me and so I conflated the two, which is good enough almost all the time. There are some odd things… Is “Nick Cave & The Bad Seeds” a group, or an individual and a group? I probably change my mind over this kind of thing from one day to the next, but it’s not a big deal.

Creators are shared between both the reading and events parts of the code. So I could see David Byrne in concert and read a book by him, and they’re both linked to the same Creator object.

Creators have a single name field, which avoids these problems with using firstname and surname and means it can also be used if the Creator is a group. Because we might want to sort Creators-who-are-people by surname there’s also a separate field (on GitHub) that stores an automatically-generated “surname, lastname” style string in a manner that is good-enough-for-me but would, I imagine, contain a nightmare of edge cases in a real, international, application.

Venues

A Venue is a place where an Event happened. A cinema, theatre, ferry, a street, etc. (On GitHub.)

Venues have a country field (because I wanted to see how many countries I’d been to events in), a latitude and a longitude (because I want to put these on maps), and an address (which I lazily populate with some barely-useful information using a Google API and code similar to this). (On GitHub.)

Pretty simple. Too simple, as it turns out below.

Events

The Event model records a visit to see something on a specific date at, optionally, a specific Venue. (On GitHub.) A Venue is optional for an Event because sometimes I knew when I’d been to something but I had no record or memory of where it was.

I can imagine adding an optional time field in future, if only so that multiple Events on the same day can be displayed in the correct order.

An Event can be one of several kinds: Cinema, Classical Concert, Comedy, Dance, Exhibition, Gig, Theatre or Other. This isn’t very flexibly done and, having input all my data, I now want to add “Talk”. Other people will have other requirements. Still, this lets us display particular kinds of Event grouped together, which feels important.

§ Events vs Works

That’s all good but we’ve yet to link Creators to Events. I realised there are two different ways this can be done.

The simplest, is for zero or more Creators to be linked directly to an Event. For example, if you go to see The Mountain Goats play live, you’re actually seeing The Mountain Goats right there, at the event. That’s easy. We can link their Creator object to this particular Event object.

The only addition to this is that rather than linking Creators to Events directly, I’m using a “through model” (EventRole) that describes the link itself — how it should be described and what order it should be displayed in (on GitHub). So we can say that at this particular Event, The Mountain Goats were headlining, and so should be listed first, while Emmy the Great should be described as “Support” and appear down the bill.

Movies

However… if you go to see the movie Dunkirk you didn’t actually see the director Christopher Nolan or any of the actors, all of whom you might want to record in the event’s data, depending on your diligence/masochism. You could call the Event itself “Dunkirk” and add Christopher Nolan to it, as we did with The Mountain Goats. But what if you go to see Dunkirk again? You could do the same with a second Event but there’s no nice way of grouping both visits together other than by finding both Events with name of “Dunkirk”. Which is OK, so long as you never see a play called “Dunkirk”, or that 1958 film of the same name.

So it seems we need a separate Movie model. We could link Creators (like Christopher Nolan) to this. And then link a Movie to an Event. So when you create a new Event object you can say “it featured this Dunkirk Movie“. We could then see all the Events connected to that one Movie. And we could list all the Christopher Nolan Movies we’ve seen.

We’ll also add a MovieRole “through model”, like we did with Events, to indicate that, in this case, Nolan was the “Director” and should be listed ahead of anyone else. Very good.

Plays

Now, what about plays? Maybe we could do similar here and have a Play model, with Creators linked directly to that. This doesn’t quite work though. If you go to see King Lear you’d make a new Play object and assign William Shakespeare to it as “Playwright”. And this was a production by the Royal Shakespeare Company so you could add them as a Creator too. And then credit Anthony Sher as playing Lear. It’s all looking good…

…until a few years later and you go to see King Lear performed by a local school, so you create a new Event, and add your “King Lear” Play to it… and realise that while it correctly lists Shakespeare as the playwright, it also lists the RSC and Anthony Sher, neither of whom were at this performance in a school hall. Ah.

So maybe the Performance of a Play is a separate thing? The Event would have a Performance linked to it (with the RSC and Sher attached) and this would have the “King Lear” Play linked to it (with Shakespeare attached).

A chain of Event > Performance > Play. This seems to work.

But then… what if you thought Sher was so good that you went to see the RSC’s production several times? It doesn’t seem quite accurate enough to list all these Performances separately with no real relationship between them. So maybe we need a Production! A Production could have several Performances of the same Play. The Production would have the RSC attached while the Performances would have Sher attached, or his understudy if he was ill.

We’d have Event > Performance > Production > Play.

This seems reasonably accurate. It could be “better” though… what if you saw a version of King Lear that was adapted by someone else? Maybe you saw it translated into another language. So we’d need to have a way of having “versions” of the same Play, each with additional roles (e.g. “Translator”).

So this could be improved, and made more detailed, but… No, stop! It’s already too much you fool!

If this was a larger website, with lots more data, maybe this would be required. But for my purposes it’s too complicated, and it gets too fiddly for inputting the data. I did start off with the Event > Performance > Play structure but that was too much for my needs and so I simplified things.

I went back to the start of this process and ended up with a Play (King Lear by William Shakespeare) which can be linked to an Event (that features the RSC and, if I entered him, Anthony Sher).

It’s not perfect, but it works OK for my needs. A compromise in favour of simplicity and ease-of-use.

Classical works and dance pieces

I don’t see many of these but I think I started off with a similar structure to Plays, having performances of particular works (like Music for 18 Musicians). While this would be required for some websites (or still be too simple) it was overkill for me. So I simplified it to having a ClassicalWork (e.g. with “Composer” Steve Reich) linked directly to an Event (with the performers attached to that). And similar for DancePieces.

More simplifying

After simplifying plays a bit, this is where I was: An Event could have one or more of these things attached to it, depending on whether it had a kind of “Cinema”, “Theatre”, “Classical Concert” or “Dance”:

Movie
Play
ClassicalWork
DancePiece

Each of those has its own “through model” lining Creators to it: MovieRole, PlayRole, ClassicalWorkRole, DancePieceRole.

This worked fine, for my needs. But I realised it was still unnecessarily restrictive and fiddly for two reasons.

First, what if I went to an Event that featured a performance of a classical work and then a piece of dance? In my system an Event of a particular kind could only have the matching type of work added to it. It seemed logical at first but it was an unnecessary restriction. So I removed it. Now, the Event kind is more of a guide as to how we could split Events up when displaying them. We can list “Cinema” Events separately from “Theatre” Events. And a “Cinema” Event could feature a DancePiece as well as a Movie. That’s fine. No one will die.

Second, having simplified how I represented plays, classical works and dance pieces (removing the Performance model between them and Events) they were all pretty similar to each other, and to Movies. It now seemed overly-complicated in the website’s admin pages to list forms for these different models separately. So I eventually ended up conflating them all into a single Work model, which has its own kind field (“Classical work”, “Dance piece”, “Movie” or “Play”). And a single WorkRole “through model” associating Creators to it. (On GitHub.)

This is much simpler. We have an Event which can optionally have Creators associated directly with it — people or groups who were actually there (e.g. musicians at a gig, actors at a play, film directors at an after-movie-discussion, etc.). And then the Event can optionally have a variety of Works linked to it, each of which can optionally have Creators associated with them (e.g. playwright, director, etc.).

Looking back, it feels silly that I made things so complicated to start off with. This is the kind of rabbit hole you can end up going down when you want to accurately model the world. The real world is complex and often doesn’t map well to the more binary, logical world of code. It’s easy to over-do this process (particularly when working on your own!) and end up with a more “accurate” but unwieldy mess that’s hard to maintain, hard to work with, and unnecessarily fiddly for the end user to use. And it’s never quite right.

The new structure is certainly good enough for my needs, and it’s relatively simple to enter data. Finally.

Oh, but what about if you go to see a movie that’s based on a play? Isn’t that like a “performance” of a play? Or, what about if you see a filmed version of a live performance of a play? What even is that? A play? A movie? What is the Work there?

It’s never quite right. Compromise.

§ Venues

I mentioned Venues earlier and they seem pretty simple. A name and a location. A place where Events happen.

Changing names

However, as soon as I started entering data from 28 years ago into my site I realised the biggest difficulty… Venues change over time. I’m not sure why I didn’t think of this initially.

If you’re entering data about a visit to see a movie at the MGM Trocadero in 1995 you want it to appear on the site as being at the MGM Trocadero. But later, when you enter data about seeing a movie at the same cinema, that’s now known as the UGC Shaftesbury Avenue, you want that visit to say “UGC Shaftesbury Avenue”. And you don’t want the previous visit to change from being at “MGM Trocadero”. So, they have the different names, but they feel very much like the same Venue.

There is a slightly philosophical question here… What does it take to change one venue into an entirely new venue? Some options:

When it changes name and branding. e.g. from “MGM Trocadero” to “Virgin Trocadero”.
When it keeps the same name but has its interior reconstructed. e.g. it splits one screen into two or three.
When it changes name and branding and has its interior reconstructed. e.g. Cineworld Shaftesbury Avenue becoming Picturehouse Central.
When it’s roughly the same location, with the same company, but in an entirely different building. e.g. the RSC performed in the temporary Courtyard Theatre (on the site of The Other Place) for a few years while its nearby theatres were redeveloped. That was then replaced by a new The Other Place on the same spot.
When it temporarily moves to a new location. e.g. the Almeida theatre “moved” to near King’s Cross while its permanent building was being renovated.
When it permanently moves to a new location. e.g. The Odeon in Colchester was replaced with a new Odeon Colchester round the corner from the previous version in 2002.

I’ve tended to treat the first three of these cases as being the same venue over time. Even the radical transformation of the Trocadero cinema into Picturehouse Central feels, just about, like going to the same venue. On the other hand options 4-6 feel like they create separate venues.

So, given options 1-3, we need a way to keep track of a Venue‘s different names over time. I didn’t think of that when I started.

The “proper” way to do this would be, I think, to have a separate model like VenueName which would have fields for name, start_date, end_date and a link to a Venue object. Any Event that occurred at a Venue would be displayed using the VenueName related to the date.

However, this requires knowing exactly, or even roughly, when a venue changed its name, and adds to the complexity of data entry. This is one of those things it’d be worth doing for a bigger site, with more users, that had to be more robust. But for me it seemed like overkill.

I ended up with another compromise. Each Event has a field for venue_name. When creating a new Event the current name of the linked Venue is copied to that field. So, assuming Events are entered in chronological order you can change the name of a Venue as you work forward through time, and each Event will bear the “current” name. No matter how often you change a Venue‘s name, the venue_name saved with existing Events won’t change.

The only downside is that if you need to change the historical venue name for several Events in the past, it requires manually editing all of their venue_names, rather than only editing (or creating) a single VenueName object with the relevant dates. But, again, it’s Good Enough for my needs, and keeps things fairly simple. You can see the name changing over time on the Picturehouse Central page.

Screens and theatres

There’s yet another question about what a Venue is… is it an entire building? Or a particular screen, theatre, hall, etc. within the building?

For example, is the National Theatre a single venue or are its Olivier, Lyttleton and Dorfman Theatres separate venues? This seems like more of an issue for theatres than cinemas — a performance in the Barbican’s 1,156 seat Theatre is a very different experience to one in its 200 seat Pit. But you could say similar for cinemas — you may want to record whether you saw a film in a cinema’s giant main screen or its poky 30-seater.

And, close to home for me, is the Barbican’s Cinema One (large, in one building) the same venue as the Barbican’s Cinemas Two and Three, which are in a connected-but-different building up the road? They have the same branding and website but feel quite different. And then are those Cinemas Two and Three a different venue to the Cinemas Two and Three that were in a different Barbican building again until they closed a few years ago?

I don’t have good answers for any of these. It seems even less clear-cut than the “When does a venue become a new venue?” question. Sometimes I’ve made separate Venues for things like this, other times I’ve created one Venue and noted on each Event which theatre it was in (e.g.).

§ Conclusion

That’s how I got to where I am with all this. The site works fine and I love having all that historical data in it. The compromises I’ve reached work for me and aren’t too annoying. For other uses there would be different compromises required.

There are still a few things that, inevitably, aren’t quite right, aside from everything mentioned above:

The “reading” and “events” parts of django-spectator are separate, only sharing the concept of Creators. Which means that if I read the print version of King Lear and then see the play performed, these are separate things in the database: a Publication and a Work (play). It’d be nice for these to be the same thing, or related somehow.
If I go to see several short plays at one Event, I can only attach Creators to either each individual Play or to the Event as a whole. There’s no way to indicate which Play they were acting in, or directed. (e.g.) I’d need the more complicated Event > Performance > Play structure I retreated from above.
Another part of my site records every time I listen to a piece of music, via Last.fm using my django-ditto code. It would be nice if a Creator I’ve seen live could be linked to the occasions I’ve listened to the music.
If I go an art exhibition this is currently an Event with a title and some optional Creators attached. And so if I go to the same exhibition multiple times there’s no direct relation between these visits. Maybe an exhibition should be a new kind of Work with its own Creators (e.g. “Artist”, “Curator”, etc.)?

These are all edge cases — there are always edge cases — and they don’t bother me too much. Yet. But they’re the kinds of things I’d bear in mind if I was making something like this for different people with their own uses and needs.

As it is this is all working, and full of data, and is very pleasing.