Home / Writing / Blogging with Version Control

Blogging with Version Control

Reverse chronological order

I’ve been musing for a while now on the way blog posts are typically presented—in reverse chronological order. This format has never truly made sense and does not reflect the way good writing and thinking happens. This post outlines why I believe applying version control to blogging (and the wider world of digital publishing) may be the solution to more relevant, more complete writing that encourages continuous learning.

When we start learning about a topic we are naïve and lack context for the information we consume. It is through questioning our knowledge, updating incorrect beliefs and making further connections that we truly expand our understanding. Often we feel we reach a point of sufficient understanding that we are able to write down and share our learnings with others. Writers publish their work, receive feedback, and the post is added to an archive. This post is essentially another sheet of paper, stacked on top of an existing pile.

The main issue with the ‘pile’ system is that this post is eventually buried beneath more recent pieces of writing; there is no incentive for revisiting or updating the work. Even worse, if an author does decide to unearth the piece and make some major changes, those who read the original piece are not made aware of these alterations. The sorting order is static.

Implementing version control

In software, a practice known as ‘version control’ is implemented to keep track of modifications to the codebase. You are likely familiar with the general concept of revisions, i.e. version 1, version 2, etc. Some writers may even use this type of system already, in an informal way. As it relates to blogging, I believe the semantic versioning specification formalises the process such that an author can more easily create dynamic, comprehensive writing. This specification uses version numbers in the form MAJOR.MINOR.PATCH e.g. 1.2.4. By re-defining these versions slightly from their use in the software world, it’s possible to create a system for versioning your writing and, as a result, allowing for a more intelligent publishing workflow.

Although this system can be modified in many ways, I believe the following framework is a good starting point:

  1. First number, MAJOR: This is the highest level of update. Starting a fresh blog post, we essentially have a version number of 0.0.0. When a post is first published, this is version 1, or 1.0.0. From this initial publish date, only a post with significant changes will qualify as a MAJOR update. This would include such modifications as adding multiple meaningful sections of content or significant changes in opinions/perspectives expressed. A reader of version 1.X.X should feel that version 2.X.X is different enough that the post should be read again from the beginning.
  2. Second number, MINOR: A minor change moves 1.2.3 to version 1.3.0. It represents changes like adding additional links to a resources section, editing a few sentences for clarity, adding footnotes or a preface. These are the kinds of changes a good writer should be looking for—they will typically come from understanding more about the complexities of a topic. If you find yourself making these types of updates over a relatively long period (say, a year) you should review the current piece against the original publication; enough minor changes may eventually warrant a major update. In some cases, you may even need to take all of these pieces and rebuild an essay from the ground up.
  3. Third number, PATCH: This number is reserved for tiny changes like spelling errors, formatting issues and generally cleaning up the more technical details of the post. Most of this type of work will already be completed but occasionally these things do slip through the cracks. As you can likely infer from the above examples, this change would update version 1.6.12 to 1.6.13.

Adding the metadata

From here, it is a matter of implementing a viable system based on the type of technology stack you use for your site. As an amateur in web development, I can only offer some preliminary ideas about how you could implement this kind of system in your website. This process is likely to be vastly different depending on the technology stack you use to serve your publication, but I thought it would still be worthwhile to outline some basic thoughts.

Some people working with markdown files will have YAML frontmatter, as I do with my Eleventy build. With this frontmatter, I would simply add a line for the version data as follows:

---
title: Blogging with Version Control
date: 2021-08-10
version: 1.2.4
---

In Eleventy, this code can be read and templated directly; you could expose this metadata directly to a page, for instance, so that the version number is shown publicly.

In regards to a sorting algorithm, you would need to convert this version number formatting to a system whereby the dot seperators are temporarily removed, and the version numbers can be sorted. Something like the following JavaScript would work:

let arr = ['1.0.0', '1.0.6', '1.12.0', '2.0.3', '2.1.0', '2.1.3', '3.0.10'];
arr = arr.map( a => a.split('.').map( n => +n+100 ).join('.') ).sort()
         .map( a => a.split('.').map( n => +n-100 ).join('.') );

Note that in this case the array is mapped back to a versioned format, which would not be necessary if you simply want to create a collection of sorted items. A slightly modified version of this code could be used as a filter in Eleventy which, when combined with something like the code below, should be able to create a collection sorted by version numbers:

eleventy.addCollection("writingVersioned", function(collection) {
    return collection.getFilteredByGlob("./src/site/writing/*.md")
        .sort((a, b) => b.data.version - a.data.version);
});

The last remaining piece is then to determine how you want to balance the priority of a change versus a publish date. You may choose to give a MINOR change priority over a piece published more recently, but only once that piece is over 3 months old. This time is likely to change for each individual depending on their publishing frequency; those who publish infrequently will probably place a greater emphasis on the publish date, whereas those who publish and update posts frequently may prioritise the versioning system. Both are acceptable and come down to personal preference.

Sorting example

While I’m hoping this kind of proposal makes sense so far, it may help to further clarify the idea with an example. Let’s take a JSON array representing a number of essays, with additional data stripped so that only the relevant metadata is left. This array would look something like the following:

[
 {
   "title": "Blogging with Version Control",
   "date": "2021-08-10",
   "version": "1.0.0"
 },
 {
   "title": "Radioactivity and Neutron Interactions",
   "date": "2021-05-25",
   "version": "1.3.0"
 },
 {
   "title": "Building a Library with CSV in Eleventy",
   "date": "2021-03-12",
   "version": "1.1.3"
 },
 {
   "title": "From Idea Consumer to Producer",
   "date": "2020-08-31",
   "version": "1.0.2"
 },
 {
   "title": "The Artist and the Critic",
   "date": "2020-07-01",
   "version": "1.1.4"
 },
 {
   "title": "Twitter Text-Generation Bot",
   "date": "2020-02-04",
   "version": "1.4.3"
 }
]

We can represent this data in the following table, including the category of version update that will be applied:

Title Date Version Update
Blogging with Version Control 2021-08-10 1.0.0 None
Radioactivity and Neutron Interactions 2021-05-25 1.3.0 Patch
Building a Library with CSV in Eleventy 2021-03-12 1.11.3 Major
From Idea Consumer to Producer 2020-08-31 1.0.2 Patch
The Artist and the Critic 2020-07-01 1.1.4 Minor
Twitter Text-Generation Bot 2020-02-04 1.4.3 Patch

Table 1. Sorting essays by date

Think of this as a bulk operation written purely for the sake of demonstration; in reality, you would likely update these pieces of writing individually and over several commits. Nevertheless, you can see that the posts are ordered in the typical reverse chronological order. Compare this to a modified sort order, whereby the following rules are applied:

  1. A MAJOR change is equivalent to publishing a new piece;
  2. A MINOR change has priority over writing published more than 3 months ago;
  3. A PATCH makes no change.

After applying these changes, the order would look as follows:

Title Date Version Change
Building a Library with CSV in Eleventy 2021-03-12 2.0.0 ↑2
Blogging with Version Control 2021-08-10 1.0.0 ↓1
Radioactivity and Neutron Interactions 2021-05-25 1.3.1 ↓1
The Artist and the Critic 2020-07-01 1.2.0 ↑1
From Idea Consumer to Producer 2020-08-31 1.0.3 ↓1
Twitter Text-Generation Bot 2020-02-04 1.4.4 0

Table 2. Sorting essays by version-control algorithm

Note: This assumes the sort algorithm runs on 10th August 2021; meaning minor changes will take priority only on posts before 10th May 2021.

Wrapping up

I hope this post was useful in starting to think about how you can apply the principles of version control into your publishing workflow. Ultimately, this post only scratches the surface of a very interesting space and I believe there is a huge opportunity for intelligent application of the version-control system to the field of digital publishing.

If you are planning on implementing (or have already implemented) this kind of system on your personal site, please reach out—I’d love to hear about it!