update: I’ve added some links to related ideas, applications, etc. I also have a follow-up post, with more details about the features of version control software, from a writer’s point of view .
Every so often, when I’m between projects, I start thinking less about what I write and I think more about <em>how</em> I write . One of my ongoing projects is to digitize a lifetime worth of my grandmother’s writing , so I’m also thinking a about how she wrote.
For each of her short stories, for example, I have inherited multiple versions: each one typed out (often in duplicate) and many with dates. This makes it very easy for me, in posterity, to follow the development of her work, find the most recent versions and so on. I wonder: how might I organize my work in a similar, computerized way? It seems to me that the old “save as” trick is not very much more efficient than my grandmother’s habit of using carbon paper. In 2013, surely there are some more sophisticated tools for storing and comparing multiple drafts, or versions, of a written document. Might I use those tools, to study her work, or to keep track of my own? Wouldn’t others want tools like these? Authors, editors, literary scholars, archivists: all eventually have to do work with multiple versions of a text.
What is version control, and why should you care? Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later. Even though the examples show software source code as the files under version control, in reality any type of file on a computer can be placed under version control.
Not every writer does care about version control. English fantasy writer Terry Pratchett said , “I save about twenty drafts — that’s ten meg of disc space — and the last one contains all the final alterations. Once it has been printed out and received by the publishers, there’s a cry here of ‘Tough shit, literary researchers of the future, try getting a proper job!’ and the rest are wiped.” I believe, to the contrary that some writers, such as lawyers for example, should not be so inclined to say “tough shit” to the readers of the future who might wish to know how some documents may have evolved, but I digress. (I could digress further with: Leaves of Grass or Piers Plowman or William Blake .) My point is that if it became easier to manage and store the drafts, then perhaps more writers would be inclined to do so, and there would be less “tough shit” later on, for anyone interested in the writer’s process.
If you’re completely new to the idea of version control, you may benefit from reading Tom Preson-Werner’s fable/introduction to Git, which is a popular kind of version control .
I am absolutely not the first writer to think about computerized version control. (In coming days, I’ll add to this post with links to similar conversations.) In fact, I owe a lot of my thinking on the subject to Cory Doctorow (whose blog’s comments provided the above Pratchett quote) and to Thomas “cmdln” Gideon (host of the fabulously nerdy Command Line podcast). Together, they are the authors of a piece of software called “Flashbake”. Doctorow’s post on Boing Boing about Flashbake provides an excellent introduction to the software, its strengths and weaknesses and the motivations behind its creation.
I was prompted to do this after discussions with several digital archivists who complained that, prior to the computerized era, writers produced a series complete drafts on the way to publications, complete with erasures, annotations, and so on. These are archival gold, since they illuminate the creative process in a way that often reveals the hidden stories behind the books we care about. By contrast, many writers produce only a single (or a few) digital files that are modified right up to publication time, without any real systematic records of the interim states between the first bit of composition and the final draft.
In another article , Doctorow elaborated on the many other benefits he enjoys while writing with version control.
Now, this may be of use to some notional scholar who wants to study my work in a hundred years, but I'm more interested in the immediate uses I'll be able to put it to — for example, summarizing all the typos I've caught and corrected between printings of my books. Flashbake also means that I'm extremely backed up (Git is designed to replicate its database to other servers, in order to allow multiple programmers to work on the same file). And more importantly, I'm keen to see what insights this brings to light for me about my own process. I know that there are days when the prose really flows, and there are days when I have to squeeze out each word. What I don't know is what external factors may bear on this.
In a year, or two, or three, I’ll be able to use the Flashbake to generate some really interesting charts and stats about how I write: does the weather matter? Do I write more when I’m blogging more? Do “fast” writing days come in a cycle? Do I write faster on the road or at home? I know myself well enough to understand that if I don’t write down these observations and become an empiricist of my own life that all I’ll get are impressionistic memories that are more apt to reflect back my own conclusions to me than to inform me of things I haven’t noticed.
So, why don’t I just install this “flashbake” software and simply move on with my writing and my projects? Well, that’s a fair question. The trouble is that, well, this version control stuff is pretty complex stuff, compared to the average word processor. As a writer whose day job is managing websites, I’m up to the challenge, but while I’m at it, I wonder whether I can do anything to make this easier for others. The Lifehacker article about Flashbake gives a “nerd alert” before listing some substantial knowledge prerequisites:
Flashbake is a command-line system for advanced users, and requires a Linux-like shell like Cygwin for Windows or Mac OS X's built-in Terminal. It is most definitely not for folks looking for something like Microsoft Word's versioning. It is, however, for people who make heavy use of plain text files, don't mind firing up the terminal and running a script or two, and know what cron is. Since Flashbake is an interface to Git written in Python, you'll need all three installed to get this party started.
Why don’t I use one of the many workarounds available out there ? Because most of them are for linear workflows, or they don’t store enough versions, or they’re 100% cloud-based and I’m afraid they’ll go offline and take my work with them when they go.
Surely there’s a middle way that balances power, usability and (long-term) access to the documents.
I’ve posted this idea in a few other places as well…
- Version Control for Writers @ MetaFilter
- Along with a friend, I posted a query to the #git-annex IRC channel. Git-Annex is one web application that may prove capable of the functionality I’m trying to describe here.
- I posted to the VCS Home list. Join to see, or wait for archive.
- I have a submission pending at Slashdot
Useful links I found along the way…
- Writing in Version Control — Andy Taylor
- The Limitations of GitHub for Writers - ProfHacker - The Chronicle of Higher Education
- Simplifying Writing Workflow » Linux Magazine
- OpenOffice.org Document Version Control With Mercurial : David R. Heffelfinger
- iPhylo: Setting up a local Wikisource
- Words and what not
- Keeping Your Life in Subversion - O’Reilly Media
- msofficehg - Microsoft Office (Excel, Word, PowerPoint) add-ins that assist document version control with TortoiseHg - Google Project Hosting
- Version Control for Microsoft Word Collaborative Writing | MacResearch
- Sublime Text 2 and Markdown: Tips, Tricks, and Links
- Working with Git on Windows - Beanstalk Guides
- Git Magic - Preface
- Inside the Leviathan - James Fallows - The Atlantic