Backup, backup, backup!

This actually happened during my comps!

The comprehensive examination is an important milestone for a PhD student. While its format varies depending on the school, degree, department, and most importantly, your supervisor, it is usually a pass/fail test of your cumulative knowledge after one or two years into the program. In Canada, after the comps, you are “promoted” from being a PhD student to being a PhD Candidate.

The format in my department is a take-home exam that lasts from Monday morning, when you receive the set of questions, until Friday at 5pm, when you need to submit your answers/papers to your committee members. One week later you meet them in an oral examination, where you clarify any questions they have about your answers and about the field in general.

So last June it was my turn. I received the questions on Monday and worked countless hours. On the third day, it happened: the Blue Screen Of Death!

No, I wasn’t doing anything weird, like calculating a huge matrix or running statistics on millions of records. I had just Word open where I was writing my answers, a lot of PDFs with relevant papers, EndNote, my personal wiki where I had written down my notes… Oh wait, was that too much? Well, apparently so, and the computer gave up!

I usually take a paranoid approach when it comes to backing up my work, particularly important ones like this document. For instance, every day before leaving the office I made sure I had an up-to-date copy on at least two cloud sync services, a USB drive and – of course – the master copy in my office’s computer. But I wasn’t 100% prepared for a BSOD…

It happened around noon on Wednesday, the symbolic halfway mark of my exam. The computer took about 30 minutes to be back in its full form. And I spent all these long minutes wondering how much work I had lost… When was the last time I had hit Ctrl+S? Did Word autosave it for me before crashing? Would I have to start the day from scratch?

In the end, it was not that bad – I had missed only about 30 minutes of work – therefore in total only 1 hour had been lost. But I couldn’t afford risking it happening again, so I switched my live writing to an online app, where my work would be saved to the cloud instantly as I typed. No more Ctrl+S, no more fear of the BSOD. Although these online apps have progressed in the past few years, they still do not offer the same rich set of features from MS Word. But instant saving was enough for me to make the switch for the remainder of the week – once everything was done, I just copied and pasted the content back to my main Word document.

While storage gets cheaper every year, we can afford to have multiple copies of our work in different locations in order to minimize the risks of losing precious data. A while back, I was getting frustrated for having a huge amount of duplicate data without syncing – and never knowing which version should be kept. Therefore I decided it was time to organize a backup plan that would have the following characteristics:

  • Seamless sync: no more manual comparisons to determine which file was the good one.
  • Cloud sync: backup in a remote location with a third-party provider and accessible from every machine
  • Physical sync: local disk encrypted backup

I use a combination of different services, but basically you can achieve the same with two things: Dropbox and Apple’s Time Machine (there are alternatives for Windows as well). I agree Dropbox might lose my data, in which case I should have a recent copy in the hard drives of all my computers. If my Dropbox account is compromised (less likely after they implemented 2-step verification – I recommend you activate it!), I will still have my data in the local backup. If my backup hard drive fails… Well, what are the odds of all that happening at the same time? 🙂

What is your backup strategy?

6 thoughts on “Backup, backup, backup!

  1. Hahaha the ol’ blue screen. I’m dealing with that right now! First my laptop screen burned out, and it has now become a desktop thanks to an external monitor. However the decline has continued with a lot of shutting off by itself, and my personal favourite, the loud beep followed by blue screen.
    Back up your data everywhere!

    Like

  2. level 1: realtime physics backup
    RAID 1 – Two harddrives inside your computer which are written with the exact same data, so hardrive failure, power surge, or bad sectors wont result in lost data
    wiki article

    level 2: realtime saving
    Tell your programs to auto-save all the time, this is usually an option.

    level 3: version control
    Rather than just using dropbox or ubuntu one etc, use a version control system (e.g., git, subversion, mercurial, bazaar, etc… These can be used in conjunction with a cloud service, bitbucket for example. One note about this style of version control, although it works for binary files (doc, docx, png, pdf) it is designed to record changes between text format. If you are writing your thesis in LaTeX it is perfect.

    The version control mechanic also exists in dropbox and time machine, you can look at difference between each sync.

    level 4: physical backup SOMEWHERE ELSE
    This could include a cloud service or just a USB drive you have somewhere safe. This is mostly for peace of mind but still useful.

    Finally, I would like to comment on your “local disk encrypted backup”. If your data is not sensitive then I recommend saving it unencrypted. Reason being, if the data were to become corrupt at all then recovery software can save most of your information, but only if it’s unencrypted. If you encrypt data and it corrupts every will be lost forever. At least that’s my understanding, a real computer science geek may say otherwise…

    Like

  3. Thank you for the comments, MZar and Kristina!

    Funny story: My computer has a sense of humour!

    Just because I showed a picture of it in its worst shape, it crashed while I was tweeting about this post. Thinking it hasn’t been published, I tweeted again from my cell phone. Then I realized the first one had been sent. Therefore I deleted the second one a few minutes later, but guess which one @McGillU had retweeted for its 20,000+ followers?? Yeah, the one I deleted, of course!!

    Like

  4. Oh no! It is almost like these things happen at the WORST possible time, when you have a tight deadline and you have been working so intensely that you basically have to back your work up every 10 minutes for the backup to be up-to-date!

    The seamless sync seems like the best option because it is indeed frustrating not to know which is the latest version (unless you adopt this naming strategy: http://www.phdcomics.com/comics/archive.php?comicid=1531)

    I usually use my usb key and the good-old-email-myself-my-work-every-day. Every 2-3 weeks, I do a more thorough backup onto my external drive. I don’t know why I don’t use dropbox or anything more sophisticated. But I agree that having multiple strategies is the best idea.

    Great post!

    (And congrats on passing your Comps. In our department, the Comps are a 6-8 month experience, where you have to write 3 full papers on 3 assigned questions, and then defend one of them orally in a departmental presentation).

    Like

  5. Very smart post, Indeed!

    For the readers: Beware of using Microsoft Word for Mac with Autosave enabled on a computer running Time Machine, specially when editing longer documents. There is a critical bug in MSWord that not only makes your Word unable to save and autosave your file (so, you can’t trust your frenetic control+s habit), but also makes Word and eventually your computer crash. This all happens with no warning. Since the file is not saved at all, Dropbox can not save versions either… You don’t want this to happen to you during an exam, although it is a rare bug. Just be careful l with it everything should be fine.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s