Just saying, some days are going to suck. Technology breaks. Often. So what is your plan then? How do you get your systems back online quickly and efficiently?
This situation is where a good backup routine comes into play. Being able to restore your system from almost nothing is going to be critical on these days. So today, we’ll talk about different methods for backing up your data, how often you should back up, and why you should test your backups regularly.
As a rule of thumb, your backups should be happening automatically. You don’t want to be in a situation where you haven’t taken a backup in three weeks because your team has been too busy planning an upgrade that just went wrong. This will also guarantee that your backups are happening regularly, meaning you can count on them being there when you need them.
There are several ways to trigger your backup routine automatically, and how you do so will depend on what your backup routine looks like. That being said, I prefer to have my backups executed by script, and I like that script to be run by a dedicated Jenkins instance. By using Jenkins, if something goes wrong with the backup routine, I can have Jenkins configured to send out a notification to my team. In an emergency, the last thing you want to find is that your backups have been broken for a month.
That being said, if Jenkins isn’t your thing, you can use Gitlab’s CI/CD, Bitbucket Pipelines, or even Bamboo to automate the process. The idea here is to put it somewhere that if something breaks, you are told about it.
On-Site & Off-Site Backups
So, this is getting a bit further into Disaster Recovery than I intended to go, but I also feel this is important to talk about. So question: Can you depend on the site where you are hosting your Jira instance to be there still when you need it?
If you haven’t guessed it, the answer is no. There is any number of disasters, both man-made and natural, that can cause a site to go down – permanently. So If you are backing up to the same physical location where systems are, you are opening yourself up to trouble.
Ideally, it would be best if you had your systems backed up to a different geographic location. For example, if you are in the US, consider something on the opposite coast or another continent. The idea here is to find a location where it’s difficult for any single event to impact multiple sites.
But how far do you go? It’s easy to get into a rabbit hole of thinking where you would require three sites – no, not good enough, four. Or maybe five?
Bad Admin. I get it’s easy when doing DR Planning to start thinking worse than the worst-case scenario. In practice, though, two sites are usually sufficient: your primary site and your backup site. Anything more doesn’t improve your DR standing.
Backing Up Cloud
So, going to start with the more straightforward system to set up. Atlassian lets you take a complete backup of your cloud instances once every 48 hours. To be clear, by complete back up I mean avatars, attachments, logos, and data. If you just want the data, you can take those as often as you should like.
As noted above, you should automate your backups. Atlassian thankfully provides some scripts to help with that. It should be said that these scripts are not officially supported and provided “as-is,” so you may find yourself fixing things yourself from time to time. But it’s at least a head start.
I would honestly clone this repo to a local source control management system (think Bitbucket), then work on your repo going forward. This setup will also allow you to change the scripts as needed for your case. Then when the time comes, Jenkins/Bamboo/whatever can check out this custom repo and run the latest version of your scripts.
Backing up Data Center
Server and Data Center products are a bit more complex to set up a good backup routine for – but honestly, it’s because you have options. However, not all options are created equal. That being said, I will briefly explain each option and what each’s strengths and weaknesses are.
This is the default for new Atlassian instances. In short, it is your instance condensed down to a single file. This is arguably one of the most transportable forms of a backup – as you can (in theory) take this to Cloud, another On-Prem Instance, or just use it to restore to an earlier point.
That being said, there are some severe downsides to this. Let’s say you are on a large Jira instance. You can have hundreds of GB of attachment data, millions of issues, and thousands of user avatars. That means this file will be MASSIVE. It’s conceivable that a large enough XML Backup can take so long to generate that it runs into the next scheduled backup. Then you have multiple backups running, further slowing down your system.
Given this, I only recommend an XML backup for relatively young Jira instances – anything below 250K Issues (rough rule of thumb). They are great for this size instance, but anything more and it starts to get impractical fast.
This is the second most straightforward method of backing up a Jira instance. However, it only works if your shared storage, database, and nodes are all on VM’s. And while I’m not a “storage expert,” I believe you can rile up someone who is by merely recommending you run a share on a VM.
That being said, if you are on a Jira Server that has a DB on the same box, this is a fast and efficient way to backup your Jira instance. If something happens and you have to rebuild, you have your instance as it existed at the time of the backup! Which, to put it bluntly, is nice.
This. This option right here is my preferred way to backup Jira. The concept is simple. You first run a utility to backup the database to a file. Examples include mysql-dump for MySQL, pg_dump for PostGres, or sqlExport for MSSQL. After completing the Database backup, you copy the Home folder(s) to an archive. Remember, if you are on Data Center, you should capture a copy of the local Home and Shared Home. Once you have the DB and home directories to files, move those files to someplace secure. And backup is done!
The order of operations does matter with this method. You HAVE TO take the database backup BEFORE the Home directories. If you do it the other way around, someone can add an attachment after you’ve taken the home directory backup but before the DB Backup. This situation will mean you have an attachment listed in your DB that isn’t on your restored system – causing Jira to throw an error.
So, why is this method my favorite? Portability! You can’t take this to the Atlassian Cloud, true, but so long as you don’t change the DB platform, you can take it almost anywhere else. Are you moving to a Cloud Compute Platform? I’ve done it. Move from a ProxMox to an ESXi Hypervisor? All you are doing is moving files, so that will work. All you need is an empty system, and you can restore your system. This backup method gives me the most options during a disaster, which is the one thing you need most in that situation.
Another thing this allows is cloning your system quickly to a dev or stage system. Just copy your files over, change the settings in the necessary files, and you’re good to go.
How Often to Backup
This question is tricky, and I’m afraid I cannot give you a hard and fast rule. So instead, it comes down to two questions: How fast does your data change, and how valuable are those changes?
I’ll address the second question first. The fact is, the more frequently you back up, the more space your backups are going to take, and the more it’s going to cost. But even this is nuanced. For example, local storage is usually far cheaper than off-site storage. For this reason, I usually take the approach of keeping more frequent backups on-prem, then doing an off-site backup less frequently.
An example of this might be doing an on-site backup every hour but only copying the midnight backup off-site. Likewise, if you are doing a file backup, you can play further games. Maybe take a DB backup every hour, but take a complete backup with attachments daily and a full off-site backup weekly. Again, this does open you up to having a DB Back up with attachments that aren’t backed up yet, but it would save some considerable space.
Of course, the second factor to consider is how often your data is changing. If you can see hundreds of users on your system every hour, it can be assumed they are doing something on Jira. This would be a reason to consider doing backups more frequently. Likewise, if your company culture is such that you only see spikes at the end of the day, you can get away with less frequent backups.
Why do we test backups?
So, the final topic on this, I swear!
Can you trust your backups? You’ve put all this infrastructure in place to create a backup file and store it, but how do you know it will work when you need it to? A Sysadmin I learned from early in my career would say that an untested backup is as good as no backups. This sentiment is because you don’t know the gaps in your methodology until it is tested. And if you don’t know the gaps, you can’t fix them.
I like to combine two activities: Testing my backups and standing up a new Dev instance. To do this, I will take a blank VM (or VM’s, as appropriate). If required, I’ll also ask for a blank NFS share and database. The idea here is to start from nothing but empty systems and backups. I will then attempt to restore the system as fast as possible from the items I have. Yes – I do time myself there – the added pressure of beating the clock makes the exercise feel more natural.
If I can bring the system online within a reasonable time, congratulations, we have good backups! If not, I note what the problems were and start fixing those. I’ll also reset everything to an empty state to try it again once I have everything fixed. Then rinse and repeat.
After I completely restore the system, I’ll make the necessary changes to put it on a different URL and DB, change the appearance slightly, and BAM! You have a Dev system with up-to-date information!
As Jira Admins, we are entrusted with some of the most valuable information repositories our companies have. “If it didn’t happen in Jira, it didn’t happen” is a sentiment I hear often. Given that, it’s entirely on us to make sure our systems are available. It’s more than probable that you will eventually have a bad day at some point in your career, and you will need to restore from your backups. However, doing the due diligence now will go a long way to ensure that an awful day doesn’t become the worst day in your career.
I just have a few announcements to close out this post with. We have a hectic but fun week ahead of us. First, don’t forget that I will be appearing in a Charity stream playing a Table Top RPG to benefit Able Gamers and Special Effect. These charities work to empower people with disabilities to enjoy games as well – and I cannot think of a greater cause. So join us from 7 PM to 10 PM Eastern on Twitch! I hope to see you all there!
Second, we are less than a week away from JiraCon! I’m still profoundly honored to have been selected as a speaker, and I can’t wait to share my presentation with you. I will be presenting live from 10:00-10:45 AM Eastern, but I encourage you to join us for all the other speakers and presentations! See you there!
Don’t forget; you can find my social media links on my Linktree. So please do like and follow. However, the best thing you can do is comment and share so more people can discover this blog! You can also get new posts sent directly to your inbox! The signup form will be below! But until next time, my name is Rodney, asking, “Have you updated your Jira issues today?”