So Long, Server

First, I should note a bit of a disclaimer: Opinions in this blog have been and always will be mine – and in no way represent the feelings of Coyote Creek Consulting or anyone other than me.  

Well, the writing has been on the wall for a while – but it seems that dreaded day is here. Last Friday, Atlassian announced that they would sunset the Server products. Starting next February, Atlassian will no longer sell new Server licenses. Three years after that, they will stop all support for Server products, including bug fixes and security updates. Afterwards, your options are Jira Cloud – which will be the only option for smaller teams – and Jira Data Center. 

I want to hate this decision.

On an emotional level – this is a gut-punch. I often say I try to be one of Atlassian’s biggest cheerleaders, but this week it’s been hard to be that. I started on Server because, for the longest time, that was our only deployment model. I still run a Server instance for Jira and Confluence in my home lab to support this blog. It’s how I’m able to capture a lot of the screenshots that I require for the blog posts. My licenses will come up for renewal before the price increases take effect, so I’ll be good for another year or so. After that – well, let’s hope I can get sponsorship. 

However, after running the numbers, I can’t hate it entirely. In the end, it will save most customers something. A few will pay more, and a few others are left out in the cold, but most people will save money. I will add this – if you plan on moving to Data Center, it will behoove you to go ahead and buy your licenses as soon as possible rather than wait for Feb. 2021. Those price increases are sharp, just saying.

So – let’s talk about those numbers.

I should note a few things here. First, I am only looking at the base Jira Software product, with no add-ons. This is not completely realistic, but it’s good enough for a comparison. I do intend to have an overview of how you can do a detailed analysis of your instance to figure out if Cloud is right for you next week. But for now, this will be good enough.

For Cloud, it is using the Standard Plan. I should also note that Data Center’s license model has a lot more tiers than Server does. So even though you might be on a 10K Server license today, it doesn’t mean you will have to go up to a 10K DC License. You might be able to get away with a 5K license if you have less than 5000 users, which would save you half that cost. Hence the asterisk.

Source: Future DC Pricing. Markup calculated as New Price / Old Price * 100

However, with a few exceptions, most people will save money versus their current Server Prices, so long as they are free to choose Cloud or Data Center. I fully get that not everyone is free to choose, but I’ll get to that in a moment.  

Now let’s get into it. As best I can tell, for the big three – that is Jira Software and Core Server, as well as Confluence Server – the prices are not increasing if you pay list price. If you are on a discount plan, your costs will increase, but that’s not most of us. The other Server products will see an increase, though, so it might make sense to review Atlassian’s docs for yourself.  

But back to the big three, the only costs that will increase are on the $10 Starter Package – which is going away entirely next February. That means you’d have to pay the next tier up – which is a hefty jump indeed. This one saddens me the most, as this tier is invaluable for people learning to become an Atlassian System Administrator. I’m sorry, but the Cloud is no substitute in this regard, just by the nature of being what it is.  

While this loss is sad, We can at least take a look at other scenarios to figure out how it will impact you moving forward.

So, About that Cloud thing you mentioned the other week.

Look, I don’t know how plainly to put it. Right now, if you are under 250 users, and Cloud is a viable option for you, you will save money on the Standard Plan over either Server or Data Center. This “break point” shifts when the new pricing goes into place such that anyone under 1000 users is now the winner.  

Now, I don’t care for Atlassian’s Cloud Product because it has its problems. No use lying about it, and that’s not what you come here for. However, Atlassian is investing heavily in it, and I don’t think the problems are unsolvable. And I’ve been given some sneak previews from Atlassian – exciting features are coming soon. So yeah, between Server’s End of Life and Data Center’s lack of smaller tiers, Jira Cloud is something to look into.  

However, not everyone has that luxury. This group was my first thought as I read through the announcement, and they still are. And I don’t think Atlassian is paying enough attention to them.  

Simply put, specific segments cannot move to Atlassian Cloud. Here in the United States, these include teams in the Financial Sector, Healthcare, Government work, and Education. The requirement here is almost always regulatory, which means this isn’t a preference. They are required by law to host their data.  

And this would be fine if Atlassian would provide lower tiers for Data Center. But as it stands, they won’t. So if a team happens to be smaller than the 500 user mark, they now have to pay for way more licenses than they need to meet regulations.  

And what about Data Center?

Look – the price increases set to take place for Data Center are going to hurt. Most are around a 200% increase in price. If you choose to move to Jira Data Center, you will be paying more over what you paid for Server after February. Just that, plain and simple. If you can lock in your license before then, you will be saving a good chunk of change for the year. So my recommendation is lock in those licenses early if you are thinking about going this route.

The good news here is you don’t have to redo your architecture to take advantage of a new DC license. You can drop it into your current Jira Server instance in place of its server license, and you’ll instantly unlock the new features associated with Data Center. From there, you can take your time to build out your infrastructure to make it a true Data Center instance – or not. All depending on your needs.  

Another thing to consider on Data Center is your license model is different. On Server, your license is perpetual. That means if you let it lapse, you can still run and use Jira just fine. You won’t be able to upgrade it to a new version or get Support from Atlassian, but it will still work. Honestly, I’ve heard from some companies that run their Atlassian tools like this – only getting a new Server license when they plan to do an upgrade and only doing an upgrade every few years. 

However, in Data Center, that’s is not the case. If you let your Data Center license lapse, your instance locks up, and you cannot use it until you get a new one. This is because Atlassian considers its Data Center license to be a “subscription,” which you need to pay annually. Just something to think over if you haven’t looked at DC before. 

Viral Marketing

So – as I mentioned last week, I can be something of a marketing nerd. And I’ve loved Atlassian’s strategy they’ve employed for years now. They’d start small in a given company. Maybe a guy used Jira at his old job, and he recommends it to his team. They adopt it, build on it, and it works for them. The next team over sees their success, adopts Jira, and it grows. Before you know it, an entire department is using it. But then Legal notices what it can do and wants in. So does HR. And Accounting. Now your full company is using Jira, and you have a large instance.  

This process is honestly how I see Jira adopted most often. It’s not a top-down decision; it’s a bottom-up grassroots movement. However, will this process still work if the teams start on Jira Cloud, where the license costs are per user? I don’t know if one team manager will want to foot the bill for another team using “their” platform. It seems Atlassian might be thinking more of the top-down approach is the future. Or they have data that says this isn’t a problem. As I said, I don’t know, but I’m going to pay attention to what I hear from the ground.

Remember, you have time.

Look, this announcement feels sudden. But don’t let it spook you. Jira Server will still be here for over three years. This period gives you plenty of time to evaluate all your options and move there before Jira Server has its End-of-Life.   

However, hasty and emotional decisions are rarely good ones. It’s one reason why I wanted to wait to say much publicly until I could calm down, get the emotion out of it, and figure out the real story here. Trust me when I say that my outlook on Friday was a lot more panicked than it is as I write this today.  

So stop, take a breath, and relax. You’ve got this. If you are on a Server instance, it’s not the end of the world. It’s just the start of the next leg of your journey. 

Questions from you

Considering I made no secret of today’s topic to everyone, I figured I’d open it up to Social Media to see if they had any questions. And of course, you guys came through. So let’s see what your concerns are?

“Did Atlassian make a good call to announce this change considering, that cloud is still very much not a mature platform yet?” – Vitalijus Šerpytis

Do I think this timing is ideal? No – not at all. Look, we are still in a massive global pandemic, the economy is in a global downturn as a result, and to have this enormous change on what most companies consider a vital resource during this time is…less than ideal.  

However, I also don’t think they necessarily had a choice. In a previous article, I made note that Atlassian is a relatively small company for their outsized reach and influence. And this small company has been splitting its attention between Server, Data Center, and Cloud for a while now.

And yes, we know Cloud has been getting the lion’s share of the investment for a while now, too. It’s where all the new “cool” features are premiered. However, Atlassian Cloud does have some fundamental problems. You know it, I know it, and you better be sure Atlassian knows it. And at their current setup, they don’t have the resources to fix that.  

Atlassian also sees data that says most new people to their products are starting in Cloud. This fact tells them two things: A) Their future depends on Cloud, and B) They don’t have the resources to fix Cloud and support Server and support DC.   

Between Server and Data Center, Server’s been trending downwards, while DC upwards. Of course, this trend exists – people are migrating away from Server to Data Center and Cloud, so it’s going to go down, but the data still is what it is. 

So given all the facts above, if you had to make the hard choice, which would you make? As I’ve stated, I think this decision does ignore some critical use cases, but I can at least understand the decision.

“How can customers trust Atlassian is going to offer DC going forward given all their actions in the last few years?” – @jonjonbling

Hmm, just asking all the hard questions. Look, I am not a part of Atlassian, so I cannot say what they are going to do with any certainty. However, Server and Data Center have a fundamental difference that makes Data Center more attractive to keep onboard: DC is a subscription license.

For Server, you pay once for your license; then, you pay a smaller “Maintenance” fee every year after for access to upgrades and Support. If you don’t plan to upgrade and don’t feel you need support, don’t pay Maintenance. You can still use your Jira instance because you’ve already brought your license. 

However, with Data Center, you pay the full price every year to continue using your Jira instance. It’s a subscription, so if you don’t pay it, you don’t get to use Jira. This pricing model brings in a lot more money to the company. It might be cynical of me, but when in doubt, follow the money. Atlassian has a lot more incentive to keep Data Center alive than Server, so it will likely be around as their “On-Prem” offering. 

Here’s to a better week coming up.

This week has been a long one. I don’t know what the future may hold for the Atlassian Ecosystem or this Blog, but I know that we are adaptable, so we’ll do what’s needed to move ahead. So, I’m not going anywhere, and I’ll do my best to guide you through what’s coming up.  

However, you can help now me out. If you are financially able and find this Blog helpful, consider becoming a Patreon and supporting what I do here. Depending on your monthly contribution, you can get access to a Members-only discord, exclusive content that I will not be posting publically, as well as recognition on the Blog. Higher tiers can even participate in a Monthly AMA Conference Call or even a private one on one (virtual) meeting with me. Patreon is very much an experiment, but if you find it useful, please consider supporting the Blog.

What are your thoughts? Are you angry, worried, or looking forward to the changes? Is your organization planning to migrate to another platform? If so, which one? Leave a comment on Social Media with your thoughts and help the algorithms distribute the Blog. You can find me on FacebookTwitter, and LinkedIn, where you can find new posts, community news, and interesting tidbits. You can also put your email down below to get new posts delivered directly to your inbox. But until next time, my name is Rodney, asking, “Have you updated your Jira issues today?”

What’s the deal with Jira Cloud?

So, a few weeks ago, I was contacted by Jira Gal Heather Rimmey. She had seen a comment I had written in reply to someone’s comment about Jira Cloud and Jira Server. It turns out she was having the same discussion with management and wondered if I had any thoughts on the differences between the two deployment models. And well, we all know I do! This week, let’s dig into the differences and discuss when one option may be preferable to the other. 

As a note, I’ll be using “Jira Server” a lot during this post. Please understand it to mean Either Server or Data Center. While the two are very different deployment models and have unique feature sets, it’s an excellent grouping to keep this post from getting too confusing. 

One is not better than the other.

Let’s get this over with first. I do not think a Server Deployment is better or worse than a Cloud deployment. However, I believe that they are very different creatures who both happen to bear the name “Jira.” That is to say, just because you have experience with Jira Cloud doesn’t mean you’ll transition smoothly to a team using Jira Server and visa versa.  

That’s my most significant point, and one I cannot stress enough. Most groups I talk to fail to consider a certain “relearning” time that will cause overall productivity to go down after a migration. Your teams will need to learn how to do all the things they knew on the old platform. This can be offset a bit by training, but only to a point. It’s something to consider.

Control vs. Ease of Use

This concept, as I see it, is the main selling point of Jira Cloud. You don’t have to learn how to set up a server or run an upgrade. You request your site, and within moments you have a Jira instance! Congratulations.  

And this is great! If you are a small team, you don’t need that overhead. However, just like everything else Jira, this is a trade-off. Atlassian is continually pushing out updates and upgrades to Jira Cloud. This model means you often get not only bug-fixes first, but the latest and greatest features that the On-Prem deployments may not receive for months or years. 

But what happens when Atlassian pushes an update that breaks your processes? Like when they pulled the Cloud version of the Automation for Jira app before the built-in version was completely ready? Don’t get me wrong, I have staked my career on Atlassian’s products, and will be their biggest cheerleader as a result, but sometimes “oops” happens.

With a Server deployment, I can find these “oops” moments in a testing phase of an upgrade. I can then choose to hold off the upgrade until I find a workaround or Atlassian fixes the bug. But you don’t get that luxury in Cloud. The best you get is the ability to put off an upgrade by a few weeks – but only on their highest-tier plan.

Again, not saying this a bad thing. It’s a trade-off for getting all the new toys before the rest of us. But it’s definitely something to consider. 

Site Names

This fact maybe just me, but I’ve always been fascinated by Product marketing and branding. As my wife will attest, if I find an interesting product packaging at a grocery store, I will derail the whole thing for a moment while I take a closer look. I’ve honestly purchased some products before so that I can take a closer look at its packaging. It’s kind of sad and funny at the same time.

What does this have to do with Jira Cloud? I’m getting there. Every Atlassian Cloud instance has the following URL Structure: <yoursite>.atlassian.net. What if you prefer jira.thejiraguy.com? Too bad – can’t do that with Cloud. At some companies, this isn’t a big deal. You can still have your company name somewhere in the URL, and that’s good enough.  

But that’s not every company. Some companies feel having their URL there is essential for customer perception. I can’t say I wholly disagree with this idea, either. I intend to eventually do a review on a tool that allows you to put public sites (like Atlassian Cloud) on your domain, but even that is just a workaround.  

However, on Server, that is the point. You set it up on your servers, so you will need to have the URL set up, too. So you can use whatever URL you have control over. 

Scaling

This is a point I think Jira Cloud does a great job on. Look, no matter how robust your system is or how much care you take over your configuration, if your userbase continues to grow, you will eventually outgrow your single server instance? What then?

Well, you are looking at a migration to Jira Data Center. This process presents its own hurdles and challenges. Just like a Server to Cloud migration, not all Apps that are available for Server are also available for Data Center. You will likely have a larger instance at this point, which means you might be trying to move a large amount of data around. Not fun.

And what do you have to do to scale Jira Cloud? Either pay for more users, or on the odd occasion, pay for a higher tier. That sounds a heck of a lot easier to me. 

Apps

While this will be my last topic today, it is not anywhere near the last difference between the two deployment models. I’ll be including some additional reading if you want to explore all the differences yourself. 

However, I’ve already touched on this one in this article, and it’s probably one of the most annoying things about the deployment models.

Jira Server Apps are not compatible with Jira Cloud, and visa versa. What does this mean functionally? It means that if you are considering moving from one to the other, you have to sit down and look at every single App you are running and determine if they have an equivalent on the other model. And most of the time – there will be at least one or two Apps that don’t. 

Then your job is to sit back down and determine if another App is available to give you the same functionality, or failing that, determining if you can live without that App. And honestly, having to find out if you can “live without” happens more often than not. 

There is another effect to this incompatibility. I cannot count how many times I am searching for a plugin to solve a specific problem, and I finally find the perfect App. It’s well-reviewed and has a large install base. And the documentation is thorough and doesn’t look too involved. Then I look up and realize it’s only for a whatever deployment model I’m not on. It’s honestly enough to drive one mad sometimes!

Further Reading

As I’ve stated earlier, these are far from the only differences. However, if I were to write on every difference – and the practical effect of each one – I’d be writing a book. However, if you are interested in reading further, I do have two links you would be interested in. The first is a high-level differences in the Cloud platform for all the tools. The second is more specific to Jira.  

Look, if you are considering moving to the Jira Cloud, or just want more information before committing to a platform, I encourage you to read through both carefully. Figure out what trade-offs you can live with and what you can’t. Jira Cloud may suit your needs very well. Or it might have several deal-breakers. The only way to know is to do your research!

And that’s it for this week!

Before I forget, big news from this past week! Atlassian announced that they are renaming Summit 2021 to Team 2021! I don’t think we know if this will be the same once we can meet in person again, but I’m still excited. They have also opened up submissions for Speakers for Team 2021. I’ve put in three presentations from some of my favorite posts over the past year, and hopefully, I can get selected! If you have any interesting ideas for presentations, breakout sessions, etc., why don’t you submit too! Nothing ventured, nothing gained!

Image

However, if you’ve found this post helpful, why don’t you give it a Like? I post new articles every Wednesday, so you can sign up below to get it delivered directly to your email inbox! You can also find me on FacebookTwitter, and LinkedIn, where I’ll post updates, news, and new posts – so be sure to follow! 

Is Jira feeling slow lately?

Well, last week. Where do I begin? Maybe here:

Yeah, pretty good, right? I thought so too, then Thursday happened:

Just Wow. Thank you, guys! I keep finding that just when I see the ceiling for this community, you guys go and surprise me like this.

On top of the page views last week, I also had a number of you reach out to me last week for various reasons. However, one drew my attention in particular.

Well, Harsha, it’s been a while since I released a System focused post, so challenge accepted. Fair warning, though – this topic gets intense. Jira is a rather complicated system. Even if we ignore the configuration and focus on the system, there are still many places we need to check out to know what bottleneck is causing Jira to perform slowly. But too late, you’ve already asked for it, so let’s dive in.

Lag Sources

CPU

Well, this should have been obvious as the first place to look. If your CPU is overloaded, it’s going to impact performance for the end-user. It amazes me how many people skip looking at this and head directly to adding more Heap Memory to the JVM (which we’ll talk about in a moment). If you are on a Windows Server, you will likely know how to check CPU on it via the Task Manager’s Performance Tab.

This utility will also give you information on Disk Throughput, Memory usage, and Network utilization – all good indicators if you are trying to diagnose slowness in your Jira instance. However, on Linux, I prefer to use the top utility.

This tool will tell you many of the same things, but it requires some know-how to interpret. So why use top instead of something like glances or htop? Well, it’s because top is an almost universal tool on Linux systems. It is so rare that I encounter a system without it that I cannot actually remember it. So how do we interpret this data?

The first thing I usually look at is the load average. This measurement is broken down to load over the past 1 minute, 5 minutes, and 15 minutes. Ideally, this number should not get over 80% of the number of CPUs you have. Do you have 4 CPUs? That number shouldn’t be above 3.2, for example. It’s not an exact science, but it’s close enough to work as a rule of thumb.

Another place we can look at is the CPU Utilization, underlined above. These numbers are broken down into various categories, as shown below.

us: user cpu time (or) % CPU time spent in user space
sy: system cpu time (or) % CPU time spent in kernel space
ni: user nice cpu time (or) % CPU time spent on low priority processes
id: idle cpu time (or) % CPU time spent idle
wa: io wait cpu time (or) % CPU time spent in wait (on disk)
hi: hardware irq (or) % CPU time spent servicing/handling hardware interrupts
si: software irq (or) % CPU time spent servicing/handling software interrupts
st: steal time - - % CPU time in involuntary wait by virtual cpu while hypervisor is servicing another processor (or) % CPU time stolen from a virtual machine

The two we are most interested in is “us” – this will be the active Jira Processes, and “wa” – this will alert us to a slow disk problem. These measurements are broken down as a percentage of the total CPU capability of the system – so they all should add up to 100%. So if your us measurement is taking 80% or more of the total system usage, you are using too much CPU.

So, the answer is just adding CPUs, correct? The answer here is maybe. If you are on a VM, it might hurt more than help. How so? My understanding here is that the Hypervisor (that is the VM Server) will only schedule a given cycle for the VM if it has enough CPUs free for that cycle. Therefore, the more CPUs are allocated to the VM, the more the Hypervisor has to free up, and the more time between cycles on your VM. How much is the max for your Hypervisor? I cannot say, but your Hypervisor Admin should tell based on the VM Host’s CPU Utilization.

What if the wa value is high instead? This event is a sign that your disk speed may be limiting your system performance, at which point we’ll need to look at your disk speed – which we’ll discuss below.

Heap memory

So let me say – Atlassian already has an excellent video freely available on this. It’s called “Trash Talk! How to reduce Downtime by turning Garbage Collection,” and I still reference it to this day. Seriously give it a watch.

Oh – you’re still here. Well, this is awkward. I thought everyone would have gone off to watch the video. Seriously, I was at that talk at Summit 2016, and I still refer to this video as a refresher. If you are a Jira System Administrator, you oh it to yourself, your users, and your career to learn how to tune the JVM. 

So what are the take-aways from the video? First, don’t try to change the GC Method or any of the GC Parameters. Atlassian sets these to settings that will work for 99.9% of cases. I usually only touch them if asked to by Atlassian support. No – what I typically tune is just these two numbers.

As a best practice, I like to set them equal, that way the JVM will reserve it’s max amount of memory on startup. This way I won’t be surprised in a day or two by an out-of-memory error (or worse yet, the system deciding to just kill the JVM process.)

The second takeaway is that it is entirely possible to cause performance problems by setting the Heap Memory too high. Doing so can cause one of two issues. In the first problem, you end up choking out legitimate system processes from Memory, slowing down the whole System by forcing it to use SWAP space. In the second, the JVM wait’s longer to do a Garbage Collection, which means that GC Cycle will is more likely to be a full GC where the System has to pause the JVM. If the JVM is paused, it is not serving Jira Pages…hence the lag.

So, how do we know what number to set it too? We have to run it for a while, then analyze the GC Logs to find that out. If we see that the JVM is doing Garbage Collections too frequently, we need to bump it up. If we see that the GC’s are pausing the JVM, we might look at bumping it down a bit. Unfortunately, there is no hard-and-fast rule that says, “If you have this many users, set it to this.”

Fileshare/Disk Speed

Now you know the CPU isn’t overloaded, and the JVM is tuned correctly, but your System is still slow. Where do you look next? Well, the file system. I alluded to this previously, but you can get a clue if your System is having some Disk Access issues by looking at the wa statistic in top. However, while that will help if it’s high, it can give a false negative.  

The Good news here is that Atlassian gives us a guide on how to run a Disk Access Speed Test.

Basically, this will have you download a jar file from Atlassian Support, and run it against the location where Jira’s Index resides. Should look something like this:

 wget https://confluence.atlassian.com/jirakb/files/54362304/54591494/3/1444177154112/support-tools.jar
java -Djava.io.tmpdir=<Jira Home Directory>/caches/indexesV1 -jar support-tools.jar

If you run this, you should get a table like the one above. I typically look to the Median and Max columns, and compare against the table Atlassian provides (copied below).

StatisticExcellentOKBad
Open< 40,00040,000 – 150,000> 150,000
Read/Write< 40,00040,000 – 100,000> 100,000
Close< 20,00020,000 – 100,000> 100,000
Delete< 50,00050,000 – 300,000> 300,000

As we can see from my results, the medians are either in the OK or Excellent Range. However, my Max times are not (Also, OUCH on the Open Max). This result was likely an outlier caused by a problem on the VM Host, but cannot be ignored outright. If this were an enterprise setup, I’d start asking the VMware Admin questions to figure out if there are any problems with the storage and if we can speed it up at all.

My result is because I am running on seven-year-old hardware disposed of by some company as end-of-life and cobbled together by someone who is still learning.  

Database

So, this will be a radical idea, but sometimes Jira’s slowness isn’t caused by Jira. Sometimes it’s caused by an external system. No, seriously, how well your Database performs will have a MASSIVE impact on how well Jira performs.  

Atlassian thankfully also has a tool for this situation.

Again, this will have you download a jar and run it on your Jira server.

wget https://confluence.atlassian.com/jirakb/files/54362302/54591493/2/1444177155911/atlassian-log-analysis-0.1.1.jar
java -cp PATH_TO_THE/atlassian-log-analysis-0.1.1.jar:PATH_TO_YOUR_JDBC_DRIVER_JAR \
com.atlassian.util.benchmark.JIRASQLPerformance \
YOUR_DB_USERNAME YOUR_DB_PASSWORD \
JDBC_CONNECTION_STRING JDBC_DRIVER_CLASS \
> db-perf-test.txt

You will get the YOUR_DB_USERNAME, YOUR_DB_PASSWORD, JDBC_CONNECTION_STRING, JDBC_DRIVER_CLASS settings from your Jira instance’s dbconfig.xml file. A note on this tool: you need 1000 issues in your Jira instance to run it, as I found out the hard way.

However, Atlassian does give you an example of what the output should look like.

TOTALS
----    ----    ----    ----    ----
stat    mean    median  min max
----    ----    ----    ----    ----
retrieve-issue  5,338,000   979,000 213,000 46,007,000
get-issue   174,775 93,000  62,000  11,621,000
retrieve-workflow   5,117,153   607,000 341,000 47,738,000
get-workflow    98,996  64,000  40,000  2,962,000
retrieve-custom-field-value 601,093 495,000 316,000 23,082,000
get-custom-field-value  91,246  52,000  37,000  3,453,000
----    ----    ----    ----    ----
All times are in nanoseconds.

Atlassian states they don’t have an “Excellent, Good, Bad” chart for DB Values clearly defined, but they tend to look for any values below 20ms as good, and 10 ms as Ideal. (Remember, 1ms = 1,000,000 nanoseconds).

Network

Well, it’s not the CPU, it’s not the Memory, File System, or Database. What does that leave? Well, the network can be a bottleneck. You can usually check this with a simple ping test. Does the ping take a long time to reach the Server? Then you are more likely to see issues with Jira’s speed.

Look, I get it. You might be in India, and your company’s Server is located in the United States. You are only ever going to get your pings so low. If you are on Data Center, you can use a CDN to help with some of that latency, but it will take actual time for the packets to travel around the world. 

However, if you are located in the same building as your Jira Server and are still getting high pings, it might be time to talk with your Network Admins to figure out what is going on – especially if you’ve run all the test to say it’s not the Jira Server itself. 

Maybe it’s just busy.

Look, I’ve done my best to ignore the Jira Configuration in all this and focus solely on the System. However, you will get to a point where it will become impractical to add more Memory or CPUs to a server. At that point, you have two real options.  

For your first option, You can do the hard work and clean up your configuration. Get rid of unused workflows and permissions schemes, consolidate duplicate custom fields, and get Jira healthy in that regard. It’s not easy, and may not be politically expedient, but it is well worth it.

Your other option is to migrate to a Data Center Architecture. Look – there are only so many users a single server can handle. Data Center Edition solves this by spreading the load out to several servers, each working together to form a single Jira instance. I have several articles already on how to convert your Jira server instance into a Data Center instance. So if this sounds like you, maybe it’s time to consider an upgrade. 

Well, there you go!

Look, this is an involved topic. Pinpointing your performance issue to a single cause is tough – especially when it’s a real possibility that there are several causes. But it’s well worth it to go through each of these and see how your System is doing. 

This week has been a busy one. I haven’t had a chance to read as much or do research like I usually like to do. So, I only have one webinar to share with everyone this week. “Get Jira superpowers: Reporting on Projects and Calculated Fields” is a joint presentation by Deiser, Old Street Solutions, and cPrime. And it’s tomorrow, 11:00 AM Eastern time! You should check it out!

As always, if you enjoyed this post you can subscribe below to receive new blog posts directly to your inbox. You can also like, follow, and comment on social media to help your colleagues discover our content! You can find me on TwitterFacebook, and LinkedIn with regular updates, new posts, and news about the blog. But until next time, my name is Rodney asking, “Have you updated your Jira issues today?”

The System is Down!

Well, we made it to August! I hope you enjoyed App Month. To be completely honest, I only realized July had five Wednesdays after I had already committed to the idea. Oops!

But I got to meet some really great Atlassian Partners, learn some cool things about some Apps, and in general had a lot of fun. And you guys seemed to like it too!

Another record breaking month, with an additional 800 page views on top of June. To put it mildly, you guys killed it! Let’s see what August holds!

Today we will be looking at what you should do when Jira is down. As usual with us, this only applies to Jira Server and Data Center. Look, no matter what we do, Jira will come down unexpectedly at some point. That’s just one of the joys of running any service. If you are lucky and are monitoring all the right metrics, it may only come down when you plan for it to. However, everyone’s luck runs out at some point. So lets take a look at what we can do now to be prepared.

Have a Plan

There was a time, early in my career, where I didn’t plan for downtime. When downtime happened, I was all panic, no purpose. I’d extend downtimes longer than they needed to be so that I can find a permanent solution. This ladies and gentlemen is an inadequate approach when people depend on your system. 

When people depend on your system to do their work, every hour it’s down is an hour the company is paying them to do nothing. I once had to break it down like this for a software engineer who wanted me to keep Jira down until lunch.

Let us say you have 400 Developers, with an average salary of $150,000/yr. That breaks down to roughly $72 per hour per Developer. That means a Jira outage cost you $28,846 in lost productivity every hour, and that is just Developers! It does not include Project Managers, IT, Management, UI/UX, QA, and everyone else that depends on Jira. You can see how it can quickly add up.

However, it is possible to be too hasty here too. You could be destroying information you need for a permanent fix by performing a quick fix. In that situation, you’ll likely have a ticking time-bomb, ready to bring down your system again.  

That is why it is essential that you have a plan. Ideally, a document that describes whom to contact, what information to gather, and when to escalate. The industry term for this document is a Runbook, and it is recommended you have one for every system you manage.

In the past, I’ve linked to a generic Runbook template – and to be honest, it wasn’t the greatest for Atlassian Products. Atlassian themselves have a template that I like better than the one I’ve previously linked. However, I’ve taken the time to customize it further into a generic Jira Runbook template. This request came out of my last Webinar, and I thought it was a great idea! It will still need a lot of information specific to your instance filled in, but at least it’s a start! 

Communicate Immediately

So, you’ve got a plan, and you’re following it. Good. But remember, many people need Jira and can’t get to it. Keeping them in the dark only means dealing with that many more interruptions while you try to fix things. They need to know what’s going on, even if the single update is “I know about it, I’m working on it, and I’ll give up updates as I know more.”

There are several ways you can do this. I had a list of email groups in Outlook that I could copy/paste into a new email. This method meant I didn’t have to remember who all I needed to contact – as again, I usually had other things on my mind. I just wrote up my message, pasted in the BCC, and hit send.

<pet-peeve> By the way, that’s another thing. For large chains, use the BCC rather than CC or TO lines. That way, if anyone needs to reply to you, they can without interrupting everyone else. </pet-peeve>

Another option you can look at is Statuspage. I’ve always liked the product, even though I never had the benefit of working with a group that used it. Training your users to check here for issues will help them find information on outages first without bothering you. Doing this sounds like a win-win to me.

Collect Information

So, you’re following your plan, and you’ve notified your users. Next?

Next, you need to take time now to gather information before attempting a restart. Typically, I like to have the following two things in case I need to go to support.

1: Thread Dump

A thread dump is a detailed list of everything Jira is doing currently and how long each task is taking. Having these details can be invaluable in determining why Jira is behaving weirdly or being slow. Atlassian provides a script now to automate capturing these thread dumps. Check out the docs on Thread Dumps here:

As a note on Thread Dumps, if you install a plugin called Thready, it will help you analyze the thread dumps by attaching the thread’s name to each entry. It’s a free plugin and doesn’t impact performance, so I usually test and install it on my instances to be ready. 

2: Support Packet

The support packet is another thing I try to capture if I can. Getting this will depend on your Jira instance being alive and responsive, so you may not be able to get it. If you can’t, don’t worry. Capture your log files from <Jira Home>/log/*.log, and you should be good to go. But the idea is before you try to change anything to get Jira back up, take a moment to get things that will help you solve this problem long-term. 

Try to change one thing at a time.

So, you’ve collected the evidence, and you’ve told people you’re on it. What now? You can run in like a firefighter, make eleven changes, and pray one of them fixes Jira, right?

WRONG! Look, you’ll want to tell your management, your users, and your future self what went wrong and how you fixed it. You can’t do that if you aren’t sure what fixed the problem. That is why you need to take a breath, calm yourself, and focus on one change at a time. Change something; see if it works now. Change again, repeat. Do so until something works. You will still need to pay attention to the logs and hunt for clues on google. But take your time, and be methodical, and be sure what your problem was when all is said and done. You will be thankful for it; trust me. 

Did I say Communicate?

Congratulations, you’ve gotten through the worst part of it. Jira is now back up and running, and everyone’s happy, right? 

Well, no. First thing, you need to let your users know Jira is back up and ready for use. They are waiting to do their jobs, after all. Some of them will find it’s working on your own. But it is common courtesy to let everyone know.  

Document, document, document!

For some, this will be the worse part. It’s excellent Jira’s up and running, but some people (like your management) may have questions about what happened. And these are not people you want to keep in the dark.  

I typically write a document that I call an After Action Report. I’ve also heard them called Root Cause Analysis, but and After Action Report makes me feel more like a hero after a big fight. Yes, it’s an ego thing.

Typically, I’m looking to answer three questions:

  1. What went wrong. Include a timeline of events, and the major players and systems involved.
  2. What you did to fix it short term. Be detailed, and write down procedures and commands. You never know when having these handy will save you time in the future.
  3. How you intend to keep this from happening again. Action items here could either be permanent fixes to be done, a followup with Atlassian support, monitoring on a specific metric or component, or a change to Standard Operating Procedures. The idea is to show you intend not to let a problem become a pattern. 

Keep these in the same place (Confluence!). Again, if you’ve done your job right, you may never have to reuse one. But it’s handy to have it there and helpful if you ever need to refer to a fix you’ve found before. 

Congratulations, you’ve survived!

Having downtime can be one of the most stressful events of your career. I should recount the time I was on duty for twenty-four hours straight with a severely troubled Jira instance. To be fair, I only spotted the problem after taking three hours to get some rest – so had I gotten some rest sooner, I may have resolved it that much faster. Seriously, don’t be me!

What are some of your downtime stories? I’d like to hear some of them in the comments! In speaking of comments, next week I’ll be compiling all the questions I’ve gotten in the comments and on DM’s into a post! If you have any questions you’d like me to answer, go ahead and get them in!

If you’ve enjoyed this post, be sure to follow the blog to get new posts directly in your inbox! You can use the form below to sign up! You can also follow us on Facebook, Twitter, and LinkedIn to get the latest updates! Be sure to like and comment on the posts, so the social media networks know the Jira Guy is worth sharing! But until next time, my name is Rodney, asking, “Have you updated your Jira issues today?”

Integrating your Atlassian Cloud with Azure AD

Well, today, it seems we are going to do something I admittedly rarely do on the blog. That’s right; today, we are going to admit that JIRA Cloud exists!  

It’s not that I have anything against JIRA Cloud. My specialties tend to lie around making sure the underlying JIRA system runs as smoothly as possible, which is hard to do when you don’t own the underlying system. However, there is still plenty of overlap between JIRA Server/DC and JIRA Cloud, so it’s not like I’m unqualified to speak on it!

So it’s no secret at work that I maintain a whole collection of personal test systems. I do this to replicate and test just about anything I want without waiting for permission. The environments include (but are not limited to):

  1. VCenter Environment for VM’s
  2. More Raspberry Pis than I rightly know what to do with
  3. AWS Account
  4. Azure Account
  5. Cloud Environments of Confluence, Bitbucket, JIRA Software, and JIRA Service Desk
  6. Server Environments of Confluence, Bitbucket, JIRA Software, and JIRA Service Desk
  7. Several VPS online, including one running (wait for it…) Confluence.
This is RACK01. As in, “Yes, there is also a RACK02”. I…I might have a problem.

So, when my manager wanted some help looking into some oddness he saw in JIRA Cloud using Azure AD, he knew who had the tools to recreate and test that setup.

However, I didn’t know how to set up the integration when I started. So I had to learn that. And since I had to learn, I might as well help you learn too!  

Pre-reqs

To pull this off, you will need a few things first.

  • An Azure AD subscription. If you don’t have a subscription, and just want to do some testing, you can get a one-month free trial here.
  • Atlassian Cloud single sign-on (SSO) enabled subscription.
  • To enable Security Assertion Markup Language (SAML) single sign-on for Atlassian Cloud products, you need to set up Atlassian Access. Learn more about Atlassian Access.
  • A Claimed Domain with Atlassian. To do this, you will need to be able to modify the DNS records for your domain.

Also, we cannot forget the documentation. This actually was from Microsoft, and not Atlassian! Shocking, I know. But it was on point and guided me through most of the process.

Setting up Single Sign-On (SSO)

Single Sign-On, or SSO, is a mechanism that does what it says on the tin. If you log in to any application participating in the SSO environment, you will not be required to re-enter your password to sign into any other participating app. So if both your JIRA and Confluence are a part of the same SSO environment, you can start working in JIRA, then move over to Confluence without having to pause to authenticate again.

  1. To get started, go to your Azure AD Directory, then click “Enterprise Applications” in the sidebar (underscored in red). This page is where you will set up the Integration with Atlassian Cloud.
  1. Now that you are on the Enterprise Applications Screen click “New Application.”
  1. In the search bar shown, type “Atlassian Cloud”. Doing this will bring the integration up in the search results. Once it appears, click on it.
  1. Clicking the search result will cause the following menu to Pop up on the right-hand side. You won’t need to modify anything here, so you can click “Add” at the bottom of this menu.
  1. We can safely skip “1. Assign users and groups” for now. Proceed by clicking “2. Setup Single sign-on.”
  1. On the next screen that appears, you are presented with three choices. Select the second option that says, “SAML.”
  1. Next, you will get a pop-up asking about Saving. For now, click ‘No, I’ll save later.”
  1. You can save Section 1 on the next screen for later – as you will need information from Atlassian to complete this section. Instead, move onto Section 2 by clicking it’s “Pencil” icon.
  1. Here, we’ll only need to update one attribute. By default, Azure AD wants to send the user’s Principle Name to Atlassian Cloud. However, Atlassian wants the email address in this field. So to change it, click “Unique User Identifier (Name ID).
  1. Doing so will cause the following form to appear. Change “user.userprincipalname” to “user.mail” under Source attribute, then click “Save.”
  1. On the Navbar, click “SAML-based Sign-on” to return to the previous section.
  1. With the Attributes & Claims ready, we can start collecting information Atlassian will need. To begin with, download the Base64 Certificate in Section 3 to your local system.
  1. The next three pieces of data we will need are in Section 5. Copy the three URL’s highlighted below to a notepad you can reference later. To find them, you will need to expand the “Configuration URLs” Dropdown menu.
  1. Now we can switch over to Atlassian and start the setup there. Under your https://admin.atlassian.com admin page, Select Security →SAML single sign-on
  1. On the page shown below, click “Add” SAML configuration.”
  1. Now we can start entering the information we got from Azure AD. Be sure to pay attention to how I have it mapped below, as Atlassian and Azure have different names for each field.
    • Enter Login URL from Azure into the Identity provider SSO URL field
    • Enter the Azure AD Identifier from Azure into the Identity provider Entity ID field
  1. Now open the Certificate you downloaded in Step 12 in a text editor of your choice. Copy the contents into the Public x509 certificate Field, then click “Save.”
  1. Now we will need to give Azure some information on your Atlassian Cloud setup. To do so, copy the “SP Entity ID” and “SP Assertion Consumer Service URL” fields from the next page.
  1. You remember in Step 8, when I had you skip Section 1 on Azure’s SSO Configuration? Now is when we will go back and fill it in by clicking the “Pencil” icon.
  1. Here we’ll copy in the two URLs we copied in Step 18 into the two highlighted fields. Be sure to pay attention below, as again, Azure and Atlassian disagree on what to call these fields.
    • The SP Entity ID field from Atlassian goes into the Identifier (Entity ID) field in Azure
    • The SP Assertion Consumer Service URL field from Atlassian goes into the Reply URL (Assertion Consumer Service URL) field in Azure
    • Be sure to click the “Default” checkbox next to both, then click “Save”
  1. You should get a Pop-up asking if you want to Test single sign-on.  Click “Yes”.  This will open the following screen.  If your user is already provisioned in Atlassian Cloud, click “Sign in as current user”
  1. Congratulations, SAML SSO is now setup!

Setting up User Provisioning

So, we have SSO setup. Great!

As things stand now, you still have to go and manually populate every new user in your Atlassian environment. Not Great.

To resolve this, we’ll next setup User Provisioning, which also does what it says. This process will automatically set up new users in your Atlassian Cloud system as you add them in AD. Which, once again, will be Great.

  1. Go back to the Atlassian Cloud Integration page in Azure. This is the page from Step 5 of the SSO setup above. Once there, click “Part 3. Provision User Accounts.”
  1. On the next screen, we will select “Automatic” under Provisioning Mode:
  1. Next, we’ll need to set up some things under your Atlassian Access screen (https://admin.atlassian.com). To get started here, click “Back to organization” → Directory → User Provisioning.
  1. Now we will click the “Create a Directory” page to get started here.
  1. Enter a Name for your Directory. To keep it descriptive, I like to copy the name from the Azure Directory. After we enter the name, click “Create”:
  1. With this created, Atlassian presents us with two pieces of information that we’ll need to give Azure. Copy both the URL and the API key.
  1. Back within Azure, we will enter both of these into the Admin Credentials section. Again, be careful here as Atlassian and Azure disagree on what to call them.
    • The Directory base URL from Atlassian will go into the Tenant URL field in Azure
    • The API key from Atlassian will go into the Secret Token field in Azure
    • Be sure to test the connection after you enter both
    • OPTIONAL: You can also enter a Notification Email to get failure notices.
  1. On the next page, Mappings, you can use the defaults as-is. Just click “Next.”
  2. Under Settings, Set “Provisioning Status” to “On,” then Set Scope to “Sync Only Assigned users and Groups.”
  1. Click “Save,” and you are done!

Azure AD will not sync your selected users to Atlassian automatically! But which users will Azure sync? That is the focus of our next section!

Adding Users and Groups to sync to Atlassian Cloud

So with our setup right now, we have Azure syncing over only selected users to Atlassian. We set it up like this because if you sync everyone and have a large AD environment, you can quickly find yourself out of licenses on JIRA. So let us explore how we tell Azure which users it needs to set up in Atlassian Cloud.

  1. Back on the Atlassian Cloud Overview Page (again, from Step 5 of the SSO Setup), click “Users and Groups” from the sidebar.
  1. On this screen, click “+ Add User” at the top of the screen.
  1. Click “Users” then select the Users that Azure should sync with Atlassian Cloud. Repeat for Groups that you would like to also sync over to Atlassian Cloud.
    Note: As I did my testing on Azure’s free tier, I didn’t have groups available to get a screenshot of. Sorry!
  1. Select Role then click Assign. Congratulations! These users will now be populated into Atlassian Cloud during the next sync operation!

And that’s it!

You now have your Atlassian Cloud environment setup and ready to use Azure for Authentication! If you are already leveraging Azure AD to manage your users, it is just one less headache to worry over. 

Job Seeker Profile!

So, it does happen where someone searching for a job will contact me to ask if I know of any open positions. Unfortunately, I am not always able to help them in that regard. However, given the uncertain times we live in, I want to do something. So I’ll feature them here.

That is the case today with Siva Kumar Veerla from Hyderabad, India. He has recently been thrown into the job market due to the COVID-19 Pandemic. From his CV, he is a solid Atlassian Administrator who has led several projects, including upgrades and system installs. He is currently looking for opportunities in India or Europe. If you think he might be a good fit for you, please feel free to contact him on LinkedIn or through the information on his CV.

And Other exciting things!

Let me just say…Wow. This month has been amazing! For starters, look at this.

Yes, that is a new record month for the blog! Thank you for continuing to read, comment, like, and share the blog on the various Social Media platforms.

I’d also like to thank Predrag Stojanovic especially, who pointed out an Atlassian Group on Facebook. And well, that group loved last week’s blog post! So, I’ve gone ahead and set up a Facebook page for thejiraguy.com blog! Like Twitter, like this page to get the latest posts from the blog and random Atlassian news I find interesting! You can also subscribe below to get new posts delivered directly to your inbox!

Also, I will be giving a presentation tomorrow on Monitoring your Atlassian Applications using Nagios! If you are in the Atlanta, GA area, tune in Thursday! If you are not, I am trying to refine this presentation to submit to Atlassian for Summit. So, with a bit of luck, you’ll be hearing it from me next April!

But until next time, this is Rodney, asking “Have you updated your JIRA Issues today?”

Alerting on JIRA Problems using Nagios

So I ran into an interesting situation this past Monday. Apparently my Primary DNS had been down for at least a week. I went to go look at my network monitoring tool (LibreNMS) – and THAT was down too – for what I can guess is at least two weeks! Granted I haven’t been doing as much on my Homelab since early March when I went into the hospital, this was still not a good state of affairs.

So I decided to stand up a Nagios instance to monitor and alert when I have critical systems down. After getting it stood up, it didn’t take me long to start thinking about how I could use this with JIRA, which is now the topic we are going to cover today!

A bit of history

As you know, when I started my Atlassian journey, I was in charge of more than just JIRA. Nagios was one of my boxes I inherited as well. So I’m somewhat familiar with the tool already and how to configure it. I’ve had to modify things during that time, but never do a full setup. However, I knew I wanted to do more than monitor if JIRA was listening to web-traffic. So as part of the whole installation, I decided to dive in and see what she can do.

How to select what to Alert on.

Selecting what I want to be alerted for has always been a balancing act for me. You don’t want to have so many emails that they become worthless, but you don’t want to have so few that you won’t be alerted to a real problem.  

The goal of alerting is to clue you into problems so you can be proactive. Fix back end problems before they become a user ticket. So I always try to take the approach “What does a user care about?”

They care that the system is up and accessible, so I always monitor the service ports, including my access port. So that’s three.

A user also cares that their integrations work. If your integrations depend on SSL, and your clock drifts too far out of alignment, those integrations can fail – so I want to check the system is in sync with the NTP Server.

A feature that users love is the ability to attach files to issues. This feature will eventually chew up your disk space, so I’ll also want to monitor the disk JIRA’s home directory lives on. 

Considering I’m using a proxy, I’ll want to be sure the JVM itself is up, so I’ll need to look at that. I’ll also want to be sure that JIRA is performing at it’s best, and isn’t taking too long to respond, so I’ll want an alert for that as well.

Do you see what I’m doing? I’m looking at what can go wrong with JIRA when I’m not looking and setting up alerts for those. The idea here is I care about what my users care about, so I want the Nagios to tell me what is wrong before my users get a chance to.

So…configurations.  

Now comes the fun part. Nagios’ configuration files is a bit much to take in at first. However, I will be isolating the Atlassian specific configurations to make things a bit easier on all of us. First, lets start with some new commands I had to add.

###############################################################################
# atlassian_commands.cfg
#
#
# NOTES: This config file provides you with some commands tailored to monitoring
#        JIRA nodes from Nagios
# AUTHOR: Rodney Nissen <rnissen@thejiraguy.com
#
###############################################################################


define command {
    command_name    check_jira_status
    command_line	$USER1$/check_http -S -H $HOSTADDRESS$ -u /status -s '{"state":"RUNNING"}'
	}
	
define command {
    command_name	check_jira_restapi
	command_line	$USER1$/check_http -S -H $HOSTADDRESS$ -u /rest/api/latest/issue/$ARG3$ -s "$ARG3$" -k 'Authorization: Basic $ARG4$' -w $ARG1$ -c $ARG2$
	}
    
define command {
    command_name    check_jira_disk
    command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -l nagios -C "/usr/lib64/nagios/plugins/check_disk -w $ARG2$ -c $ARG3$ -p $ARG1$"
    }
    
define command {
    command_name    check_jira_load
    command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/usr/lib64/nagios/plugins/check_load -w $ARG1$ -c $ARG2$" -l nagios

The first two commands here are VERY tailored to JIRA. The first one checks that the JVM is running, all with a handy HTTP request. If you go to your JIRA instance and go to the /status directory, the JVM will respond with a simple JSON telling you the state of the node. You use this feature in JIRA Data Center, so your load balancer can determine which nodes are up and ready for traffic. Buuut…it’s on JIRA Server too, and we can use it for active monitoring. So I did. If JIRA returns anything other than {“state”:”RUNNING”}, the check will fail and you will get an alert.

The second is a check on the rest API. This one will exercise your JIRA instance to make sure it’s working without too much load time for users. The idea here will search for a known issue key, and see if it returns valid information within a reasonable time. $ARG1$ is how long JIRA has before Nagios will issue a warning that it’s too slow (in seconds), and $ARG2$ is how long JIRA has before Nagios considers it a critical problem. $ARG3$ is your known good Issuekey. $ARG4$ is a set of credentials for JIRA encoded in Base64. If you are not comfortable just leaving your actual credentials encoded as such, I’d suggest you check out the API Token Authentication App for JIRA. Using the App will allow you to use a token for authentication and not expose your password.

The third commands here are for checking JIRA’s home directory ($ARG1). $ARG2$ and $ARG3$ are percentages for the warning and critical thresholds, respectively.

The fourth is for checking the system load. This one is relatively straight forward. $ARG1$ is the system load that will trigger a warning, and $ARG2$ is the system load that shows you have a problem.

Now for the JIRA host configuration:

###############################################################################
# jira.CFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
#
# NOTE: This config file is intended to serve as an *extremely* simple
#       example of how you can create configuration entries to monitor
#       the local (Linux) machine.
#
###############################################################################



###############################################################################
#
# HOST DEFINITION
#
###############################################################################

# Define a host for the local machine

define host {

    use                     linux-server            ; Name of host template to use
                                                    ; This host definition will inherit all variables that are defined
                                                    ; in (or inherited by) the linux-server host template definition.
    host_name               jira
    alias                   JIRA
    address                 192.168.XXX.XXX
}


###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################

# Define a service to "ping" the local machine

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     PING
    check_command           check_ping!100.0,20%!500.0,60%
}


# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     SSH
    check_command           check_ssh
}



# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     HTTP
    check_command           check_http
}

define service {

    use                     generic-service
    host_name               jira.folden-nissen.com
    service_description     HTTPS
    check_command           check_https
}


define service {
    use                     generic-service
    host_name               jira
    service_description     NTP
    check_command           check_ntp!0.5!1
}

define service {
	use				generic-service
	host_name			jira
	service_description		JIRA Status
	check_command			check_jira_status
	}
	
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA API Time
	check_command			check_jira_restapi!2!3!HL-1!<Base64 Credentials>
	}
    
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA System Load
	check_command			check_jira_load!5.0,4.0,3.0!10.0,6.0,4.0
	}
    
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA Home Directory Free Space
	check_command			check_jira_disk!<JIRA Home>!75%!85%
	}

So, first, we define the host. This section is information specific to JIRA. Then we start setting services for JIRA. Within Nagios, a Service is a particular check you want to run.

The next four options are pretty standard. These are checking Ping, the two service ports (HTTP and HTTPS), and the SSH port. The SSH Port and HTTP/S port checks will also check that those services are responding as expected.

The next check is for NTP. I have this setup to warn me if the clock is a half-second off and give me a critical error if the clock is off by one second. These settings might be too strict, but it has yet to alert, so I think I have dialed it in well enough.

The next is the JIRA Status check. This service will check /status, as we mentioned earlier. It’s either the string we are expecting, or it’s not, so no arguments needed.

After that is my JIRA API Check, which I set up to check the HL-1 issue. If the API Call takes longer to 2 seconds, issue a warning, and if it takes longer than 3 seconds, Nagios issues a critical problem. This alert won’t tell me exactly what’s wrong, but it will tell me if there is a problem anywhere in the system, so I think it’s a good check.

The last two services are systems check – checking the System Load and JIRA home directory disk, respectively. The Load I haven’t had a chance to dial in yet, so I might have it set too high, but I’m going to leave it for now. As for the Disk check, I like to have plenty of warning I am approaching a full disk to give me time to resolve it, so these numbers are good.

The last step is to add these to the nagios.cfg file so that they get loaded into memory. However, this is as easy as adding the following lines into the cfg file.

# Definitions for JIRA Monitoring
# Commands:
cfg_file=/usr/local/nagios/etc/objects/atlassian_command.cfg

# JIRA Nodes:
cfg_file=/usr/local/nagios/etc/objects/jira.cfg

And that’s it! Restart Nagios and you will see your new host and service checks come up!

Nagios in action.

So I’ve had this configuration in place for about a day now, and it appears to be working. The API Time check did go off once, but I did restart the JIRA Server to adjust some specs on the VM, so I expected the delay. So I hope this helps you as you are setting up alerts for your JIRA system!

And that’s it for this week!

We did get a bit of bad news about Summit 2021 last week. Out of an abundance of caution, Atlassian decided to go ahead and make all in-person events of 2020 and Summit 2021 virtual events. However, they have almost a year to prepare for a virtual Summit – as opposed to the 28 days they had this year. So I am excited to see what ideas Atlassian has to make this a fantastic event!

Don’t forget our poll! I’m going to let it run another week!

Don’t forget you can check me out on Twitter! I’ll be posting news, events, and thoughts there, and would love to interact with everyone! If you found this article helpful or insightful, please leave a comment and let me know! A comment and like on this post in LinkedIn will also help spread the word and help others discover the blog! Also, If you like this content and would like it delivered directly to your inbox, sign up below!

But until next time, my name is Rodney, asking “Have you updated your JIRA Issues today?”

Monitoring JIRA for Fun and Health

So, dear readers, here’s the deal. Some weeks, when I sit down to write, I know exactly what I’m going to write about, and can get right to it. Other weeks, I’m sitting down, and I don’t have a clue. I can usually figure something out, but it’s very much a struggle. This week is VERY much the latter.

Compound that with the fact that I just lost most of my VM’s due to a storage failure I had this very morning. Part of it was a mistake on my part. I have the home lab so that I can learn things I can’t learn on the job. And mistakes are a painful but powerful way to learn. Still….

This brings me back to a conversation I had with a colleague and fellow Atlassian Administrator for a company I used to work for. He had asked me what my thoughts around implementing Monitoring of JIRA. Well, I have touched on the subject before, but if I’m being honest, this isn’t my greatest work. Combine that with the fact that I suddenly need to rebuild EVERYTHING, well, why not start with my monitoring stack!

So, we are going to be setting up a number of systems. To gather system stats, that is to say CPU usage, Memory Usage, and Disk usage, we are going to be using Telegraf, which will be storing that data in an InfluxDB database. Then for JIRA stats we are going to use Prometheus. And to query and display this information, we will be using Grafana.

The Setup

So we are going to be setting up a new system that will live alongside our JIRA instance. We will call it Grafana, as that will be the front end we will interact with the system with.

On the back end it will be running both a InfluxDB Server and a Prometheus Server. Grafana will use both InfluxDB and Prometheus as data sources, and will use that to generate stats and graphs of all the relevant information.

Our system will be a CentOS 7 system (my favorite currently), and will have the following stats:

  • 2 vCPU
  • 4 GB RAM
  • 16 GB Root HDD for OS
  • 50 GB Secondary HDD for Services

This will give us the ability to scale up the capacity for services to store files without too much impact on the overall system, as well as monitor it’s size as well.

As per normal, I am going to write all commands out assuming you are root. If you are not, I’m also assuming you know what sudo is and how to use it, so I won’t insult you by holding your hand with that.

InfluxDB

Lets get started with InfluxDB. First thing we’ll need to do is add the yum repo from Influxdata onto the system. This will allow us to use yum to do the heavy lifting in the install of this service.

So lets open /etc/yum.repos.d/influxdb.repo

vim /etc/yum.repos.d/influxdb.repo

And add the following to it:

[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key

Now we can install InfluxDB

yum install influxdb -y

And really, that’s it for the install. Kind of wish Atlassian did this kind of thing.

We’ll need to of course allow firewall access to Telegraf can get data into InfluxDB.

firewall-cmd --permanent --zone=public --add-port=8086/tcp
firewall-cmd --reload

And with that we’ll start and enable the service so that we can actually do the service setup.

systemctl start influxdb
systemctl enable influxdb

Now we need to set some credentials. As initially setup, the system isn’t really all that secure. So we are going to secure it initially by using curl to set ourselves an account.

curl -XPOST "http://localhost:8086/query" --data-urlencode \
"q=CREATE USER username WITH PASSWORD 'strongpassword' WITH ALL PRIVILEGES"

I shouldn’t have to say this, but you should replace username with one you can remember and strongpassword with, well, a strong password.

Now we can use the command “influx” to get into InfluxDB and do any further set up we need.

influx -username 'username' -password 'password'

Now that we are in, we need to setup a database and user for our JIRA data to go into. As a rule of thumb, I like to have one DB per application and/or system I intend to monitor with InfluxDB.

CREATE DATABASE Jira
CREATE USER jira WITH PASSWORD 'strongpassword'
GRANT ALL ON jira TO jira
CREATE RETENTION POLICY one_year ON Jira DURATION 365d REPLICATION 1 DEFAULT
SHOW RETENTION POLICIES ON Jira

And that’s it, InfluxDB is ready to go!

Grafana

Now that we have at least one datasource, we can get to setting up the Front End. Unfortunately, we’ll need information from JIRA in order to setup Prometheus (once we’ve set JIRA up to use the Prometheus Exporter), so that data source will need to wait.

Fortunately, Grafana can also be setup using a Yum repo. So lets open up /etc/yum.repos.d/grafana.repo

vim /etc/yum.repos.d/grafana.repo

and add the following:

[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

Afterwards, we just run the yum install command:

sudo yum install grafana -y

Grafana defaults to port 3000, however options to change or proxy this are available. However, we will need to open port 3000 on the firewall.

firewall-cmd --permanent --zone=public --add-port=3000/tcp
firewall-cmd --reload

Then we start and enable it:

sudo systemctl start grafana-server
sudo systemctl enable grafana-server

Go to port 3000 of the system on your web browser and you should see it up and running. We’ll hold off on setting up everything else on Grafana until we finish the system setup, though.

Telegraf

Telegraf is the tool we will use to get our data from JIRA’s underlying linux system and into InfluxDB. This is actually part of the same YUM repo that InfluxDB is installed from, so we’ll now also add it to the JIRA server – same as we did Grafana.

vim /etc/yum.repos.d/influxdb.repo

And add the following to it:

[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key

And now that it has the YUM repo, we’ll install telegraf onto the JIRA Server.

yum install telegraf -y

Now that we have it installed, we can take a look at it’s configuration, which you can find in /etc/telegraf/telegraf.conf. I highly suggest you take a backup of this file first. Here is an example of a config file where I’ve filtered out all the comments and added back in everything essential.

[global_tags]
[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  logtarget = "file"
  logfile = "/var/log/telegraf/telegraf.log"
  logfile_rotation_interval = "1d"
  logfile_rotation_max_size = "500MB"
  logfile_rotation_max_archives = 3
  hostname = "<JIRA's Hostname>"
  omit_hostname = false
[[outputs.influxdb]]
  urls = ["http://<grafana's url>:8086"]
  database = "Jira"
  username = "jira"
  password = "<password from InfluxDB JIRA Database setup>"
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]

And that should be it for config. There are of course more we can capture using various plugins – based on whatever we are interested in, but this will get the bare minimum we are interested in.

Because telegraf is pushing data to the InfluxDB server, we don’t need to open any firewall ports for this, which means we can start it, then monitor the logs to make sure it is sending the data over without any problems.

systemctl start telegraf
systemctl enable telegraf
tail -f /var/log/telegraf/telegraf.log

And assuming you don’t see any errors here, you are good to go! We will have the stats waiting for us when you finish the setup of Grafana. But first….

Prometheus Exporter

So telegraf is great for getting the Linux system stats, but that only gives us a partial picture. We can train it to capture JMX info, but that means we have to setup JMX – something I’m keen to avoid whenever possible. So what options have we got to capture details like JIRA usage, JAVA Heap performance, etc?

Ladies and gentlemen, the Prometheus Exporter!

That’s right, as of the time of this writing, this is yet another free app! This will setup a special page that Prometheus can go to and “scrape” the data from. This is what will take our monitoring from “okay” to “Woah”.

Because it is a free app, we can install it directly from the “Manage Apps” section of the JIRA Administration console

Once you click install, click “Accept & Install” on the pop up, and it’s done! After a refresh, you should notice a new sidebar item called “Prometheus Exporter Settings”. Click that, then click “Generate” next to the token field.

Next we’ll need to open the “here” link into a new tab on the “Exposed metrics are here” text. Take special special note of the URL used, as we’ll need this to setup Prometheus.

Prometheus

Now we’ll go back to our Grafana system to setup Prometheus. To find the download, we’ll go to the Prometheus Download Page, and find the latest Linux 64 bit version.

Try to avoid “Pre-release”

Copy that to your clipboard, then download it to your Grafana system.

 wget https://github.com/prometheus/prometheus/releases/download/v2.15.2/prometheus-2.15.2.linux-amd64.tar.gz

Next we’ll need to unpack it and move it into it’s proper place.

tar -xzvf prometheus-2.15.2.linux-amd64.tar.gz
mv prometheus-2.15.2.linux-amd64 /archive/prometheus

Now if we go into the prometheus folder, we will see a normal assortment of files, but the one we are interested in is prometheus.yml. This is our config file and where we are interested in working. As always, take a backup of the original file, then open it with:

vim /archive/prometheus/prometheus.yml

Here we will be adding a new “job” to the bottom of the config. You can copy this config and modify it for your purposes. Note we are using the URL we got from the Prometheus Exporter. The first part of the URL (everything up to the first slash, or the FQDN) goes under target where indicated. The rest of the URL (folder path) goes under metrics_path. And then your token goes where indicated so that you can secure these metrics.

global:
  scrape_interval:     15s
  evaluation_interval: 15s
alerting:
  alertmanagers:
  - static_configs:
    - targets:
rule_files:
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'Jira'
    scheme: https
    metrics_path: '<everything after the slash>'
    params:
      token: ['<token from Prometheus exporter']
    static_configs:
    - targets:
      - <first part of JIRA URL, everything before the first '/'>

We’ll need to now open up the firewall port for Prometheus

firewall-cmd --permanent --zone=public --add-port=9090/tcp
firewall-cmd --reload

Now we can test Prometheus. from the prometheus folder, run the following command.

./prometheus --config.file=prometheus.yml

From here we can open a web browser, and point it to our Grafana server on port 9090. On the Menu, we can go to Status -> Targets and see that both the local monitoring and JIRA are online.

Go ahead and stop prometheus for now by hitting “Ctrl + C”. We’ll need to set this up as a service so that we can rely on it coming up on it’s own should we ever have to restart the Grafana server.

Start by creating a unique user for this service. We’ll be using the options “–no-create-home” and “–shell /bin/false” to tell linux this is an account that shouldn’t be allowed to login to the server.

useradd --no-create-home --shell /bin/false prometheus

Now we’ll change the files to be owned by this new prometheus account. Note that the -R makes chown run recursively, meaning it will change it for every file underneath were we run it. Stop and make sure you are running it from the correct directory. If you run this command from the root directory, you will have a bad day (Trust me)!

chown -R prometheus:prometheus ./

And now we can create it’s service file.

vim /etc/systemd/system/prometheus.service

Inside the file we’ll place the following:

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/archive/prometheus/prometheus \
    --config.file /archive/prometheus/prometheus.yml \
    --storage.tsdb.path /archive/prometheus/ \
    --web.console.templates=/archive/prometheus/consoles \
    --web.console.libraries=/archive/prometheus/console_libraries 

[Install]
WantedBy=multi-user.target

After you save this file, type the following commands to reload systemctl, start the service, make sure it’s running, then enable it for launch on boot.:

systemctl daemon-reload
systemctl start prometheus
systemctl status prometheus
systemctl enable prometheus

Now just double check that the service is in fact running, and you’re good to go!

Grafana, the Reckoning

Now that we have both our datasources up and gathering information, we need to start by creating a way to display it. On your web browser, go back to Grafana, port 3000. You should be greeted with the same login screen as before. To login the first time, use ‘admin’ as username and password.

You will be prompted immediately to change this password. Do so. No – really.

After you change your password, you should see the screen below. Clilck “Add data source”

We’ll select InfluxDB from the list as our first Data Source.

For settings, we’ll enter only the following:

  • Name: JIRA
  • URL: http://localhost:8086
  • Database: Jira
  • User: jira
  • Password: Whatever you set the InfluxDB Jira password to be

Click “Save & Test” at the bottom and you should have the first one down. Now click “Back” so we can set up Prometheus.

On Prometheus, all we’ll need to do is set the URL to be “http://localhost:9090. Enter that, then click “Save & Test”. And that’s both Data Sources done! Now we can move onto the Dashboard. On the right sidebar, click through to “Home”, then click “New Dashboard”

And now you are ready to start visualizing Data. I’ve already covered some Dashboard tricks in my previous attempt at this topic. However, if it helps, here’s how I used Prometheus to setup a graph of the JVM Heap.

Some Notes

Now, there is some cleanup you can do here. You can map out the storage for Grafana and InfluxDB to go to your /archive drive, for example. However, I can’t be giving away *ALL* the secrets ;). I want to challenge you there to see if you can learn to do it yourself.

We do have a few scaling options here too. For one, we can split Influx, Prometheus, and Grafana onto their own systems. However, my experience has been that this isn’t usually necessary, and they can all live comfortably on one system.

And one final note. The Prometheus exporter, strictly speaking, isn’t JIRA Data Center compatible. It will run however. As best I can tell, it will give you the stats for each node where applicable, and the overall stats where that makes sense. It might be worth installing and setting up Prometheus to bypass the load balancer and do each node individually.

But seriously, that’s it?

Indeed it is! This one is probably one of my longer posts, so thank you for making it to the end. It’s been a great week hearing how the blog is helping people out in their work, so keep it up! I’ll do my part here to keep providing you content.

On that note, this post was a reader-requested topic. I’m always happy to take on a challenge from readers, so if you have something you’d like to hear about, let me know!

One thing that I’m working on is to try and make it easier for you to be notified about new blog posts. As such, I’ve included an email subscription form at the bottom of the blog. If you want to be notified automatically about to blog posts, enter your email and hit subscribe!

And don’t forget about the Atlassian Discord chat – thoroughly unofficial. click here to join: https://discord.gg/mXuRsVu

But until next time, my name is Rodney, asking “Have you updated your JIRA issues today?”

How to test changes in JIRA

So, a bit of a backstory here. I was doing some experiments at work on running JIRA Data Center in Kubernetes using the official Atlassian containers when I noticed something odd. After loading the MySQL Connector and starting it all up, JIRA Setup kept telling me that the database wasn’t empty. I could see that it was, and per advice from a colleague, even double checked that the collations and char-sets were all correctly set.

Finally I isolated it down to the MySQL Connector. I had grabbed version 8.something, and Atlassian only supports version 5.1.48. And while this connector worked for JIRA 8.5.0, it apparently had some issues with JIRA 8.5.2 and 8.5.3.

This did get me thinking though. I went through the process of isolating the problem relatively quickly as I have had to do this fairly often in my career. But it isn’t the most intuitive thing to learn. So why not cover that this week!

Dev and Test

So, first thing: Friends don’t let Friends Test in Production. People are depending on that system being stable and there, and if you are mucking about in it constantly to “test” things, it will be anything but stable.

For all license tiers save the smallest, Atlassian also gives you an unlimited use Development License. And this is for both Apps and the main Applications. USE IT! If I.T. won’t give you another system, setup a VM on your desktop. IF they won’t let you use that, bring in an old PC from Home. There is no excuse for testing in production.

The most common setup I see is for a team to have two non-production instances of each platform: Test and Dev. Dev is your personal instance. This is where you can make changes to your hearts content, bring it up and down, upgrade it, reset it, whatever as much as you want. Break it? Won’t impact anything, and just refresh from Production. This is usually where I test “I wonder what will happen if I do this?” at.

Test, on the other hand, is your public non-production instance. You want to let a user test the functionality of a new App before purchasing it? Goes in Test. A user wants to add a new field? Put it in test and let them see what it looks like first. I usually like to refresh this from production on every JIRA Upgrade, but will do it sooner if we’ve made any big changes in production.

As a best practice, I also like to change the color scheme of JIRA for each instance, so you can identify which is which on site. My usual color scheme is to have the top bar be orange for Test, and Red for Dev. A few other things I do:

  • Separate out each instance to a separate DB Server
  • Make sure that if a given non-production server tries to talk to Production, it’s rerouted to the appropriate non-production instance instead. Often using /etc/hosts file.
  • DISABLE THE OUTGOING EMAIL SERVER

I definitely recommend you have both available. If you are only limited to one due to policy or budget, at least have a test instance. Your production instance will thank you.

But what about a non-production site for JIRA Cloud?

Okay – so I haven’t had to deal with this too often. BUT, you are also not the first person to ask, dear reader. Atlassian has a document actually outlining a few options you have to setting up non-production Atlassian cloud instances.

Take a snapshot and/or backup before changing anything

Before trying to figure out a problem or making a change, give yourself a way to get back to a pre-test state. If your instance (DB and all) is on a single VM, take a snapshot of the VM before starting. IF not, Take a tarball of your install and home directory, and while those are running take a database dump from your DB. Heck, if you can, take a file backup and a VM snapshot, do both!

Before I have your ESXI admins after me with torches and pitchforks, I should note here. The way I understand it, a snapshot setups up a way for ESXI to journal all the changes made to a system within a file, and revert back those changes. That means the longer a snapshot sits on a system, the larger it becomes. So always go back and remove a snapshot after you finish your testing. At the very least, it keeps things from getting messy.

This doesn’t only extend to a whole system. If you are changing a single file, make a copy of it first. That way you can go back to the file before you made any changes should the change prove catastrophic. The goal here is no matter what you are doing, always give yourself a path back to before you did it.

Isolate and make only one change at a time

This is probably the most challenging part of testing. For each run you do, you need to make only one change at a time. But what do I mean by change? Do I mean you should upgrade by changing one file at a time? Of course not!

The purpose of this is to isolate something enough to know what fixes or breaks it. So if you are doing a full upgrade, start by upgrading JIRA. Then check to see that it still runs as expected. Then make your changes to setenv.sh. Check again. Then server.xml. Then check again. Then upgrade the apps. Check again.

In the example I gave in the intro, here’s the changes I made each run when I found there was a problem with the DB:

  1. Drop and Re-Setup the Database using a GUI Tool
  2. Drop and Re-Setup the Database from command line.
  3. Try a MySQL 5.7 DB instead of a MySQL 5.6 DB
  4. Try JIRA 8.5.2 instead of JIRA 8.5.3
  5. Try JIRA 8.5.2 with MySQL 5.6 instead of MySQL 5.7
  6. Try JIRA 8.5.2, MySQL 5.6, with a different MySQL Connector – FIXED!

So you can see how each step I only changed one item. Yeah, it took me six runs to find a solution, but I now know it was for sure the MySQL Connector.

Yes, this adds significant overhead of bringing down and restarting JIRA each run. BUT – if and when something does break, you will know it was only the last thing you did that broke it. Likewise if something fixes it, you also know it was the last thing you did that actually fixed it.

Keep track of the changes you’ve made to each instance since the last Refresh

This is a bit of practical advice. Somewhere (Confluence), you need to have a document that shows in what ways each non-production instance has been changed since the last time you refreshed it from production.

Add a field? Add that to the doc. User tested an App? Document it. The idea is to have a journal to show what you’ve done, so that if you need to refresh it while a user is still testing something, you know where to find those changes to restore them.

And I get it – documentation is evil. Why spend time writing what you are doing when you can be doing more. This something I struggle with too! But this is a case where an ounce of prevention is worth a pound of cure.

Practice good Change Management on Production!

So, you’ve tested something in dev, put it before users in test, and now you are ready to put it on Production now. Enough delays, right?

Slow down there, friend! Production is sacred, you shouldn’t just run in there with every change.

Change control/change management is a complex subject – and honestly – hasn’t always been my strong suit. But it’s meant to keep you as an admin from your worst impulses. Annoying at times, I’ll grant you, but still a good thing overall.

The best way I found is to setup a board made of up of your Power Users, other Admins, and various other stakeholders as needed. Have them meet every so often (every other week seems to be the sweet spot here). If you have the budget for it, make it a lunch meeting and provide food. You are much more likely to get people to show up if they get to eat.

Then go over every change you want to make and gather feedback. They might spot a problem with a use case you hadn’t considered. But be sure to get a vote on each change before the meeting is over. Trust me, if you don’t structure and control the meeting, they will talk each point to death.

As a note here, there should be an exception to putting changes through the board during an emergency. If production is down, your first priority should be getting it back online as soon as possible. Then you can have time to retroactively put it through the board. For all non-emergency changes though, the change board is the valve to what you want to put into production.

Strictly this is not part of testing, but considering all, I didn’t want you to run off thinking testing was the last step. As with everything JIRA, it all works best when it’s a process.

And that is it!

You are ready to do some testing in JIRA. With the advice above, you are ready to maintain your JIRA Instances responsibly – or at the very least give yourself a way out of any sticky situations you find yourself in.

Don’t forget to join us on Discord! https://discord.gg/mXuRsVu

Until next time, this is Rodney, asking “Have you updated your JIRA issues today?”

Leaving the Breadcrumbs: how to adjust Logging.

So, we’ve discussed how to read your logs, and what impact changing them will have on your disk drives. So how do we go about changing logging levels and tuning logging? That is what we are going to discuss today. So without much preamble, lets get into this.

Should I adjust my logging levels?

If I’m being honest, No. Well, that was a quick blog post, I’ll see you next week!

Actually, let me explain. The default levels are sufficient for 99% of admins out there. It’s a good balance of what you need to know to diagnose issues without filling up your disks. Typically, I only recommend people only adjust these levels if asked to do so by Atlassian or App Vendor support.

However, it’s still important that you know how to do so when asked. And with a bit of homework you might even be able to adjust them and find your answers before you have to get support involved. My advice though is to do so carefully if you choose to.

I just need to note something here, and it’s something I forgot to mention last week. Debug level logs can sometimes capture passwords in the log file. I should not have to tell you why that could be bad. However, this is just one more reason why you should really think twice about capturing any logs on the Debug level.

Temporary vs Permanent

So, the first question you need to answer is do you need this change to be permanent or temporary. Atlassian does give us two ways to change logging – one via the Admin console in the web UI, which will only last until your next JIRA restart. As such I label this the “temporary” option. The second option is by changing some files within the JIRA install directory, which will persist across application restarts, and as such I label a “permanent” change.

As I tend to recommend you stick to defaults, so without any deeper context, temporary is my go to answer. However, as with all things Atlassian, context is everything. Lets say you’ve had some problems with a new app you’ve installed onto your instance, and it’s causing the Application to restart regularly. This is a time you’d want to look at a more permanent solution, as you’re not going to capture those detailed logs while JIRA loads up otherwise.

If you are working with Support, they will almost always tell you to go the temporary route. However, no matter what I say, my advice is to follow their advice. Seriously, they are scary good at what they do, and they are not going to steer you wrong.

Making a Temporary change to logging.

To change something temporarily within the logs, we’ll need to go to the Logging and Profiling section of the JIRA Administration Console. Once there, you’ll find it under System -> Logging and Profiling.

Note: You will need System Administrator global permissions to be able to see this section.

As a pro-tip, you can also find any admin page from anywhere in JIRA by hitting the period key on your keyboard, then typing the page you are interested in. For this to work, your cursor must be outside a text input box of any kind.

Once here, you’ll see different sections:

  • Mark Logs
  • HTTP Access Logging
  • SQL Logging
  • Profiling
  • Mail
  • Default Loggers

Today we’re going to be primarily interested in the Mark Logs section and the Default Loggers section. Everything else is available to turn on, but remember that these logs will only run until the next time you restart JIRA.

Mark Logs

The first section here is where you can add a comment into your logs. You can also roll over your logs. A roll over is where JIRA will increment the end number on all existing rolled over logs, copy the current log file to atlassian-jira.log.1, then start a new atlassian-jira.log file.

Both of these techniques are great for trying to mark a section of logs before doing some operation or test. In fact, I made great use of this functionality to do my testing last week on log sizes. I can search a ten minute manual search for where you started doing something to a “It’s right here” instant search.

Default Loggers

The next section of interest is the Default Loggers section. This has the familiar logging levels shown, on a per-package bases. A package is an individual class or object within the JAVA code, so each of these will adjust the logging on a specific aspect for JIRA.

This unfortunately is an exhaustive list, and I don’t have time to write up what each of them do (assuming I even know that!). However, most of these come down to logic, and with a bit of searching you can find something.

That is to say I won’t offer any guidance. If I’m being honest, it’s amazing how many times JIRA problems turn out to be App problems. So I’m going to quickly discuss how to add and adjust logs for an App here.

Our first step is to go to our App Manager, then expand the information for the App we are interested in.

Really should update that…

Look for the App Key, highlighted above, and copy it. Then head over to Logging and Profiling, and under Default Loggers, click “Configure logging level for another package”. Past the App key into the Package name section, then select your logging level. Click Add and boom, you are now logging for that particular App.

Permanent Logging Changes

To change logging levels, you will need to find the <jira-install-dir>/atlassian-jira/WEB-INF/classes/log4j.properties.

Here we can change a number of things, including log levels of various packages, where the logs are, and so on.

Super Important Note: Before making any changes to this file. Make a copy and save it in a safe place. Your future self will thank you.

To change the logging level, you first need to know the package you are adjusting for. You will either have gotten this from Support or the App Manager section of JIRA. Then we’ll look for log4j.logger.<package-name>. You should see something like this:

log4j.logger.com.atlassian = WARN, console, filelog
log4j.additivity.com.atlassian = false

To adjust the log level, change “WARN” to any of the other logging levels, then save. After you restart JIRA you should see the logging level change reflected. This goes with any change to this log file – you will need to restart JIRA to see any changes.

And that’s logging, done.

So I’ve actually had a lot of fun going back over this subject matter. It was also the first reader-requested topic, so that is amazing in and off itself. So what do you guys think I should cover next? Leave a comment here or on LinkedIn and if it’s good, I’ll cover it. So until next time, this is Rodney, asking “Have you updated your JIRA Issues today?”

Pile of Breadcrumbs, how logging levels impact JIRA’s logs

Well, seems we are back on track. Last time we looked at logging, we went over how to decode the information that was in the logs. During that piece, I made the following claim:

If you set everything to Debug before you leave on Friday, by the time you are sitting down for dinner on Saturday, you’re going to be paged for a full disk.

Following the Breadcrumbs: Decoding JIRA’s Logs, 30 Oct, 2019

Well, this got me to thinking….exactly how much does JIRA’s logging level impact log size. To figure this out, we need a bit of SCIENCE!

Ze Experiment!

So here’s what I’m thinking. We take a normal JIRA instance, and we do a set number of tasks in there, roll over the logs, then see what size they are. Rinse and Repeat per log level.

So the list of tasks I have for each iteration is:

  • Roll Over the Logs
  • Logout
  • Log in
  • Create Issue
  • Comment on Issue
  • Close Issue
  • Search for closed issue
  • Log into the Admin Console
  • Run Directory Sync
  • Roll Logs

Between the last Roll over and the first one of the next iteration is when I’ll capture log size and adjust the logging levels.

Now to have as much of an apples to apples comparison, I’ll need to limit the background tasks as much as possible. The biggest one I can think of is the automated directory sync, which will need to be disabled for the test. However, as this is a regular activity within JIRA, I’ll be including a manual directory sync to capture that into the data set.

I’ll also need a control to measure what JIRA does by default, but that will be my first run. To make sure I can return to defaults later, I’ll be adjusting the logging levels in Administration -> System -> Logging and Profiling. Changes to the logging level here do not persist over a restart, so this should be ideal. So, without further adieu, See you on the other side.

The Results

Well, That was an adventure. I ended up taking a backup of the log4j.properties file and changing it. Turns out changing > 110 settings by hand one at a time is not very time-efficient. Changing the log4j.properties file dropped the number that I had to change manually to ~30.

I tried to be as consistent as possible with each run. That means I had the same entries for each field on the issue I created, same comment, same click path, etc. I even goofed up my search on the control run (typed issueky instead of issuekey), and I repeated that mistake for each run afterwards.

Another note I want to make is that there was one setting I could not change. Turns out that if the com.atlassian.jira.util.log.LogMarker object is anything but Info, JIRA will crash when you go to roll over the logs. Oops!

Conclusions

I think even considering all that, my assertion still stands. This was one person doing somewhat normal JIRA tasks. With that the Debug was still almost 1000x the Info log. Fun fact, it rolled over the logs automatically four times. Now multiply that by how many people are using your instance, and how many times they will be doing these kinds of operations, over and over again. Even if they aren’t, there are automated processes like the Directory syncs that will take up log space. It will definitely add up fast.

However, I think the bigger consideration here is never take anything at face value. Yeah, I’m an expert, but even I was just parroting something I had been told about logging. Now I have the experiment and data to backup my assertion. Don’t be afraid to put things to a test. You might discover where some advice isn’t right for your environment. Or you might find out why things were said, and become that much more knowledgeable . The point is to always be learning.

So until next week, this is Rodney, asking “Have you updated your JIRA issues today?”