Integrating your Atlassian Cloud with Azure AD

Well, today, it seems we are going to do something I admittedly rarely do on the blog. That’s right; today, we are going to admit that JIRA Cloud exists!  

It’s not that I have anything against JIRA Cloud. My specialties tend to lie around making sure the underlying JIRA system runs as smoothly as possible, which is hard to do when you don’t own the underlying system. However, there is still plenty of overlap between JIRA Server/DC and JIRA Cloud, so it’s not like I’m unqualified to speak on it!

So it’s no secret at work that I maintain a whole collection of personal test systems. I do this to replicate and test just about anything I want without waiting for permission. The environments include (but are not limited to):

  1. VCenter Environment for VM’s
  2. More Raspberry Pis than I rightly know what to do with
  3. AWS Account
  4. Azure Account
  5. Cloud Environments of Confluence, Bitbucket, JIRA Software, and JIRA Service Desk
  6. Server Environments of Confluence, Bitbucket, JIRA Software, and JIRA Service Desk
  7. Several VPS online, including one running (wait for it…) Confluence.
This is RACK01. As in, “Yes, there is also a RACK02”. I…I might have a problem.

So, when my manager wanted some help looking into some oddness he saw in JIRA Cloud using Azure AD, he knew who had the tools to recreate and test that setup.

However, I didn’t know how to set up the integration when I started. So I had to learn that. And since I had to learn, I might as well help you learn too!  

Pre-reqs

To pull this off, you will need a few things first.

  • An Azure AD subscription. If you don’t have a subscription, and just want to do some testing, you can get a one-month free trial here.
  • Atlassian Cloud single sign-on (SSO) enabled subscription.
  • To enable Security Assertion Markup Language (SAML) single sign-on for Atlassian Cloud products, you need to set up Atlassian Access. Learn more about Atlassian Access.
  • A Claimed Domain with Atlassian. To do this, you will need to be able to modify the DNS records for your domain.

Also, we cannot forget the documentation. This actually was from Microsoft, and not Atlassian! Shocking, I know. But it was on point and guided me through most of the process.

Setting up Single Sign-On (SSO)

Single Sign-On, or SSO, is a mechanism that does what it says on the tin. If you log in to any application participating in the SSO environment, you will not be required to re-enter your password to sign into any other participating app. So if both your JIRA and Confluence are a part of the same SSO environment, you can start working in JIRA, then move over to Confluence without having to pause to authenticate again.

  1. To get started, go to your Azure AD Directory, then click “Enterprise Applications” in the sidebar (underscored in red). This page is where you will set up the Integration with Atlassian Cloud.
  1. Now that you are on the Enterprise Applications Screen click “New Application.”
  1. In the search bar shown, type “Atlassian Cloud”. Doing this will bring the integration up in the search results. Once it appears, click on it.
  1. Clicking the search result will cause the following menu to Pop up on the right-hand side. You won’t need to modify anything here, so you can click “Add” at the bottom of this menu.
  1. We can safely skip “1. Assign users and groups” for now. Proceed by clicking “2. Setup Single sign-on.”
  1. On the next screen that appears, you are presented with three choices. Select the second option that says, “SAML.”
  1. Next, you will get a pop-up asking about Saving. For now, click ‘No, I’ll save later.”
  1. You can save Section 1 on the next screen for later – as you will need information from Atlassian to complete this section. Instead, move onto Section 2 by clicking it’s “Pencil” icon.
  1. Here, we’ll only need to update one attribute. By default, Azure AD wants to send the user’s Principle Name to Atlassian Cloud. However, Atlassian wants the email address in this field. So to change it, click “Unique User Identifier (Name ID).
  1. Doing so will cause the following form to appear. Change “user.userprincipalname” to “user.mail” under Source attribute, then click “Save.”
  1. On the Navbar, click “SAML-based Sign-on” to return to the previous section.
  1. With the Attributes & Claims ready, we can start collecting information Atlassian will need. To begin with, download the Base64 Certificate in Section 3 to your local system.
  1. The next three pieces of data we will need are in Section 5. Copy the three URL’s highlighted below to a notepad you can reference later. To find them, you will need to expand the “Configuration URLs” Dropdown menu.
  1. Now we can switch over to Atlassian and start the setup there. Under your https://admin.atlassian.com admin page, Select Security →SAML single sign-on
  1. On the page shown below, click “Add” SAML configuration.”
  1. Now we can start entering the information we got from Azure AD. Be sure to pay attention to how I have it mapped below, as Atlassian and Azure have different names for each field.
    • Enter Login URL from Azure into the Identity provider SSO URL field
    • Enter the Azure AD Identifier from Azure into the Identity provider Entity ID field
  1. Now open the Certificate you downloaded in Step 12 in a text editor of your choice. Copy the contents into the Public x509 certificate Field, then click “Save.”
  1. Now we will need to give Azure some information on your Atlassian Cloud setup. To do so, copy the “SP Entity ID” and “SP Assertion Consumer Service URL” fields from the next page.
  1. You remember in Step 8, when I had you skip Section 1 on Azure’s SSO Configuration? Now is when we will go back and fill it in by clicking the “Pencil” icon.
  1. Here we’ll copy in the two URLs we copied in Step 18 into the two highlighted fields. Be sure to pay attention below, as again, Azure and Atlassian disagree on what to call these fields.
    • The SP Entity ID field from Atlassian goes into the Identifier (Entity ID) field in Azure
    • The SP Assertion Consumer Service URL field from Atlassian goes into the Reply URL (Assertion Consumer Service URL) field in Azure
    • Be sure to click the “Default” checkbox next to both, then click “Save”
  1. You should get a Pop-up asking if you want to Test single sign-on.  Click “Yes”.  This will open the following screen.  If your user is already provisioned in Atlassian Cloud, click “Sign in as current user”
  1. Congratulations, SAML SSO is now setup!

Setting up User Provisioning

So, we have SSO setup. Great!

As things stand now, you still have to go and manually populate every new user in your Atlassian environment. Not Great.

To resolve this, we’ll next setup User Provisioning, which also does what it says. This process will automatically set up new users in your Atlassian Cloud system as you add them in AD. Which, once again, will be Great.

  1. Go back to the Atlassian Cloud Integration page in Azure. This is the page from Step 5 of the SSO setup above. Once there, click “Part 3. Provision User Accounts.”
  1. On the next screen, we will select “Automatic” under Provisioning Mode:
  1. Next, we’ll need to set up some things under your Atlassian Access screen (https://admin.atlassian.com). To get started here, click “Back to organization” → Directory → User Provisioning.
  1. Now we will click the “Create a Directory” page to get started here.
  1. Enter a Name for your Directory. To keep it descriptive, I like to copy the name from the Azure Directory. After we enter the name, click “Create”:
  1. With this created, Atlassian presents us with two pieces of information that we’ll need to give Azure. Copy both the URL and the API key.
  1. Back within Azure, we will enter both of these into the Admin Credentials section. Again, be careful here as Atlassian and Azure disagree on what to call them.
    • The Directory base URL from Atlassian will go into the Tenant URL field in Azure
    • The API key from Atlassian will go into the Secret Token field in Azure
    • Be sure to test the connection after you enter both
    • OPTIONAL: You can also enter a Notification Email to get failure notices.
  1. On the next page, Mappings, you can use the defaults as-is. Just click “Next.”
  2. Under Settings, Set “Provisioning Status” to “On,” then Set Scope to “Sync Only Assigned users and Groups.”
  1. Click “Save,” and you are done!

Azure AD will not sync your selected users to Atlassian automatically! But which users will Azure sync? That is the focus of our next section!

Adding Users and Groups to sync to Atlassian Cloud

So with our setup right now, we have Azure syncing over only selected users to Atlassian. We set it up like this because if you sync everyone and have a large AD environment, you can quickly find yourself out of licenses on JIRA. So let us explore how we tell Azure which users it needs to set up in Atlassian Cloud.

  1. Back on the Atlassian Cloud Overview Page (again, from Step 5 of the SSO Setup), click “Users and Groups” from the sidebar.
  1. On this screen, click “+ Add User” at the top of the screen.
  1. Click “Users” then select the Users that Azure should sync with Atlassian Cloud. Repeat for Groups that you would like to also sync over to Atlassian Cloud.
    Note: As I did my testing on Azure’s free tier, I didn’t have groups available to get a screenshot of. Sorry!
  1. Select Role then click Assign. Congratulations! These users will now be populated into Atlassian Cloud during the next sync operation!

And that’s it!

You now have your Atlassian Cloud environment setup and ready to use Azure for Authentication! If you are already leveraging Azure AD to manage your users, it is just one less headache to worry over. 

Job Seeker Profile!

So, it does happen where someone searching for a job will contact me to ask if I know of any open positions. Unfortunately, I am not always able to help them in that regard. However, given the uncertain times we live in, I want to do something. So I’ll feature them here.

That is the case today with Siva Kumar Veerla from Hyderabad, India. He has recently been thrown into the job market due to the COVID-19 Pandemic. From his CV, he is a solid Atlassian Administrator who has led several projects, including upgrades and system installs. He is currently looking for opportunities in India or Europe. If you think he might be a good fit for you, please feel free to contact him on LinkedIn or through the information on his CV.

And Other exciting things!

Let me just say…Wow. This month has been amazing! For starters, look at this.

Yes, that is a new record month for the blog! Thank you for continuing to read, comment, like, and share the blog on the various Social Media platforms.

I’d also like to thank Predrag Stojanovic especially, who pointed out an Atlassian Group on Facebook. And well, that group loved last week’s blog post! So, I’ve gone ahead and set up a Facebook page for thejiraguy.com blog! Like Twitter, like this page to get the latest posts from the blog and random Atlassian news I find interesting! You can also subscribe below to get new posts delivered directly to your inbox!

Also, I will be giving a presentation tomorrow on Monitoring your Atlassian Applications using Nagios! If you are in the Atlanta, GA area, tune in Thursday! If you are not, I am trying to refine this presentation to submit to Atlassian for Summit. So, with a bit of luck, you’ll be hearing it from me next April!

But until next time, this is Rodney, asking “Have you updated your JIRA Issues today?”

Alerting on JIRA Problems using Nagios

So I ran into an interesting situation this past Monday. Apparently my Primary DNS had been down for at least a week. I went to go look at my network monitoring tool (LibreNMS) – and THAT was down too – for what I can guess is at least two weeks! Granted I haven’t been doing as much on my Homelab since early March when I went into the hospital, this was still not a good state of affairs.

So I decided to stand up a Nagios instance to monitor and alert when I have critical systems down. After getting it stood up, it didn’t take me long to start thinking about how I could use this with JIRA, which is now the topic we are going to cover today!

A bit of history

As you know, when I started my Atlassian journey, I was in charge of more than just JIRA. Nagios was one of my boxes I inherited as well. So I’m somewhat familiar with the tool already and how to configure it. I’ve had to modify things during that time, but never do a full setup. However, I knew I wanted to do more than monitor if JIRA was listening to web-traffic. So as part of the whole installation, I decided to dive in and see what she can do.

How to select what to Alert on.

Selecting what I want to be alerted for has always been a balancing act for me. You don’t want to have so many emails that they become worthless, but you don’t want to have so few that you won’t be alerted to a real problem.  

The goal of alerting is to clue you into problems so you can be proactive. Fix back end problems before they become a user ticket. So I always try to take the approach “What does a user care about?”

They care that the system is up and accessible, so I always monitor the service ports, including my access port. So that’s three.

A user also cares that their integrations work. If your integrations depend on SSL, and your clock drifts too far out of alignment, those integrations can fail – so I want to check the system is in sync with the NTP Server.

A feature that users love is the ability to attach files to issues. This feature will eventually chew up your disk space, so I’ll also want to monitor the disk JIRA’s home directory lives on. 

Considering I’m using a proxy, I’ll want to be sure the JVM itself is up, so I’ll need to look at that. I’ll also want to be sure that JIRA is performing at it’s best, and isn’t taking too long to respond, so I’ll want an alert for that as well.

Do you see what I’m doing? I’m looking at what can go wrong with JIRA when I’m not looking and setting up alerts for those. The idea here is I care about what my users care about, so I want the Nagios to tell me what is wrong before my users get a chance to.

So…configurations.  

Now comes the fun part. Nagios’ configuration files is a bit much to take in at first. However, I will be isolating the Atlassian specific configurations to make things a bit easier on all of us. First, lets start with some new commands I had to add.

###############################################################################
# atlassian_commands.cfg
#
#
# NOTES: This config file provides you with some commands tailored to monitoring
#        JIRA nodes from Nagios
# AUTHOR: Rodney Nissen <rnissen@thejiraguy.com
#
###############################################################################


define command {
    command_name    check_jira_status
    command_line	$USER1$/check_http -S -H $HOSTADDRESS$ -u /status -s '{"state":"RUNNING"}'
	}
	
define command {
    command_name	check_jira_restapi
	command_line	$USER1$/check_http -S -H $HOSTADDRESS$ -u /rest/api/latest/issue/$ARG3$ -s "$ARG3$" -k 'Authorization: Basic $ARG4$' -w $ARG1$ -c $ARG2$
	}
    
define command {
    command_name    check_jira_disk
    command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -l nagios -C "/usr/lib64/nagios/plugins/check_disk -w $ARG2$ -c $ARG3$ -p $ARG1$"
    }
    
define command {
    command_name    check_jira_load
    command_line    $USER1$/check_by_ssh -H $HOSTADDRESS$ -C "/usr/lib64/nagios/plugins/check_load -w $ARG1$ -c $ARG2$" -l nagios

The first two commands here are VERY tailored to JIRA. The first one checks that the JVM is running, all with a handy HTTP request. If you go to your JIRA instance and go to the /status directory, the JVM will respond with a simple JSON telling you the state of the node. You use this feature in JIRA Data Center, so your load balancer can determine which nodes are up and ready for traffic. Buuut…it’s on JIRA Server too, and we can use it for active monitoring. So I did. If JIRA returns anything other than {“state”:”RUNNING”}, the check will fail and you will get an alert.

The second is a check on the rest API. This one will exercise your JIRA instance to make sure it’s working without too much load time for users. The idea here will search for a known issue key, and see if it returns valid information within a reasonable time. $ARG1$ is how long JIRA has before Nagios will issue a warning that it’s too slow (in seconds), and $ARG2$ is how long JIRA has before Nagios considers it a critical problem. $ARG3$ is your known good Issuekey. $ARG4$ is a set of credentials for JIRA encoded in Base64. If you are not comfortable just leaving your actual credentials encoded as such, I’d suggest you check out the API Token Authentication App for JIRA. Using the App will allow you to use a token for authentication and not expose your password.

The third commands here are for checking JIRA’s home directory ($ARG1). $ARG2$ and $ARG3$ are percentages for the warning and critical thresholds, respectively.

The fourth is for checking the system load. This one is relatively straight forward. $ARG1$ is the system load that will trigger a warning, and $ARG2$ is the system load that shows you have a problem.

Now for the JIRA host configuration:

###############################################################################
# jira.CFG - SAMPLE OBJECT CONFIG FILE FOR MONITORING THIS MACHINE
#
#
# NOTE: This config file is intended to serve as an *extremely* simple
#       example of how you can create configuration entries to monitor
#       the local (Linux) machine.
#
###############################################################################



###############################################################################
#
# HOST DEFINITION
#
###############################################################################

# Define a host for the local machine

define host {

    use                     linux-server            ; Name of host template to use
                                                    ; This host definition will inherit all variables that are defined
                                                    ; in (or inherited by) the linux-server host template definition.
    host_name               jira
    alias                   JIRA
    address                 192.168.XXX.XXX
}


###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################

# Define a service to "ping" the local machine

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     PING
    check_command           check_ping!100.0,20%!500.0,60%
}


# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     SSH
    check_command           check_ssh
}



# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service {

    use                     generic-service           ; Name of service template to use
    host_name               jira
    service_description     HTTP
    check_command           check_http
}

define service {

    use                     generic-service
    host_name               jira.folden-nissen.com
    service_description     HTTPS
    check_command           check_https
}


define service {
    use                     generic-service
    host_name               jira
    service_description     NTP
    check_command           check_ntp!0.5!1
}

define service {
	use				generic-service
	host_name			jira
	service_description		JIRA Status
	check_command			check_jira_status
	}
	
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA API Time
	check_command			check_jira_restapi!2!3!HL-1!<Base64 Credentials>
	}
    
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA System Load
	check_command			check_jira_load!5.0,4.0,3.0!10.0,6.0,4.0
	}
    
define service {
	use				generic-service
	host_name			jira
	service_description		JIRA Home Directory Free Space
	check_command			check_jira_disk!<JIRA Home>!75%!85%
	}

So, first, we define the host. This section is information specific to JIRA. Then we start setting services for JIRA. Within Nagios, a Service is a particular check you want to run.

The next four options are pretty standard. These are checking Ping, the two service ports (HTTP and HTTPS), and the SSH port. The SSH Port and HTTP/S port checks will also check that those services are responding as expected.

The next check is for NTP. I have this setup to warn me if the clock is a half-second off and give me a critical error if the clock is off by one second. These settings might be too strict, but it has yet to alert, so I think I have dialed it in well enough.

The next is the JIRA Status check. This service will check /status, as we mentioned earlier. It’s either the string we are expecting, or it’s not, so no arguments needed.

After that is my JIRA API Check, which I set up to check the HL-1 issue. If the API Call takes longer to 2 seconds, issue a warning, and if it takes longer than 3 seconds, Nagios issues a critical problem. This alert won’t tell me exactly what’s wrong, but it will tell me if there is a problem anywhere in the system, so I think it’s a good check.

The last two services are systems check – checking the System Load and JIRA home directory disk, respectively. The Load I haven’t had a chance to dial in yet, so I might have it set too high, but I’m going to leave it for now. As for the Disk check, I like to have plenty of warning I am approaching a full disk to give me time to resolve it, so these numbers are good.

The last step is to add these to the nagios.cfg file so that they get loaded into memory. However, this is as easy as adding the following lines into the cfg file.

# Definitions for JIRA Monitoring
# Commands:
cfg_file=/usr/local/nagios/etc/objects/atlassian_command.cfg

# JIRA Nodes:
cfg_file=/usr/local/nagios/etc/objects/jira.cfg

And that’s it! Restart Nagios and you will see your new host and service checks come up!

Nagios in action.

So I’ve had this configuration in place for about a day now, and it appears to be working. The API Time check did go off once, but I did restart the JIRA Server to adjust some specs on the VM, so I expected the delay. So I hope this helps you as you are setting up alerts for your JIRA system!

And that’s it for this week!

We did get a bit of bad news about Summit 2021 last week. Out of an abundance of caution, Atlassian decided to go ahead and make all in-person events of 2020 and Summit 2021 virtual events. However, they have almost a year to prepare for a virtual Summit – as opposed to the 28 days they had this year. So I am excited to see what ideas Atlassian has to make this a fantastic event!

Don’t forget our poll! I’m going to let it run another week!

Don’t forget you can check me out on Twitter! I’ll be posting news, events, and thoughts there, and would love to interact with everyone! If you found this article helpful or insightful, please leave a comment and let me know! A comment and like on this post in LinkedIn will also help spread the word and help others discover the blog! Also, If you like this content and would like it delivered directly to your inbox, sign up below!

But until next time, my name is Rodney, asking “Have you updated your JIRA Issues today?”

Monitoring JIRA for Fun and Health

So, dear readers, here’s the deal. Some weeks, when I sit down to write, I know exactly what I’m going to write about, and can get right to it. Other weeks, I’m sitting down, and I don’t have a clue. I can usually figure something out, but it’s very much a struggle. This week is VERY much the latter.

Compound that with the fact that I just lost most of my VM’s due to a storage failure I had this very morning. Part of it was a mistake on my part. I have the home lab so that I can learn things I can’t learn on the job. And mistakes are a painful but powerful way to learn. Still….

This brings me back to a conversation I had with a colleague and fellow Atlassian Administrator for a company I used to work for. He had asked me what my thoughts around implementing Monitoring of JIRA. Well, I have touched on the subject before, but if I’m being honest, this isn’t my greatest work. Combine that with the fact that I suddenly need to rebuild EVERYTHING, well, why not start with my monitoring stack!

So, we are going to be setting up a number of systems. To gather system stats, that is to say CPU usage, Memory Usage, and Disk usage, we are going to be using Telegraf, which will be storing that data in an InfluxDB database. Then for JIRA stats we are going to use Prometheus. And to query and display this information, we will be using Grafana.

The Setup

So we are going to be setting up a new system that will live alongside our JIRA instance. We will call it Grafana, as that will be the front end we will interact with the system with.

On the back end it will be running both a InfluxDB Server and a Prometheus Server. Grafana will use both InfluxDB and Prometheus as data sources, and will use that to generate stats and graphs of all the relevant information.

Our system will be a CentOS 7 system (my favorite currently), and will have the following stats:

  • 2 vCPU
  • 4 GB RAM
  • 16 GB Root HDD for OS
  • 50 GB Secondary HDD for Services

This will give us the ability to scale up the capacity for services to store files without too much impact on the overall system, as well as monitor it’s size as well.

As per normal, I am going to write all commands out assuming you are root. If you are not, I’m also assuming you know what sudo is and how to use it, so I won’t insult you by holding your hand with that.

InfluxDB

Lets get started with InfluxDB. First thing we’ll need to do is add the yum repo from Influxdata onto the system. This will allow us to use yum to do the heavy lifting in the install of this service.

So lets open /etc/yum.repos.d/influxdb.repo

vim /etc/yum.repos.d/influxdb.repo

And add the following to it:

[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key

Now we can install InfluxDB

yum install influxdb -y

And really, that’s it for the install. Kind of wish Atlassian did this kind of thing.

We’ll need to of course allow firewall access to Telegraf can get data into InfluxDB.

firewall-cmd --permanent --zone=public --add-port=8086/tcp
fireall-cmd --reload

And with that we’ll start and enable the service so that we can actually do the service setup.

systemctl start influxdb
systemctl enable influxdb

Now we need to set some credentials. As initially setup, the system isn’t really all that secure. So we are going to secure it initially by using curl to set ourselves an account.

curl -XPOST "http://localhost:8086/query" --data-urlencode \
"q=CREATE USER username WITH PASSWORD 'strongpassword' WITH ALL PRIVILEGES"

I shouldn’t have to say this, but you should replace username with one you can remember and strongpassword with, well, a strong password.

Now we can use the command “influx” to get into InfluxDB and do any further set up we need.

influx -username 'username' -password 'password'

Now that we are in, we need to setup a database and user for our JIRA data to go into. As a rule of thumb, I like to have one DB per application and/or system I intend to monitor with InfluxDB.

CREATE DATABASE Jira
CREATE USER jira WITH PASSWORD 'strongpassword'
GRANT ALL ON jira TO jira
CREATE RETENTION POLICY one_year ON Jira DURATION 365d REPLICATION 1 DEFAULT
SHOW RETENTION POLICIES ON Jira

And that’s it, InfluxDB is ready to go!

Grafana

Now that we have at least one datasource, we can get to setting up the Front End. Unfortunately, we’ll need information from JIRA in order to setup Prometheus (once we’ve set JIRA up to use the Prometheus Exporter), so that data source will need to wait.

Fortunately, Grafana can also be setup using a Yum repo. So lets open up /etc/yum.repos.d/grafana.repo

vim /etc/yum.repos.d/grafana.repo

and add the following:

[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

Afterwards, we just run the yum install command:

sudo yum install grafana -y

Grafana defaults to port 3000, however options to change or proxy this are available. However, we will need to open port 3000 on the firewall.

firewall-cmd --permanent --zone=public --add-port=3000/tcp
firewall-cmd --reload

Then we start and enable it:

sudo systemctl start grafana-server
sudo systemctl enable grafana-server

Go to port 3000 of the system on your web browser and you should see it up and running. We’ll hold off on setting up everything else on Grafana until we finish the system setup, though.

Telegraf

Telegraf is the tool we will use to get our data from JIRA’s underlying linux system and into InfluxDB. This is actually part of the same YUM repo that InfluxDB is installed from, so we’ll now also add it to the JIRA server – same as we did Grafana.

vim /etc/yum.repos.d/influxdb.repo

And add the following to it:

[influxdb]
name = InfluxDB Repository - RHEL \$releasever
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key

And now that it has the YUM repo, we’ll install telegraf onto the JIRA Server.

yum install telegraf -y

Now that we have it installed, we can take a look at it’s configuration, which you can find in /etc/telegraf/telegraf.conf. I highly suggest you take a backup of this file first. Here is an example of a config file where I’ve filtered out all the comments and added back in everything essential.

[global_tags]
[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  logtarget = "file"
  logfile = "/var/log/telegraf/telegraf.log"
  logfile_rotation_interval = "1d"
  logfile_rotation_max_size = "500MB"
  logfile_rotation_max_archives = 3
  hostname = "<JIRA's Hostname>"
  omit_hostname = false
[[outputs.influxdb]]
  urls = ["http://<grafana's url>:8086"]
  database = "Jira"
  username = "jira"
  password = "<password from InfluxDB JIRA Database setup>"
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]

And that should be it for config. There are of course more we can capture using various plugins – based on whatever we are interested in, but this will get the bare minimum we are interested in.

Because telegraf is pushing data to the InfluxDB server, we don’t need to open any firewall ports for this, which means we can start it, then monitor the logs to make sure it is sending the data over without any problems.

systemctl start telegraf
systemctl enable telegraf
tail -f /var/log/telegraf/telegraf.log

And assuming you don’t see any errors here, you are good to go! We will have the stats waiting for us when you finish the setup of Grafana. But first….

Prometheus Exporter

So telegraf is great for getting the Linux system stats, but that only gives us a partial picture. We can train it to capture JMX info, but that means we have to setup JMX – something I’m keen to avoid whenever possible. So what options have we got to capture details like JIRA usage, JAVA Heap performance, etc?

Ladies and gentlemen, the Prometheus Exporter!

That’s right, as of the time of this writing, this is yet another free app! This will setup a special page that Prometheus can go to and “scrape” the data from. This is what will take our monitoring from “okay” to “Woah”.

Because it is a free app, we can install it directly from the “Manage Apps” section of the JIRA Administration console

Once you click install, click “Accept & Install” on the pop up, and it’s done! After a refresh, you should notice a new sidebar item called “Prometheus Exporter Settings”. Click that, then click “Generate” next to the token field.

Next we’ll need to open the “here” link into a new tab on the “Exposed metrics are here” text. Take special special note of the URL used, as we’ll need this to setup Prometheus.

Prometheus

Now we’ll go back to our Grafana system to setup Prometheus. To find the download, we’ll go to the Prometheus Download Page, and find the latest Linux 64 bit version.

Try to avoid “Pre-release”

Copy that to your clipboard, then download it to your Grafana system.

 wget https://github.com/prometheus/prometheus/releases/download/v2.15.2/prometheus-2.15.2.linux-amd64.tar.gz

Next we’ll need to unpack it and move it into it’s proper place.

tar -xzvf prometheus-2.15.2.linux-amd64.tar.gz
mv prometheus-2.15.2.linux-amd64 /archive/prometheus

Now if we go into the prometheus folder, we will see a normal assortment of files, but the one we are interested in is prometheus.yml. This is our config file and where we are interested in working. As always, take a backup of the original file, then open it with:

vim /archive/prometheus/prometheus.yml

Here we will be adding a new “job” to the bottom of the config. You can copy this config and modify it for your purposes. Note we are using the URL we got from the Prometheus Exporter. The first part of the URL (everything up to the first slash, or the FQDN) goes under target where indicated. The rest of the URL (folder path) goes under metrics_path. And then your token goes where indicated so that you can secure these metrics.

global:
  scrape_interval:     15s
  evaluation_interval: 15s
alerting:
  alertmanagers:
  - static_configs:
    - targets:
rule_files:
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'Jira'
    scheme: https
    metrics_path: '<everything after the slash>'
    params:
      token: ['<token from Prometheus exporter']
    static_configs:
    - targets:
      - <first part of JIRA URL, everything before the first '/'>

We’ll need to now open up the firewall port for Prometheus

firewall-cmd --permanent --zone=public --add-port=9090/tcp
firewall-cmd --reload

Now we can test Prometheus. from the prometheus folder, run the following command.

./prometheus --config.file=prometheus.yml

From here we can open a web browser, and point it to our Grafana server on port 9090. On the Menu, we can go to Status -> Targets and see that both the local monitoring and JIRA are online.

Go ahead and stop prometheus for now by hitting “Ctrl + C”. We’ll need to set this up as a service so that we can rely on it coming up on it’s own should we ever have to restart the Grafana server.

Start by creating a unique user for this service. We’ll be using the options “–no-create-home” and “–shell /bin/false” to tell linux this is an account that shouldn’t be allowed to login to the server.

useradd --no-create-home --shell /bin/false prometheus

Now we’ll change the files to be owned by this new prometheus account. Note that the -R makes chown run recursively, meaning it will change it for every file underneath were we run it. Stop and make sure you are running it from the correct directory. If you run this command from the root directory, you will have a bad day (Trust me)!

chown -R prometheus:prometheus ./

And now we can create it’s service file.

vim /etc/systemd/system/prometheus.service

Inside the file we’ll place the following:

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/archive/prometheus/prometheus \
    --config.file /archive/prometheus/prometheus.yml \
    --storage.tsdb.path /archive/prometheus/ \
    --web.console.templates=/archive/prometheus/consoles \
    --web.console.libraries=/archive/prometheus/console_libraries 

[Install]
WantedBy=multi-user.target

After you save this file, type the following commands to reload systemctl, start the service, make sure it’s running, then enable it for launch on boot.:

systemctl daemon-reload
systemctl start prometheus
systemctl status prometheus
systemctl enable prometheus

Now just double check that the service is in fact running, and you’re good to go!

Grafana, the Reckoning

Now that we have both our datasources up and gathering information, we need to start by creating a way to display it. On your web browser, go back to Grafana, port 3000. You should be greeted with the same login screen as before. To login the first time, use ‘admin’ as username and password.

You will be prompted immediately to change this password. Do so. No – really.

After you change your password, you should see the screen below. Clilck “Add data source”

We’ll select InfluxDB from the list as our first Data Source.

For settings, we’ll enter only the following:

  • Name: JIRA
  • URL: http://localhost:8086
  • Database: Jira
  • User: jira
  • Password: Whatever you set the InfluxDB Jira password to be

Click “Save & Test” at the bottom and you should have the first one down. Now click “Back” so we can set up Prometheus.

On Prometheus, all we’ll need to do is set the URL to be “http://localhost:9090. Enter that, then click “Save & Test”. And that’s both Data Sources done! Now we can move onto the Dashboard. On the right sidebar, click through to “Home”, then click “New Dashboard”

And now you are ready to start visualizing Data. I’ve already covered some Dashboard tricks in my previous attempt at this topic. However, if it helps, here’s how I used Prometheus to setup a graph of the JVM Heap.

Some Notes

Now, there is some cleanup you can do here. You can map out the storage for Grafana and InfluxDB to go to your /archive drive, for example. However, I can’t be giving away *ALL* the secrets ;). I want to challenge you there to see if you can learn to do it yourself.

We do have a few scaling options here too. For one, we can split Influx, Prometheus, and Grafana onto their own systems. However, my experience has been that this isn’t usually necessary, and they can all live comfortably on one system.

And one final note. The Prometheus exporter, strictly speaking, isn’t JIRA Data Center compatible. It will run however. As best I can tell, it will give you the stats for each node where applicable, and the overall stats where that makes sense. It might be worth installing and setting up Prometheus to bypass the load balancer and do each node individually.

But seriously, that’s it?

Indeed it is! This one is probably one of my longer posts, so thank you for making it to the end. It’s been a great week hearing how the blog is helping people out in their work, so keep it up! I’ll do my part here to keep providing you content.

On that note, this post was a reader-requested topic. I’m always happy to take on a challenge from readers, so if you have something you’d like to hear about, let me know!

One thing that I’m working on is to try and make it easier for you to be notified about new blog posts. As such, I’ve included an email subscription form at the bottom of the blog. If you want to be notified automatically about to blog posts, enter your email and hit subscribe!

And don’t forget about the Atlassian Discord chat – thoroughly unofficial. click here to join: https://discord.gg/mXuRsVu

But until next time, my name is Rodney, asking “Have you updated your JIRA issues today?”

How to test changes in JIRA

So, a bit of a backstory here. I was doing some experiments at work on running JIRA Data Center in Kubernetes using the official Atlassian containers when I noticed something odd. After loading the MySQL Connector and starting it all up, JIRA Setup kept telling me that the database wasn’t empty. I could see that it was, and per advice from a colleague, even double checked that the collations and char-sets were all correctly set.

Finally I isolated it down to the MySQL Connector. I had grabbed version 8.something, and Atlassian only supports version 5.1.48. And while this connector worked for JIRA 8.5.0, it apparently had some issues with JIRA 8.5.2 and 8.5.3.

This did get me thinking though. I went through the process of isolating the problem relatively quickly as I have had to do this fairly often in my career. But it isn’t the most intuitive thing to learn. So why not cover that this week!

Dev and Test

So, first thing: Friends don’t let Friends Test in Production. People are depending on that system being stable and there, and if you are mucking about in it constantly to “test” things, it will be anything but stable.

For all license tiers save the smallest, Atlassian also gives you an unlimited use Development License. And this is for both Apps and the main Applications. USE IT! If I.T. won’t give you another system, setup a VM on your desktop. IF they won’t let you use that, bring in an old PC from Home. There is no excuse for testing in production.

The most common setup I see is for a team to have two non-production instances of each platform: Test and Dev. Dev is your personal instance. This is where you can make changes to your hearts content, bring it up and down, upgrade it, reset it, whatever as much as you want. Break it? Won’t impact anything, and just refresh from Production. This is usually where I test “I wonder what will happen if I do this?” at.

Test, on the other hand, is your public non-production instance. You want to let a user test the functionality of a new App before purchasing it? Goes in Test. A user wants to add a new field? Put it in test and let them see what it looks like first. I usually like to refresh this from production on every JIRA Upgrade, but will do it sooner if we’ve made any big changes in production.

As a best practice, I also like to change the color scheme of JIRA for each instance, so you can identify which is which on site. My usual color scheme is to have the top bar be orange for Test, and Red for Dev. A few other things I do:

  • Separate out each instance to a separate DB Server
  • Make sure that if a given non-production server tries to talk to Production, it’s rerouted to the appropriate non-production instance instead. Often using /etc/hosts file.
  • DISABLE THE OUTGOING EMAIL SERVER

I definitely recommend you have both available. If you are only limited to one due to policy or budget, at least have a test instance. Your production instance will thank you.

But what about a non-production site for JIRA Cloud?

Okay – so I haven’t had to deal with this too often. BUT, you are also not the first person to ask, dear reader. Atlassian has a document actually outlining a few options you have to setting up non-production Atlassian cloud instances.

Take a snapshot and/or backup before changing anything

Before trying to figure out a problem or making a change, give yourself a way to get back to a pre-test state. If your instance (DB and all) is on a single VM, take a snapshot of the VM before starting. IF not, Take a tarball of your install and home directory, and while those are running take a database dump from your DB. Heck, if you can, take a file backup and a VM snapshot, do both!

Before I have your ESXI admins after me with torches and pitchforks, I should note here. The way I understand it, a snapshot setups up a way for ESXI to journal all the changes made to a system within a file, and revert back those changes. That means the longer a snapshot sits on a system, the larger it becomes. So always go back and remove a snapshot after you finish your testing. At the very least, it keeps things from getting messy.

This doesn’t only extend to a whole system. If you are changing a single file, make a copy of it first. That way you can go back to the file before you made any changes should the change prove catastrophic. The goal here is no matter what you are doing, always give yourself a path back to before you did it.

Isolate and make only one change at a time

This is probably the most challenging part of testing. For each run you do, you need to make only one change at a time. But what do I mean by change? Do I mean you should upgrade by changing one file at a time? Of course not!

The purpose of this is to isolate something enough to know what fixes or breaks it. So if you are doing a full upgrade, start by upgrading JIRA. Then check to see that it still runs as expected. Then make your changes to setenv.sh. Check again. Then server.xml. Then check again. Then upgrade the apps. Check again.

In the example I gave in the intro, here’s the changes I made each run when I found there was a problem with the DB:

  1. Drop and Re-Setup the Database using a GUI Tool
  2. Drop and Re-Setup the Database from command line.
  3. Try a MySQL 5.7 DB instead of a MySQL 5.6 DB
  4. Try JIRA 8.5.2 instead of JIRA 8.5.3
  5. Try JIRA 8.5.2 with MySQL 5.6 instead of MySQL 5.7
  6. Try JIRA 8.5.2, MySQL 5.6, with a different MySQL Connector – FIXED!

So you can see how each step I only changed one item. Yeah, it took me six runs to find a solution, but I now know it was for sure the MySQL Connector.

Yes, this adds significant overhead of bringing down and restarting JIRA each run. BUT – if and when something does break, you will know it was only the last thing you did that broke it. Likewise if something fixes it, you also know it was the last thing you did that actually fixed it.

Keep track of the changes you’ve made to each instance since the last Refresh

This is a bit of practical advice. Somewhere (Confluence), you need to have a document that shows in what ways each non-production instance has been changed since the last time you refreshed it from production.

Add a field? Add that to the doc. User tested an App? Document it. The idea is to have a journal to show what you’ve done, so that if you need to refresh it while a user is still testing something, you know where to find those changes to restore them.

And I get it – documentation is evil. Why spend time writing what you are doing when you can be doing more. This something I struggle with too! But this is a case where an ounce of prevention is worth a pound of cure.

Practice good Change Management on Production!

So, you’ve tested something in dev, put it before users in test, and now you are ready to put it on Production now. Enough delays, right?

Slow down there, friend! Production is sacred, you shouldn’t just run in there with every change.

Change control/change management is a complex subject – and honestly – hasn’t always been my strong suit. But it’s meant to keep you as an admin from your worst impulses. Annoying at times, I’ll grant you, but still a good thing overall.

The best way I found is to setup a board made of up of your Power Users, other Admins, and various other stakeholders as needed. Have them meet every so often (every other week seems to be the sweet spot here). If you have the budget for it, make it a lunch meeting and provide food. You are much more likely to get people to show up if they get to eat.

Then go over every change you want to make and gather feedback. They might spot a problem with a use case you hadn’t considered. But be sure to get a vote on each change before the meeting is over. Trust me, if you don’t structure and control the meeting, they will talk each point to death.

As a note here, there should be an exception to putting changes through the board during an emergency. If production is down, your first priority should be getting it back online as soon as possible. Then you can have time to retroactively put it through the board. For all non-emergency changes though, the change board is the valve to what you want to put into production.

Strictly this is not part of testing, but considering all, I didn’t want you to run off thinking testing was the last step. As with everything JIRA, it all works best when it’s a process.

And that is it!

You are ready to do some testing in JIRA. With the advice above, you are ready to maintain your JIRA Instances responsibly – or at the very least give yourself a way out of any sticky situations you find yourself in.

Don’t forget to join us on Discord! https://discord.gg/mXuRsVu

Until next time, this is Rodney, asking “Have you updated your JIRA issues today?”

Leaving the Breadcrumbs: how to adjust Logging.

So, we’ve discussed how to read your logs, and what impact changing them will have on your disk drives. So how do we go about changing logging levels and tuning logging? That is what we are going to discuss today. So without much preamble, lets get into this.

Should I adjust my logging levels?

If I’m being honest, No. Well, that was a quick blog post, I’ll see you next week!

Actually, let me explain. The default levels are sufficient for 99% of admins out there. It’s a good balance of what you need to know to diagnose issues without filling up your disks. Typically, I only recommend people only adjust these levels if asked to do so by Atlassian or App Vendor support.

However, it’s still important that you know how to do so when asked. And with a bit of homework you might even be able to adjust them and find your answers before you have to get support involved. My advice though is to do so carefully if you choose to.

I just need to note something here, and it’s something I forgot to mention last week. Debug level logs can sometimes capture passwords in the log file. I should not have to tell you why that could be bad. However, this is just one more reason why you should really think twice about capturing any logs on the Debug level.

Temporary vs Permanent

So, the first question you need to answer is do you need this change to be permanent or temporary. Atlassian does give us two ways to change logging – one via the Admin console in the web UI, which will only last until your next JIRA restart. As such I label this the “temporary” option. The second option is by changing some files within the JIRA install directory, which will persist across application restarts, and as such I label a “permanent” change.

As I tend to recommend you stick to defaults, so without any deeper context, temporary is my go to answer. However, as with all things Atlassian, context is everything. Lets say you’ve had some problems with a new app you’ve installed onto your instance, and it’s causing the Application to restart regularly. This is a time you’d want to look at a more permanent solution, as you’re not going to capture those detailed logs while JIRA loads up otherwise.

If you are working with Support, they will almost always tell you to go the temporary route. However, no matter what I say, my advice is to follow their advice. Seriously, they are scary good at what they do, and they are not going to steer you wrong.

Making a Temporary change to logging.

To change something temporarily within the logs, we’ll need to go to the Logging and Profiling section of the JIRA Administration Console. Once there, you’ll find it under System -> Logging and Profiling.

Note: You will need System Administrator global permissions to be able to see this section.

As a pro-tip, you can also find any admin page from anywhere in JIRA by hitting the period key on your keyboard, then typing the page you are interested in. For this to work, your cursor must be outside a text input box of any kind.

Once here, you’ll see different sections:

  • Mark Logs
  • HTTP Access Logging
  • SQL Logging
  • Profiling
  • Mail
  • Default Loggers

Today we’re going to be primarily interested in the Mark Logs section and the Default Loggers section. Everything else is available to turn on, but remember that these logs will only run until the next time you restart JIRA.

Mark Logs

The first section here is where you can add a comment into your logs. You can also roll over your logs. A roll over is where JIRA will increment the end number on all existing rolled over logs, copy the current log file to atlassian-jira.log.1, then start a new atlassian-jira.log file.

Both of these techniques are great for trying to mark a section of logs before doing some operation or test. In fact, I made great use of this functionality to do my testing last week on log sizes. I can search a ten minute manual search for where you started doing something to a “It’s right here” instant search.

Default Loggers

The next section of interest is the Default Loggers section. This has the familiar logging levels shown, on a per-package bases. A package is an individual class or object within the JAVA code, so each of these will adjust the logging on a specific aspect for JIRA.

This unfortunately is an exhaustive list, and I don’t have time to write up what each of them do (assuming I even know that!). However, most of these come down to logic, and with a bit of searching you can find something.

That is to say I won’t offer any guidance. If I’m being honest, it’s amazing how many times JIRA problems turn out to be App problems. So I’m going to quickly discuss how to add and adjust logs for an App here.

Our first step is to go to our App Manager, then expand the information for the App we are interested in.

Really should update that…

Look for the App Key, highlighted above, and copy it. Then head over to Logging and Profiling, and under Default Loggers, click “Configure logging level for another package”. Past the App key into the Package name section, then select your logging level. Click Add and boom, you are now logging for that particular App.

Permanent Logging Changes

To change logging levels, you will need to find the <jira-install-dir>/atlassian-jira/WEB-INF/classes/log4j.properties.

Here we can change a number of things, including log levels of various packages, where the logs are, and so on.

Super Important Note: Before making any changes to this file. Make a copy and save it in a safe place. Your future self will thank you.

To change the logging level, you first need to know the package you are adjusting for. You will either have gotten this from Support or the App Manager section of JIRA. Then we’ll look for log4j.logger.<package-name>. You should see something like this:

log4j.logger.com.atlassian = WARN, console, filelog
log4j.additivity.com.atlassian = false

To adjust the log level, change “WARN” to any of the other logging levels, then save. After you restart JIRA you should see the logging level change reflected. This goes with any change to this log file – you will need to restart JIRA to see any changes.

And that’s logging, done.

So I’ve actually had a lot of fun going back over this subject matter. It was also the first reader-requested topic, so that is amazing in and off itself. So what do you guys think I should cover next? Leave a comment here or on LinkedIn and if it’s good, I’ll cover it. So until next time, this is Rodney, asking “Have you updated your JIRA Issues today?”

Pile of Breadcrumbs, how logging levels impact JIRA’s logs

Well, seems we are back on track. Last time we looked at logging, we went over how to decode the information that was in the logs. During that piece, I made the following claim:

If you set everything to Debug before you leave on Friday, by the time you are sitting down for dinner on Saturday, you’re going to be paged for a full disk.

Following the Breadcrumbs: Decoding JIRA’s Logs, 30 Oct, 2019

Well, this got me to thinking….exactly how much does JIRA’s logging level impact log size. To figure this out, we need a bit of SCIENCE!

Ze Experiment!

So here’s what I’m thinking. We take a normal JIRA instance, and we do a set number of tasks in there, roll over the logs, then see what size they are. Rinse and Repeat per log level.

So the list of tasks I have for each iteration is:

  • Roll Over the Logs
  • Logout
  • Log in
  • Create Issue
  • Comment on Issue
  • Close Issue
  • Search for closed issue
  • Log into the Admin Console
  • Run Directory Sync
  • Roll Logs

Between the last Roll over and the first one of the next iteration is when I’ll capture log size and adjust the logging levels.

Now to have as much of an apples to apples comparison, I’ll need to limit the background tasks as much as possible. The biggest one I can think of is the automated directory sync, which will need to be disabled for the test. However, as this is a regular activity within JIRA, I’ll be including a manual directory sync to capture that into the data set.

I’ll also need a control to measure what JIRA does by default, but that will be my first run. To make sure I can return to defaults later, I’ll be adjusting the logging levels in Administration -> System -> Logging and Profiling. Changes to the logging level here do not persist over a restart, so this should be ideal. So, without further adieu, See you on the other side.

The Results

Well, That was an adventure. I ended up taking a backup of the log4j.properties file and changing it. Turns out changing > 110 settings by hand one at a time is not very time-efficient. Changing the log4j.properties file dropped the number that I had to change manually to ~30.

I tried to be as consistent as possible with each run. That means I had the same entries for each field on the issue I created, same comment, same click path, etc. I even goofed up my search on the control run (typed issueky instead of issuekey), and I repeated that mistake for each run afterwards.

Another note I want to make is that there was one setting I could not change. Turns out that if the com.atlassian.jira.util.log.LogMarker object is anything but Info, JIRA will crash when you go to roll over the logs. Oops!

Conclusions

I think even considering all that, my assertion still stands. This was one person doing somewhat normal JIRA tasks. With that the Debug was still almost 1000x the Info log. Fun fact, it rolled over the logs automatically four times. Now multiply that by how many people are using your instance, and how many times they will be doing these kinds of operations, over and over again. Even if they aren’t, there are automated processes like the Directory syncs that will take up log space. It will definitely add up fast.

However, I think the bigger consideration here is never take anything at face value. Yeah, I’m an expert, but even I was just parroting something I had been told about logging. Now I have the experiment and data to backup my assertion. Don’t be afraid to put things to a test. You might discover where some advice isn’t right for your environment. Or you might find out why things were said, and become that much more knowledgeable . The point is to always be learning.

So until next week, this is Rodney, asking “Have you updated your JIRA issues today?”

JIRA Service Desk Vulnerability – 06 Nov 2019

Yes, I know what I promised you guys last week. And more information about logging is coming, but sometimes there are things that are more important. This is one of them.

I’ve spoken briefly about security advisories before. Well….Atlassian has just announced a Security Advisory for JIRA Service Desk, that affects both Server and Data Center versions. As such, I figured that was worth breaking into our regularly scheduled programming to discuss it in more detail.

So, what’s going on already?

Atlassian has released several patches that solve a significant vulnerability in JIRA Service Desk that can let people bypass authorization. For any service, this is not a great bug to have. I mean, they aren’t leaking credentials in clear text, but this is almost as bad.

HOWEVER, there are workarounds to mitigate this bug, as well as fixes released to resolve it entirely. I’ll be going through it in more detail, but to hear it straight from the horse’s mouth, here’s the link:

How do I know if my version of JIRA is affected?

First off, this bug only affects JIRA Service Desk. Which means if you are running JIRA Software or JIRA Core, You’re good. Also, if you are using Atlassian’s cloud offering, you are also good. This bug only impacts servers running JIRA Service Desk under their Server or Data Center offerings.

Furthermore, it doesn’t impact every version. Below I’ve listed out the Versions that are Affected:

Versions with Bug Present

  • Any version before 3.9.17
  • All of 3.10.x through 3.15.x
  • 3.16.x before 3.16.10
  • All of 4.0.x through 4.1.x
  • 4.2.x before 4.2.6
  • 4.3.x before 4.3.5
  • 4.4.x before 4.4.3
  • 4.5.0

If you are running any of the above versions, below are the versions of Service Desk where this is fixed. As always, I recommend you go with an Enterprise Release, which the latest one for Service Desk is Version 4.5.1

Versions where the Bug is fixed

  • 3.16.10
  • 4.2.6
  • 4.3.5
  • 4.4.3
  • 4.5.1

So…exactly how worried should I be?

Well, that depends. Is you system exposed to the internet? You probably should be putting mitigations steps into place right now rather than reading this, honestly. No I mean it, close this window and go fix it now! I’ll be here when you return.

Even if you are not, It’s always shockingly easy to gain access to places people think are “secure”. That is to say you can’t always assume that just because you have a good firewall between your system and the internet, that someone can’t bypass that by just walking into your office and finding an unlocked computer. Looking at you Darren!

A good thing about this bug is that it has a work-around that you can put in place if you can’t upgrade right away. All you have to do is add the following bit to the file <jira-install-dir>/atlassian-jira/WEB-INF/urlrewrite.xml

<rule>
    <from>/servicedesk/.*\.jsp.*</from>
    <to type="temporary-redirect">/</to>
</rule>

After you add the above rule to the file, be sure to restart JIRA so that the changes are picked up, and you are safe until your next upgrade…or vulnerability disclosure, whichever happens first.

Assuming you are not willing to mess with the Atlassian Install directory (I cannot blame you), you can also make some changes to your proxy or load balancer to mitigate the bug. However, this is only as secure as your users’ inability to bypass the proxy, so your mileage may vary.

Seriously though, you should upgrade sooner than later. Just saying.

Why should I care?

Unfortunately for us, black hat hackers and other ne’er-do-wells have figured out that enterprises love running Atlassian Applications. Which means they are now actively looking for vulnerabilities to use against these applications. Why develop a niche tool that will only work against a handful of targets when you can develop one that will work on just about every target?

However, Atlassian’s Bug Bounty program means that the good guys are just as motivated to find these bugs and report them responsibly to Atlassian before the bad guys can find them.

But, that doesn’t mean you can slack off. People on both sides of ethical line are finding bugs, and even when a bug is found by the white hats, black hat hackers are still weaponizing them. If you’re lucky they are only installing a cryptocurrency miner on your system. You don’t want to be called in because the entire company was ransomware’d, with your system being patient zero.

Also, can ransomware be a verb? Oh well, it is now!

Well, that’s all for this week

I am actually delaying this post a bit so that I don’t disclose this before Atlassian does (Responsibility!), but I also want to bring your attention to something.

Atlassian opened up registration for Summit 2020. If you haven’t been to one before, I have one bit of advice. Do. It.

No, I’m serious! Beg your manager and their manager to have your company pay for it. If they won’t, take a vacation and pay for it yourself. It is ABSOLUTELY worth it. Between the Keynotes and talks, you will learn not only what the best practices are, but what’s coming down the pipeline. Last time I went, they announced native iOS and Android apps for the cloud versions. That was some exciting news to return to the office with.

You will also network with so many other people in the trenches just like you. Trust me, when you’re doing the day to day as the lone Atlassian Admin, it’s things like this that lets you know there are people who get it.

And don’t even get me started on Summit’s Bash. You’ll want to go 😉

For more details, here’s the link to the website:

Who knows, maybe you will even bump into me.

However, if you can’t go, don’t fret. Altassian has always been good about posting not only their keynotes, but all talks and presentations onto YouTube. I pretty much block off that week to catch up on everything when I can’t go, so you won’t miss anything critical.

So, until next week when we return to our look at Logging in JIRA, this is Rodney, asking “Have you updated your JIRA issues today?”

Following the Breadcrumbs: Decoding JIRA’s Logs

Well, that’s a thing. For those that don’t know, Rachel pretty much wrote the book on being a JIRA Admin. So, I guess it’s time to put up or shut up. JIRA Logging, lets do this.

For those of us in the trenches, logs are life savers. No, really. When something goes wrong within an application, logs often are our first, and sometimes only view into why it went wrong. You can tell a lot about what JIRA is doing from it’s logs – If you know what to look for.

As always, we start any look into a new aspect of JIRA by looking at the documentation. Here, we find the relevant document under Logging and Profiling:

Where to find your logs.

Finding your logs is actually pretty simple, assuming you know where your home directory is. Your log files for JIRA can be found at <jira_home_directory>/log/.

Navigating here, you will see several log files, usually along the lines of “atlassian-jira-<name>.log. If that log has had a roll-over, you will see some repeats of them, with older ones being followed by a number.

The one we will be interested in nine times out of ten is the atlassian-jira.log file.

But what about the Catalina.out file?

You my have noticed another log file in the JIRA install directory called catalina.out. This is tomcat’s log file, and while it replicates most of what is in atlassian-jira.log, it won’t have everything JIRA’s doing in there. Furthermore, it will muddy the water with additional information from Tomcat. If you haven’t changed any settings in the JIRA install directory, you won’t have a lot of reasons to go into there.

However, I have found catalina.out helpful when I’ve had an issue starting JIRA because the JVM environment was misconfigured. In that case, JIRA’s logs will have nothing, but Tomcat will report the Java error. If your JIRA is up and running though, don’t worry about this log.

Typical Logs

So, here is a snapshot of the logs after doing some typical things within JIRA:

Yeah, you may have to zoom in a bit there. But for the whole, looks like organized chaos. And well, it is. And in a live production environment, these may be zooming by so fast you may not have a chance to actually absorb what’s happening. With a bit of practice you can get to a point where you can watch the logs stream by and see only the things that look out of the ordinary…but even that’s not perfect. However, lets look at one entry, when I access the login page for JIRA.

2019-10-29 08:08:44,423 http-nio-8080-exec-23 INFO anonymous 488x35978x1 xmd6sf 192.168.<redacted>,192.168.<redacted> /rest/gadget/1.0/login [c.a.jira.index.MonitoringIndexWriter] [lucene-stats] flush stats: snapshotCount=6, totalCount=63, periodSec=486768, flushIntervalMillis=81128149, indexDirectory=null, indexWriterId=com.atlassian.jira.index.MonitoringIndexWriter@19cc4430, indexDirectoryId=RAMDirectory@5d7db2f9 lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@14d78abf

So, what do we have going on here. At first glance, we notice several parts.

Time Stamp and Misc Information

First, we have the time stamp. This is often the best thing to filter off of. If you know when an event you are interested in happened, you can start around that time and work your way backwards. It can narrow hundreds of thousands of lines to maybe a few hundred.

The next bit is just an identifier that tells you what part of JIRA is generating the log. for the above entry, it’s the web server. It may take some digging to figure out what is a part of what and honestly unless you are doing log aggregation, it likely isn’t going to help you a whole lot.

Logging Level

The third bit – that’s interesting. It’s the logging level. Logging levels tell the application how much of what’s going on you want recorded, and come in five levels:

You may notice I put them onto a scale. That is because as you change the logging level, you are increasing or decreasing how much of what is going on is recorded to the logs. I could do a whole post on logging levels alone, but the gist of it is if you set debug, you can almost watch all the variables change in real time in the logs. If you set it to Fatal, JIRA will only log entries when something catastrophic happens, and even then will likely leave out details leading up to the crash.

So this leads to the question: “If not enough information is a bad thing, why don’t we set everything to Debug and log all the things?”

Bad Admin! No Treat!

If you haven’t guessed, there is a trade-off. Here, it’s disk space. If you set everything to Debug before you leave on Friday, by the time you are sitting down for dinner on Saturday, you’re going to be paged for a full disk. No really, you’ll be surprised how fast text can fill up a hard drive.

Speaking in general, you can also use Logging levels to tell if something bad is happening. The Developers will assign different events to different logging levels, based on their best guess on how important it is to log. That being said, I’ve seen items that were Warn or Error that were no big deal, and I’ve seen Info level log entries that were actually concerns. The easiest example I can think of is actually in my logs right now:

2019-10-29 08:30:11,656 Caesium-1-1 INFO ServiceRunner     [c.a.crowd.directory.DbCachingRemoteDirectory] Incremental synchronisation for directory [ 10000 ] was not completed, falling back to a full synchronisation
2019-10-29 08:30:11,656 Caesium-1-1 INFO ServiceRunner     [c.a.crowd.directory.DbCachingRemoteDirectory] INCREMENTAL synchronisation for directory [ 10000 ] was not successful, attempting FULL

This is a log level INFO, but it’s saying something causing it to fail an incremental synchronization from my LDAP Directory. In my test environment, the LDAP directory is small enough this won’t be a problem, but in a full corporate LDAP directory, this can cause a significant overhead. I’ve even DDoS’ed my own JIRA instance trying to do a full directory sync. And again, this is only a “INFO” level log…

Identifier

After the Log Level, we have the identifier information. For web traffic and actions, this will tell you whose account initiated the action that resulted in the log. If you have a problem that only one user is reporting, this can help as you can search the log for that username. Of if you want to know if that person did something that caused problems, there you go too.

For web traffic, this is also accompanied by a couple of IP Addresses, usually the user’s clientside IP Address, followed by the IP address of your proxy. This can be helpful for identifying traffic originating from outside your network, or investigating access that didn’t look right.

Package

After you have your identifier, you will see a section in a bracket. This is the package that generated the log. In lay mans terms, this is a more specific part of JIRA that generated the log. This can be a specific plugin (or part of a plugin), or a particular routine within JIRA. You can see the list of packages and their associated logging levels within the System -> Logging and Profiling section of the Administrative Control Panel in JIRA.

In speaking of Logging and Profiling, you can also change the logging level per package in this section as well. These changes will persist until the next time you restart JIRA. In a future post we’ll also cover how to change them permanently. You also have access to an even deeper logging level here called “Trace” – but it is recommend you only use that if you are asked to by Atlassian Support in order to help with a specific issue. Your disks will thank you.

Message

After all these details, we get to the meat of the log – the message. This is what happened, or didn’t happen, that warrants logging. This is the “What” to the rest of the logs “Who, When, and Where”. This is where you can dig in to find whatever problem you are looking for.

And that’s what’s in your JIRA Log

And that’s pretty much it. I do have one more thing for you. For me, reading white logs on a black background can make things easy to get mixed up. To help with this, I wrote up a quick little script that will look up where you JIRA Log is, and pass it through an Awk filter so that it will colorize things based off the log level for each entry. It will also follow the log, which means it will display entries in real time as they are updated into the logs, and keep doing so until you hit “Ctrl+C”.

For reference, the colors each log level will be:

  • DEBUG: Green
  • INFO: Default
  • WARN: Yellow
  • ERROR: Red
  • FATAL: Magenta
#!/bin/bash
# Rodney Nissen
# Version 3
# Updated 10/29/2019
# Update JIRAINST to point to your JIRA Install Directory

JIRAINST="/opt/atlassian/jira"
JIRAHOME="$(sudo cat $JIRAINST/atlassian-jira/WEB-INF/classes/jira-application.properties | grep "jira.home = " | cut -d ' ' -f 3 | tr -d '\r')"
JIRALOG="$JIRAHOME/log/atlassian-jira.log"

sudo tail -f -n 50 $JIRALOG | awk '
  /DEBUG/ {print "\033[32m" $0 "\033[39m";next}
  /WARN/ {print "\033[033m" $0 "\033[39m";next}
  /ERROR/ {print "\033[31m" $0 "\033[39m";next}
  /FATAL/ {print "\033[35m" $0 "\033[39m";next}  
  1
'

I hope you will get as much use out of this tool as I have. As I stated in several parts of this post, I think this is the kind of subject that deserves a series of posts, not just one. As such, I’m thinking I’m going to do at least two more – one is an in depth dive into Logging Levels and what they actually mean, and another on how to play around with your logs. But until then, I’m Rodney, asking “Have you updated your JIRA Issues today?”

Advanced Monitoring of JIRA

Note from the future: I’ve actually written a much better article on this you can find here: Monitoring JIRA for Health and Fun. You will find much better information there. I’m leaving this up as a testament to the past, but figured I’d save you the work.

So, confession time. I am a sucker for data. I love seeing real time graphs of system stats, performance, events, etc.

And this is just Minecraft….

So when I saw the following blog post being shared on reddit, it was a no brainer.

Jira active monitoring 24/7 with Prometheus + Grafana

It was a great read and a great guide. I chose to set up Prometheus in a docker container as all my other monitoring tools are there, but aside from that his guide was spot on, and I encourage everyone to read it.

However….

It didn’t quiet go far enough, in my mind. I could get a ton of stats that I couldn’t get before: licensed user count, logged in user count, attachments storage size. But without being able to tie it to what the underlying system was doing it just felt…incomplete. I mean it’s great you can tell you have 250 users logged in, but is that what’s really causing the lag in the system?

Luckily I had another tool in my toolbelt. I had previously setup Grafana and InfluxDB/Telegraf for use on other systems. It was merely a matter of setting up telegraf on the JIRA server, and start pumping in the required data that way.

To see how to install InfluxDB and Grafana via docker, I’d first suggest following this guide from a fellow homelabber:

My JIRA system runs on CentOS, so my first step in getting JIRA’s server info was to get the InfluxDB added to yum. To do so, I executed the following command:

cat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo
[influxdb]
name = InfluxDB Repository - RHEL 
baseurl = https://repos.influxdata.com/rhel/7/x86_64/stable/
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdb.key
EOF

After this, it was a simple thing to do a yum install, and boom, I had telegraf on the system. As a matter of disclosure, I did look that up here, it’s not like I just knew it.

After installing it, I did need to configure it so that it would gather and send the stats I was interested in to InfluxDB. To do so, I took a backup of the default config file (found at /etc/telegraf/telegraf.conf), then replaced the original with my own generic config file for Linux systems:

[global_tags]
[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  debug = false
  quiet = false
  logfile = "/var/log/telegraf/telegraf.log"
  hostname = "<<system host name, optional>>"
  omit_hostname = false
[[outputs.influxdb]]
  urls = ["<<url to your influxdb server>>"]
  username = "<<influxdb user>>"
  password = "<<influxdb password>>"
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]

Then it was a matter of starting Telegraf, confirm in the logs that it was sending, and then start work in Grafana. After working the data a bit, I got the following dashboard:

::wipes up drool::

Now, this is still very much a work in progress. For one thing, the orange line is supposed to be the max java heap, and unless I stop JIRA and change it – I don’t think it’s supposed to change. And those gaps aren’t me restarting JIRA, so it should have been constant. (proxy problems, ammirite?)

The particularly tricky one to do is the CPU Graph and guage. Telegraf will breakdown CPU usage into io_wait, user, system, etc. To be fair, this is how it’s recorded to Linux internally, but to get a unified number, I had to take the idle time on the CPU, multiply it by -1, then subtract it from 100.

The second gotcha is under to options tab of the single stat block. It will always default to give you an average – which at times you will want. But in most cases will give you weird readings, like 0.4 users are currently logged in instead of one. To fix this, simply change it to read “Current”, as shown below.

So, you may be asking yourselves, why?

My first answer would be “Have you seen it! It’s just cool a.f.!”

But my second, more practical answer is incident detection and response. Just like in JIRA, Dashboards here give you and idea of whats going on now, as well as what’s been going on previously. You can even expand on this to tell if you are using swap, which could severely slow down a system. Or if you are out of memory for the JVM setting you have. Or you don’t have enough CPU power to process all the requests. And you can tell all this at a glance. You can even have this posted on a monitor on the wall so everyone can see it in real time.

My Brain at the moment….

So this post is a bit short…

I just got back home yesterday from a last minute trip I had to take. I knew I’d want to update the blog, but honestly couldn’t think of what to do it on. Luckily while I was gone I had worked on this as a distraction, so here we are. But this is a tool I wish I had one some of my previous engagements, and figured some of you might be able to use it as well. So, until next week, when I can hopefully complete part one of the Installing JIRA from scratch, this is Rodney, asking “Have you updated your JIRA issues today?”