Sets and LODs for Custom Colour Legend Filters

I came across an interesting challenge the other day while I was working on a client project. Although Tableau already allows users to highlight dimensions from a colour legend, using a colour legend as a filter is a bit trickier. In order to work around this, you can create a custom colour legend as a separate sheet and use that on your dashboard with a filter action (see community post here). However, in order for this hack to work, the field you are using to colour your data needs to be a dimension. What if you are colouring your data by a calculated field?

Let me illustrate the problem with the go-to sample superstore data. Let’s say we want to see a scatter plot that breaks our customers into quadrants based on sales and profit. Something like this:
2016-07-25_21-27-58

Next, we colour the individual quadrants using the following calculation:

2016-07-25_21-38-57

Wait, whoa, what’s going on here? Why the nested LOD calculations? Why make it so complicated? Well, essentially what we’re doing here is writing a calculation that is the equivalent of WINDOW_AVG. So if we were using WINDOW_AVG, the calculation would look like this:
2016-07-25_21-42-00.png

So why not just use WINDOW_AVG? Hold your britches, explanation on the horizon, about 2 paragraphs down.

So we drag our complicated nested LOD calculation onto the colour shelf of the marks card, and we get to this point where our quadrants are differentiated by colour:

2016-07-25_21-46-59

Sweet. Now this is all fine and dandy and we’d like to see a list of our customers based on the quadrants they fall into.

2016-07-26_22-02-41.png

As you can see in the screenshot above, we’ve got a list of our customers with the quadrants calc we used to colour the scatterplot included as a detail. If we create a new dashboard and put everything together, we can see that clicking the colour legend will not filter our customer list, it will only highlight relevant customers.

2016-07-26_22-06-09

Using the method outlined in the community post here, we can use a colour legend as a filter by creating a new sheet that mimics the design of a colour legend. Remember how we could have done our quadrant colouring calculation with a simple WINDOW_AVG? Well, this is what happens when we try to create a colour legend sheet with such a calculation:

2016-07-26_22-10-45.png

WINDOW_AVG is a table calculation that needs Customer Name as a level of detail in order to calculate properly. The problem is, we end up with too many squares. The solution? Sets. And sets don’t like table calculations. Hence, the long, drawn-out, nested LOD calc.

To remedy the situation, create Customer Name sets for each of the quadrants, based on the calculation we used to colour our quadrants. For example, a High Sales, High Profit set would look like this:

2016-07-26_22-15-18.png

Once you’ve got all four sets created, drag all of them onto the detail shelf of the sheet where you have your customer names listed. Then, create a new calculation with the following:

2016-07-26_22-19-00.png

In both your scatterplot sheet and customer name list sheet, drag this new calculation onto the detail shelf of your marks card. Create a new sheet and drag the same calculation onto the rows shelf and the colour shelf. You end up with something that looks like a colour legend. Put the whole shebang together on a dashboard, add a filter action between your legend sheet and name list and a highlight action between your legend sheet and your scatterplot, and you get this:

2016-07-26_22-33-56

Now you’ve got an interactive dashboard that breaks your customers down into quadrants and gives you a list of those customers depending on your selection. A useful tool for segmentation analysis, and a good way to practice the utility of sets in Tableau.

Advertisement

Time Formats in Tableau Server Postgresql Data

It’s been a while since I’ve posted an article, but I tend to believe that no news is good news! I’ve been working hard at my placement and learning a mountain of stuff. I’ve been itching to write a new post to really ground the experience in my brain, and hopefully help other folks out who might be struggling with the same analytics issues.

Recently, I was working with the Tableau Server postgresql  data and hit a bit of a bump when I realised the times in the dataset weren’t in the timezone format I needed. Why is this important? Well, let’s say you’re trying to determine the times when your dashboards are viewed the most, or times when you have concurrent users on your server to determine how much of a strain you’re putting on your system. By default, the times included in the postgresql data are in UTC format. So in order to put the time in a format that is meaningful in your timezone, you will need to do some manipulation.

One thing that complicates the situation is daylight savings time. Although you could do a simple DATEADD formula and convert UTC time to your timezone, how can you account for times in the year when the time changes? What if you are looking at multiple regions with different daylight savings time rules?  In a comment posted in response to Mark Jackson’s article , Brandi Beals provides a conversion calculation that addresses this issue.

First, you will need to determine the daylight savings time rules for your own country. In Brandi’s comment, she outlines a calculation for time conversion based on the daylight savings time rules in the United States for CST. She ends up with a calculation that looks like this:

DATEADD(‘hour’,
IF
DATEPART(‘month’,TODAY())>=3      //Starts in March
AND DATEPART(‘week’,TODAY())>=10  //Starts 2nd Sunday ~10th week of the year
AND DATEPART(‘month’,TODAY())<11  //Ends in November
AND DATEPART(‘week’,TODAY())<45   //Ends 1st Sunday ~45th week of the year
THEN -5 //Daylight Savings Time
ELSE -6 //Standard Time
END
,[Started At])

Let’s break this down. In the first DATEPART line, we are looking for dates that start after March, when daylight savings time occurs in the States. In the second DATEPART line, we are looking for dates that start after and including the 10th week of the year. Why? Because not only does daylight savings time start in March in the US, but it also starts on the 2nd Sunday of the month, which is approximately the 10th week of the year.

Okay. Pause. How do we know the second Sunday of March is the 10th week of the year? Well, we could take out our calendars and count through the weeks until we get to the date we need. Or we could use something called the ISO week date. The ISO week date is a calendar system that will give us the week a date is located in, while still accounting for leap years. The calculation is a bit tricky, but if you’re curious to see how it’s done, Wikipedia has a great summary of instructions here. Otherwise, you can go to www.epochconverter.com/weeks for a handy table of dates and their corresponding week numbers.

Note that even with using this method, we can’t be 100% accurate because leap years will sometimes change the week number where a weekday lies. However, this will only vary by 1, so it should not significantly affect your data, but it’s something to be aware of.

If we go back to Brandi’s calculation, we can see that the last two DATEPART lines include dates before the first Sunday of November (the 45th week in the 11th month). We can then complete our IF THEN calculation by using a numerical conversion from UTC to CST (-5), ELSE a numerical conversion for Standard Time (-6).

In the UK, the rules for daylight savings time are slightly different, but the overall calculation remains the same. Based on a daylight savings time (i.e. British Summer Time) that starts on the last Sunday of March, ends on the last Sunday of October, and a UTC to GMT offset of +1, we end up with a calculation for the UK that looks like this:

DATEADD(‘hour’,
IF DATEPART(‘month’,TODAY())>=3
AND DATEPART(‘week’,TODAY())>=12
AND DATEPART(‘month’,TODAY())<10
AND DATEPART(‘week’,TODAY())<43
THEN 1
ELSE 0
END
,[Started At])

Although it’s a bit of a pain to get the conversion calculation figured out to account for daylight savings time, it will give you the most accurate time information when monitoring information output from Tableau Server’s postgresql data. In addition, you can use these calculations when trying to look at time information from multiple time zones, while accounting for different daylight savings rules in each region.

 

Last and Prior Calculations: PART II

In my last post, I outlined one method for creating last and prior calculations using Tableau’s LAST() function. Although this is a pretty straightforward and useful method, it limits you in terms of creating table calculations such as percent change. In this post, I will show you another way to create last and prior calculations that will also allow you to perform table calculations.

Using level of detail calculations and IIF statements

The first thing you want to do with this method is create a calculation that will give you the last, or most recent, date in your data set. Using the sample superstore data, this would be something simple like:

{MAX([Order Date])}

By wrapping the expression in curly brackets, we’re essentially creating a fixed level of detail calculation. But in this case, because we haven’t set the scope, we’ve told Tableau to calculate the figure based on our entire dataset, thus giving us our most recent date. This will help when we do the DATEDIFF calculation in the next step.

Now if we go back to the table we created previously using the LAST() calculation, we can see that the most recent month in our dataset is given a value of 0. Again, as before, our last 6 months would therefore be months 0-5 and our prior 6 months would be months 6-12.

Screen Shot 2016-05-15 at 10.14.52 pm

Using this information and the most recent date (i.e. last date)  calculation, we can use the DATEDIFF function to generate our last 6 months that looks like this:

Screen Shot 2016-05-30 at 1.07.48 am

And for prior 6 months:

Screen Shot 2016-05-30 at 12.51.27 am.png

When we drag them into our table, we create a Boolean True/False calculation for the months in our dataset

Screen Shot 2016-05-30 at 12.51.45 am.png

In order to get the actual figures, we need to create another two calculations using an IIF statement. Because we cannot wrap our Sales figure in a SUM aggregation INSIDE the IIF statement, we can work around this by wrapping the SUM aggregation around the outside of the IIF statement like so:

Screen Shot 2016-06-04 at 12.34.27 pm

Calculation for last 6 month sales

Screen Shot 2016-06-04 at 12.36.27 pm.png

Calculation for prior 6 month sales

And now we can create a dual axis chart as I did in my previous post that compares the two figures:

Screen Shot 2016-05-30 at 1.00.49 am

Now the last thing we want to do with this method that we weren’t able to do with the previous method is create a percent change calculation. The calculation is as follows:

Screen Shot 2016-05-30 at 1.21.13 am

And now if we put everything together in a table, we can see our final result with our percent change calculation. This can be used to display percent change for any of our dimensions and the figures will update accordingly:

Screen Shot 2016-05-30 at 1.22.22 am

 

 

 

Last and Prior Calculations: PART I

When working with performance metrics, it’s always a good idea to have some benchmark to compare against. It might feel good to say you made 100M in sales, but depending on the company, 100M might be a dramatic dip compared to the 200M you made last month. Other benchmarks might be competitor performance, figures from the same time last year, or figures from the same company but a different department.

Another benchmark that can be informative is comparing performance in the last x number of months versus performance in the x number of months prior to that – hence, last versus prior. In order to calculate these metrics in Tableau, you can do one of two tricks. In this post I will walk you through the steps to create these calculations using Tableau’s LAST() function.

USING LAST()

The first trick involves using the LAST() table calculation in Tableau, which assigns a value of 0 to the last row visible in your workbook. So if we are looking at our sales by quarter, our LAST() calculation would assign values like this:
Screen Shot 2016-05-15 at 10.12.56 pm

If we were looking at monthly sales, LAST() would assign values like this for our bottom (i.e. last) rows:

Screen Shot 2016-05-15 at 10.14.52 pm

Let’s say we wanted to see sales values for the last 6 months. All we would have to do is filter to the bottom 6 rows, which we can see are assigned values 0-5.  The calculation would look like this:

Screen Shot 2016-05-15 at 10.20.04 pm

And we would just drag that onto the filters shelf and select True. Now our table shows us sales for the last 6 months:

Screen Shot 2016-05-15 at 10.20.55 pm

Great. We’ve got sales for the last 6 months, and now we want to compare these to sales for the 6 months before that (i.e. prior 6 months).  If we look at our original table, we can see that the prior 6 months are assigned values 6-11 by the LAST() function:

Highlight

So using AND, we can obtain the prior 6 months with  a calculation like this:

Screen Shot 2016-05-15 at 10.25.56 pm

And now our table looks like this:

2016-05-19_09-10-26

In order to create a view that compares last and prior figures side by side, we can create a dual axis bar chart. First, using the previous Boolean calculation we made, create a new calculation that will display only Last sales and that looks like this:

IF [Last 6 Months] THEN SUM([Sales])
ELSE NULL
END

Do the same thing for a calculation that will display only Prior sales:

IF [Prior 6 Months] THEN SUM([Sales])
ELSE NULL
END

Now you can construct your dual axis bar chart. Drag your date field onto the detail shelf and set it to continuous MONTH. Set your mark type to bar. Then drag both calcs onto the columns shelf, and set them to compute using the date field. Click the arrow on one of your calculation pills and select dual axis, then right click on the axis at the top and synchronize your axis. Lastly, drag Measure Names from the Dimensions pane onto the rows shelf and you will have a dual axis bar chart showing you the values for sales in the last 6 months vs prior 6 months.

2016-05-19_09-13-54

Now, the problem with this method is that if you wanted to calculate % change between last and prior 6 months, you might have some difficulty. This is because your last and prior calculations are table calculations that cannot be aggregated.  In other words, the calculations are computing values for each month in the view, and they cannot be summed together in a calculation like SUM(Last 6 Month Sales) or SUM(Prior 6 Month Sales).

One option is a level of detail calculation to force the calculation to ignore the months in the view. Ah, but here’s another problem, you cannot combine table calculations and level of detail calculations. So what do? In my next post I will show you an alternative way to create last and prior calculations that will still allow you to use them in aggregate calculations such as % change.

Spring is in the air (and your viz)

Welcome to my new blog and my first post in this new digital space!

Things are warming up in London and it’s happening fast. So fast that I’m still finding myself walking out in the sunshine wearing my full-on winter coat, wool scarf, and fleece boots. I’ll admit, my core temperature tends to err on the side of Arctic (I like to tell people I’m a failed Canadian due to my constant chilliness), but even I’m feeling goofy walking out in the heat looking like I’m stuck in January. So my solution this weekend was to do some hardcore spring cleaning, which gave me an opportunity to tidy up for a more weather-appropriate closet. It also gave me some good motivation to get this blog up and running, and the inspiration for this first post!

One thing that people sometimes neglect when finalising their dashboards is cleaning up their space. I get it, it can be tempting to fire off a viz into publication after spending hours creating your masterpiece and building up all the anticipation to show people what you’ve done. However, doing just a couple of quick clean ups can optimise how you present your creation so people read your data’s message loud and clear.

So here are 5 tips to spring clean your workbooks and present a polished viz

1.) Delete unused calculations

I’ve often come across workbooks with a ton of calculations crowing up the measures pane, but only one or two actually being used in the view. If your intention is to just showcase your finalised dashboard, then this isn’t too big of an issue. However, if you want to give folks the opportunity to explore your workbook on a deeper level and understand the skill that went on in the background, it’s a lot easier when your calculations are cleaned up.

A quick and easy way to do this is to click on the carrot next to the dimensions pane, and click “Hide all unused fields”

Another way is to just try and delete the calculation. If it’s being used in one of your views, Tableau will spit out a warning. If it’s not, the calculation will just be deleted and your viz will remain the same.

2.) Delete unused sheets and hide everything else 

I find that when I’m constructing my vizzes, I often create multiple sheets to test out different ways to represent the data and see what works best for what I want to communicate. This is great for the creative brain process, but not so great if you want to keep things tidy. If you want to keep your sheets to document your creative process, put them all in a new dashboard and then you can hide them with the other sheets used in your finalised dashboard. Hiding sheets is as easy as right clicking on your dashboard tab at the bottom, and clicking Hide all sheets.

Sometimes people like to colour code their sheets at the bottom, which is a great way to keep yourself organised while you’re working away, but it ends up being distracting when it’s the first thing people see when they open up your workbook. Don’t keep these open in your finalised view, hide them as well.

3.) Delete unused data sources

This is not just a visual tidying up, it can also help optimise how your workbook loads up. If you added a data source while you were building things but then decided not to use it, delete it! Just right click on the name of the data source you want to remove, and select Close.

4.) Give everything names that make sense

This is particularly important for your calculated fields. Make sure they are descriptive. This is for your benefit as much as it is for your audience as well. It’s a pain trying to drop a calculation in your view when you’re not sure if it was calculation23 or calculation54. Name them!

Also name your worksheets so you know how to reference them when building your dashboard, and so your audience knows how to reference them. I don’t typically rename my dashboards, but if they’re going on Tableau Public the dashboard name will be used in the URL that is built, so it’s sometimes nice to have control in how that URL will be written.

5.) Leave your tooltips for last

Always format your tooltips as a last step. This is because if you are dragging pills on and off while building your views, it will affect your tooltips. You want everything to be consistent, and you don’t want any weird or wonky things being written in your tooltips.