Dynamic Previous 12 Months of Data w/ Column Names

I have a fairly challenging calculation: Based on start date/end date I need to know if an employee was employed at all in a given month and then be able to count those for the last 12 months. I was able to do this in SQL in my source (snowflake). However I need the appropriate month/year names in the column headers and the SQL won’t let me set dynamic column names that Sigma can read in.

My base dataset without the SQL-fu is (simplified):

Employee | Start Date | End Date
Bob | 9/30/2022 | 11/15/2022

Goal would be a dataset something like this:
Sept 2022 | Oct 2022 | Nov 2022…
15 | 28 | 29

Is there a way to either:

  • Dynamically set a column name from within a workbook via formula or…
  • Do this calculation within a workbook and have column names with Month/Year?

Would a Pivot Table work for you here or does it need to be a table?

Pivot table would be fine. I’ve been playing around with how to do that and am still stuck.

Something like this could work, you probably need to add additional logic

So there’s a few problems with that which I see. Firstly it doesn’t take into account years and I’m looking at a rolling 12-month period. So if I was employed from November 2021 - February 2022 it wouldn’t compute right. Secondly I don’t think that approach conforms to the dataset I have, since I only have startdate and enddate.

One way I have solved this in the past is manually adding a column per month I am looking at, and then using a calculation like [Start Date] < MakeDate(2022,12,01) AND [End Date] > MakeDate(2022,12,01) And just updating the MakeDate calculation for each month I want represented.
This works but involves a lot of manual creation of columns and formulas. The benefit is that it is conceptually simple, and just gives you a TRUE or FALSE value for each month for each employee.

I feel like you could also do something with joining in a date dimension but that’s not something I’ve tried or fully reasoned through.

So every month I’d need to go in, edit the workbook and add in the new column appropriately. Yeah, that’s always been something I figured I could do, I just wasn’t super jazzed about the idea.

I hadn’t thought about using a separate date dimension but I’m not sure it would be any different from just using month(date). Unless it’s been too long a week and my brain is going out.

Agree, I’m also never jazzed about the “manually update each month” options.

Let me play around with the date dimension option and see what I can come up with.

1 Like

I created a date dimension table that was just a list of dates for all the months I wanted to do analysis for. I then did a cross join so that every employee had a row for all of the months in the dimension table.
Next, I created a new column with the formula DateTrunc("month", [Start date]) <= [Date] and If(IsNotNull([End Date]), DateTrunc("month", [End Date]) >= [Date], True)
This checks if the start month is equal to or before the date I am testing for. It then check if there is an end date, and if so if the end date is after or equal to the date I am testing. If there is no end date, then the expression evaluates as true.
Last, I grouped by my date dimension. This then gives me a list of all employees and a true false value for whether they were employed in the given month. I then added a CountIf function to count the number of employees for each month.

Benefits of this approach:

  • Easy to upload a date dimension table that covers a good chunk of time. While there is still some maintenance required, swapping in a new date dimension table is straight forward and quick.

Drawbacks:

  • You are adding a lot of rows to the table and that may get unwieldy. You will have number of months in the dimension table + 1 rows for every employee.

So looking at this it’s possible so long as we also included a comparison on Year([Start Date]) and Year([End Date]). I’ll play around with it. But yeah, ~365 days * X employees is a lot of data. I suppose I could materialize it. I also realized I’d need to update the date dim table but I think I can script that out automagically in snowflake pretty easily. Or really just have that be a view.

Okay, here’s the solution I got and I’ll give the details in case this helps others down the line. For starters I created a dimension view as follows. It’s limited seeing as I knew I only wanted a year back and wanted it to change dynamically when as the months changed (I know I didn’t take into account leap years but it’ll be right 99.999% of the time and I’m not going to stress about it).

create or replace view v_dim_dates_last_year as
with cte_date as (
  select 
    dateadd(day, seq4(), dateadd(
		dd, (-1 * day(current_date()-1)),		--first of the month
			dateadd(yy, -1, current_date())		--1 year ago
		)
		) as my_date 
  from 
    table(generator(rowcount => 400))       --365 days + max of 30 give or take to cover back the previous year-ish
) 
select 
  to_date(my_date) as date, 
  to_timestamp(my_date) as datetime, 
  year(my_date) as year, 
  month(my_date) as month, 
  monthname(my_date) as monthname, 
  day(my_date) as day, 
  dayofweek(my_date) as dayofweek, 
  weekofyear(my_date) as weekofyear, 
  dayofyear(my_date) as dayofyear 
from 
  cte_date;

I created another view against my core dataset and that view cross joined as mentioned earlier. That I loaded in as a new dataset to Sigma.

To make it simple(-er) I extracted the month and year as separate columns from both the term date and hire dates. Then I made this monstrosity to count within the year/month grouping.

CountDistinctIf([Employee Number], ([Hire Year] < [Year] or ([Hire Year] = [Year] and [Hire Month] <= [Month])) and ((([Term Year] = [Year] and [Term Month] >= [Month]) or [Term Year] > [Year]) or IsNull([Termination Date])))

I also filtered out any rows where [Date] > today() since I don’t need to put in future dates.

Once I had an all active staff by month grouped table, I could duplicate it and filter it to certain kinds of staff, re-join the two together and do things like this:

I’m glad you figured it out!

I also want to make sure to mention that you can do a cross join directly in the Sigma join interface. If you choose “custom formula” for both join keys and just put in “1” for both, then it will be a cross join.