How to fill in missing dates in table elements

Frequently we get asked about how to fill in missing dates. To start, you can see here in my base data that I am missing the 1st, 8th, 15th, 22nd, and 29th. Some will have data like this where it is 1 row per day as I do below, others will have more line item type data. Regardless, we know there are missing dates and the solution is almost identical.

  1. Create a date dimension table if you dont have one in your warehouse already. My preferred method is via custom SQL. Here I use a combination of snowflakes Generator function and dateAdd from an end date. You could make this even more dynamic by using current_date() and backfill the last 3 years if you wanted to. Ill leave the SQL for other warehouses in the comments below. I should also note that you could upload a CSV of the full date list or even use an input table. In this use case it is less necessary for the dimension table to be dynamic but as pointed out above, there are other use cases that would backfill the last 3 years from today in which case the SQL is best.

  2. Once we have our dimension table, we can Add via lookup from our base data. To do this, right click the date column and select “Add Column Via Lookup”. This is where things vary slightly depending on the format of your data. Since my base data has 1 row per data I don’t need an aggregate, but if your data is more granular, feel free to set aggregate to Sum in the lookup helper modal.

Now we see that we have a table with rows for all the dates, including the ones missing from the original! To complete this we can simply wrap the lookup in the formula bar with the ZN() function aka Zero if Null.

If this was helpful but you still need assistance or want clarification around any of these steps, let us know!

3 Likes

As promised, some SQL to create Date Dimensions for the last 3 years in other Warehouses (1095 days). If you know a way for one not mentioned below drop it in the comments!

Snowflake

select
  seq4() DAYS_FROM_TODAY, dateadd(day, '-' || seq4(), current_date()) as FULL_DATE
from
  table
    (generator(rowcount => 1095)) -- 1095 = 3 years

BigQuery

SELECT
  d AS full_date
  FROM
    UNNEST(GENERATE_DATE_ARRAY(DATE_ADD(current_date(), INTERVAL -1095 DAY), current_date(), INTERVAL 1 DAY)) AS d

REDSHIFT

SELECT current_date - (n || ' days')::interval
from generate_series (1, 1095) n
2 Likes

Added Custom-SQL

If you’re on Snowflake here is a handy SQL thats easily customizable to create all the date ranges and date parts you may need!

WITH cte_my_date
     AS (SELECT Dateadd(hour, Seq4(), '2015-01-01 00:00:00') AS MY_DATE
         FROM   TABLE( Generator(rowcount => 20000) ))
SELECT To_date(my_date)              AS date
       ,To_timestamp(my_date)        AS datetime
       ,Year(my_date)                AS year
       ,Month(my_date)               AS month
       ,To_char(( my_date ), 'MMMM') AS monthname
       ,Monthname(my_date)           AS "short_monthname"
       ,Day(my_date)                 AS day
       ,Dayofweek(my_date)           AS dayofweek
       ,Dayname(my_date)             AS dayname
       ,Weekofyear(my_date)          AS weekofyear
       ,Dayofyear(my_date)           AS dayofyear
FROM   cte_my_date; 
1 Like

Added Data-Modeling, Dates, Table