So many people write in to Sigma support asking, “what is wrong with my IF statement?” And… so many people write in asking how to do things that are easily solved with an IF statement.
So, if you want to learn how to properly utilize this mighty function, please read on!
Syntax
Per the Sigma documentation, the correct syntax for the IF statement is as follows:
If(condition 1, value 1, [condition 2], [value 2],..., [else value])
.
This means, when condition 1
is satisfied, return value 1
. If/when condition 1
is not satisfied, the IF function will move on to check the next condition. If no “else” return value is passed through the last argument of the IF function, rows that don’t satisfy any of the previous conditional checks will return Null
. On the other hand, if an “else” return value is specified, rows that don’t satisfy any of the previous conditional checks will return that specified value.
In short, this function will check if the specified conditions are satisfied (and return corresponding values) before resorting to returning the implicit or explicit [else]
value.
5 points to keep in mind
-
Make sure that you know the data type(s) of any column(s) referenced in IF statement conditions… e.g., if you’re trying to check that a string is greater than a date in a date type column, the function won’t run successfully. You’ll need to convert that string to date using the
Date
function so it can be compared against a column of date values, e.g.:If(Date("2022-12-01") > [Date], True, False)
works, assuming[Date]
is a date type.
-
Do NOT mix data types in the return arguments, e.g.:
- Incorrect:
If([Cost] < [Sales], "Profit", [Cost] = [Sales], 0, "Negative")
The above would hypothetically return a mix of strings and numbers… which is not allowed.
Instead, only return one data type per column (aside from Nulls), e.g.:
If([Cost] < [Sales], "Profit", [Cost] = [Sales], "Break Even", [Sales] < [Cost], "Loss")
The above will successfully return a string column no matter what (except when
[Sales]
or [Cost] is null, in which case the implicit else condition will returnNull
… which is acceptable). - Incorrect:
-
Remember that you can combine column conditions within any condition argument, e.g.:
If([Cost] < [Sales] and [Product Type] = "Computers", 1, 0)
This can be particularly helpful if you want to filter out rows where multiple column conditions are satisfied. That is, if you use this formula to create a column showing which computer products made a profit, you can then filter this indicator column to only include values of 1 (this allows you to only see profitable computer products). Alternatively, if you prefer to take a closer look at the computer products which are not profitable, you can filter the indicator column to only include values of 0.
That said, if you have too many things to check within one condition, e.g.:
If(Contains([SKU], "1000") or Contains([SKU], "1001") or Contains([SKU], "1002") or Contains([SKU], "1003") or Contains([SKU], "1004") or Contains([SKU], "1005") or Contains([SKU], "1006") or Contains([SKU], "1007") or Contains([SKU], "1008") or Contains([SKU], "1009") or Contains([SKU], "1010") or Contains([SKU], "1011") or Contains([SKU], "1012") or Contains([SKU], "1013") or Contains([SKU], "1015") or Contains([SKU], "1016") or Contains([SKU], "1017") or Contains([SKU], "1018") or Contains([SKU], "1019") or Contains([SKU], "1020"), 1, 0)
you will likely hit an error… but, fear not, you can update this formula so that it doesn’t, e.g.:
If(Contains([SKU], "1000", "1001", "1002", "1003", ..., "1020"), 1, 0)
-
The conditions are checked in sequential order, so be careful to put them in the right order (i.e., if a row satisfies both the first and second conditions, that row will be assigned the value associated with the first condition).
-
Use a Switch statement if you have just one column / parameter that you’re testing equivalence for in each condition… it’s much cleaner and accomplishes the same thing as an If statement.