How to format a telephone number using regular expressions with RegexpReplace

Telephone numbers can have many different formatting conventions. Typically it’s best practice to store these as a Text/String data type. The side-benefit of doing so is it will also allow the use of regular expressions for parsing, matching and replacing.

In our example, there are 10-digit numbers stored as Text type: ##########. We want to format them to the convention (###) ###-####

We can use Sigma’s RegexpReplace function for this. The syntax is:

RegexpReplace(string, pattern, replacement)

Note: in our example below, we will be running this function in a Snowflake connection and so by extension it uses Snowflake’s REGEXP_REPLACE. Snowflake’s supports backreferencing capture groups in the replacement regular expression. Each CDW will have its own nuances for its regular expression functions and may or may not support these features.

In the below example, the pattern argument will use parentheses () to denote capture groups that we’ll backreference when replacing them with the formatting we desire. Group 1: (###), Group 2: (###), Group 3: (####).

Note: Snowflake supports a maximum of 9 capture groups. They are referenced by using double back slashes followed by the number of the particular group, e.g. \\1

In the replacement argument, we will backreference the capture groups and put in the additional formatting we desire, such as the literal parentheses around the first three digits and the literal dash between the second and third capture groups.

RegexpReplace([Phone Number], "(\\d{3})(\\d{3})(\\d{4})", "(\\1) \\2-\\3")

Added Formulas