Analytics How To: Using Regex in Funnel.io to Split a Column by a Delimiter
Since this article was published, Funnel has added text splitting functionality.
The REQ Analytics Team knows how complex and complicated managing marketing data can be. That’s why we invest in many solutions that assist our data management efforts and bring value to our clients. One of the solutions we use in our reporting automation process is Funnel.io, a tool that helps transform marketing data into automated marketing reports. Let’s dive into how to split a column by delimiter in Funnel.io.
A delimiter is a special character that indicates the barrier between separate groups of text or numbers. In this example, the delimiter is a hyphen.
Many users understand the usefulness of Excel’s “Split Text by a Delimiter” function, but when trying to recreate the effect in Funnel.io, users can run into issues. In Excel, users can select the column they wish to split, click on the “data” tab, then “Text to Columns”, “Delimited”, then specify the delimiter.
This action is very helpful because one dimension has now become three separate dimensions that can better tell the data’s story. Currently, Funnel.io does not have a “Split Text by a Delimiter” function like Excel, but regex can be used to solve this issue.
Regex stands for regular expression, which is a language that uses characters to describe a search pattern. In the first example a simple regex will be used with Funnel.io to separate one dimension into three different capture groups. A user can then use those to make their new dimensions. In the second example, the same regex will be used alongside a few more rules in Funnel.io to account for inconsistencies in data.
Example 1:
Below is the data that is being imported into Funnel.io using a google sheet connection.
The goal in this example is to split the campaign column into three columns: platform, campaign name, and campaign type. The regex that will be used in this example is the following: ([^-]+)-([^-]+)-([^-]+). If your delimiter is a different character, let say the percentage sign, the regex becomes ([^%]+)%([^%]+)%([^%]+).
In order to make the platform dimension in Funnel.io select “Dimensions” under “Data Transformation”, then click “+ Custom Dimension” in the top right corner. The rule should look like the screenshot below so that “Campaign matches regex ([^-]+)-([^-]+)-([^-]+) regex match capture group 1”. The regex included in this rule includes three capture groups that look like “([^-]+)” separated by the delimiter. Capture group 1 is chosen because the rule needs to capture the leftmost segment which is the platform.
If the steps above were done correctly, the data explorer view should look like the screenshot below.
In order to create the campaign name and campaign type dimensions, repeat the same process as shown when the platform dimension was made, but instead of “capture group 1”, use “capture group 2” for the campaign name and “capture group 3” for the campaign type. The final product should look like the data explorer below.
Example 2:
Below is the data that is being imported into Funnel.io using a google sheet connection.
The goal in this example is to split the campaign column into platform, campaign name, campaign type, and ad type. The challenge when using real data is that dimensions are not always uniform. In this example, there are some cells with four capture groups and some with three. Additionally, Cell B4 and B5 don’t have an ad type, and cell B6 does not have a campaign type. The rules below show one way to approach these problems.
The rules for the platform dimension are similar to the first example, but these rules account for the three and four capture group formats.
Campaign name uses the same rule as platform, but uses capture group 2.
Campaign type uses a similar rule, but in order to ensure the rule does not capture the ad type that is in cell B6 of the underlying data, there must be an additional rule that says that “Campaign does not contain Variation” in the three capture group rule.
Ad type uses a similar rule to campaign type, but in order to ensure the rule does not capture the campaign types that are in cells B4 and B5 of the underlying data, there must be an additional rule that says that “Campaign contains Variation” in the three capture group rule.
The final product should look like the data explorer below.
Splitting text by a delimiter is a powerful tool that can help clean messy strings into usable dimensions. Using regex and other rules, Funnel.io users can also clean their dimensions and provide greater clarity to their data. Even in example two, when there is an uneven number of capture groups and missing dimensions, users can get creative in order to organize and clean their data.