Skip to main content
Interania

Derived column syntax reference

0votes
30updates
1,101views
This applies tov2.23

Use derived columns to define new calculated columns after data has been imported. You can define the derived columns with D language functions, and they must return integer (long) or decimal (double) type values.

This document provides a detailed overview of the language features available when coding a derived column function. 

See Create a Derived Column and Derived column examples for more information about creating and using derived columns. 

Derived Columns must return an Integer, String, or Decimal

Derived columns must return an integer (technically a long), string, or decimal (double) value. 

Derived Columns use the D programming language

The D programming language is a compiled language with a feel similar to C/C++. When coding a D function for use in an Interana derived column, keep in mind the following limitations:

  • Your function must comply with the @safe annotation (statically checked to exhibit no possibility of undefined behavior)
  • Your function must comply with the pure keyword (cannot access global or static, mutable state except through its arguments)

For performance and security reasons, many standard D library functions are not available.

Available D Libraries

We do not currently support any D libraries.

Tips for working with derived columns

  • You must escape "." in the column name or it will not compile. You can either rename the column or surround these columns with the c("<column_name>") function.
  • Derived columns reference the friendly column name. If you change that name after creating the derived column, the derived column will no longer work.
  • Avoid using D language reserved characters in column names. For example, Derived Columns cannot reference columns named "c", a reserved character in D. See the D language Lexical topic for more information. 

Interana built-in functions that operate on columns

Within your derived column, you can reference the following custom functions:

Column type Function / variable Notes
int, string long c(string field_name)

Given a column name (as a quoted string), returns its integer value for the current row.

Typically you can make a direct, unquoted reference to any column name to get its value, but in some cases the column name has a special character (like a .). Use this function in those cases.

string long match_string(long field_value, string pattern) Tests the value of a string column against a regular expression (returns 1 if match, else 0).
int_set, string_set long set_size(long field_value) Returns the number of elements in the int_set or string_set column (for this particular row).
int_set long set_contains(long field_value, long elem) Tests all values of an int_set column for the specified integer value. Returns 1 if match, else 0.
string_set long set_match_string(long field_value, string pattern) Tests all values of a string_set column against a regular expression. Returns 1 if match, else 0.

Arithmetic operations with derived columns (get and getd functions)

In version 2.23, we added the get function for derived columns. Use these when the data you're referencing contains null values, and you want a safe way to compute arithmetic operations on multiple fields. 

This supports long and double values. The syntax for these is long get(string field_name, long on_null); and double getd(string field_name, double on_null);.

When the fetched value is null (does not exist), the value you specify for on_null will be returned instead. This allows you to perform arithmetic operations, where performing those operations on null values would result in errors. 

For example, you can set on_null to 0 to compute the sum of two fields x and y:

return get("x", 0) + get("y", 0);

If you are performing a multiplication operation on two fields (x and y), set on_null to 1:

return get("x", 1) * get("y", 1);

For example, you can create a derived column based on screen width and height values that accounts for possible null values: 

double return_screen_area(){ 
    return get("screen_width", 0.0) * get("screen_height", 0.0); 
}

So if screen_width has a NULL, the computed value will be:

0.0 * screen_height_value = 0.0 

Referencing columns

Within your D function, you can reference Interana columns of type:

  • int
  • string
  • int_set
  • string_set

You can also reference columns containing "." characters. You must surround these columns with the c("<column_name>") function.

Referencing lookup columns

As of version 2.18, you can reference a lookup column when defining a derived column. You can use the columns in the lookup table as normal columns, or use the lc() function to get the value from that column. 

Use the syntax lc("column_name") to reference a lookup column. 

Unsupported references

Interana does not support the following references:

  • time columns (for example, of type milli_time)
  • named expressions, including cohorts, sessions, metrics, funnels
  • other derived columns

set_size and get_item functions

Note that when you import set columns to Interana, the import process preserves the order of items in the column and does not de-duplicate items in the set. Interana also does not distinguish between null and empty data.   

Function Description
set_size(set_column) Returns the number of elements in a set column (int_set, string_set)
get_item(set_column, index) Returns the element from the set at position index.  The index starts at 0. When looping over the set elements, the first element is at index 0, second at index 1, and so on.
  • Was this article helpful?