Skip to main content
Interania

Data types reference

0votes
1updates
2views
This applies tov2.23

This reference guide enumerates each of the data types supported by Interana, and explains how Interana determines which data type to assign at ingest time.

Data Types

These are the core data types you will see in the Interana query UI.

Data Type Aggregations Grouping Filter Operators Value Typeahead
Identifier Count Unique, First, Last Yes is one of, is not one of No
Integer / Decimal Count Unique, First, Last, Min, Max, Sum, Average, Median, Percentile  Defaults to No, can be configured to Yes by admin is one of, is not one of, is less than (<), is greater than (>), is less than or equal to (<=), is greater than or equal to (>=) No
Integer Set / Decimal Set   Defaults to No, can be configured to Yes by admin set contains, set does not contain No
String Count Unique, First, Last Yes is one of, is not one of, text contains, starts with, ends with Yes
String Set   Yes set contains, set does not contain Yes
Time Count Unique, First, Last, Min, Max, Sum, Average, Median, Percentile  No is one of, is not one of, is less than (<), is greater than (>), is less than or equal to (<=), is greater than or equal to (>=) No

Expansion Types

You won't see these as being first class data types in the Interana query UI, but at ingest time we apply expansion rules based on these data types. 

Expansion Type Data Type of Main Column Generated Subcolumns
IP Address String city, region, country, continent
URL N/A (not loaded into Interana by default) scheme, hostname, path, filename, query, params, fragment
User Agent N/A (not loaded into Interana) browser, device, platform, browser_majorver, browser_minorver, browser_patchver

Ingest Time Auto-Transformations

At ingest time, we automatically perform the following transformations on the raw JSON data before applying any data type recognition rules or expansions.

Transformation Rule Original JSON Resulting JSON
Flatten Nested JSON Objects {"column": {"a": 1, "b": "xxx"}} {"column.a": 1, "column.b": "xxx"}
Shred Arrays of JSON Objects {"column": [{"a": 1,"b": "zzz"}, {"a": 2,"b": "yyy"}]} {"column.a": [1,2], "column.b": ["zzz","yyy"]}

Ingest Time Data Type Recognition Rules For JSON Number Columns

At ingest time, when we see a column for the first time (and it is a JSON Number) we detect the data type of the new column using the following matching rules, in the precedence order listed below:

Parsing Rule Raw Data Data Type Rule Details
Detect Time {"abc" : 1448933490} Time Interana will attempt to interpret JSON ints as epoch timestamps in one of: microseconds, milliseconds, seconds. 
Detect Integer {"abc" : 12345} Integer Simple JSON ints are interpreted as Integers by Interana.
Detect Decimal {"abc" : 12345.98} Decimal  
Detect Integer Set {"abc" : [12345 245 99834]} Integer Set  
Detect Decimal Set {"abc" : [12345.98 245.2 99834]} Decimal Set  

Based on the order of precedence, if there is ambiguity about whether a column value can be interpreted as an epoch timestamp or an int, Interana will interpret it as an epoch time value. 

Ingest Time Data Type Recognition Rules For JSON String Columns

At ingest time, when we see a new column for the first time (and it is a JSON String) we detect the data type of the new column using the following matching rules, in the precedence order listed below:

Parsing Rule Raw Data Data Type Rule Details
Detect Time From JSON String {"abc" : "2015-11-30 08:09:12"} Time Interana will attempt to interpret JSON strings as timestamps using approximately 40 different format strings (including ISO-8601).  
Detect Identifier from JSON String {"abc" : "e41249ed-2398-4c29-a6fa-ee81116dd302"} Identifier Interana will attempt to interpret JSON strings as hexadecimal identifiers, including some common uuid formats. Note that non-hex characters (like hyphens or dots) are stripped out of the resulting data.
Detect Integer From JSON String {"abc" : "12345"} Integer  
Detect Decimal From JSON String {"abc" : "12345.98"} Decimal  
Detect Decimal From JSON String With Dollar Sign {"abc" : "$12,345.98"} Decimal Note that dollar signs and commas are stripped out of the resulting data.
Detect String Set {"abc" : ["hello" "goodbye" "nice" "to" "see" "you"]} String Set  
Detect URL {"abc" : "http://www.site.com/landing/"} URL  
Detect IP Address {"abc" : "127.0.0.1"} IP Address  
Detect User Agent {"abc" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0)"} User Agent Interana will attempt to interpret JSON strings as User Agents using a regex matching scheme.

Data Type Detection Settings For New Columns

Interana allows administrators to change the default data type detection settings using the Interana CLI.  Note that these settings only affect how Interana interprets new columns; once a column has been scanned by Interana and its data type has been assigned, these settings will not modify the column. The main command is:

ia settings update purifier <Setting> <Value>

and the exact settings available are:

Setting Value Effect
strict_number_detection 1 Do not interpret JSON strings as Integer or Decimal. 
force_url_to_string 1 Do not interpret JSON strings as URLs.
add_full_and_parsed_url 1 In addition to expanding the pieces of the URL, also store the original full URL as a String column. (This takes extra space.)
force_geo_to_string 1 Do not interpret JSON strings as IP addresses.
force_useragent_to_string 1 Do not interpret JSON strings as User Agents.
force_hexn_to_string 1 Do not interpret JSON strings as hex identifiers. 

Note that the settings "add_full_and_parsed_url" and "force_url_to_string" are mutually exclusive.

What's Next?

To get more familiar with how Interana interprets your raw data, try the How To Learn How Interana Handles Data Types At Ingest Time

  • Was this article helpful?