Skip to main content
Interania

Set up Interana using an AMI

Welcome to Interana!

This article helps you install and onboard data into an Amazon Machine Image (AMI) of an Interana instance.

Provisioning Interana

Provision Interana using AWS as follows:

  1. In your AWS Management Console, click on the Services tab, then click EC2.
  2. Under Create Instance, click Launch Instance.
  3. Under Quick Start, click My AMIs > Shared with me.
  4. Find the Interana AMI and click Select.
  5. Under Choose an Instance Type, choose r5d.xlarge (this machine type was chosen carefully, please use this one!). Click Next: Configure Instance Details.
  6. Accept the default values for Steps 3 and 4 (Configure Instance Details and Add Storage). In Step 5: Add Tags, you can optionally add tags. Otherwise, click Next: Configure Security Group.
  7. In Step 6, select an existing security group or create a new security group, then click Review and Launch
  8. Double check that you have an r5d.xlarge, that you’ve associated the instance with the proper security group, and that you have the proper tags associated with the instance. Then click Launch.

Getting Started

Now that you’ve provisioned Interana, let’s get started with using it. You’ll need to SSH into your AWS instance as the “ubuntu” user, using the key from your AWS account and the public DNS for your instance. That is:

> ssh -i <ssh-key-filename> ubuntu@ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws.com

You can administer Interana with the program ia. Try this to start:

> ia --help

The --help flag also works for specific commands, for example

> ia user --help
> ia user create --help

This is in fact the first command you'll want to run, to create a user for the Interana UI:

> ia user create <email> <password>

Next, you'll probably want to make this user an admin:

> ia user update <email> --add-role admin

Accessing the UI

To access the UI:

  1. Using a web browser, visit the public IP/DNS of your instance.
  2. Proceed through the security warning (this message appears because there is no certificate installed, for more information on installing a certificate, see the Advanced section).
  3. Log in using the username and password you created in the previous step. 

You’ll find that there’s an "events" table already created which does not yet have any data! To import data into your Interana instance, see the next section, Importing Data

Importing Data

To import data:

  1. Get a file onto your Interana server.
  2. Move it to the directory /data/events

This puts it in a table in Interana called "events", which has already been created.

The import takes a minute or two if the file is smaller than a few million records, and longer if it's larger than that.

There's also a sample file ~/sample_data.json for your use and reference. To load the sample data into Interana, run

> mv ~/sample_data.json /data/events/

Data format

But, the data needs to be in a specific format!

It must be newline-delimited JSON (jsonl), which means it's a text file where each line is a json dictionary representing a single event.

Each JSON event must have a field called "time" in lowercase. The times must be in epoch milliseconds (that is, milliseconds since midnight January 1, 1970 UTC). 

Each JSON event should also have a field called "actor" in lowercase that describes who or what is performing the action. This is typically a user id, username, or device id. The actor field is the one on which you'll be able to ask behavioral questions, so it should be the thing whose behavior you care about.

Data format summary

To recap:

  •  Format your data as jsonl: each line is a json dict that represents an event.
  •  Make sure each line has an "actor" that's a string, and a "time" in epoch milliseconds.
  •  Put the file in /data/events/ on your Interana instance.

Look at the results

Once you import some data, look in the UI, and you should see your data! For help running queries and creating boards, check out the tutorial.

If you don't see what you expect, debug with the following command:

> ia table status events

This gives you information about what data has been ingested to the table.

Possible import problems and solutions include:

Nothing shows up at all. This means Interana did not find a file. Double check that you dropped the files in the correct directory, /data/events/

Files with a status of ERROR. This means Interana failed with an entire file. Double check that your source files are text files (not compressed!) with one json dictionary per line.

Files with 0 parsed lines. This means that Interana wasn't able to parse any lines in the file. Double check that your source files are text files (not compressed!) with one json dictionary per line.

Files with "lines without timestamp" that is more than zero. Double check that you have a column called “time” (lowercase) that's an integer representing the number of milliseconds (not seconds!) since the epoch.

Warning messages in the “Had Warnings” field. Interana can have problems with any other value in the row. This doesn't stop it from importing the row, but the problematic value might not show up. For more information, inspect the end of the status output for some examples of the problems.

You see data but it's not in a format you like. You can change the format of your data and try again. See Iterate below.

If you really want the gory details, this is the raw log:

journalctl -u import-pipeline -f

Iterate

It is usually the case that the first time you load your data it's not exactly how you want it. The easy way to iterate on this with the evaluation version of Interana is to just delete and recreate the table and the source data as follows:

 

  1. Drop the Interana table

> ia table drop <table_name>
  1. Recreate the table

> ia table create

This recreates the default table that Interana shipped with. If you’ve created a fancier table definition (see Advanced below), then you’ll want to repeat that table create command here, not the default command.

In addition, whatever data that was in /data/events/ will be moved into  /data/archive/

  1. Drop your new files into /data/events/, and Interana will see them and ingest them. This takes a minute or two if the data is small (a few million rows) or longer if it is larger.

Feel free to do this as many times as you like. If you’d like to keep around one version of a table while you try a new one, simply omit the drop command, and read the Advanced section below to learn more about creating tables.

Advanced

The command ia table create has a number of additional options that might be useful.

To explain them, first some background. A table in Interana has two columns that are special: the time column and the actor column(s).

The time column is used to place the event on the timeline for simple things like count per day, and more complicated ones like time between two steps in a flow. The same column must be used for time in all events, and it must have the same format.

The default time column is called “time” and is formatted as milliseconds since Jan 1, 1970, but both the column name and format can be overridden. You can change the column name with the -t flag to ia table create, and the format with the -f option. Valid options for format are ‘seconds’, ‘milliseconds’, ‘microseconds’, or a python strptime format such as %Y-%m-%dT%H:%M:%S.%fZ.

The actor column is the column whose behavior you care about. This is typically an ID or name for a user or device, or maybe something more abstract like a piece of content. Some of the fancier behavioral features in Interana (such as flows) are limited to working only with an actor column. You can have more than one actor, but typically it’s fewer than five.

The default actor column is called “actor” and is of type “string”. You can override this with the -a flag to ia table create. This needs both the column name and the type, which is either string or int. If you want more than one actor column, you need to put in a separate -a. For example, if I have a string column user_id and an integer column anonymous_id I would use this command:

> ia table create my_data -a user_id string -a anonymous_id int

It's also worth noting that you can make a table pointing to a directory that already has data in it, and Interana will import that data.

So if you want to change your table definition but not your source data, just drop and recreate the Interana table without touching the source data.

Regarding data types, Interana cares whether fields are numbers or strings, and it uses the json type to determine that.

So if you have a field "my_number": 123, it will show up as a number and you can use it to compute things, but if you have "my_number": "123", Interana will interpret it as a string and you'll only be able to do string operations (like substring match) with it.

Simple nesting of dictionaries gets flattened in Interana. That is,

{"foo": {"a":1, "b":2}}

will give you two columns, foo.a and foo.b.

Interana can have columns that are lists of things, so if you have a list of simple types in your JSON, Interana will probably do what you expect.

But Interana does not import lists of dictionaries well. For example:

{"list_of_ints": [ 1, 2, 3], "list_of_strings": ["foo", "bar", "baz"]}

The above JSON is great. This will get you two columns in Interana of type integer list and string list.

{"list_of_objects": [{"a":1, "b":2}, {"a":3, "c":4}]}

Not so great. Interana will still import it, but you won't love the result. You'll get columns list_of_objects.a, list_of_objects.b, and so on, and it's difficult to query.

Regarding installing a certificate, this article should be helpful! https://www.thesslstore.com/knowledgebase/ssl-install/nginx-ssl-installation/

Also if you're interested in installing the AWS CLI to download files from s3 and putting them in /data/events, https://linuxhint.com/install_aws_cli_ubuntu/ should be helpful!

Configuring BQL

To query your data from outside of the Interana server or the UI, try using BQL. BQL is the Interana behavioral query language that wraps around the Interana query model. It’s similar to SQL but has some key differences. For more information about the syntax, see BQL syntax and usage.

To get started, you’ll need a token. In your browser, log into Interana. Then visit

https://<public_ip>/api/create_token

Copy your token and put it in a safe and secure place. Then from a terminal or a script, you can use the token to run BQL queries. Here’s an example curl that counts all events from the beginning of time until now. 

curl -k -H "Authorization: Token vVnACM3xGPnmxEXse5A8dYVVxyl/YOpXlzYNZwRXgUDqveKpme+rfuFrFVMcZ8euJccQMm7kMstijL2kG+YNqxsDvb2e0000" https:/<public_ip>/v1/query -d '{"bql": "select count(*) from <table_name> between beginning_of_time and now"}'

  • Was this article helpful?