Query API reference
Interana’s API gives you a way to extract summarized and aggregated data for use in downstream processes, data warehouses, dashboards, or reports. For some customers, Interana is one of several analytics processes they run and consolidation of these various analyses in a single report or dashboard is important. Other customers need aggregated data from Interana to feed into separate processes. With the advent of the API, users can now run queries outside of the Interana front end.
The Interana external API is a REST API that allows integration with Interana outside of the standard interface. The API is deployed automatically as part of an Interana cluster installation. The first version of the API provides basic functionality for single measurement and time series queries.
Single Measurement Queries
Single measurement queries are queries that return a single result set. An example of a single measurement query is a Table view query that extracts a site’s top users based on the count of user events. Only one table of results is returned for the given time range.
Time series queries are queries that return a result set for every data point in the query’s time range. In Interana Explorer, time series queries are rendered using the Time view.
Endpoints
The API currently supports one endpoint: query. The query endpoint allows the user to make queries against Interana. The URL path to the query endpoint is:
https://<cluster hostname>/api/v1/query
The query endpoint must be accessed with the GET http method.
Authentication and authorization
All requests to the API must be over SSL (https protocol).
The API uses a token based authentication model. Tokens can be created or revoked by Interana support, or you can generate your own API tokens. Tokens must be passed in the Authorization header of each request to the API in the following format:
Authorization: Token <token>
Every user account is authorized to make requests to the API, as long as they use a valid request token.
Requests
Requests to the query endpoint must be sent with the GET HTTP method. The required query parameter defines the query to be executed on Interana. It is a JSON object that is URL-encoded and passed as a parameter to the request.
Object format
The query
object has the following format:
query
Name | Type | Required | Description |
---|---|---|---|
dataset | string | yes | The name of the dataset that you want to use. |
start | int | yes | The start time for the query. Represented as milliseconds since UNIX Epoch time. |
end | int | yes | The end time for the query. Represented as milliseconds since UNIX Epoch time. |
timezone_offset | int | no | Milliseconds offset from UTC, used for day alignment. This defaults to the configuration of your Interana instance, or Pacific daylight time if the instance is not configured (PDT = -7hr\*60m / hr\*60s / m\*1000ms / s = -25200000 ms). |
queries | array | yes | A list of objects containing details about the query. For most basic queries, this list will only contain one element (see queries). |
group_by | array | no |
An array of strings listing the columns to group by. Applies to all query objects. The response will return the columns in the same order that you specify in the |
max_groups | int | no | The number of groups to return if group_by is specified. Defaults to 10. |
order_by | string | no | Specify the order in which results are returned. See order_by for more information. |
sampled | boolean | no | Whether to run a sampled query. The default is true. |
compute_all_others | boolean | no | When group_by is specified, whether to compute the "All others" group. Defaults to false. |
"Type" refers to the JSON type of the property. See http://www.json.org for more information.
queries
Name | Type | Required | Description |
---|---|---|---|
type | string | no | The type of query to run. Select single_measurement (the default value) or time_series . |
measure | object | yes | An object defining an aggregation to measure (see measure). |
filter | string | no | Filters to apply to the query. This uses the Advanced filter syntax. |
You can use the advanced filter syntax with the API to reference per-actor metrics in queries. The API does not support ratio metrics.
measure
Name | Type | Required | Description |
---|---|---|---|
aggregator | string | yes |
The aggregation to measure. One of: “count_star”, “unique_count”, “sum”, “avg”, “min”, “max”, “P1”, “P5”, “P10”, “P25”, “P50”, “P75”, “P90”, “P95”, “P99” |
column | string | yes | The column that you want to measure. |
order_by
Name | Type | Required | Description |
---|---|---|---|
type | string | No | Specify what you want to order by. Select measure to order by the query measure, or select count_star to order by the count of events. |
direction | string | No | Specify the sort order, either asc (ascending) or desc (descending). |
Responses
The API will return the http status code 200 for successful requests to the query endpoint, along with a JSON object containing the results of the query.
Object format
The query result object format is:
results
Name | Type | Description |
---|---|---|
columns | array | An array of column objects |
rows | array | An array of row objects |
columns
Name | Type | Description |
---|---|---|
label | array or string | A description of the column |
type | string |
The type of the data in the column. Specify "array", 'number'', or ''time_series'' See Data types for more information. |
rows
Name | Type | Description |
---|---|---|
values | array | The data corresponding to the defined columns. The order of elements in this array corresponds to the order of elements in the columns array. The length of this array will always equal the length of the columns array. The type of data will match the type defined in the corresponding column object (see Data types). |
properties | object | A map of properties for the result. For time_series queries, this includes information about the time bucketing used to calculate the result. If no properties are applicable, this field will be omitted. See row properties, below. |
row properties
Name | Type | Description |
---|---|---|
rate | string | Select ''day'', ''week'', or ''month'' |
resolution | int | The time between data points, in milliseconds |
window | int | The length of the time window, in milliseconds. The window must be greater than or equal to the resolution setting. |
Data types
The type property of column objects describes the type of data that will appear in each row. The following table describes the JSON format of the possible type values.
Name | Type | Description |
---|---|---|
number | number | |
array | array | |
time_series | array | An array of time_series objects |
time_series
Name | Type | Description |
---|---|---|
timestamp | int | The timestamp of the data point in milliseconds since UNIX Epoch time |
value | number | The value of the data point |
properties | array | See time_series properties |
time_series properties
Name | Type | Description |
---|---|---|
event_count | int |
The number of events used to compute the value. This is the number of events scanned in the time window ( For unsampled queries, this should equal the number of events that exist in that particular time window. |
object_count | int |
The number of unique objects used to compute the value. For unsampled queries, If you run a sampled Count Unique of |
Examples: single measurement and time series
Single measurement queries
In this single measurement example, the query is looking for the number of unique userids, grouped by artist, from April 25, 2016 to April 30, 2016.
The same query can be executed in the API with the following request and response calls. Note that start and end times are specified as milliseconds since epoch and timezone_offset
is relative to GMT.
Request: single measurement
{ "dataset": "Music", "start": 1461567600000, "end": 1461999600000, "timezone_offset": -25200000, "queries": [ { "type": "single_measurement", "measure": { "aggregator": "unique_count", "column": "userId" }, "filter": "(`artist` != \"*null*\")" } ], "sampled": true, "group_by": ["artist"], "max_groups": 5, "compute_all_others": false }
Response: single measurement
{ "rows": [ {"values": [ [ "3 Doors Down"], 31456] }, {"values": [ [ "Justin Bieber"], 31336] }, {"values": [ [ "OneRepublic"], 31772] }, {"values": [ [ "Taylor Swift"], 30136] }, {"values": [ [ "The White Stripes"], 27036] } ], "columns": [ {"type": "array", "label": ["artist"] }, {"type": "number", "label": "measure_value"} ] }
Time series queries
If the single measurement query example used above is issued as a time series query, each x-axis point in the query time range returns a count of each user’s events for that given time window. In other words, the query returns a separate result set for each point in the x-axis.
The time series query in the example below looks for the number of events between April 29, 2016 12:00 am and April 29, 2016 12:00 pm.
The same query can be executed using the API with the request call below. You can also view the corresponding response. Note that the response has been abbreviated given the large number of results.
In the request call, the query start and end times are specified in milliseconds since epoch, and timezone_offset
, also specified in milliseconds, is relative to GMT. Finally, the sampled flag indicates whether to use sampling when running the query.
Request: time series
{ “dataset”: “Music”, "start": 1461913200000, "end": 1461956400000, "timezone_offset": -25200000, "queries": [ { "type": "time_series", "measure": {"aggregator": "count_star"}, } ], "sampled": true, }
Response: time series
{ "rows": [ {"values": [ [ "All"], [ {“timestamp”: 1461913200000, “properties”: {“object_count”: 25243, "event_count": 25243}, "value": 8414.333333333332}, ... {"timestamp": 1461934800000, "properties": {"object_count": 20977, "event_count": 20977}, "value": 6992.333333333333}, ... {"timestamp": 1461956400000, "properties": {"object_count": 27581, "event_count": 27581}, "value": 9193.666666666666} ... ... ] ], "properties": {"window": 43200000, "resolution": 21600000, "rate": "minute"} } ], "columns": [ {"type": "array", "label": ["result"] }, {"type": "time_series", "label": "measure_value"} ] }
Errors and retry
The API returns appropriate HTTP status codes for error cases and JSON objects containing error information.
Status Codes
The possible error status codes for the query endpoint are:
Code | Error type |
---|---|
400 | Malformed query parameter |
401 | Invalid authentication token |
500 | Unexpected server error |
504 | The request timed out before the query could complete. The server-side timeout is 180 seconds. |
Object format
The JSON error object format is:
Name | Type | Description |
---|---|---|
error | string | The class of the error: “Invalid parameter”, “Invalid token”, “Server error”, or “Request timed out” |
message | string | A description of the error |
Examples
400 status
{ "error": "Invalid parameter", "message": "End time must be after start time", }
429 status
{ "error": "Request limit exceeded", "message": "The request limit of 1000 queries has been reached. This token can be used for requests on 2016-03-18 00:00:00" }
Retry
Some queries that time out in the API may be cached on the server. Retrying the API request can sometimes result in retrieving results successfully. We recommend limiting retry policies to a small number of retries to avoid excessive load on the server.
Request limits and throttling
By default, tokens are authorized to make 1 query per second and 1000 requests per day. Once that limit has been exceeded, requests using that token will be rejected with HTTP error 429 until the next day. Contact Interana customer support to request a limit increase.
Versioning
This is version 1 of the External API, indicated by the string “v1” in the URL path. This API may be expanded in future releases.
For more information
- See the Query API implementation guide to learn how to create a language-specific implementation that can be used for your custom scenario.
- See the Query API troubleshooting topic if you need help troubleshooting issues with API calls.