Skip to main content
Interania

Interana structured logs: Ingest monitoring

1votes
50updates
147views
This applies tov2.24

You can use Interana structured logs to analyze Interana performance (ingest monitoring) and query usage. This document is a reference for ingest monitoring structured logs, and is organized as follows:

Common properties of ingest structured log events

  • Events from the import-pipeline can be queried with:
    • process = /opt/interana/backend/import_server/import_pipeline.py
  • Events from the purifier can be queried with:
    • process = purifier

The following tables list the common properties of structured log events for ingest pipeline and purifier.

Ingest pipeline
 

Each structured log event emitted by the ingest pipeline has the following properties:

  • pipeline_id
  • job_id
  • inst_id
  • table_id
  • table_name
  • continuous - 1 = forever, 0 = one-time

For file related events, each event also includes:

  • batch_id
  • original_filename
  • remote_filename_md5
  • file_size - transformed size
  • original_size - raw size
  • line_count - lines after transformation
  • lines_dropped - lines lost in the transformation phase; Note: this is only valid for pipelines using the transformer library (generators)
  • lines_total - total number of lines in the file: lines successfully transformed and lines that were dropped
  • iteration_date
  • concat_filename
Purifier

Purifier common properties include the following:

  • batch_id
  • pipeline_id
  • job_id
  • inst_id
  • table_id
  • purifier_filename - same as the concat_filename in the ingest pipeline

Key events

Unless otherwise noted, event types are found in the "event_name" field.

Ingest pipeline
 

File-Based Events

Look for the events concerning the file's lifecycle through the ingest pipeline. For a healthy ingest, you should see one of each of the following events per file:

  • uploaded_by_customer—represents the time the file was made available to Interana. Interana uses the modtime of the file.
  • detected_by_interana—time when Interana first scans the iteration date. Every file found in a given iteration date has the same detected_by_interana time
  • get_request—file downloaded.
  • list_request—whenever we make a list request to an S3 bucket; S3 only
  • found_files—emitted at end of each iteration date scan; "file_count" = number of files found overall, not new files to import

Stream-Based Events 

These events refer to a set of records that is flushed into the next stage of the ingest pipeline:

  • batch_closed—emitted when a batch of records from our internal bus is flushed to the transformer stage of the import pipeline

Generic Events—Apply to File-Based and Stream-Based Ingest

These events refer to files, but they also apply to streaming ingest. Each time Interana flush3w a batch of records, that batch is treated as a temp file and fires the following events:

  • file_transformation_start– about to start transforming the file
  • file_transformation_complete– transformation of the file has finished
  • purification_start– about to run the purifier on the file
  • purification_end– purifier has completed, so the import of the file has finished.  When we see this event, we generally consider the file to successfully imported.

Errors

Interana has the following events for errors within the ingest pipeline:

  • transformer_failure - error in the transformation phase, for both generators and classic transformers; might be some extra info in the "result" column
  • purifier_failure - error in the purification phase; "error_code" column has the purifier's return code
  • error_processing_file - general file import error -- emitted in both transformer_failure and purifier_failure cases, but will catch any other errors within the file's import pipeline lifecycle; slightly more detail in the "result" column
  • insufficient_disk_space - import job cannot proceed due to not enough disk space 
  • insufficient_disk_percent - import job cannot proceed due to not enough free disk percent

Misc.

  • terminate_called - when we set the exit flag, either pausing the job or shutting down the import-pipeline service
  • uncaught_exception - error we couldn't recover from, so the job crashed; "exception" contains exception

 

Purifier
 

Purifier ingest

Purifier ingest events:

  • purifier_start—start of the Purifier ingest.
  • purifier_finish—end of the Purifier ingest.
  • parseJsonChunk (activity_name field)—emitted when a chunk of the file has finished parsing and includes the number of lines read (lines_read) and successfully parsed (lines_parsed).
  • detectNewColumns (activity_name field)—new columns detected.

Errors

These events contain an "error_count" field that shows the number of occurrences of these errors.

Time column errors to look for:

  • count_invalid_timestamp
  • count_timestamp_far_in_future
  • count_parse_error_or_exception

Conversion Function Errors

  • import_conversion_failed

Each of the events contains column information and the number of failures:

  • table_id
  • column_name
  • column_type
  • conversion_function
  • conversion_function_params
  • conversion_failure_count

Useful named expressions

Metrics

The following table lists useful named expressions for metrics.

Metrics—named expressions

Files Successfully Imported

  • Aggregator
    • Count Unique: original_filename
  • Filter
    • event_name: purification_end
  • Divided By
    • Aggregator
      • Count Unique: original_filename
    • Filter
      • event_name: detected_by_interana

Lines Transformed

  • Aggregator
    • Sum: line_count
  • Filter
    • event_name: purification_start

New Columns Added

  • Aggregator
    • Count Events
  • Filters
    • process: purifier
  • activity_name: detectNewColumns

Purifier - Lines Parsed

  • Aggregator
    • Sum: lines_parsed
  • Filters
    • process: purifier
    • activity_name: parseJson_chunk

Purifier - Lines Read

  • Aggregator
    • Sum: lines_read
  • Filters
    • process: purifier
    • activity_name: parseJson_chunk

S3 List Request Cost

  • Aggregator
    • Count Events
  • Filter
    • event_name: list_request
    • Divided By
      • Aggregator
        • Maximum: __s3_list_request_cost_denominator__


Derived Columns

The following table lists useful named expressions for derived columns.

Derived columns—named expressions

__s3_list_request_cost_denominator__

long get_s3_list_request_cost_denominator() {

return 200000;

}

 

Dashboards

The following table lists useful named expressions for dashboards.

Dashboards—named experssions
Customer import health

Import Heartbeat

  • View: Time
  • Measure
    • Count Events
  • Compare
    • table_id
    • table_name
    • pipeline_id
  • Filters
    • process: /opt/interana/backend/import_server/import_pipeline.py
    • event_name: purification_end

Files Successfully Imported

  • View: Bar
  • Measure
    • Count Events
  • Compare
    • event_name
  • Filter
    • process: import_pipeline
    • event_name: purification_end, detected_by_interana

Percentage of Files Successfully Imported

  • View: Time
  • Measure
    • Files Successfully Imported
  • Filter
    • process: import_pipeline

Lines Processed

  • View: Time
  • Measure
    • Lines Transformed
    • Purifier - Lines Parsed
    • Purifier - Lines Read
  • Filter
    • process: import_pipeline, purifier

New Columns Added By Table

  • View: Time
  • Measure
    • New Columns Added
  • Compare
    • table_id
  • Filter
    • process: purifier

Conversion Failures

  • View: Stacked Area Time
  • Measure
    • Sum: conversion_failure_count
  • Compare
    • table_id
    • column_name
  • Filter
    • process: purifier
    • event_name: import_conversion_failed

Conversion Failures by Column, Type, Conversion Function

  • View: table
  • Measure
    • Sum: conversion_failure_count
  • Compare
    • table_id
    • column_name
    • column_type
    • conversion_function
    • conversion_function_params
  • Filter
    • process: purifier
    • event_name: import_conversion_failed

Time Column Errors

  • View: Time
  • Measure
    • Sum: error_count
  • Compare

    • table_id, event_name

  • Filter
    • process: purifier
    • event_name: count_invalid_timestamp, count_timestamp_far_in_future, count_parse_error_or_exception

S3 List Requests - Last 2 Days

  • View: Number
  • Measure
    • Count Events
  • Filter
    • process: import_pipeline
    • event_name: list_request

S3 List Request Cost - Last 2 Days

  • View: Number
  • Measure
    • S3 List Request Cost
  • Filter
    • process: import_pipeline
    • event_name: list_request
Global import stats

Lines Transformed

  • View: Stacked Area Time
  • Measure
    • Lines Transformed
  • Compare
    • customer
  • Filters
    • process: import_pipeline

Files Processed

  • View: Stacked Area Time
  • Measure
    • Count Unique: original_filename
  • Compare
    • customer
  • Filters
    • process: import_pipeline
    • event_name: purification_end

Purifier - Lines Parsed

  • View: Stacked Area Time
  • Measure
    • Purifier - Lines Parsed
  • Compare
    • customer
  • Filters
    • process: purifier

Purifier - Lines Read

  • View: Stacked Area Time
  • Measure
    • Purifier - Lines Read
  • Compare
    • customer
  • Filters
    • process: purifier

Files Imported By Customer - Raw Size

  • View: Stacked Area Time
  • Measure
    • Sum: original_size
  • Compare
    • customer
  • Filters
    • process: import_pipeline

Files Processed By Customer - Raw Size

  • View: Stacked Area Time
  • Measure
    • Purifier - Lines Read
  • Compare
    • customer
  • Filters
    • process: import_pipeline
Columns to set to groupable
  • table_id
  • pipeline_id
  • job_id
  • customer_id
  • iteration_dateFor more information
  • Was this article helpful?