Skip to main content
Interania

Configure AWS Kinesis Firehose for Interana ingest

0votes
29updates
169views

This document demonstrates how to configure Kinesis Firehose in Amazon Web Services (AWS), to capture and load streaming data into Amazon S3 for ingest into Interana. You will learn how to create a Firehose delivery stream from the AWS Management Console and configure the Interana SDK for ingest into Interana.

Create an AWS Kinesis Firehose delivery stream for Interana ingest

After completing this procedure, you will have configured Kinesis Firehose in AWS to archive logs in Amazon S3, configured the Interana SDK, and created pipeline and job for ingesting the data into Interana.

Complete the following steps in the order listed:

  1. Specify a name and source.
  2. Choose a destination.
  3. Specify configuration settings.
  4. Configure an API using the API Gateway.
  5. Validate the API.
  6. Publish your API in Amazon Gateway API.
  7. Initialize the Interana SDK in your application.
  8. Create a table, pipeline, and job for ingest.

You must have an Amazon Web Services (AWS) account to complete this procedure. You can create a free account on the AWS web page. For more information, see Setting up Amazon for Kinesis Firehose.

1. Specify a name and source

The first step in creating a Kinesis Firehose delivery stream is to specify a name and source in the AWS Management Console.  

To specify a name and source for the Kinesis Firehose delivery stream, do the following:
  1. Log in to your AWS account.
  2. Follow the Amazon documentation to specify a name and source.
  • For the Name, enter KinesisFirehose.
  • For the Source, choose Direct PUT or other sources.
  1. Skip Transformation records and go directly to choosing a destination

Transformation records do not apply for this procedure.

2. Choose a destination

This section demonstrates how to specify Amazon S3 as the destination for a Kinesis Firehose delivery stream. 

To choose a destination for the delivery stream, do the following:
  1. In the Create Delivery Stream wizard, go to the Choose destination page.
  2. Choose Amazon S3 for the destination.
  3. Specify the values described in the Amazon Choose Amazon S3 for your destination documentation.
  4. Continue with specifying configuration settings.

3. Specify configuration settings

This section demonstrates how to configure delivery stream buffer, compression, logging, and IAM role settings. You must create a new IAM role or select an existing IAM role. You must have appropriate IAM policies attached to an IAM role, to allow the API to invoke Kinesis actions. 

This section explains how to verify and to create, if necessary, the required IAM role and policies.The following task shows you how to create a new IAM role customized for Interana ingest. For more information on IAM roles, see the Amazon IAM Roles documentation.

To specify configuration settings for the delivery stream, do the following:
  1. In the Create Delivery Stream wizard, specify configuration settings as described in the Amazon documentation.
  2. For IAM Role, click Create New and specify the following:

  • IAM Role Policy

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "Service":"firehose.amazonaws.com"
         },
         "Action":"sts:AssumeRole"
      },
      {
         "Effect":"Allow",
         "Principal":{
            "Service":"apigateway.amazonaws.com"
         },
         "Action":"sts:AssumeRole"
      },
      {
         "Effect":"Allow",
         "Principal":{
            "Service":"s3.amazonaws.com"
         },
         "Action":"sts:AssumeRole"
      }
   ]
}
  • IAM Policy—Substitute the BUCKET-NAME variable with your destination bucket name, and the FIREHOSE-STREAM-NAME variable with your Firehose stream name.
{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":[
            "s3:ListAllMyBuckets",
            "s3:GetBucketLocation"
         ],
         "Resource":"arn:aws:s3:::*"
      },
      {
         "Effect":"Allow",
         "Action":[
            "s3:ListBucket"
         ],
         "Resource":"arn:aws:s3:::S3-BUCKET-NAME"
      },
      {
         "Effect":"Allow",
         "Action":[
            "s3:AbortMultipartUpload",
            "s3:GetObject",
            "s3:GetObjectVersion",
            "s3:GetObjectVersionAcl",
            "s3:ListMultipartUploadParts",
            "s3:PutObject"
         ],
         "Resource":"arn:aws:s3:::S3-BUCKET-NAME/*"
      },
      {
         "Effect":"Allow",
         "Action":[
            "firehose:ListStreams",
            "firehose:PutRecord",
            "firehose:PutRecords"
         ],
         "Resource":"arn:aws:firehose:*:AWS-ACCOUNT-ID:deliverystream/FIREHOSE-STREAM-NAME"
      }
   ]
}
  1. Preview and save your settings.

  2. Continue with configuring an API using the API Gateway.

4. Configure an API using the API Gateway

This section demonstrates how to create an API with an integration of the AWS type to access Kinesis Firehose, using the API Gateway.

To configure an API using the API Gateway, do the following:
  1. On the API Gateway console, Start to Create an API as a Kenisis Proxy. After the API is created, the API Gateway console displays the Resources page, which contains only the API's root (/) resource.

  2. On the Resources page, select Actions > Create Resource.

  3. Create a /track resource, and select the Enable API Gateway CORS check box to enable access to other domains. 

  4. Add a POST method, specifying the following:

  • Integration Type: AWS Service

  • AWS Region: Select your region from the drop-down list 

  • AWS Service: Firehose 

  • AWS Subdomain: If necessary

  • HTTP Method: POST

  • Action Type: Use action name

  • Action: PutRecord

  • Execution role: Example — arn:aws:iam::61example818:role/Example_APIGatewayRole

  • Content Handling: Passthrough

  1. For the Integration Request, under HTTP Header add the following Content-Type header: 
'x-amz-json-1.1'
  1. Under Body Mapping Template, click Add Mapping Template and enter the following:
{
   "DeliveryStreamName": "$input.params('topic')",
   "Record": {
       "Data": "$util.base64Encode($input.json('$.data'))Cg=="
   }
}
  1. Save the configuration. 

5. Validate the API

This section demonstrates how to create a model to validate the body request, then add validation with the model and add the query parameter.

To validate the API, do the following:
  1. On the API Gateway console, click Models > Create.
  2. Enter a Model name, Conent type, and Model description.
  3. In the Model Schema field, enter the following:
{
 "$schema": "http://json-schema.org/draft-04/schema#",
 "title" : "Single Event Schema",
 "type" : "object",
 "properties": {
     "data": {
         "type": "object",
         "properties": {
             "event": {
                 "type": "string"
             },
             "timestamp": {
                 "type": "integer"
             }
         },
         "required": ["event", "timestamp"]
     }
 },
 "required": ["data"]
}

  1. Click Create model.
  2. In the API Gateway console, click Resources, select /track, then the POST method.
  3. Select Method Request and specify the following:
  • Request Validator: Validate body, query string parameters, and headers
  • URL Query Parameters: For topic, select the Required check box.
  • Request Body: Click Add model and for Content type enter application/json and for Model name add SDKRequest, then click the check mark to save.

6. Publish your API in Amazon API Gateway

You are now ready to deploy your API to make it accessible to users. See Deploying an API in Amazon API Gateway for more information.

To publish your API, do the following:
  1. Go to the Amazon API Gateway page.
  2. Follow the steps to deploy an API to a stage, and at the appropriate time select [New Stage] and enter a Stage name.
  3. Deploy the stage.
  4. Make a note of the stage URL, as you will use it (appended with /track) to initialize the Interana SDK

7. Initialize the Interana SDK in your application

You can integrate Kinesis Firehose with the Interana SDK to send events to your API that are then ingested by Interana for analysis.

To initialize the Interna SDK, use the following syntax:
init({endpoint, actor})
  • Set the the endpoint parameter in the init function to the Kinesis Firehose endpoint. The endpoint is the URL to which the API will send events. It should include topic as a URL parameter to determine which stream events should be sent to the API, for example: https://<API_stage_url>/track?topic=firehosename
  • Set the actor to the shard key column, usually a userID as shown in the following example.
Example:
interana.init({endpoint: 'https://<API_stage_url>/track?topic=firehosename', actor: 'userID'})

For more information on the Interana SDK, see Configuring the Interana SDK for ingest.

8. Create a table, AWS pipeline, and job for ingest

This section demonstrates how to create a table in which to store the S3 data, how to create an aws pipeline, and how to create a job that performs the ingest.

A. Create a table 

For each table, the SDK requires a value for "actor", then automatically includes the event time in the "timestamp" column in a epoch milliseconds format.

The following example creates an event table named, "aws_events", using the SDK defaults for shard key ("actor") and time column ("timestamp" in epoch milliseconds format):

ia table create event aws_events actor timestamp milliseconds
B. Create a pipeline 

The next step is to create an AWS pipeline from which to import data from Firehose. You will need to provide the following parameters:

  • file_pattern—If you set a value for Destination S3 bucket prefix, it is included in the file_pattern. For more information, see the Choose Amazon S3 destination documentation. For example, if “aws_events” is the prefix, the pattern would be: aws_events/{year}/{month:02d}/{day:02d}/{hour:02d}

Firehose automatically prefixes files with YYYY/MM/DD/HH. If no prefix is specified, the pattern would be: {year}/{month:02d}/{day:02d}/{hour:02d} 

 

  • s3_access_key—For more information, see the Amazon documentation on how to create an AWS access key.
  • s3_secret_access_key—For more information, see the Amazon documentation on how to manage access keys.
  • s3_bucket—For more information the Amazon documentation for working with S3 buckets.
  • s3_region (optional)—The default is us-east-1. If your bucket is located in that region, you won’t need to specify this parameter. The value should be set to the region for the s3 bucket in which you'll be writing the Firehose data. For more information, see the Amazon documentation on AWS regions and endpoints.
Example:

In the following example, there is a table named firehose_table, with a destination s3 bucket prefix of aws_events, with a bucket named firehose_bucket in the us-west-2 region.

ia pipeline create firehose_pipeline firehose_table aws -p file_pattern aws_events/{year}/{month:02d}/{day:02d}/{hour:02d} -p s3_access_key EXAMPLE_KEY -p s3_secret_access_key SECRET_EXAMPLE_KEY -p s3_bucket my_bucket -p s3_region us-west-2

For more information on creating pipelines and jobs, see Streaming ingest and the Interana CLI reference.

C. Create a job 

You can now create a job to schedule the ingest. In the following example, the job is continuous. 

ia job create firehose_pipeline continuous 

What's Next

You are now ready to ingest log data from Amazon S3.

  • Was this article helpful?