App Insights Part 2 – Stream your data

This post is the 2nd part of my review of Azure’s App Insights service. The first part introduced some of the features provided by App Insights and  covered 2 ways of exporting and using events outside of Azure.
In this post I’ll cover the third option of exporting events which I believe is the best for storing events for a duration longer then 7 days and for enabling custom and complex querying and analysis of your data.

Step 1 – Persisting events using Continuous Export

In order to persist your data forever (or at least for longer then 7 days) you’ll need to set-up the Continuous Export feature of App Insights.
Until recently this feature incurred extra cost but now Azure supports free export of 1 GB of monthly data for free as part of the Application Insights Basic pricing tier.

You can export the data only to Azure Storage account but from there it is much easier to move the data to other data stores (as we’ll see in the next step).

As part of the configuration you can control which types of data to export. These includes all of the rich set of metrics and traces that App Insights monitor such as Availability, Page Views, Performance Counters, and more.
For our purpose of logging custom events, We should choose to export them in addition to other types of interest.

App Insights Continuous Export data types
App Insights Continuous Export data types

Step 2 – Migrating events to Azure SQL DB with Streams Analytics Jobs

Now that the data is in a storage account we would like to export it to Azure SQL DB. This will enable us to run SQL queries on the data, join it with other tables and use if for reporting and visualization in applications that support working with relational data.

Exporting the data to Azure SQL DB is done with Azure Stream Analytics which provides a tool for pulling data from various sources, manipulate them at mass scale and output the results to any a variety of data stores. It is based on a continuous stream of data that is being processed as long as the stream job is running.

From Azure Portal, create a new Stream Analytics job. Now we need to define the input, manipulation query and output for the job.

Defining job input

Let’s add an input by clicking the ‘Inputs’ tile on the job’s blade in the portal and click the ‘Add’ button.

The source can be Azure’s event hub, IoT hub or as in our case, a blob storage. We can choose a Path pattern in order to process only part of the data stored and must specify the serialization format of the data (JSON, CSV, or even Avro). As our data is coming from App Insights we’ll stick to JSON.

Defining job output

The first step for defining our output as SQL database is creating the output table.
The table should be created manually in DB where each column name should fit the names that will be later defined in our query.
Creating a clustered index on the event time is also recommended for improved performance.

Now we can add map that table to our job. Click the ‘Outputs’ tile in the job and then click the ‘Add’ button. Choose the output as ‘SQL Database’, provide the DB details and choose the table name you just created.
Other output options include Blob and Table storage, Event hub for publishing processed events to other consumers and even directly to Power BI report.

Defining job query

For our use case we would like to export the custom events written by our AngularJS application. For each event it can be helpful to save additional client information provided automatically by App Insights such as IP, country, session id, browser and more.

Here’s the query we use for preparing our data before being added to our SQL DB:

SELECT
 flat.ArrayValue.name as eventName
 ,A.context.[user].authId as userId
 ,A.context.[user].accountId as accountId
 ,TRY_CAST(UDF.[entity-id](A.context.custom.dimensions) as bigint) as entityId
 ,UDF.[entity-email](A.context.custom.dimensions) as loginEmail
 ,UDF.[entity-query](A.context.custom.dimensions) as query
 ,UDF.stringify(A.context.custom.dimensions) as customDimensions
 ,A.context.data.eventTime as eventTime
 ,A.context.data.isSynthetic as isSynthetic
 ,A.context.device.id as deviceId
 ,A.context.device.type as deviceType
 ,A.context.device.os as os
 ,A.context.device.osVersion as osVersion
 ,A.context.device.locale as locale
 ,A.context.device.userAgent as userAgent
 ,A.context.device.browser as browser
 ,A.context.device.browserVersion as browserVersion
 ,A.context.device.screenResolution.value as screenResolution
 ,A.context.session.id as sessionId
 ,A.context.session.isFirst as sessionIsFirst
 ,A.context.location.clientip as clientIp
 ,A.context.location.continent as continent
 ,A.context.location.country as country
 ,A.context.location.province as province
 ,A.context.location.city as city
 INTO
 AIOutput
 FROM AIInput A
 CROSS APPLY GetElements(A.[event]) as flat

AIInput and AIOutput are our input and output sources defined in the previous sections.

Streams has a limited SQL-like query language that provides only basic functionality. Compensating for the lack of advanced features is the ability to use User Defined Functions written in javascript.

The actual custom event data is in JSON format and if we would like to extract certain information from it and save it to dedicated columns in our SQL table we need to work a little bit harder as Streams doesn’t yet provide a built-in method for doing it.

For our purpose, we extract the email, search query (if relevant) and entity id of the object on which the user operated on. We also format the JSON as string in order to save the raw event data (otherwise the actual context won’t be exported properly).

Here’s an example of our ‘entity-query’ UDF which search for a field named ‘query’ in our JSON and return its value.

function main(array) {
 var viewItem;
 array.forEach(function(obj) 
 { 
 else if (obj.query) 
 viewItem = obj["query"];
 });
 return viewItem || null;
}

As part of testing the query you can load sample data directly or from your defined input and see how the manipulated results look like.

Start processing

Now you can finally start the job and see your hard work finally paying off. The portal provides some basic monitoring on the number of processed input and output events as well as any errors of issues for your job.

You can also stop your job for re-configuration and resume it from exactly the same place it was stopped, ensuring no events are being skipped.

Summary

This post wraps-up my coverage on how to use custom events provided by App Insights in your applications. I covered several ways to use the data and showed some points that require custom handling and special attention.

For additional information on App Insights Basic and Enterprise plans check out – pricing plan.

A detailed explanation of performance monitoring of web pages provided by App Insights can be found here.

An interesting functionality of App Insights that I didn’t cover is the ability to define a continuous web test to periodically check the availability of your application by pinging a specific endpoint.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s