Flight Plan: Time-Series AI Anomaly Detection

This BoosterPack was created and authored by: Chillwall AI

DAIR BoosterPacks are free, curated packages of cloud-based tools and resources about a specific emerging technology, built by experienced Canadian businesses who have built products or services using that technology and are willing to share their expertise.

Ready for takeoff?

Here’s what you’ll find in this Flight Plan

Overview

What is Time-Series AI Anomaly Detection?

Anomaly detection applications are software tools designed to identify irregular patterns or deviations in data. These automated systems detect spikes, drops, and other abnormal occurrences over time, which can lead to issues such as defects, injuries, theft, system failures, and financial losses. By identifying these anomalies early, organizations can take timely actions to prevent negative impacts on downstream models, system functionality, and reporting.

Anomaly detection systems use advanced machine learning algorithms to automatically monitor data quality without relying on manual rules. They learn normal patterns from historical data and detect anomalies that a human might not anticipate.

This Anomaly Detection BoosterPack utilizes a purpose-built deep learning AI powered solution based on time-series data inputs. The main tools used in the BoosterPack are Pandas, Numpy, Scikit-Learn and TensorFlow. The optimized turn-key solution is deployed across one server: AWS EC2 instance.

What value will it add to my business?

AI-powered anomaly-detection applications identify operational anomalies at the lowest base level.

This helps organizations

  • take proactive action,
  • fine-tune their system solutions to their unique requirements, and
  • save time and costs by automating complex data monitoring at the code, app, or system level.

Detecting control failures, security threats and understanding data patterns in innovative Canadian healthcare, fintech, retail, manufacturing, IT & environmental software solutions can help developers prevent catastrophic failures or unlock new business solutions.

This Anomaly Detection solution:

  • is turn-key – fully automated with all micro-services and packages deployed for users in one setup launch.
  • is fully open-source– there are no licensing costs to use this BoosterPack.
  • is stable – runs an AWS EC2 instance within the DAIR environment.
  • is flexible – supports various business use cases.
  • is cost effective – runs within your DAIR budget.

Why choose Time-Series AI Anomaly Detection over the alternatives?

Custom anomaly detection software solutions are complex to build requiring significant expertise, high costs and long development time. Anomaly detection requires specialized data scientists, network engineers and software developers requiring 6 months or more to build. Purchased solutions are expensive and still require manual workflow steps such as data training for the detection algorithms especially when your data is not labeled or trained.

This BoosterPack offers significant value to an SME:

  • Allowing upload of raw unsupervised data sets.
  • An automated AI model trains your data.
  • An automated virtual cloud environment is set up for you.
  • Low cost, fast solution. Runs within your DAIR budget.

Best Practices

Domain Knowledge

Before we dive into implementing Deep Learning Artificial Neural Networks (autoencoders), it is helpful to thoroughly understand the problem domain and the specific challenges you aim to address. Consider the types of data you’ll be working with, the nature of anomalies you need to detect, and any domain-specific constraints or requirements.

See The Importance of Domain Knowledge in the Tutorials section.

Input Data

High-quality data is essential for training effective autoencoder models. Invest time and effort in your data pre-processing to ensure your dataset is clean and ready for model building. For Anomaly detection input data format requirements and data quality see our Sample Solution BoosterPack. We will use an  ECG sample dataset as a sample  guidance. For additional support with data security or privacy concerns, see FormKiq BoosterPack.

Model Performance

Continuously monitor the autoencoder’s performance during training. Keep track of key metrics such as reconstruction error or anomaly detection accuracy and adjust model hyper-parameters as needed to improve performance. Regularly retrain or fine-tune the model with new data to adapt to evolving patterns and anomalies. See Autoencoders -Machine Learning and TensorFlow Tutorials in Tutorial section to understand autoencoder structure and hyper-parameter tuning.

Tips and Traps

  • Tip: Start by thoroughly understanding your dataset and your normal performance environment. Understanding your data is crucial for selecting appropriate features, defining anomaly detection thresholds, and interpreting model outputs accurately.
  • Tip: Pre-process your data carefully to ensure it’s clean, normalized, and suitable for training the autoencoder model. Handle missing values appropriately. Pre-processing techniques such as normalization and scaling can enhance model convergence and stability. See ECG sample dataset as a sample reference and Sample Solution for full input data requirements.
  • Tip: Hyper-parameter tuning to increase accuracy. Our Sample Solution model outputs anomalies in most cases. However, datasets may have unique behaviors or patterns outside the norm depending on your use case requiring you to experiment with different hyper-parameter functions: activation functions, learning rates, number of epochs, number of nodes and regularization techniques to find the optimal model for your anomaly detection task. Hyper-parameters related to the model architecture for optimization include:
    • Activation Functions – function that calculates the output of the node based on its individual inputs and their weights. We are using hyperbolic tangent(tanh) for LSTM layer.
    • Learning Rates – the amount that the weights are updated during training, often in the range between 0.0 and 1.0 (least to most).
    • Number of Epochs – number of epochs to train the model. If your model’s performance on the training and validation datasets is still improving or hasn’t converged, increasing the number of epochs might be beneficial.
    • Number of Nodes – refers to the number of neurons or units in a neural network layer. You may want to increase the number of nodes to capture intricate patterns in the data if you have a more complex dataset.
    • Dropout – fraction of the units to drop for the linear transformation of the inputs used to prevent overfitting by adding a penalty term to the loss function, the range of dropout is between 0 and 1(least to most).
    • Threshold – look at MAE graph in the Sample Solution, and set a threshold that distinguish between normal and abnormal behavior, the default values is 0.3 in the Sample Solution.

See Autoencoders -Machine Learning and TensorFlow Tutorials in Tutorial section to understand autoencoder structure and hyper-parameter tuning.

  • Trap: Beware of overfitting

This is when the autoencoder model learns to memorize the training data instead of capturing its underlying patterns. Monitor the model’s performance on the validation set and apply regularization techniques such as dropout or early stopping to mitigate overfitting. See Overfitting & Underfitting Data in Tutorial section for more information.

  • Trap: Pre-processing

Avoid imbalanced datasets where anomalies are rare compared to normal instances. Imbalanced data can bias the model towards normal behavior, leading to poor detection performance. Use techniques such as over-sampling, under-sampling, or synthetic data generation to address class imbalance and improve model robustness. See Imbalanced Data in Tutorial section for more information.

  • Trap: Anomaly detection is an iterative process

It requires continuous refinement and improvement. Don’t expect to achieve optimal results with the first iteration of your model if your dataset’s patterns are complex. Iterate on your hyper-parameters, and feature engineering strategies based on feedback and evaluation results to enhance detection accuracy and efficiency. See ECG sample dataset as a sample reference of normal model dataset structure and test iterate to learn.

Resources

Please see the sections below for resources on AI and machine learning.

Access to the Sample Solution source code will be a helpful learning resource. For ECG Anomaly Detection example, review the contents of the /data, /src, /reports, and /models folders.

Tutorials

The table below provides a starting list of links to tutorials the author has found to be most useful.

Tutorial Content Summary
What is AI in 5 minutes Intro to Artificial Intelligence for the beginner.
AI vs ML vs DL Difference Explained AI, Machine Learning & Deep Learning explained.
Deep Learning Specialization by Andrew Ng on Coursera Covers foundational concepts of deep learning, including neural networks, convolutional networks, recurrent networks, and sequence models.
Sequence Models Course by Deeplearning.ai Describes sequence models such as recurrent neural networks (RNNs) and LSTMs.
Autoencoders -Machine Learning Provides a comprehensive introduction to autoencoders, covering the theory behind autoencoder models, their various architectures, and practical examples of implementing autoencoders using TensorFlow and Keras.
The Importance of Domain Knowledge Provides information about what Domain Knowledge is and why it is important in Machine Learning.
Imbalanced Data Tutorials to help you better understand Imbalanced Data and how to work with Imbalanced Data.
Overfitting & Underfitting Data Tutorials to help you better understand your datasets.
Dataset Instance Transfers Step-by-Step guide on transferring files between your laptop and Amazon instance.
Label Encoding Tutorials to help you better understand your data label encoding in Python.
Multi-Threading in Python Tutorials to help you better understand your thread-based data parallelism in Python.
Multi-Processing in Python Tutorials to help you better understand your process-based data parallelism in Python.

Documentation

Please see the table below for a set of documentation resources for Time-Series AI Anomaly Detection BoosterPack.

Document Summary
https://github.com/Chillwall/anomaly_detection.git GitHub Code Repository and Documentation

 

Support

As a DAIR participant, you can access support related to this BoosterPack. If you have questions, you can:

  • post them in DAIR Slack #help channel
  • send an email to [email protected]
  • create an issue in the GitHub repository

Got it? Now let us show you how we deployed it on the DAIR Cloud…

Time-Series AI Anomaly Detection Sample Solution

For a wide range of industries from Fintech, Healthcare, CleanTech etc. needing to identify operational anomalies, the Sample Solution demonstrates how Time-Series Al Anomaly Detection is used to pinpoint threats, risks, and opportunities. Unlike <complex custom solutions, this Sample Solution achieves a turn-key AI powered solution requiring minimal organization resources in support of key operational functionality.

Please see the Sample Solution page for more information on how the Sample Solution works.

The Sample Solution showcases the following technologies: Pandas, Numpy, Scikit-Learn and TensorFlow, described in subsequent sections.