Using Apache PLC4X and ElasticSearch for IIoT monitoring and anomaly detection

No Comments

Industrial IoT (IIoT) as a buzzword gained traction within recent years. However, implementing common use cases like real-time monitoring of PLCs may involve a huge amount of money and effort. For example, current approaches implementing such a monitoring solution require complex architectures. Examples and show-cases for real-time monitoring often use OPC-UA as an interface to connect to the PLCs. However, this leaves out a huge number of older PLCs in factories all over the world with no support for OPC-UA.

We are presenting an approach involving less effort and way less money needed to implement near-real-time IIoT monitoring. This is realized by utilizing Apache PLC4X and the ELK Stack. Apache PLC4X is integrated into Logstash as a Logstash plugin. This is used to connect to the PLCs and transfer the incoming data into ElasticSearch. Kibana serves as the data analysis and monitoring user interface.

Showcase overview

The showcase implements a rather simple scenario with two conveyor belts within a factory containing temperature sensors for each conveyor belt. A conveyor belt consists of three workstations (or stages). The conveyor belt processes items through the workstations. The temperature of the processed items is monitored by the sensors at every stage. During each workstation, errors may occur due to high temperatures (it’s a rather theoretical example). These errors may produce faulty material, so we are eager to monitor the temperature on each workstation and get insights if anomalies happen.Sensors on Conveyor Belt

The showcase consists of three main components. First, there is a simulated (virtual) factory with two servers as data sources for our scenario. The servers are polled by Logstash (with integrated Apache PLC4X) as the second component. Logstash pushes the data to ElasticSearch, with Kibana as the frontend for data analysis and display.

 

Components overview

What is Apache PLC4X?

Apache PLC4X serves as a universal protocol adapter for different types of PLCs. The project’s goal is to easily access many types of PLCs by having a standardized interface for the PLCs’ protocol, e.g. Siemens S7 or Beckhoff ADS.

Logstash integration

The simplicity of the presented approach is achieved by the integration of Apache PLC4X into Logstash as a plugin. We implemented an easy and convenient way to leverage the power of PLC4X with a well-known ETL tool like Logstash. This blog post (German only) presents our challenges in building the Logstash integration.

The showcase

The showcases example implementation is available within a GitHub repository. You can run and try out the example with all components in a few seconds by executing docker-compose.

This will spin up several Docker containers for ElasticSearch, Kibana, Logstash, and 2 simulated OPC-UA servers. Everything is preconfigured to show the demo.

Implementation Overview

The simulated OPC-UA server

For the first iteration of our showcase, we quickly needed a simulated PLC to demonstrate the capabilities of Apache PLC4X with Logstash. As PLC4X’s greatest advantage over OPC-UA is that it seamlessly supports different kinds of legacy PLCs (i.e. PLCs without support for OPC-UA), we wanted to use a simulated PLC. However, we didn’t have such a simulated PLC at hand. Therefore, we used a free implementation of an OPC-UA server and because of the OPC-UA protocol support of Apache PLC4X, we can demonstrate our use case.

The server consists of three temperature sensors, continuously emitting temperature data with little variations in the temperature value (Gaussian distributed) and introduces randomized temperature variations into the data.

Demo Implementation

The implementation of our showcase is done by configuring the Logstash PLC4X plugin within a pipeline:

input {
	plc4x {
		jobs => {
			job1 => {
				rate => 200
				sources => ["sensors1", "sensors2"]
				queries =>  {
					PreStage => "ns=2;i=3"
					MidStage => "ns=2;i=4"
					PostStage => "ns=2;i=5"
					ConveyorBeltTimestamp => "ns=2;i=7"
				}
			}
		}
		sources => {
			sensors1 => "opcua:tcp://opcua-server-1:4840/freeopcua/server/"
			sensors2 => "opcua:tcp://opcua-server-2:4841/freeopcua/server/"
		}
	}
}

Afterward, we configured three Timelion visualizations to display the three different temperature sensors for each conveyor belt. You can configure many more useful visualizations if needed. The picture below shows the configuration of a time series diagram for one stage.

Timelion Visualization

The 3 Timelion visualizations are combined within a dashboard and already present a nice overview of the temperature data.

Dashboard

You may already see some of the detected anomalies at first sight. However, smaller anomalies, with narrower outliers, are harder to manually discover within such a view. This is where Kibana’s machine learning features (part of the platinum license) come in place. We are able to automatically detect anomalies within our factory, simply by configuring a new machine learning job.

You can configure the machine learning job by navigating to the machine learning tab and clicking on the “create new job” button. Then you have to select the plant index and choose the wizard for multi-metric anomaly detection. Next, choose the time range for your job data. In our case, we had the example running for about four hours, which produced around 140.000 data points.

ML Job Creation

In the job settings area, select the fields on which the job runs: values.PreStage, values.MidStage and values.PostStage with the max aggregation operation. As the split data field, select the sensor name by choosing the sourceName field.

The bucket span describes the size of the generated buckets for the max aggregation and should be set to one second for a finer resolution. Last but not least, configure a job name and description. By clicking on the  “create job” button, the machine learning job gets started. 

ML job results

After the job finished, you can view the results. As you can see in the screenshot, the model detected several anomalies and even scores them with a severity measure.

It’s also possible to configure a watcher (alerting) for a continuously running job. This allows for e-mail notifications when anomalies occur.

Key takeaways

In this blog post, we built a near-real-time monitoring and anomaly detection for a simulated factory showcase.

The benefits of this solution are its low cost and effort compared to similar solutions on the market. Although we used a simulated PLC, the demo is easily applicable to a real PLC. Monitoring, alerting, and machine learning is usually just a matter of configuring the available components and therefore quickly implemented. 

For the demo setting with only three sensors on two servers, machine learning seems a bit overpowered. However, with thousands of PLCs within multiple factories, manually configuring alerting and thresholds can be a tedious task. Scaling up our example is easily manageable, all it takes is an ElasticSearch cluster and Logstash with the PLC4X plugin on small machines close to the PLCs.

So what’s next? We want to improve our showcase by using real or simulated PLCs without OPC-UA to demonstrate the real power of Apache PLC4X. We will further improve the PLC4X Logstash plugin and get it ready for production. It is on our roadmap to extend the showcase scenario to a more production-like setting with an ElasticSearch cluster and a scalable, variable amount of PLCs. 

Stefan Herrmann

Stefan works as an IT Consultant at codecentric AG in Frankfurt. He is at home in the Java world. His interests are in agile practice and he has a passion for information retrieval and machine learning.

More content about Elasticsearch

Comment

Your email address will not be published. Required fields are marked *