Convolutional neural networks for damage detection

11.3.2019 | 10 minutes of reading time

Damage detection from sensor data is at the basis of predictive maintenance . Mainly, one needs to discriminate the normal from the anomalous (damaged) status, and estimate the severity of the damage to forecast the right course of action (maintenance, repair, replacement). In this blog post, we use sensor data from a rolling bearing to show, step by step, how to implement a convolutional neural network architecture with Keras for anomaly detection.

Methodology

In the last couple of years, many Artificial Intelligence architectures for anomaly and damage detection have flourished. We can distinguish mainly two approaches: the first one is based on supervised learning and the second one on unsupervised learning.

Supervised learning

The supervised approach makes sense when the available data has a balanced number of normal and anomalous events. This happens, for example, in the case of data coming from experimental tests. Usually, engineers create a test setup in a laboratory to assess how damages influence sensors’ readings. An alternative, frequently used, is to simulate (through computational models) the test setup. In either way, engineers can generate as much data as needed for characterizing normal operational status and failure status. The labelled data (normal vs failure) can then be used to implement supervised ML/AI architectures. The training of the ML/AI profits very much from the balanced data. This chain simulation/tests/training is a kind of old-school damage detection technique, but is still necessary in many applications where the system failure would affect human security: automotive, aircrafts, infrastructures, and so on, whose reliability levels are regulated and have to be certified.

Unsupervised learning

Sometimes engineering systems are so complex that neither lab tests nor simulations are economically feasible. In this case, one alternative is to use the real data coming from monitoring the structure, and waiting (and hoping) to find hints on how damages affect the data. In this data-driven detection approach, the anomaly happens rarely, as opposed to the previous balanced case. Well, with very few anomaly labels, AI training can be very tricky. Loosely speaking, this is like the classical problem of finding the needle in the haystack, that is, a huge amount of normal labels and scarcely any failure labels. In this case, one can formulate an unsupervised problem, feeding the whole haystack to the AI algorithm. The logic here is that using automatic features extraction and dimensionality reduction techniques, one can find hidden correlations that are signatures of incipient anomaly. This is the preferred approach for IoT applications, where failures are more likely to cause economic rather than social losses.

In this blog, we present an example based on the supervised approach, while we will describe the unsupervised approach in a future post. We show here, step-by-step, how to use deep learning to classify a healthy versus damaged status of a rolling bearing.

The dataset

For this use case, we examine a very popular rolling bearing dataset, made available by the Case Western Reserve University. We compare different architectures from different papers and show how to implement them with Keras. But first of all, what are bearings and why do we care about them?

Rolling bearings

Rolling bearings are very critical mechanical devices that find wide application in machinery. For example, in power transmissions, compressors, turbines, and yes, on a smaller scale, also in your skateboard. Their mechanical task is simple. They allow the relative translation and rotation of two moving parts. Dust and debris find their way into bearings, which causes performance worsening, oxidation and degradation of mechanical properties. The identification of their health status is then crucial for many mechanical devices in the manufacture industry.

The dataset collects acceleration recorded for normal (N) operations, failure of the steel balls (B), in the inner ring (I) or the outer ring (O) (we will not consider the latter damage).

Damage detection architecture

We use a convolutional neural network to classify if the rolling bearing is in the normal (N) state or in the (B, IR) state. This is a supervised classification problem.

Let’s have an overview of the whole architecture before digging into the implementation details.

As the picture shows, we can distinguish four main conceptual phases.

Data chunking
First, we read the Matlab data in chunks, using tumbling time windowing , to mimic the deployment of the AI in production. In other words, AI analyzes the data stream at fixed short time intervals, monitoring that the bearing operates normally at almost real-time. The fixed length of the time window is a parameter of the algorithm.
Feature engineering
We apply standard tools of signal processing analysis, (wavelet, short time Fourier and q-transform) to extract features. For each small time series we extract three images, one for each transform, labelled in accordance with the original data, either (N) or (IR, B). The idea of this preprocessing operation is to have images that represent the coupled time-frequency domain. We show in the following the results obtained by training one specific CNN architecture based on the short time Fourier transform. Other kinds of signal preprocessing (like audio signals) might take advantage of the q and wavelet transform and are included in the repo .
Data shuffling
After creating the image database, we shuffle the images and group them in the training and the hold-out dataset. So, as usual, the CNN will be trained on the former, and evaluated on the latter.
Convolutional neural network
Finally, we train the CNN to classify whether an image represents a normal operation or some type of fault of the bearing. We tried different CNN architectures from different papers and found an excellent result with the simplest one of Guo et al. (fewer trainable parameters).

Implementation

In our repo , we provide two implementations: first, a Jupyter notebook that shows how to implement, step by step, the CNN architecture until here described based only on short-time Fourier transform; second, a more sophisticated and detailed implementation for Google Colaboratory in the folder deep-predict/colab , which includes wavelet and q-transform images and cyclic learning rate finding.

The original dataset is hosted on a server that is not very friendly to automated downloading. We mirror the dataset on a S3 AWS, and use the Python request package.

What the data looks like

The data in the dataset consists of many Matlab files. Most of the files contain two time series (vibration data), sampled at 12 kHz (12000 data points per second). The two time series are accelerations recorded by sensors located at both the drive end (labelled ‘DE’) and the fan end (labelled ‘FE’) of the motor housing. As we found some inconsistency in the data (‘FE’ labels missing in some files), we opted to use just the recording at the drive end.

The Matlab files are imported into Python by using scipy.io module that gives back a dictionary:

1{'__header__': b'MATLAB 5.0 MAT-file, Platform: PCWIN, Created on: Mon Jan 31 15:28:20 2000',
2 '__version__': '1.0',
3 '__globals__': [],
4 'X097_DE_time': array([[...]]),
5 'X097_FE_time': array([[...]]),
6 'X097RPM': array([[...]], dtype=uint16)}

The data corresponding to the keys ‘_DE_’ is what we will use and we need to extract it from each file. This is easily done with regex among the dictionary keys.

Furthermore, the key ‘X097RPM’ is the motor speed in rpm. As it varies for each experiment, it could strongly bias the classification. For this reason, we normalise the signals to reduce this potential bias. Finally, the figure below shows what 1/10 of second recorded data looks like in the state N, IR, and B.

Next, we perform some operation on the data. First of all we will consider only:

the normal data labelled as `N`
the inner raceway data labelled as `IR`
the ball damage data labelled as `B`

that is, we skip the outer raceway `OR` data in this implementation.

Preprocessing

First, we join the data for every load condition, from `0…3` HP (horse power), normalising it with respect to the standard deviation. This is because we want our CNN to classify the status independently from the load.

Then, we chunk and shuffle the data before feature extraction. We need a function to split any time series in the given number of chunks, dropping all the remaining data that would not fit a chunk size:

1def split_exact(x, n_chunks=2, axis=1):
2    import numpy as np
3    l = np.shape(x)[axis]
4    x_split = x
5    if l > n_chunks > 1:
6        n = n_chunks
7        if axis == 0:
8            x_split = np.split(x[:-(l % n)], n, axis=axis)
9        elif axis == 1:
10            x_split = np.split(x[:, :-(l % n)], n, axis=axis)
11    return x_split

In Python there are many mature tools for signal processing. For example, to calculate the short time Fourier transform and give back an image with assigned shape (dpi) we use the module scipy.signal and skimage:

1from scipy import signal
2from skimage.transform import resize
3 
4def generate_spectrogram_image(data_y_vector, image_shape):
5    """
6    Calculate the spectrogram of an array data_y_vector and resize it in 
7    the image_shape resolution
8    """
9    fs = 12000.
10    data_y_vector_len = np.shape(data_y_vector)[0]
11 
12    f, t, sxx = signal.spectrogram(
13        data_y_vector,
14        fs)
15 
16    sxx = min_max_norm(sxx)
17    sxx = resize(sxx, image_shape, mode='constant', anti_aliasing=True)
18 
19    return sxx

where the function:

1def min_max_norm(ary):
2    ary = (ary - ary.min()) / np.abs(ary.max() - ary.min())
3    return ary

normalises the spectrogram in the range [0,1].

To get an idea of what the features we extract look like, let’s plot them just for one chunk.

Can the CNN classify these images?

Short answer: Yes. And with very high precision, too.

The problem we want to solve is a supervised classification problem, with three classes. We use a convolutional neural network implemented as sequential Keras model. Our installation of Keras exploits TensorFlow as backend for the tensor arithmetics. This figure shows the different CNN layers we use, following the paper of Guo et al.

The images, previously generated, form a tuple `(number of images, x dpi, y dpi, number of channels)` that is the CNN input. The details on how to create an image dataset that can be easily split in train and test data using the module of scikit-learn are given in the repo, it would be too lengthy to be posted here and, moreover, it is quite easy.

We use the categorical cross-entropy loss function, ADAM optimiser and Leaky RELU activation function. Most noteworthy, we apply the cyclical learning rate algorithm of Smith to find the learning rate. An implementation of such an algorithm by Keras callback is in the `./colab` folder of our repo. The Keras model is summarised in the following code snippet:

1model = Sequential()
2model.add(Conv2D(5, KERNEL_SIZE,                  
3                 input_shape=IMAGES_SHAPE,
4                 data_format='channels_last',
5                 kernel_initializer=KERNEL_INITIALIZER,                 
6                 padding=PADDING))
7model.add(LeakyReLU(LEAK_ALPHA))
8model.add(MaxPooling2D(pool_size=MAX_POOLING_POOL_SIZE))
9model.add(Conv2D(10, KERNEL_SIZE, 
10                 kernel_initializer=KERNEL_INITIALIZER,
11                 padding=PADDING))
12model.add(LeakyReLU(LEAK_ALPHA))
13model.add(MaxPooling2D(pool_size=MAX_POOLING_POOL_SIZE))
14model.add(Conv2D(10, KERNEL_SIZE, 
15                 kernel_initializer=KERNEL_INITIALIZER,
16                 padding=PADDING))
17model.add(LeakyReLU(LEAK_ALPHA))
18model.add(MaxPooling2D(pool_size=MAX_POOLING_POOL_SIZE))
19model.add(Flatten())
20model.add(Dense(100))
21model.add(LeakyReLU(LEAK_ALPHA))
22model.add(Dense(50))
23model.add(LeakyReLU(LEAK_ALPHA))
24model.add(Dropout(DROPOUT))
25model.add(Dense(NUMBER_OF_CLASSES))
26model.add(Activation(ACTIVATION_LAYER_FUNCTION))
27 
28model.compile(loss=LOSS_FUNCTION, 
29              optimizer=OPTIMIZER,
30              metrics=[metrics.categorical_accuracy])  
31model.summary()

The performance of the model is very high for the test data, reaching 1.00 both in precision and recall. In a future blog post, we will show you how to add robustness to this model, adding noise to the signals and evaluating the cost/benefit analysis of the results. So, stay tuned!

How to continue from here…

In the repo, we provide the methods to extract also the wavelet and q-transform of the signals. Some questions you might want to answer are:

Which method looks better? Try the wavelet and q-transform. How are the results changing?
How can you extend the training dataset? Or generate new test data which is compatible with the 40 seconds we have (that are not that much).

If you have similar problems with other data and want to apply AI or got lost in trying to figure out the answers to these questions, don’t hesitate: Get in touch via email!

References:

Was this post helpful?

Likes

Blog authors

Dimitar Dimitrov

Do you still have questions? Just send me a message.

Giulio Cottone

Do you still have questions? Just send me a message.

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Im Bereich des maschinellen Lernens wurde eine lange Zeit angenommen, dass die Eingabedaten von Modellen und Gewichten sicher sei und nicht extrahiert werden könnten. In den letzten Jahren veröffentlichte Forschung hat diese Annahme in Frage gestellt...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 8 Minuten Lesezeit

Ihsan Kisi

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mithilfe von Daten können Unternehmen fundiertere Entscheidungen treffen, ihre Arbeitsabläufe optimieren und mit der Kraft des maschinellen Lernens (ML) einen Vorteil in der wettbewerbsintensiven Geschäftswelt erlangen. Allerdings ist der Umgang mit ...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 7 Minuten Lesezeit

Ihsan Kisi

Große Sprachmodelle: Was ist ein LLM?

Große Sprachmodelle (Large Language Models oder LLM) haben in den letzten Jahren enorme Fortschritte gemacht und spielen eine entscheidende Rolle in verschiedenen Anwendungen. Aber was ist ein LLM? Es ist sinnvoll zu erklären, was ein „einfaches“ Sprachmodell...

Machine Learning

20.6.2023 | 4 Minuten Lesezeit

Elvira Siegel

Stream Processing mit Kafka Streams und Spring Boot

Kontinuierliche Datenströme in verteilten Systemen ohne Zeitverzögerung zu verarbeiten, birgt einige Herausforderungen. Wir zeigen euch, wie Stream Processing mit Kafka Streams und Spring Boot gelingen kann. Alles im Fluss: Betrachtet man Daten als fortlaufenden...

Softwarearchitektur
Cloud
IoT
Messaging
Kotlin
Spring

20.12.2021 | 20 Minuten Lesezeit

Maik Fleuter

Lukas Maier

Smart DistancR – Perspektivisch korrekte Distanzmessung zwischen Personen

Die Corona-Krise ist weiterhin in aller Munde und wird uns mit hoher Wahrscheinlichkeit noch etwas länger begleiten. Wie man aus unterschiedlichen Statistiken erfährt, schwanken die Fallzahlen weiter und sorgen für zusätzliche Restriktionen. Diese werden...

Computer Vision
Künstliche Intelligenz
IoT
Machine Learning

13.12.2021 | 7 Minuten Lesezeit

Michel Ehmen

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Die Qualität bzw. Nützlichkeit von Machine-Learning-Modellen lässt sich mit Hilfe von Testdaten und Metriken bewerten. Allerdings in welchem Umfang? Manuell, automatisiert, einmalig, regelmäßig? Manuell lassen sich die ersten Modelle als Ergebnis eines...

Data
Machine Learning
Softwareentwicklung
CI/CD

7.12.2021 | 7 Minuten Lesezeit

Berthold Schulte

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Machine Learning (ML) erzeugt erst dann realen Mehrwert, wenn es in Produktion benutzt wird. Allerdings kann die Zeitspanne zwischen der Entwicklung eines belastbaren Modells und dessen Einsatz frustrierend lange sein. Insbesondere in schnelllebigen ...

Agile Methoden
Cloud
Machine Learning

26.7.2021 | 5 Minuten Lesezeit

Timo Böhm

Niklas Haas

Schnelles Training eines Recommendation-Modells durch BigQuery ML

Machine Learning (ML) kann nur durch Modelle in der Produktion Business Value erzeugen. Allerdings kann die Zeitspanne zwischen der Entwicklung der nächsten Iteration eines Modells und dessen Einsatz in einer Produktionsumgebung massiv sein. Dies gilt...

Accelerate
Cloud
Data
Google Cloud
Machine Learning

26.7.2021 | 11 Minuten Lesezeit

Niklas Haas

Timo Böhm

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Heutzutage steht fast alles, was mit den Labels „künstliche Intelligenz (KI)“ oder „Machine Learning (ML)“ versehen ist, für Fortschritt. Seltsamerweise schließt diese Assoziation jedoch häufig die Themen Daten und Dateninfrastruktur nicht ausreichend...

Kultur
Data
Machine Learning

21.6.2021 | 12 Minuten Lesezeit

Marcel Mikl

Vom Plastik in die AWS IoT Cloud

Was haben wir vor und was ist die codecentric Lernfabrik eigentlich?Im Rahmen unserer „Qualitätsoffensive Cloud ” und der Intensivierung des Themas Industrie 4.0 haben wir bei codecentric uns die 24V Lernfabrik von Fischertechnik angeschafft. Mit dieser...

Cloud
IoT
AWS
IIoT

20.5.2021 | 7 Minuten Lesezeit

David Schwarzmann

Jens Deters

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

Bei klassischen Machine-Learning-(ML-)Projekten beschäftigen sich Data Scientists häufig längere Zeit (mehrere Monate) mit der Entwicklung eines ML-Modells. Dabei werden hohe Kosten verursacht und die Zeit, bis ein erstes Modell zur Verfügung steht, ...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Google Cloud
Machine Learning

17.5.2021 | 5 Minuten Lesezeit

Nils Bauroth

Sven Rediske

IIoT mal anders: Rezepte für den Pflanzen-Thermomix

In diesem Artikel erläutern wir unsere IIoT-Lösung zur autonomen, deklarativen Pflanzenzucht. Dieser Artikel ist der zweite in unserer Reihe zu IIoT. Der erste Artikel befasste sich mit allgemeinen Fragen und Problemstellungen zum Thema. Pflanzen zu...

Softwarearchitektur
IIoT
IoT
Softwareentwicklung

3.3.2021 | 8 Minuten Lesezeit

Robert Meißner

Marcus Hanhart

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

Noch vor kurzer Zeit mussten für den Einsatz von künstlicher Intelligenz (KI) unter großem Aufwand eigene KI-Modelle erstellt werden. Heute ist für viele Anwendungsfälle die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Machine Learning
Python

29.7.2020 | 11 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und Konstruktion eigener neuronaler Netze möglich. Heute ist die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken. So kann man ...

Cloud
Computer Vision
Data
Python
Machine Learning
Google Cloud
Künstliche Intelligenz

8.7.2020 | 11 Minuten Lesezeit

Nico Axtmann

Marcel Mikl

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und ausreichend Spezialwissen möglich. Hauptsächlich große Internet-Konzerne wie Google, Apple und Facebook hatten das Geld, die Daten und die Expertise, um ...

Data
Machine Learning
Künstliche Intelligenz

6.7.2020 | 7 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

BIE Spotty – unsere Lösung beim BIE City Hackathon

Typischerweise sind bei Hackathons viele Soft- und Hardware-Entwickler zu finden, die innerhalb eines begrenzten Zeitraums versuchen, kreative und ungewöhnliche Lösungen in Form von Code und ersten Prototypen für vorher definierte Challenges zu erarbeiten...

IoT
Computer Vision
IT-Security
Machine Learning

2.7.2020 | 5 Minuten Lesezeit

Meike Wocken

Machine Learning in der Praxis. Eine Mate mit … Matthias Niehoff #EineMateMit

Machine Learning und künstliche Intelligenz sind aktuell in aller Munde und versprechen vielfältige Einsatzmöglichkeiten im Unternehmen. Trotzdem tun sich viele Unternehmen aktuell noch schwer, das Potential der Technologie zu nutzen. „Der Fokus liegt...

Künstliche Intelligenz
Data
Community
Machine Learning

27.5.2020 | 1 Minuten Lesezeit

Matthias Niehoff

Agilität im Industrial IoT? Eine Mate mit … Marcus Schulderinsky #EineMateMit

Durch das Industrial IoT ergeben sich neue Möglichkeiten für IT (Information Technology) und OT (Operational Technology). Also zwischen der Softwarewelt der IT und der Hardwarewelt der OT. “Hierbei ist besonders der Clash zwischen beiden eine spannende...

Community
IIoT
IoT

17.5.2020 | 1 Minuten Lesezeit

Marcus Schulderinsky

BPMN im Smart Home: Camunda und openHAB

Geschäftsprozessmodellierung und einhergehende Sprachen wie BPMN und DMN sind Begriffe, denen man normalerweise im beruflichen Umfeld begegnet und die im privaten Raum keine Rolle spielen. Natürlich kann man die Prozesse eines Haushalts (aka kleines,...

Java
BPM
Smart Home
IoT

6.4.2020 | 8 Minuten Lesezeit

Stephan Köninger

Wie man Data-Science-Projekte nicht in die PoC-Sackgasse manövriert

Warum gelingt es Data-Science-Initiativen häufig nicht, einen echten Mehrwert zu schaffen? Wir haben einige Ursachen dafür ausgemacht. In diesem Blogpost stellen wir vier typische Fallen für Data-Science-Projekte vor und geben Tipps, wie Du sie umschiffen...

Machine Learning
Data
Künstliche Intelligenz
Softwareentwicklung

27.3.2020 | 11 Minuten Lesezeit

Marcel Mikl

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

Convolutional neural networks for damage detection

Methodology

Supervised learning

Unsupervised learning

The dataset

Rolling bearings

Damage detection architecture

Data chunking

Feature engineering

Data shuffling

Convolutional neural network

Implementation

What the data looks like

Preprocessing

Can the CNN classify these images?

How to continue from here…

References:

Was this post helpful?

Ja

Blog authors

Get in contact

Get in contact

Contact Dimitar

Contact Giulio

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Große Sprachmodelle: Was ist ein LLM?

Stream Processing mit Kafka Streams und Spring Boot

Smart DistancR – Perspektivisch korrekte Distanzmessung zwischen Personen

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Schnelles Training eines Recommendation-Modells durch BigQuery ML

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Vom Plastik in die AWS IoT Cloud

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

IIoT mal anders: Rezepte für den Pflanzen-Thermomix

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

BIE Spotty – unsere Lösung beim BIE City Hackathon

Machine Learning in der Praxis. Eine Mate mit … Matthias Niehoff #EineMateMit

Agilität im Industrial IoT? Eine Mate mit … Marcus Schulderinsky #EineMateMit

BPMN im Smart Home: Camunda und openHAB

Wie man Data-Science-Projekte nicht in die PoC-Sackgasse manövriert

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten