Convolutional neural networks for damage detection

11.3.2019 | 10 minutes of reading time

Damage detection from sensor data is at the basis of predictive maintenance . Mainly, one needs to discriminate the normal from the anomalous (damaged) status, and estimate the severity of the damage to forecast the right course of action (maintenance, repair, replacement). In this blog post, we use sensor data from a rolling bearing to show, step by step, how to implement a convolutional neural network architecture with Keras for anomaly detection.


In the last couple of years, many Artificial Intelligence architectures for anomaly and damage detection have flourished. We can distinguish mainly two approaches: the first one is based on supervised learning and the second one on unsupervised learning.

Supervised learning

The supervised approach makes sense when the available data has a balanced number of normal and anomalous events. This happens, for example, in the case of data coming from experimental tests. Usually, engineers create a test setup in a laboratory to assess how damages influence sensors’ readings. An alternative, frequently used, is to simulate (through computational models) the test setup. In either way, engineers can generate as much data as needed for characterizing normal operational status and failure status. The labelled data (normal vs failure) can then be used to implement supervised ML/AI architectures. The training of the ML/AI profits very much from the balanced data. This chain simulation/tests/training is a kind of old-school damage detection technique, but is still necessary in many applications where the system failure would affect human security: automotive, aircrafts, infrastructures, and so on, whose reliability levels are regulated and have to be certified.

Unsupervised learning

Sometimes engineering systems are so complex that neither lab tests nor simulations are economically feasible. In this case, one alternative is to use the real data coming from monitoring the structure, and waiting (and hoping) to find hints on how damages affect the data. In this data-driven detection approach, the anomaly happens rarely, as opposed to the previous balanced case. Well, with very few anomaly labels, AI training can be very tricky. Loosely speaking, this is like the classical problem of finding the needle in the haystack, that is, a huge amount of normal labels and scarcely any failure labels. In this case, one can formulate an unsupervised problem, feeding the whole haystack to the AI algorithm. The logic here is that using automatic features extraction and dimensionality reduction techniques, one can find hidden correlations that are signatures of incipient anomaly. This is the preferred approach for IoT applications, where failures are more likely to cause economic rather than social losses.

In this blog, we present an example based on the supervised approach, while we will describe the unsupervised approach in a future post. We show here, step-by-step, how to use deep learning to classify a healthy versus damaged status of a rolling bearing.

The dataset

For this use case, we examine a very popular rolling bearing dataset, made available by the Case Western Reserve University. We compare different architectures from different papers and show how to implement them with Keras. But first of all, what are bearings and why do we care about them?

Rolling bearings

Rolling bearings are very critical mechanical devices that find wide application in machinery. For example, in power transmissions, compressors, turbines, and yes, on a smaller scale, also in your skateboard. Their mechanical task is simple. They allow the relative translation and rotation of two moving parts. Dust and debris find their way into bearings, which causes performance worsening, oxidation and degradation of mechanical properties. The identification of their health status is then crucial for many mechanical devices in the manufacture industry.

The dataset collects acceleration recorded for normal (N) operations, failure of the steel balls (B), in the inner ring (I) or the outer ring (O) (we will not consider the latter damage).

Damage detection architecture

We use a convolutional neural network to classify if the rolling bearing is in the normal (N) state or in the (B, IR) state. This is a supervised classification problem.

Let’s have an overview of the whole architecture before digging into the implementation details.

As the picture shows, we can distinguish four main conceptual phases.

  1. Data chunking

    First, we read the Matlab data in chunks, using tumbling time windowing , to mimic the deployment of the AI in production. In other words, AI analyzes the data stream at fixed short time intervals, monitoring that the bearing operates normally at almost real-time. The fixed length of the time window is a parameter of the algorithm.

  2. Feature engineering

    We apply standard tools of signal processing analysis, (wavelet, short time Fourier and q-transform) to extract features. For each small time series we extract three images, one for each transform, labelled in accordance with the original data, either (N) or (IR, B). The idea of this preprocessing operation is to have images that represent the coupled time-frequency domain. We show in the following the results obtained by training one specific CNN architecture based on the short time Fourier transform. Other kinds of signal preprocessing (like audio signals) might take advantage of the q and wavelet transform and are included in the repo .

  3. Data shuffling

    After creating the image database, we shuffle the images and group them in the training and the hold-out dataset. So, as usual, the CNN will be trained on the former, and evaluated on the latter. 

  4. Convolutional neural network

    Finally, we train the CNN to classify whether an image represents a normal operation or some type of fault of the bearing. We tried different CNN architectures from different papers and found an excellent result with the simplest one of Guo et al. (fewer trainable parameters). 


In our repo , we provide two implementations: first, a Jupyter notebook  that shows how to implement, step by step, the CNN architecture until here described based only on short-time Fourier transform; second, a more sophisticated and detailed implementation for Google Colaboratory in the folder deep-predict/colab , which includes wavelet and q-transform images and cyclic learning rate finding.

The original dataset is hosted on a server that is not very friendly to automated downloading. We mirror the dataset on a S3 AWS, and use the Python request package.

What the data looks like

The data in the dataset consists of many Matlab files. Most of the files contain two time series (vibration data), sampled at 12 kHz (12000 data points per second). The two time series are accelerations recorded by sensors located at both the drive end (labelled ‘DE’) and the fan end (labelled ‘FE’) of the motor housing. As we found some inconsistency in the data (‘FE’ labels missing in some files), we opted to use just the recording at the drive end. 

The Matlab files are imported into Python by using scipy.io module that gives back a dictionary:

1{'__header__': b'MATLAB 5.0 MAT-file, Platform: PCWIN, Created on: Mon Jan 31 15:28:20 2000',
2 '__version__': '1.0',
3 '__globals__': [],
4 'X097_DE_time': array([[...]]),
5 'X097_FE_time': array([[...]]),
6 'X097RPM': array([[...]], dtype=uint16)}

The data corresponding to the keys ‘_DE_’ is what we will use and we need to extract it from each file. This is easily done with regex among the dictionary keys.

Furthermore, the key ‘X097RPM’ is the motor speed in rpm. As it varies for each experiment, it could strongly bias the classification. For this reason, we normalise the signals to reduce this potential bias. Finally, the figure below shows what 1/10 of second recorded data looks like in the state N, IR, and B.

Next, we perform some operation on the data. First of all we will consider only:

  • the normal data labelled as `N`
  • the inner raceway data labelled as `IR`
  • the ball damage data labelled as `B`

that is, we skip the outer raceway `OR` data in this implementation.


First, we join the data for every load condition, from `0…3` HP (horse power), normalising it with respect to the standard deviation. This is because we want our CNN to classify the status independently from the load.

Then, we chunk and shuffle the data before feature extraction. We need a function to split any time series in the given number of chunks, dropping all the remaining data that would not fit a chunk size:

1def split_exact(x, n_chunks=2, axis=1):
2    import numpy as np
3    l = np.shape(x)[axis]
4    x_split = x
5    if l > n_chunks > 1:
6        n = n_chunks
7        if axis == 0:
8            x_split = np.split(x[:-(l % n)], n, axis=axis)
9        elif axis == 1:
10            x_split = np.split(x[:, :-(l % n)], n, axis=axis)
11    return x_split

In Python there are many mature tools for signal processing. For example, to calculate the short time Fourier transform and give back an image with assigned shape (dpi) we use the module scipy.signal and skimage:

1from scipy import signal
2from skimage.transform import resize
4def generate_spectrogram_image(data_y_vector, image_shape):
5    """
6    Calculate the spectrogram of an array data_y_vector and resize it in 
7    the image_shape resolution
8    """
9    fs = 12000.
10    data_y_vector_len = np.shape(data_y_vector)[0]
12    f, t, sxx = signal.spectrogram(
13        data_y_vector,
14        fs)
16    sxx = min_max_norm(sxx)
17    sxx = resize(sxx, image_shape, mode='constant', anti_aliasing=True)
19    return sxx

where the function:

1def min_max_norm(ary):
2    ary = (ary - ary.min()) / np.abs(ary.max() - ary.min())
3    return ary

normalises the spectrogram in the range [0,1].

To get an idea of what the features we extract look like, let’s plot them just for one chunk.

Can the CNN classify these images?

Short answer: Yes. And with very high precision, too.

The problem we want to solve is a supervised classification problem, with three classes. We use a convolutional neural network implemented as sequential Keras model. Our installation of Keras exploits TensorFlow as backend for the tensor arithmetics. This figure shows the different CNN layers we use, following the paper of Guo et al.

The images, previously generated, form a tuple `(number of images, x dpi, y dpi, number of channels)` that is the CNN input. The details on how to create an image dataset that can be easily split in train and test data using the module of scikit-learn  are given in the repo, it would be too lengthy to be posted here and, moreover, it is quite easy.

We use the categorical cross-entropy loss function, ADAM optimiser and Leaky RELU activation function. Most noteworthy, we apply the cyclical learning rate algorithm of Smith to find the learning rate. An implementation of such an algorithm by Keras callback is in the `./colab` folder of our repo. The Keras model is summarised in the following code snippet:

1model = Sequential()
2model.add(Conv2D(5, KERNEL_SIZE,                  
3                 input_shape=IMAGES_SHAPE,
4                 data_format='channels_last',
5                 kernel_initializer=KERNEL_INITIALIZER,                 
6                 padding=PADDING))
9model.add(Conv2D(10, KERNEL_SIZE, 
10                 kernel_initializer=KERNEL_INITIALIZER,
11                 padding=PADDING))
14model.add(Conv2D(10, KERNEL_SIZE, 
15                 kernel_initializer=KERNEL_INITIALIZER,
16                 padding=PADDING))
29              optimizer=OPTIMIZER,
30              metrics=[metrics.categorical_accuracy])  

The performance of the model is very high for the test data, reaching 1.00 both in precision and recall. In a future blog post, we will show you how to add robustness to this model, adding noise to the signals and evaluating the cost/benefit analysis of the results. So, stay tuned!

How to continue from here…

In the repo, we provide the methods to extract also the wavelet and q-transform of the signals. Some questions you might want to answer are:

  1. Which method looks better? Try the wavelet and q-transform. How are the results changing?
  2. How can you extend the training dataset? Or generate new test data which is compatible with the 40 seconds we have (that are not that much).

If you have similar problems with other data and want to apply AI or got lost in trying to figure out the answers to these questions, don’t hesitate: Get in touch via email!


share post




More articles in this subject area\n

Discover exciting further topics and let the codecentric world inspire you.


Gemeinsam bessere Projekte umsetzen

Wir helfen Deinem Unternehmen

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.