Window Functions in Stream Analytics

11.10.2018 | 11 minutes of reading time

Introduction to Stream Analytics

Why should we talk about stream analytics? In the past decades data analytics was dominated by batch processing. Records from transactional databases were copied into analytical databases by regular extract-transform-load (ETL) jobs when business was not running. Reports were generated nightly by aggregating huge batches of data.

Those analytical databases are pretty fast in computing the batch queries but the speed comes at a price: Flexibility and latency. The database schema is designed to give maximum performance for the queries needed by the report. If a new report is requested by the business it can take several weeks or even months to modify the whole process. As data comes in only at night, reports will only get updated the next day.

Nowadays many businesses do not work from 9 to 5 anymore. Customers expect services to be available 24/7 from anywhere in the world. Analysts need to have up-to-date information and with the rise of machine learning automated near real-time actions have become state of the art. This is why many companies are switching their architecture from ETL based batch processing towards stream processing.

In a data stream driven architecture services emit individual records, so called events. Whoever is interested in events of a particular service can simply subscribe to them and will receive them as soon as they are available. Because every service can have access to the raw events it does not have to wait for ETL processes to finish. The following figure illustrates individual events happening over time.

But how do we perform analytics on streams? How can we generate the reports based on moving data rather than fixed batches? Streams are by definition unbounded so if we are looking for an aggregated view, e.g. the amount of clicks on our website coming from a particular country, we need to introduce some boundary. In the batch world this boundary was heavily influenced by the ETL schedule. In stream analytics we are free to discretize the stream any way we want in order to perform aggregations on top.

Discretizing the stream into groups of events is called windowing. Windowing can technically be done based on any attribute of your events as long as it has an order. Nevertheless it is most commonly done based on time. There are some subtleties to take into account, however.

One question is which timestamp are you using to assign an event to a window? The time when the event was generated or the time when the event arrived at the processor? If you are using the creation time, you need to be aware that event producers might not have properly synchronized clocks. There are techniques to deal with those kinds of issues but they are beyond the scope of this post. Instead, in this blog post we want to take a closer look at different window functions that can be used to perform aggregations on data streams.

The remainder of the post is structured as follows. First we will introduce the four common types of window functions: Tumbling window, hopping window, sliding window, and session window. Afterwards we will take a look into the different tools and products available on the market and what functionality they provide in terms of window functions. We are closing the post by summarizing and discussing the main findings.

Window Functions

Definition

A window function assigns events in your stream to windows. To be precise it is more a window relation rather than a function because it theoretically does not have to assign all events to windows, i.e. it is not total, and it can assign an event to multiple windows. In addition it is neither surjective (not all windows have to contain events) nor injective (a window can contain multiple events). Nevertheless we are going to stick to the mathematically incorrect term window function.

Given a window function and a stream of data we can compute aggregates on events inside each window. As mentioned earlier an event might be assigned to multiple windows or a window might have no events assigned at all, depending on the selected window function. This is important to keep in mind when working with the derived stream of aggregates as. For example a graph of event counts based on overlapping windows will look very different from a graph based on counts computed from distinct windows.

Martin Kleppmann mentions four commonly used window functions [1]: Tumbling window, hopping window, sliding window and session window. The next sections are going to explain each of them in detail.

Tumbling Window

A tumbling window has a fixed length. The next window is placed right after the end of the previous one on the time axis. Tumbling windows do not overlap and span the whole time domain, i.e. each event is assigned to exactly one window. You can implement tumbling windows by rounding down the event time to the nearest window start. The following animation illustrates a tumbling window of length 1.

Because tumbling windows are only configured through a single property, the window length s, and they include every event exactly once, they are often used for simple reporting. You can use tumbling windows to sum all incoming requests towards your server within a 1 minute window and then display a graph where each minute corresponds to one data point.

The Azure Stream Analytics query below represents an example of counting the number of clicks on your website based the country of the visitor, grouped in a 10 second tumbling window. We are using the creation timestamp to calculate the window assignment.

1SELECT Country, Count(*) AS Count
2FROM ClickStream TIMESTAMP BY CreatedAt
3GROUP BY Country, TumblingWindow(second, 10)

Hopping Window

Like tumbling windows, hopping windows also have a fixed length. However they introduce a second configuration parameter: The hop size h. Instead of moving the window of length s forward in time by s we move it by h.

This means that tumbling windows are a special case of hopping windows where s = h. If s > h windows are overlapping and if s < h some events might not be assigned to any window. The following animation illustrates a hopping window of length 1 with hop size 0.25. It is common to choose h to be a fraction of s.

Hopping windows where h is a fraction of s can be implemented by computing tumbling windows of size h and aggregating them into a bigger hopping window. A common use case for hopping windows are moving average computations.

The Azure Stream Analytics query below represents an example of a moving average on the number of clicks on your website based the country of the visitor grouped in a 10 second window hopping 2 seconds. Again we are using the creation timestamp to calculate the window assignment.

1SELECT Country, Avg(*) AS Average
2FROM ClickStream TIMESTAMP BY CreatedAt
3GROUP BY Country, HoppingWindow(second, 10, 2)

Sliding Window

Sliding windows can be viewed as hopping windows with h → 0. While they are discretizing the input stream the derived aggregated stream is not discrete. A sliding window moves along the time axis, grouping together events that happen within the window length s.

However as our data points are discrete we can implement a sliding window by moving forward based on actual events rather than continuously in time. A new window is created whenever an event enters or exits the length of the sliding window moving foward. This mathematically corresponds to a deduplication of all possible windows based on the set of events that have been assigned to them. The following figure illustrates a sliding window of length 1.

Sliding windows are, for example, used to compute moving averages. What makes them unique is that they provide a resolution based the event time pattern in your stream rather than a fixed one. If events are more dense you will get a higher resolution of your moving aggregate. If no events are coming in, the aggregate stream stays the same without emitting new values.

Note that sliding windows are not always implemented the same way. In some tools the aggregation computation is only triggered when a new event enters the window but not if an old event exits. Make sure to check the documentation or source code of the tool you are using.

The Azure Stream Analytics query below represents an example of a moving average on the number of clicks on your website based the country of the visitor grouped in a 10 second sliding window. Again we are using the creation timestamp to calculate the window assignment.

1SELECT Country, Avg(*) AS Average
2FROM ClickStream TIMESTAMP BY CreatedAt
3GROUP BY Country, SlidingWindow(second, 10)

Session Window

In contrast to the previous window functions session windows have a variable length. When using a session window function you need to specify a time threshold between consecutive events that must not be exceeded. The window will keep expanding as long as new events are coming in that are close enough in time. The animation below illustrates a session window with a threshold of 0.5.

You can implement a session window by keeping the current events in a buffer and adding new events as long as they are within the specified session interval. As streams are unbounded sessions can theoretically grow indefinitely. Thus some implementations take a second parameter which represents the maximum session time or the maximum amount of events per session.

Session windows are useful to group together events that are expected to be related when they happen in close succession. The name suggests the prominent use case for this window function: Grouping clicks inside user sessions on your website. As long as the user keeps clicking within a short period of time your window function will aggregate all clicks in one session.

The Azure Stream Analytics query below represents an example of a click count on your website based the country of the visitor grouped in a 5 second interval session window lasting at most 10 seconds. Again we are using the creation timestamp to calculate the window assignment.

1SELECT Country, Count(*) AS Count
2FROM ClickStream TIMESTAMP BY CreatedAt
3GROUP BY Country, SessionWindow(second, 5, 10)

Window Functions in Practice

In the previous section we looked at the theory behind tumbling, hopping, sliding, and session windows. Now we want to get some insight in which window functions are available in the different tools and products on the market. The table below compares the availability of different window functions inside the following tools and products:

Flink and Kafka Streams are open source frameworks. Azure Stream Analytics, Google Cloud Dataflow, and Amazon Kinesis Data Analytics are proprietary, managed solutions by public cloud providers. Only preconfigured window functions taken into consideration. Some tools, e.g. Flink, allow definition of custom window functions which gives a great deal of flexibility.

Tumbling windows are supported by all tools although in Google Cloud Dataflow they are called fixed time windows. Hopping windows are supported by all tools except Amazon Kinesis Data Analytics. However both Flink as well as Dataflow are calling them sliding windows which is inconsistent with the terminology introduced in the previous section.

Sliding windows are supported by Azure Stream Analytics as well as Amazon Kinesis Data Analytics only. Kafka streams uses sliding windows for stream joins but you cannot aggregate on them directly. Session windows are available in all tools except Amazon Kinesis Data Analytics.

Amazon provides an alternative to tumbling windows called stagger windows which are non-overlapping fixed-length windows aligned with event timestamps. According to the documentation stagger windows are the recommended way to aggregate data using time-based windows, because they reduce late or out-of-order data compared to tumbling windows.

Both Flink and Google Cloud Dataflow offer global windows. Global windows are a trick to aggregate over all data within the stream that is available up to the point the window is triggered. Because computation of global aggregates are expensive they can only be triggered manually. Google Cloud Dataflow also provides other custom window functions such as interval windows and calendar windows.

When writing the comparison between the different tools and products I spent a lot of time reading documentation and I might have missed something. If you find a mistake or have a remark regarding the table above please leave a comment!

Conclusion

In this post we have seen how window functions play an important role in stream analytics. Using concepts like tumbling windows, hopping windows, sliding windows, session windows, or other window functions we are able to compute aggregates on an unbounded data stream.

By today every good stream processing engine provides different windowing functions. Which one you should pick depends on your use case as they produce different results and have different complexity to compute. By migrating your batch jobs to streaming jobs you are able to report results in near real-time and react to important events in your business quickly.

Which stream processing engine is your favourite? Which window functions do you typically use and why? I’m looking forward to discussing with you in the comments 🙂

References

[1] Kleppmann, Martin. Designing data-intensive applications: The big ideas behind reliable, scalable, and maintainable systems. O’Reilly Media, Inc., 2017.

Was this post helpful?

Likes

Blog author

Frank Rosner

Do you still have questions? Just send me a message.

fromFrank Rosner

Implementing and testing an Angular feature flag directive

Introduction An important goal of agile software development is to shorten the user feedback loop. To achieve that you want to release your changes as often as possible. This also includes releasing prototypes, e.g. to a smaller audience, gathering customer...

Frontend
Angular
JavaScript
Testing
Webdevelopment

18.5.2020 | 6 Minuten Lesezeit

Frank Rosner

Implementing a consumer-driven contract testing workflow with Pact broker...

Introduction In the previous posts we learned that the Pact workflow requires you to exchange contracts and verification results between consumers and providers. We introduced two approaches on how the contract exchange can happen: 1) committing the...

DevOps
API
Test Driven Development
Testing

24.2.2020 | 12 Minuten Lesezeit

Frank Rosner

Raffael Stein

Publishing application metrics to CloudWatch using Micrometer

Why metrics? In my post about Quality attributes in software we introduced observability as an important quality attribute of modern software applications. Observability expresses whether changes in a system are reflected in a quantitative measure. ...

AWS
Cloud
DevOps
Kotlin
APM

21.12.2019 | 9 Minuten Lesezeit

Frank Rosner

Concurrency and automatic conflict resolution

Introduction Modern software applications are often required to be reliable and scalable. By combining multiple unreliable components into one bigger, distributed system, we can achieve higher reliability and scalability than what would have been possible...

Data
Database
Software architecture
Software development

20.12.2019 | 11 Minuten Lesezeit

Frank Rosner

Hit me baby one more time – What are cache hits and why should you care...

Motivation When reasoning about algorithm performance we often look at complexity. Especially when comparing different algorithms, looking at asymptotic complexity (e.g. the big-O notation) is useful. We have to keep in mind, however, that the big-O...

APM
Software development
Scala

6.12.2019 | 11 Minuten Lesezeit

Frank Rosner

Microbenchmarking your Scala code

Motivation I am sure you recognize this loading spinner icon. I do not know anyone who likes to wait for the computer. However, when writing software I usually favour readability, maintainability, and extensibility over speed. I agree with Donald Knuth...

Microservices
APM
Scala

29.11.2019 | 11 Minuten Lesezeit

Frank Rosner

Message Pact – Contract testing in event-driven applications

Introduction In the previous blog post we introduced contract testing with Pact as an alternative to end-to-end tests when developing distributed applications. Pact works great for interactions between services that follow a request-response pattern...

Agile
Kotlin
Microservices
API
Test Driven Development

18.11.2019 | 9 Minuten Lesezeit

Raffael Stein

Frank Rosner

Consumer-driven contract testing with Pact

Introduction Consumer-driven contract testing is an alternative to end-to-end tests where not all services have to be deployed at the same time. It enables testing a distributed system in a decoupled way by decomposing service interactions into consumer...

JavaScript
Kotlin
API
Test Driven Development

3.10.2019 | 11 Minuten Lesezeit

Frank Rosner

Raffael Stein

Understanding the AWS Lambda SQS integration

Introduction AWS offers different components for building scalable, reliable, and secure cloud applications. Lambda is a service to execute code on demand. A Lambda function can be invoked in many different ways, e.g. by an API Gateway as part of a “...

AWS
Cloud
DevOps
Serverless

11.8.2019 | 7 Minuten Lesezeit

Frank Rosner

Let’s also apply run with Kotlin scope functions

Scope functions In Kotlin, scope functions allow you to execute a function, i.e. a block of code, in the context of an object. The object is then accessible in that temporary scope without using the name. Although whatever you do with scope functions...

8.7.2019 | 5 Minuten Lesezeit

Frank Rosner

Resilience design patterns: retry, fallback, timeout, circuit breaker

What is resilience? Software is not an end in itself: it supports your business processes and makes customers happy. If software is not running in production it cannot generate value. Productive software, however, also has to be correct, reliable, and...

Software architecture
Microservices
Search
Resilience

24.6.2019 | 10 Minuten Lesezeit

Frank Rosner

Alexander Potukar

Testing your database migrations with Flyway and Testcontainers

Why database migrations? Database migrations are usually a combination of schema and data migrations in databases. A schema migration denotes a change in an existing database schema, e.g. adding a column or creating a new index. A data migrationinvolves...

CI/CD
Kotlin
Database
Testing

6.6.2019 | 5 Minuten Lesezeit

Frank Rosner

Docker demystified

Introduction Since its open source launch in 2013, Docker has become one of the most popular pieces of technology out there. A lot of companies are contributing, and a huge number of people are using and adopting it. But why is it so popular? What does...

DevOps
Container
Linux
Software architecture

3.6.2019 | 15 Minuten Lesezeit

Frank Rosner

Interview: Schülerpraktikum Frontend-Entwicklung bei codecentric München

Wer bist du? Ich bin Yannis, 17 Jahre alt und besuche die 11. Jahrgangsstufe eines Gymnasiums in der Nähe von Frankfurt (Main), an dem ich den Schwerpunkt technische Informatik belege. Zur Zeit absolviere ich ein zweiwöchiges Schülerpraktikum bei der...

2.6.2019 | 2 Minuten Lesezeit

Frank Rosner

Ten cognitive biases to look out for as a developer

Introduction Cognitive biases can be viewed as bugs in our thinking when collecting, processing, and interpreting information. From an evolutionary standpoint they are features rather than bugs as they often enable us to be happy, social, and thus to...

Software development

20.5.2019 | 10 Minuten Lesezeit

Frank Rosner

Explain non-blocking I/O like I’m five

Introduction Ten years ago there was a major shift in the field of network application development. In 2009 Ryan Dahl invented Node.js because he was not happy with the limited possibilities of the popular Apache HTTP Server to handle thousands of concurrent...

4.4.2019 | 8 Minuten Lesezeit

Frank Rosner

Vert.x Kotlin Coroutines

Vert.x Eclipse Vert.x is an event-driven application framework that runs on the JVM. Architecturally it is very similar to Node.js, having a single-threaded event loop at its core and it heavily relies on non-blocking operations in order to be scalable...

Java
Kotlin

13.2.2019 | 6 Minuten Lesezeit

Frank Rosner

How to identify relevant quality attributes in software

Introduction When designing a system architecture, you will have to take decisions. Those decisions will influence how your system is going to behave in different scenarios. The behaviour will impact the functionality of the system or product in one ...

Software architecture
Microservices

11.2.2019 | 10 Minuten Lesezeit

Frank Rosner

Monitoring AWS Lambda functions with CloudWatch

Introduction Functions as a Service products like AWS Lambda provide a great deal of convenience compared to bare metal, virtual machines, and also containerized deployments. You only have to manage the actual code you want to run and the rest is taken...

AWS
Cloud
Serverless

23.10.2018 | 10 Minuten Lesezeit

Frank Rosner

Terraform Multi-Provider Deployment Including a Custom Provider

Introduction In the post Continuous Delivery on AWS with Terraform and Travis CI we have seen how Terraform can be used to manage your infrastructure as code and automate your deployments. When working on a project involving different infrastructure...

Software architecture
Open Source
AWS
Cloud
DevOps
Go

9.8.2018 | 9 Minuten Lesezeit

Frank Rosner

Continuous Delivery on AWS with Terraform and Travis CI

Introduction At codecentric we use Terraform extensively to automate infrastructure deployments. If you are aiming at true continuous delivery, a high degree of automation is crucial. Continuous delivery (CD) is about producing software in short cycles...

Cloud
CI/CD
Infrastructure
AWS
DevOps

29.7.2018 | 12 Minuten Lesezeit

Frank Rosner

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Green Cloud: Daten und Emissionen sparen

Das Internet produziert jährlich 900 Millionen Tonnen CO₂ – das ist deutlich mehr als Deutschland insgesamt emittiert. Hauptverantwortlich ist der immer weiter steigende Stromverbrauch beim Transport und der Speicherung von Daten. Wenn ihr kurz darüber...

Cloud
Green IT
Softwarearchitektur
Data

11.3.2024 | 5 Minuten Lesezeit

Dennis

Charge your APIs Volume 23: REST vs. gRPC

APIs dienen als Verbindungsstück zwischen Daten und Verarbeitung und erlauben uns damit, Daten im richtigen Kontext als Informationen zu interpretieren. Passende fachliche Themen sind dabei präsenter denn je und erreichen bald auch den Endverbraucher...

Java
Softwareentwicklung
Spring
Softwarearchitektur
API
Data

11.2.2024 | 7 Minuten Lesezeit

Sebastian Tiemann

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Im Bereich des maschinellen Lernens wurde eine lange Zeit angenommen, dass die Eingabedaten von Modellen und Gewichten sicher sei und nicht extrahiert werden könnten. In den letzten Jahren veröffentlichte Forschung hat diese Annahme in Frage gestellt...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 8 Minuten Lesezeit

Ihsan Kisi

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mithilfe von Daten können Unternehmen fundiertere Entscheidungen treffen, ihre Arbeitsabläufe optimieren und mit der Kraft des maschinellen Lernens (ML) einen Vorteil in der wettbewerbsintensiven Geschäftswelt erlangen. Allerdings ist der Umgang mit ...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 7 Minuten Lesezeit

Ihsan Kisi

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Wenn wir Erkenntnisse aus großen Datenmengen gewinnen wollen, bieten uns Cloud Service Provider inzwischen Lösungen an, dank derer wir uns kein Data Warehouse oder Hadoop-Cluster mehr in den Keller stellen müssen. AWS hat mit Athena, RedShift und EMR...

Cloud
Big Data
AWS
Serverless
GitLab

21.3.2023 | 16 Minuten Lesezeit

Maik Fleuter

Bessere SQL-Datenpipelines mit dbt

SQL ist weiterhin aus der Datenanalyse nicht wegzudenken – es ist vergleichsweise einfach zu lernen und Anwender können es ohne zusätzliche Werkzeuge auf einer Datenbank ausführen. Entsprechend ist es bei vielen Datenanalysten und Engineers beliebt. ...

Data

22.2.2023 | 2 Minuten Lesezeit

Matthias Niehoff

Streaming Wikipedia mit Apache Kafka

Apache Kafka ist in aller Munde und entwickelt sich im Kontext von verteilten Systemen zum De-facto-Standard als Plattform für Event Streaming. Im Rahmen unserer OffProject Time (Weiterbildungszeit) haben wir uns die Plattform auch näher angeschaut und...

Kotlin
Data
Java
Messaging
Spring

15.8.2022 | 10 Minuten Lesezeit

Christoph Metzger

Felix Rieß

Einführung in die Welt der Tourenoptimierung – Echte Routen und realistischere...

In diesem Artikel möchte ich euch mit einem Python Jupyter Notebook zeigen, wie ihr Anwendungsfälle der Tourenoptimierung inklusive Nebenbedingungen lösen und visualisieren könnt. Außerdem zeige ich euch, wie ihr mit OpenStreetMaps die Route zwischen...

Data

21.6.2022 | 7 Minuten Lesezeit

Lukas Heidemann

Einführung in die Welt der Tourenoptimierung – Visualisierung und Lösungsverfahren...

In diesem Artikel möchte ich euch zeigen, wie ihr Probleme der Tourenoptimierung in einem Python Jupyter Notebook lösen und visualisieren könnt. Am Beispiel eines Fahrradkurierdienst zeige ich außerdem, wie das Grundproblem um gängige Nebenbedingungen...

Data

16.6.2022 | 9 Minuten Lesezeit

Lukas Heidemann

Einführung in die Welt der Tourenoptimierung (1/3)

In vielen Unternehmen fallen täglich verschiedene Transportprozesse an. Klassische Beispiele sind die Optimierung von Warenein- und ausgängen, die Einsatzplanung von Servicetechnikern oder die optimale Reihenfolge der Auslieferung bei Lieferdiensten....

Data

12.6.2022 | 8 Minuten Lesezeit

Lukas Heidemann

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Die Qualität bzw. Nützlichkeit von Machine-Learning-Modellen lässt sich mit Hilfe von Testdaten und Metriken bewerten. Allerdings in welchem Umfang? Manuell, automatisiert, einmalig, regelmäßig? Manuell lassen sich die ersten Modelle als Ergebnis eines...

Data
Machine Learning
Softwareentwicklung
CI/CD

7.12.2021 | 7 Minuten Lesezeit

Berthold Schulte

Schnelles Training eines Recommendation-Modells durch BigQuery ML

Machine Learning (ML) kann nur durch Modelle in der Produktion Business Value erzeugen. Allerdings kann die Zeitspanne zwischen der Entwicklung der nächsten Iteration eines Modells und dessen Einsatz in einer Produktionsumgebung massiv sein. Dies gilt...

Accelerate
Cloud
Data
Google Cloud
Machine Learning

26.7.2021 | 11 Minuten Lesezeit

Niklas Haas

Timo Böhm

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Heutzutage steht fast alles, was mit den Labels „künstliche Intelligenz (KI)“ oder „Machine Learning (ML)“ versehen ist, für Fortschritt. Seltsamerweise schließt diese Assoziation jedoch häufig die Themen Daten und Dateninfrastruktur nicht ausreichend...

Kultur
Data
Machine Learning

21.6.2021 | 12 Minuten Lesezeit

Marcel Mikl

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

Bei klassischen Machine-Learning-(ML-)Projekten beschäftigen sich Data Scientists häufig längere Zeit (mehrere Monate) mit der Entwicklung eines ML-Modells. Dabei werden hohe Kosten verursacht und die Zeit, bis ein erstes Modell zur Verfügung steht, ...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Google Cloud
Machine Learning

17.5.2021 | 5 Minuten Lesezeit

Nils Bauroth

Sven Rediske

The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren

Dieser Artikel begleitet meinen Vortrag The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren, den ich am 20.10.2020 auf der data2day gehalten habe.Datenvisualisierung ist ausschlaggebend für Verständnis und KommunikationDatenvisualisierung...

Data
Data Science

19.10.2020 | 11 Minuten Lesezeit

Shirin Elsinghorst

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

Noch vor kurzer Zeit mussten für den Einsatz von künstlicher Intelligenz (KI) unter großem Aufwand eigene KI-Modelle erstellt werden. Heute ist für viele Anwendungsfälle die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Machine Learning
Python

29.7.2020 | 11 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und Konstruktion eigener neuronaler Netze möglich. Heute ist die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken. So kann man ...

Cloud
Computer Vision
Data
Python
Machine Learning
Google Cloud
Künstliche Intelligenz

8.7.2020 | 11 Minuten Lesezeit

Nico Axtmann

Marcel Mikl

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und ausreichend Spezialwissen möglich. Hauptsächlich große Internet-Konzerne wie Google, Apple und Facebook hatten das Geld, die Daten und die Expertise, um ...

Data
Machine Learning
Künstliche Intelligenz

6.7.2020 | 7 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

Machine Learning in der Praxis. Eine Mate mit … Matthias Niehoff #EineMateMit

Machine Learning und künstliche Intelligenz sind aktuell in aller Munde und versprechen vielfältige Einsatzmöglichkeiten im Unternehmen. Trotzdem tun sich viele Unternehmen aktuell noch schwer, das Potential der Technologie zu nutzen. „Der Fokus liegt...

Künstliche Intelligenz
Data
Community
Machine Learning

27.5.2020 | 1 Minuten Lesezeit

Matthias Niehoff

Process Mining mit bupaR

Process Mining schafft Transparenz darüber, was wirklich in Unternehmen geschieht. Im Prozessmanagement werden die Idealvorstellungen eines Prozesses meist langwierig definiert. In der Praxis ist die Qualität dieser Beschreibungen jedoch oft nicht eindeutig...

Open Source
Data
Process Management

5.5.2020 | 9 Minuten Lesezeit

Anna Lukas

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

Window Functions in Stream Analytics

Introduction to Stream Analytics

Window Functions

Definition

Tumbling Window

Hopping Window

Sliding Window

Session Window

Window Functions in Practice

Conclusion

References

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

Implementing and testing an Angular feature flag directive

Implementing a consumer-driven contract testing workflow with Pact broker...

Publishing application metrics to CloudWatch using Micrometer

Concurrency and automatic conflict resolution

Hit me baby one more time – What are cache hits and why should you care...

Microbenchmarking your Scala code

Message Pact – Contract testing in event-driven applications

Consumer-driven contract testing with Pact

Understanding the AWS Lambda SQS integration

Let’s also apply run with Kotlin scope functions

Resilience design patterns: retry, fallback, timeout, circuit breaker

Testing your database migrations with Flyway and Testcontainers

Docker demystified

Interview: Schülerpraktikum Frontend-Entwicklung bei codecentric München

Ten cognitive biases to look out for as a developer

Explain non-blocking I/O like I’m five

Vert.x Kotlin Coroutines

How to identify relevant quality attributes in software

Monitoring AWS Lambda functions with CloudWatch

Terraform Multi-Provider Deployment Including a Custom Provider

Continuous Delivery on AWS with Terraform and Travis CI

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Green Cloud: Daten und Emissionen sparen

Charge your APIs Volume 23: REST vs. gRPC

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Bessere SQL-Datenpipelines mit dbt

Streaming Wikipedia mit Apache Kafka

Einführung in die Welt der Tourenoptimierung – Echte Routen und realistischere...

Einführung in die Welt der Tourenoptimierung – Visualisierung und Lösungsverfahren...

Einführung in die Welt der Tourenoptimierung (1/3)

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Schnelles Training eines Recommendation-Modells durch BigQuery ML

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

Machine Learning in der Praxis. Eine Mate mit … Matthias Niehoff #EineMateMit

Process Mining mit bupaR

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten