LANGUAGE

IoT Analytics Platform

13.7.2016 | 15 minutes of reading time

The Internet of Things a.k.a. the next industrial revolution is the current hype, but what kinds of challenges do we face with the consumption of big amounts of data? One variant is to collect all the data and do post processing in batches. However, the preferred way is to do real or near real time analytics of the latest data.

To cope with the sheer amount of data, you’ll need a platform which can scale with the amount of data. The platform and also the software components need to be able to adjust to the changing requirements depending on the changing influx of data.
The SMACK Stack (Spark, Mesos, Akka, Cassandra and Kafka) has proven to be a solid base for such a platform. Akka, Spark and Kafka are capable of taking care of the vast amount of data, while Mesos, Marathon and DC/OS scale the platform.
My colleague Florian Troßbach already wrote a nice article about the SMACK Stack.
This blog and the corresponding sources will focus on SACK (Spark, Akka, Cassandra and Kafka) components of the SMACK Stack. The basis for this showcase is data from the 171 bus routes of Los Angeles which is freely available via a REST API (Metro-API: http://developer.metro.net/).

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

Always unblock YouTube

As can be seen in the video, the current vehicle positions are drawn on top of an OpenStreetMap in real time, previously collected data can be queried from Cassandra and also be drawn on the map. The collected vehicle data can be used for calculation of hotspots where a lot of vehicles meet.

Architecture

Ingest – Akka

Digest – Spark

UI – Javascript, Openstreetmap

Backend – Akka

The ingesting service initially collects the metadata from the Metro-API to store those directly in Cassandra. Depending on this collected metadata, the route details and vehicle information is queried from the Metro-API. Every 30 seconds the vehicle information is collected and published to Kafka.

The Spark digesting application reads the incoming vehicle information from Kafka, enhances those for more efficient retrieval by the frontend, stores the enhanced data in Cassandra publishes it to Kafka as well.

As it would be rather boring to just collect those data and store them in Cassandra, a frontend is needed. The frontend uses OpenLayers-API to visualize the vehicle positions and details on bus routes with OpenStreetMap. The current positions are streamed directly from Kafka onto the map via websocket communication. Additionally the hotspot clusters of vehicle meeting points can also be drawn onto the map.
The following sections will go into more detail about the main parts of the showcase.

Data Ingest

The data ingest service is an example on how to read data into the IoT Analytics Platform.
But since it only requests the current data every 30 seconds from a publicly available API, it can’t really be compared to the humongous amount of data produced by a real IoT scenario. On the other hand the real-time ingestion of Vehicle Positions every 30 seconds is far more dynamic than other examples such as statically bound weather data.

The Scala actors framework Akka is used as technological basis for the ingestion. When started it automatically retrieves the meta data, route information and stores those details directly inside Cassandra.

The route meta data contains IDs for the routes together with the corresponding geo coordinates of the bus stops. For every route there is also a display name available – the names shown on the buses themselves.

Using this meta data, it is possible to extract all vehicle information periodically for the given bus routes. An extra actor does take care of the periodic extraction. It triggers itself every 30 seconds.

1val tick = context.system.scheduler.schedule(0 seconds, 30 seconds, self, Tick())
2 
3override def receive: Receive = {
4 case Tick() => {
5   log.info(s"extracting vehicles Infor for routeID: ${routeInfo.id}")
6   extractVehicles(routeInfo.id)
7 }
8}

Akka uses the concept of “reactive streams ” to cope with a ginormous amount of data. More about Akka can be found at the blog of Heiko Seeberger . The following snippet shows how a flow of data is created for every bus route:

1Flow[Vehicle].map(elem => {
2 log.info(s"publishing element: ${elem}")
3 new ProducerRecord[Array[Byte], Vehicle]("METRO-Vehicles", elem)
4}).to(producer).runWith(Source.actorPublisher(VehiclesActor.props(routeInfo, httpClient)))

The VehicleActor acts as publishing source which is triggered every 30 seconds. In this simple flow the data is directly streamed to the Kafka producer to store the incoming vehicle information on the Kafka.

To achieve this, the publisher retrieves the data and sends Vehicle objects back to the flow via the onNext method. In case of the flow not being able to deliver this information, an internal buffer will be used to hold the data till the flow is able to process the data again. This is the best practice on how to handle back pressure with Akka.

1vehicles.items.foreach {
2 vehicle =>
3   {
4     log.debug(vehicle.toString)
5     log.debug("sending vehicle to stream sink")
6     val vehicleToPersist = Vehicle(vehicle.id, Some(currTime.minusSeconds(vehicle.seconds_since_report).withMillisOfSecond(0).toDate), vehicle.latitude, vehicle.longitude, vehicle.heading, Some(routeInfo.id), vehicle.run_id, vehicle.seconds_since_report)
7     log.debug(s"sending Vehicle ${vehicleToPersist}")
8     if (buffer.isEmpty && totalDemand > 0) {
9       log.info(s"Buffer Empty sending vehicle: ${vehicleToPersist}")
10       onNext(vehicleToPersist)
11     } else {
12       log.info(s"Buffering vehicle: ${vehicleToPersist}")
13       buffer :+= vehicleToPersist
14       if (totalDemand > 0) {
15         val (use, keep) = buffer.splitAt(totalDemand.toInt)
16         buffer = keep
17         log.info(s"Demand is greater 0 sending ${use}")
18         use foreach onNext
19       }
20     }
21   }
22}

Data Digestion

The data digestion service delivers the actual business value of an IoT analytics platform. In this part of the application, information can be gathered, transformed and optimized. In our example it is necessary to enhance the geo coordinates of the vehicles to enable efficient retrieval from Cassandra. This is a key requirement of using geo coordinates with Cassandra. Partition keys of Cassandra tables can only be queried on equality as those keys are calculated as hash values. Those hash values are used to define on which node the data resides in a cluster. A hash value can’t be used to isolate certain values depending on a range, so using < and > operators on a partition key is not possible. More details on how this exactly works and how QuadKeys will help is handled later in this article.

Additionally the enhanced data is written back to Kafka again, so it can be streamed directly to the frontend.

With this simple example, the main purpose of the Spark job is to transform geo coordinates of vehicles (Vehicle) to TiledVehicle objects.
Those TiledVehicles are needed to have the ability to do a geographical search for vehicles. Every vehicle will have a TileID which defines on which tile the coordinates of this vehicle is located. With this approach it’s possible to do a geographical search for coordinates within a given bounding box.

1val tiledVehicle = vehicle.map(vehicle => TiledVehicle(
2 TileCalc.convertLatLongToQuadKey(vehicle.latitude, vehicle.longitude),
3 TileCalc.transformTime(vehicle.time.get),
4 vehicle.id,
5 vehicle.time,
6 vehicle.latitude,
7 vehicle.longitude,
8 vehicle.heading,
9 vehicle.route_id,
10 vehicle.run_id,
11 vehicle.seconds_since_report
12))
13 
14tiledVehicle.cache()
15 
16tiledVehicle.saveToCassandra("streaming", "vehicles_by_tileid")
17 
18tiledVehicle.foreachRDD(rdd => rdd.foreachPartition(f = tiledVehicles => {
19 
20 val producer: Producer[String, Array[Byte]] = new KafkaProducer[String, Array[Byte]](producerConf)
21 
22 tiledVehicles.foreach { tiledVehicle =>
23   val message = new ProducerRecord[String, Array[Byte]]("tiledVehicles", new TiledVehicleEncoder().toBytes(tiledVehicle))
24   producer.send(message)
25 }
26 
27 producer.close()
28}))

Grouping of coordinates by a quadkey

To find geographic points, lines or areas within Cassandra those geographic artifacts need to be identified by a unique identifier. This is a result of the way data structures are stored in Cassandra. Partition keys can only be queried for equality. That would mean if you look for an coordinate within an enclosing bounding box, it’s not possible to have that coordinate as partition key.

Therefore you somehow need to group certain coordinates to one unique key. To achieve this you need to define a quadkey to represent a certain area which contains a set of coordinates. To achieve this the map is separated into tiles using the same mechanism as used by Microsoft Bing maps. The earth is split into four different quadrants, each consisting of one tile. If you split those tiles again into four tiles , you are able to address any point on the earth by one key, depending on the length of the key.
An image showing this can be found on the Microsoft Blog here.

The same can be achieved by using GeoHashes , but those aren’t as human readable as quadkeys and therefore can not be verified as easily.

In the context of this example application we define the length of the quadkey to be 15 characters long. By with using a tile of 256×256 pixels size, a geographic area of 1.5 to 1.5 km can be spanned and used for grouping coordinates. More details on the Bing maps tile system can be found on the corresponding link .

Viewing the live data

In comparison to static data it is much easier to create a dynamic front end showing the dynamic data of busses on a map. With this in mind, let us look at the last part of the chain: the visualization. It has been kept as simple as possible, using OpenLayers v3 in combination with OpenStreetMap. The following data can be visualized on that map:

Current positions of vehicles (busses)
Positions of vehicles in the past (the last 15 min)
Information about vehicles and routes
Visualization of bus routes and the waypoints
Visualization of clusters of vehicles

To visualize previous positions of vehicles a REST service has been created with Akka Http. This service extracts the data for a given bounding box from Cassandra. A bounding box is a rectangle which encloses the viewable map portion. To draw current vehicle positions, a websocket service is used to directly stream the data from Kafka into the frontend.
To show the vehicles of a given bounding box a simple Akka-Http-Route is used:

1def vehiclesOnBBox = path("vehicles" / "boundingBox") {
2 corsHandler {
3   parameter('bbox.as[String], 'time.as[String] ? "5") { (bbox, time) =>
4     get {
5       marshal {
6         val boundingBox: BoundingBox = toBoundingBox(bbox)
7 
8         val askedVehicles: Future[Future[List[Vehicle]]] = (vehiclesPerBBox ? (boundingBox, time)).mapTo[Future[List[Vehicle]]]
9         askedVehicles.flatMap(future => future)
10 
11       }
12     }
13   }
14 }
15}

This route requests the VehiclePerBBox actor to retrieve the vehicle per given bounding box.

1override def receive(): Receive = {
2 case (boundingBox: BoundingBox,time: String) => {
3   log.info("received a BBox query")
4   val eventualVehicles = getVehiclesByBBox(boundingBox, time)
5   log.info(s"X: ${eventualVehicles}")
6   sender() ! eventualVehicles
7 }
8 case _ => log.error("Wrong request")
9}

Analogous to saving the vehicle positions, the quadkeys (TileIDs) are calculated for a given bounding box. A combination of the calculated TileIDs with a timestamp is used to retrieve the corresponding vehicle data within the given bounding box. That data is visualized on the map.

The same method is used to retrieve bus route meta data and clusters of busses.
Live data on the other hand is pushed onto the map a WebSocket connection. It behaves analogously to the way data is retrieved from Cassandra. A bounding box is send to the service. Data is retrieved according to the given bounding box depending on the tile ids, as this enhanced vehicle data is also available from Kafka.

The Request Handler checks if the given connection is a WebSocket connection. In that case a new Akka flow to retrieve the data is started. This flow connects with another Actor to pipe the data from Kafka onto the web page.

1val requestHandler: HttpRequest => HttpResponse = {
2 case req@HttpRequest(GET, Uri.Path("/ws/vehicles"), _, _, _) =>
3   req.header[UpgradeToWebSocket] match {
4     case Some(upgrade) => upgrade.handleMessages(Flows.graphFlowWithStats(router))
5     case None => HttpResponse(400, entity = "Not a valid websocket request!")
6   }
7 case _: HttpRequest => HttpResponse(404, entity = "Unknown resource!")
8}

The Route actor takes care of the communication between a newly created publishing actor, which is bound to the incoming bounding box, and the response channel to the web browser via WebSocket.

1class RouterActor extends Actor with ActorLogging {
2 var routees = Set[Routee]()
3 
4 def receive: Receive = {
5   case ar: AddRoutee => {
6     log.info(s"add routee ${ar.routee}")
7     routees = routees + ar.routee
8   }
9   case rr: RemoveRoutee => {
10     log.info(s"remove routee ${rr.routee}")
11     routees = routees - rr.routee
12   }
13   case msg:Any => {
14     routees.foreach(_.send(msg, sender))
15   }
16 }
17}

This route actor is available from the beginning of the application while the consuming actor is created on the fly for the newly created flow from websocket to publisher. This route actor is taking care of all new consuming actors and will dispatch incoming data from Kafka to those consuming actors.

1class TiledVehiclesFromKafkaActor(router: ActorRef) extends Actor with ActorLogging {
2 
3 import scala.concurrent.ExecutionContext.Implicits.global
4 implicit val materializer = ActorMaterializer()
5 
6 
7 //Kafka
8 val consumerSettings = ConsumerSettings(context.system, new ByteArrayDeserializer, new TiledVehicleFstDeserializer,
9   Set("tiledVehicles"))
10   .withBootstrapServers("localhost:9092")
11   .withGroupId("group1")
12 
13 val source = Consumer.atMostOnceSource(consumerSettings.withClientId("Akka-Client"))
14 source.map(message => message.value).runForeach(vehicle => router ! vehicle)
15 
16 override def receive: Actor.Receive = {
17   case _ => // just ignore any messages
18 }
19}

If a new request with a bounding box is sent to the router via a websocket connection, the publisher will receive it and will create a new actor instance of it which is registered at the router.

1def receive: Receive = {
2 case bbox: BoundingBox => {
3   log.info("received BBox changing behavior")
4   tileIds = TileCalc.convertBBoxToTileIDs(bbox)
5   log.info(s"${tileIds.size} tiles are requested")
6   unstashAll()
7   become(streamAndQueueVehicles, discardOld = false)
8 }
9 case msg => stash()
10}

If a bounding box request comes into the new actor it will change its behaviour and act only on Vehicle data which is within the enclosing bounding box. That data will be send to the frontend via WebSocket.

1def streamAndQueueVehicles: Receive = {
2 
3 // receive new stats, add them to the queue, and quickly
4 // exit.
5 case tiledVehicles: TiledVehicle=>
6   // remove the oldest one from the queue and add a new one
7   if (queue.size == MaxBufferSize) queue.dequeue()
8   if (tileIds.contains(tiledVehicles.tileId)) {
9     queue += Vehicle(tiledVehicles.id,tiledVehicles.time, tiledVehicles.latitude, tiledVehicles.longitude, tiledVehicles.heading, tiledVehicles.route_id, tiledVehicles.run_id, tiledVehicles.seconds_since_report)
10 
11     if (!queueUpdated) {
12       queueUpdated = true
13       self ! QueueUpdated
14     }
15   }
16 // we receive this message if there are new items in the
17 // queue. If we have a demand for messages send the requested
18 // demand.
19 case QueueUpdated => deliver()
20 
21 // the connected subscriber request n messages, we don't need
22 // to explicitely check the amount, we use totalDemand propery for this
23 case Request(amount) =>
24   deliver()
25 
26 // subscriber stops, so we stop ourselves.
27 case Cancel =>
28   context.stop(self)
29 
30 case stringMsg:String => {
31   if ("close" == stringMsg) {
32     log.info("closing websocket connection")
33     become(receive, discardOld = true)
34     router ! Cancel
35   }
36 }
37}

If no further data can be send via the WebSocket connection, the actor will kill itself.

Clustering of vehicle positions

An analytics platform should not only be used to show moving vehicles, it reveals its true potential by enabling an analysis of the collected data. To demonstrate this a sample Spark application was created to find clusters of vehicle positions. It should be shown on which coordinates vehicles meet. The “Density-Based Clustering in Spatial Databases” (DBSCAN) grouping algorithm is used to find those. The blog of Natalino Busa has been very inspirational for choosing this algorithm.

First we select all vehicle coordinates for a given time range from Cassandra:

1val vehiclesPos:Array[Double] = vehiclesRdd
2 .flatMap(vehicle => Seq[(String, (Double,Double))]((s"${vehicle.id}_${vehicle.latitude}_${vehicle.longitude}",(vehicle.latitude, vehicle.longitude))))
3 .reduceByKey((x,y) => x)
4 .map(x => List(x._2._1, x._2._2)).flatMap(identity)
5 .collect()

This array is transformed back into an RDD and mapped to a density matrix, where the matrix consists of all coordinates as latitude and longitude.

1val vehiclePosRdd: RDD[Array[Double]] = sc.parallelize(seqOfVehiclePos)
2 
3val denseMatrixRdd: RDD[DenseMatrix[Double]] = vehiclePosRdd.map(vehiclePosArray => DenseMatrix.create[Double](vehiclePosArray.length / 2, 2, vehiclePosArray))
4Auf dieser dichte Matrix kann dann wieder die eigentliche Berechnung gemacht werden: 
5 
6val clusterRdd: RDD[GDBSCAN.Cluster[Double]] = denseMatrixRdd.map(dm => dbscan(dm)).flatMap(identity)

The DBSCAN algorithm is used on this density matrix of longitude and latitudes. A cluster exists where there are at least three points within about 50 meter range of each other.

1def dbscan(v : breeze.linalg.DenseMatrix[Double]):Seq[GDBSCAN.Cluster[Double]] = {
2 log.info(s"calculating cluster for denseMatrix: ${v.data.head}, ${v.data.tail.head}")
3 val gdbscan = new GDBSCAN(
4   DBSCAN.getNeighbours(epsilon = 0.0005, distance = Kmeans.euclideanDistance),
5   DBSCAN.isCorePoint(minPoints = 3)
6 )
7 val clusters = gdbscan.cluster(v)
8 clusters
9}

The result needs some cleansing and filtering and will be stored back in Cassandra again, from where it is easy to select via the frontend to be shown on the map.

Noteworthy

The ingest and the backend of the UI are completely based on Akka and Scala 2.11. But as it is common to use Spark in the context of SMACK you are usually bound or better off with Scala 2.10 since that is the preferred Scala version for Spark at the time of writing. So the sample application ended up with support for both Scala versions. The showcase uses the sbt-doge SBT plugin, as this enables cross compilation during the build process. It turned out to be especially useful for the commonly used case classes and other helper classes of the commons module, such as the utility classes used for calculation of the tile ids, or the serialization and deserialization of case classes in the context of Kafka.
As serialization framework Fast-Serialization has been used. It turned out to be very effective with serializing/deserializing of objects used in different versions of Scala.
While creating the multi module project the goal has been to keep the build as easy and simple as possible. Therefore all project configuration concerning build and projects is located within the root build.sbt file.

Conclusion

The analytics platform shown here can not only be used for IoT data, it can be used for a variety of scenarios with lots of incoming data. This is especially true for scenarios with a lot of parallel and continuously incoming data flows. A platform like this is certainly better suited for those scenarios than a classical REST application where the data is stored in an RDBMS.
Even though it includes Cassandra with all its benefits as storage, this solution showed a minor issue. With an RDBMS system it is easier to query with ranges on the primary key, which is not supported for Cassandra. This can easily be worked around by choosing a QuadKey (TileID) and grouping data points within dedicated tiles. As a result of this, Cassandra is able to shine.
The current solution of using QuadKeys for grouping geo coordinates within tiles could be further optimized. The current solution only uses one static zoom level, at this point extra zoom levels with higher aggregation of points can help boost the performance. To achieve this performance boost, an optimized query structure would also be needed. For example the use of hilbert space filling curves to retrieve the optimum amount of tileIds for a given geographical shape could be a significant improvement.

Links:

Was this post helpful?

LANGUAGE

Likes

Blog author

Achim Nierbeck

Niederlassungsleiter

Do you still have questions? Just send me a message.

fromAchim Nierbeck

Solution Factory – In 9 Wochen von der Idee zum Produkt

Digitalisierung revolutioniert jedes Business und das schon seit über einer Dekade. Dieser andauernde Trend wird auch Ihr Business-Modell nicht unberührt lassen und hat einiges zu bieten. Es gibt zahlreiche Beispiele, wie und wo eine digitale Transformation...

Startup
Agilität
AWS
Cloud
CI/CD
Softwareentwicklung
Agile Methoden

21.7.2019 | 8 Minuten Lesezeit

Mahdi Ebrahimi

Achim Nierbeck

Solution Factory – How to get from idea to product in 9 weeks

Digitization has been revolutionizing each and every business out there for the past few decades. It has surely a lot to offer in your business domain as well: a new customer portal to improve users’ satisfaction and help you reach out to a whole new...

Agile
AWS
Cloud
CI/CD
Software development
Agile methods

30.6.2019 | 9 Minuten Lesezeit

Mahdi Ebrahimi

Achim Nierbeck

SMACK stack from the trenches

This is going to be a sum-up of the experience gathered on various projects done with the SMACK stack. For details about the SMACK stack you might want to take a look at the following blog – The SMACK Stack – Hands on . Apache Spark – the S in SMACK...

Reactive Programming
NoSQL
Big Data
Messaging

19.1.2017 | 12 Minuten Lesezeit

Achim Nierbeck

SMACK Stack DC/OS Style

In the world of Internet of things (IoT) you work with a continuous flow of data. For this you have two options at hand, the first is to do batch processing long after the data is collected. The other option is to analyse the data while it is being collected...

31.7.2016 | 6 Minuten Lesezeit

Achim Nierbeck

Combining Apache Cassandra with Apache Karaf

Getting the best of Apache Cassandra inside Apache Karaf: this blog post will describe how easy it was to embed the NoSQL database inside the runtime. This can be helpful while developing OSGi-related applications with Karaf that work together with Cassandra...

NoSQL
Container

19.12.2014 | 9 Minuten Lesezeit

Achim Nierbeck

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Public Cloud im regulierten Sektor: Das ist zu beachten

Es war längere Zeit ein weit verbreitetes und in strategischen Debatten häufig zitiertes Missverständnis, dass die Bundesanstalt für Finanzdienstleistungsaufsicht (BaFin) dem Einsatz von Public-Cloud-Anbietern wie AWS, Azure und Co. einen Riegel vorschiebt...

Cloud
Compliance

10.4.2024 | 6 Minuten Lesezeit

Marc Bialowons

Björn Bohn

Green Cloud: Daten und Emissionen sparen

Das Internet produziert jährlich 900 Millionen Tonnen CO₂ – das ist deutlich mehr als Deutschland insgesamt emittiert. Hauptverantwortlich ist der immer weiter steigende Stromverbrauch beim Transport und der Speicherung von Daten. Wenn ihr kurz darüber...

Cloud
Green IT
Softwarearchitektur
Data

11.3.2024 | 5 Minuten Lesezeit

Dennis

AZ-900-Zertifizierung: Mein How-to!

Was ist AZ-900? Azure bietet eine Reihe verschiedener Zertifizierungen an. Zu finden sind sie hier. Darunter befindet sich auch die Zertifizierung AZ-900. Bei diesem Zertifikat handelt es sich um Microsoft Certified: Azure Fundamentals. Diese prüft unter...

Azure
Cloud

2.1.2024 | 5 Minuten Lesezeit

Ege Inanc

Mit FinOps die größten Kostenfallen bei AWS S3 verhindern

In der Welt der Cloud-Technologie und insbesondere bei AWS (Amazon Web Services) ist die effiziente Verwaltung von Ressourcen von entscheidender Bedeutung, um unnötige Kosten zu vermeiden. Dieser Blogbeitrag konzentriert sich auf AWS S3 und die teuren...

AWS
Cloud

27.11.2023 | 4 Minuten Lesezeit

Lukas Miliunas

Maximilian Mayer

Cloud FinOps

Cloud FinOps bietet einen etablierten Prozess, um Kosten für den Cloudbetrieb zu reduzieren (s. auch diesen Artikel). Zu diesem Zweck bietet es ein etabliertes Cloud-unabhängiges Vorgehen, das eine Organisation schrittweise aufgreifen kann. Das Tooling...

Cloud
Cloud Native
Green IT

26.10.2023 | 5 Minuten Lesezeit

Lukas Miliunas

Marco Paga

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Im Bereich des maschinellen Lernens wurde eine lange Zeit angenommen, dass die Eingabedaten von Modellen und Gewichten sicher sei und nicht extrahiert werden könnten. In den letzten Jahren veröffentlichte Forschung hat diese Annahme in Frage gestellt...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 8 Minuten Lesezeit

Ihsan Kisi

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mithilfe von Daten können Unternehmen fundiertere Entscheidungen treffen, ihre Arbeitsabläufe optimieren und mit der Kraft des maschinellen Lernens (ML) einen Vorteil in der wettbewerbsintensiven Geschäftswelt erlangen. Allerdings ist der Umgang mit ...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 7 Minuten Lesezeit

Ihsan Kisi

Mehr Struktur in der Cloud mit Azure Landing Zones

Die Migration in die Cloud bringt einige Herausforderungen mit sich. Viele Unternehmen stehen vor der Frage, wie ein effizienter und sicherer Aufbau einer skalierbaren Cloud-Infrastruktur umzusetzen ist. Die Antwort auf diese Herausforderung liegt in...

Cloud
Azure
IT-Governance

4.8.2023 | 4 Minuten Lesezeit

Florian Moll

Nils Bauroth

CI/CD-Pipelines mit AWS CDK CodePipeline

Das Aufsetzen der CI/CD-Pipeline ist ein typischer Task in der Anfangszeit eines Projekts. Ist die Pipeline dann aufgesetzt, sind Änderungen nur noch selten notwendig. Dementsprechend wenig Routine entwickeln Programmierende im Umgang mit der Konfiguration...

Cloud
CI/CD
AWS

17.7.2023 | 4 Minuten Lesezeit

Dennis

Green Cloud: Nachhaltig skalieren

Wenn Softwareprojekte in die Cloud gebracht werden, versprechen wir uns davon hohe Verfügbarkeit, planbare Kosten und eine immer dem Bedarf entsprechende Skalierung. Aufgrund der grenzenlosen Angebote ist es aber auch leicht, die Komponenten eines Systems...

Cloud
Softwarearchitektur
Green IT

12.6.2023 | 5 Minuten Lesezeit

Dennis

Crossplane: Eine Lösung für hybride Cloud-Herausforderungen?

Crossplane ist ein plattformübergreifendes Kontrollsystem (Control-Plane), das das Management von Cloud-Ressourcen vereinfachen und automatisieren soll. Das Tool ermöglicht es, verschiedene Cloud-Provider und lokale Ressourcen, z. B. Kubernetes-Cluster...

Cloud
Cloud Native

12.5.2023 | 2 Minuten Lesezeit

Matthias Niehoff

Green Cloud: Ideen für eine nachhaltigere Architektur

Die ökologische Nachhaltigkeit eines Systems ist aktuell häufig noch kein Thema. Nachhaltigkeit bedeutet für mich in diesem Kontext die Reduktion der verursachten Emissionen durch gesenkten Ressourcenverbrauch – egal ob die Emissionen beim Cloudprovider...

Cloud
Softwarearchitektur
Green IT

5.5.2023 | 5 Minuten Lesezeit

Dennis

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Wenn wir Erkenntnisse aus großen Datenmengen gewinnen wollen, bieten uns Cloud Service Provider inzwischen Lösungen an, dank derer wir uns kein Data Warehouse oder Hadoop-Cluster mehr in den Keller stellen müssen. AWS hat mit Athena, RedShift und EMR...

Cloud
Big Data
AWS
Serverless
GitLab

21.3.2023 | 16 Minuten Lesezeit

Maik Fleuter

Ist die Cloud der große Umweltsünder?

Rechenleistung und Speicher kosten nicht nur Geld. Sie verbrauchen auch Mengen – potenziell klimaschädlicher – Energie. Das überrascht die Wenigsten, im kollektiven Bewusstsein ist es aber bislang kaum angekommen. Sehr wohl bewusst ist es natürlich den...

Cloud

18.1.2023 | 2 Minuten Lesezeit

Matthias Niehoff

AWS Cloud Development Kit – Infrastructure as Code on Steroids

Infrastructure as Code (IaC) ist inzwischen ein alter Hut. Frameworks wie Terraform, Ansible und andere haben Standards geschaffen. Kaum jemand provisioniert produktive Systeme heute ohne IaC – sei es in der Cloud oder auf der eigenen Infrastruktur.Und...

Infrastructure as Code
AWS
Cloud

21.12.2022 | 3 Minuten Lesezeit

Matthias Niehoff

Infrastructure as Code in AWS: Keine Silver Bullet

TL;DR Es gibt keine Universalmethode. Infrastructure as Code ist ein vergleichsweise neuer Ansatz. Einige Lösungen rund um Infrastructure as Code befinden sich noch in der Entwicklung. Es gibt keinen klaren Favoriten. Die Wahl des passenden Tools hängt...

Cloud
AWS
Infrastructure as Code

13.12.2022 | 27 Minuten Lesezeit

Florian Wiech

Sören

AWS CloudFront Functions testen

Mit den CloudFront Functions bietet AWS die Möglichkeit, den Funktionsumfang von CloudFront um kleine JavaScript-Funktionen zu erweitern. AWS führt diese Funktionen direkt an den Edge-Locations aus und ermöglicht es dadurch, alle ankommenden Requests...

Cloud
AWS
Testing
Softwareentwicklung

4.10.2022 | 3 Minuten Lesezeit

Dennis

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Bei unseren Kunden und auch bei codecentric dreht sich alles um den besten und schnellsten Weg, die richtige Software zu entwickeln – und das natürlich in hoher Qualität. Von daher bin ich auch ein fleißiger Leser des „State of DevOps“-Report (hier zum...

Cloud
Java
Remote Work

16.5.2022 | 11 Minuten Lesezeit

Rainer Vehns

Green Cloud: Emissionen unserer Cloud-Architektur messen

Überall wird von der Cloud geschwärmt: Grenzenlose Skalierung und unzählige Features sind bereits „out of the box“ verfügbar. Das alles gibt es zu unschlagbar günstigen Preisen. Das Thema Nachhaltigkeit kommt dabei selten zur Sprache. Rechenzentren verbrauchen...

AWS
Azure
Cloud
Google Cloud
Green IT

24.4.2022 | 6 Minuten Lesezeit

Dennis

Terraform Remote State richtig nutzen

Was ist Terraform und was ist State?Terraform ist ein Tool für die Verwaltung von Infrastruktur in Form von Code, gehört also in den sogenannten Infrastructure-as-Code-Bereich (IaC). Eine kurze Einführung und ein Vergleich zu anderen Tools findet sich...

Infrastructure
Softwarearchitektur
Cloud
DevOps

21.4.2022 | 7 Minuten Lesezeit

Alexander Kasper

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

IoT Analytics Platform

Architecture

Data Ingest

Data Digestion

Grouping of coordinates by a quadkey

Viewing the live data

Clustering of vehicle positions

Noteworthy

Conclusion

Links:

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

Solution Factory – In 9 Wochen von der Idee zum Produkt

Solution Factory – How to get from idea to product in 9 weeks

SMACK stack from the trenches

SMACK Stack DC/OS Style

Combining Apache Cassandra with Apache Karaf

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Public Cloud im regulierten Sektor: Das ist zu beachten

Green Cloud: Daten und Emissionen sparen

AZ-900-Zertifizierung: Mein How-to!

Mit FinOps die größten Kostenfallen bei AWS S3 verhindern

Cloud FinOps

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mehr Struktur in der Cloud mit Azure Landing Zones

CI/CD-Pipelines mit AWS CDK CodePipeline

Green Cloud: Nachhaltig skalieren

Crossplane: Eine Lösung für hybride Cloud-Herausforderungen?

Green Cloud: Ideen für eine nachhaltigere Architektur

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Ist die Cloud der große Umweltsünder?

AWS Cloud Development Kit – Infrastructure as Code on Steroids

Infrastructure as Code in AWS: Keine Silver Bullet

AWS CloudFront Functions testen

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Green Cloud: Emissionen unserer Cloud-Architektur messen

Terraform Remote State richtig nutzen

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten