True KVM Live Migration with OpenStack Icehouse and Ceph based VM storage

16.3.2015 | 12 minutes of reading time

Intro

As mentioned before — for example in Fabian’s The CenterDevice Cloud Architecture Revisited post from December 2014) — our document management product CenterDevice runs on top of infrastructure virtualized by OpenStack.
Where that older post was more application focused, this one covers a particularly nasty problem that plagued us for some time: Being unable to migrate virtual machines from one bare metal hypervisor host to another without interruption. By the end of this article you will see how we have overcome a series of obstacles on the way to successful live migrations in OpenStack Icehouse for KVM virtual machines using Ceph/Rados Block Device based volumes for data storage.

System setup

At the time of writing our cluster sports 12 bare metal servers. 6 of these are dedicated OpenStack compute nodes, with 4 more serving as Ceph storage cluster nodes. The remaining 2 are OpenStack controllers.
All storage is provided to virtual machines as OpenStack Cinder volumes backed by Ceph virtual block devices . One of the main reasons for this setup is that it allows for easy migration of virtual machines from one physical host to the next without also having to bring along large amounts of storage across the network.

Migrating Virtual Machines

OpenStack by default enables “regular” migrations, i. e. migrations where a virtual machine needs to be shut down to then be rebooted on another host. This incurs a service interruption inside the virtual machine. Ideally you would want to be able to seamlessly move the VM across physical servers without the OS and software inside it even noticing. Depending on the hypervisor type and the surrounding setup this is generally feasible.

With the instance to be migrated (the source VM) still running, its memory content is sent to the destination host. The source hypervisor keeps track of which memory pages are modified on the source while the transfer is in progress. Once the initial bulk transfer is complete, pages changed in the meantime are transferred again. This is done repeatedly with (ideally) ever smaller increments.

As long as the differences can be transferred faster than the source VM dirties memory pages, at some point the source VM gets suspended. Final differences are sent to the target host and an identical machine started there. At the same time the virtual network infrastructure takes care of all traffic being directed to the new virtual machine. Once the replacement machine is running, the suspended source instance is deleted. Usually the actual handover takes place so quickly and seamlessly that all but very time sensitive applications ever notice anything.

Since only memory is transferred, a prerequisite for this kind of live migration is shared storage, for which we use Ceph. OpenStack supports this, but you need to enable “true live migration” as described in the OpenStack Admin Guide. It boils down to adding the following to the /etc/nova/nova.conf file:

1live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_TUNNELLED

Sounds easy enough, so where’s the catch?

Problem #1

With our cluster set up as described above, this is what happened when I tried to live-migrate a VM from one host to the next:

1[daniel.schneller@control01]➜ nova list  
2+--------------+--------+--------+------------+-------------+---------------------+  
3| ID           | Name   | Status | Task State | Power State | Networks            |  
4+--------------+--------+--------+------------+-------------+---------------------+  
5| a1564ec8-... | dstest | ACTIVE | -          | Running     | testnet=192.168.1.2 |  
6+--------------+--------+--------+------------+-------------+---------------------+ 
7 
8[daniel.schneller@control01]➜ nova live-migration dstest node10
9 
10[daniel.schneller@control01]➜ tail -n20 /var/log/nova/nova-compute.log  
11Live migration of instance a1564ec8-... to host node10 failed  
12Traceback (most recent call last):  
13  File "/usr/lib/python2.7/dist-packages/nova/api/openstack/compute/contrib/admin_actions.py", line 282, in _migrate_live  
14    disk_over_commit, host)  
15  File "/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 94, in inner  
16    return f(self, context, instance, *args, **kw)  
17  File "/usr/lib/python2.7/dist-packages/nova/compute/api.py", line 1960, in live_migrate  
18    disk_over_commit, instance, host)  
19  File "/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 96, in live_migration  
20    dest=dest))  
21  File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py", line 80, in call  
22    return rpc.call(context, self._get_topic(topic), msg, timeout)  
23  File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py", line 102, in call  
24    return _get_impl().call(cfg.CONF, context, topic, msg, timeout)  
25  File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py", line 712, in call  
26    rpc_amqp.get_connection_pool(conf, Connection))  
27  File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 368, in call  
28    rv = list(rv)  
29  File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 336, in __iter__  
30    raise result  
31RemoteError: Remote error: InvalidCPUInfo_Remote Unacceptable CPU info: CPU doesn't have compatibility.

Notice the last line. Apparently there is some difference between CPUs. So let us see what kinds of CPUs the hypervisors have (some lines removed for brevity). First the source host the virtual machine lives on at the moment:

1[daniel.schneller@node05]➜ cat /proc/cpuinfo  
2vendor_id       : GenuineIntel  
3cpu family      : 6  
4model           : 45  
5model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz  
6stepping        : 7  
7cpuid level     : 13  
8flags           : fpu … (many more)

Then its designated new home:

1[daniel.schneller@node10]➜ cat /proc/cpuinfo  
2vendor_id       : GenuineIntel  
3cpu family      : 6  
4model           : 44  
5model name      : Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz  
6stepping        : 2  
7cpuid level     : 11  
8flags           : fpu … (not as many as above)

The source host CPU is of more recent vintage. Unless configured otherwise, KVM will map the underlying host CPU’s features into any virtual machine that gets started on it. This is good for performance, because the guest OS can better leverage the hardware’s power. But as a downside, for a live migration you can only use hosts that have identical or even more capable CPUs as a migration target; otherwise the guest operating system — not knowing the hardware was “hot swapped” underneath — might try to access features not present on the new host, leading to crashes. For regular migrations this is not a problem, because it involves rebooting the guest.

Fix #1

In our case we gladly accept a slightly smaller CPU feature set over the potentially slightly better performance, because with it comes full migration flexibility. To ensure CPU compatibility across all VMs and hypervisors we can instruct Nova/KVM to report a specific CPU model and set of features to guests. We can figure out which model that would ideally be with the following set of commands.

1[daniel.schneller@control01] ~/tmp ➜ pdsh 'node0[1-9],node10' 'sudo virsh capabilities | xmllint --xpath "/capabilities/host/cpu" - > ~daniel.schneller/tmp/$(hostname).xml'  
2[daniel.schneller@control01] ~/tmp ➜ cat node*.xml >> all-cpus.xml  
3[daniel.schneller@control01] ~/tmp ➜ sudo virsh cpu-baseline all-cpus.xml  
4<cpu mode='custom' match='exact'>  
5  <model fallback='allow'>Westmere</model>  
6  <vendor>Intel</vendor>  
7  ...  
8</cpu>

The first command assumes a few things:

I can connect to all relevant hypervisor hosts via SSH
I can do password-less sudo there
xmllint is installed on them
My home directory resides on shared storage
~/tmp exists.

On each node it queries the hypervisor with virsh capabilities, then extracts only the relevant CPU element from the XML. The result is written into a file per host. The second command then combines all the separate XML files into a single one. The third command then uses virsh’s built-in mechanism to resolve multiple sets of CPU capabilities into a baseline that they all support.

In our case we learned that Westmere describes the intersection of all host CPU features. So using Ansible I made sure that all hypervisors had the following entries in /etc/nova/nova-compute.conf:

1[DEFAULT]  
2compute_driver=libvirt.LibvirtDriver  
3[libvirt]  
4virt_type=kvm  
5# Define custom CPU restriction to the lowest  
6# common subset of features across all hypervisors.  
7# Otherwise live migrations will fail when trying  
8# to move from a more modern CPU to an older one.  
9cpu_mode=custom  
10cpu_model=Westmere

After that the nova-compute service needs to be restarted on all hypervisors. This can be done without affecting running virtual machines, because it only restarts the service that is (among other things) responsible for spawning new CPUs, but is not required for active VMs to function.

Problem #2

Unfortunately even with this obstacle out of the way the migration would still fail with the same error message as before. Turns out, there is a problem with the CPU comparison code in Nova. See Nova Bug Ticket #1082414 for details. It boils down to the wrong set of CPUs being compared – instead of checking whether the source’s virtual CPU can be supported by the target’s real CPU, the code compares the two physical CPUs for compatibility, bringing us back to square 1.

Fix #2

While the bug is going to be fixed in a newer (at the time of writing yet to be released) OpenStack version, the patch is too big to be back-ported to OpenStack Icehouse. So as an interim solution 1 I simply disabled the broken check as discussed in comments #26 and following of the above mentioned bug.

Important: Once this patch is in place, nothing prevents you from migrating instances to incompatible hosts! Even though we specified a custom CPU model earlier (fix #1), virtual machines that were launched prior to that change cannot know about the new limitations! Before live-migrating any virtual machines you must make sure to reboot them once to make them pick up the new CPU type!

Problem #3

So. Now we should be able to live-migrate, right? Well… Wrong…!

The next problem came down to an unfortunate oversight on our part. Even though listed as a requirement in the Configure migrations chapter we did not ensure that the Nova instances directory (typically /var/lib/nova/instances) was mounted on shared storage. This led to the following error in the source hosts /var/lib/nova/nova-compute.log:

1RemoteError: Remote error: RemoteError Remote error:  
2InvalidSharedStorage_Remote node10 is not on shared storage: Live  
3migration can not be used without shared storage.

To determine the presence of shared storage Nova performs a (too) simple check: It tries to create a temporary file in the instance directory of the virtual machine to be migrated and checks if that file can be seen at the same path on the destination host. In our case that naturally failed, because that path resides on a local drive on each hypervisor, even though the VM volumes reside on shared storage. Same as before, apparently this whole part of the code is going through major refactoring for future OpenStack releases, but that did not exactly help me.

Fix #3

I was already looking for the right spot to remove that check, too, when I came across this old mailing list thread “Live migration of VM using librbd and OpenStack”, discussing this exact issue. The final message in that thread conveniently has the right place identified already and a valuable hint thrown in for free:

Just for posterity, my ultimate solution was to patch nova on each compute host to always return True in _check_shared_storage_test_file (nova/virt/libvirt/driver.py)
This did make migration work with “nova live-migration”, with one caveat. Since Nova is assuming that /var/lib/nova/instances is on shared storage (and since I hard coded the check to say “yes, it really is”), it thinks the /var/lib/nova/instances/ folder will exist at both source and destination, and makes no attempt to create it on the destination.

This is the complete patch we apply on new compute nodes (including both the CPU check mentioned above and the shared storage workaround):

1--- libvirt/driver.py.orig  2014-08-21 19:20:10.000000000 +0200  
2+++ libvirt/driver.py   2015-02-27 10:09:17.830455657 +0100  
3@@ -4234,9 +4234,10 @@  
4             disk_available_mb = \  
5                     (disk_available_gb * units.Ki) - CONF.reserved_host_disk_mb
6 
7-        # Compare CPU  
8-        source_cpu_info = src_compute_info['cpu_info']  
9-        self._compare_cpu(source_cpu_info)  
10+        # Compare CPU -- Daniel Schneller: Disabled due to  
11+        # https://bugs.launchpad.net/nova/+bug/1082414  
12+        # source_cpu_info = src_compute_info['cpu_info']  
13+        # self._compare_cpu(source_cpu_info)
14 
15         # Create file on storage, to be checked on source host  
16         filename = self._create_shared_storage_test_file()  
17@@ -4399,11 +4400,22 @@
18 
19         Cannot confirm tmpfile return False.  
20         """  
21-        tmp_file = os.path.join(CONF.instances_path, filename)  
22-        if not os.path.exists(tmp_file):  
23-            return False  
24-        else:  
25-            return True  
26+        # Daniel Schneller: Nova assumes live migration also  
27+        # implies shared storage for instance metadata (libvirt.xml)  
28+        # and checks this by creating a tempfile in that directory,  
29+        # verifying it can be seen from source and destination of  
30+        # the migration. This would prevent live migration for us  
31+        # unnecessarily. We return True here, no matter what, faking  
32+        # shared storage. Cleverly Nova itself even seems to copy  
33+        # the instance metdata over again in a later step.  
34+        # This will have to be reviewed in later OpenStack versions,  
35+        # where improved handling has already been announced.  
36+        return True  
37+        #tmp_file = os.path.join(CONF.instances_path, filename)  
38+        #if not os.path.exists(tmp_file):  
39+        #    return False  
40+        #else:  
41+        #    return True
42 
43     def _cleanup_shared_storage_test_file(self, filename):  
44         """Removes existence of the tmpfile under CONF.instances_path."""

As noted in the patch , in Icehouse Nova creates the console.log and libvirt.xml files on the destination hypervisor, provided the instance directory already exists. Also, since it assumes shared storage, it does not clean up the source directory once the migration is complete.

Finally!

With the above patches and modifications in place, live migration now works as follows:

Determine the VMs UUID, e. g. with nova show or nova list.
Pick the new destination host and create /var/lib/nova/instances/.
Ensure the directory has the correct ownership chown nova:nova /var/lib/nova/instances/
Perform the actual migration: nova live-migration
Remove the old /var/lib/nova/instances/ from the old host.

The time needed for the migration is usually in the range of several seconds, sometimes up to a few minutes. This primarily depends on the RAM size, its rate of change inside the virtual machine, and the speed of the network connecting source and destination hypervisors.

Limitations / Caveats

While the above procedure generally works flawlessly, the necessity for the manual creation and deletion of directories is unfortunate and a potential source of errors.

The CPU compatibility issue is less likely to cause trouble in the future. As we have full control over the VMs running in our cluster, we can make sure each VM gets rebooted at least once before it is migrated. And because we will most certainly not add new compute nodes with CPUs inferior to the Westmere models we presently have in the our servers, the baseline feature set now configured will work fine for the foreseeable future, too.

In the coming months we will therefore probably move /var/lib/nova/instances to CephFS which at the moment we only use for roaming home directories. Once we do that, the second part of the above patch can be reverted again.

Conclusion

In this post I compiled a comprehensive summary on how to enable true Live Migration with OpenStack Icehouse for KVM based virtual machines built on Ceph/RBD volumes. While the information presented is mostly available from other places on the Internet, having it all combined in one place will hopefully save someone else the tedious work of compiling it again.

Footnotes

interim, adj. originally “provisional”, “limited”; in IT contexts often referring to the most permanent of all solutions. See also: Prototype 😉.⤺

Was this post helpful?

Likes

Blog author

Daniel Schneller

Do you still have questions? Just send me a message.

fromDaniel Schneller

XFS: Possible Memory Allocation Deadlock in kmem_alloc

A few weeks ago we were surprised by seemingly random I/O hangs on several virtual machines. Any attempt to write to their data volumes blocked, making the load average rise into the stratosphere, and — slightly more consequentially — make Elasticsearch...

Cloud
DevOps
Infrastructure

10.4.2017 | 10 Minuten Lesezeit

Daniel Schneller

Rate Limiting based on HTTP headers with HAProxy

Recently we had a problem with a buggy update to a piece of 3rd party client software. It produced lots and lots of valid, but nonsensical requests, targeting our system. This post details how we added a dynamic rate limiting to our HAProxy load balancers...

3.12.2014 | 7 Minuten Lesezeit

Daniel Schneller

Localizing Mobile Apps

What do the acronyms I18N or L10N stand for? What do they mean for developers of mobile applications in particular? I hosted a session about localizing mobile applications at Developer Week 2014 in Nuremberg. It covers — among other things — text, numbers...

26.8.2014 | 1 Minuten Lesezeit

Daniel Schneller

Jinja2 for better Ansible playbooks and templates

There have been posts about Ansible on this blog before, so this one will not go into Ansible basics again, but focus on ways to improve your use of variables, often, but not only used together with the template module, showing some of the more involved...

24.8.2014 | 11 Minuten Lesezeit

Daniel Schneller

Ansible: Simple yet powerful automation

Automatic provisioning of infrastructure as well as deployment is a cornerstone of DevOps. It brings the benefits of version control, reproducibility, and a central place to consolidate (executable) knowledge about infrastructure setups. Best known provisioning...

CI/CD
DevOps
Infrastructure

22.6.2014 | 14 Minuten Lesezeit

Daniel Schneller

SSH Two-Factor Authentication with Duo Security

An ever increasing number of services start offering (and recommending) additional means of securing access to your accounts: Instead of just asking users to identify and authenticate themselves with a simple set of username and password, a second piece...

10.3.2014 | 7 Minuten Lesezeit

Daniel Schneller

Pseudo-Localization for Cocoa Apps

Locali… what? Simply speaking, localizing an application means translating all output it produces on the screen (and printouts etc.) to the language of the people using it. There is more to it, though, than a simple translation of messages. You should...

Java
iOS
Software development

23.10.2013 | 14 Minuten Lesezeit

Daniel Schneller

SSL: Man in the middle? – No, thank you!

At DWX Developer Week I recently gave a talk on SSL and man in the middle attacks. Due to the popular demand (and some internal scheduling issues) I repeated it again internally. However, the recording of that is available on the codecentric YouTube ...

2.7.2013 | 1 Minuten Lesezeit

Daniel Schneller

Easier JBehave steps with variants

In an earlier post we offered an introduction to the JBehave project for automatic acceptance testing. While that article focused on setup and general use of the framework, this time I will concentrate on a recent addition I wrote and contributed to...

Agile
Java

1.4.2012 | 4 Minuten Lesezeit

Daniel Schneller

SOAP Webservices mit iOS

Betrachtet man APIs für aktuelle Web-Plattformen wie Soziale Netzwerke, die Amazon Web Services, Fotodienste à la Flickr oder Instagram und zahllose mehr, so könnte der Eindruck entstehen, REST hätte als der Kommunikation mit entfernten Diensten zu ...

Java
API

2.1.2012 | 5 Minuten Lesezeit

Daniel Schneller

Why good metrics values do not equal good quality

Quite regularly, codecentric’s experts perform reviews and quality evaluations of software products. For example, clients may want to get an independent assessment of a program they had a contractor develop. In other cases, they request an assessment...

Agile methods
Java

3.10.2011 | 7 Minuten Lesezeit

Daniel Schneller

Using JMeter to measure binary protocols

In a recent project I developed a bridge component to connect a backend web service with a credit-card terminal. The terminal can only speak a binary protocol. The bridge needs to map the binary messages to the corresponding backend calls. If you are...

Java
APM

9.5.2011 | 6 Minuten Lesezeit

Daniel Schneller

droidcon 2011

Vom 23. bis 24. März fand in der Urania in Berlin die droidcon.2011 statt. Neben zahlreichen Ausstellern im Expo Bereich, die bislang teilweise noch nicht (in Deutschland) erhältliche Produkte, darunter z. B. Motorola mit dem Xoom Tablet und Android...

Android
Community
Mobile

5.4.2011 | 4 Minuten Lesezeit

Daniel Schneller

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Public Cloud im regulierten Sektor: Das ist zu beachten

Es war längere Zeit ein weit verbreitetes und in strategischen Debatten häufig zitiertes Missverständnis, dass die Bundesanstalt für Finanzdienstleistungsaufsicht (BaFin) dem Einsatz von Public-Cloud-Anbietern wie AWS, Azure und Co. einen Riegel vorschiebt...

Cloud
Compliance

10.4.2024 | 6 Minuten Lesezeit

Marc Bialowons

Björn Bohn

Green Cloud: Daten und Emissionen sparen

Das Internet produziert jährlich 900 Millionen Tonnen CO₂ – das ist deutlich mehr als Deutschland insgesamt emittiert. Hauptverantwortlich ist der immer weiter steigende Stromverbrauch beim Transport und der Speicherung von Daten. Wenn ihr kurz darüber...

Cloud
Green IT
Softwarearchitektur
Data

11.3.2024 | 5 Minuten Lesezeit

Dennis

AZ-900-Zertifizierung: Mein How-to!

Was ist AZ-900? Azure bietet eine Reihe verschiedener Zertifizierungen an. Zu finden sind sie hier. Darunter befindet sich auch die Zertifizierung AZ-900. Bei diesem Zertifikat handelt es sich um Microsoft Certified: Azure Fundamentals. Diese prüft unter...

Azure
Cloud

2.1.2024 | 5 Minuten Lesezeit

Ege Inanc

Mit FinOps die größten Kostenfallen bei AWS S3 verhindern

In der Welt der Cloud-Technologie und insbesondere bei AWS (Amazon Web Services) ist die effiziente Verwaltung von Ressourcen von entscheidender Bedeutung, um unnötige Kosten zu vermeiden. Dieser Blogbeitrag konzentriert sich auf AWS S3 und die teuren...

AWS
Cloud

27.11.2023 | 4 Minuten Lesezeit

Lukas Miliunas

Maximilian Mayer

Cloud FinOps

Cloud FinOps bietet einen etablierten Prozess, um Kosten für den Cloudbetrieb zu reduzieren (s. auch diesen Artikel). Zu diesem Zweck bietet es ein etabliertes Cloud-unabhängiges Vorgehen, das eine Organisation schrittweise aufgreifen kann. Das Tooling...

Cloud
Cloud Native
Green IT

26.10.2023 | 5 Minuten Lesezeit

Lukas Miliunas

Marco Paga

Mehr Struktur in der Cloud mit Azure Landing Zones

Die Migration in die Cloud bringt einige Herausforderungen mit sich. Viele Unternehmen stehen vor der Frage, wie ein effizienter und sicherer Aufbau einer skalierbaren Cloud-Infrastruktur umzusetzen ist. Die Antwort auf diese Herausforderung liegt in...

Cloud
Azure
IT-Governance

4.8.2023 | 4 Minuten Lesezeit

Florian Moll

Nils Bauroth

CI/CD-Pipelines mit AWS CDK CodePipeline

Das Aufsetzen der CI/CD-Pipeline ist ein typischer Task in der Anfangszeit eines Projekts. Ist die Pipeline dann aufgesetzt, sind Änderungen nur noch selten notwendig. Dementsprechend wenig Routine entwickeln Programmierende im Umgang mit der Konfiguration...

Cloud
CI/CD
AWS

17.7.2023 | 4 Minuten Lesezeit

Dennis

Green Cloud: Nachhaltig skalieren

Wenn Softwareprojekte in die Cloud gebracht werden, versprechen wir uns davon hohe Verfügbarkeit, planbare Kosten und eine immer dem Bedarf entsprechende Skalierung. Aufgrund der grenzenlosen Angebote ist es aber auch leicht, die Komponenten eines Systems...

Cloud
Softwarearchitektur
Green IT

12.6.2023 | 5 Minuten Lesezeit

Dennis

Crossplane: Eine Lösung für hybride Cloud-Herausforderungen?

Crossplane ist ein plattformübergreifendes Kontrollsystem (Control-Plane), das das Management von Cloud-Ressourcen vereinfachen und automatisieren soll. Das Tool ermöglicht es, verschiedene Cloud-Provider und lokale Ressourcen, z. B. Kubernetes-Cluster...

Cloud
Cloud Native

12.5.2023 | 2 Minuten Lesezeit

Matthias Niehoff

Green Cloud: Ideen für eine nachhaltigere Architektur

Die ökologische Nachhaltigkeit eines Systems ist aktuell häufig noch kein Thema. Nachhaltigkeit bedeutet für mich in diesem Kontext die Reduktion der verursachten Emissionen durch gesenkten Ressourcenverbrauch – egal ob die Emissionen beim Cloudprovider...

Cloud
Softwarearchitektur
Green IT

5.5.2023 | 5 Minuten Lesezeit

Dennis

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Wenn wir Erkenntnisse aus großen Datenmengen gewinnen wollen, bieten uns Cloud Service Provider inzwischen Lösungen an, dank derer wir uns kein Data Warehouse oder Hadoop-Cluster mehr in den Keller stellen müssen. AWS hat mit Athena, RedShift und EMR...

Cloud
Big Data
AWS
Serverless
GitLab

21.3.2023 | 16 Minuten Lesezeit

Maik Fleuter

Ist die Cloud der große Umweltsünder?

Rechenleistung und Speicher kosten nicht nur Geld. Sie verbrauchen auch Mengen – potenziell klimaschädlicher – Energie. Das überrascht die Wenigsten, im kollektiven Bewusstsein ist es aber bislang kaum angekommen. Sehr wohl bewusst ist es natürlich den...

Cloud

18.1.2023 | 2 Minuten Lesezeit

Matthias Niehoff

AWS Cloud Development Kit – Infrastructure as Code on Steroids

Infrastructure as Code (IaC) ist inzwischen ein alter Hut. Frameworks wie Terraform, Ansible und andere haben Standards geschaffen. Kaum jemand provisioniert produktive Systeme heute ohne IaC – sei es in der Cloud oder auf der eigenen Infrastruktur.Und...

Infrastructure as Code
AWS
Cloud

21.12.2022 | 3 Minuten Lesezeit

Matthias Niehoff

Infrastructure as Code in AWS: Keine Silver Bullet

TL;DR Es gibt keine Universalmethode. Infrastructure as Code ist ein vergleichsweise neuer Ansatz. Einige Lösungen rund um Infrastructure as Code befinden sich noch in der Entwicklung. Es gibt keinen klaren Favoriten. Die Wahl des passenden Tools hängt...

Cloud
AWS
Infrastructure as Code

13.12.2022 | 27 Minuten Lesezeit

Florian Wiech

Sören

AWS CloudFront Functions testen

Mit den CloudFront Functions bietet AWS die Möglichkeit, den Funktionsumfang von CloudFront um kleine JavaScript-Funktionen zu erweitern. AWS führt diese Funktionen direkt an den Edge-Locations aus und ermöglicht es dadurch, alle ankommenden Requests...

Cloud
AWS
Testing
Softwareentwicklung

4.10.2022 | 3 Minuten Lesezeit

Dennis

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Bei unseren Kunden und auch bei codecentric dreht sich alles um den besten und schnellsten Weg, die richtige Software zu entwickeln – und das natürlich in hoher Qualität. Von daher bin ich auch ein fleißiger Leser des „State of DevOps“-Report (hier zum...

Cloud
Java
Remote Work

16.5.2022 | 11 Minuten Lesezeit

Rainer Vehns

Green Cloud: Emissionen unserer Cloud-Architektur messen

Überall wird von der Cloud geschwärmt: Grenzenlose Skalierung und unzählige Features sind bereits „out of the box“ verfügbar. Das alles gibt es zu unschlagbar günstigen Preisen. Das Thema Nachhaltigkeit kommt dabei selten zur Sprache. Rechenzentren verbrauchen...

AWS
Azure
Cloud
Google Cloud
Green IT

24.4.2022 | 6 Minuten Lesezeit

Dennis

Terraform Remote State richtig nutzen

Was ist Terraform und was ist State?Terraform ist ein Tool für die Verwaltung von Infrastruktur in Form von Code, gehört also in den sogenannten Infrastructure-as-Code-Bereich (IaC). Eine kurze Einführung und ein Vergleich zu anderen Tools findet sich...

Infrastructure
Softwarearchitektur
Cloud
DevOps

21.4.2022 | 7 Minuten Lesezeit

Alexander Kasper

Stream Processing mit Kafka Streams und Spring Boot

Kontinuierliche Datenströme in verteilten Systemen ohne Zeitverzögerung zu verarbeiten, birgt einige Herausforderungen. Wir zeigen euch, wie Stream Processing mit Kafka Streams und Spring Boot gelingen kann. Alles im Fluss: Betrachtet man Daten als fortlaufenden...

Softwarearchitektur
Cloud
IoT
Messaging
Kotlin
Spring

20.12.2021 | 20 Minuten Lesezeit

Maik Fleuter

Lukas Maier

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Machine Learning (ML) erzeugt erst dann realen Mehrwert, wenn es in Produktion benutzt wird. Allerdings kann die Zeitspanne zwischen der Entwicklung eines belastbaren Modells und dessen Einsatz frustrierend lange sein. Insbesondere in schnelllebigen ...

Agile Methoden
Cloud
Machine Learning

26.7.2021 | 5 Minuten Lesezeit

Timo Böhm

Niklas Haas

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

True KVM Live Migration with OpenStack Icehouse and Ceph based VM storage

Intro

System setup

Migrating Virtual Machines

Problem #1

Fix #1

Problem #2

Fix #2

Problem #3

Fix #3

Finally!

Limitations / Caveats

Conclusion

Footnotes

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

XFS: Possible Memory Allocation Deadlock in kmem_alloc

Rate Limiting based on HTTP headers with HAProxy

Localizing Mobile Apps

Jinja2 for better Ansible playbooks and templates

Ansible: Simple yet powerful automation

SSH Two-Factor Authentication with Duo Security

Pseudo-Localization for Cocoa Apps

SSL: Man in the middle? – No, thank you!

Easier JBehave steps with variants

SOAP Webservices mit iOS

Why good metrics values do not equal good quality

Using JMeter to measure binary protocols

droidcon 2011

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Public Cloud im regulierten Sektor: Das ist zu beachten

Green Cloud: Daten und Emissionen sparen

AZ-900-Zertifizierung: Mein How-to!

Mit FinOps die größten Kostenfallen bei AWS S3 verhindern

Cloud FinOps

Mehr Struktur in der Cloud mit Azure Landing Zones

CI/CD-Pipelines mit AWS CDK CodePipeline

Green Cloud: Nachhaltig skalieren

Crossplane: Eine Lösung für hybride Cloud-Herausforderungen?

Green Cloud: Ideen für eine nachhaltigere Architektur

Datenanalyse auf die schnelle Art – mit Amazon Athena und GitLab

Ist die Cloud der große Umweltsünder?

AWS Cloud Development Kit – Infrastructure as Code on Steroids

Infrastructure as Code in AWS: Keine Silver Bullet

AWS CloudFront Functions testen

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Green Cloud: Emissionen unserer Cloud-Architektur messen

Terraform Remote State richtig nutzen

Stream Processing mit Kafka Streams und Spring Boot

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten