Core ML – inference on iOS

19.8.2019 | 7 minutes of reading time

In machine learning, we are training a model for a particular task, e.g. distinguishing dogs and cats in pictures. Inference refers to the application of the model. Most of the inference applications are addressed via a client-server API or used in batch mode. In comparison, applications (such as Apple’s FaceID) run directly on the mobile device. The on-device inference has the benefit of low latencies, which creates an excellent user experience. In these applications, the topic of reasoning on mobile devices is gaining more and more attention. In addition to Apple, Google explores various hardware options to deploy the resource-intensive models on mobile devices. The goal of this article is to show the possibilities of inference with an iOS device. In addition to the advantages and disadvantages of mobile inference, we are taking a look at the Core ML framework, the Neural Engine and the hardware innovation in the mobile area.

Inference on-device

Let’s look at the pros and cons of putting a model on a mobile device.

Pros

Latency: There is no network traffic generated by using on-device inference. The prediction is computed directly on the hardware, that has the side benefit to use the application in an offline environment.
Data Security: There is no data movement involved when it comes to computing the prediction. The data does not have to leave the device, which introduces a certain level of data safety.

Cons

Updating the models: When we want to publish a newly trained model, we have to release a new version of the application itself. For my experience in the mobile development area, it usually takes some time until all users have their app updated. Furthermore, the models are consuming a lot of space, which is also a downside for some users.
Speed of the hardware: Depending on the equipment used, the computing time of the models may vary significantly. While newer devices such as the iPhone XS include specialized machine learning hardware, performance degradation can occur on older devices.

Both the latency and the given data security are fascinating arguments to deal with the subject of mobile machine learning more closely. One of the most critical factors in the successful application of the models is the speed of the hardware. Apple has equipped the A11 and A12 Bionic chip with specialized hardware to run neural networks on the iPhone efficiently. For these reasons, we want to dive deeper into the subject of machine learning on the iPhone.

Core ML

Core ML is a machine learning framework developed by Apple. Compared to PyTorch and TensorFlow, that are used to train models, Core ML has a focus on deployment and runtime of the models. With Core ML 3 on-device training is possible. The developer must have already trained a model to be then able to execute it with Core ML or integrate it into an iOS app. Before we can integrate the model into the application with Core ML, a conversion to the Core ML format is necessary. Mainly, Core ML can only be used within the Apple ecosystem and not for Android applications.

Core ML stack (https://developer.apple.com/documentation/coreml)

Core ML uses Accelerate , Basic neural network subroutines (BNNS) and Metal Performance Shaders (MPS) libraries, which primarily cover low-level neural network, inference, CPU, and GPU operations. These libraries greatly facilitate access to machine learning on iOS. Furthermore, Apple has developed the frameworks Vision and Natural Language to perform feature extraction on image and text data. For example, existing models of the Vision library can recognize faces, texts and barcodes on images. Then this information can act as features for your models.

Alternatives to Core ML

In addition to Core ML, there are of course also other ways to take a model on an iOS terminal in operation, for example, TensorFlow Lite . The significant advantage of this is that a model can be used directly on different platforms like Android. However, this involves some disadvantages. XCode provides access to Core ML. We don’t need to set up complex environments to start developing. Furthermore, Core ML is optimized to run on iOS. As a result, Core ML’s performance is significantly better compared to TensorFlow Lite. It is worth taking a look at the article by Andrey Logvinenko, who has studied the performance differences in detail.

Neural Engine

After taking a look at the software aspects of the iOS ecosystem, let’s look at the hardware features of the iPhone XS introduced last year. The A12 Bionic chip consists of a computing unit (six cores), a graphics unit (four cores) and a neural processing unit (also known as Neural Engine). The Neural Engine is the centrepiece for running models. For example, FaceID and Siri use the Neural Engine to make predictions. Although these applications could also be carried out with the CPU, this would significantly increase the computing time and energy consumption. The Neural Engine consists of eight cores and theoretically can perform up to 5 trillion calculations per second. As a developer, we have access to the Neural Engine and can run our models on it.

Looking at the development of the A * Bionic series, it becomes clear that Apple puts a lot of effort into further development. On the A12 Bionic chip, Core ML is up to 9 times faster than its predecessor A11 Bionic . Various experiments from the community, such as Yolo and Core ML , show that a similar improvement is achieved.

Image recognition with Core ML

To use a trained model with Core ML, we need to convert it with the Python Package coremltools . Besides the coremltools, there are also third-party conversion tools, such as TensorFlow converter .

We are using a Keras model that can distinguish between dogs and cats. In the app, either a picture taken with the camera or removed from the photo library.

Keras model to Core ML model with coremltools

Before the conversion, the Keras model must be saved with model.save (“model.h5”). Finally, the model can be converted using the method coremltools.converters.keras.convert. You must specify different metadata such as the classes. Furthermore, additional preprocessing methods such as a normalization of the data can be specified. In our case, we have the two classes Cat (0) and Dogs (1). The image_scale, green_bias, red_bias, and blue_bias fields specify the preprocessing values. In this example, we use MobileNet preprocessing. After conversion, the model must be saved as “.mlmodel”. Core ML can then read this in an app.

For integration into an app, the file must be added to the Xcode project. In XCode, you can see which model parameters are given for the input and output of the data. In our case, we need as input an RGB image with 224×224 pixels. The output of the model is the highest-probability label and a hashmap that contains the likelihood of the labels.

Integration of Core ML model into XCode

The prediction works with the model.prediction (image: features) method. For this, the model must first be loaded. With the class UIImage the image data can be processed. Besides, we have added the methods resize and pixelBuffer to the class. The resize method can be used to resize images to 224×224 pixels to prepare for the prediction. The pixel buffer serves as the input vector for the model.

Application to predict cats and dogs

Summary

In this article, Core ML and Apple’s hardware innovations were introduced to enable inference on iOS. While frameworks such as TensorFlow can both train new models and infer models, Core ML only allows inference. A model trained with a TensorFlow or another Third-Party Library needs to be converted to Core ML using the Python library coremltools. Then the converted model can be integrated into an app and run through Core ML. In addition to Core ML, there are other frameworks such as TensorFlow Lite to perform inference on iOS. One of the core strengths of Core ML compared to the other frameworks is its performance. Core ML is much faster due to the hardware optimizations. In addition to software development, Apple is investing in hardware innovations. The Neural Engine has created a vital core that provides iOS devices with sufficient resources to enable inference on the end device. This ensures the privacy of the data without compromising the performance of the models. In conclusion, Apple has created an ecosystem through Core ML and hardware innovation that makes it easy to use machine learning in apps.

Was this post helpful?

Likes

Blog author

Nico Axtmann

Do you still have questions? Just send me a message.

fromNico Axtmann

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

Noch vor kurzer Zeit mussten für den Einsatz von künstlicher Intelligenz (KI) unter großem Aufwand eigene KI-Modelle erstellt werden. Heute ist für viele Anwendungsfälle die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Machine Learning
Python

29.7.2020 | 11 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und Konstruktion eigener neuronaler Netze möglich. Heute ist die Einstiegshürde in die Welt der KI durch Cloud-Computing-Dienste stark gesunken. So kann man ...

Cloud
Computer Vision
Data
Python
Machine Learning
Google Cloud
Künstliche Intelligenz

8.7.2020 | 11 Minuten Lesezeit

Nico Axtmann

Marcel Mikl

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

Noch vor kurzer Zeit war der Einsatz von künstlicher Intelligenz (KI) nur mit großem Aufwand und ausreichend Spezialwissen möglich. Hauptsächlich große Internet-Konzerne wie Google, Apple und Facebook hatten das Geld, die Daten und die Expertise, um ...

Data
Machine Learning
Künstliche Intelligenz

6.7.2020 | 7 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

Deployment von Machine-Learning-Modellen mit Seldon Core

In diesem Artikel sehen wir uns an, wie wir Machine-Learning- und Deep-Learning-Modelle mit Seldon Core deployen können. Seldon Core ist eine Open-Source-Plattform, um Modelle auf einem Kubernetes-Cluster in Betrieb zu nehmen. Bevor wir uns Seldon Core...

Softwarearchitektur
Data
Künstliche Intelligenz
Machine Learning

9.9.2019 | 7 Minuten Lesezeit

Nico Axtmann

Data Science in der Praxis: Häufige Fehler und Vorgehen

In diesem Artikel gehen wir auf die Besonderheiten von Data Science in der Praxis ein. Wir konzentrieren uns auf die technischen Unterschiede, häufige Fehler und Herausforderungen. Dabei lassen wird die sozialen und kommunikativen Aspekte außen vor. ...

Agilität
Machine Learning
Data

28.8.2019 | 11 Minuten Lesezeit

Nico Axtmann

Portability between deep learning frameworks – with ONNX

In recent years, the number of frameworks for deep learning has exploded. Companies such as Google, Facebook and Amazon have made their deep learning frameworks TensorFlow , PyTorch and MXNet available open-source or are actively involved in developing...

Data
Machine Learning
AI
Python

27.8.2019 | 6 Minuten Lesezeit

Nico Axtmann

Inbetriebnahme eines scikit-learn-Modells mit ONNX und FastAPI

Dieser Artikel befasst sich mit dem Deployment eines Machine-Learning-Modells, das den Wert eines Hauses in Boston anhand gewisser Merkmale wie der Kriminalitätsrate des Bezirks und der Anzahl der Räume in einer Wohnung bestimmen kann. Im ersten Schritt...

Data
Python
Künstliche Intelligenz
Machine Learning

6.8.2019 | 3 Minuten Lesezeit

Nico Axtmann

Core ML – Inferenz unter IOS

Beim maschinellen Lernen wird ein Modell für eine gewisse Aufgabe wie bspw. das Unterscheiden von Hunden und Katzen auf Bildern trainiert. Die Inferenz bezeichnet die Anwendung des Modells. Ein Großteil der Inferenz-Anwendungen wird über eine Client-...

14.5.2019 | 8 Minuten Lesezeit

Nico Axtmann

Skalierbare Bildklassifizierung mit ONNX und AWS Lambda

In meinem Blogartikel ONNX – Portabilität von Deep-Learning-Modellen haben wir bereits ONNX kennengelernt und was es damit auf sich hat. Zur Erinnerung: ONNX ist ein Open Source geführter Standard, mit dem Modelle zwischen verschiedenen Deep-Learning...

13.5.2019 | 7 Minuten Lesezeit

Nico Axtmann

ONNX – Portabilität von Deep-Learning-Modellen

In den vergangenen Jahren ist die Anzahl an Frameworks für Deep Learning explodiert. Unternehmen wie Google, Facebook und Amazon haben ihre Deep Learning Frameworks TensorFlow , PyTorch und MXNet quelloffen zur Verfügung gestellt oder entwickeln aktiv...

Künstliche Intelligenz
Python

10.4.2019 | 6 Minuten Lesezeit

Nico Axtmann

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Green Cloud: Daten und Emissionen sparen

Das Internet produziert jährlich 900 Millionen Tonnen CO₂ – das ist deutlich mehr als Deutschland insgesamt emittiert. Hauptverantwortlich ist der immer weiter steigende Stromverbrauch beim Transport und der Speicherung von Daten. Wenn ihr kurz darüber...

Cloud
Green IT
Softwarearchitektur
Data

11.3.2024 | 5 Minuten Lesezeit

Dennis

Charge your APIs Volume 23: REST vs. gRPC

APIs dienen als Verbindungsstück zwischen Daten und Verarbeitung und erlauben uns damit, Daten im richtigen Kontext als Informationen zu interpretieren. Passende fachliche Themen sind dabei präsenter denn je und erreichen bald auch den Endverbraucher...

Java
Softwareentwicklung
Spring
Softwarearchitektur
API
Data

11.2.2024 | 7 Minuten Lesezeit

Sebastian Tiemann

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Im Bereich des maschinellen Lernens wurde eine lange Zeit angenommen, dass die Eingabedaten von Modellen und Gewichten sicher sei und nicht extrahiert werden könnten. In den letzten Jahren veröffentlichte Forschung hat diese Annahme in Frage gestellt...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 8 Minuten Lesezeit

Ihsan Kisi

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mithilfe von Daten können Unternehmen fundiertere Entscheidungen treffen, ihre Arbeitsabläufe optimieren und mit der Kraft des maschinellen Lernens (ML) einen Vorteil in der wettbewerbsintensiven Geschäftswelt erlangen. Allerdings ist der Umgang mit ...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 7 Minuten Lesezeit

Ihsan Kisi

Große Sprachmodelle: Was ist ein LLM?

Große Sprachmodelle (Large Language Models oder LLM) haben in den letzten Jahren enorme Fortschritte gemacht und spielen eine entscheidende Rolle in verschiedenen Anwendungen. Aber was ist ein LLM? Es ist sinnvoll zu erklären, was ein „einfaches“ Sprachmodell...

Machine Learning

20.6.2023 | 4 Minuten Lesezeit

Elvira Siegel

Bessere SQL-Datenpipelines mit dbt

SQL ist weiterhin aus der Datenanalyse nicht wegzudenken – es ist vergleichsweise einfach zu lernen und Anwender können es ohne zusätzliche Werkzeuge auf einer Datenbank ausführen. Entsprechend ist es bei vielen Datenanalysten und Engineers beliebt. ...

Data

22.2.2023 | 2 Minuten Lesezeit

Matthias Niehoff

Streaming Wikipedia mit Apache Kafka

Apache Kafka ist in aller Munde und entwickelt sich im Kontext von verteilten Systemen zum De-facto-Standard als Plattform für Event Streaming. Im Rahmen unserer OffProject Time (Weiterbildungszeit) haben wir uns die Plattform auch näher angeschaut und...

Kotlin
Data
Java
Messaging
Spring

15.8.2022 | 10 Minuten Lesezeit

Christoph Metzger

Felix Rieß

Einführung in die Welt der Tourenoptimierung – Echte Routen und realistischere...

In diesem Artikel möchte ich euch mit einem Python Jupyter Notebook zeigen, wie ihr Anwendungsfälle der Tourenoptimierung inklusive Nebenbedingungen lösen und visualisieren könnt. Außerdem zeige ich euch, wie ihr mit OpenStreetMaps die Route zwischen...

Data

21.6.2022 | 7 Minuten Lesezeit

Lukas Heidemann

Einführung in die Welt der Tourenoptimierung – Visualisierung und Lösungsverfahren...

In diesem Artikel möchte ich euch zeigen, wie ihr Probleme der Tourenoptimierung in einem Python Jupyter Notebook lösen und visualisieren könnt. Am Beispiel eines Fahrradkurierdienst zeige ich außerdem, wie das Grundproblem um gängige Nebenbedingungen...

Data

16.6.2022 | 9 Minuten Lesezeit

Lukas Heidemann

Einführung in die Welt der Tourenoptimierung (1/3)

In vielen Unternehmen fallen täglich verschiedene Transportprozesse an. Klassische Beispiele sind die Optimierung von Warenein- und ausgängen, die Einsatzplanung von Servicetechnikern oder die optimale Reihenfolge der Auslieferung bei Lieferdiensten....

Data

12.6.2022 | 8 Minuten Lesezeit

Lukas Heidemann

Kotlin Multiplatform Mobile – Ein praktischer Einstieg

Mit Kotlin Multiplatform Mobile (KMM) steht ein weiteres Cross-Platform Framework in den Startlöchern. Diesmal geht JetBrains ins Rennen und versucht natürlich alles besser zu machen als alle anderen. Die Frage ist aber: Brauchen wir das wirklich? Ist...

Android
iOS
Kotlin

23.5.2022 | 10 Minuten Lesezeit

Danny Steinbrecher

Smart DistancR – Perspektivisch korrekte Distanzmessung zwischen Personen

Die Corona-Krise ist weiterhin in aller Munde und wird uns mit hoher Wahrscheinlichkeit noch etwas länger begleiten. Wie man aus unterschiedlichen Statistiken erfährt, schwanken die Fallzahlen weiter und sorgen für zusätzliche Restriktionen. Diese werden...

Computer Vision
Künstliche Intelligenz
IoT
Machine Learning

13.12.2021 | 7 Minuten Lesezeit

Michel Ehmen

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Die Qualität bzw. Nützlichkeit von Machine-Learning-Modellen lässt sich mit Hilfe von Testdaten und Metriken bewerten. Allerdings in welchem Umfang? Manuell, automatisiert, einmalig, regelmäßig? Manuell lassen sich die ersten Modelle als Ergebnis eines...

Data
Machine Learning
Softwareentwicklung
CI/CD

7.12.2021 | 7 Minuten Lesezeit

Berthold Schulte

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Machine Learning (ML) erzeugt erst dann realen Mehrwert, wenn es in Produktion benutzt wird. Allerdings kann die Zeitspanne zwischen der Entwicklung eines belastbaren Modells und dessen Einsatz frustrierend lange sein. Insbesondere in schnelllebigen ...

Agile Methoden
Cloud
Machine Learning

26.7.2021 | 5 Minuten Lesezeit

Timo Böhm

Niklas Haas

Schnelles Training eines Recommendation-Modells durch BigQuery ML

Machine Learning (ML) kann nur durch Modelle in der Produktion Business Value erzeugen. Allerdings kann die Zeitspanne zwischen der Entwicklung der nächsten Iteration eines Modells und dessen Einsatz in einer Produktionsumgebung massiv sein. Dies gilt...

Accelerate
Cloud
Data
Google Cloud
Machine Learning

26.7.2021 | 11 Minuten Lesezeit

Niklas Haas

Timo Böhm

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Heutzutage steht fast alles, was mit den Labels „künstliche Intelligenz (KI)“ oder „Machine Learning (ML)“ versehen ist, für Fortschritt. Seltsamerweise schließt diese Assoziation jedoch häufig die Themen Daten und Dateninfrastruktur nicht ausreichend...

Kultur
Data
Machine Learning

21.6.2021 | 12 Minuten Lesezeit

Marcel Mikl

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

Bei klassischen Machine-Learning-(ML-)Projekten beschäftigen sich Data Scientists häufig längere Zeit (mehrere Monate) mit der Entwicklung eines ML-Modells. Dabei werden hohe Kosten verursacht und die Zeit, bis ein erstes Modell zur Verfügung steht, ...

Cloud
Computer Vision
Data
Künstliche Intelligenz
Google Cloud
Machine Learning

17.5.2021 | 5 Minuten Lesezeit

Nils Bauroth

Sven Rediske

The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren

Dieser Artikel begleitet meinen Vortrag The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren, den ich am 20.10.2020 auf der data2day gehalten habe.Datenvisualisierung ist ausschlaggebend für Verständnis und KommunikationDatenvisualisierung...

Data
Data Science

19.10.2020 | 11 Minuten Lesezeit

Shirin Elsinghorst

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

Cloud
Computer Vision
Data
Künstliche Intelligenz
Machine Learning
Python

29.7.2020 | 11 Minuten Lesezeit

Marcel Mikl

Nico Axtmann

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

Cloud
Computer Vision
Data
Python
Machine Learning
Google Cloud
Künstliche Intelligenz

8.7.2020 | 11 Minuten Lesezeit

Nico Axtmann

Marcel Mikl

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

Core ML – inference on iOS

Inference on-device

Pros

Cons

Core ML

Alternatives to Core ML

Neural Engine

Image recognition with Core ML

Summary

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

KI für KMU: (Teil-)Automatisierung der Qualitätskontrolle von Bauteilen

Deployment von Machine-Learning-Modellen mit Seldon Core

Data Science in der Praxis: Häufige Fehler und Vorgehen

Portability between deep learning frameworks – with ONNX

Inbetriebnahme eines scikit-learn-Modells mit ONNX und FastAPI

Core ML – Inferenz unter IOS

Skalierbare Bildklassifizierung mit ONNX und AWS Lambda

ONNX – Portabilität von Deep-Learning-Modellen

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Green Cloud: Daten und Emissionen sparen

Charge your APIs Volume 23: REST vs. gRPC

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Große Sprachmodelle: Was ist ein LLM?

Bessere SQL-Datenpipelines mit dbt

Streaming Wikipedia mit Apache Kafka

Einführung in die Welt der Tourenoptimierung – Echte Routen und realistischere...

Einführung in die Welt der Tourenoptimierung – Visualisierung und Lösungsverfahren...

Einführung in die Welt der Tourenoptimierung (1/3)

Kotlin Multiplatform Mobile – Ein praktischer Einstieg

Smart DistancR – Perspektivisch korrekte Distanzmessung zwischen Personen

Machine-Learning-Modelle bewerten – Quality Gates etablieren

Kürzere Time-to-Market für ML-Modelle durch Googles BigQuery ML

Schnelles Training eines Recommendation-Modells durch BigQuery ML

KI, Daten und Infrastruktur – ML-Systeme schnell Ende-zu-Ende verproben...

Schnelles KI-Prototyping mit Google Cloud AutoML Vision

The Good, the Bad and the Ugly: Daten effektiv visualisieren und kommunizieren

KI in der Praxis: Fehlerhafte Bauteile mit Rekognition auf AWS identifizieren

KI in der Praxis: Fehlerhafte Bauteile mit AutoML in der Google Cloud ...

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten