Overview

Concurrency with CoreData

No Comments

I want highly responsive app which would allow browsing data even when I’m offline.” – boy, if we had a coin every time we hear one of those.

Content which is before you is the resulting work of my colleague Igor Stojanovic and me.

Ironically, CoreData can be the very core of your app when it comes to dealing with data. Apart from managing the way your app holds data in memory, it also deals with advanced data querying (NSPredicate), lazy loading (faults), undo/redo support, schema migration, UI integration (NSFetchedResultsController), merge policies and other related things.

Depending on how you use it, CoreData can either be your enemy or your friend: if done on the main thread, CoreData processing can significantly inhibit application performance. If you however decide to use it asynchronously, you better know some patterns.

CoreData design

Even though some claim CoreData has steep learning curve, it can be quite simple to comprehend once you understand the terminology. In principle, it’s all about managing the object graph of your model. The diagram below shows CoreData’s architecture.

coredata_design

Usually a SQLite database is used as the backing data store, but you can also specify “Atomic” or “In-Memory” storage types. Note that although support for an XML data store exists for OSX, it’s not available on iOS.

An NSManagedObject instance can be roughly thought of as a record in a database table. Table metadata is kept in an NSEntityDescriptor object. An NSManagedObjectContext is a pool of NSManagedObjects. It is responsible for NSManagedObject‘s life cycle, as well as for fetching objects from the underlying data store, persisting, object faulting, undo/redo support…

A single NSManagedObjectContext must work on a single queue. This can be achieved in two ways. The first (older) approach is to simply only use the NSManagedObjectContext instance on the same queue it was created on. This is known as thread confinement:

dispatch_async(queue, ^{
    NSManagedObjectContext* moc = [[NSManagedObjectContext alloc] init];
    // moc usage
});

The second approach is to initialize NSManagedObjectContext with a concurrency type, and then later on use it with a performBlock:^ or performBlockAndWait:^ call.
Example:

NSManagedObjectContext* moc = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateConcurrencyType];
[moc performBlock:^{
    // use moc in this block
}];

There are three concurrency types available:

  • NSConfinementConcurrencyType – obsolete according to Apple, though still many people use it. What this means is that you have to make sure the MOC is used on the queue it’s created on manually. Instead of using initWithConcurrencyType, you could just use the init method in this case.
  • NSPrivateQueueConcurrencyType – means that [moc performBlock:] method will be executed on a background queue.
  • NSMainQueueConcurrencyType – means that [moc performBlock:] method will be executed on the main queue.

Changes upon NSManagedObjects are not recorded to the persistent store until save is called on the NSManagedObjectContext.

NSPersistentStoreCoordinator binds persistent stores with NSManagedObjectContexts. Usually, NSPersistentStoreCoordinator has a single Persistent Store (such as SQLite) and multiple NSManagedObjectContexts attached to it, but it can simultaneously handle multiple persistent stores. You can think of NSPersistentStoreCoordinator as a layer between the persistent store and NSManagedObjectContext.

Be aware that all UI updates happen on the main queue, so any CoreData action (read or write) on the main queue will have an impact and might cause visual glitches or performance issues. The reason is that all UI handling, drawing and user events (taps, gestures) are taken care of on the main thread. Performing anything else on it will have some impact on app performance. If you want to know more about why it is a “failed dream” to create multithreaded UI on any platform, read “Multithreaded toolkits: A failed dream?”.

There are different approaches to dealing with CoreData. Here are some common solutions (stacks) found in practice.

Stack #1

Just put everything on main queue.

The first approach is a rather simple one: just have it all on main queue. You need the default CoreData stack which is generated by Xcode (once you enable CoreData when creating a new project). Here’s a picture of how that looks:

stack1

The single NSManagedObjectContext is used both by the UI and for recording stuff into database. Use cases are:

  1. Fetch data from server
  2. As soon as data returns, present some kind of a waiting indicator
  3. Store retrieved data in the database (insert/update/delete)
  4. Hide the waiting indicator and present new data.

While being a very simple solution, the downside is that while data is recorded in the database (many apps show a head-up-display (HUD) style waiting indicator even while data is being fetched from the server), the app is not usable since it uses the main queue to record new stuff in the database. Luckily, the popular MBProgressHUD is implemented in a smart way so that it can animate itself on the main queue while it is used by CoreData tasks.

Stack #2

Separate background writing tasks from the main queue.

iOS

Here, a worker context is introduced which is then used to store data received from the server. Once data storing is complete, the main context can be merged with the worker context’s changes in two ways:

  1. Once the worker context finishes its operation, it can notify the main context, which can then be reset by calling the reset method.
  2. Have main context merged in response to a notification fired from the work context mergeChangesFromContextDidSaveNotification method. The worker context will fire such a notification after executing save.

The downside of this approach is that the developer has to merge contexts manually.

Stack #3

Make use of parent-child relationships between managed object contexts.

In iOS 5, Apple introduced parent-child relationships between managed object contexts. Using these should help sync data between the contexts. Here’s one possible stack:

stack4

This one has the benefit of creating managed objects on a background queue. But after calling save, objects are passed to parent (main queue) and the actual writing to database will still happen on main queue. So although the solution is slightly better than Stack #1, it still has an impact on the main thread.

Stack #4

Stack #3 with reversed managed object contexts.

stack3

This kind of stack is used by the popular RestKit framework. In this case, a private managed object context is supposed to be used for storing data retrieved from server. Once that is done, the main managed object context is notified via NSManagedObjectContextDidSaveNotification and thus the main context is updated with changes on it’s parent (worker) context.

In case the main context is used for saving data (which is ok to do for persisting a very small amount of data), changes are automatically merged with the private context because of the parent-child relationship between them.

Stack #5

Make extensive use of parent-child relationships.

stack5

This one is interesting. In addition to stack #4, it introduces a new NSManagedObjectContext with the private concurrency type as a child of the main context. So let’s look at the data flow:

  1. Data is asynchronously fetched. User can use the app while this is in progress.
  2. Data arrives from the server. The worker context is used to store data in the database. Since this happens on a private queue, the user can use the app as before.
  3. After calling save on the worker context, data is merged with the main context. Now this happens on the main queue, but hopefully, you will not get severe performance issues here, since the merging is done in-memory.
  4. Finally, data is passed to the additional worker context which does the actual writing to the data store. This is done on a private queue again, so the app remains responsive.

The downside of this approach is that since there are now 3 managed object contexts to merge, due to the extra work involved, all this merging can have a noticeable overall negative impact on app performance.

Stack #6

1 managed object context per view controller

If you have a large number of worker contexts or/and they store big amounts of data, then, since merging with the main context happens on the main queue, it might cause the app to become unresponsive.

Besides, the main context can become pretty heavy when it comes to how much data it holds in memory. As time goes by, objects are constantly being added to it. If you want to clear it out, you must call the reset method on it. But if you have multiple view controllers visible on screen (quite normal, especially for iPad apps) you need to be careful about this so that none of the view controllers get the rugged pulled from under their feet.

stack6

The solution here could be that you simply don’t merge anything with the main context. Instead, who says you must have only one main context?! Instead of having single main context for all the view controllers in your app, try the following approach: each view controller gets its own NSManagedObjectContext. Usage can be like:

  1. When data is fetched from server, use a worker context to store data.
  2. Once saving is done, fire a notification, e. g. something like a MyAppUsersSavedNotification (assuming you were saving a list of some users into database).
  3. The MyAppUsersViewController which shows the list of users receives the notification and executes:
[self.managedObjectContext reset];
self.fetchedResultsController = nil;
[self.tableView reloadData];

This way, the NSFetchedResultsController for this view controller will be recreated (assuming you have lazy getter for it as in most examples), and it will pull fresh data from CoreData. Altough this is performed on the main queue, it should be pretty fast, just don’t forget to set fetchBatchSize on the NSFetchRequest object. Also, consider which data you actually need from the managed object. If you don’t need all the fields, then fetch only those you need by using the propertiesToFetch field in NSFetchRequest object.

The winner?

Probably you had hoped for the one-size-fits-all-silver-bullet-magic-wand solution to be called out here. Unfortunately – as usually – it really depends on your needs. For really simple use cases you can get away even with Stack #1, but for some more heavy use cases we recommend using RestKit and the approach they follow from Stack #4. I also find Stack #6 quite interesting and high-performant, but there you would need to deliver your own solution and be rather careful how you use managed object context instances.

To round this up, we recommend the following links for further reading:

  • https://developer.apple.com/library/ios/documentation/cocoa/conceptual/coredata/Articles/cdConcurrency.html
  • http://floriankugler.com/blog/2013/4/29/concurrent-core-data-stack-performance-shootout
  • http://floriankugler.com/blog/2013/4/2/the-concurrent-core-data-stack
  • https://developer.apple.com/library/ios/documentation/Cocoa/Conceptual/CoreData/cdProgrammingGuide.html

Comment

Your email address will not be published. Required fields are marked *