Overview

Spring Data – Part 4: Geospatial Queries with MongoDB

9 Comments

Introduction

Every location-based service [1] has to solve the following problem: find all venues within a given distance from the current location of the user. Long before the advent of mobile devices, geographic information systems (GIS) [2] had to deal with this (and other) problem(s).

The NoSQL [3] datastore MongoDB [4] supports geospatial queries [5] (i.e. queries based on coordinates) out of the box. For a better understanding of the things to come, I recommend reading this article on Spring Data Mongo DB for an introduction to both MongoDB and the corresponding Spring Data API.

Planar Maps

Let’s start with a simple example consisting of four points in a plane. The meaning of the units of the coordinate systems can be whatever you choose: miles, kilometers etc.



Let’s insert these points into a collection named location:

C:\dev\bin\mongodb-2.0.2\bin>mongo
MongoDB shell version: 2.0.2
connecting to: test
> db.createCollection("location")
{ "ok" : 1 }
> db.location.save( {_id: "A", position: [0.001, -0.002]} )
> db.location.save( {_id: "B", position: [1.0, 1.0]} )
> db.location.save( {_id: "C", position: [0.5, 0.5]} )
> db.location.save( {_id: "D", position: [-0.5, -0.5]} )

To enable geospatial indexing, we set an appropriate index on the position array:

> db.location.ensureIndex( {position: "2d"} )

That’s it. Now we can perform queries like this (blue circle, red box from the above image) using special MongoDB operators:

> db.location.find( {position: { $near: [0,0], $maxDistance: 0.75  } } )
{ "_id" : "A", "position" : [ 0.001, -0.002 ] }
{ "_id" : "D", "position" : [ -0.5, -0.5 ] }
{ "_id" : "C", "position" : [ 0.5, 0.5 ] }
> db.location.find( {position: { $within: { $box: [ [0.25, 0.25], [1.0,1.0] ] }  } } )
{ "_id" : "C", "position" : [ 0.5, 0.5 ] }
{ "_id" : "B", "position" : [ 1, 1 ] }

Try this with your relational database without defining custom types and functions!

Spring Data MongoDB API

With Spring Data MongoDB the same queries can be implemented with very few lines of code. First of all, we define a POJO representing a location on the map:

public class Location {
 
   @Id private String id;
 
   private double[] position;
   ...
}

A repository defining the queries may look like this:

public interface LocationRepository extends MongoRepository<Location, String> {
 
   List<Location> findByPositionWithin(Circle c);
 
   List<Location> findByPositionWithin(Box b);
}

Spring Data derives the appropriate implementation at runtime from these interface methods. The classes Circle, Point and Box are abstractions belonging to the MongoDB API.

public class MongoDBGeoSpatialTest {
 
  @Autowired LocationRepository repo;
 
  @Autowired MongoTemplate template;
 
  @Before public void setUp() {
    // ensure geospatial index
    template.indexOps(Location.class).ensureIndex( new GeospatialIndex("position") );
    // prepare data
    repo.save( new Location("A", 0.001, -0.002) );
    repo.save( new Location("B", 1, 1) );
    repo.save( new Location("C", 0.5, 0.5) );
    repo.save( new Location("D", -0.5, -0.5) );
  }
 
  @Test public void shouldFindAroundOrigin() {
    // when
    List<Location> locations = repo.findByPositionWithin( new Circle(0,0, 0.75) );
 
    // then
    assertLocations( locations, "A", "C", "D" );
  }
 
  @Test public void shouldFindWithinBox() {
    // when
    List<Location> locations = repo.findByPositionWithin( new Box( new Point(0.25, 0.25), new Point(1,1)) );
 
    // then
    assertLocations( locations, "B", "C" );
  }
  ...

Our query results with the Spring Data MongoDB API are the same as with the mongo console:

Circle:
A(0.001, -0.002)
D(-0.500, -0.500)
C(0.500, 0.500)
 
Box:
C(0.500, 0.500)
B(1.000, 1.000)

The full source code of this example can be found at github. A good starting point is mongodb.MongoDBGeoSpatialTest.

Performance considerations

MongoDB does a really good job when indexing geospatial data. I did a small test comparing queries with circle and box shapes. I expected the box query to be faster than the circle query (because checking a box requires only comparison of the coordinates, checking a circle requires calculating distances) – but it wasn’t! My test scenario was the following:

  1. Create 100,000 random locations with coordinates in (-1,1) x (-1,1)
  2. Perform queries around 10,000 different random center points (x,y) with coordinates also in (-1,1) x (-1,1) using
    • a circle with center (x,y) and radius r = 0.1
    • a box with center (x,y) and width = sqrt(pi) * r (thus having the same area as the circle)

These are the test results:

CircleBox
Average time per query [ms]47.659247.2629
Average hits per query750749

It shows there are no differences at all. Of course, this is no proof – but a hint. Also the box is good approximation of the circle – at least it covers roughly the same amount of lcations (which are probably not the same though). But with MongoDB the box trick is not needed at all!

If you want check this yourself have a look a this unit test for details: mongodb.MongoDBMassTest.

Spherical Maps

Since the earth is a spherical globe [6] (and not a flat plane), working with planar maps is only a good approximation when you are dealing with small distances. Besides that you usually use latitude and longitude coordinates to describe a point on the globe and distances are measured in miles or kilometers. The earth is not a perfect globe, so the distance between two arcdegrees of longitude also varies [7].

MongoDB honors these facts since version 1.8 and provides special operators to support the spherical model. By default the range for geospatial index covers the interval [-180, 180) since latitude and longitude are expressed with these values. A coordinate tupel in MongoDB consists of [longitude, latitude]. Order is important.

I will use the Spring Data API alone, since it automagically scales down to miles or kilometers. In a raw MongoDB example you have to scale by yourself. Our example is based on three German cities:

CityLongitudeLatitude
Berlin13.40583852.531261
Cologne6.92127250.960157
Düsseldorf6.81003651.224088

I extracted the coordinates with the help of Google Maps [8]. We only have to add a single(!) line of code to our repository:

   List<Location> findByPositionNear(Point p, Distance d);

Since Düsseldorf and Cologne are not that far away from each other, the following query …

   List<Location> locations = repo.findByPositionNear(DUS , new Distance(70, Metrics.KILOMETERS) );

… finds the two cities of Cologne and Düsseldorf. Important is the use of the Metrics enum. Using KILOMETERS or MILES does two things under the hood:

  • it switches to spherical query mode
  • it applies appropriate scaling to the distance value

If we stretch our search a little bit more …

   List<Location> locations = repo.findByPositionNear(DUS , new Distance(350, Metrics.MILES) );

… all three cities are found. These examples can be found at github too.

Summary

I showed you how easy geospatial data and queries are handled by MongoDB. With the help of Spring Data MongoDB this ease is carried over to the Java world. We worked with simple planar maps, did a rough performance analysis and also looked at the more real world spherical model.

Spring Data Project

These are my other posts covering the Spring Data project:

Part 1: Spring Data Commons
Part 2: Spring Data JPA
Part 3: Spring Data Mongo DB

Expect upcoming blog posts on Spring Data Neo4j and Spring GemFire

References

[1] Location-based service
[2] GIS – Geograhic information system
[3] NoSQL databases
[4] MongoDB
[5] MongoDB – Geospatial Indexing
[6] Projections and Coordinate Systems
[7] Geographical Distance
[8] Finding longitude and latitude on Google Maps

Kommentare

  • 1. March 2012 von badami

    I am not able to find the class which implements Geospatial Queries with MongoDB : specifically the example code on github does not include the class which implements “LocationRepository interface”.

  • Tobias Trelle

    Hi badami,

    you are right, there is no implementation of LocationRepository at compile time. Because it is not needed at all!

    That is the basic idea behind all the Spring Data projects. You write only an interface and Spring autowires an appropriate implementation at runtime. The information provided by the signature of the interface methods is sufficient to derive all search criteria.

    This basic concepts are explained in detail in these previous posts of my blog series on Spring Data:

    Spring Data – Part 1: Commons
    Spring Data – Part 3: MongoDB

    HTH
    Tobias

  • 8. March 2012 von Simon

    Good article. How about spatial queries support with DB2 Spatial Extender? Do you have some examples? Thanks.

  • Does the C# driver support it also?

    Cause in the mongo db geospatial index page
    http://www.mongodb.org/display/DOCS/Geospatial+Indexing
    it says:
    “Related driver docs: Python, PHP, Perl”

    While no Java seems to be related according to this post it is…

    • Tobias Trelle

      Hi Oz,

      I didn’t try the C# driver myself, but I had no problems w/ using the Java driver.

      I think they just didn’t link to the Java and C# example.

  • 31. May 2012 von Chris

    Thanks very clear and helpful. For me, if you added your implementation of assertLocations() and the fact that you have overridden the equals() method in the Location class then I think it would have helped to get the example up and running without needing to visit GitHub to examine those specifics.

  • 13. January 2013 von Simon Massey

    This is an excellent article which is cited on a java.dzone.com article here:

    http://architects.dzone.com/articles/serverside-pagination-zk

    Thanks!

  • 27. July 2013 von Khanh

    Thanks Tobias Trelle for great article. I’m trying List locations = repo.findByPositionNear(DUS , new Distance(70, Metrics.KILOMETERS) );
    and want the results are sorted by distance. Any solution?
    Thanks again!

Comment

Your email address will not be published. Required fields are marked *