Thursday, May 9, 2013

Simple Pagination for MongoDB Queries Using QueryDSL

So, I have looked at several Java APIs for MongoDB integration; my favorite by far is Spring Data.  That said, Spring has API classes for paging and sorting.  Even with those classes, I find the QueryDSL API from mysema to be very functional.  To work with my Maven projects, I add the below plugin and dependencies:

...<plugin>
    <groupId>com.mysema.maven</groupId>
    <artifactId>maven-apt-plugin</artifactId>
    <version>1.0</version>
    <executions>
     <execution>
      <goals>
       <goal>process</goal>
      </goals>
      <configuration>
       <outputDirectory>target/generated-sources/java</outputDirectory>
       <processor>com.mysema.query.apt.QuerydslAnnotationProcessor</processor>
      </configuration>
     </execution>
    </executions>
   </plugin>
...
<dependency>
   <groupId>com.mysema.querydsl</groupId>
   <artifactId>querydsl-apt</artifactId>
   <version>2.2.3</version>
  </dependency>
  <dependency>
   <groupId>com.mysema.querydsl</groupId>
   <artifactId>querydsl-mongodb</artifactId>
   <version>2.2.3</version>
  </dependency>
...
Once my Maven configuration is set, I need to update the domain model objects that I will be using with QueryDSL.  Below is a snippet from my Employee model object with the @QueryEntity annotation added:


...
@QueryEntity
@Document(collection = "employees")
public class Employee extends Person {
...

Next I need to update the Spring Data Employee Repository interface to extend the QueryDslPredicateExecutor interface, type by my Employee model class.  At this point it is important to note that for the this integration, I had to switch from the annotated Mongo Repository definition to extending   the MongoRepository interface.  The @RepositoryDefinition annotation did not want to play nice with the QueryDslPredicateExecutor extension.


...
// @RepositoryDefinition(domainClass = Employee.class, idClass = String.class)
public interface EmployeeRepository extends MongoRepository<Employee, String>,
  QueryDslPredicateExecutor<Employee> {
...
Next, I need to update my EmployeeService interface and EmployeeServiceImpl implementation class to add a new method to access the generated find...() methods added for me by QueryDSL.


...
 public Page<Employee> findAllWithPages(int pageStart, int pageSize,
   Sort.Direction sortDirection, String sortField) {
  PageRequest pageRequest = new PageRequest(pageStart, pageSize,
    new Sort(Sort.Direction.ASC, "employeeId"));
  return this.employeeRepository.findAll(pageRequest);
 }
...
In this new method, I build a org.springframework.data.domain.PageRequest object and a org.springframework.data.domain.Sort object to pass into the newly provisioned findAll(..) method on the EmployeeRepository.  I did not write this new findAll(...) method, it was generated for me by QueryDSL and Spring Data.  Additionally, QueryDSL created the QEmployee class for me and placed it into target/generated-soutces/java in my Maven project.  QEmployee is a query type and is seen in its entirety below.


import static com.mysema.query.types.PathMetadataFactory.*;

import com.mysema.query.types.*;
import com.mysema.query.types.path.*;


/**
 * QEmployee is a Querydsl query type for Employee
 */
public class QEmployee extends EntityPathBase<Employee> {

    private static final long serialVersionUID = -236647047;

    public static final QEmployee employee = new QEmployee("employee");

    public final QPerson _super = new QPerson(this);

    public final SimplePath<Address> address = createSimple("address", Address.class);

    //inherited
    public final DateTimePath<java.util.Date> birthDate = _super.birthDate;

    public final SimplePath<Department> department = createSimple("department", Department.class);

    public final StringPath employeeId = createString("employeeId");

    //inherited
    public final StringPath firstName = _super.firstName;

    public final DateTimePath<java.util.Date> hireDate = createDateTime("hireDate", java.util.Date.class);

    //inherited
    public final StringPath id = _super.id;

    //inherited
    public final StringPath lastName = _super.lastName;

    //inherited
    public final StringPath middleName = _super.middleName;

    public final NumberPath<Integer> salary = createNumber("salary", Integer.class);

    public final StringPath title = createString("title");

    public QEmployee(String variable) {
        super(Employee.class, forVariable(variable));
    }

    public QEmployee(BeanPath<? extends Employee> entity) {
        super(entity.getType(), entity.getMetadata());
    }

    public QEmployee(PathMetadata<?> metadata) {
        super(Employee.class, metadata);
    }

}

Finally, below is the JUnit test that calls the pagination code generated for me.
package com.icfi.mongo;

import static org.junit.Assert.assertEquals;

import java.util.List;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.GenericXmlApplicationContext;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Sort;
import org.springframework.data.mongodb.core.MongoOperations;

import com.icfi.mongo.data.loaders.EmployeeShortLoader;
import com.icfi.mongo.data.model.Employee;
import com.icfi.mongo.services.EmployeeService;

public class PagingQueryTest {
 private static Logger log = LoggerFactory.getLogger(PagingQueryTest.class);

 private ApplicationContext ctx;
 MongoOperations mongoOps;
 List<Employee> employees;
 EmployeeService employeeService;

 @Before
 public void setup() {
  ctx = new GenericXmlApplicationContext("context/main.xml");
  mongoOps = (MongoOperations) ctx.getBean("mongoTemplate");
  employeeService = (EmployeeService) ctx.getBean("employeeService");
  EmployeeShortLoader.main(null);
 }

 @Test
 public void testPaging() {
  String[] lastNames = new String[] { "Stanfel", "Gustavson", "Lortz",
    "Marquardt", "Unno", "Savasere", "Spelt", "Wynblatt",
    "Danecki", "Weedman", "Hartvigsen", "Menhoudj", "Heyers",
    "Willoner", "Shumilov", "Zuberek", "Boguraev" };
  int pageCount = 10;
  int pageNumber = 0;
  String sortField = "employeeId";
  Sort.Direction sortOrder = Sort.Direction.ASC;

  Page<Employee> employeesPage = employeeService.findAllWithPages(
    pageNumber, pageCount, sortOrder, sortField);

  while (employeesPage.hasNextPage()) {

   assertEquals("List size is incorrect.", pageCount,
     employeesPage.getSize());

   log.info("Page Number = " + pageNumber);

   if (employeesPage.hasContent()) {
    log.info(employeesPage.getContent()
      .get(employeesPage.getSize() - 1).getLastName());

    assertEquals(
      "Last name was incorrect.",
      lastNames[pageNumber],
      employeesPage.getContent()
        .get(employeesPage.getSize() - 1).getLastName());
   }

   pageNumber++;

   employeesPage = employeeService.findAllWithPages(pageNumber,
     pageCount, sortOrder, sortField);
  }

  log.info("Page Number = " + pageNumber);

  employeesPage = employeeService.findAllWithPages(pageNumber, pageCount,
    sortOrder, sortField);

  log.info(employeesPage.getContent()
    .get(employeesPage.getContent().size() - 1).getLastName());

  assertEquals(
    "Last name was incorrect.",
    lastNames[pageNumber],
    employeesPage.getContent()
      .get(employeesPage.getContent().size() - 1)
      .getLastName());
 }

 @After
 public void tearDown() {
  this.mongoOps.getCollection("employees").drop();
 }
}

With this approach I have quickly added pagination to my MongoDB queries, while writing minimal code.

Tuesday, May 7, 2013

Speaking at Twin Cities JUG - MongoDB and Spring Data

I am scheduled to speak at the Twin Cities JUG on June 10th.  My topic, near and dear to my heart, is Spring Data and MongoDB.  I have incorporated some new content, including MongoDB 2.4.

2dsphere Indexes in Mongo 2.4

MongoDB 2.4 has added  new spherical index, 2dsphere.  This index allows for more precise geo-spatial querying within Mongo by taking into the account the spherical shape of the earth and the fact that the distances between lines of longitude shrink and grow, depending on the latitude.  Here is a good link to understand the Earth's spherical nature:  http://www.learner.org/jnorth/tm/LongitudeIntro.html

A simple application of this 2dsphere index is seen below.  Given the "locations" collection with the given "typical" document shape:

/* 0 */
{
  "_id" : ObjectId("51887218c0aa488a0394002f"),
  "_class" : "com.icfi.mongo.data.model.Location",
  "city" : "Wheeling",
  "state" : "WV",
  "coords" : [40.071472, -80.6868],
  "timeZone" : -5,
  "zipCode" : "26003",
  "dstObserved" : true
}
I will apply a 2dsphere index on the "coords" (short for coordinates) element.  This element is an array of double precision numbers representing the longitude and latitude (in that order) of a given city.  The command is below:
db.locations.ensureIndex( { "coords" : "2dsphere" } )
Next we can run query using a geo-spatial operator, like $near.  The $near command syntax is seen below:


db.collection<collection>.find( { <location field=""> :
                         { $near :
                            { $geometry :
                                { type : "Point" ,
                                  coordinates : [ <longitude> , <latitude> ] } },
                              $maxDistance : <distance in="" meters="">
                      } } )

My actual command to search for cities near Wheeling, WV 26003 is:

db.locations.find({ 'coords' : { $near : { $geometry : { type : 'Point' ,coordinates : [ 40.071472 , -80.6868 ] } }, $maxDistance : 10000} })
This search returns documents that are within a circular distance of 10000 meters for the given Long/Lat coordinates.


To make this more friendly in Java, I added the utility to convert miles to meters; I don't use meters much.

public class GeoUtils {

 public static final double MILES_METERS_DIVISOR = 0.00062137;

 public static double milesToMeters(double miles) {
  return miles / GeoUtils.MILES_METERS_DIVISOR;
 }
}

A JUnit test is seen below.  First I get the location object from which I want to harvest the coordinates (point).  Then I call the LocationService method to find the nearest cities.  I have also include the Location model class and the Spring Data LocationRepository class.


@Test
 public void testNearMiles() {
  log.info("<<<<<<<<<<<<<<<<<  testNearMiles  >>>>>>>>>>>>>>>>>>>>");

  List<Location> locations = locationService.findByCityAndState(
    "Wheeling", "WV");

  assertNotNull("locations[0] was null.", locations.get(0));
  
  assertEquals("City was not correct.", "Wheeling", locations.get(0)
    .getCity());
  assertEquals("State was not correct.", "WV", locations.get(0)
    .getState());
  assertEquals("ZipCode was not correct.", "26003", locations.get(0)
    .getZipCode());

  List<Location> locales = this.locationService.findNear(
    locations.get(0), 5);

  for (Location locale : locales) {
   log.info(locale.toString());
  }
  
  assertEquals("City was not correct.","Yorkville",locales.get(2).getCity());
  assertEquals("City was not correct.","Glen Dale",locales.get(14).getCity());
 }

Location service class:

...
@Override
 public List<Location> findNear(double lon, double lat, double distance) {
  return this.locationRepository.findByGeoNear(lon, lat, distance);
 }

 @Override
 public List<Location> findNear(Location location, double distanceInMiles) {
  return this.findNear(location.getLongitude(), location.getLatitude(),
    GeoUtils.milesToMeters(distanceInMiles));
 }
...

In the LocationRepository class I have the following annotated method:

...
@Query("{ 'coords' : { $near : { $geometry : { type : 'Point' ,coordinates : [ ?0 , ?1 ] } }, $maxDistance : ?2} }")
List<Location> findByGeoNear(double lon, double lat, double distance);
Since I am using Spring Data, I can add a new method to my LocationService like so:


@Override
 public GeoResults<Location> findNearPoint(Location location,
   double distanceInMiles) {

  Point point = new Point(location.getLongitude(), location.getLatitude());

  NearQuery query = NearQuery.near(point).maxDistance(
    new Distance(distanceInMiles, Metrics.MILES));

  GeoResults<Location> results = this.mongoOps.geoNear(query,
    Location.class);

  return results;
 }
This new method uses a new Point class and the Distance class that handles the miles appropriately.  This approach is considered a "GEO Near" query.  It finds the locations near the given point and calculates the actual distance from the original point to each resultant location.  Results are returned in a parameterized GeoResults object.

MongoDB 2.4 - Text Indexes

Full text indexes were added with MongoDB 2.4.  I am working with MongoDB 2.4.3 and I have tested the functionality on my local Windows box.  I have not tested the performance.

For the "employees" collection with the given "typical" document shape:


/* 0 */
{
  "_id" : ObjectId("5189056ab38f56933e7224bb"),
  "_class" : "com.icfi.mongo.data.model.Employee",
  "address" : {
    "_id" : null,
    "addressLine1" : "227 Clifton Ave #-4",
    "city" : "Darby",
    "county" : "Delaware",
    "state" : "PA",
    "zipCode" : "19023"
  },
  "employeeId" : "28241",
  "hireDate" : ISODate("1988-01-28T05:00:00Z"),
  "department" : {
    "_id" : "d004",
    "name" : "Production",
    "managerId" : "110420"
  },
  "title" : "Senior Engineer",
  "salary" : 82927,
  "lastName" : "Baik",
  "firstName" : "Yuguang",
  "middleName" : "M",
  "birthDate" : ISODate("1959-01-04T05:00:00Z")
}
I want to perform full-text searches on the "title" field. First, I need to ensure that the proper "text" index is applied to that field. The following "ensureIndex" command takes care of that.
db["employees"].ensureIndex({"title":"text"})

Next I can perform a "text" search in the Mongo Shell with the format seen below:


db.collection.runCommand( "text", { search: ,
                                    filter: ,
                                    project: ,
                                    limit: ,
                                    language:  } )

My search command is:

db.employees.runCommand( "text", { search: "senior" } )

This runs a case-insensitive full-text search and returns documents containing the word "senior", based on the "title" field.  The default limit is 100 docs returned.  This can be overridden by the limit argument in the text command.  In the Java API it would look something like this:

final DBObject command = new BasicDBObject();
  command.put("text", "employees");
  command.put("search", "SeNiOr");
  // command.put("limit", 2);
  final CommandResult result = db.command(command);

10gen warns about the "text index".  They can grow very large and can adversely effect performance.  At this time, the text index and text command are in beta and not recommended for production use.