September 9, 2010

Working with Aggregate SOQL queries/results in Batch Apex !

What if you want create Batch Apex Job, that uses SOQL having aggregate functions like SUM, MAX, COUNT on top of results grouped by “GROUP BY” clause. You can’t easily create a Batch job via QueryLocator, as shown below.

global class QueryLocatorAggregator implements Database.Batchable<AggregateResult> {
}

It fails for this error

Error: Compile Error: QueryLocatorAggregator: Class must implement the global interface method: Iterable<SOBJECT:AggregateResult> start(Database.BatchableContext) from Database.Batchable<SOBJECT:AggregateResult> at line 1 column 14

So the other option that seems doable is using SObject as generic type argument in Batchable, as shown below with complete source code.

global class QueryLocatorAggregator implements Database.Batchable<Sobject> {
    global Database.QueryLocator start(Database.BatchableContext info){
        String query = 'Select Department, COUNT(Name) From Contact GROUP BY Department';
        return Database.getQueryLocator(query);
    }
    global void execute(Database.BatchableContext BC,
             List<Sobject> scope){
        for (Sobject so : scope)  {
          AggregateResult ar = (AggregateResult) so;
          Integer counter = Integer.valueOf(ar.get('expr0'));
          // process the results
        }
    }
    global void finish(Database.BatchableContext BC){}   
}

The above code/class complies(saves), but on first processing of batch it fails with this error

System.UnexpectedException: Aggregate query does not support queryMore(), use LIMIT to restrict the results to a single batch

So, how to work on queries with aggregation ? well the hint was right their in the first error message in red i.e. use Iterable<AggregateResult>

Making AggregateResult queries work in Batch Apex

The only way I found is using Iterable<??>. Iterable is an easy QueryLocator alternate to feed data to batch apex execution. Database.Batchable<?> interface gives two abstract methods to implement

  1. global (Database.QueryLocator ) start(Database.BatchableContext bc) {}
  2. global (Iterable<sObject>) start(Database.BatchableContext bc) {}

The magical part about Database.Batchable interface is you need to implement either of the above two abstract methods. Though those magic confuse me a lot, because of my Java background :-D

Here are the three steps to get rolling with Iterable<> in Batch Apex

  1. Create a class that implements contract of Iterator<AggregateResult>
  2. Create a class that implements contract of Iterable<AggregateResult>
  3. Use the instance of Iterable<AggregateResult> implementation in Database.Batchable’s “global (Iterable<sObject>) start(Database.BatchableContext bc) {}”
Create a class that implements contract of Iterator<AggregateResult>

Here is the code sample for this.

global class AggregateResultIterator implements Iterator<AggregateResult> {
   AggregateResult [] results {get;set;}
   // tracks which result item is returned
   Integer index {get; set;} 
         
   global AggregateResultIterator() {
      index = 0;
	  // Fire query here to load the results
      String query = 'Select Department, COUNT(Name) From Contact GROUP BY Department';
      results = Database.query(query);            
   } 
   
   global boolean hasNext(){ 
      return results != null && !results.isEmpty() && index < results.size(); 
   }    
   
   global AggregateResult next(){        
      return results[index++];            
   }       
}    
Create a class that implements contract of Iterable<AggregateResult>

Here is the code sample for that

global class AggregateResultIterable implements Iterable<AggregateResult> {
   global Iterator<AggregateResult> Iterator(){
      return new AggregateResultIterator();
   }
}
Use the instance of Iterable<AggregateResult> implementation in Database.Batchable’s “global (Iterable<sObject>) start(Database.BatchableContext bc) {}”

Here is the complete Database.Batchable class implementation

global class IterableAggregator implements Database.Batchable<AggregateResult> {    
    global Iterable<AggregateResult> start(Database.batchableContext info){
        // just instantiate the new iterable here and return
        return new AggregateResultIterable();
    }

    global void execute(Database.BatchableContext BC, List<Sobject> scope){
        for (Sobject so : scope)  {
          AggregateResult ar = (AggregateResult) so;
          Integer counter = Integer.valueOf(ar.get('expr0'));
          // process the results
        }        
    }

    global void finish(Database.BatchableContext BC){}   
}

When not to use Iterable<?> with AggregateResult

Database.Batchable implementations depending on Iterable<?> for data source are bound to normal governor limits, so Iterable’s can never process 50 million records like QueryLocator. Before using Iterable, you need to watch the amount of Data/Rows resulting from that Aggregated SOQL.

References