Info

In this example the random descent metaheuristic is applied to sample a good core subset. Don't know what random descent is? Read this.

This example demonstrates how to implement an objective function that includes an efficient delta evaluation used to evaluate a neighbour of the current solution, based on this current solution, its evaluation and the applied move. The same core subset selection problem from example 1A is addressed using the same basic random descent algorithm. Please take a look at that example before proceeding.

In example 1A we provided a basic implementation of the objective to maximize the average pairwise distance between all selected items. This implementation simply sums all pairwise distances and divides by the amount of distances.

public class CoreSubsetObjective implements Objective<SubsetSolution, CoreSubsetData>{

    public Evaluation evaluate(SubsetSolution solution, CoreSubsetData data) {
        double value = 0.0;
        if(solution.getNumSelectedIDs() >= 2){
            // at least two items selected: compute average distance
            int numDist = 0;
            double sumDist = 0.0;
            Integer[] selected = new Integer[solution.getNumSelectedIDs()];
            solution.getSelectedIDs().toArray(selected);
            for(int i=0; i<selected.length; i++){
                for(int j=i+1; j<selected.length; j++){
                    sumDist += data.getDistance(selected[i], selected[j]);
                    numDist++;
                }
            }
            value = sumDist/numDist;
        }
        return SimpleEvaluation.WITH_VALUE(value);
    }

    public boolean isMinimizing() {
        return false;
    }

}

When a neighbourhood search evaluates a neighbour of the current solution, both solutions are similar so that it is often not necessary to repeat all computations for this neighbour. Based on the applied changes (move) we can update the evaluation of the current solution to that of its neighbour.

Consider a selection S (current solution) with evaluation e and a move that will remove a set of IDs D from the selection and add a set A of currently unselected IDs. We then identify three important sets:

  1. D = removed IDs
  2. A = added IDs
  3. R = S\D = retained IDs
All pairwise distances within R are already accounted for in the evaluation e of S. Pairwise distances within D and from elements in D to elements in R need to be removed from the average, while distances within A and from A to R need to be added to it.

Now let's take a look at the details of the Objective interface. Previously we had implemented the two required methods evaluate(solution, data) and isMinimizing() but the API shows that there is actually also a third method to evaluate a move that will be applied to the current solution. This method has a default implementation that performs a full evaluation of the obtained neighbour:

  1. The move is applied to the current solution.
  2. The modified solution (neighbour) is fully evaluated by calling evaluate(solution, data).
  3. The move is undone.
Of course, this is often not very efficent, as is the case for our example. Overriding this method is the key to efficient delta evaluations, although it is not required. The method description in the API may look a bit weird:
default <ActualSolutionType extends SolutionType> Evaluation evaluate(Move<? super ActualSolutionType> move, ActualSolutionType curSolution, Evaluation curEvaluation, DataType data)
This is because it uses some generics magics with bounds and wildcards to provide a default implementation (as described above) that always works, regardless of your specific solution type, applied neighbourhood, etc. These generics can be ignored when overriding this method:
public Evaluation evaluate(Move move, SolutionType curSolution, Evaluation curEvaluation, DataType data) {
    // provide delta evaluation here
}
Here, the solution and data type are set to SubsetSolution and CoreSubsetData, respectively, and we can extend our basic CoreSubsetObjective that already contains the full evaluation to also include an efficient delta evaluation. Because this delta evaluation needs to know how the selection will change when the received move is applied to the current solution, it casts the move to a SubsetMove so that the set of added and removed IDs can be accessed. If a different move type is received, an IncompatibleDeltaEvaluationException is thrown. This means that the extended objective can only be used in combination with neighbourhoods that generate moves of type SubsetMove. This is the case for all predefined subset neighbourhoods. It is inevitable to require a specific move type when providing a smart delta evaluation, but it can be a high-level class or interface as in this example.

The double value of the current solution's evaluation can be obtained by calling curEvaluation.getValue(). For this objective function, it is sufficient to know this value, the current solution and the applied move to be able to compute the new evaluation. This is not always the case: it might be necessary to track some metadata as well by using a custom evaluation object (see example 1C). Here we can still use the basic SimpleEvaluation that just wraps a double value.

public class CoreSubsetObjectiveWithDelta extends CoreSubsetObjective {

    public Evaluation evaluate(Move move, SubsetSolution curSolution, Evaluation curEvaluation, CoreSubsetData data) {
        // check move type
        if(!(move instanceof SubsetMove)){
            throw new IncompatibleDeltaEvaluationException("Core subset objective should be used in combination "
                                                + "with neighbourhoods that generate moves of type SubsetMove.");
        }
        // cast move
        SubsetMove subsetMove = (SubsetMove) move;

        // get current evaluation
        double curEval = curEvaluation.getValue();
        // undo average to get sum of distances
        int numSelected = curSolution.getNumSelectedIDs();
        int numDistances = numSelected * (numSelected-1) / 2;
        double sumDist = curEval * numDistances;

        // get set of added and removed IDs
        Set<Integer> added = subsetMove.getAddedIDs();
        Set<Integer> removed = subsetMove.getDeletedIDs();
        // infer list of retained IDs
        List<Integer> retained = new ArrayList<>(curSolution.getSelectedIDs());
        retained.removeAll(removed);

        // subtract distances from removed items to retained items
        for(int rem : removed){
            for(int ret : retained){
                sumDist -= data.getDistance(rem, ret);
                numDistances--;
            }
        }

        // subtract distances from removed to other removed items
        for(int rem1 : removed){
            for(int rem2 : removed){
                // account for each distinct pair only once
                if(rem1 < rem2){
                    sumDist -= data.getDistance(rem1, rem2);
                    numDistances--;
                }
            }
        }

        // add distances from new items to retained items
        for(int add : added){
            for(int ret : retained){
                sumDist += data.getDistance(add, ret);
                numDistances++;
            }
        }

        // add distances from new items to other new items
        for(int add1 : added){
            for(int add2 : added){
                // account for each distinct pair only once
                if(add1 < add2){
                    sumDist += data.getDistance(add1, add2);
                    numDistances++;
                }
            }
        }

        double newEval;
        if(numDistances > 0){
            // take average based on updated number of distances
            newEval = sumDist / numDistances;
        } else {
            // no distances (less than two items remain selected)
            newEval = 0.0;
        }

        // return new evaluation
        return SimpleEvaluation.WITH_VALUE(newEval);

    }

}

The complete source code of this example is available on GitHub, including some additional code to read the input from a CSV file. You can also download a ZIP file that contains the Java sources of all examples, a compiled JAR (including all dependencies) as well as some input files for in case you want to run any of the examples. To run this example, execute

$ java -cp james-examples.jar org.jamesframework.examples.coresubset2.CoreSubset2 <inputfile> <subsetsize> <runtime>

from the command line. The input file should be a CSV file in which the first row contains N item names and the subsequent N rows specify an N × N distance matrix. The runtime is given in seconds.

Example input file (275 items)