Info

The random descent and parallel tempering metaheuristics are applied to optimize the knapsack contents. Don't know what random descent is? Read this. Never heard about parallel tempering? Read this.

In this example the 0/1 knapsack problem — a constrained variable size subset selection problem — is implemented in JAMES. Given a collection of items which each have a weight and profit, the goal is to select a subset of these items so that the total profit is maximized without exceeding the capacity of the knapsack, i.e. the total weight of all selected items should be smaller than or equal to a constant W.

First provide the data by implementing the IntegerIdentifiedData interface in KnapsackData. Every item is assigned a unique ID in [0, N-1]. The data stores two arrays of length N where weights[i] and profits[i] contain the weight and profit, respectively, of the item with ID i. These two arrays are specified when creating the data and the IDs are automatically inferred to match the array indices. Methods are provided to get all IDs (as required by the IntegerIdentifiedData interface) as well as to get the weight or profit of an item with a given ID.

public class KnapsackData implements IntegerIdentifiedData {

    // weights
    private double[] weights;
    // profits
    private double[] profits;
    // IDs (indices in weight and profit arrays)
    private Set<Integer> ids;

    public KnapsackData(double[] weights, double[] profits){
        // store data
        this.weights = weights;
        this.profits = profits;
        // infer IDs: 0..N-1 in case of N items
        // (indices in weight and profit arrays)
        ids = new HashSet<>();
        for(int id=0; id<weights.length; id++){
            ids.add(id);
        }
    }

    public Set<Integer> getIDs() {
        return ids;
    }

    public double getWeight(int id){
        return weights[id];
    }

    public double getProfit(int id){
        return profits[id];
    }

}

The objective of the knapsack problem is easily defined. We create a class KnapsackObjective that implements the Objective interface where the solution and data type are set to SubsetSolution and KnapsackData respectively. A given subset solution is evaluated by computing the sum of profits of all selected items. This value is to be maximized.

public class KnapsackObjective implements Objective<SubsetSolution, KnapsackData>{

    public Evaluation evaluate(SubsetSolution solution, KnapsackData data) {
        // compute sum of profits of selected items
        double totalProfit = solution.getSelectedIDs().stream().mapToDouble(data::getProfit).sum();
        // wrap value in simple evaluation object
        return SimpleEvaluation.WITH_VALUE(totalProfit);
    }

    public boolean isMinimizing() {
        return false;
    }

}

It is advised to also add an efficient delta evaluation to speed up applied neighbourhood searches (see example 1B). First, the value (total profit) of the current solution is retrieved. Then, the profit of any newly selected items is added and the profit of any removed items is subtracted. This yields the value of the neighbour that is obtained when applying the evaluated move to the current solution.

public Evaluation evaluate(Move move, SubsetSolution curSolution, Evaluation curEvaluation, KnapsackData data) {
    // check move type
    if(!(move instanceof SubsetMove)){
        throw new IncompatibleDeltaEvaluationException("Knapsack objective should be used in combination "
                                            + "with neighbourhoods that generate moves of type SubsetMove.");
    }
    // cast move
    SubsetMove subsetMove = (SubsetMove) move;
    // get current profit
    double value = curEvaluation.getValue();
    // account for added items
    value += subsetMove.getAddedIDs().stream().mapToDouble(data::getProfit).sum();
    // account for removed items
    value -= subsetMove.getDeletedIDs().stream().mapToDouble(data::getProfit).sum();
    // return updated evaluation
    return SimpleEvaluation.WITH_VALUE(value);
}

To ensure that the knapsack capacity is never exceeded we will add a constraint to our problem. We provide a KnapsackConstraint that implements the Constraint interface, with solution type SubsetSolution and data type KnapsackData. The Constraint interface requires to implement a single method Validation validate(SubsetSolution, KnapsackData) that validates a subset solution given our knapsack data. In this example, the sum of weights of all selected items is computed and compared to the knapsack capacity. If the capacity is not exceeded, the constraint is satisfied.

The validate(...) method returns an object of type Validation. The Validation interface has a single method passed() that indicates whether the solution passed validation, i.e. whether the constraint is satisfied. A predefined implementation of a SimpleValidation is provided that merely wraps a boolean value. The constraint below returns such simple validation object.

public class KnapsackConstraint implements Constraint<SubsetSolution, KnapsackData> {

    // maximum total weight
    private double maxWeight;

    public KnapsackConstraint(double maxWeight){
        this.maxWeight = maxWeight;
    }

    public Validation validate(SubsetSolution solution, KnapsackData data) {
        // compute sum of weights of selected items
        double weight = solution.getSelectedIDs().stream().mapToDouble(data::getWeight).sum();
        // check that maximum weight is not exceeded
        if(weight <= maxWeight){
          return SimpleValidation.PASSED;
        } else {
          return SimpleValidation.FAILED;
        }
    }

}

Similar to a delta evaluation in an objective, we can provide an efficient delta validation in a constraint. This is not required because the Constraint interface includes a default apply-undo behaviour: apply the move, perform a full validation by calling validate(solution, data) and undo the move. Of course, this is not very efficient so it is advised to provide a custom delta validation whenever possible.

public Validation validate(Move move, SubsetSolution curSolution, Validation curValidation, KnapsackData data) {
    // provide delta validation here
}

The SimpleValidation type used above only indicates whether the solution passed validation or not. Based on this information only, it is not possible to perform an efficient delta validation. Therefore, we create our own KnapsackValidation that tracks the total weight of a solution, as this is what determines validity. It implements the Validation interface with a single required method passed() that checks whether the total weight does not exceed the knapsack capacity.

public class KnapsackValidation implements Validation {

    private double curWeight;
    private double maxWeight;

    public KnapsackValidation(double curWeight, double maxWeight) {
        this.curWeight = curWeight;
        this.maxWeight = maxWeight;
    }

    public double getCurWeight() {
        return curWeight;
    }

    public boolean passed() {
        return curWeight <= maxWeight;
    }

}

Now we can update the constraint to use this custom validation type and include an efficient delta validation.

public class KnapsackConstraint implements Constraint<SubsetSolution, KnapsackData> {

    // maximum total weight
    private double maxWeight;

    public KnapsackConstraint(double maxWeight){
        this.maxWeight = maxWeight;
    }

    public Validation validate(SubsetSolution solution, KnapsackData data) {
        // compute sum of weights of selected items
        double weight = solution.getSelectedIDs().stream().mapToDouble(data::getWeight).sum();
        // return custom validation object
        return new KnapsackValidation(weight, maxWeight);
    }

    public Validation validate(Move move, SubsetSolution curSolution, Validation curValidation, KnapsackData data) {
        // check move type
        if(!(move instanceof SubsetMove)){
            throw new IncompatibleDeltaEvaluationException("Knapsack constraint should be used in combination "
                                                + "with neighbourhoods that generate moves of type SubsetMove.");
        }
        // cast move
        SubsetMove subsetMove = (SubsetMove) move;
        // cast current validation object (known to be of the required type as both validate methods return such object)
        KnapsackValidation kVal = (KnapsackValidation) curValidation;
        // extract current sum of weights
        double weight = kVal.getCurWeight();
        // account for added items
        weight += subsetMove.getAddedIDs().stream().mapToDouble(data::getWeight).sum();
        // account for removed items
        weight -= subsetMove.getDeletedIDs().stream().mapToDouble(data::getWeight).sum();
        // return updated validation
        return new KnapsackValidation(weight, maxWeight);
    }

}

We are now ready to combine the data, objective and constraint in a SubsetProblem. As all subset sizes are allowed, no size limits are specified. The data and objective are passed to the problem upon construction and the constraint is added afterwards with addMandatoryConstraint(...). By adding a mandatory constraint, applied neighbourhood searches discard all neighbours that violate this constraint. The returned best found solution (if any) is then always guaranteed to satisfy the constraint.

Mandatory versus penalizing constraints

Instead of a mandatory constraint you may also use a penalizing constraint. Such constraint does not cause solutions that violate it to be discarded. Instead, a penalty is assigned to the solution's evaluation. This is a more flexible approach compared to mandatory constraints but requires to carefully choose the penalties, usually reflecting the severeness of the constraint violation. As there is a tradeoff between evaluation and penalties it is not guaranteed that the best found solution of a search will satisfy all penalizing constraints, which may or may not be desired. See example 2B.

// set weights, profits and capacity
double[] weights = ...
double[] profits = ...
double capacity = ...

// create data object
KnapsackData data = new KnapsackData(weights, profits);
// create objective
KnapsackObjective obj = new KnapsackObjective();
// create constraint
KnapsackConstraint constraint = new KnapsackConstraint(capacity);

// create subset problem (all sizes allowed)
SubsetProblem<KnapsackData> problem = new SubsetProblem<>(data, obj);

// add mandatory constraint
problem.addMandatoryConstraint(constraint);

By default, local searches start from a randomly generated initial solution. In case of a subset problem, any random selection within the size limits (if any) may be generated when requesting a random solution. However, it may of course happen that such random selection exceeds the knapsack capacity and does not have any valid neighbours, in which case the search immediately stalls. To ensure that only random subsets within the knapsack constraint are generated, we can set a custom random solution generator. First, the default generator is retrieved from the problem. Then, a custom generator is set which calls the default generator to obtain a random selection, from which items are subsequently removed until the capacity is respected. The RandomSolutionGenerator interface has a single method create(rnd, data) to generate a random solution using the specified source of randomness and data. It can therefore compactly be defined as a Java 8 lambda expression, as in the example code below.

Mandatory versus penalizing constraints (revisited)

When using a penalizing instead of a mandatory constraint it is not required that the initial solution satisfies the constraint. If the penalties are carefully chosen, the search is able to find its way towards solutions within the constraint even if it started in a region of the search space where all solutions violate the constraint, because the latter solutions are not discarded but penalized. Yet, a penalizing constraint is more difficult to design. See example 2B.
// retrieve default random solution generator
RandomSolutionGenerator<? extends SubsetSolution, ? super KnapsackData> defaultRndSolGen = problem.getRandomSolutionGenerator();

// set custom generator
problem.setRandomSolutionGenerator((r,d) -> {
    // 1: create default random initial solution
    SubsetSolution sol = defaultRndSolGen.create(r, d);
    // 2: compute current total weight
    double weight = computeSelectionWeight(sol, d);
    // 3: remove random items as long as total weight is larger than the capacity
    while(weight > capacity){
        int id = SetUtilities.getRandomElement(sol.getSelectedIDs(), r);
        sol.deselect(id);
        weight -= data.getWeight(id);
    }
    // 4: retain random subset to increase initial solution variability
    int finalSize = r.nextInt(sol.getNumSelectedIDs()+1);
    sol.deselectAll(SetUtilities.getRandomSubset(sol.getSelectedIDs(),
                                                 sol.getNumSelectedIDs()-finalSize,
                                                 r));
    return sol;
});

Good to know

The utility class SetUtilities can be used to sample random items and subsets of any given set.

The total weight of a selection is easily computed.

// computes the total weight of items selected in the given subset solution
double computeSelectionWeight(SubsetSolution solution, KnapsackData data){
    return solution.getSelectedIDs().stream().mapToDouble(data::getWeight).sum();
}

We will apply two different local search metaheuristics to optimize the selected knapsack: random descent (basic) and parallel tempering (advanced). Random descent only accepts modifications of the current solution that yield an improvement so that it can easily get stuck in a local optimum. Parallel tempering also accepts inferior moves in a controlled way to be able to escape from such local optima.

First create the RandomDescent search with a predefined SinglePerturbationNeighbourhood which generates moves that add, delete or swap a single (pair of) ID(s) in the current selection. This neighbourhood allows selection of variable size subsets. The solution type of the search is fixed to SubsetSolution. A maximum runtime is specified as stop criterion and a ProgressSearchListener is attached to track the progress of the search (see example 1A: core subset selection).

// create random descent search with single perturbation neighbourhood
RandomDescent<SubsetSolution> randomDescent = new RandomDescent<>(problem, new SinglePerturbationNeighbourhood());
// set maximum runtime
long timeLimit = ...
randomDescent.addStopCriterion(new MaxRuntime(timeLimit, TimeUnit.SECONDS));
// attach listener (see example 1A)
randomDescent.addSearchListener(new ProgressSearchListener());

Then start the search, get the best found solution when it completes and eventually dispose it to release all resources.

// start search (blocks until search terminates)
randomDescent.start();
// print results
if(randomDescent.getBestSolution() != null){
    System.out.println("Items in knapsack: " + randomDescent.getBestSolution().getNumSelectedIDs() + "/" + data.getIDs().size());
    System.out.println("Total profit: " + randomDescent.getBestSolutionEvaluation());
    System.out.println("Total weight: " + computeSelectionWeight(randomDescent.getBestSolution(), data) + "/" + capacity);
} else {
    System.out.println("No valid solution found...");
}
// dispose search
randomDescent.dispose();

The basic random descent search may easily get stuck in a local optimum. Advanced searches, such as the parallel tempering algorithm, provide ways to escape from such local optima so that they can find better solutions. Parallel tempering has three parameters: the number of concurrently executed Metropolis searches (replicas) and the minimum and maximum temperature. The temperature bounds should be carefully chosen, taking into account the scale of the objective function, as the probability to accept an inferior move depends on the ratio of the difference in evaluation (delta) and temperature. If large deltas are expected, the temperatures should be higher. Else, almost no inferior moves will ever be accepted. Conversely, if small deltas are expected, the temperatures should be lower to accept inferior moves in a controlled way.

In this example, the expected difference in evaluations between neighbouring solutions highly depends on the scale of the profits assigned to the items. Therefore the temperatures are scaled accordingly by multiplying them with the average profit of all knapsack items. Alternatively, we could have normalized the profits.

// set temperature range, scaled according to average profit of knapsack items
double scale = computeAverageProfit(data);
double minTemp = scale * 0.001;
double maxTemp = scale * 0.1;

The average profit of all knapsack items is easily computed.

double computeAverageProfit(KnapsackData data){
    return data.getIDs().stream().mapToDouble(data::getProfit).average().getAsDouble();
}

Now we can create the parallel tempering search.

// create parallel tempering
int numReplicas = 10;
ParallelTempering<SubsetSolution> parallelTempering = new ParallelTempering<>(problem,
                                                            new SinglePerturbationNeighbourhood(),
                                                            numReplicas, minTemp, maxTemp);

Running the search is done in exactly the same way as for the random descent algorithm. Experimenting with some example input will show that the advanced parallel tempering algorithm finds much better solutions for the 0/1 knapsack problem as compared to the basic random descent algorithm.

The complete source code of this example is available on GitHub, including some additional code to read the input from a text file. You can also download a ZIP file that contains the Java sources of all examples, a compiled JAR (including all dependencies) as well as some input files for in case you want to run any of the examples. To run this example, execute

$ java -cp james-examples.jar org.jamesframework.examples.knapsack.KnapSack <inputfile> <capacity> <runtime>

from the command line. The input file should be a text file in which the first row contains a single number N that indicates the number of available knapsack items. The subsequent N rows each contain the profit and weight (in this order) of a single item, separated by one or more spaces. The runtime is given in seconds.