# Examples

Step-by-step implementation guides are provided here for a series of examples. The full source code of all examples is bundled in the examples module. You can browse the code on GitHub or download a ZIP file that contains all Java sources, a compiled JAR (including all dependencies) as well as example input files for in case you would like to run any of the examples.

These examples show how to model a subset selection problem and how to apply a variety of searches to find good solutions.

Basic fixed size subset selection. Algorithm: random descent with a predefined subset neighbourhood (single swap).

The core subset selection problem originates from the field of crop science. Given a large collection of crop varieties, the goal is to select a maintainable subset that represents the diversity of the entire collection as much as possible. There are several approaches to this problem, with distinct objective functions.

This example discusses the implementation of a simplified core subset selection problem in JAMES. It is assumed that a distance matrix is available in which the dissimilarity of any pair of crop varieties is expressed. The goal is to select a subset of fixed size with maximum average distance between all pairs of selected items.

The basic random descent algorithm is applied to sample a core subset, using a predefined subset neighbourhood that swaps a single selected and unselected ID to modify the current solution.

Efficient delta evaluation in neighbourhood searches. Algorithm: random descent with a predefined subset neighbourhood (single swap).

When a neighbourhood search evaluates a neighbour of the current solution, both solutions are usually very similar. It is therefore often possible to evaluate such neighbour based on the current solution, its evaluation and the applied move (changes) using fewer computations as compared to a full evaluation of the neighbour.

This example demonstrates how to implement such efficient delta evaluation for the same problem and objective from example 1A. The same basic random descent algorithm is applied to sample a core subset, using the predefined single swap neighbourhood.

Custom evaluation objects with metadata for delta evaluation. Algorithms: random descent and parallel tempering with a predefined subset neighbourhood (single swap).

Various objective functions have been proposed for the core subset selection problem. In this example, the average distance from every selected item to the closest other selected item is maximized, instead of the average distance between all pairs. To be able to provide an efficient delta evaluation for this objective, a custom evaluation object is constructed that stores additional metadata.

The random descent and parallel tempering algorithms are applied to sample a core subset, using a predefined subset neighbourhood that swaps a single selected and unselected ID to modify the current solution.

Variable size subset selection with a mandatory constraint. Algorithms: random descent and parallel tempering with a predefined subset neighbourhood (single perturbation).

The 0/1 knapsack problem is a constrained variable size subset selection problem. The input consists of a list of items which each have a fixed weight and profit. The goal is to select a subset of these items with maximal total profit where the total weight does not exceed a given knapsack capacity. The latter requirement is imposed using a mandatory constraint so that invalid solutions are discarded during search.

The random descent and parallel tempering algorithms are applied to optimize the selection, using a predefined subset neighbourhood that adds, deletes or swaps a (pair of) ID(s) to modify the current selection.

Variable size subset selection with a penalizing constraint. Algorithm: parallel tempering with a predefined subset neighbourhood (single perturbation).

In this example, the 0/1 knapsack problem is solved using a penalizing instead of a mandatory constraint. Such penalizing constraint does not cause solutions that violate it to be discarded, but assigns a penalty to such solution's evaluation. This allows the search to cross regions of the search space with solutions outside the constraint.

The parallel tempering algorithm is applied to optimize the selection, using a predefined subset neighbourhood that adds, deletes or swaps a (pair of) ID(s) to modify the current selection.

Variable size subset selection with a custom greedy neighbourhood. Algorithms: random descent and variable neighbourhood search.

The maximum clique problem originates from graph theory and has many applications in e.g. social networks, bioinformatics and computational chemistry. The goal is to find a clique of maximum size in a given graph. Such clique consists of a subset of vertices which are all connected to each other (a complete subgraph).

The random descent and variable neighbourhood search algorithms are applied to search for a maximum clique. The algorithms start with an empty clique and use a custom greedy neighbourhood that always adds a single vertex which is connected to all vertices in the current clique. The variable neighbourhood search uses additional shaking neighbourhoods to escape from local optima.

These examples show how to implement and solve other types of problems besides subset selection.

Basic custom problem specification. Algorithms: random descent and parallel tempering.

The travelling salesman problem consists of finding the shortest round trip through a list of given cities. Each city is to be visited exactly once and the trip has to end in the same city where the tour started. This example considers the symmetric TSP where the travel distance from a to b is always the same as that from b to a. The goal is to create a permutation (ordered list) of all cities so that visiting the cities in this specific order yields the lowest possible total travel distance.

The random descent and parallel tempering algorithms are applied to optimize the round trip using a custom neighbourhood that reverses sub-series of cities in given TSP solutions. This type of move is often referred to as a 2-opt move in the context of TSP.

Design of reusable components for permutation problems. Algorithm: parallel tempering.

This example reconsiders the travelling salesman problem. As this is actually a permutation problem, generic components are designed that can be reused for any permutation problem, similar to the subset selection components already available in the core module. The necessary TSP data and objective are then plugged in.

All generic permutation problem components designed in this example have been made available in the extensions module.

The parallel tempering algorithm is applied to optimize the round trip with a generic neighbourhood that reverses subsequences in given permutations, one of the permutation neighbourhoods provided in the extensions module.

The following examples show how to use the automated analysis workflow from the extensions module and corresponding R package.

Find appropriate temperature range for parallel tempering. Algorithms: Metropolis searches with different temperatures.

This example demonstrates how to perform a parameter sweep using the provided tools from the extensions module. The core subset selection problem from example 1C is used as a case study, where the entry-to-nearest-entry objective is maximized. Different Metropolis searches with varying fixed temperatures are applied to find an appropriate temperature range to be used for a parallel tempering search (see example 5B). The results are inspected using the provided R package.

Compare algorithm performance. Algorithms: random descent and parallel tempering.

Compare algorithm performance using the provided tools from the extensions module. The core subset selection problem from example 1C is used as a case study, where the entry-to-nearest-entry objective is maximized. Both random descent as well as parallel tempering are applied to solve the problem for a series of given datasets. The temperature range for parallel tempering is set based on the results of example 5A. The performance of both algorithms is compared using the provided R package.