PACE 2021 - Cluster Editing

The Cluster Editing problem (also referred to as the Correlation Clustering problem) is defined as follows:

Input: An undirected graph $G = (V, E)$.
Output: A set of edge modifications (addition or deletion) of minimum size that transforms $G$ into a cluster graph.

Recall that a cluster graph is a graph in which every connected component forms a clique; it is also characterized by the forbidden induced subgraph $P_3$ (a path on three vertices).

Example

Example

In this example, one edge addition and two edge deletions transform the input graph into a cluster graph.

Background

Clustering plays an important role in modern society. Clustering is the task of partitioning instances into some number of groups (called clusters) such that instances in the same group are similar to one another. The Cluster Editing problem is one of the most natural ways to model clustering on graphs.

Cluster Editing is NP-hard but fixed parameter tractable (FPT) for the minimum solution size $k$. In fact, there is a simple $O(3^k \cdot (|V| + |E|))$-time branching algorithm. If there is an induced path $(u, v, w)$ on three vertices (which can be found in linear time), we branch into three cases: add an edge $uw$, delete the edge $uv$, or delete the edge $vw$.

Illustration of the branching rule

Literature