Exact Algorithms for Cluster Editing: Evaluation and

Exact Algorithms for Cluster Editing: Evaluation and Experiments S. B ocker1, S. Briesemeister2, G. W. Klau3 1Friedrich-Schiller-Universit at Jena 2Eb...

0 downloads 128 Views 818KB Size
Exact Algorithms for Cluster Editing: Evaluation and Experiments S. B¨ocker1 , S. Briesemeister2 , G. W. Klau3 1 Friedrich-Schiller-Universit¨ at 2 Eberhard-Karls-Universit¨ at 3 Freie

Jena

T¨ ubingen

Universit¨ at Berlin

7th International Workshop on Experimental Algorithms June 1st, 2008

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

1

Outline

1 2 3 4 5

Introduction Data Reduction Fixed-Parameter (FPT) Algorithm Integer Linear Programming (ILP) Approach Experiments

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

2

Introduction

The Cluster Editing Problem

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

3

Introduction

The Cluster Editing Problem

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

3

Introduction

The Cluster Editing Problem

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

3

Introduction

The Cluster Editing Problem Definition Input: An unweighted undirected graph G = (V , E ). Task: Find a set of edge modifications of minimum cardinality which transform G into a transitive graph.

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

4

Introduction

The Cluster Editing Problem Definition Input: An unweighted undirected graph G = (V , E ). Task: Find a set of edge modifications of minimum cardinality which transform G into a transitive graph.

conflict triple ! B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

4

Introduction

The Cluster Editing Problem Background NP-hard Kriv´anek & Mor´avek, 1986 many FPT approaches parameter k is the number of edge modifications first non-trivial FPT algorithm: O(2.27k ) problem kernel of size 4kopt Guo, 2007

Gramm et al., 2005

This Work New FPT algorithm: O(1.82k )

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

5

Introduction

The Cluster Editing Problem Background NP-hard Kriv´anek & Mor´avek, 1986 many FPT approaches parameter k is the number of edge modifications first non-trivial FPT algorithm: O(2.27k ) problem kernel of size 4kopt Guo, 2007

Gramm et al., 2005

This Work New FPT algorithm: O(1.82k )

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

5

Introduction

The Cluster Editing Problem Motivation evaluation of Gramm et al. FPT algorithm Dehne et al., 2006 → sufficient to solve instances with |V | = 150 and 50 changes this work: graphs with |V | ≥ 1000 and 20000+ changes

www.math.cornell.edu/∼ durrett/RGD/protein map.gif B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

6

Introduction

The Cluster Editing Problem Basic Concept FPT algorithm for Weighted Cluster Editing B¨ocker et al., 2007 edge uv with deletion costs d labeled with s(uv ) = d edge uv with insertion costs i labeled with s(uv ) = −i unweighted graphs: edges labeled with +1/ − 1 4

1 1

3

2

3

-1

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

7

Introduction

The Cluster Editing Problem Basic Concept FPT algorithm for Weighted Cluster Editing B¨ocker et al., 2007 edge uv with deletion costs d labeled with s(uv ) = d edge uv with insertion costs i labeled with s(uv ) = −i unweighted graphs: edges labeled with +1/ − 1 4

1 1

3

2

3

-1

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

7

Introduction

The Cluster Editing Problem Basic Concept FPT algorithm for Weighted Cluster Editing B¨ocker et al., 2007 edge uv with deletion costs d labeled with s(uv ) = d edge uv with insertion costs i labeled with s(uv ) = −i unweighted graphs: edges labeled with +1/ − 1 4 3 3

-1

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

7

Data Reduction

Merging Vertices final decision for edges → permanent or forbidden if edge uv becomes permanent then merge u and v to u 0 for all w ∈ V \ {u, v } set s(u 0 w ) ← s(uw ) + s(vw ) reduce k by min{|s(uw )|, |s(vw )|}

∞ a

B¨ ocker, Briesemeister, Klau

b

Exact Algorithms for Cluster Editing

a+b

June 1st, 2008

8

Data Reduction

Merging Vertices final decision for edges → permanent or forbidden if edge uv becomes permanent then merge u and v to u 0 for all w ∈ V \ {u, v } set s(u 0 w ) ← s(uw ) + s(vw ) reduce k by min{|s(uw )|, |s(vw )|}

∞ a

B¨ ocker, Briesemeister, Klau

pay b -b

Exact Algorithms for Cluster Editing

a-b

June 1st, 2008

8

Data Reduction

4k Kernel Critical Clique Clique K where the vertices of K have the same set of neighbors in V \ K and K is maximal under this property. Guo, 2007

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

9

Data Reduction

4k Kernel Critical Clique Clique K where the vertices of K have the same set of neighbors in V \ K and K is maximal under this property. Guo, 2007

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

9

Data Reduction

4k Kernel Critical Clique Clique K where the vertices of K have the same set of neighbors in V \ K and K is maximal under this property. Guo, 2007

2

2 4

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

9

Data Reduction

More Reduction Rules Simple Rules search for “heavy” (non)-edges

Almost Clique Rule searches for internal highly connected and external loosely connected subgraphs uses heuristic for subgraph search

Similar Neighborhood searches for nodes with similar neighborhood can be computed via dynamic programming B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

10

Data Reduction

More Reduction Rules Simple Rules search for “heavy” (non)-edges

Almost Clique Rule searches for internal highly connected and external loosely connected subgraphs uses heuristic for subgraph search

Similar Neighborhood searches for nodes with similar neighborhood can be computed via dynamic programming B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

10

FPT Algorithm

Branching and Reduction Simple Edge Branching search for edge that minimizes branching number simply branch in case “permanent” or “forbidden”

Parameter-Dependent Reduction icp(uv ) =

P

w ∈N(u)∆N(v )

min{|s(uw )|, |s(vw )|} (permanent)

icf (uv ) =

P

w ∈N(u)∩N(v )

min{s(uw ), s(vw )} (forbidden)

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

11

FPT Algorithm

Branching and Reduction Simple Edge Branching search for edge that minimizes branching number simply branch in case “permanent” or “forbidden”

Parameter-Dependent Reduction icp(uv ) =

P

w ∈N(u)∆N(v )

min{|s(uw )|, |s(vw )|} (permanent)

icf (uv ) =

P

w ∈N(u)∩N(v )

min{s(uw ), s(vw )} (forbidden)

∞ a

B¨ ocker, Briesemeister, Klau



-b

a

Exact Algorithms for Cluster Editing

b

June 1st, 2008

11

FPT Algorithm

Upper and Lower Bounds Lower Bound Greedy search for edge-disjoint conflict triples. 2 3

-5

Upper Bound greedy search for edges where reduction rules almost apply use upper bound to utilize parameter-dependent reduction in preprocessing B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

12

ILP Approach

Integer Linear Programming Approach Variables V 2

For each edge e ∈



binary variable xe with interpretation ( 1 e in solution xe = 0 otherwise

Exclude Conflict Triples 1

k

j

2

k

j

3

k 1 2

j

3

i

B¨ ocker, Briesemeister, Klau

i

+xij + xjk − xik ≤ 1 +xij − xjk + xik ≤ 1 −xij + xjk + xik ≤ 1

i

Exact Algorithms for Cluster Editing

June 1st, 2008

13

ILP Approach

Integer Linear Programming Approach Integer Linear Program minimize

X

s(ij) −

s(ij)xij

1≤i
ij∈E

subject to

X

+ xij + xjk − xik ≤ 1

for all 1 ≤ i < j < k ≤ n

+ xij − xjk + xik ≤ 1

for all 1 ≤ i < j < k ≤ n

− xij + xjk + xik ≤ 1

for all 1 ≤ i < j < k ≤ n

xij ∈ {0, 1}

for all 1 ≤ i < j ≤ n

Contribution of Edge ij to Objective Function ij ∈ E ij ∈ /E B¨ ocker, Briesemeister, Klau

xij = 1 s(ij) − s(ij) = 0 −s(ij) > 0

xij = 0 s(ij) − 0 > 0 0

Exact Algorithms for Cluster Editing

June 1st, 2008

14

ILP Approach

Integer Linear Programming Approach Gr¨otschel and Wakabayashi studied the corresponding polyhedron and identified additional facet-defining inequalities. Gr¨otschel and Wakabayashi,1989

Cutting Plane Algorithm 1 2

solve (initially empty) LP relaxation check current solution x¯: violated conflict triple inequality → add violated other inequality → add

3

if no violated inequalities exist and x¯ is integral → finished, else → goto 1

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

15

Experiments

Datasets

Random Unweighted Graphs randomly create a transitive graph of size n choose k vertex tuples and invert the corresponding edges

Protein Similarity Data real-value weighted graphs (not shown here)

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

16

Experiments

Weighted Data Reduction

0.6 0.4 0.0

0.2

reduction ratio

0.8

1.0

large graphs get reduced more efficiently than small graphs nearly no twilight zone (everything or nothing reduced) ratio k/n important for reduction

200

400

600

800

1000

n B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

17

Experiments

Weighted Data Reduction

0.8 0.6 0.4 0.2

n = 100 n = 500 n = 1500 n = 2000 unweighted kernel

0.0

average reduction ratio

1.0

unweighted kernel reduction effective for k/n ≤ 0.5 weighted reduction much more efficient

0

5

10

15

20

25

k/n B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

18

Experiments

ILP vs. FPT

comparison on reduced graphs size n ≈ 100, 200, . . . , 900 costs k ≈ 1n, 2n, . . . , 10n about 28 graphs per bin on average

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

19

Experiments

ILP vs. FPT

100

150

FPT n= 500 FPT n= 900 ILP n= 500 ILP n= 900

0

50

runtime in minutes

200

250

FPT depends on actual edit cost k ILP more dependent on graph size n

2

4

6

8

10

k/n B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

20

Experiments

Cluster Editing Efficiently Solved

efficient reduction rules and bounds graphs with thousands of vertices shrink significantly FPT algorithm much faster than Gramm et al. algorithm ILP usually outperforms FPT promising combination of reduction and ILP Cluster Editing no longer restricted to small or almost transitive graphs

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

21

Experiments

Thank you for your attention!

[email protected]

B¨ ocker, Briesemeister, Klau

Exact Algorithms for Cluster Editing

June 1st, 2008

22