Description

This page intentionally left blank

Acquisitions Editor: Matt Goldstein

Project Editor: Maite Suarez-Rivas

Production Supervisor: Marilyn Lloyd

Marketing Manager: Michelle Brown

Marketing Coordinator: Jake Zavracky

Project Management: Windfall Software

Composition: Windfall Software, using ZzTEX

Copyeditor: Carol Leyba

Technical Illustration: Dartmouth Publishing

Proofreader: Jennifer McClain

Indexer: Ted Laux

Cover Design: Joyce Cosentino Wells

Cover Photo: Ã‚Â© 2005 Tim Laman / National Geographic. A pair of weaverbirds work

together on their nest in Africa.

Prepress and Manufacturing: Caroline Fell

Printer: Courier Westford

Access the latest information about Addison-Wesley titles from our World Wide Web

site: http://www.aw-bc.com/computing

Many of the designations used by manufacturers and sellers to distinguish their

products are claimed as trademarks. Where those designations appear in this book,

and Addison-Wesley was aware of a trademark claim, the designations have been

printed in initial caps or all caps.

The programs and applications presented in this book have been included for their

instructional value. They have been tested with care, but are not guaranteed for any

particular purpose. The publisher does not offer any warranties or representations, nor

does it accept any liabilities with respect to the programs or applications.

Library of Congress Cataloging-in-Publication Data

Kleinberg, Jon.

Algorithm design / Jon Kleinberg, EÃŒÂva Tardos.Ã¢â‚¬â€1st ed.

p. cm.

Includes bibliographical references and index.

ISBN 0-321-29535-8 (alk. paper)

1. Computer algorithms. 2. Data structures (Computer science) I. Tardos, EÃŒÂva.

II. Title.

QA76.9.A43K54 2005

005.1Ã¢â‚¬â€dc22

2005000401

Copyright Ã‚Â© 2006 by Pearson Education, Inc.

For information on obtaining permission for use of material in this work, please

submit a written request to Pearson Education, Inc., Rights and Contract Department,

75 Arlington Street, Suite 300, Boston, MA 02116 or fax your request to (617) 848-7047.

All rights reserved. No part of this publication may be reproduced, stored in a

retrieval system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording, or any toher media embodiments now known or hereafter to

become known, without the prior written permission of the publisher. Printed in the

United States of America.

ISBN 0-321-29535-8

1 2 3 4 5 6 7 8 9 10-CRW-08 07 06 05

About the Authors

Jon Kleinberg is a professor of Computer Science at

Cornell University. He received his Ph.D. from M.I.T.

in 1996. He is the recipient of an NSF Career Award,

an ONR Young Investigator Award, an IBM Outstanding Innovation Award, the National Academy of Sciences Award for Initiatives in Research, research fellowships from the Packard and Sloan Foundations,

and teaching awards from the Cornell Engineering

College and Computer Science Department.

KleinbergÃ¢â‚¬â„¢s research is centered around algorithms, particularly those concerned with the structure of networks and information, and with applications

to information science, optimization, data mining, and computational biology. His work on network analysis using hubs and authorities helped form the

foundation for the current generation of Internet search engines.

EÃŒÂva Tardos is a professor of Computer Science at Cornell University. She received her Ph.D. from EoÃŒË†tvoÃŒË†s

University in Budapest, Hungary in 1984. She is a

member of the American Academy of Arts and Sciences, and an ACM Fellow; she is the recipient of an

NSF Presidential Young Investigator Award, the Fulkerson Prize, research fellowships from the Guggenheim, Packard, and Sloan Foundations, and teaching awards from the Cornell Engineering College and

Computer Science Department.

TardosÃ¢â‚¬â„¢s research interests are focused on the design and analysis of

algorithms for problems on graphs or networks. She is most known for her

work on network-Ã¯Â¬â€šow algorithms and approximation algorithms for network

problems. Her recent work focuses on algorithmic game theory, an emerging

area concerned with designing systems and algorithms for selÃ¯Â¬Âsh users.

This page intentionally left blank

Contents

About the Authors

Preface

v

xiii

1

Introduction: Some Representative Problems

1.1

A First Problem: Stable Matching 1

1.2

Five Representative Problems 12

Solved Exercises 19

Exercises 22

Notes and Further Reading 28

2

Basics of Algorithm Analysis

29

2.1

Computational Tractability 29

2.2

Asymptotic Order of Growth 35

2.3

Implementing the Stable Matching Algorithm Using Lists and

Arrays 42

2.4

A Survey of Common Running Times 47

2.5

A More Complex Data Structure: Priority Queues 57

Solved Exercises 65

Exercises 67

Notes and Further Reading 70

3

Graphs

73

3.1

Basic DeÃ¯Â¬Ânitions and Applications 73

3.2

Graph Connectivity and Graph Traversal 78

3.3

Implementing Graph Traversal Using Queues and Stacks 87

3.4

Testing Bipartiteness: An Application of Breadth-First Search 94

3.5

Connectivity in Directed Graphs 97

1

viii

Contents

3.6

4

5

6

Ã¢Ë†â€”

Directed Acyclic Graphs and Topological Ordering

Solved Exercises 104

Exercises 107

Notes and Further Reading 112

99

Greedy Algorithms

115

4.1

Interval Scheduling: The Greedy Algorithm Stays Ahead 116

4.2

Scheduling to Minimize Lateness: An Exchange Argument 125

4.3

Optimal Caching: A More Complex Exchange Argument 131

4.4

Shortest Paths in a Graph 137

4.5

The Minimum Spanning Tree Problem 142

4.6

Implementing KruskalÃ¢â‚¬â„¢s Algorithm: The Union-Find Data

Structure 151

4.7

Clustering 157

4.8

Huffman Codes and Data Compression 161

Ã¢Ë†â€” 4.9

Minimum-Cost Arborescences: A Multi-Phase Greedy

Algorithm 177

Solved Exercises 183

Exercises 188

Notes and Further Reading 205

Divide and Conquer

5.1

A First Recurrence: The Mergesort Algorithm

5.2

Further Recurrence Relations 214

5.3

Counting Inversions 221

5.4

Finding the Closest Pair of Points 225

5.5

Integer Multiplication 231

5.6

Convolutions and the Fast Fourier Transform

Solved Exercises 242

Exercises 246

Notes and Further Reading 249

209

210

234

Dynamic Programming

251

6.1

Weighted Interval Scheduling: A Recursive Procedure 252

6.2

Principles of Dynamic Programming: Memoization or Iteration

over Subproblems 258

6.3

Segmented Least Squares: Multi-way Choices 261

The star indicates an optional section. (See the Preface for more information about the relationships

among the chapters and sections.)

Contents

6.4

6.5

6.6

6.7

6.8

6.9

Ã¢Ë†â€” 6.10

7

8

Subset Sums and Knapsacks: Adding a Variable 266

RNA Secondary Structure: Dynamic Programming over

Intervals 272

Sequence Alignment 278

Sequence Alignment in Linear Space via Divide and

Conquer 284

Shortest Paths in a Graph 290

Shortest Paths and Distance Vector Protocols 297

Negative Cycles in a Graph 301

Solved Exercises 307

Exercises 312

Notes and Further Reading 335

Network Flow

7.1

The Maximum-Flow Problem and the Ford-Fulkerson

Algorithm 338

7.2

Maximum Flows and Minimum Cuts in a Network 346

7.3

Choosing Good Augmenting Paths 352

Ã¢Ë†â€” 7.4

The PreÃ¯Â¬â€šow-Push Maximum-Flow Algorithm 357

7.5

A First Application: The Bipartite Matching Problem 367

7.6

Disjoint Paths in Directed and Undirected Graphs 373

7.7

Extensions to the Maximum-Flow Problem 378

7.8

Survey Design 384

7.9

Airline Scheduling 387

7.10 Image Segmentation 391

7.11 Project Selection 396

7.12 Baseball Elimination 400

Ã¢Ë†â€” 7.13

A Further Direction: Adding Costs to the Matching Problem

Solved Exercises 411

Exercises 415

Notes and Further Reading 448

NP and Computational Intractability

8.1

Polynomial-Time Reductions 452

8.2

Reductions via Ã¢â‚¬Å“GadgetsÃ¢â‚¬Â: The SatisÃ¯Â¬Âability Problem

8.3

EfÃ¯Â¬Âcient CertiÃ¯Â¬Âcation and the DeÃ¯Â¬Ânition of NP 463

8.4

NP-Complete Problems 466

8.5

Sequencing Problems 473

8.6

Partitioning Problems 481

8.7

Graph Coloring 485

337

404

451

459

ix

x

Contents

8.8

8.9

8.10

9

Numerical Problems 490

Co-NP and the Asymmetry of NP 495

A Partial Taxonomy of Hard Problems 497

Solved Exercises 500

Exercises 505

Notes and Further Reading 529

PSPACE: A Class of Problems beyond NP

9.1

PSPACE 531

9.2

Some Hard Problems in PSPACE 533

9.3

Solving QuantiÃ¯Â¬Âed Problems and Games in Polynomial

Space 536

9.4

Solving the Planning Problem in Polynomial Space 538

9.5

Proving Problems PSPACE-Complete 543

Solved Exercises 547

Exercises 550

Notes and Further Reading 551

10 Extending the Limits of Tractability

10.1

10.2

10.3

Ã¢Ë†â€” 10.4

Ã¢Ë†â€” 10.5

11.2

11.3

11.4

11.5

11.6

Ã¢Ë†â€” 11.7

553

Finding Small Vertex Covers 554

Solving NP-Hard Problems on Trees 558

Coloring a Set of Circular Arcs 563

Tree Decompositions of Graphs 572

Constructing a Tree Decomposition 584

Solved Exercises 591

Exercises 594

Notes and Further Reading 598

11 Approximation Algorithms

11.1

531

599

Greedy Algorithms and Bounds on the Optimum: A Load

Balancing Problem 600

The Center Selection Problem 606

Set Cover: A General Greedy Heuristic 612

The Pricing Method: Vertex Cover 618

Maximization via the Pricing Method: The Disjoint Paths

Problem 624

Linear Programming and Rounding: An Application to Vertex

Cover 630

Load Balancing Revisited: A More Advanced LP Application 637

Contents

11.8

Arbitrarily Good Approximations: The Knapsack Problem

Solved Exercises 649

Exercises 651

Notes and Further Reading 659

12 Local Search

12.1

12.2

12.3

12.4

12.5

Ã¢Ë†â€” 12.6

12.7

The Landscape of an Optimization Problem 662

The Metropolis Algorithm and Simulated Annealing 666

An Application of Local Search to HopÃ¯Â¬Âeld Neural Networks

Maximum-Cut Approximation via Local Search 676

Choosing a Neighbor Relation 679

ClassiÃ¯Â¬Âcation via Local Search 681

Best-Response Dynamics and Nash Equilibria 690

Solved Exercises 700

Exercises 702

Notes and Further Reading 705

13 Randomized Algorithms

644

661

671

707

13.1

13.2

13.3

13.4

13.5

A First Application: Contention Resolution 708

Finding the Global Minimum Cut 714

Random Variables and Their Expectations 719

A Randomized Approximation Algorithm for MAX 3-SAT 724

Randomized Divide and Conquer: Median-Finding and

Quicksort 727

13.6 Hashing: A Randomized Implementation of Dictionaries 734

13.7 Finding the Closest Pair of Points: A Randomized Approach 741

13.8 Randomized Caching 750

13.9 Chernoff Bounds 758

13.10 Load Balancing 760

13.11 Packet Routing 762

13.12 Background: Some Basic Probability DeÃ¯Â¬Ânitions 769

Solved Exercises 776

Exercises 782

Notes and Further Reading 793

Epilogue: Algorithms That Run Forever

795

References

805

Index

815

xi

This page intentionally left blank

Preface

Algorithmic ideas are pervasive, and their reach is apparent in examples both

within computer science and beyond. Some of the major shifts in Internet

routing standards can be viewed as debates over the deÃ¯Â¬Âciencies of one

shortest-path algorithm and the relative advantages of another. The basic

notions used by biologists to express similarities among genes and genomes

have algorithmic deÃ¯Â¬Ânitions. The concerns voiced by economists over the

feasibility of combinatorial auctions in practice are rooted partly in the fact that

these auctions contain computationally intractable search problems as special

cases. And algorithmic notions arenÃ¢â‚¬â„¢t just restricted to well-known and longstanding problems; one sees the reÃ¯Â¬â€šections of these ideas on a regular basis,

in novel issues arising across a wide range of areas. The scientist from Yahoo!

who told us over lunch one day about their system for serving ads to users was

describing a set of issues that, deep down, could be modeled as a network Ã¯Â¬â€šow

problem. So was the former student, now a management consultant working

on stafÃ¯Â¬Âng protocols for large hospitals, whom we happened to meet on a trip

to New York City.

The point is not simply that algorithms have many applications. The

deeper issue is that the subject of algorithms is a powerful lens through which

to view the Ã¯Â¬Âeld of computer science in general. Algorithmic problems form

the heart of computer science, but they rarely arrive as cleanly packaged,

mathematically precise questions. Rather, they tend to come bundled together

with lots of messy, application-speciÃ¯Â¬Âc detail, some of it essential, some of it

extraneous. As a result, the algorithmic enterprise consists of two fundamental

components: the task of getting to the mathematically clean core of a problem,

and then the task of identifying the appropriate algorithm design techniques,

based on the structure of the problem. These two components interact: the

more comfortable one is with the full array of possible design techniques,

the more one starts to recognize the clean formulations that lie within messy

xiv

Preface

problems out in the world. At their most effective, then, algorithmic ideas do

not just provide solutions to well-posed problems; they form the language that

lets you cleanly express the underlying questions.

The goal of our book is to convey this approach to algorithms, as a design

process that begins with problems arising across the full range of computing

applications, builds on an understanding of algorithm design techniques, and

results in the development of efÃ¯Â¬Âcient solutions to these problems. We seek

to explore the role of algorithmic ideas in computer science generally, and

relate these ideas to the range of precisely formulated problems for which we

can design and analyze algorithms. In other words, what are the underlying

issues that motivate these problems, and how did we choose these particular

ways of formulating them? How did we recognize which design principles were

appropriate in different situations?

In keeping with this, our goal is to offer advice on how to identify clean

algorithmic problem formulations in complex issues from different areas of

computing and, from this, how to design efÃ¯Â¬Âcient algorithms for the resulting

problems. Sophisticated algorithms are often best understood by reconstructing the sequence of ideasÃ¢â‚¬â€including false starts and dead endsÃ¢â‚¬â€that led from

simpler initial approaches to the eventual solution. The result is a style of exposition that does not take the most direct route from problem statement to

algorithm, but we feel it better reÃ¯Â¬â€šects the way that we and our colleagues

genuinely think about these questions.

Overview

The book is intended for students who have completed a programmingbased two-semester introductory computer science sequence (the standard

Ã¢â‚¬Å“CS1/CS2Ã¢â‚¬Â courses) in which they have written programs that implement

basic algorithms, manipulate discrete structures such as trees and graphs, and

apply basic data structures such as arrays, lists, queues, and stacks. Since

the interface between CS1/CS2 and a Ã¯Â¬Ârst algorithms course is not entirely

standard, we begin the book with self-contained coverage of topics that at

some institutions are familiar to students from CS1/CS2, but which at other

institutions are included in the syllabi of the Ã¯Â¬Ârst algorithms course. This

material can thus be treated either as a review or as new material; by including

it, we hope the book can be used in a broader array of courses, and with more

Ã¯Â¬â€šexibility in the prerequisite knowledge that is assumed.

In keeping with the approach outlined above, we develop the basic algorithm design techniques by drawing on problems from across many areas of

computer science and related Ã¯Â¬Âelds. To mention a few representative examples

here, we include fairly detailed discussions of applications from systems and

networks (caching, switching, interdomain routing on the Internet), artiÃ¯Â¬Âcial

Preface

intelligence (planning, game playing, HopÃ¯Â¬Âeld networks), computer vision

(image segmentation), data mining (change-point detection, clustering), operations research (airline scheduling), and computational biology (sequence

alignment, RNA secondary structure).

The notion of computational intractability, and NP-completeness in particular, plays a large role in the book. This is consistent with how we think

about the overall process of algorithm design. Some of the time, an interesting problem arising in an application area will be amenable to an efÃ¯Â¬Âcient

solution, and some of the time it will be provably NP-complete; in order to

fully address a new algorithmic problem, one should be able to explore both

of these options with equal familiarity. Since so many natural problems in

computer science are NP-complete, the development of methods to deal with

intractable problems has become a crucial issue in the study of algorithms,

and our book heavily reÃ¯Â¬â€šects this theme. The discovery that a problem is NPcomplete should not be taken as the end of the story, but as an invitation to

begin looking for approximation algorithms, heuristic local search techniques,

or tractable special cases. We include extensive coverage of each of these three

approaches.

Problems and Solved Exercises

An important feature of the book is the collection of problems. Across all

chapters, the book includes over 200 problems, almost all of them developed

and class-tested in homework or exams as part of our teaching of the course

at Cornell. We view the problems as a crucial component of the book, and

they are structured in keeping with our overall approach to the material. Most

of them consist of extended verbal descriptions of a problem arising in an

application area in computer science or elsewhere out in the world, and part of

the problem is to practice what we discuss in the text: setting up the necessary

notation and formalization, designing an algorithm, and then analyzing it and

proving it correct. (We view a complete answer to one of these problems as

consisting of all these components: a fully explained algorithm, an analysis of

the running time, and a proof of correctness.) The ideas for these problems

come in large part from discussions we have had over the years with people

working in different areas, and in some cases they serve the dual purpose of

recording an interesting (though manageable) application of algorithms that

we havenÃ¢â‚¬â„¢t seen written down anywhere else.

To help with the process of working on these problems, we include in

each chapter a section entitled Ã¢â‚¬Å“Solved Exercises,Ã¢â‚¬Â where we take one or more

problems and describe how to go about formulating a solution. The discussion

devoted to each solved exercise is therefore signiÃ¯Â¬Âcantly longer than what

would be needed simply to write a complete, correct solution (in other words,

xv

xvi

Preface

signiÃ¯Â¬Âcantly longer than what it would take to receive full credit if these were

being assigned as homework problems). Rather, as with the rest of the text,

the discussions in these sections should be viewed as trying to give a sense

of the larger process by which one might think about problems of this type,

culminating in the speciÃ¯Â¬Âcation of a precise solution.

It is worth mentioning two points concerning the use of these problems

as homework in a course. First, the problems are sequenced roughly in order

of increasing difÃ¯Â¬Âculty, but this is only an approximate guide and we advise

against placing too much weight on it: since the bulk of the problems were

designed as homework for our undergraduate class, large subsets of the

problems in each chapter are really closely comparable in terms of difÃ¯Â¬Âculty.

Second, aside from the lowest-numbered ones, the problems are designed to

involve some investment of time, both to relate the problem description to the

algorithmic techniques in the chapter, and then to actually design the necessary

algorithm. In our undergraduate class, we have tended to assign roughly three

of these problems per week.

Pedagogical Features and Supplements

In addition to the problems and solved exercises, the book has a number of

further pedagogical features, as well as additional supplements to facilitate its

use for teaching.

As noted earlier, a large number of the sections in the book are devoted

to the formulation of an algorithmic problemÃ¢â‚¬â€including its background and

underlying motivationÃ¢â‚¬â€and the design and analysis of an algorithm for this

problem. To reÃ¯Â¬â€šect this style, these sections are consistently structured around

a sequence of subsections: Ã¢â‚¬Å“The Problem,Ã¢â‚¬Â where the problem is described

and a precise formulation is worked out; Ã¢â‚¬Å“Designing the Algorithm,Ã¢â‚¬Â where

the appropriate design technique is employed to develop an algorithm; and

Ã¢â‚¬Å“Analyzing the Algorithm,Ã¢â‚¬Â which proves properties of the algorithm and

analyzes its efÃ¯Â¬Âciency. These subsections are highlighted in the text with an

icon depicting a feather. In cases where extensions to the problem or further

analysis of the algorithm is pursued, there are additional subsections devoted

to these issues. The goal of this structure is to offer a relatively uniform style

of presentation that moves from the initial discussion of a problem arising in a

computing application through to the detailed analysis of a method to solve it.

A number of supplements are available in support of the book itself. An

instructorÃ¢â‚¬â„¢s manual works through all the problems, providing full solutions to

each. A set of lecture slides, developed by Kevin Wayne of Princeton University,

is also available; these slides follow the order of the bookÃ¢â‚¬â„¢s sections and can

thus be used as the foundation for lectures in a course based on the book. These

Ã¯Â¬Âles are available at www.aw.com. For instructions on obtaining a professor

Preface

login and password, search the site for either Ã¢â‚¬Å“KleinbergÃ¢â‚¬Â or Ã¢â‚¬Å“TardosÃ¢â‚¬Â or

contact your local Addison-Wesley representative.

Finally, we would appreciate receiving feedback on the book. In particular,

as in any book of this length, there are undoubtedly errors that have remained

in the Ã¯Â¬Ânal version. Comments and reports of errors can be sent to us by e-mail,

at the address algbook@cs.cornell.edu; please include the word Ã¢â‚¬Å“feedbackÃ¢â‚¬Â

in the subject line of the message.

Chapter-by-Chapter Synopsis

Chapter 1 starts by introducing some representative algorithmic problems. We

begin immediately with the Stable Matching Problem, since we feel it sets

up the basic issues in algorithm design more concretely and more elegantly

than any abstract discussion could: stable matching is motivated by a natural

though complex real-world issue, from which one can abstract an interesting

problem statement and a surprisingly effective algorithm to solve this problem.

The remainder of Chapter 1 discusses a list of Ã¯Â¬Âve Ã¢â‚¬Å“representative problemsÃ¢â‚¬Â

that foreshadow topics from the remainder of the course. These Ã¯Â¬Âve problems

are interrelated in the sense that they are all variations and/or special cases

of the Independent Set Problem; but one is solvable by a greedy algorithm,

one by dynamic programming, one by network Ã¯Â¬â€šow, one (the Independent

Set Problem itself) is NP-complete, and one is PSPACE-complete. The fact that

closely related problems can vary greatly in complexity is an important theme

of the book, and these Ã¯Â¬Âve problems serve as milestones that reappear as the

book progresses.

Chapters 2 and 3 cover the interface to the CS1/CS2 course sequence

mentioned earlier. Chapter 2 introduces the key mathematical deÃ¯Â¬Ânitions and

notations used for analyzing algorithms, as well as the motivating principles

behind them. It begins with an informal overview of what it means for a problem to be computationally tractable, together with the concept of polynomial

time as a formal notion of efÃ¯Â¬Âciency. It then discusses growth rates of functions and asymptotic analysis more formally, and offers a guide to commonly

occurring functions in algorithm analysis, together with standard applications

in which they arise. Chapter 3 covers the basic deÃ¯Â¬Ânitions and algorithmic

primitives needed for working with graphs, which are central to so many of

the problems in the book. A number of basic graph algorithms are often implemented by students late in the CS1/CS2 course sequence, but it is valuable

to present the material here in a broader algorithm design context. In particular, we discuss basic graph deÃ¯Â¬Ânitions, graph traversal techniques such

as breadth-Ã¯Â¬Ârst search and depth-Ã¯Â¬Ârst search, and directed graph concepts

including strong connectivity and topological ordering.

xvii

xviii

Preface

Chapters 2 and 3 also present many of the basic data structures that will

be used for implementing algorithms throughout the book; more advanced

data structures are presented in subsequent chapters. Our approach to data

structures is to introduce them as they are needed for the implementation of

the algorithms being developed in the book. Thus, although many of the data

structures covered here will be familiar to students from the CS1/CS2 sequence,

our focus is on these data structures in the broader context of algorithm design

and analysis.

Chapters 4 through 7 cover four major algorithm design techniques: greedy

algorithms, divide and conquer, dynamic programming, and network Ã¯Â¬â€šow.

With greedy algorithms, the challenge is to recognize when they work and

when they donÃ¢â‚¬â„¢t; our coverage of this topic is centered around a way of classifying the kinds of arguments used to prove greedy algorithms correct. This

chapter concludes with some of the main applications of greedy algorithms,

for shortest paths, undirected and directed spanning trees, clustering, and

compression. For divide and conquer, we begin with a discussion of strategies

for solving recurrence relations as bounds on running times; we then show

how familiarity with these recurrences can guide the design of algorithms that

improve over straightforward approaches to a number of basic problems, including the comparison of rankings, the computation of closest pairs of points

in the plane, and the Fast Fourier Transform. Next we develop dynamic programming by starting with the recursive intuition behind it, and subsequently

building up more and more expressive recurrence formulations through applications in which they naturally arise. This chapter concludes with extended

discussions of the dynamic programming approach to two fundamental problems: sequence alignment, with applications in computational biology; and

shortest paths in graphs, with connections to Internet routing protocols. Finally, we cover algorithms for network Ã¯Â¬â€šow problems, devoting much of our

focus in this chapter to discussing a large array of different Ã¯Â¬â€šow applications.

To the extent that network Ã¯Â¬â€šow is covered in algorithms courses, students are

often left without an appreciation for the wide range of problems to which it

can be applied; we try to do justice to its versatility by presenting applications

to load balancing, scheduling, image segmentation, and a number of other

problems.

Chapters 8 and 9 cover computational intractability. We devote most of

our attention to NP-completeness, organizing the basic NP-complete problems

thematically to help students recognize candidates for reductions when they

encounter new problems. We build up to some fairly complex proofs of NPcompleteness, with guidance on how one goes about constructing a difÃ¯Â¬Âcult

reduction. We also consider types of computational hardness beyond NPcompleteness, particularly through the topic of PSPACE-completeness. We

Preface

Ã¯Â¬Ând this is a valuable way to emphasize that intractability doesnÃ¢â‚¬â„¢t end at

NP-completeness, and PSPACE-completeness also forms the underpinning for

some central notions from artiÃ¯Â¬Âcial intelligenceÃ¢â‚¬â€planning and game playingÃ¢â‚¬â€

that would otherwise not Ã¯Â¬Ând a place in the algorithmic landscape we are

surveying.

Chapters 10 through 12 cover three major techniques for dealing with computationally intractable problems: identiÃ¯Â¬Âcation of structured special cases,

approximation algorithms, and local search heuristics. Our chapter on tractable

special cases emphasizes that instances of NP-complete problems arising in

practice may not be nearly as hard as worst-case instances, because they often

contain some structure that can be exploited in the design of an efÃ¯Â¬Âcient algorithm. We illustrate how NP-complete problems are often efÃ¯Â¬Âciently solvable

when restricted to tree-structured inputs, and we conclude with an extended

discussion of tree decompositions of graphs. While this topic is more suitable for a graduate course than for an undergraduate one, it is a technique

with considerable practical utility for which it is hard to Ã¯Â¬Ând an existing

accessible reference for students. Our chapter on approximation algorithms

discusses both the process of designing effective algorithms and the task of

understanding the optimal solution well enough to obtain good bounds on it.

As design techniques for approximation algorithms, we focus on greedy algorithms, linear programming, and a third method we refer to as Ã¢â‚¬Å“pricing,Ã¢â‚¬Â which

incorporates ideas from each of the Ã¯Â¬Ârst two. Finally, we discuss local search

heuristics, including the Metropolis algorithm and simulated annealing. This

topic is often missing from undergraduate algorithms courses, because very

little is known in the way of provable guarantees for these algorithms; however, given their widespread use in practice, we feel it is valuable for students

to know something about them, and we also include some cases in which

guarantees can be proved.

Chapter 13 covers the use of randomization in the design of algorithms.

This is a topic on which several nice graduate-level books have been written.

Our goal here is to provide a more compact introduction to some of the

ways in which students can apply randomized techniques using the kind of

background in probability one typically gains from an undergraduate discrete

math course.

Use of the Book

The book is primarily designed for use in a Ã¯Â¬Ârst undergraduate course on

algorithms, but it can also be used as the basis for an introductory graduate

course.

When we use the book at the undergraduate level, we spend roughly

one lecture per numbered section; in cases where there is more than one

xix

xx

Preface

lectureÃ¢â‚¬â„¢s worth of material in a section (for example, when a section provides

further applications as additional examples), we treat this extra material as a

supplement that students can read about outside of lecture. We skip the starred

sections; while these sections contain important topics, they are less central

to the development of the subject, and in some cases they are harder as well.

We also tend to skip one or two other sections per chapter in the Ã¯Â¬Ârst half of

the book (for example, we tend to skip Sections 4.3, 4.7Ã¢â‚¬â€œ4.8, 5.5Ã¢â‚¬â€œ5.6, 6.5, 7.6,

and 7.11). We cover roughly half of each of Chapters 11Ã¢â‚¬â€œ13.

This last point is worth emphasizing: rather than viewing the later chapters

as Ã¢â‚¬Å“advanced,Ã¢â‚¬Â and hence off-limits to undergraduate algorithms courses, we

have designed them with the goal that the Ã¯Â¬Ârst few sections of each should

be accessible to an undergraduate audience. Our own undergraduate course

involves material from all these chapters, as we feel that all of these topics

have an important place at the undergraduate level.

Finally, we treat Chapters 2 and 3 primarily as a review of material from

earlier courses; but, as discussed above, the use of these two chapters depends

heavily on the relationship of each speciÃ¯Â¬Âc course to its prerequisites.

The resulting syllabus looks roughly as follows: Chapter 1; Chapters 4Ã¢â‚¬â€œ8

(excluding 4.3, 4.7Ã¢â‚¬â€œ4.9, 5.5Ã¢â‚¬â€œ5.6, 6.5, 6.10, 7.4, 7.6, 7.11, and 7.13); Chapter 9

(brieÃ¯Â¬â€šy); Chapter 10, Sections.10.1 and 10.2; Chapter 11, Sections 11.1, 11.2,

11.6, and 11.8; Chapter 12, Sections 12.1Ã¢â‚¬â€œ12.3; and Chapter 13, Sections 13.1Ã¢â‚¬â€œ

13.5.

The book also naturally supports an introductory graduate course on

algorithms. Our view of such a course is that it should introduce students

destined for research in all different areas to the important current themes in

algorithm design. Here we Ã¯Â¬Ând the emphasis on formulating problems to be

useful as well, since students will soon be trying to deÃ¯Â¬Âne their own research

problems in many different subÃ¯Â¬Âelds. For this type of course, we cover the

later topics in Chapters 4 and 6 (Sections 4.5Ã¢â‚¬â€œ4.9 and 6.5Ã¢â‚¬â€œ6.10), cover all of

Chapter 7 (moving more rapidly through the early sections), quickly cover NPcompleteness in Chapter 8 (since many beginning graduate students will have

seen this topic as undergraduates), and then spend the remainder of the time

on Chapters 10Ã¢â‚¬â€œ13. Although our focus in an introductory graduate course is

on the more advanced sections, we Ã¯Â¬Ând it useful for the students to have the

full book to consult for reviewing or Ã¯Â¬Âlling in background knowledge, given

the range of different undergraduate backgrounds among the students in such

a course.

Finally, the book can be used to support self-study by graduate students,

researchers, or computer professionals who want to get a sense for how they

Preface

might be able to use particular algorithm design techniques in the context of

their own work. A number of graduate students and colleagues have used

portions of the book in this way.

Acknowledgments

This book grew out of the sequence of algorithms courses that we have taught

at Cornell. These courses have grown, as the Ã¯Â¬Âeld has grown, over a number of

years, and they reÃ¯Â¬â€šect the inÃ¯Â¬â€šuence of the Cornell faculty who helped to shape

them during this time, including Juris Hartmanis, Monika Henzinger, John

Hopcroft, Dexter Kozen, Ronitt Rubinfeld, and Sam Toueg. More generally, we

would like to thank all our colleagues at Cornell for countless discussions both

on the material here and on broader issues about the nature of the Ã¯Â¬Âeld.

The course staffs weÃ¢â‚¬â„¢ve had in teaching the subject have been tremendously helpful in the formulation of this material. We thank our undergraduate and graduate teaching assistants, Siddharth Alexander, Rie Ando, Elliot

Anshelevich, Lars Backstrom, Steve Baker, Ralph Benzinger, John Bicket,

Doug Burdick, Mike Connor, Vladimir Dizhoor, Shaddin Doghmi, Alexander Druyan, Bowei Du, Sasha EvÃ¯Â¬Âmievski, Ariful Gani, Vadim Grinshpun,

Ara Hayrapetyan, Chris Jeuell, Igor Kats, Omar Khan, Mikhail Kobyakov,

Alexei Kopylov, Brian Kulis, Amit Kumar, Yeongwee Lee, Henry Lin, Ashwin Machanavajjhala, Ayan Mandal, Bill McCloskey, Leonid Meyerguz, Evan

Moran, Niranjan Nagarajan, Tina Nolte, Travis Ortogero, Martin PaÃŒÂl, Jon

Peress, Matt Piotrowski, Joe Polastre, Mike Priscott, Xin Qi, Venu Ramasubramanian, Aditya Rao, David Richardson, Brian Sabino, Rachit Siamwalla, Sebastian Silgardo, Alex Slivkins, Chaitanya Swamy, Perry Tam, Nadya Travinin,

Sergei Vassilvitskii, Matthew Wachs, Tom Wexler, Shan-Leung Maverick Woo,

Justin Yang, and Misha Zatsman. Many of them have provided valuable insights, suggestions, and comments on the text. We also thank all the students

in these classes who have provided comments and feedback on early drafts of

the book over the years.

For the past several years, the development of the book has beneÃ¯Â¬Âted

greatly from the feedback and advice of colleagues who have used prepublication drafts for teaching. Anna Karlin fearlessly adopted a draft as her course

textbook at the University of Washington when it was still in an early stage of

development; she was followed by a number of people who have used it either

as a course textbook or as a resource for teaching: Paul Beame, Allan Borodin,

Devdatt Dubhashi, David Kempe, Gene Kleinberg, Dexter Kozen, Amit Kumar,

Mike Molloy, Yuval Rabani, Tim Roughgarden, Alexa Sharp, Shanghua Teng,

Aravind Srinivasan, Dieter van Melkebeek, Kevin Wayne, Tom Wexler, and

xxi

xxii

Preface

Sue Whitesides. We deeply appreciate their input and advice, which has informed many of our revisions to the content. We would like to additionally

thank Kevin Wayne for producing supplementary material associated with the

book, which promises to greatly extend its utility to future instructors.

In a number of other cases, our approach to particular topics in the book

reÃ¯Â¬â€šects the infuence of speciÃ¯Â¬Âc colleagues. Many of these contributions have

undoubtedly escaped our notice, but we especially thank Yuri Boykov, Ron

Elber, Dan Huttenlocher, Bobby Kleinberg, Evie Kleinberg, Lillian Lee, David

McAllester, Mark Newman, Prabhakar Raghavan, Bart Selman, David Shmoys,

Steve Strogatz, Olga Veksler, Duncan Watts, and Ramin Zabih.

It has been a pleasure working with Addison Wesley over the past year.

First and foremost, we thank Matt Goldstein for all his advice and guidance in

this process, and for helping us to synthesize a vast amount of review material

into a concrete plan that improved the book. Our early conversations about

the book with Susan Hartman were extremely valuable as well. We thank Matt

and Susan, together with Michelle Brown, Marilyn Lloyd, Patty Mahtani, and

Maite Suarez-Rivas at Addison Wesley, and Paul Anagnostopoulos and Jacqui

Scarlott at Windfall Software, for all their work on the editing, production, and

management of the project. We further thank Paul and Jacqui for their expert

composition of the book. We thank Joyce Wells for the cover design, Nancy

Murphy of Dartmouth Publishing for her work on the Ã¯Â¬Âgures, Ted Laux for

the indexing, and Carol Leyba and Jennifer McClain for the copyediting and

proofreading.

We thank Anselm Blumer (Tufts University), Richard Chang (University of

Maryland, Baltimore County), Kevin Compton (University of Michigan), Diane

Cook (University of Texas, Arlington), Sariel Har-Peled (University of Illinois,

Urbana-Champaign), Sanjeev Khanna (University of Pennsylvania), Philip

Klein (Brown University), David Matthias (Ohio State University), Adam Meyerson (UCLA), Michael Mitzenmacher (Harvard University), Stephan Olariu

(Old Dominion University), Mohan Paturi (UC San Diego), Edgar Ramos (University of Illinois, Urbana-Champaign), Sanjay Ranka (University of Florida,

Gainesville), Leon Reznik (Rochester Institute of Technology), Subhash Suri

(UC Santa Barbara), Dieter van Melkebeek (University of Wisconsin, Madison), and Bulent Yener (Rensselaer Polytechnic Institute) who generously

contributed their time to provide detailed and thoughtful reviews of the manuscript; their comments led to numerous improvements, both large and small,

in the Ã¯Â¬Ânal version of the text.

Finally, we thank our familiesÃ¢â‚¬â€Lillian and Alice, and David, Rebecca, and

Amy. We appreciate their support, patience, and many other contributions

more than we can express in any acknowledgments here.

Preface

This book was begun amid the irrational exuberance of the late nineties,

when the arc of computing technology seemed, to many of us, brieÃ¯Â¬â€šy to pass

through a place traditionally occupied by celebrities and other inhabitants of

the pop-cultural Ã¯Â¬Ârmament. (It was probably just in our imaginations.) Now,

several years after the hype and stock prices have come back to earth, one can

appreciate that in some ways computer science was forever changed by this

period, and in other ways it has remained the same: the driving excitement

that has characterized the Ã¯Â¬Âeld since its early days is as strong and enticing as

ever, the publicÃ¢â‚¬â„¢s fascination with information technology is still vibrant, and

the reach of computing continues to extend into new disciplines. And so to

all students of the subject, drawn to it for so many different reasons, we hope

you Ã¯Â¬Ând this book an enjoyable and useful guide wherever your computational

pursuits may take you.

Jon Kleinberg

EÃŒÂva Tardos

Ithaca, 2005

xxiii

This page intentionally left blank

Chapter 1

Introduction: Some

Representative Problems

1.1 A First Problem: Stable Matching

As an opening topic, we look at an algorithmic problem that nicely illustrates

many of the themes we will be emphasizing. It is motivated by some very

natural and practical concerns, and from these we formulate a clean and

simple statement of a problem. The algorithm to solve the problem is very

clean as well, and most of our work will be spent in proving that it is correct

and giving an acceptable bound on the amount of time it takes to terminate

with an answer. The problem itselfÃ¢â‚¬â€the Stable Matching ProblemÃ¢â‚¬â€has several

origins.

The Problem

The Stable Matching Problem originated, in part, in 1962, when David Gale

and Lloyd Shapley, two mathematical economists, asked the question: Could

one design a college admissions process, or a job recruiting process, that was

self-enforcing? What did they mean by this?

To set up the question, letÃ¢â‚¬â„¢s Ã¯Â¬Ârst think informally about the kind of situation

that might arise as a group of friends, all juniors in college majoring in

computer science, begin applying to companies for summer internships. The

crux of the application process is the interplay between two different types

of parties: companies (the employers) and students (the applicants). Each

applicant has a preference ordering on companies, and each companyÃ¢â‚¬â€once

the applications come inÃ¢â‚¬â€forms a preference ordering on its applicants. Based

on these preferences, companies extend offers to some of their applicants,

applicants choose which of their offers to accept, and people begin heading

off to their summer internships.

2

Chapter 1 Introduction: Some Representative Problems

Gale and Shapley considered the sorts of things that could start going

wrong with this process, in the absence of any mechanism to enforce the status

quo. Suppose, for example, that your friend Raj has just accepted a summer job

at the large telecommunications company CluNet. A few days later, the small

start-up company WebExodus, which had been dragging its feet on making a

few Ã¯Â¬Ânal decisions, calls up Raj and offers him a summer job as well. Now, Raj

actually prefers WebExodus to CluNetÃ¢â‚¬â€won over perhaps by the laid-back,

anything-can-happen atmosphereÃ¢â‚¬â€and so this new development may well

cause him to retract his acceptance of the CluNet offer and go to WebExodus

instead. Suddenly down one summer intern, CluNet offers a job to one of its

wait-listed applicants, who promptly retracts his previous acceptance of an

offer from the software giant Babelsoft, and the situation begins to spiral out

of control.

Things look just as bad, if not worse, from the other direction. Suppose

that RajÃ¢â‚¬â„¢s friend Chelsea, destined to go to Babelsoft but having just heard RajÃ¢â‚¬â„¢s

story, calls up the people at WebExodus and says, Ã¢â‚¬Å“You know, IÃ¢â‚¬â„¢d really rather

spend the summer with you guys than at Babelsoft.Ã¢â‚¬Â They Ã¯Â¬Ând this very easy

to believe; and furthermore, on looking at ChelseaÃ¢â‚¬â„¢s application, they realize

that they would have rather hired her than some other student who actually

is scheduled to spend the summer at WebExodus. In this case, if WebExodus

were a slightly less scrupulous company, it might well Ã¯Â¬Ând some way to retract

its offer to this other student and hire Chelsea instead.

Situations like this can rapidly generate a lot of chaos, and many peopleÃ¢â‚¬â€

both applicants and employersÃ¢â‚¬â€can end up unhappy with the process as well

as the outcome. What has gone wrong? One basic problem is that the process

is not self-enforcingÃ¢â‚¬â€if people are allowed to act in their self-interest, then it

risks breaking down.

We might well prefer the following, more stable situation, in which selfinterest itself prevents offers from being retracted and redirected. Consider

another student, who has arranged to spend the summer at CluNet but calls

up WebExodus and reveals that he, too, would rather work for them. But in

this case, based on the offers already accepted, they are able to reply, Ã¢â‚¬Å“No, it

turns out that we prefer each of the students weÃ¢â‚¬â„¢ve accepted to you, so weÃ¢â‚¬â„¢re

afraid thereÃ¢â‚¬â„¢s nothing we can do.Ã¢â‚¬Â Or consider an employer, earnestly following

up with its top applicants who went elsewhere, being told by each of them,

Ã¢â‚¬Å“No, IÃ¢â‚¬â„¢m happy where I am.Ã¢â‚¬Â In such a case, all the outcomes are stableÃ¢â‚¬â€there

are no further outside deals that can be made.

So this is the question Gale and Shapley asked: Given a set of preferences

among employers and applicants, can we assign applicants to employers so

that for every employer E, and every applicant A who is not scheduled to work

for E, at least one of the following two things is the case?

1.1 A First Problem: Stable Matching

(i) E prefers every one of its accepted applicants to A; or

(ii) A prefers her current situation over working for employer E.

If this holds, the outcome is stable: individual self-interest will prevent any

applicant/employer deal from being made behind the scenes.

Gale and Shapley proceeded to develop a striking algorithmic solution to

this problem, which we will discuss presently. Before doing this, letÃ¢â‚¬â„¢s note that

this is not the only origin of the Stable Matching Problem. It turns out that for

a decade before the work of Gale and Shapley, unbeknownst to them, the

National Resident Matching Program had been using a very similar procedure,

with the same underlying motivation, to match residents to hospitals. Indeed,

this system, with relatively little change, is still in use today.

This is one testament to the problemÃ¢â‚¬â„¢s fundamental appeal. And from the

point of view of this book, it provides us with a nice Ã¯Â¬Ârst domain in which

to reason about some basic combinatorial deÃ¯Â¬Ânitions and the algorithms that

build on them.

Formulating the Problem To get at the essence of this concept, it helps to

make the problem as clean as possible. The world of companies and applicants

contains some distracting asymmetries. Each applicant is looking for a single

company, but each company is looking for many applicants; moreover, there

may be more (or, as is sometimes the case, fewer) applicants than there are

available slots for summer jobs. Finally, each applicant does not typically apply

to every company.

It is useful, at least initially, to eliminate these complications and arrive at a

more Ã¢â‚¬Å“bare-bonesÃ¢â‚¬Â version of the problem: each of n applicants applies to each

of n companies, and each company wants to accept a single applicant. We will

see that doing this preserves the fundamental issues inherent in the problem;

in particular, our solution to this simpliÃ¯Â¬Âed version will extend directly to the

more general case as well.

Following Gale and Shapley, we observe that this special case can be

viewed as the problem of devising a system by which each of n men and

n women can end up getting married: our problem naturally has the analogue

of two Ã¢â‚¬Å“gendersÃ¢â‚¬ÂÃ¢â‚¬â€the applicants and the companiesÃ¢â‚¬â€and in the case we are

considering, everyone is seeking to be paired with exactly one individual of

the opposite gender.1

1

Gale and Shapley considered the same-sex Stable Matching Problem as well, where there is only a

single gender. This is motivated by related applications, but it turns out to be fairly diÃ¯Â¬â‚¬erent at a

technical level. Given the applicant-employer application weÃ¢â‚¬â„¢re considering here, weÃ¢â‚¬â„¢ll be focusing

on the version with two genders.

3

4

Chapter 1 Introduction: Some Representative Problems

So consider a set M = {m1, . . . , mn } of n men, and a set W = {w1, . . . , wn }

of n women. Let M Ãƒâ€” W denote the set of all possible ordered pairs of the form

(m, w), where m Ã¢Ë†Ë† M and w Ã¢Ë†Ë† W. A matching S is a set of ordered pairs, each

from M Ãƒâ€” W, with the property that each member of M and each member of

W appears in at most one pair in S. A perfect matching S is a matching with

the property that each member of M and each member of W appears in exactly

one pair in S .

An instability: m and wÃ¢Â¬Ëœ

each prefer the other to

their current partners.

m

w

mÃ¢Â¬Ëœ

wÃ¢Â¬Ëœ

Figure 1.1 Perfect matching

S with instability (m, w ).

Matchings and perfect matchings are objects that will recur frequently

throughout the book; they arise naturally in modeling a wide range of algorithmic problems. In the present situation, a perfect matching corresponds

simply to a way of pairing off the men with the women, in such a way that

everyone ends up married to somebody, and nobody is married to more than

one personÃ¢â‚¬â€there is neither singlehood nor polygamy.

Now we can add the notion of preferences to this setting. Each man m Ã¢Ë†Ë† M

ranks all the women; we will say that m prefers w to w if m ranks w higher

than w . We will refer to the ordered ranking of m as his preference list. We will

not allow ties in the ranking. Each woman, analogously, ranks all the men.

Given a perfect matching S, what can go wrong? Guided by our initial

motivation in terms of employers and applicants, we should be worried about

the following situation: There are two pairs (m, w) and (m , w ) in S (as

depicted in Figure 1.1) with the property that m prefers w to w, and w prefers

m to m . In this case, thereÃ¢â‚¬â„¢s nothing to stop m and w from abandoning their

current partners and heading off together; the set of marriages is not selfenforcing. WeÃ¢â‚¬â„¢ll say that such a pair (m, w ) is an instability with respect to S:

(m, w ) does not belong to S, but each of m and w prefers the other to their

partner in S.

Our goal, then, is a set of marriages with no instabilities. WeÃ¢â‚¬â„¢ll say that

a matching S is stable if (i) it is perfect, and (ii) there is no instability with

respect to S. Two questions spring immediately to mind:

.

Does there exist a stable matching for every set of preference lists?

.

Given a set of preference lists, can we efÃ¯Â¬Âciently construct a stable

matching if there is one?

Some Examples To illustrate these deÃ¯Â¬Ânitions, consider the following two

very simple instances of the Stable Matching Problem.

First, suppose we have a set of two men, {m, m }, and a set of two women,

{w, w }. The preference lists are as follows:

m prefers w to w .

m prefers w to w .

1.1 A First Problem: Stable Matching

w prefers m to m .

w prefers m to m .

If we think about this set of preference lists intuitively, it represents complete

agreement: the men agree on the order of the women, and the women agree

on the order of the men. There is a unique stable matching here, consisting

of the pairs (m, w) and (m , w ). The other perfect matching, consisting of the

pairs (m , w) and (m, w ), would not be a stable matching, because the pair

(m, w) would form an instability with respect to this matching. (Both m and

w would want to leave their respective partners and pair up.)

Next, hereÃ¢â‚¬â„¢s an example where things are a bit more intricate. Suppose

the preferences are

m prefers w to w .

m prefers w to w.

w prefers m to m.

w prefers m to m .

WhatÃ¢â‚¬â„¢s going on in this case? The two menÃ¢â‚¬â„¢s preferences mesh perfectly with

each other (they rank different women Ã¯Â¬Ârst), and the two womenÃ¢â‚¬â„¢s preferences

likewise mesh perfectly with each other. But the menÃ¢â‚¬â„¢s preferences clash

completely with the womenÃ¢â‚¬â„¢s preferences.

In this second example, there are two different stable matchings. The

matching consisting of the pairs (m, w) and (m , w ) is stable, because both

men are as happy as possible, so neither would leave their matched partner.

But the matching consisting of the pairs (m , w) and (m, w ) is also stable, for

the complementary reason that both women are as happy as possible. This is

an important point to remember as we go forwardÃ¢â‚¬â€itÃ¢â‚¬â„¢s possible for an instance

to have more than one stable matching.

Designing the Algorithm

We now show that there exists a stable matching for every set of preference

lists among the men and women. Moreover, our means of showing this will

also answer the second question that we asked above: we will give an efÃ¯Â¬Âcient

algorithm that takes the preference lists and constructs a stable matching.

Let us consider some of the basic ideas that motivate the algorithm.

.

Initially, everyone is unmarried. Suppose an unmarried man m chooses

the woman w who ranks highest on his preference list and proposes to

her. Can we declare immediately that (m, w) will be one of the pairs in our

Ã¯Â¬Ânal stable matching? Not necessarily: at some point in the future, a man

m whom w prefers may propose to her. On the other hand, it would be

5

6

Chapter 1 Introduction: Some Representative Problems

dangerous for w to reject m right away; she may never receive a proposal

from someone she ranks as highly as m. So a natural idea would be to

have the pair (m, w) enter an intermediate stateÃ¢â‚¬â€engagement.

Woman w will become

engaged to m if she

prefers him to mÃ¢Â¬Ëœ.

m

w

mÃ¢Â¬Ëœ

Figure 1.2 An intermediate

state of the G-S algorithm

when a free man m is proposing to a woman w.

.

Suppose we are now at a state in which some men and women are freeÃ¢â‚¬â€

not engagedÃ¢â‚¬â€and some are engaged. The next step could look like this.

An arbitrary free man m chooses the highest-ranked woman w to whom

he has not yet proposed, and he proposes to her. If w is also free, then m

and w become engaged. Otherwise, w is already engaged to some other

man m . In this case, she determines which of m or m ranks higher

on her preference list; this man becomes engaged to w and the other

becomes free.

.

Finally, the algorithm will terminate when no one is free; at this moment,

all engagements are declared Ã¯Â¬Ânal, and the resulting perfect matching is

returned.

Here is a concrete description of the Gale-Shapley algorithm, with Figure 1.2 depicting a state of the algorithm.

Initially all m Ã¢Ë†Ë† M and w Ã¢Ë†Ë† W are free

While there is a man m who is free and hasnÃ¢â‚¬â„¢t proposed to

every woman

Choose such a man m

Let w be the highest-ranked woman in mÃ¢â‚¬â„¢s preference list

to whom m has not yet proposed

If w is free then

(m, w) become engaged

Else w is currently engaged to m

If w prefers m to m then

m remains free

Else w prefers m to m

(m, w) become engaged

m becomes free

Endif

Endif

Endwhile

Return the set S of engaged pairs

An intriguing thing is that, although the G-S algorithm is quite simple

to state, it is not immediately obvious that it returns a stable matching, or

even a perfect matching. We proceed to prove this now, through a sequence

of intermediate facts.

1.1 A First Problem: Stable Matching

Analyzing the Algorithm

First consider the view of a woman w during the execution of the algorithm.

For a while, no one has proposed to her, and she is free. Then a man m may

propose to her, and she becomes engaged. As time goes on, she may receive

additional proposals, accepting those that increase the rank of her partner. So

we discover the following.

(1.1) w remains engaged from the point at which she receives her Ã¯Â¬Ârst

proposal; and the sequence of partners to which she is engaged gets better and

better (in terms of her preference list).

The view of a man m during the execution of the algorithm is rather

different. He is free until he proposes to the highest-ranked woman on his

list; at this point he may or may not become engaged. As time goes on, he

may alternate between being free and being engaged; however, the following

property does hold.

(1.2) The sequence of women to whom m proposes gets worse and worse (in

terms of his preference list).

Now we show that the algorithm terminates, and give a bound on the

maximum number of iterations needed for termination.

(1.3)

loop.

The G-S algorithm terminates after at most n2 iterations of the While

Proof. A useful strategy for upper-bounding the running time of an algorithm,

as we are trying to do here, is to Ã¯Â¬Ând a measure of progress. Namely, we seek

some precise way of saying that each step taken by the algorithm brings it

closer to termination.

In the case of the present algorithm, each iteration consists of some man

proposing (for the only time) to a woman he has never proposed to before. So

if we let P(t) denote the set of pairs (m, w) such that m has proposed to w by

the end of iteration t, we see that for all t, the size of P(t + 1) is strictly greater

than the size of P(t). But there are only n2 possible pairs of men and women

in total, so the value of P(Ã‚Â·) can increase at most n2 times over the course of

the algorithm. It follows that there can be at most n2 iterations.

Two points are worth noting about the previous fact and its proof. First,

there are executions of the algorithm (with certain preference lists) that can

involve close to n2 iterations, so this analysis is not far from the best possible.

Second, there are many quantities that would not have worked well as a

progress measure for the algorithm, since they need not strictly increase in each

7

8

Chapter 1 Introduction: Some Representative Problems

iteration. For example, the number of free individuals could remain constant

from one iteration to the next, as could the number of engaged pairs. Thus,

these quantities could not be used directly in giving an upper bound on the

maximum possible number of iterations, in the style of the previous paragraph.

Let us now establish that the set S returned at the termination of the

algorithm is in fact a perfect matching. Why is this not immediately obvious?

Essentially, we have to show that no man can Ã¢â‚¬Å“fall offÃ¢â‚¬Â the end of his preference

list; the only way for the While loop to exit is for there to be no free man. In

this case, the set of engaged couples would indeed be a perfect matching.

So the main thing we need to show is the following.

(1.4) If m is free at some point in the execution of the algorithm, then there

is a woman to whom he has not yet proposed.

Proof. Suppose there comes a point when m is free but has already proposed

to every woman. Then by (1.1), each of the n women is engaged at this point

in time. Since the set of engaged pairs forms a matching, there must also be

n engaged men at this point in time. But there are only n men total, and m is

not engaged, so this is a contradiction.

(1.5)

The set S returned at termination is a perfect matching.

Proof. The set of engaged pairs always forms a matching. Let us suppose that

the algorithm terminates with a free man m. At termination, it must be the

case that m had already proposed to every woman, for otherwise the While

loop would not have exited. But this contradicts (1.4), which says that there

cannot be a free man who has proposed to every woman.

Finally, we prove the main property of the algorithmÃ¢â‚¬â€namely, that it

results in a stable matching.

(1.6) Consider an execution of the G-S algorithm that returns a set of pairs

S. The set S is a stable matching.

Proof. We have already seen, in (1.5), that S is a perfect matching. Thus, to

prove S is a stable matching, we will assume that there is an instability with

respect to S and obtain a contradiction. As deÃ¯Â¬Âned earlier, such an instability

would involve two pairs, (m, w) and (m , w ), in S with the properties that

.

m prefers w to w, and

.

w prefers m to m .

In the execution of the algorithm that produced S, mÃ¢â‚¬â„¢s last proposal was, by

deÃ¯Â¬Ânition, to w. Now we ask: Did m propose to w at some earlier point in

1.1 A First Problem: Stable Matching

this execution? If he didnÃ¢â‚¬â„¢t, then w must occur higher on mÃ¢â‚¬â„¢s preference list

than w , contradicting our assumption that m prefers w to w. If he did, then

he was rejected by w in favor of some other man m , whom w prefers to m.

m is the Ã¯Â¬Ânal partner of w , so either m = m or, by (1.1), w prefers her Ã¯Â¬Ânal

partner m to m ; either way this contradicts our assumption that w prefers

m to m .

It follows that S is a stable matching.

Extensions

We began by deÃ¯Â¬Âning the notion of a stable matching; we have just proven

that the G-S algorithm actually constructs one. We now consider some further

questions about the behavior of the G-S algorithm and its relation to the

properties of different stable matchings.

To begin with, recall that we saw an example earlier in which there could

be multiple stable matchings. To recap, the preference lists in this example

were as follows:

m prefers w to w .

m prefers w to w.

w prefers m to m.

w prefers m to m .

Now, in any execution of the Gale-Shapley algorithm, m will become engaged

to w, m will become engaged to w (perhaps in the other order), and things

will stop there. Thus, the other stable matching, consisting of the pairs (m , w)

and (m, w ), is not attainable from an execution of the G-S algorithm in which

the men propose. On the other hand, it would be reached if we ran a version of

the algorithm in which the women propose. And in larger examples, with more

than two people on each side, we can have an even larger collection of possible

stable matchings, many of them not achievable by any natural algorithm.

This example shows a certain Ã¢â‚¬Å“unfairnessÃ¢â‚¬Â in the G-S algorithm, favoring

men. If the menÃ¢â‚¬â„¢s preferences mesh perfectly (they all list different women as

their Ã¯Â¬Ârst choice), then in all runs of the G-S algorithm all men end up matched

with their Ã¯Â¬Ârst choice, independent of the preferences of the women. If the

womenÃ¢â‚¬â„¢s preferences clash completely with the menÃ¢â‚¬â„¢s preferences (as was the

case in this example), then the resulting stable matching is as bad as possible

for the women. So this simple set of preference lists compactly summarizes a

world in which someone is destined to end up unhappy: women are unhappy

if men propose, and men are unhappy if women propose.

LetÃ¢â‚¬â„¢s now analyze the G-S algorithm in more detail and try to understand

how general this Ã¢â‚¬Å“unfairnessÃ¢â‚¬Â phenomenon is.

9

10

Chapter 1 Introduction: Some Representative Problems

To begin with, our example reinforces the point that the G-S algorithm

is actually underspeciÃ¯Â¬Âed: as long as there is a free man, we are allowed to

choose any free man to make the next proposal. Different choices specify

different executions of the algorithm; this is why, to be careful, we stated (1.6)

as Ã¢â‚¬Å“Consider an execution of the G-S algorithm that returns a set of pairs S,Ã¢â‚¬Â

instead of Ã¢â‚¬Å“Consider the set S returned by the G-S algorithm.Ã¢â‚¬Â

Thus, we encounter another very natural question: Do all executions of

the G-S algorithm yield the same matching? This is a genre of question that

arises in many settings in computer science: we have an algorithm that runs

asynchronously, with different independent components performing actions

that can be interleaved in complex ways, and we want to know how much

variability this asynchrony causes in the Ã¯Â¬Ânal outcome. To consider a very

different kind of example, the independent components may not be men and

women but electronic components activating parts of an airplane wing; the

effect of asynchrony in their behavior can be a big deal.

In the present context, we will see that the answer to our question is

surprisingly clean: all executions of the G-S algorithm yield the same matching.

We proceed to prove this now.

All Executions Yield the Same Matching There are a number of possible

ways to prove a statement such as this, many of which would result in quite

complicated arguments. It turns out that the easiest and most informative approach for us will be to uniquely characterize the matching that is obtained and

then show that all executions result in the matching with this characterization.

What is the characterization? WeÃ¢â‚¬â„¢ll show that each man ends up with the

Ã¢â‚¬Å“best possible partnerÃ¢â‚¬Â in a concrete sense. (Recall that this is true if all men

prefer different women.) First, we will say that a woman w is a valid partner

of a man m if there is a stable matching that contains the pair (m, w). We will

say that w is the best valid partner of m if w is a valid partner of m, and no

woman whom m ranks higher than w is a valid partner of his. We will use

best(m) to denote the best valid partner of m.

Now, let SÃ¢Ë†â€” denote the set of pairs {(m, best(m)) : m Ã¢Ë†Ë† M}. We will prove

the following fact.

(1.7)

Every execution of the G-S algorithm results in the set SÃ¢Ë†â€”.

This statement is surprising at a number of levels. First of all, as deÃ¯Â¬Âned,

there is no reason to believe that SÃ¢Ë†â€” is a matching at all, let alone a stable

matching. After all, why couldnÃ¢â‚¬â„¢t it happen that two men have the same best

valid partner? Second, the result shows that the G-S algorithm gives the best

possible outcome for every man simultaneously; there is no stable matching

in which any of the men could have hoped to do better. And Ã¯Â¬Ânally, it answers

1.1 A First Problem: Stable Matching

our question above by showing that the order of proposals in the G-S algorithm

has absolutely no effect on the Ã¯Â¬Ânal outcome.

Despite all this, the proof is not so difÃ¯Â¬Âcult.

Proof. Let us suppose, by way of contradiction, that some execution E of the

G-S algorithm results in a matching S in which some man is paired with a

woman who is not his best valid partner. Since men propose in decreasing

order of preference, this means that some man is rejected by a valid partner

during the execution E of the algorithm. So consider the Ã¯Â¬Ârst moment during

the execution E in which some man, say m, is rejected by a valid partner w.

Again, since men propose in decreasing order of preference, and since this is

the Ã¯Â¬Ârst time such a rejection has occurred, it must be that w is mÃ¢â‚¬â„¢s best valid

partner best(m).

The rejection of m by w may have happened either because m proposed

and was turned down in favor of wÃ¢â‚¬â„¢s existing engagement, or because w broke

her engagement to m in favor of a better proposal. But either way, at this

moment w forms or continues an engagement with a man m whom she prefers

to m.

Since w is a valid partner of m, there exists a stable matching S containing

the pair (m, w). Now we ask: Who is m paired with in this matching? Suppose

it is a woman w = w.

Since the rejection of m by w was the Ã¯Â¬Ârst rejection of a man by a valid

partner in the execution E, it must be that m had not been rejected by any valid

partner at the point in E when he became engaged to w. Since he proposed in

decreasing order of preference, and since w is clearly a valid partner of m , it

must be that m prefers w to w . But we have already seen that w prefers m

to m, for in execution E she rejected m in favor of m . Since (m , w) Ã¢Ë†Ë† S , it

follows that (m , w) is an instability in S .

This contradicts our claim that S is stable and hence contradicts our initial

assumption.

So for the men, the G-S algorithm is ideal. Unfortunately, the same cannot

be said for the women. For a woman w, we say that m is a valid partner if

there is a stable matching that contains the pair (m, w). We say that m is the

worst valid partner of w if m is a valid partner of w, and no man whom w

ranks lower than m is a valid partner of hers.

(1.8) In the stable matching SÃ¢Ë†â€”, each woman is paired with her worst valid

partner.

Proof. Suppose there were a pair (m, w) in SÃ¢Ë†â€” such that m is not the worst

valid partner of w. Then there is a stable matching S in which w is paired

11

12

Chapter 1 Introduction: Some Representative Problems

with a man m whom she likes less than m. In S , m is paired with a woman

w = w; since w is the best valid partner of m, and w is a valid partner of m,

we see that m prefers w to w .

But from this it follows that (m, w) is an instability in S , contradicting the

claim that S is stable and hence contradicting our initial assumption.

Thus, we Ã¯Â¬Ând that our simple example above, in which the menÃ¢â‚¬â„¢s preferences clashed with the womenÃ¢â‚¬â„¢s, hinted at a very general phenomenon: for

any input, the side that does the proposing in the G-S algorithm ends up with

the best possible stable matching (from their perspective), while the side that

does not do the proposing correspondingly ends up with the worst possible

stable matching.

1.2 Five Representative Problems

The Stable Matching Problem provides us with a rich example of the process of

algorithm design. For many problems, this process involves a few signiÃ¯Â¬Âcant

steps: formulating the problem with enough mathematical precision that we

can ask a concrete question and start thinking about algorithms to solve

it; designing an algorithm for the problem; and analyzing the algorithm by

proving it is correct and giving a bound on the running time so as to establish

the algorithmÃ¢â‚¬â„¢s efÃ¯Â¬Âciency.

This high-level strategy is carried out in practice with the help of a few

fundamental design techniques, which are very useful in assessing the inherent

complexity of a problem and in formulating an algorithm to solve it. As in any

area, becoming familiar with these design techniques is a gradual process; but

with experience one can start recognizing problems as belonging to identiÃ¯Â¬Âable

genres and appreciating how subtle changes in the statement of a problem can

have an enormous effect on its computational difÃ¯Â¬Âculty.

To get this discussion started, then, it helps to pick out a few representative milestones that weÃ¢â‚¬â„¢ll be encountering in our study of algorithms: cleanly

formulated problems, all resembling one another at a general level, but differing greatly in their difÃ¯Â¬Âculty and in the kinds of approaches that one brings

to bear on them. The Ã¯Â¬Ârst three will be solvable efÃ¯Â¬Âciently by a sequence of

increasingly subtle algorithmic techniques; the fourth marks a major turning

point in our discussion, serving as an example of a problem believed to be unsolvable by any efÃ¯Â¬Âcient algorithm; and the Ã¯Â¬Âfth hints at a class of problems

believed to be harder still.

The problems are self-contained and are all motivated by computing

applications. To talk about some of them, though, it will help to use the

terminology of graphs. While graphs are a common topic in earlier computer

13

1.2 Five Representative Problems

science courses, weÃ¢â‚¬â„¢ll be introducing them in a fair amount of depth in

Chapter 3; due to their enormous expressive power, weÃ¢â‚¬â„¢ll also be using them

extensively throughout the book. For the discussion here, itÃ¢â‚¬â„¢s enough to think

of a graph G as simply a way of encoding pairwise relationships among a set

of objects. Thus, G consists of a pair of sets (V , E)Ã¢â‚¬â€a collection V of nodes

and a collection E of edges, each of which Ã¢â‚¬Å“joinsÃ¢â‚¬Â two of the nodes. We thus

represent an edge e Ã¢Ë†Ë† E as a two-element subset of V: e = {u, v} for some

u, v Ã¢Ë†Ë† V, where we call u and v the ends of e. We typically draw graphs as in

Figure 1.3, with each node as a small circle and each edge as a line segment

joining its two ends.

(a)

LetÃ¢â‚¬â„¢s now turn to a discussion of the Ã¯Â¬Âve representative problems.

Interval Scheduling

Consider the following very simple scheduling problem. You have a resourceÃ¢â‚¬â€

it may be a lecture room, a supercomputer, or an electron microscopeÃ¢â‚¬â€and

many people request to use the resource for periods of time. A request takes

the form: Can I reserve the resource starting at time s, until time f ? We will

assume that the resource can be used by at most one person at a time. A

scheduler wants to accept a subset of these requests, rejecting all others, so

that the accepted requests do not overlap in time. The goal is to maximize the

number of requests accepted.

More formally, there will be n requests labeled 1, . . . , n, with each request

i specifying a start time si and a Ã¯Â¬Ânish time fi . Naturally, we have si < fi for all
i. Two requests i and j are compatible if the requested intervals do not overlap:
that is, either request i is for an earlier time interval than request j (fi Ã¢â€°Â¤ sj ),
or request i is for a later time than request j (fj Ã¢â€°Â¤ si ). WeÃ¢â‚¬â„¢ll say more generally
that a subset A of requests is compatible if all pairs of requests i, j Ã¢Ë†Ë† A, i = j are
compatible. The goal is to select a compatible subset of requests of maximum
possible size.
We illustrate an instance of this Interval Scheduling Problem in Figure 1.4.
Note that there is a single compatible set of size 4, and this is the largest
compatible set.
Figure 1.4 An instance of the Interval Scheduling Problem.
(b)
Figure 1.3 Each of (a) and
(b) depicts a graph on four
nodes.
14
Chapter 1 Introduction: Some Representative Problems
We will see shortly that this problem can be solved by a very natural
algorithm that orders the set of requests according to a certain heuristic and
then Ã¢â‚¬Å“greedilyÃ¢â‚¬Â processes them in one pass, selecting as large a compatible
subset as it can. This will be typical of a class of greedy algorithms that we
will consider for various problemsÃ¢â‚¬â€myopic rules that process the input one
piece at a time with no apparent look-ahead. When a greedy algorithm can be
shown to Ã¯Â¬Ând an optimal solution for all instances of a problem, itÃ¢â‚¬â„¢s often fairly
surprising. We typically learn something about the structure of the underlying
problem from the fact that such a simple approach can be optimal.
Weighted Interval Scheduling
In the Interval Scheduling Problem, we sought to maximize the number of
requests that could be accommodated simultaneously. Now, suppose more
generally that each request interval i has an associated value, or weight,
vi > 0; we could picture this as the amount of money we will make from

the ith individual if we schedule his or her request. Our goal will be to Ã¯Â¬Ând a

compatible subset of intervals of maximum total value.

The case in which vi = 1 for each i is simply the basic Interval Scheduling

Problem; but the appearance of arbitrary values changes the nature of the

maximization problem quite a bit. Consider, for example, that if v1 exceeds

the sum of all other vi , then the optimal solution must include interval 1

regardless of the conÃ¯Â¬Âguration of the full set of intervals. So any algorithm

for this problem must be very sensitive to the values, and yet degenerate to a

method for solving (unweighted) interval scheduling when all the values are

equal to 1.

There appears to be no simple greedy rule that walks through the intervals

one at a time, making the correct decision in the presence of arbitrary values.

Instead, we employ a technique, dynamic programming, that builds up the

optimal value over all possible solutions in a compact, tabular way that leads

to a very efÃ¯Â¬Âcient algorithm.

Bipartite Matching

When we considered the Stable Matching Problem, we deÃ¯Â¬Âned a matching to

be a set of ordered pairs of men and women with the property that each man

and each woman belong to at most one of the ordered pairs. We then deÃ¯Â¬Âned

a perfect matching to be a matching in which every man and every woman

belong to some pair.

We can express these concepts more generally in terms of graphs, and in

order to do this it is useful to deÃ¯Â¬Âne the notion of a bipartite graph. We say that

a graph G = (V , E) is bipartite if its node set V can be partitioned into sets X

15

1.2 Five Representative Problems

and Y in such a way that every edge has one end in X and the other end in Y.

A bipartite graph is pictured in Figure 1.5; often, when we want to emphasize

a graphÃ¢â‚¬â„¢s Ã¢â‚¬Å“bipartiteness,Ã¢â‚¬Â we will draw it this way, with the nodes in X and

Y in two parallel columns. But notice, for example, that the two graphs in

Figure 1.3 are also bipartite.

Now, in the problem of Ã¯Â¬Ânding a stable matching, matchings were built

from pairs of men and women. In the case of bipartite graphs, the edges are

pairs of nodes, so we say that a matching in a graph G = (V , E) is a set of edges

M Ã¢Å â€ E with the property that each node appears in at most one edge of M.

M is a perfect matching if every node appears in exactly one edge of M.

To see that this does capture the same notion we encountered in the Stable

Matching Problem, consider a bipartite graph G with a set X of n men, a set Y

of n women, and an edge from every node in X to every node in Y. Then the

matchings and perfect matchings in G are precisely the matchings and perfect

matchings among the set of men and women.

In the Stable Matching Problem, we added preferences to this picture. Here,

we do not consider preferences; but the nature of the problem in arbitrary

bipartite graphs adds a different source of complexity: there is not necessarily

an edge from every x Ã¢Ë†Ë† X to every y Ã¢Ë†Ë† Y, so the set of possible matchings has

quite a complicated structure. In other words, it is as though only certain pairs

of men and women are willing to be paired off, and we want to Ã¯Â¬Âgure out

how to pair off many people in a way that is consistent with this. Consider,

for example, the bipartite graph G in Figure 1.5: there are many matchings in

G, but there is only one perfect matching. (Do you see it?)

Matchings in bipartite graphs can model situations in which objects are

being assigned to other objects. Thus, the nodes in X can represent jobs, the

nodes in Y can represent machines, and an edge (xi , yj ) can indicate that

machine yj is capable of processing job xi . A perfect matching is then a way

of assigning each job to a machine that can process it, with the property that

each machine is assigned exactly one job. In the spring, computer science

departments across the country are often seen pondering a bipartite graph in

which X is the set of professors in the department, Y is the set of offered

courses, and an edge (xi , yj ) indicates that professor xi is capable of teaching

course yj . A perfect matching in this graph consists of an assignment of each

professor to a course that he or she can teach, in such a way that every course

is covered.

Thus the Bipartite Matching Problem is the following: Given an arbitrary

bipartite graph G, Ã¯Â¬Ând a matching of maximum size. If |X| = |Y| = n, then there

is a perfect matching if and only if the maximum matching has size n. We will

Ã¯Â¬Ând that the algorithmic techniques discussed earlier do not seem adequate

x1

y1

x2

y2

x3

y3

x4

y4

x5

y5

Figure 1.5 A bipartite graph.

16

Chapter 1 Introduction: Some Representative Problems

for providing an efÃ¯Â¬Âcient algorithm for this problem. There is, however, a very

elegant and efÃ¯Â¬Âcient algorithm to Ã¯Â¬Ând a maximum matching; it inductively

builds up larger and larger matchings, selectively backtracking along the way.

This process is called augmentation, and it forms the central component in a

large class of efÃ¯Â¬Âciently solvable problems called network Ã¯Â¬â€šow problems.

1

2

Independent Set

3

4

6

5

7

Figure 1.6 A graph whose

largest independent set has

size 4.

Now letÃ¢â‚¬â„¢s talk about an extremely general problem, which includes most of

these earlier problems as special cases. Given a graph G = (V , E), we say

a set of nodes S Ã¢Å â€ V is independent if no two nodes in S are joined by an

edge. The Independent Set Problem is, then, the following: Given G, Ã¯Â¬Ând an

independent set that is as large as possible. For example, the maximum size of

an independent set in the graph in Figure 1.6 is four, achieved by the four-node

independent set {1, 4, 5, 6}.

The Independent Set Problem encodes any situation in which you are

trying to choose from among a collection of objects and there are pairwise

conÃ¯Â¬â€šicts among some of the objects. Say you have n friends, and some pairs

of them donÃ¢â‚¬â„¢t get along. How large a group of your friends can you invite to

dinner if you donÃ¢â‚¬â„¢t want any interpersonal tensions? This is simply the largest

independent set in the graph whose nodes are your friends, with an edge

between each conÃ¯Â¬â€šicting pair.

Interval Scheduling and Bipartite Matching can both be encoded as special

cases of the Independent Set Problem. For Interval Scheduling, deÃ¯Â¬Âne a graph

G = (V , E) in which the nodes are the intervals and there is an edge between

each pair of them that overlap; the independent sets in G are then just the

compatible subsets of intervals. Encoding Bipartite Matching as a special case

of Independent Set is a little trickier to see. Given a bipartite graph G = (V , E ),

the objects being chosen are edges, and the conÃ¯Â¬â€šicts arise between two edges

that share an end. (These, indeed, are the pairs of edges that cannot belong

to a common matching.) So we deÃ¯Â¬Âne a graph G = (V , E) in which the node

set V is equal to the edge set E of G . We deÃ¯Â¬Âne an edge between each pair

of elements in V that correspond to edges of G with a common end. We can

now check that the independent sets of G are precisely the matchings of G .

While it is not complicated to check this, it takes a little concentration to deal

with this type of Ã¢â‚¬Å“edges-to-nodes, nodes-to-edgesÃ¢â‚¬Â transformation.2

2

For those who are curious, we note that not every instance of the Independent Set Problem can arise

in this way from Interval Scheduling or from Bipartite Matching; the full Independent Set Problem

really is more general. The graph in Figure 1.3(a) cannot arise as the Ã¢â‚¬Å“conÃ¯Â¬â€šict graphÃ¢â‚¬Â in an instance of

1.2 Five Representative Problems

Given the generality of the Independent Set Problem, an efÃ¯Â¬Âcient algorithm

to solve it would be quite impressive. It would have to implicitly contain

algorithms for Interval Scheduling, Bipartite Matching, and a host of other

natural optimization problems.

The current status of Independent Set is this: no efÃ¯Â¬Âcient algorithm is

known for the problem, and it is conjectured that no such algorithm exists.

The obvious brute-force algorithm would try all subsets of the nodes, checking

each to see if it is independent, and then recording the largest one encountered.

It is possible that this is close to the best we can do on this problem. We will

see later in the book that Independent Set is one of a large class of problems

that are termed NP-complete. No efÃ¯Â¬Âcient algorithm is known for any of them;

but they are all equivalent in the sense that a solution to any one of them

would imply, in a precise sense, a solution to all of them.

HereÃ¢â‚¬â„¢s a natural question: Is there anything good we can say about the

complexity of the Independent Set Problem? One positive thing is the following:

If we have a graph G on 1,000 nodes, and we want to convince you that it

contains an independent set S of size 100, then itÃ¢â‚¬â„¢s quite easy. We simply

show you the graph G, circle the nodes of S in red, and let you check that

no two of them are joined by an edge. So there really seems to be a great

difference in difÃ¯Â¬Âculty between checking that something is a large independent

set and actually Ã¯Â¬Ânding a large independent set. This may look like a very basic

observationÃ¢â‚¬â€and it isÃ¢â‚¬â€but it turns out to be crucial in understanding this class

of problems. Furthermore, as weÃ¢â‚¬â„¢ll see next, itÃ¢â‚¬â„¢s possible for a problem to be

so hard that there isnÃ¢â‚¬â„¢t even an easy way to Ã¢â‚¬Å“checkÃ¢â‚¬Â solutions in this sense.

Competitive Facility Location

Finally, we come to our Ã¯Â¬Âfth problem, which is based on the following twoplayer game. Consider two large companies that operate cafeÃŒÂ franchises across

the countryÃ¢â‚¬â€letÃ¢â‚¬â„¢s call them JavaPlanet and QueequegÃ¢â‚¬â„¢s CoffeeÃ¢â‚¬â€and they are

currently competing for market share in a geographic area. First JavaPlanet

opens a franchise; then QueequegÃ¢â‚¬â„¢s Coffee opens a franchise; then JavaPlanet;

then QueequegÃ¢â‚¬â„¢s; and so on. Suppose they must deal with zoning regulations

that require no two franchises be located too close together, and each is trying

to make its locations as convenient as possible. Who will win?

LetÃ¢â‚¬â„¢s make the rules of this Ã¢â‚¬Å“gameÃ¢â‚¬Â more concrete. The geographic region

in question is divided into n zones, labeled 1, 2, . . . , n. Each zone i has a

Interval Scheduling, and the graph in Figure 1.3(b) cannot arise as the Ã¢â‚¬Å“conÃ¯Â¬â€šict graphÃ¢â‚¬Â in an instance

of Bipartite Matching.

17

18

Chapter 1 Introduction: Some Representative Problems

10

1

5

15

5

1

5

1

15

10

Figure 1.7 An instance of the Competitive Facility Location Problem.

value bi , which is the revenue obtained by either of the companies if it opens

a franchise there. Finally, certain pairs of zones (i, j) are adjacent, and local

zoning laws prevent two adjacent zones from each containing a franchise,

regardless of which company owns them. (They also prevent two franchises

from being opened in the same zone.) We model these conÃ¯Â¬â€šicts via a graph

G = (V , E), where V is the set of zones, and (i, j) is an edge in E if the

zones i and j are adjacent. The zoning requirement then says that the full

set of franchises opened must form an independent set in G.

Thus our game consists of two players, P1 and P2, alternately selecting

nodes in G, with P1 moving Ã¯Â¬Ârst. At all times, the set of all selected nodes

must form an independent set in G. Suppose that player P2 has a target bound

B, and we want to know: is there a strategy for P2 so that no matter how P1

plays, P2 will be able to select a set of nodes with a total value of at least B?

We will call this an instance of the Competitive Facility Location Problem.

Consider, for example, the instance pictured in Figure 1.7, and suppose

that P2Ã¢â‚¬â„¢s target bound is B = 20. Then P2 does have a winning strategy. On the

other hand, if B = 25, then P2 does not.

One can work this out by looking at the Ã¯Â¬Âgure for a while; but it requires

some amount of case-checking of the form, Ã¢â‚¬Å“If P1 goes here, then P2 will go

there; but if P1 goes over there, then P2 will go here. . . . Ã¢â‚¬Â And this appears to

be intrinsic to the problem: not only is it computationally difÃ¯Â¬Âcult to determine

whether P2 has a winning strategy; on a reasonably sized graph, it would even

be hard for us to convince you that P2 has a winning strategy. There does not

seem to be a short proof we could present; rather, weÃ¢â‚¬â„¢d have to lead you on a

lengthy case-by-case analysis of the set of possible moves.

This is in contrast to the Independent Set Problem, where we believe that

Ã¯Â¬Ânding a large solution is hard but checking a proposed large solution is easy.

This contrast can be formalized in the class of PSPACE-complete problems, of

which Competitive Facility Location is an example. PSPACE-complete problems are believed to be strictly harder than NP-complete problems, and this

conjectured lack of short Ã¢â‚¬Å“proofsÃ¢â‚¬Â for their solutions is one indication of this

greater hardness. The notion of PSPACE-completeness turns out to capture a

large collection of problems involving game-playing and planning; many of

these are fundamental issues in the area of artiÃ¯Â¬Âcial intelligence.

Solved Exercises

Solved Exercises

Solved Exercise 1

Consider a town with n men and n women seeking to get married to one

another. Each man has a preference list that ranks all the women, and each

woman has a preference list that ranks all the men.

The set of all 2n people is divided into two categories: good people and

bad people. Suppose that for some number k, 1 Ã¢â€°Â¤ k Ã¢â€°Â¤ n Ã¢Ë†â€™ 1, there are k good

men and k good women; thus there are n Ã¢Ë†â€™ k bad men and n Ã¢Ë†â€™ k bad women.

Everyone would rather marry any good person than any bad person.

Formally, each preference list has the property that it ranks each good person

of the opposite gender higher than each bad person of the opposite gender: its

Ã¯Â¬Ârst k entries are the good people (of the opposite gender) in some order, and

its next n Ã¢Ë†â€™ k are the bad people (of the opposite gender) in some order.

Show that in every stable matching, every good man is married to a good

woman.

Solution A natural way to get started thinking about this problem is to

assume the claim is false and try to work toward obtaining a contradiction.

What would it mean for the claim to be false? There would exist some stable

matching M in which a good man m was married to a bad woman w.

Now, letÃ¢â‚¬â„¢s consider what the other pairs in M look like. There are k good

men and k good women. Could it be the case that every good woman is married

to a good man in this matching M? No: one of the good men (namely, m) is

already married to a bad woman, and that leaves only k Ã¢Ë†â€™ 1 other good men.

So even if all of them were married to good women, that would still leave some

good woman who is married to a bad man.

Let w be such a good woman, who is married to a bad man. It is now

easy to identify an instability in M: consider the pair (m, w ). Each is good,

but is married to a bad partner. Thus, each of m and w prefers the other to

their current partner, and hence (m, w ) is an instability. This contradicts our

assumption that M is stable, and hence concludes the proof.

Solved Exercise 2

We can think about a generalization of the Stable Matching Problem in which

certain man-woman pairs are explicitly forbidden. In the case of employers and

applicants, we could imagine that certain applicants simply lack the necessary

qualiÃ¯Â¬Âcations or certiÃ¯Â¬Âcations, and so they cannot be employed at certain

companies, however desirable they may seem. Using the analogy to marriage

between men and women, we have a set M of n men, a set W of n women,

19

20

Chapter 1 Introduction: Some Representative Problems

and a set F Ã¢Å â€ M Ãƒâ€” W of pairs who are simply not allowed to get married. Each

man m ranks all the women w for which (m, w) Ã¢Ë†Ë† F, and each woman w ranks

all the men m for which (m , w ) Ã¢Ë†Ë† F.

In this more general setting, we say that a matching S is stable if it does

not exhibit any of the following types of instability.

(i) There are two pairs (m, w) and (m , w ) in S with the property that

(m, w ) Ã¢Ë†Ë† F, m prefers w to w, and w prefers m to m . (The usual kind

of instability.)

(ii) There is a pair (m, w) Ã¢Ë†Ë† S, and a man m , so that m is not part of any

pair in the matching, (m , w) Ã¢Ë†Ë† F, and w prefers m to m. (A single man

is more desirable and not forbidden.)

(iii) There is a pair (m, w) Ã¢Ë†Ë† S, and a woman w , so that w is not part of

any pair in the matching, (m, w ) Ã¢Ë†Ë† F, and m prefers w to w. (A single

woman is more desirable and not forbidden.)

(iv) There is a man m and a woman w, neither of whom is part of any pair

in the matching, so that (m, w) Ã¢Ë†Ë† F. (There are two single people with

nothing preventing them from getting married to each other.)

Note that under these more general deÃ¯Â¬Ânitions, a stable matching need not be

a perfect matching.

Now we can ask: For every set of preference lists and every set of forbidden

pairs, is there always a stable matching? Resolve this question by doing one of

the following two things: (a) give an algorithm that, for any set of preference

lists and forbidden pairs, produces a stable matching; or (b) give an example

of a set of preference lists and forbidden pairs for which there is no stable

matching.

Solution The Gale-Shapley algorithm is remarkably robust to variations on

the Stable Matching Problem. So, if youÃ¢â‚¬â„¢re faced with a new variation of the

problem and canÃ¢â‚¬â„¢t Ã¯Â¬Ând a counterexample to stability, itÃ¢â‚¬â„¢s often a good idea to

check whether a direct adaptation of the G-S algorithm will in fact produce

stable matchings.

That turns out to be the case here. We will show that there is always a

stable matching, even in this more general model with forbidden pairs, and

we will do this by adapting the G-S algorithm. To do this, letÃ¢â‚¬â„¢s consider why

the original G-S algorithm canÃ¢â‚¬â„¢t be used directly. The difÃ¯Â¬Âculty, of course, is

that the G-S algorithm doesnÃ¢â‚¬â„¢t know anything about forbidden pairs, and so

the condition in the While loop,

While there is a man m who is free and hasnÃ¢â‚¬â„¢t proposed to

every woman,

Solved Exercises

wonÃ¢â‚¬â„¢t work: we donÃ¢â‚¬â„¢t want m to propose to a woman w for which the pair

(m, w) is forbidden.

Thus, letÃ¢â‚¬â„¢s consider a variation of the G-S algorithm in which we make

only one change: we modify the While loop to say,

While there is a man m who is free and hasnÃ¢â‚¬â„¢t proposed to

every woman w for which (m, w) Ã¢Ë†Ë† F.

Here is the algorithm in full.

Initially all m Ã¢Ë†Ë† M and w Ã¢Ë†Ë† W are free

While there is a man m who is free and hasnÃ¢â‚¬â„¢t proposed to

every woman w for which (m, w) Ã¢Ë†Ë† F

Choose such a man m

Let w be the highest-ranked woman in mÃ¢â‚¬â„¢s preference list

to which m has not yet proposed

If w is free then

(m, w) become engaged

Else w is currently engaged to m

If w prefers m to m then

m remains free

Else w prefers m to m

(m, w) become engaged

m becomes free

Endif

Endif

Endwhile

Return the set S of engaged pairs

We now prove that this yields a stable matching, under our new deÃ¯Â¬Ânition

of stability.

To begin with, facts (1.1), (1.2), and (1.3) from the text remain true (in

particular, the algorithm will terminate in at most n2 iterations). Also, we

donÃ¢â‚¬â„¢t have to worry about establishing that the resulting matching S is perfect

(indeed, it may not be). We also notice an additional pairs of facts. If m is

a man who is not part of a pair in S, then m must have proposed to every

nonforbidden woman; and if w is a woman who is not part of a pair in S, then

it must be that no man ever proposed to w.

Finally, we need only show

(1.9)

There is no instability with respect to the returned matching S.

21

22

Chapter 1 Introduction: Some Representative Problems

Proof. Our general deÃ¯Â¬Ânition of instability has four parts: This means that we

have to make sure that none of the four bad things happens.

First, suppose there is an instability of type (i), consisting of pairs (m, w)

and (m , w ) in S with the property that (m, w ) Ã¢Ë†Ë† F, m prefers w to w, and w

prefers m to m . It follows that m must have proposed to w ; so w rejected m,

and thus she prefers her Ã¯Â¬Ânal partner to mÃ¢â‚¬â€a contradiction.

Next, suppose there is an instability of type (ii), consisting of a pair

(m, w) Ã¢Ë†Ë† S, and a man m , so that m is not part of any pair in the matching,

(m , w) Ã¢Ë†Ë† F, and w prefers m to m. Then m must have proposed to w and

been rejected; again, it follows that w prefers her Ã¯Â¬Ânal partner to m Ã¢â‚¬â€a

contradiction.

Third, suppose there is an instability of type (iii), consisting of a pair

(m, w) Ã¢Ë†Ë† S, and a woman w , so that w is not part of any pair in the matching,

(m, w ) Ã¢Ë†Ë† F, and m prefers w to w. Then no man proposed to w at all;

in particular, m never proposed to w , and so he must prefer w to w Ã¢â‚¬â€a

contradiction.

Finally, suppose there is an instability of type (iv), consisting of a man

m and a woman w, neither of which is part of any pair in the matching,

so that (m, w) Ã¢Ë†Ë† F. But for m to be single, he must have proposed to every

nonforbidden woman; in particular, he must have proposed to w, which means

she would no longer be singleÃ¢â‚¬â€a contradiction.

Exercises

1. Decide whether you think the following statement is true or false. If it is

true, give a short explanation. If it is false, give a counterexample.

True or false? In every instance of the Stable Matching Problem, there is a

stable matching containing a pair (m, w) such that m is ranked Ã¯Â¬Ârst on the

preference list of w and w is ranked Ã¯Â¬Ârst on the preference list of m.

2. Decide whether you think the following statement is true or false. If it is

true, give a short explanation. If it is false, give a counterexample.

True or false? Consider an instance of the Stable Matching Problem in which

there exists a man m and a woman w such that m is ranked Ã¯Â¬Ârst on the

preference list of w and w is ranked Ã¯Â¬Ârst on the preference list of m. Then in

every stable matching S for this instance, the pair (m, w) belongs to S.

3. There are many other settings in which we can ask questions related

to some type of Ã¢â‚¬Å“stabilityÃ¢â‚¬Â principle. HereÃ¢â‚¬â„¢s one, involving competition

between two enterprises.

Exercises

Suppose we have two television networks, whom weÃ¢â‚¬â„¢ll call A and B.

There are n prime-time programming slots, and each network has n TV

shows. Each network wants to devise a scheduleÃ¢â‚¬â€an assignment of each

show to a distinct slotÃ¢â‚¬â€so as to attract as much market share as possible.

Here is the way we determine how well the two networks perform

relative to each other, given their schedules. Each show has a fixed rating,

which is based on the number of people who watched it last year; weÃ¢â‚¬â„¢ll

assume that no two shows have exactly the same rating. A network wins a

given time slot if the show that it schedules for the time slot has a larger

rating than the show the other network schedules for that time slot. The

goal of each network is to win as many time slots as possible.

Suppose in the opening week of the fall season, Network A reveals a

schedule S and Network B reveals a schedule T. On the basis of this pair

of schedules, each network wins certain time slots, according to the rule

above. WeÃ¢â‚¬â„¢ll say that the pair of schedules (S, T) is stable if neither network

can unilaterally change its own schedule and win more time slots. That

is, there is no schedule S such that Network A wins more slots with the

pair (S , T) than it did with the pair (S, T); and symmetrically, there is no

schedule T such that Network B wins more slots with the pair (S, T ) than

it did with the pair (S, T).

The analogue of Gale and ShapleyÃ¢â‚¬â„¢s question for this kind of stability

is the following: For every set of TV shows and ratings, is there always

a stable pair of schedules? Resolve this question by doing one of the

following two things:

(a) give an algorithm that, for any set of TV shows and associated

ratings, produces a stable pair of schedules; or

(b) give an example of a set of TV shows and associated ratings for

which there is no stable pair of schedules.

4. Gale and Shapley published their paper on the Stable Matching Problem

in 1962; but a version of their algorithm had already been in use for

ten years by the National Resident Matching Program, for the problem of

assigning medical residents to hospitals.

Basically, the situation was the following. There were m hospitals,

each with a certain number of available positions for hiring residents.

There were n medical students graduating in a given year, each interested

in joining one of the hospitals. Each hospital had a ranking of the students

in order of preference, and each student had a ranking of the hospitals

in order of preference. We will assume that there were more students

graduating than there were slots available in the m hospitals.

23

24

Chapter 1 Introduction: Some Representative Problems

The interest, naturally, was in finding a way of assigning each student

to at most one hospital, in such a way that all available positions in all

hospitals were filled. (Since we are assuming a surplus of students, there

would be some students who do not get assigned to any hospital.)

We say that an assignment of students to hospitals is stable if neither

of the following situations arises.

.

.

First type of instability: There are students s and s , and a hospital h,

so that

Ã¢â‚¬â€œ s is assigned to h, and

Ã¢â‚¬â€œ s is assigned to no hospital, and

Ã¢â‚¬â€œ h prefers s to s.

Second type of instability: There are students s and s , and hospitals

h and h , so that

Ã¢â‚¬â€œ s is assigned to h, and

Ã¢â‚¬â€œ s is assigned to h , and

Ã¢â‚¬â€œ h prefers s to s, and

Ã¢â‚¬â€œ s prefers h to h .

So we basically have the Stable Matching Problem, except that (i)

hospitals generally want more than one resident, and (ii) there is a surplus

of medical students.

Show that there is always a stable assignment of students to hospitals, and give an algorithm to find one.

5. The Stable Matching Problem, as discussed in the text, assumes that all

men and women have a fully ordered list of preferences. In this problem

we will consider a version of the problem in which men and women can be

indifferent between certain options. As before we have a set M of n men

and a set W of n women. Assume each man and each woman ranks the

members of the opposite gender, but now we allow ties in the ranking.

For example (with n = 4), a woman could say that m1 is ranked in first

place; second place is a tie between m2 and m3 (she has no preference

between them); and m4 is in last place. We will say that w prefers m to m

if m is ranked higher than m on her preference list (they are not tied).

With indifferences in the rankings, there could be two natural notions

for stability. And for each, we can ask about the existence of stable

matchings, as follows.

(a)

A strong instability in a perfect matching S consists of a man m and

a woman w, such that each of m and w prefers the other to their

partner in S. Does there always exist a perfect matching with no

Exercises

strong instability? Either give an example of a set of men and women

with preference lists for which every perfect matching has a strong

instability; or give an algorithm that is guaranteed to find a perfect

matching with no strong instability.

(b) A weak instability in a perfect matching S consists of a man m and

a woman w, such that their partners in S are w and m ,…

Purchase answer to see full

attachment