CSE 373/548 - Analysis of Algorithms

Spring 1996


Steven Skiena
Department of Computer Science
SUNY Stony Brook

In Spring 1996, I taught my Analysis of Algorithms course via EngiNet, the SUNY Stony Brook distance learning program. Each of my lectures that semester was videotaped, and the tapes made available to off-site students. I found it an enjoyable experience.

As an experiment in using the Internet for distance learning, we have digitized the complete audio of all 23 lectures, and have made this available on the WWW. We partitioned the full audio track into sound clips, each corresponding to one page of lecture notes, and linked them to the associated text and images.

In a real sense, listening to all the audio is analogous to sitting through a one-semester college course on algorithms! Properly compressed, the full semester's audio requires less than 300 megabytes of storage, which is much less than I would have imagined. The entire semesters lectures, over thirty hours of audio files, fit comfortably on The Algorithm Design Manual CD-ROM, which also includes a hypertext version of the book and a substantial amount of software.



Menu

Lecture 1 - analyzing algorithms

Listening To Part 1-7

Lecture Schedule

subject topics reading
Preliminaries Analyzing algorithms 1-32
" Asymptotic notation 32-37
" Recurrence relations 53-64
Sorting Heapsort 140-150
" Quicksort 153-167
" Linear Sorting 172-182
Searching Data structures 200-215
" Binary search trees 244-245
" Red-Black trees:insertion 262-272
`` Red-Black trees:deletion 272-277
MIDTERM 1
Comb. Search Backtracking
" Elements of dynamic programming 301-314
" Examples of dynamic programming 314-323
Graph Algorithms Data structures 465-477
for graphs
" Breadth/depth-first search 477-483
" Topological Sort/Connectivity 485-493
" Minimum Spanning Trees 498-510
" Single-source shortest paths 514-532
" All-pairs shortest paths 550-563
MIDTERM 2
Intractability P and NP 916-928
" NP-completeness 929-939
" NP-completeness proofs 939-951
" Further reductions 951-960
" Approximation algorithms 964-974
" Set cover / knapsack heuristics 974-983
FINAL EXAM

Listening To Part 1-8

What Is An Algorithm?

Algorithms are the ideas behind computer programs.  

An algorithm is the thing which stays the same whether the program is in Pascal running on a Cray in New York or is in BASIC running on a Macintosh in Kathmandu!

To be interesting, an algorithm has to solve a general, specified problem. An algorithmic problem is specified by describing the set of instances it must work on and what desired properties the output must have.  

Example: Sorting

Input: A sequence of N numbers tex2html_wrap_inline13209

Output: the permutation (reordering) of the input sequence such as tex2html_wrap_inline13211 .

We seek algorithms which are correct and efficient.

Correctness

For any algorithm, we must prove that it always returns the desired output for all legal instances of the problem.  

For sorting, this means even if (1) the input is already sorted, or (2) it contains repeated elements.

Listening To Part 1-9

Correctness is Not Obvious!

The following problem arises often in manufacturing and transportation testing applications.

Suppose you have a robot arm equipped with a tool, say a soldering iron. To enable the robot arm to do a soldering job, we must construct an ordering of the contact points, so the robot visits (and solders) the first contact point, then visits the second point, third, and so forth until the job is done.   

Since robots are expensive, we need to find the order which minimizes the time (ie. travel distance) it takes to assemble the circuit board.

tex2html_wrap13295 tex2html_wrap13297
You are given the job to program the robot arm. Give me an algorithm to find the best tour!

Listening To Part 1-10

Nearest Neighbor Tour

A very popular solution starts at some point tex2html_wrap_inline13213 and then walks to its nearest neighbor tex2html_wrap_inline13215 first, then repeats from tex2html_wrap_inline13217 , etc. until done.  


Pick and visit an initial point tex2html_wrap_inline13219

tex2html_wrap_inline13221

i = 0

While there are still unvisited points

i = i+1

Let tex2html_wrap_inline13227 be the closest unvisited point to tex2html_wrap_inline13229

Visit tex2html_wrap_inline13231

Return to tex2html_wrap_inline13233 from tex2html_wrap_inline13235

This algorithm is simple to understand and implement and very efficient. However, it is not correct!

tex2html_wrap13299
tex2html_wrap13301
Always starting from the leftmost point or any other point will not fix the problem.

Listening To Part 1-11

Closest Pair Tour

Always walking to the closest point is too restrictive, since that point might trap us into making moves we don't want.  

Another idea would be to repeatedly connect the closest pair of points whose connection will not cause a cycle or a three-way branch to be formed, until we have a single chain with all the points in it.


Let n be the number of points in the set

tex2html_wrap_inline13237

For i=1 to n-1 do

For each pair of endpoints (x,y) of partial paths

If tex2html_wrap_inline13243 then

tex2html_wrap_inline13245 , tex2html_wrap_inline13247 , d = dist(x,y)

Connect tex2html_wrap_inline13251 by an edge

Connect the two endpoints by an edge.

Although it works correctly on the previous example, other data causes trouble:

tex2html_wrap13303 tex2html_wrap13305
This algorithm is not correct!

Listening To Part 1-12

A Correct Algorithm

We could try all possible orderings of the points, then select the ordering which minimizes the total length:  


tex2html_wrap_inline13253

For each of the n! permutations tex2html_wrap_inline13257 of the n points

If tex2html_wrap_inline13259 then

tex2html_wrap_inline13261 and tex2html_wrap_inline13263

Return tex2html_wrap_inline13265

Since all possible orderings are considered, we are guaranteed to end up with the shortest possible tour.

Because it trys all n! permutations, it is extremely slow, much too slow to use when there are more than 10-20 points.  

No efficient, correct algorithm exists for the traveling salesman problem, as we will see later.

Listening To Part 1-13

Efficiency

"Why not just use a supercomputer?"

Supercomputers are for people too rich and too stupid to design efficient algorithms!  

A faster algorithm running on a slower computer will always win for sufficiently large instances, as we shall see.

Usually, problems don't have to get that large before the faster algorithm wins.

Expressing Algorithms

We need some way to express the sequence of steps comprising an algorithm.

In order of increasing precision, we have English, pseudocode, and real programming languages. Unfortunately, ease of expression moves in the reverse order.

I prefer to describe the ideas of an algorithm in English, moving to pseudocode to clarify sufficiently tricky details of the algorithm.  

Listening To Part 1-14

The RAM Model

Algorithms are the only important, durable, and original part of computer science because they can be studied in a machine and language independent way.

The reason is that we will do all our design and analysis for the RAM model of computation:   

We measure the run time of an algorithm by counting the number of steps.

This model is useful and accurate in the same sense as the flat-earth model (which is useful)!  

Listening To Part 1-15

Best, Worst, and Average-Case

The worst case complexity of the algorithm is the function defined by the maximum number of steps taken on any instance of size n.  

tex2html_wrap13307
The best case complexity of the algorithm is the function defined by the minimum number of steps taken on any instance of size n.  

The average-case complexity of the algorithm is the function defined by an average number of steps taken on any instance of size n.  

Each of these complexities defines a numerical function - time vs. size!

Insertion Sort

One way to sort an array of n elements is to start with tex2html_wrap_inline13269 empty list, then successively insert new elements in the proper position:  

displaymath13203

At each stage, the inserted element leaves a sorted list, and after n insertions contains exactly the right elements. Thus the algorithm must be correct.

But how efficient is it?

Note that the run time changes with the permutation instance! (even for a fixed size problem)

How does insertion sort do on sorted permutations?

How about unsorted permutations?

Exact Analysis of Insertion Sort

Count the number of times each line of pseudocode will be executed.

Line InsertionSort(A) #Inst. #Exec.
1 for j:=2 to len. of A do c1 n
2 key:=A[j] c2 n-1
3 /* put A[j] into A[1..j-1] */ c3=0 /
4 i:=j-1 c4 n-1
5 while tex2html_wrap_inline13271 do c5 tj
6 A[i+1]:= A[i] c6
7 i := i-1 c7
8 A[i+1]:=key c8 n-1

The for statement is executed (n-1)+1 times (why?)

Within the for statement, "key:=A[j]" is executed n-1 times.

Steps 5, 6, 7 are harder to count.

Let tex2html_wrap_inline13275 the number of elements that have to be slide right to insert the jth item.

Step 5 is executed tex2html_wrap_inline13277 times.

Step 6 is tex2html_wrap_inline13279 .

Add up the executed instructions for all pseudocode lines to get the run-time of the algorithm:

tex2html_wrap_inline13281 tex2html_wrap_inline13283 tex2html_wrap_inline13285 tex2html_wrap_inline13287

What are the tex2html_wrap_inline13289 ? They depend on the particular input.

Best Case

If it's already sorted, all tex2html_wrap_inline13291 's are 1.

Hence, the best case time is

displaymath13204

where C and D are constants.

Worst Case

If the input is sorted in descending order, we will have to slide all of the already-sorted elements, so tex2html_wrap_inline13293 , and step 5 is executed

displaymath13205


Next: Lecture 2 - asymptotic notation Up: Table of contents Previous: None

Lecture 2 - asymptotic notation

Listening To Part 2-1

Problem 1.2-6:   How can we modify almost any algorithm to have a good best-case running time?


To improve the best case, all we have to do it to be able to solve one instance of each size efficiently. We could modify our algorithm to first test whether the input is the special instance we know how to solve, and then output the canned answer.

For sorting, we can check if the values are already ordered, and if so output them. For the traveling salesman, we can check if the points lie on a line, and if so output the points in that order.

The supercomputer people pull this trick on the linpack benchmarks!


Because it is so easy to cheat with the best case running time, we usually don't rely too much about it.

Because it is usually very hard to compute the average running time, since we must somehow average over all the instances, we usually strive to analyze the worst case running time.

The worst case is usually fairly easy to analyze and often close to the average or real running time.

Listening To Part 2-2

Exact Analysis is Hard!

We have agreed that the best, worst, and average case complexity of an algorithm is a numerical function of the size of the instances.

tex2html_wrap13531
However, it is difficult to work with exactly because it is typically very complicated!

Thus it is usually cleaner and easier to talk about upper and lower bounds of the function.   

This is where the dreaded big O notation comes in!  

Since running our algorithm on a machine which is twice as fast will effect the running times by a multiplicative constant of 2 - we are going to have to ignore constant factors anyway.

Listening To Part 2-3

Names of Bounding Functions

Now that we have clearly defined the complexity functions we are talking about, we can talk about upper and lower bounds on it:   

Got it? C, tex2html_wrap_inline13367 , and tex2html_wrap_inline13369 are all constants independent of n.

All of these definitions imply a constant tex2html_wrap_inline13371 beyond which they are satisfied. We do not care about small values of n.

Listening To Part 2-4

O, tex2html_wrap_inline13373 , and tex2html_wrap_inline13375

tex2html_wrap13533
The value of tex2html_wrap_inline13377 shown is the minimum possible value; any greater value would also work.

(a) tex2html_wrap_inline13379 if there exist positive constants tex2html_wrap_inline13381 , tex2html_wrap_inline13383 , and tex2html_wrap_inline13385 such that to the right of tex2html_wrap_inline13387 , the value of f(n) always lies between tex2html_wrap_inline13391 and tex2html_wrap_inline13393 inclusive.

(b) f(n) = O(g(n)) if there are positive constants tex2html_wrap_inline13397 and c such that to the right of tex2html_wrap_inline13399 , the value of f(n) always lies on or below tex2html_wrap_inline13403 .

(c) tex2html_wrap_inline13405 if there are positive constants tex2html_wrap_inline13407 and c such that to the right of tex2html_wrap_inline13409 , the value of f(n) always lies on or above tex2html_wrap_inline13413 .

Asymptotic notation tex2html_wrap_inline13415 are as well as we can practically deal with complexity functions.

Listening To Part 2-5

What does all this mean?

eqnarray1163

eqnarray1179

eqnarray1194

Think of the equality as meaning in the set of functions.

Note that time complexity is every bit as well defined a function as tex2html_wrap_inline13417 or you bank account as a function of time.

Listening To Part 2-6

Testing Dominance

f(n) dominates g(n) if tex2html_wrap_inline13423 , which is the same as saying g(n)=o(f(n)).  

Note the little-oh - it means ``grows strictly slower than''.

Knowing the dominance relation between common functions is important because we want algorithms whose time complexity is as low as possible in the hierarchy. If f(n) dominates g(n), f is much larger (ie. slower) than g.

Complexity 10 20 30 40 50 60
n 0.00001 sec 0.00002 sec 0.00003 sec 0.00004 sec 0.00005 sec 0.00006 sec
tex2html_wrap_inline13441 0.0001 sec 0.0004 sec 0.0009 sec 0.016 sec 0.025 sec 0.036 sec
tex2html_wrap_inline13443 0.001 sec 0.008 sec 0.027 sec 0.064 sec 0.125 sec 0.216 sec
tex2html_wrap_inline13445 0.1 sec 3.2 sec 24.3 sec 1.7 min 5.2 min 13.0 min
tex2html_wrap_inline13447 0.001 sec 1.0 sec 17.9 min 12.7 days 35.7 years 366 cent
tex2html_wrap_inline13449 0.59 sec 58 min 6.5 years 3855 cent tex2html_wrap_inline13451 cent tex2html_wrap_inline13453 cent

Listening To Part 2-7

Logarithms

It is important to understand deep in your bones what logarithms are and where they come from.   

A logarithm is simply an inverse exponential function. Saying tex2html_wrap_inline13455 is equivalent to saying that tex2html_wrap_inline13457 .

Exponential functions, like the amount owed on a n year mortgage at an interest rate of tex2html_wrap_inline13459 per year, are functions which grow distressingly fast, as anyone who has tried to pay off a mortgage knows.

Thus inverse exponential functions, ie. logarithms, grow refreshingly slowly.  

Binary search is an example of an tex2html_wrap_inline13461 algorithm. After each comparison, we can throw away half the possible number of keys. Thus twenty comparisons suffice to find any name in the million-name Manhattan phone book!

If you have an algorithm which runs in tex2html_wrap_inline13463 time, take it, because this is blindingly fast even on very large instances.

Listening To Part 2-8

Properties of Logarithms

Recall the definition, tex2html_wrap_inline13465 .

Asymptotically, the base of the log does not matter:

 

displaymath13329

Thus, tex2html_wrap_inline13467 , and note that tex2html_wrap_inline13469 is just a constant.

Asymptotically, any polynomial function of n does not matter:

Note that

displaymath13330

since tex2html_wrap_inline13471 , and tex2html_wrap_inline13473 .

Any exponential dominates every polynomial. This is why we will seek to avoid exponential time algorithms.

Listening To Part 2-9

Federal Sentencing Guidelines

2F1.1. Fraud and Deceit; Forgery; Offenses Involving Altered or Counterfeit Instruments other than Counterfeit Bearer Obligations of the United States.  

(a) Base offense Level: 6

(b) Specific offense Characteristics

(1) If the loss exceeded $2,000, increase the offense level as follows:

Loss(Apply the Greatest) Increase in Level
(A) $2,000 or less no increase
(B) More than $2,000 add 1
(C) More than $5,000 add 2
(D) More than $10,000 add 3
(E) More than $20,000 add 4
(F) More than $40,000 add 5
(G) More than $70,000 add 6
(H) More than $120,000 add 7
(I) More than $200,000 add 8
(J) More than $350,000 add 9
(K) More than $500,000 add 10
(L) More than $800,000 add 11
(M) More than $1,500,000 add 12
(N) More than $2,500,000 add 13
(O) More than $5,000,000 add 14
(P) More than $10,000,000 add 15
(Q) More than $20,000,000 add 16
(R) More than $40,000,000 add 17
(Q) More than $80,000,000 add 18

Listening To Part 2-10

The federal sentencing guidelines are designed to help judges be consistent in assigning punishment. The time-to-serve is a roughly linear function of the total level.

However, notice that the increase in level as a function of the amount of money you steal grows logarithmically in the amount of money stolen.  

This very slow growth means it pays to commit one crime stealing a lot of money, rather than many small crimes adding up to the same amount of money, because the time to serve if you get caught is much less.

The Moral: ``if you are gonna do the crime, make it worth the time!''

Listening To Part 2-11

Working with the Asymptotic Notation

Suppose tex2html_wrap_inline13475 and tex2html_wrap_inline13477 .  

What do we know about g'(n) = f(n)+g(n)? Adding the bounding constants shows tex2html_wrap_inline13481 .

What do we know about g''(n) = f(n)-g(n)? Since the bounding constants don't necessary cancel, tex2html_wrap_inline13485

We know nothing about the lower bounds on g'+g'' because we know nothing about lower bounds on f, g.


Suppose tex2html_wrap_inline13489 and tex2html_wrap_inline13491 .

What do we know about g'(n) = f(n)+g(n)? Adding the lower bounding constants shows tex2html_wrap_inline13495 .

What do we know about g''(n) = f(n)-g(n)? We know nothing about the lower bound of this!

Listening To Part 2-12

The Complexity of Songs

Suppose we want to sing a song which lasts for n units of time. Since n can be large, we want to memorize songs which require only a small amount of brain space, i.e. memory.    

Let S(n) be the space complexity of a song which lasts for n units of time.

The amount of space we need to store a song can be measured in either the words or characters needed to memorize it. Note that the number of characters is tex2html_wrap_inline13501 since every word in a song is at most 34 letters long - Supercalifragilisticexpialidocious!

What bounds can we establish on S(n)?

Listening To Part 2-13

The Refrain

Most popular songs have a refrain, which is a block of text which gets repeated after each stanza in the song:  

Bye, bye Miss American pie
Drove my chevy to the levy but the levy was dry
Them good old boys were drinking whiskey and rye
Singing this will be the day that I die.

Refrains made a song easier to remember, since you memorize it once yet sing it O(n) times. But do they reduce the space complexity?

Not according to the big oh. If

displaymath13331

Then the space complexity is still O(n) since it is only halved (if the verse-size = refrain-size):

displaymath13332

Listening To Part 2-14

The k Days of Christmas

To reduce S(n), we must structure the song differently.

Consider ``The k Days of Christmas''. All one must memorize is:

On the kth Day of Christmas, my true love gave to me, tex2html_wrap_inline13515
tex2html_wrap_inline13517
On the First Day of Christmas, my true love gave to me, a partridge in a pear tree

But the time it takes to sing it is

displaymath13333

If tex2html_wrap_inline13519 , then tex2html_wrap_inline13521 , so tex2html_wrap_inline13523 .

Listening To Part 2-15

100 Bottles of Beer

What do kids sing on really long car trips?

n bottles of beer on the wall,
n bottles of beer.
You take one down and pass it around
n-1 bottles of beer on the ball.

All you must remember in this song is this template of size tex2html_wrap_inline13525 , and the current value of n. The storage size for n depends on its value, but tex2html_wrap_inline13527 bits suffice.

This for this song, tex2html_wrap_inline13529 .


Is there a song which eliminates even the need to count?

That's the way, uh-huh, uh-huh
I like it, uh-huh, huh

Reference: D. Knuth, `The Complexity of Songs', Comm. ACM, April 1984, pp.18-24


Next: Lecture 3 - recurrence Up: Table of contents Previous: Lecture 1 - analyzing

Lecture 3 - recurrence relations

Listening To Part 3-1

Problem 2.1-2: Show that for any real constants a and b, b > 0,  

displaymath13552


To show tex2html_wrap_inline13580 , we must show O and tex2html_wrap_inline13582 . Go back to the definition!

Note the need for absolute values.

Listening To Part 3-2

Problem 2.1-4:

(a) Is tex2html_wrap_inline13606 ?

(b) Is tex2html_wrap_inline13608 ?


(a) Is tex2html_wrap_inline13610 ?

Is tex2html_wrap_inline13612 ?

Yes, if tex2html_wrap_inline13614 for all n

(b) Is tex2html_wrap_inline13616

Is tex2html_wrap_inline13618 ?

note tex2html_wrap_inline13620

Is tex2html_wrap_inline13622 ?

Is tex2html_wrap_inline13624 ?

No! Certainly for any constant c we can find an n such that this is not true.

Listening To Part 3-3

Recurrence Relations

Many algorithms, particularly divide and conquer algorithms, have time complexities which are naturally modeled by recurrence relations.  

A recurrence relation is an equation which is defined in terms of itself.

Why are recurrences good things?

  1. Many natural functions are easily expressed as recurrences:

    displaymath13553

    displaymath13554

    displaymath13555

  2. It is often easy to find a recurrence as the solution of a counting problem. Solving the recurrence can be done for many special cases as we will see, although it is somewhat of an art.

Listening To Part 3-4

Recursion is Mathematical Induction!

In both, we have general and boundary conditions, with the general condition breaking the problem into smaller and smaller pieces.   

The initial or boundary condition terminate the recursion.  

As we will see, induction provides a useful tool to solve recurrences - guess a solution and prove it by induction.

displaymath13556

n 0 1 2 3 4 5 6 7
tex2html_wrap_inline13626 0 1 3 7 15 31 63 127

Guess what the solution is?

Prove tex2html_wrap_inline13628 by induction:

  1. Show that the basis is true: tex2html_wrap_inline13630 .
  2. Now assume true for tex2html_wrap_inline13632 .
  3. Using this assumption show:

    displaymath13557

height6pt width4pt

Listening To Part 3-5

Solving Recurrences

No general procedure for solving recurrence relations is known, which is why it is an art. My approach is:  

Realize that linear, finite history, constant coefficient recurrences always can be solved

Check out any combinatorics or differential equations book for a procedure.

Consider tex2html_wrap_inline13634 , tex2html_wrap_inline13636 , tex2html_wrap_inline13638

It has history = 2, degree = 1, and coefficients of 2 and 1. Thus it can be solved mechanically! Proceed:

displaymath13559

Systems like Mathematica and Maple have packages for doing this.   

Listening To Part 3-6

Guess a solution and prove by induction

To guess the solution, play around with small values for insight.

Note that you can do inductive proofs with the big-O's notations - just be sure you use it right.  

Example: tex2html_wrap_inline13640 .

Show that tex2html_wrap_inline13642 for large enough c and n. Assume that it is true for n/2, then

eqnarray2060

Starting with basis cases T(2)=4, T(3)=5, lets us complete the proof for tex2html_wrap_inline13650 .

Listening To Part 3-7

Try backsubstituting until you know what is going on

Also known as the iteration method. Plug the recurrence back into itself until you see a pattern.  

Example: tex2html_wrap_inline13652 .

Try backsubstituting:

eqnarray2082

The tex2html_wrap_inline13654 term should now be obvious.

Although there are only tex2html_wrap_inline13656 terms before we get to T(1), it doesn't hurt to sum them all since this is a fast growing geometric series:

displaymath13560

displaymath13561

Listening To Part 3-8

Recursion Trees

Drawing a picture of the backsubstitution process gives you a idea of what is going on.  

We must keep track of two things - (1) the size of the remaining argument to the recurrence, and (2) the additive stuff to be accumulated during this call.

Example: tex2html_wrap_inline13660

tex2html_wrap13800 tex2html_wrap13802
The remaining arguments are on the left, the additive terms on the right.

Although this tree has height tex2html_wrap_inline13662 , the total sum at each level decreases geometrically, so:

displaymath13562

The recursion tree framework made this much easier to see than with algebraic backsubstitution.

Listening To Part 3-9

See if you can use the Master theorem to provide an instant asymptotic solution

The Master Theorem:   Let tex2html_wrap_inline13664 and b>1 be constants, let f(n) be a function, and let T(n) be defined on the nonnegative integers by the recurrence

displaymath13563

where we interpret n/b as tex2html_wrap_inline13674 or tex2html_wrap_inline13676 . Then T(n) can be bounded asymptotically as follows:

  1. If tex2html_wrap_inline13680 for some constant tex2html_wrap_inline13682 , then tex2html_wrap_inline13684 .
  2. If tex2html_wrap_inline13686 , then tex2html_wrap_inline13688 .
  3. If tex2html_wrap_inline13690 for some constant tex2html_wrap_inline13692 , and if tex2html_wrap_inline13694 for some constant c<1, and all sufficiently large n, then tex2html_wrap_inline13698 .

Listening To Part 3-10

Examples of the Master Theorem

Which case of the Master Theorem applies?

Listening To Part 3-11

Why should the Master Theorem be true?

Consider T(n) = a T(n/b) + f(n).

Suppose f(n) is small enough

Say f(n)=0, ie. T(n) = a T(n/b).

Then we have a recursion tree where the only contribution is at the leaves.  

There will be tex2html_wrap_inline13756 levels, with tex2html_wrap_inline13758 leaves at level l.

displaymath13564

tex2html_wrap13804
so long as f(n) is small enough that it is dwarfed by this, we have case 1 of the Master Theorem!

Listening To Part 3-12

Suppose f(n) is large enough

If we draw the recursion tree for T(n) = a T(n/b) + f(n).

tex2html_wrap13806
If f(n) is a big enough function, the one top call can be bigger than the sum of all the little calls.

Example: tex2html_wrap_inline13766 . In fact this holds unless tex2html_wrap_inline13768 !

In case 3 of the Master Theorem, the additive term dominates.

In case 2, both parts contribute equally, which is why the log pops up. It is (usually) what we want to have happen in a divide and conquer algorithm.

Listening To Part 3-13

Famous Algorithms and their Recurrence

Matrix Multiplication

The standard matrix multiplication algorithm for two tex2html_wrap_inline13770 matrices is tex2html_wrap_inline13772 .    

tex2html_wrap13808 tex2html_wrap13810
Strassen discovered a divide-and-conquer algorithm which takes tex2html_wrap_inline13774 time.

Since tex2html_wrap_inline13776 dwarfs tex2html_wrap_inline13778 , case 1 of the master theorem applies and tex2html_wrap_inline13780 .

This has been ``improved'' by more and more complicated recurrences until the current best in tex2html_wrap_inline13782 .

Listening To Part 3-14

Polygon Triangulation

Given a polygon in the plane, add diagonals so that each face is a triangle None of the diagonals are allowed to cross.   

tex2html_wrap13812 tex2html_wrap13814
Triangulation is an important first step in many geometric algorithms.

The simplest algorithm might be to try each pair of points and check if they see each other. If so, add the diagonal and recur on both halves, for a total of tex2html_wrap_inline13784 .

However, Chazelle gave an algorithm which runs in tex2html_wrap_inline13786 time. Since tex2html_wrap_inline13788 , by case 1 of the Master Theorem, Chazelle's algorithm is linear, ie. T(n) = O(n).

Sorting

The classic divide and conquer recurrence is Mergesort's T(n) = 2 T(n/2) + O(n), which divides the data into equal-sized halves and spends linear time merging the halves after they are sorted.  

Since tex2html_wrap_inline13794 but not tex2html_wrap_inline13796 , Case 2 of the Master Theorem applies and tex2html_wrap_inline13798 .

In case 2, the divide and merge steps balance out perfectly, as we usually hope for from a divide-and-conquer algorithm.

Mergesort Animations

Approaches to Algorithms Design

Incremental

Job is partly done - do a little more, repeat until done.  

A good example of this approach is insertion sort

Divide-and-Conquer

A recursive technique  

A good example of this approach is Mergesort.


Next: Lecture 4 - heapsort Up: Table of contents Previous: Lecture 2 - asymptotic

Lecture 4 - heapsort

Listening To Part 4-1

4.2-2 Argue the solution to

displaymath13836

is tex2html_wrap_inline13852 by appealing to the recursion tree.  


Draw the recursion tree.

tex2html_wrap14006 tex2html_wrap14008
How many levels does the tree have? This is equal to the longest path from the root to a leaf.

The shortest path to a leaf occurs when we take the heavy branch each time. The height k is given by tex2html_wrap_inline13854 , meaning tex2html_wrap_inline13856 or tex2html_wrap_inline13858 .

The longest path to a leaf occurs when we take the light branch each time. The height k is given by tex2html_wrap_inline13860 , meaning tex2html_wrap_inline13862 or tex2html_wrap_inline13864 .

The problem asks to show that tex2html_wrap_inline13866 , meaning we are looking for a lower bound

On any full level, the additive terms sums to n. There are tex2html_wrap_inline13868 full levels. Thus tex2html_wrap_inline13870

Listening To Part 4-2

4.2-4 Use iteration to solve T(n) = T(n-a) + T(a) + n, where tex2html_wrap_inline13874 is a constant.


Note iteration is backsubstitution.  

eqnarray2921

Listening To Part 4-3

Why don't CS profs ever stop talking about sorting?!

  1. Computers spend more time sorting than anything else, historically 25% on mainframes.    
  2. Sorting is the best studied problem in computer science, with a variety of different algorithms known.
  3. Most of the interesting ideas we will encounter in the course can be taught in the context of sorting, such as divide-and-conquer, randomized algorithms, and lower bounds.

You should have seen most of the algorithms - we will concentrate on the analysis.

Listening To Part 4-4

Applications of Sorting

One reason why sorting is so important is that once a set of items is sorted, many other problems become easy.  

Searching

Binary search lets you test whether an item is in a dictionary in tex2html_wrap_inline13876 time.  

Speeding up searching is perhaps the most important application of sorting.

Closest pair

Given n numbers, find the pair which are closest to each other.  

Once the numbers are sorted, the closest pair will be next to each other in sorted order, so an O(n) linear scan completes the job.

Listening To Part 4-5

Element uniqueness

Given a set of n items, are they all unique or are there any duplicates?    

Sort them and do a linear scan to check all adjacent pairs.

This is a special case of closest pair above.

Frequency distribution - Mode

Given a set of n items, which element occurs the largest number of times?   

Sort them and do a linear scan to measure the length of all adjacent runs.

Median and Selection

What is the kth largest item in the set?   

Once the keys are placed in sorted order in an array, the kth largest can be found in constant time by simply looking in the kth position of the array.

Listening To Part 4-6

Convex hulls

Given n points in two dimensions, find the smallest area polygon which contains them all.  

tex2html_wrap14010
The convex hull is like a rubber band stretched over the points.

Convex hulls are the most important building block for more sophisticated geometric algorithms.  

Once you have the points sorted by x-coordinate, they can be inserted from left to right into the hull, since the rightmost point is always on the boundary.

Without sorting the points, we would have to check whether the point is inside or outside the current hull.

Adding a new rightmost point might cause others to be deleted.

Huffman codes

If you are trying to minimize the amount of space a text file is taking up, it is silly to assign each letter the same length (ie. one byte) code.   

Example: e is more common than q, a is more common than z.

If we were storing English text, we would want a and e to have shorter codes than q and z.

To design the best possible code, the first and most important step is to sort the characters in order of frequency of use.

>Listening t0
Character Frequency Code
f 5 1100
e 9 1101
c 12 100
b 13 101
d 16 111
a 45 0

Listening to Part 4-8

Selection Sort

A simple tex2html_wrap_inline13880 sorting algorithm is selection sort.  

Sweep through all the elements to find the smallest item, then the smallest remaining item, etc. until the array is sorted.


Selection-sort(A)

for i = 1 to n

for j = i+1 to n

if (A[j] < A[i]) then swap(A[i],A[j])

It is clear this algorithm must be correct from an inductive argument, since the ith element is in its correct position.

It is clear that this algorithm takes tex2html_wrap_inline13888 time.

It is clear that the analysis of this algorithm cannot be improved because there will be n/2 iterations which will require at least n/2 comparisons each, so at least tex2html_wrap_inline13894 comparisons will be made. More careful analysis doubles this.

Thus selection sort runs in tex2html_wrap_inline13896 time.

Listening to Part 4-9

Binary Heaps

A binary heap is defined to be a binary tree with a key in each node such that:  

  1. All leaves are on, at most, two adjacent levels.
  2. All leaves on the lowest level occur to the left, and all levels except the lowest one are completely filled.
  3. The key in root is tex2html_wrap_inline13898 all its children, and the left and right subtrees are again binary heaps.

Conditions 1 and 2 specify shape of the tree, and condition 3 the labeling of the tree.

tex2html_wrap14012
Listening to Part 4-10

The ancestor relation in a heap defines a partial order on its elements, which means it is reflexive, anti-symmetric, and transitive.  

  1. Reflexive: x is an ancestor of itself.
  2. Anti-symmetric: if x is an ancestor of y and y is an ancestor of x, then x=y.
  3. Transitive: if x is an ancestor of y and y is an ancestor of z, x is an ancestor of z.

Partial orders can be used to model heirarchies with incomplete information or equal-valued elements. One of my favorite games with my parents is fleshing out the partial order of ``big'' old-time movie stars.  

The partial order defined by the heap structure is weaker than that of the total order, which explains

  1. Why it is easier to build.
  2. Why it is less useful than sorting (but still very important).

Listening to Part 4-11

Constructing Heaps

Heaps can be constructed incrementally, by inserting new elements into the left-most open spot in the array.  

If the new element is greater than its parent, swap their positions and recur.

Since at each step, we replace the root of a subtree by a larger one, we preserve the heap order.

Since all but the last level is always filled, the height h of an n element heap is bounded because:

displaymath13837

so tex2html_wrap_inline13902 .

Doing n such insertions takes tex2html_wrap_inline13904 , since the last n/2 insertions require tex2html_wrap_inline13908 time each.

Listening to Part 4-12

Heapify

The bottom up insertion algorithm gives a good way to build a heap, but Robert Floyd found a better way, using a merge procedure called heapify.  

Given two heaps and a fresh element, they can be merged into one by making the new one the root and trickling down.


Build-heap(A)

n = |A|

For tex2html_wrap_inline13912 do

Heapify(A,i)


Heapify(A,i)

left = 2i

right = 2i+1

if tex2html_wrap_inline13914 then

max = left

else max = i

if tex2html_wrap_inline13916 and (A(right] > A[max]) then

max = right

if tex2html_wrap_inline13920 then

swap(A[i],A[max])

Heapify(A,max)

Rough Analysis of Heapify

Heapify on a subtree containing n nodes takes

displaymath13838

The 2/3 comes from merging heaps whose levels differ by one. The last row could be exactly half filled. Besides, the asymptotic answer won't change so long the fraction is less than one.  

Solve the recurrence using the Master Theorem.

Let a = 1, b= 3/2 and f(n) = 1.

Note that tex2html_wrap_inline13928 , since tex2html_wrap_inline13930 .

Thus Case 2 of the Master theorem applies.


The Master Theorem: Let tex2html_wrap_inline13932 and b>1 be constants, let f(n) be a function, and let T(n) be defined on the nonnegative integers by the recurrence

displaymath13839

where we interpret n/b to mean either tex2html_wrap_inline13942 or tex2html_wrap_inline13944 . Then T(n) can be bounded asymptotically as follows:

  1. If tex2html_wrap_inline13948 for some constant tex2html_wrap_inline13950 , then tex2html_wrap_inline13952 .
  2. If tex2html_wrap_inline13954 , then tex2html_wrap_inline13956 .
  3. If tex2html_wrap_inline13958 for some constant tex2html_wrap_inline13960 , and if tex2html_wrap_inline13962 for some constant c<1, and all sufficiently large n, then tex2html_wrap_inline13966 .

Listening to Part 4-14

Exact Analysis of Heapify

In fact, Heapify performs better than tex2html_wrap_inline13968 , because most of the heaps we merge are extremely small.

tex2html_wrap14014
In a full binary tree on n nodes, there are n/2 nodes which are leaves (i.e. height 0), n/4 nodes which are height 1, n/8 nodes which are height 2, ...

In general, there are at most tex2html_wrap_inline13976 nodes of height h, so the cost of building a heap is:

displaymath13840

Since this sum is not quite a geometric series, we can't apply the usual identity to get the sum. But it should be clear that the series converges.

Listening to Part 4-15

Proof of Convergence

Series convergence is the ``free lunch'' of algorithm analysis.    

The identify for the sum of a geometric series is

displaymath13841

If we take the derivative of both sides, ...

displaymath13842

Multiplying both sides of the equation by x gives the identity we need:

displaymath13843

Substituting x = 1/2 gives a sum of 2, so Build-heap uses at most 2n comparisons and thus linear time.

Listening to Part 4-16

The Lessons of Heapsort, I

"Are we doing a careful analysis? Might our algorithm be faster than it seems?"

Typically in our analysis, we will say that since we are doing at most x operations of at most y time each, the total time is O(x y).

However, if we overestimate too much, our bound may not be as tight as it should be!

Listening to Part 4-17

Heapsort

Heapify can be used to construct a heap, using the observation that an isolated element forms a heap of size 1.  


Heapsort(A)

Build-heap(A)

for i = n to 1 do

swap(A[1],A[i])

n = n - 1

Heapify(A,1)

If we construct our heap from bottom to top using Heapify, we do not have to do anything with the last n/2 elements.

With the implicit tree defined by array positions, (i.e. the ith position is the parent of the 2ith and (2i+1)st positions) the leaves start out as heaps.

Exchanging the maximum element with the last element and calling heapify repeatedly gives an tex2html_wrap_inline13990 sorting algorithm, named Heapsort.

Lecture Sound../sounds/lec4-17a.au

Heapsort Animations

Listening to Part 4-18

The Lessons of Heapsort, II

Always ask yourself, ``Can we use a different data structure?''

Selection sort scans throught the entire array, repeatedly finding the smallest remaining element.  


For i = 1 to n

A: Find the smallest of the first n-i+1 items.

B: Pull it out of the array and put it first.

Using arrays or unsorted linked lists as the data structure, operation A takes O(n) time and operation B takes O(1).

Using heaps, both of these operations can be done within tex2html_wrap_inline13998 time, balancing the work and achieving a better tradeoff.

Listening to Part 4-19

Priority Queues

A priority queue is a data structure on sets of keys supporting the following operations:  

These operations can be easily supported using a heap.

Listening to Part 4-20

Applications of Priority Queues

Heaps as stacks or queues

  

Both stacks and queues can be simulated by using a heap, when we add a new time field to each item and order the heap according it this time field.

This simulation is not as efficient as a normal stack/queue implementation, but it is a cute demonstration of the flexibility of a priority queue.

Listening to Part 4-21

Discrete Event Simulations

In simulations of airports, parking lots, and jai-alai - priority queues can be used to maintain who goes next.   

The stack and queue orders are just special cases of orderings. In real life, certain people cut in line.

Sweepline Algorithms in Computational Geometry

   

tex2html_wrap14016
In the priority queue, we will store the points we have not yet encountered, ordered by x coordinate. and push the line forward one stop at a time.

Listening to Part 4-22

Greedy Algorithms

In greedy algorithms, we always pick the next thing which locally maximizes our score. By placing all the things in a priority queue and pulling them off in order, we can improve performance over linear search or sorting, particularly if the weights change.  

Example: Sequential strips in triangulations.

Danny Heep

 


Next: Lecture 5 - quicksort Up: Table of contents Previous: Lecture 3 - recurrence

Lecture 5 - quicksort

Listening to Part 5-1

4-2 Find the missing integer from 0 to n using O(n) ``is bit[j] in A[i]'' queries.


Note - there are a total of tex2html_wrap_inline14085 bits, so we are not allowed to read the entire input!  

Also note, the problem is asking us to minimize the number of bits we read. We can spend as much time as we want doing other things provided we don't look at extra bits.

How can we find the last bit of the missing integer?

Ask all the n integers what their last bit is and see whether 0 or 1 is the bit which occurs less often than it is supposed to. That is the last bit of the missing integer!

How can we determine the second-to-last bit?

Ask the tex2html_wrap_inline14087 numbers which ended with the correct last bit! By analyzing the bit patterns of the numbers from 0 to n which end with this bit.  

By recurring on the remaining candidate numbers, we get the answer in T(n) = T(n/2) + n =O(n), by the Master Theorem.

Listening to Part 5-2

Quicksort

Although mergesort is tex2html_wrap_inline14091 , it is quite inconvenient for implementation with arrays, since we need space to merge.  

In practice, the fastest sorting algorithm is Quicksort, which uses partitioning as its main idea.  

Example: Pivot about 10.

17 12 6 19 23 8 5 10 - before

6 8 5 10 23 19 12 17 - after

Partitioning places all the elements less than the pivot in the left part of the array, and all elements greater than the pivot in the right part of the array. The pivot fits in the slot between them.  

Note that the pivot element ends up in the correct place in the total order!

Listening to Part 5-3

Partitioning the elements

Once we have selected a pivot element, we can partition the array in one linear scan, by maintaining three sections of the array: < pivot, > pivot, and unexplored.

Example: pivot about 10

| 17 12 6 19 23 8 5 | 10

| 5 12 6 19 23 8 | 17

5 | 12 6 19 23 8 | 17

5 | 8 6 19 23 | 12 17

5 8 | 6 19 23 | 12 17

5 8 6 | 19 23 | 12 17

5 8 6 | 23 | 19 12 17

5 8 6 ||23 19 12 17

5 8 6 10 19 12 17 23

As we scan from left to right, we move the left bound to the right when the element is less than the pivot, otherwise we swap it with the rightmost unexplored element and move the right bound one step closer to the left.

Listening to Part 5-4

Since the partitioning step consists of at most n swaps, takes time linear in the number of keys. But what does it buy us?

  1. The pivot element ends up in the position it retains in the final sorted order.
  2. After a partitioning, no element flops to the other side of the pivot in the final sorted order.

Thus we can sort the elements to the left of the pivot and the right of the pivot independently!

This gives us a recursive sorting algorithm, since we can use the partitioning approach to sort each subproblem.

Listening to Part 5-5

Quicksort Animations

Listening to Part 5-6

Pseudocode


Sort(A)

Quicksort(A,1,n)


Quicksort(A, low, high)

if (low < high)

pivot-location = Partition(A,low,high)

Quicksort(A,low, pivot-location - 1)

Quicksort(A, pivot-location+1, high)


Partition(A,low,high)

pivot = A[low]

leftwall = low

for i = low+1 to high

if (A[i] < pivot) then

leftwall = leftwall+1

swap(A[i],A[leftwall])

swap(A[low],A[leftwall])

Listening to Part 5-7

Best Case for Quicksort

Since each element ultimately ends up in the correct position, the algorithm correctly sorts. But how long does it take?  

The best case for divide-and-conquer algorithms comes when we split the input as evenly as possible. Thus in the best case, each subproblem is of size n/2.

The partition step on each subproblem is linear in its size. Thus the total effort in partitioning the tex2html_wrap_inline14105 problems of size tex2html_wrap_inline14107 is O(n).

The recursion tree for the best case looks like this:

tex2html_wrap14169
The total partitioning on each level is O(n), and it take tex2html_wrap_inline14113 levels of perfect partitions to get to single element subproblems. When we are down to single elements, the problems are sorted. Thus the total time in the best case is tex2html_wrap_inline14115 .

Listening to Part 5-8

Worst Case for Quicksort

Suppose instead our pivot element splits the array as unequally as possible. Thus instead of n/2 elements in the smaller half, we get zero, meaning that the pivot element is the biggest or smallest element in the array.

tex2html_wrap14171
Now we have n-1 levels, instead of tex2html_wrap_inline14119 , for a worst case time of tex2html_wrap_inline14121 , since the first n/2 levels each have tex2html_wrap_inline14125 elements to partition.

Thus the worst case time for Quicksort is worse than Heapsort or Mergesort.

To justify its name, Quicksort had better be good in the average case. Showing this requires some fairly intricate analysis.

The divide and conquer principle applies to real life. If you will break a job into pieces, it is best to make the pieces of equal size!

Listening to Part 5-9

Intuition: The Average Case for Quicksort

Suppose we pick the pivot element at random in an array of n keys.

tex2html_wrap14173
Half the time, the pivot element will be from the center half of the sorted array.

Whenever the pivot element is from positions n/4 to 3n/4, the larger remaining subarray contains at most 3n/4 elements.

If we assume that the pivot element is always in this range, what is the maximum number of partitions we need to get from n elements down to 1 element?

displaymath14055

displaymath14056

displaymath14057

Listening to Part 5-10

What have we shown?

At most tex2html_wrap_inline14133 levels of decent partitions suffices to sort an array of n elements.  

But how often when we pick an arbitrary element as pivot will it generate a decent partition?

Since any number ranked between n/4 and 3n/4 would make a decent pivot, we get one half the time on average.

If we need tex2html_wrap_inline14139 levels of decent partitions to finish the job, and half of random partitions are decent, then on average the recursion tree to quicksort the array has tex2html_wrap_inline14141 levels.

tex2html_wrap14175
Since O(n) work is done partitioning on each level, the average time is tex2html_wrap_inline14145 .

More careful analysis shows that the expected number of comparisons is tex2html_wrap_inline14147 .

Listening to Part 5-11

Average-Case Analysis of Quicksort

To do a precise average-case analysis of quicksort, we formulate a recurrence given the exact expected time T(n):

displaymath14058

Each possible pivot p is selected with equal probability. The number of comparisons needed to do the partition is n-1.  

We will need one useful fact about the Harmonic numbers tex2html_wrap_inline14151 , namely

displaymath14059

It is important to understand (1) where the recurrence relation comes from and (2) how the log comes out from the summation. The rest is just messy algebra.

Listening to Part 5-12

displaymath14060

displaymath14061

displaymath14062

displaymath14063

displaymath14064

rearranging the terms give us:

displaymath14065

substituting tex2html_wrap_inline14153 gives

displaymath14066

displaymath14067

We are really interested in A(n), so

displaymath14068

Listening to Part 5-13

What is the Worst Case?

The worst case for Quicksort depends upon how we select our partition or pivot element. If we always select either the first or last element of the subarray, the worst-case occurs when the input is already sorted!

A B D F H J K

B D F H J K

D F H J K

F H J K

H J K

J K

K

Having the worst case occur when they are sorted or almost sorted is very bad, since that is likely to be the case in certain applications.

To eliminate this problem, pick a better pivot:

  1. Use the middle element of the subarray as pivot.
  2. Use a random element of the array as the pivot.
  3. Perhaps best of all, take the median of three elements (first, last, middle) as the pivot. Why should we use median instead of the mean?

Whichever of these three rules we use, the worst case remains tex2html_wrap_inline14157 . However, because the worst case is no longer a natural order it is much more difficult to occur.

Listening to Part 5-14

Is Quicksort really faster than Heapsort?

Since Heapsort is tex2html_wrap_inline14159 and selection sort is tex2html_wrap_inline14161 , there is no debate about which will be better for decent-sized files.  

But how can we compare two tex2html_wrap_inline14163 algorithms to see which is faster? Using the RAM model and the big Oh notation, we can't!

When Quicksort is implemented well, it is typically 2-3 times faster than mergesort or heapsort. The primary reason is that the operations in the innermost loop are simpler. The best way to see this is to implement both and experiment with different inputs.

Since the difference between the two programs will be limited to a multiplicative constant factor, the details of how you program each algorithm will make a big difference.

If you don't want to believe me when I say Quicksort is faster, I won't argue with you. It is a question whose solution lies outside the tools we are using.

Listening to Part 5-15

Randomization

Suppose you are writing a sorting program, to run on data given to you by your worst enemy. Quicksort is good on average, but bad on certain worst-case instances.  

If you used Quicksort, what kind of data would your enemy give you to run it on? Exactly the worst-case instance, to make you look bad.

But instead of picking the median of three or the first element as pivot, suppose you picked the pivot element at random.

Now your enemy cannot design a worst-case instance to give to you, because no matter which data they give you, you would have the same probability of picking a good pivot!

Randomization is a very important and useful idea. By either picking a random pivot or scrambling the permutation before sorting it, we can say:

``With high probability, randomized quicksort runs in tex2html_wrap_inline14165 time.''

Where before, all we could say is:

``If you give me random input data, quicksort runs in expected tex2html_wrap_inline14167 time.''

Since the time bound how does not depend upon your input distribution, this means that unless we are extremely unlucky (as opposed to ill prepared or unpopular) we will certainly get good performance.

Randomization is a general tool to improve algorithms with bad worst-case but good average-case complexity.

The worst-case is still there, but we almost certainly won't see it.


Next: Lecture 6 - linear Up: Table of contents Previous: Lecture 4 - heapsort

Lecture 6 - linear sorting

Listening to Part 6-1

7.1-2: Show that an n-element heap has height tex2html_wrap_inline14189 .


Since it is balanced binary tree, the height of a heap is clearly tex2html_wrap_inline14191 , but the problem asks for an exact answer.  

The height is defined as the number of edges in the longest simple path from the root.

tex2html_wrap14251
The number of nodes in a complete balanced binary tree of height h is tex2html_wrap_inline14193 .

Thus the height increases only when tex2html_wrap_inline14195 , or in other words when tex2html_wrap_inline14197 is an integer.

Listening to Part 6-2

7.1-5 Is a reverse sorted array a heap?


In a heap, each element is greater than or equal to each of its descendants.

In the array representation of a heap, the descendants of the ith element are the 2ith and (2i+1)th elements.

If A is sorted in reverse order, then tex2html_wrap_inline14201 implies that tex2html_wrap_inline14203 .

Since 2i > i and 2i+1 > i then tex2html_wrap_inline14209 and tex2html_wrap_inline14211 .

Thus by definition A is a heap!

Listening to Part 6-3

Can we sort in better than tex2html_wrap_inline14213 ?

Any comparison-based sorting program can be thought of as defining a decision tree of possible executions.  

Running the same program twice on the same permutation causes it to do exactly the same thing, but running it on different permutations of the same data causes a different sequence of comparisons to be made on each.

tex2html_wrap14253
Claim: the height of this decision tree is the worst-case complexity of sorting.  

Listening to Part 6-4

Once you believe this, a lower bound on the time complexity of sorting follows easily.  

Since any two different permutations of n elements requires a different sequence of steps to sort, there must be at least n! different paths from the root to leaves in the decision tree, ie. at least n! different leaves in the tree.

Since only binary comparisons (less than or greater than) are used, the decision tree is a binary tree.

Since a binary tree of height h has at most tex2html_wrap_inline14219 leaves, we know tex2html_wrap_inline14221 , or tex2html_wrap_inline14223 .

By inspection tex2html_wrap_inline14225 , since the last n/2 terms of the product are each greater than n/2. By Sterling's approximation, a better bound is tex2html_wrap_inline14231 where e=2.718.

displaymath14187

Listening to Part 6-5

Non-Comparison-Based Sorting

All the sorting algorithms we have seen assume binary comparisons as the basic primative, questions of the form ``is x before y?''.  

Suppose you were given a deck of playing cards to sort. Most likely you would set up 13 piles and put all cards with the same number in one pile.

A 2 3 4 5 6 7 8 9 10 J Q K

A 2 3 4 5 6 7 8 9 10 J Q K

A 2 3 4 5 6 7 8 9 10 J Q K

A 2 3 4 5 6 7 8 9 10 J Q K

With only a constant number of cards left in each pile, you can use insertion sort to order by suite and concatenate everything together.

If we could find the correct pile for each card in constant time, and each pile gets O(1) cards, this algorithm takes O(n) time.

Listening to Part 6-6

Bucketsort

Suppose we are sorting n numbers from 1 to m, where we know the numbers are approximately uniformly distributed.  

We can set up n buckets, each responsible for an interval of m/n numbers from 1 to m

tex2html_wrap14255
Given an input number x, it belongs in bucket number tex2html_wrap_inline14241 .

If we use an array of buckets, each item gets mapped to the right bucket in O(1) time.

With uniformly distributed keys, the expected number of items per bucket is 1. Thus sorting each bucket takes O(1) time!

The total effort of bucketing, sorting buckets, and concatenating the sorted buckets together is O(n).

What happened to our tex2html_wrap_inline14249 lower bound!

Listening to Part 6-7

We can use bucketsort effectively whenever we understand the distribution of the data.

However, bad things happen when we assume the wrong distribution.

Suppose in the previous example all the keys happened to be 1. After the bucketing phase, we have:

tex2html_wrap14257
We spent linear time distributing our items into buckets and learned nothing. Perhaps we could split the big bucket recursively, but it is not certain that we will ever win unless we understand the distribution.

Problems like this are why we worry about the worst-case performance of algorithms!

Such distribution techniques can be used on strings instead of just numbers. The buckets will correspond to letter ranges instead of just number ranges.

The worst case ``shouldn't'' happen if we understand the distribution of our data.

Listening to Part 6-8

Real World Distributions

Consider the distribution of names in a telephone book.  

Either make sure you understand your data, or use a good worst-case or randomized algorithm!

The Shifflett's of Charlottesville

For comparison, note that there are seven Shifflett's (of various spellings) in the 1000 page Manhattan telephone directory.  

tex2html_wrap14259
Listening to Part 6-10

Rules for Algorithm Design

The secret to successful algorithm design, and problem solving in general, is to make sure you ask the right questions. Below, I give a possible series of questions for you to ask yourself as you try to solve difficult algorithm design problems:    

  1. Do I really understand the problem?

    1. What exactly does the input consist of?
    2. What exactly are the desired results or output?
    3. Can I construct some examples small enough to solve by hand? What happens when I solve them?
    4. Are you trying to solve a numerical problem? A graph algorithm problem? A geometric problem? A string problem? A set problem? Might your problem be formulated in more than one way? Which formulation seems easiest?

  2. Can I find a simple algorithm for the problem?

    1. Can I find the solve my problem exactly by searching all subsets or arrangements and picking the best one?

      1. If so, why am I sure that this algorithm always gives the correct answer?
      2. How do I measure the quality of a solution once I construct it?

        Listening to Part 6-11

      3. Does this simple, slow solution run in polynomial or exponential time?
      4. If I can't find a slow, guaranteed correct algorithm, am I sure that my problem is well defined enough to permit a solution?
    2. Can I solve my problem by repeatedly trying some heuristic rule, like picking the biggest item first? The smallest item first? A random item first?
      1. If so, on what types of inputs does this heuristic rule work well? Do these correspond to the types of inputs that might arise in the application?
      2. On what types of inputs does this heuristic rule work badly? If no such examples can be found, can I show that in fact it always works well?
      3. How fast does my heuristic rule com