Discpet-e Mathematics with -Ppoof 'AlfN . ... .......
ff Mgt
Wi
A All
t
4W AA
Ir4a_
Hurl,
J:7
W, U-6-1
Jltu
I -
XV
NIM EFX K ... ... . ....
1: .5mm'.4
71%,M KI 4.
4
-vu-,ziiý
IC Wiff ........
......
R12
44
W : V.
17G.-
.
. . Z;ý
5k
4
ZX
Ied
Vxý
fý _P
Rý
Epit Gossett
Fundamental Boolean Alegebra Properties Idempotence
Domination
x~x=x
x.O---O
Associativity (x + y)+z =x +(y+z) (x -V) • z = x* (y.z)
De Morgan's Laws x + y =x y X Y = x±+y
Involution
Absorption
XX
x . (x + y) -- x
X + X*.v
= X
General Proof Strategies If the assertion...
Then try...
claims something is true for all integers n > no
mathematical induction
is stated explicitly or implicitly as an implication
direct; indirect; contradiction
contains an existential quantifier
a constructive proof; a non-constructive proof
contains a universal quantifier
finding a counter-example; the choose method
contains the phrase "if and only if"
to prove the two implications separately; to produce a sequence of equivalent statements linking the two sides of the biconditional
is stated as an equivalence
to look for a complete set of implications that are relatively easy to prove proof by cases
can be easily split into a collection of independent assertions is an implication with a true conclusion
vacuous proof
is an implication with a false hypothesis
trivial proof
is about membership in a set
direct proof: verify that the element satisfies the set membership requirements
asserts one set is a subset of another
to show that a generic element of the first set is also a member of the second set
asserts the equality of two sets
to show that each set is a subset of the other; to use a sequence of reversible statements with the fundamental set properties and other theorems
Discrete Mathematics with Proof
4\4\
Y
4
\44
4
4
n'x
'4\\
4444
K
4
K \
4
K 444
4 4
4\
\V44 4
4
4
"4
44\4\
4;
U
;
\4
\j
4K 4
4
j4\\
\
K
K
4\
\ 444(444.
44
44
;
4
444
K44
1ARON EDUAThYN mIc. I2Jpper Saddle River New Jeey 07458
Kj K
44
4444
4 4 N4
R is true and R --* P is true (I can't teach if there are no students). Using the result in Quick Check 2.6 again, we see that P ++ R is true. That is, the statement "I am a teacher if and only if I have students" is true. U
2.3.6 Operator Precedence
TABLE 2.15 Logic operator precedence Higher Precedence
A, ---- ,
~For
V,
You are familiar with the need for an agreement about precedence rules26 for the standard arithmetic operators +, -, ., -.. Thus, 3 + 4 * 5 has traditionally been given the value 23 rather than the value 35. It is similarly convenient to establish a mathematical convention for precedence of logic operators. The common agreement is to evaluate expressions inside parentheses first. Negation is done next (changing the logic value of the logic variable or expression in parentheses to its immediate right). The operators AND, OR, and XOR are applied next, and finally, implication and the biconditional are applied. Table 2.15 summarizes this information. Operators near the top of the table are applied before operators near 27 the bottom. example, the expression
.*.-
Lower Precedence
(A V -B) A C -- ((D V E) A F) V (-'G A H) 26you may know this as "order of operations." 27 Many computer languages carry these precedence rules one step further. The logic operators within a row of the table may be applied in a left-to-right order in the absence of parentheses.
2.3 Propositional Logic
41
is understood as ((A V (--B)) A C)
-- *
(((D v E) A F) v ((-G) A H)).
You may use parentheses to change the normal order. For example, A A (B -* C) causes the implication to be applied before the AND. It is often desirable to add extra parentheses to make an expression easier to read. For example, A -). -B may be easier to read if it is written A --* (--B).
2.3.7 Logical Equivalence The binary logic operators A, v, -- , and *-*all take two statements and produce another statement. The unary logic operator -- changes one statement into another statement. There is another operator that plays a different role: that of a meta-operator.The operator ý*, which is read "logically equivalent," takes two statements and makes a declaration about whether they are really, at their core, essentially the same statement, or whether they are fundamentally different statements. For example, in Quick Check 2.6, you showed that (P --* Q) A (Q --> P) and P -+ Q are only cosmetically different; for every pair of T/F values for P and Q, the two compound statements are assigned the same T/F value. This notion of being essentially the same deserves a name. That definition will be delayed until another useful definition has been made. DEFINITION 2.16 Tautology, Contradiction,Conditional A statement is called a tautology if every entry in its truth table is T. A statement is called a contradictionif every entry in its truth table is F. A statement that is neither a tautology nor a contradiction is called a conditionalstatement aAn alternative name is contingency.
The statement P V (--P) is a tautology (look at its truth table). The statement P V Q is a conditional statement because if P and Q are both false, then P V Q is also false, but if P and Q are both true, then P V Q is true. DEFINITION 2.17 Logical Equivalence Two statements, A and B, are called logically equivalent if and only if A +* B is a tautology. Logical equivalence is denoted by the meta-operator *. The definition of logical equivalence can also be stated as A 4:ý B if and only if A and B have the same truth table. For example, [(P -
Q) A (Q -- P)] 4* [P
÷-*
Q]
because [(P -- Q) A (Q --* P)] 4* [P < Q] is a tautology (think carefully about this sentence). Appendix C.2 contains logic puzzles related to implication, biconditional, and equivalence. *
2.3.8 Derived Implications For every implication, there are three other implications that can be easily derived. As will be seen, the ease of derivation does not necessarily mean that these three derived implications have the same truth value as the original implication. Table 2.16 lists these derived implications.
42
Chapter 2 Sets, Logic, and Boolean Algebras TABLE 2.16 Derived implications Symbolic Form P
Q
-
Name The original implication
-- -'P
Q
-
-P -+
The contrapositive
P
The converse
-Q
The inverse
Consider the following three examples.
Easy Derived Implications Let P be the statement "today is Monday." Let Q be the statement "we have math class today." The derived implications are as follows: Symbolic Form Name P --+ Q The original implication If today is Monday, then we have math class today. _Q * -- P The contrapositive If we don't have math class today, then today is not Monday. Q --* P The converse If we have math class today, then today is Monday. -P -) -'Q The inverse If today is not Monday, then we do not have math class today. Suppose that your math class meets on Monday, Wednesday, and Friday. Then under normal conditions, the original implication can be considered a true statement. We would also agree that the contrapositive is a true statement. However, the converse need not be true; we might have math class because it is Wednesday. Similarly, the inverse need not be true (why?). For this example, the biconditional P Q The original implication If a positive integer n is divisible by 2, then n is even. -Q --
-P ' The contrapositive If n is not even, then n is not divisible by 2.
Q
-
P
The converse
If n is even, then n is divisible by 2. - P -> - Q The inverse If n is not divisible by 2, then n is not even. In this example, all four implications are true (because even is defined in terms of divisibility by 2). For this example, the biconditional P +-* Q is true. U
More Complex Derived Implications Let P be the statement "I am a man." Let Q be the statement "I am not a mother." The derived implications should be read carefully.
2.3 Propositional Logic Symbolic Form P -Q
Q
Name
English Translation
-
Q
implication
If I am a man, then I am not a mother.
-
-P
contrapositive
If I am a mother, then I am not a man.
-
P
converse
If I am not a mother, then I am a man.
-Q
inverse
If I am not a man, then I am a mother.
-P
43
In this example, the original implication and the contrapositive are both true implications, while the converse and inverse need not be (I can be a woman who is not a mother). U It can be shown (Exercise I in Exercises 2.3.9) that an implication and its contrapositive are logically equivalent statements. Similarly (Exercise 2), the converse and inverse are logically equivalent. Thus the original implication and the contrapositive are either both true, or both false. The converse and the inverse are also either both true or both false. However, as these examples demonstrate, knowing that the original implication is true does not automatically mean that the converse (or inverse) is true. The converse always needs to be investigated separately.
Vt Quick Check 2.7 1. Write the original implication, the contrapositive, the converse, and the inverse for each of the following statements. Indicate which of the four implications in each set are true.
Mars. (b) If I am not an extrovert, then I have no friends. 2. Use truth tables to show that R] [-,(P A Q)] '* [(-P) V (-'Q)].
(a) If I am an alien, then I am from
2.3.9 Exercises The exercises marked with OD have detailed solutions in Appendix G.
(b) It is necessary for us to walk seven miles in order to arrive at the cave entrance.
1. 0D Use truth tables to show that an implication and its contrapositive are logically equivalent statements.
(c) I will attend the banquet only if I am not sick. (d) Jill goes to class whenever there will be a quiz.
2. Use truth tables to show that the converse and inverse are logically equivalent,
(e) Working 40 hours each week is sufficient for me to pay my bills.
3. Use truth tables to show that the converse and the original implication are not logically equivalent.
6. Complete the following table by filling in the truth values for P A Q and P -+ Q. The first two columns indicate the statements that P and Q represent.
4. Write and identify the derived implications for the following
original implications. You may need to modify the original statement so that it reads more like an implication. If so, write the new form at the beginning of your answer. (a) O If n > 1, then Irn- < n. (b) When it rains it pours. (c) If V is a vector space, then V has a basis. (d) (P A Q) -+ (R v S). (e) If M is a planar map, then M can be colored with at most four colors. 5. Write and identify the derived implications for the following original implications. You may need to modify the original statement so that it reads more like an implication. If so, write
ments that P adQresn
p
Q
2 is an integer 2 is an integer
2 is an even integer
2.5 is an integer
2.5 is less than 3
pA Q P
2 is less than 1
2.5 is an integer 2.5 is an even integer 7. Which of the following statements are tautologies? (a) D P -
[(-P) -
Q]
(b) D(PAQ) vQ (c) (PAQ)(d) [P -+ Q] [[(P A (-Q))] (e) [P
-
Q]
the new form at the beginning of your answer.
(f) [P
-
Q]
(a) ýD On every day that is sunny, we go to the beach.
(g) (P V Q)
-
*-* -
[(-P) V Q] [-(P A (-Q))]
(P A Q)
4-
Q
[(-P) V Q]
-+
Q
44
Chapter 2 Sets, Logic, and Boolean Algebras
8. Which of the following statements are tautologies? (a) [(-P)
*-
and in English).
(-Q)] - [Q - R]
(b) P -- [(-Q) v R] (c) [(P) A (P v Q)] -+ Q (d) [Q A (P V Q)] - (-P) 9. Determine whether the following pairs represent equivalent statements. (a) P (P A Q); -P V-Q (b) (P - Q) v P; (P V -'Q) A Q (c) (P A Q) - P; (P A Q) - Q (d) -(P A Q); (-P)V (-Q) 10. Is the following statement a tautology? "If the manna in the wilderness was popcorn or my cat is lazy, and also if the manna in the wilderness was popcorn or my cat isn't lazy, thenthemanna in the wilderness was popcorn ormcuss ." then the manna in the wilderness was popcorn."
11. P v (-P)and -[P A (-P)] are both tautologies. State them in normal English. Do they express the same concept? 12. Which (if any) of the following statements are equivalent to "-(PA Q)? (a) (-'P) A (-Q) (b) P v Q (c) (-P)A Q (d) (-'P) v (Q) 13. P (a) Produce the truth table for the following proposition.
(b) Is this a tautology? Explain. 14. (a) Produce the truth table for the following proposition. (P ,-
Q)
-
(a) Write the three derived implications (both symbolically
(P V Q)
(b) Is this a tautology? Explain. 15. P Let P be the proposition "a man has discovered something he will die for" and let Q be the proposition "he is fit to live." Consider the implication (- P) --* (- Q): "If a man hasn't discovered something he will die for, then he isn't fit to live" (Martin Luther King, Jr.). (a) Write the three derived implications (both symbolically and in English).OR. (b) Assume that the original implication is true. Briefly discuss what we know about the truth of the derived implications.
and let Q 16. Let P be the proposition "you will forgive another" be the proposition "you break the bridge over which you must pass." Consider the implication (--P) -> Q: "If you will not forgive another, then you break the bridge over which you must pass (George Herbert, adapted)." (a) Write the three derived implications (both symbolically and in English). (b) Assume that the original implication is true. Briefly discuss what we know about the truth of the derived implications.
17. Let P be the proposition "I won a prize in the raffle" and let Q be the proposition "I had a winning ticket." Consider the implication P -+ Q: "If I won a prize in the raffle, then I had a winning ticket."
(b) Assume that the original implication is true. Briefly discuss what we know about the truth of the derived implinations. 18. Let P be the proposition "The groundhog sees his shadow" and let Q be the proposition "There will be six more weeks of winter." Consider the implication P --+ Q: "If the groundhog sees his shadow, then there will be six more weeks of winter."
(a) Write the three derived implications (both symbolically and in English). (b) Assume that the original implication is true. Briefly diswhat we know about the truth of the derived implictos cations.
19. Let P be the proposition "I live in the United States of America" and let Q be the proposition "I live in the state of Minnesota." Consider the implication P -) Q: "If I live in the United States of America, then I live in the state of Minnesota." (a) Write the three derived implications (both symbolically and in English). (b) Assume that the original implication is false. Briefly discuss what we know about the truth of the derived implications. 20. Are the propositions -(P A-•Q) and - Pv Q logically equivalent? (Be sure to give reasons and show your work!) 21. Each statement is either true or false. Identify which case is correct, and then give some justification for your answer. (a) ýD4 Suppose the final column in the truth table for a statement contains an E Then the statement is not a tautology but is a contradiction. (b) When asserting that an implication is true, we cannot auautomatically assume that the hypothesis is true, nor can we automatically assume that the conclusion is true. (c) The logic operator AND has higher precedence than the
(d) If both the original implication, P -P
-
are true.
-Q,
--
Q, and its inverse,
are true, then all four derived implications
(e) Two statements are logically equivalent if P • Q is not a conditional statement. (f) If the hypothesis of an implication is false, then the implication is true, independent of the truth value of the conclusion. 22. (a) Produce the truth table for the following proposition. Use our standard row ordering. Include all intermediate steps in the table. [P A (P
-
Q)]
-+
Q
(b) Is this a tautology? Explain. (c) (extra credit) Why might I consider this proposition to be significant?
2.4 Logical Equivalence and Rules of Inference 23. Consider the following requirements to vote: In order to become a qualified voter, you Version A: must be at least 18 years old and not have been convicted of a felony. Version B: must not - be under 18 years old
45
(a) Create propositions for the major assertions of the voter qualifications. Then express each version symbolically. (Notice that versions A and B share common pieces.) (b) Show that version A and version B are logically equivalent. Produce a tight, logically valid proof, not hand waving.
or * have a felony conviction
2.4 Logical Equivalence and Rules of Inference TABLE 2.17 Proving [-(P A Q)] ++[(--,P) v (- Q is a tautology
P
Q
T
T
T F
F F
T F
4
[-(P A Q)] [(-,P) v (-Q)] T T T T
Recall that a statement is called a tautology if and only if its T/F value is T for all T/F assignments to its component statements. In a truth table, the statement would be
represented by the final column and the component statements by the other columns. The statement is a tautology if the final column only has T's. It should be clear that if P and Q are logically equivalent statements, then the statement (P +*-Q) is a tautology. For example, consider -(P A Q) and (--P) v (-Q). Because these are equivalent statements, [--(P A Q)] ++ [(--P) V (-Q)] is a tautology. This is demonstrated in Table 2.17. Logical equivalences are not the only statements that lead to tautologies. Another example is the statement P v (--P). With a tautology we can concentrate on the form of the statement; we already know it is true for any combination of T/F values of its component statements. Tautologies provide the "rules" of logic that are used in proofs. If the tautology can be converted to a logical equivalence, it can be used as a substitution rule. If the tautology includes an implication, it is often useful to convert it into a meta-statement called a rule of inference.
DEFINITION 2.18 Inference; Rule of Inference Let A and B be two statements. Then B may be inferredfrom A, denoted by A =ý B, if A --+ B is a tautology: The symbol, =:ý, is the inference meta-operator. The meta-
statement A =• B is called a rule of inference. The key idea is that whenever A has a T in its truth table, so does B. Therefore, if A can be verified as true, then B must be true also. In order to utilize these rules, it is reasonable to make the observations contained in the following principles.
The Substitution Principles Substituting an Equivalent Statement: If A B, and A is a component of a statement, C, then B may be substituted for A without changing the T/F value of C. .*
Replacing a Logic Variable in a Tautology: If B is a logic variable in a tautology, C, and A is any statement, then A may be substituted for every occurrence of B in C and C will still be a tautology. Using a Rule of Inference: If A =• B, A evaluates to T, and A is a component of a statement, C, then B may be substituted for A without changing the T/F value of
C.
46
Chapter 2 Sets, Logic, and Boolean Algebras M
Using the Substitution Principles Substituting an Equivalent Statement: Let A be the proposition [-'(P A Q)] and let B be [(--P) V (--Q)] (so A #=.B). Let C be the statement [(--P) V (--Q)] A R. The first substitution principle asserts that C is logically equivalent to [[-(P A Q)]] A R. Table 2.18 demonstrates that this is true.
TABLE 2.18 Substituting an equivalent statement P
Q
R
[(-.P)v(-.Q)IAR
[-(P A Q)] A R
T
T
T
F
F
T
T
F
F
F
T
F
T
T
T
T
F
F
F
F
F
T
T
T
T
F
T
F
F
F
F
F
T
T
T
F
F
F
F
F
Replacing a Logic Variable in a Tautology: Let C be the tautology B v (-B). Let A be the conditional statement P V Q. Then replacing B with A leads to another tautology:
[P v Q] v [V(P v Q)].
Table 2.19 provides a confirmation.
TABLE 2.19 Replacing a logic variable B
P
Q
B v (-B)
[P v QI v [-(P v Q)]
T
T
T
T
T
T
T
F
T
T
T
F
T
T
T
T
F
F
T
T
F
T
T
T
T
F
T
F
T
T
F
F
T
T
T
F
F
F
T
T
Using a Rule of Inference: Let A be the statement [-'P A (P V Q)] and let B be Q. Example 2.25 on page 48 will show that A ---> B is a tautology. Thus, A zt B. Let C be the statement Q A [-P A (P V Q)]. The third substitution principle asserts that C can be replaced by Q A Q when [-P A (P V Q)] is known to be true. The third row of the last two columns of Table 2.20 demonstrates that this is the case. The first two rows demonstrate that when [-P A (P v Q)] is false, the substitution is not necessarily valid. U
2.4 Logical Equivalence and Rules of Inference
47
TABLE 2.20 Using a rule of inference P
Q
PvQ
T
T
T
F
F
F
T
T
F
T
F
F
F
F
F
T
T
T
T
T
T
F
F
F
T
F
F
F
-P
-PA(PvQ)
QA[-PA(PVQ)]
QAQ
2.4.1 Important Logical Equivalences and Rules of Inference Some of the more useful logical equivalences and tautologies are presented in Table 2.2 1. Most are based on intuitively clear ideas (you should discover the idea in each case). They can be formally justified by using truth tables.
TABLE 2.21 Fundamental logical equivalences Idempotencea
Domination
(PvP)4aP
(PvT)
(PAP)-:P
(PAF)4F
Associativity
Identity
[(Pv Q) v R] P is a tautology. It does not mean that P A Q may always be replaced by P. That is, the rule of inference, (P A Q) ==>P, must be used carefully. Verify the warning in the previous sentence by showing that P A Q cannot be replaced by P when Q is false.
only the fundamental logical equivalences and the logical equivalences and rules of inference for implication and the biconditional, prove that the following are tautologies. (a) [P A (P - Q)] -- Q (the tautology underlying modus ponens) (b) [ - Q1 - [(PA(-Q)) - Q1
2. Using the substitution principles and
(proof by contradiction)
2.4.3 Exercises The exercises marked with * Appendix G.
have detailed solutions in
1. Prove that the statement [(P v Q) A R] + [P V (Q A R)] is not a tautology, even though it looks like an associative law. You may use a truth table if you wish. 2. I Write the contrapositive, converse, and inverse of (P A Q) -> (R v S). Use De Morgan's laws to simplify. 3. One version of proof by contradiction (also called reductio ad absurdum) is based on [P -+ Q] • [(P A (-Q)) -+ (-P)].
Write a few sentences describing the intuitive idea behind this tautology. For extra credit, translate the Latin phrase reduction adabsurduminto English. 4. Use truth tables to prove the underlying tautologies for each of the fundamental logical equivalences. Show the intermediate expressions. (b) Domination (a) Idempotence (c) Associativity (e) Commutativity
(d) Identity (f) De Morgans's laws
2.4 Logical Equivalence and Rules of Inference (g) Distributivity (A over v)
(h) Distributivity (V over (i)
(c) but [(P
10. Show that R
A)
Law of the excluded middle
A
A (P
(- Q)]
-- (-P)
Q) and (R
--
is a tautology.
A P) --+ (R A
11. Prove that each of the following statements is a tautology. Do not use truth tables.
(1) Law of addition
(a) P --- P
Law of simplification
(c) • [P
(a) O Implication (b) Negation of an implication (c) The biconditional 6. Use truth tables to prove the logical equivalences and rules of inference related to theorems. Work with the underlying tautologies and show the intermediate expressions.
[P
(d)
5. Use truth tables to prove the following logical equivalences and rules of inference for implication and the biconditional. Work with the underlying tautologies and show the intermediate expressions.
(Q
-•
(Q
(-'Q))]
A
-P
-[(R
v P) -
Q (P
Q)
(b) Law of hypothetical syllogism
(b) (P A Q)
(c) Contrapositive
(d) Proof by contradiction
(e) Laws of disjunction
(f) Proof by cases
(d) -'[P -[P +-* Q] Q] *-•(P Q( A (e) [(P (D -(P Q) P
(P
-
(C)
(P-Q)]
(c) (P A Q)
Q)
-
(-'Q))
V
(Q
(PA Q) V (-P)
4P
V (-,P)
A (-'P))]
law of the
excluded middle However, the following truth table indicates that
(b) Laws of disjunction (work with the underlying tautologies) (-A)]
(P v Q)
law of simplification
4#,T
A
--
13. Consider the following demonstration that (P A Q) V (-P)
(a) Contrapositive
8. Without using truth tables, show that [(A v B)
Q]
(R V Q)]
(g) [(P V Q) A (P - R) A (Q -+ R)] --* R Hints: Start by associating (P --> R) A (Q -ý- R). Derive the subexpression [(--P A -Q) v R1. Look for the law of simplification at the end. 12. Prove that each of the following statements is a tautology. Do not use truth tables. (a) -[P ((-P) Q)] [(-P) (-Q)] (b) (P
7. Use the fundamental logical equivalences and the logical equivalences and rules of inference for implication and the biconditional to prove these logical equivalences and rules of inference related to theorems.
(-P)
-*
(d) [(P V Q) A (P V (-Q))] - P (R -- Q)] - [(P A R) -+
(f) [P -QI
[(-P) - Q]
(b) I P -
(-Q))]
(e) [P -
(a) Modus ponens
(P
A
[(A v B) A (-B)] is equivalent to the statement F. Conclude that the original statement is false for all values of A and B.
A
Q) V (-P) is not a tautology (the second row contains
an F in the final column). Resolve the contradiction.
(Thus, it is a contradiction, as defined in Definition 2.16.)
9. Show (perhaps using a truth table) that
P
Q
I--P
PA Q
(P A Q) V (-P)
T
T
F
T
T
(a) OD[(P --> Q) A Q] --+ P is not a tautology. (the fallacy of affirming the consequent)
T F
F T
F T
F F
F T
(b) [(P -- Q) A (-P)] -+ (-Q) is not a tautology. (the fallacy of denying the antecedent)
F
F
T
F
T
14. Consider the following demonstration that -[--(P
p
0 Q)]
-+
- [P ÷* Q]
-H[(P
, :
-> Q)] P. Complement For every P E B, there exists a unique (by definition of negation) proposition (-,P) e B such that P V (-P)
.•
T
law of the excluded middle
P A (--P)
.•
F
law of contradiction.
pair
of (not necessarily
Commutativity For
every
distinct) propositions
P, Q G B P V Q ý* Q v P PA
Q
commutativity laws
100) V (x < 100)]
( 3yyE2] (y < y (c) Vx c A, 3y E B, [(-M(x, y)) (d) Vx c R, Vy c R, Ez c R, 39 Was 40
[Z
L-
-.
(-N(x, y))]
(x=+-,)] 2
Epimenides telling a lie when he made that statement? A set of statements that are repeated as a group.
15. Use the predicates from Example 2.38 to translate the English statement "there is at least one student in every math class who owns a cat" into symbolic notation. Then negate the symbolic expression. Finally, translate the negation into English.
2.7 Analyzing Claims (Optional) 16. Let G represent the set of all game shows on television and let P represent the set of people in your neighborhood. Let the predicate C(p, g) mean that person p has been a contestant on game show g. Let the predicate D(p) indicate that person p is a doctor. Use these predicates to translate the English statement "there is a person in your neighborhood who has been a contestant on a game show but is not a doctor" into symbolic notation. Then negate the symbolic expression. Finally, translate the negation into English. 17. D (a) Negate the proposition Vs, [(3d, [P(s, d)]) V Q(s)]. (b) Let ss=BethelCollegestudet =Bethel College student d = Bethel College dorm P(s, d) =student s lives in dorm d Q(s) = student s rents a book locker, Write the original proposition and its negation in English. 18. For each claim, determine whether it is always true or else false in some cases. Then give some justification for your answer. (a) The expression Vx E Z, (x + y = x • y) is a proposition (rather than merely a predicate). (b) ODThe negation of "every good boy does fine" is "no good boy does fine." 2 2 (c) ODThe truth value ofVx, 3y, [x - _y ] depends on the choice for the universe of discourse. (d) One way to bind a free variable is to assign it a value. (e) Let S be the set of all odd integers that are divisible by 2. Then Vx c Z, [(xc S) -> (x +x 0 2x)] is a valid assertion.
69
19. For each claim, determine whether it is always true or else false in some cases. Then give some justification for your answer. (a) Since Z C R, if Vx, P(x) is true when the universe of discourse is Z, then Vx, P(x) is true when the universe of discourse is JR. (b) Although the written forms are different, 3z, 3y, 3x, S(x, y, z) and 3x, 3y, 3z, S(x, y, z) will have the same truth values. (c) If at least one variable is not bound, then a predicate is not a statement. (d) LetU ={x E RIx = x+ 1). Then for ally, zinU, y + z = z + y is a valid assertion. (e) If for any choice of x E U, a y E U can be found such that R(x, y) is true, then 3y E U, Vx E U, R(x, y) is true. 20. ODFind a counterexample to the claim [Vx, 3y, P(x, y)] is always true.
-+
[Vy, 3x, P(x, y)]
21. Find a counterexample to the claim
[(3x, P(x)) A (3x, Q(x))]
-
[3x, P(x) A Q(x)]
is always true. 22. Find a counterexample to the claim [Vx E U, P(x)]
-
[3X E U, P(x)]
is always true.
2.7 Analyzing Claims (Optional) 2.7.1 Analyzing Claims that Contain Implications Many arguments in daily life explicitly or implicitly contain implications. By extending the analysis to the forms of the implications, it is possible to improve upon the guidelines for informal logic that were presented in Section 2.2. Recall the brief discussion of syllogistic logic that was presented in the introduction to this chapter. Webster's dictionary defines syllogism as a deductive scheme of a formal argument consisting of a major and a minor premise and a conclusion (as in "every virtue is laudable; kindness is a virtue; therefore kindness is laudable"). The example from the dictionary can be modified 41 and rewritten as majorpremise: If kindness is a virtue, then kindness is laudable. minorpremise: Kindness is a virtue. Kindness is laudable. VALID 411 have simplified the more general "every virtue" to the more specific "kindness" and also expressed the
phrase "every virtue is laudable" as an implication. Syllogistic reasoning does not always need to have an implication for the major premise.
70
Chapter 2 Sets, Logic, and Boolean Algebras In essence, we state an implication (the major premise) with the implicit assumption that it is a true implication, and we also affirm the truth of the hypothesis (via the minor premise). Then we assert the truth of the conclusion. The astute student should notice that this is merely a cosmetic repackaging of modus ponens. The term valid is used to describe the form of reasoning used. The previous example is valid since it is based on modus ponens. Our study of derived implications shows that an argument form based on the contrapositive of a true implication also leads to a valid argument. Consider the implication "If selfishness is a virtue, then selfishness is laudable." The contrapositive can be expressed as If selfishness is not laudable, then selfishness is not a virtue. Selfishness is not laudable. Selfishness is not a virtue. VALID This is really the same form of reasoning. We state an implication (which incidentally happens to be the contrapositive of another implication); then we affirm its hypothesis. The reasoning based on the contrapositive can be presented in a visually different manner: If selfishness is a virtue, then selfishness is laudable. Selfishness is not laudable. Selfishness is not a virtue. VALID In this form, we state the original implication and affirm the negation of the conclusion, thus affirming the negation of the hypothesis. Study this table to see that it contains essentially the same information as the previous table. Notice that the minor premise in this example is related to the negation of the conclusion of the implication. Our study of derived implications should wam us that the following are not valid forms of reason: If academic excellence is a virtue, then academic excellence is laudable. Academic excellence is laudable. Academic excellence is a virtue. INVALID If athletic prowess is a virtue, then athletic prowess is laudable. Athletic prowess is not a virtue. Athletic prowess is not laudable. INVALID The first of these invalid forms is based on the converse of the original implication. It is often called the fallacy of affirming the consequent. The second invalid form is based on the inverse of the original implication. It has been called thefallacy of denying the antecedent. We thus have two valid and two invalid forms of reason based on an implication. They are summarized next: If A, then B. A B VALID-direct syllogistic reason If A, then B. -1B -,A VALID-syllogistic reason using contrapositive If A, then B. B A INVALID-fallacy of affirming the consequent If A, then B. --A --B INVALID-fallacy of denying the antecedent
2.7 Analyzing Claims (Optional)
71
In order to emphasize an important idea introduced at the beginning of this section, I will introduce some additional definitions. We will label a valid argument as sound if both the major and minor premise are true. A valid argument form will be labeled unsound if either premise is not true. A valid argument form will be called incomplete if either of its premises has not yet been shown to be true or untrue. Notice that invalid argument forms are neither sound nor unsound nor incomplete. All the previous examples of valid argument forms have been sound. Table 2.26 summarizes these definitions. TABLE 2.26 Classifying argument forms Premises
Form
valid invalid
true
unknown
false
sound
incomplete
unsound
-
-
-
If an invalid form is used, the argument provides no evidence for the truth (or lack of truth) of the conclusion. If a valid form is used and the premises are true, then the conclusion is also true. The next set of examples all contain errors in reasoning. I
Ending Drug Abuse An argument some people are making as a solution to the problem of drug abuse is "If all the drug dealers were given mandatory life sentences, there would be no more drug abuse." We can translate this into the following argument form: If all the drug dealers were given mandatory life sentences, then there would be no more drug abuse. We should give life sentences to convicted drug dealers. The drug abuse problem will be solved. INCOMPLETE No one has established the implication that mandatory life sentences will stop drug abuse. One possible reason that the implication may be false is that new dealers may replace the old dealers as fast as the old ones are locked up. The proposed solution also ignores the influence of the drug consumer. It may be possible to prove that harsh sentences prevent some people from becoming drug dealers. The matter needs more study. The politician should make a more tentative statement: "Giving convicted drug dealers long sentences may help slow the abuse of drugs." This is a more logically palatable argument but is poor political rhetoric. U
Everybody Is Doing It "Mom, can I put peanut butter up my nose? Everybody is doing it." This argument is a common one. Ignoring the standard response of "If everybody jumped off a cliff, would you?", consider the implied argument form: If everybody else is doing A, then I should do A. Everybody else is doing A. I should do A. Usually UNSOUND It is not true that everybody else is putting peanut butter up their nose. Even if we modify this to "everybody that I think is important" or "Susie Jones and Tom Smith," the major premise is still of dubious general value. If the claim were "everybody is breathing," the argument might be considered sound; but stuffing peanut butter up your nose needs additional justification. U
72
Chapter 2 Sets, Logic, and Boolean Algebras Pets and Longevity There is some statistical evidence that people who have a pet live longer after the loss of a spouse than those who do not have pets. Consider the claim "If you have a pet, then you will probably live longer. My friend died soon after his wife died. He must not have had a pet." We can diagram the argument as follows: If you have a pet after your spouse dies, then you are more likely to live longer. You don't live long after your spouse dies. You did not have a pet INVALID The error here is a bit subtle. The argument form appears to be the valid form based on the contrapositive. The error is that the minor premise is not the negation of the conclusion. The implication (based on statistical data) only claims you are more likely to live longer. It does not claim a guaranteed longer life. The argument just presented really has the following form: If A, then B. -- C -A
INVALID-using unrelated information
U
Conditional Love Have you ever done poorly in a class? Perhaps you thought to yourself, "I flunked this class. My parents are going to kill me!" Your conversation to yourself might be toned down and rephrased as If I get good grades, then my parents will love me. I didn't get good grades. My parents don't love me. INVALID-fallacy of denying the antecedent Almost all parents are less conditional in their love than this invalid syllogism implies. Your parents may be disappointed, but will not cease to love you because of poor grades. Perhaps the tendency to commit the preceding logical fallacy is encouraged by the phrasing of the major premise. Love can hardly be genuine if it is so narrowly conditional. We should be very cautious about oversimplifying complex realities. U M
Music Lovers Suppose your friend says to you "I know you love music, so you will love this new CD." Your friend's statement is based on the following argument form. If you love music, then you will love this CD. You love music. You will love this CD. UNSOUND This is a validform of reason. However, even granting the assumption that you love music, the major premise is not a universally true statement. There is no compelling reason to believe that a love of music will entail a love of all genres of music. Even if your tastes are very eclectic and you do like the musical selections on the CD, you still may be disappointed in the quality of the performance. U Logic and Problem Solving Consider the statements "If I study logic I will be able to solve problems. I can solve problems. Therefore I have studied logic." The argument form is If I study logic, then I will be able to solve problems. I can solve problems. I have studied logic. INVALID-fallacy of affirming the consequent
U
2.7 Analyzing Claims (Optional)
73
V Quick Check 2.14 1. Analyze the following arguments. (a) If I practice piano for six hours a day for at least one year, then I will become famous. I have practiced six hours a day for the past two years. Someday I will become famous. (b) Productive employees at this company get raises. Ichabod re-
ceived a raise. Therefore, Ichabod is a productive employee. (For this problem, assume that the major premise is true.) (c) Using a seat belt has been shown to decrease the chance of injury in an accident. I wear a seat belt, so I am less likely to be injured in an accident.
2.7.2 Analyzing Claims that Contain Quantifiers Many syllogisms contain major or minor premises that include quantifiers. For example, the major premise of this syllogism contains a universal quantifier: All people are mortal. I am a person. I am mortal. VALID The next syllogism contains an existential quantifier in the major premise: Some people can ride a bicycle. George can ride a bicycle. George is a person. INVALID People
Beings who can ride a bicycle Curious o
Figure 2.16. Bicycle riders,
There is a simple technique that is sometimes useful when analyzing syllogisms that contain quantifiers. It involves the use of the familiar Venn diagram. 42 To see that the previous syllogism is invalid, consider the Venn diagram in Figure 2.16. Since an existential quantifier was used, we may not assert that all people can ride bicycles. We also do not know whether all bicycle riders are people. George's full name may be Curious George. A chimpanzee may know how to ride a bicycle, but a chimpanzee is not a person. The discussion on page 21 should serve as a warning that this intuitive approach using Venn diagrams is risky.
i
Quick Check 2.15 1. Use a collection of Venn diagrams to analyze informally the possibilities for the following syllogism. Then comment on the validity of the syllogism.
All A Some Some Some
are B. B are C, but not all B are C. C are not B. A are C.
2.7.3 Exercises The exercises marked with OP have detailed solutions in Appendix G.
1. What must we do in order to show that an implication is true? 2. What must we do in order to show that a biconditional is true? 3. The two previous questions make the assumption that an implication or a biconditional might be false. Give a reason for this assumption. 42
4. Write the original implication, the contrapositive, the converse, and the inverse for each of the following statements. Indicate which of the four implications in each set are true. (a) '1 If I am pregnant, then I am a female. (b) If I am a republican, then I am not a Democrat. (c) If I am a student, then I live in a dorm and I go to class. (d) If I have an X chromosome, then I am male.
This application of Venn diagrams goes back at least as far as the mid-1700s, when Leonard Euler used them for this purpose [27].
74
Chapter 2 Sets, Logic, and Boolean Algebras
5. (a) Write a true implication whose converse is false. Then write a true implication whose converse is also true. You may not use an example found in this text. (b) Rewrite the second pair as an equivalence, 6. Analyze the following implication-based claims, (a) P If Socrates was a man, then he was mortal. Socrates was mortal. Therefore, Socrates was a man. (b) If Euclid proved the theorem, then Euclid was a great mathematician. Euclid proved the theorem. Therefore, Euclid was a great mathematician.
(c) Some students are athletes. Some athletes are students. Every person is either a student or an athlete. (d) Some students are math majors. Every math major is a student. Every math major does homework. 44 Therefore, some students do homework. (e) Some students are musicians. Every musician practices. Therefore, some musician is not a student. (f) Every rose is a flower. Some flowers are perennials. Therefore, every rose is a perennial.
(c) If Hippocrates was an ancient Greek scholar, then he was very wise. Hippocrates was an ancient Greek. Therefore, he was very wise. (d) If I don't get enough sleep, then I am crabby. I am not crabby. Therefore, I got enough sleep. (Assume that I am telling the truth in my major and minor premises.)
The next set of questions relates to a small college in Frostbite Falls, Minnesota. There are only two kinds of students on the campus, scholars and dunces. Scholars never make a mistake. Every statement they make is true. Dunces never get anything completely correct. In any conversation, some of what they Vay may be true, but no matter how hard they try, they always say at
(e) When I am sad, I like to take a walk in the forest. I am walking in the forest right now. Therefore, I am sad. (Assume that I am telling the truth in my major and minor premises.) (f) If I am angry, then I shout. I am not angry. Therefore, I am not shouting. (Assume that I am telling the truth in my major and minor premises.) (g) • Major premise: Sixty men can do a piece of work sixty times as quickly as one man. "•Minor premise: One man can dig a posthole in sixty seconds. "•Conclusion: Sixty men can dig a posthole in one second.43 7. Use the concepts and terminology of derived implications to explain your answers to the following questions. (a) Assume that the implication If it is raining, then I use an umbrella. is true. Suppose I am not using an umbrella. Can I conclude that it is not raining? (b) Assume that the implication If my car is out of gas, then it won't start. is also true. My car started. Can I conclude that I am not out of gas? (c) Suppose in part (b) that my car won't start. Can I conclude that it is out of gas? 8. Analyze the following claims. Draw Venn diagrams where appropriate in each case (even if it seems trivial to solve without a diagram). If multiple diagrams might apply, show them. (a) All teachers are boring. Dr. Beeblebrox is a teacher. Therefore, Dr. Beeblebrox is boring. (b) PDSome students live at home. Zelda is a student. Therefore, Zelda lives at home.
43 44
least one thing that isfalse. 9. 0 One of the students made the statement "If I am a dunce, then I am a scholar." Can we determine which kind of student he is? 10. Another student said "If I am a scholar, then I am not a dunce." What is she? 11. Her roommate stated "If I am a dunce, then I am not a 12. One day when I was very thirsty, a dunce that I knew handed me a cup of juice. He told me two things. (a) I am a scholar. (b) I have put poison in the juice. Even though he was incorrect in his first statement, I declined the juice. Why? 13. Two very popular math majors, A and B, were roommates. One day they made the following statements: A: If I am a dunce, then B is a dunce. B: I am a dunce if and only if A is a dunce. What kind of student is A? (Hint: Check the consistency of both students' statements.) 14. 1 asked a student what kind he was. He replied, "If I am wrong, then I am a dunce." Is he a scholar or a dunce? 15. One student had an oral exam. She made the following three statements: • All implications are either true or false. • Every statement is either an implication or a biconditional. * The statement P ++ Q is equivalent to the compound statement (P - Q) A (Q -- P). Is she a scholar or a dunce?
Ambrose Bierce, The Devil's Dictionary. Assume true for this problem. Math majors who don't do homework usually switch majors eventually.
2.8 QUICK CHECK SOLUTIONS
75
2.8 QUICK CHECK SOLUTIONS You will learn more if you make an honest attempt to solve the Quick Check problems before you look at my solutions. Don't give up too soon! If you do need to look at my answer, make sure you can reproduce the solution without opening your book.
Quick Check 2.1 1. 5
Dbut5eF
2.
D 7 Fbecause6E Dbut6 0F
3.
DnF = {2,4}
4.
D U F = {1,2,3,4,5,6,8}
5. D = {1,3,5,7,9}
6. IDI=4,IFI=5 7. D-
F = {6,8}, F-D
={1,3,5)
8. D and F are not disjoint because D n F = {2, 4}. Quick Check 2.2 1. A nBnC={3} 2. AUBUC={1,2,3,4,5} 3. {A, B, C) is not a partition of {1, 2, 3, 4, 5}-for example, A n B # 0. 4. A x B = {(1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 2), (3, 3), (3, 4)) 5.
Ax B x C -3.3.3=27 A x B x C = {(1, 2,3), (1,2,4), (1,2,5), (1,3,3), (1,3,4), (1,3,5), (1, 4, 3), (1, 4, 4), (1, 4, 5), (2, 2, 3) ... , (3, 4, 5))
6. P (A n B) = P ({2, 31) = {0, {2}, {3}, {2, 311 Quick Check 2.3 1. Suppose that C C B and also that B C A. Let x e C. Then since C C B, it is also true that x E B (using Definition 2.1). But then, since B C A is also true, x E B implies x E A. Since for every x E C it is always true that x c A, the definition of subset implies that C C A. 2. Suppose B C A. x EA- B
iff
x 0 (A - B)
definition of complement
iff
x 0 A or x c (A fn B)
definition of set difference
iff
x
A or x E (An B)
definition of complement
iff
x
A orxEB
BCA
iff
x c (A U B)
definition of union
Since every element of A - B is an element of (A U B), and vice versa, the two sets are equal.
Quick Check 2.4 1. They actually make sense under all three guidelines, but number 2 (look for flaws in logic or logical inconsistency) is the most appropriate. 2.
(a) You could consider this an improper generalization. How do they know no manager was fired? (Perhaps a manager at a competing company bought an IBM and was fired.)
76
Chapter 2 Sets, Logic, and Boolean Algebras
There is an invited inference here. Since no manager has been fired for buying an IBM, an IBM must be the best product. However, it could be that IBM made acceptable computers but not the best. (It is also possible that IBM made the best.) (b) No one else's carrots have cholesterol either. This is a variation of a nonsequitur. The appeal is to our (reasonable) desire to eat healthy food. Cholesterol received bad press in the 1980s. It makes a great advertising pitch due to the emotional linkage with health. The implied implication is that the lack of something bad means the presence of something good. Arsenic has no cholesterol either. If the claim was "our eggs have an acceptably low amount of cholesterol," there might be some validity to the claim. (c) The general guideline "analyze assumptions" might be worth using. The assumption is "some movie should be watched." Under this assumption, the best of the lot is the proper choice. However, it may be the case that all the movies are exploitative trash with no artistic merit. You may be better off playing a game, talking, or reading a book together.
Quick Check 2.5 1. (a) This is neither true nor false. If the word as were deleted, it would be a (true) statement. (There is no verb, so it is not even an English sentence.) (b) This is a statement. We can imagine examples where someone eats an apple every day of their life but still gets ill or injured. The statement is false. (c) Despite the Shakespearean English, this is a statement. A modern translation is "appearances may be deceiving." (I prefer the original.) This is a tricky sentence to translate into a mathematical statement (see Example 2.37 on page 64). (d) This is not a statement. It basically means "good night." There is no possible assignment of true or false. (There is a verb, so it is an English sentence, but it is a command, not a claim. That is, the verb is used in an imperative, rather than an indicative, mode.) 2. John or Jane Doe. With this version, only one would need to sign a check. It is inconvenient to always require both signatures. 3. There are several acceptable ways to express these statements. Human languages tend to be ambiguous. Think about how much more precise the mathematical version is. (a) It is not true that I like both popcorn and jalapefios. (b) It is not true that I like either popcorn or jalapefios. (We will see later that an equivalent statement is "I dislike both popcorn and jalapefios.") (c) Suppose I do like popcorn but I don't like jalapefios. Then it is not the case that I like both, so the statement -(P A Q) is true. This agrees with the second row of the truth table. There are seven other combinations to try (four involving -(P v Q)).
4. Let B represent the statement "I love Betty" and let S represent the statement "I love Sue." Similarly, let C represent the statement "I love Colin" and let T represent the statement "I love Tom." (a) BvS
(b) CAT
(c) B V S [Notice that this is the same answer as for part (a).] (d) C A (-'T)
2.8 QUICK CHECK SOLUTIONS
77
Quick Check 2.6 1. The two statements have the same truth table because their truth tables agree for every combination of T/F values for P and Q: P
Q
P-+ Q
Q--P
T T F F
T F T F
T F T T
T T F T
(P-
Q) A(Q--P)
T F F T
P
Q
P.- Q
T T
T F
T F
F F
T F
F T
This observation can also be written as [(P -- Q) A (Q -- P)] ÷- [P ÷-
Q].
2. I am an earthling if and only if I am not a space alien. (This is not a true statement unless we restrict the collection of beings we are discussing to consist only of earthlings and space aliens.)
Quick Check 2.7 1. (a) Let A represent "I am an alien" and M represent "I am from Mars." Symbolic Form A
--
(--M)
M
-
(-A)
-
A
-
(-M)
Name Implication
Contrapositive
English Translation If I am an alien,
then I am from Mars. If I am not from Mars, then I am not an alien.
M (-A)
Converse Inverse
If I am from Mars, then I am an alien. If I am not an alien, then I am not from Mars.
In this problem, the original implication and the contrapositive are false. The converse and the inverse are true statements. (b) Let E represent "I am an extrovert" and A "I have friends." (I have used the Spanish word amigo, A, instead of the English word friend so that F maintains its meaning "false.") Notice the effect of the negations. Symbolic Form
Name
(-E) -- (-A)
Implication
If I am not an extrovert, then I have no friends.
Contrapositive
If I have friends, then I am an extrovert.
A
-*
E
(-A)
-
(--E)
E
-
A
Converse Inverse
English Translation
If I have no friends, then I am not an extrovert. If I am an extrovert, then I have friends.
All four statements are false. Being an extrovert and having friends are independent characteristics; neither determines the other.
78
Chapter 2 Sets, Logic, and Boolean Algebras 2. Since the two final columns are identical, the two statements are equivalent. P
Q
-P
-Q
PA Q
-(P A Q)
(-P) v (- Q)
T
T
F
F
T
F
F
T F F
F T F
F T T
T F T
F F F
T T T
T T T
Quick Check 2.8 1. When P is true and Q is false, (P A Q) is false, so the truth values differ. 2. (a) One approach is to prove directly that the statement is a tautology. [P A (P -
Q)] -
Q
[PA (-Pv Q)] - Q
implication
,• [(P A (-P)) V (P A Q)] --+ Q distributivity
[F v (PA Q)] : [(P A Q) V F]
-
Q
law of contradiction
Q
commutativity
< (P A Q) -- Q
identity ST law of simplification (b) One strategy is to show that both sides of the biconditional are logically equivalent. implication [(P A (--Q)) - Q] < [-(P A -'Q)] V Q * ((--P) V -- (--Q)) V Q I# ((--P) V Q) V Q
De Morgan double negation
,Ix. The statement is false (try x = 1 or x = ) (b) The universe of discourse is the set of all students in this class. Let s represent students, and let the predicate C(s) represent "student s owns a car." The quantified predicate is 3s, -'C(s). 2. (a) Every integer is less than twice itself. (b) There is a tree on planet Earth that is more than 1000 years old.
Quick Check 2.13 1. --,(Vx E U, - T(x)) € 3X E U, -,('-,T(x)) S2 -+ S3 -) S 4 --* 5 -+ Q. (Actually, the proof is a little more complex than was just indicated.) A more complete description will now be given. Since the axioms all are assumed true (P), the laws of simplification allow us to assert the truth of axiom 4 (A4). From A 4 we can produce the points A, B, and C (denote the existence of these three points by SI). Using SI and At, we conclude that there is a line AB between A and B (S 2). SI, S2, and A3 imply the existence of a unique line through C and parallel to AB (S3).20 S3 and A2 assert the existence of a point D (S4). The final step is the most complex in the proof. In essence, it asserts that D is distinct from A, B, and C. To do this, the results of the previous steps are used as well as appealing to the implication "if two lines are parallel, then they have no points in common" (the "concept --> properties" form of a definition). Since D is distinct from the other points (which are also distinct from each other), there must be at least four 21 points (Q). Divisibility of a Sum
PROPOSITION
3.1 Let a, b, and c be integers, with a : 0. If a Ib and a I c, then a I (b + c).
Proof: This proof will not only illustrate a direct proof, but will also illustrate a proof that is heavily definition oriented. You may want to review Definition 3.7 on page 96. Since a I b, we know that there is an integer k such that b = ak ("concept 22 properties"). Similarly, there is an integer m such that c = am. Therefore, b + c = ak + am = a(k + m). We know that k + m is an integer, so the equation shows that b + c is an integer multiple of a. Therefore, b + c is divisible by a EZ U ("properties --* concept"). The proof is complete. 19 The 20 21
book by Solow ([73]) contains several examples of this forward-backward process. Notice the use of modus ponens. This expanded description of the proof illustrates a difficult decision: "How much detail is needed?", or
"what constitutes a complete proof?". The presentation in the expanded proof does not reach the limit of detail possible, yet it is probably too much already. It is not easy to decide how much detail is needed to ensure mathematical rigor. A related question is, "When is a complete proof necessary?". For example,
does a first-year calculus student who plans to major in engineering need to understand the proof of the first fundamental theorem of calculus, or is the ability to apply the theorem properly sufficient?
You may find it interesting to study the various philosophies developed by mathematicians and philoso-
phers that attempt to provide a firm foundation for doing mathematics. One of these schools of thought is called logicism. Logicism attempts to reduce all of mathematics to logic. The most energetic attempt to-
ward this goal was the book PrincipiaMathematicaby Russell and Whitehead, 1910-1913. The book is in three volumes, consisting mostly of symbols. One of the goals of this work was to put arithmetic on a firm
foundation. In 1931 G6del found a major flaw in the axiomatic method: "Within a rigidly logical system such as
Russell and Whitehead had developed for arithmetic, propositions can be formulated that are undecidable or undemonstrable within the axioms of the system. That is, within the system there exist certain clear-cut statements that can be neither proved nor disproved. Hence one cannot, using the usual methods, be certain that the axioms of arithmetic will not lead to contradictions" [9, p. 655]. 22 It is essential that we use different letters for the two integers (k and m); otherwise, we are indirectly asserting that b = c.
104
Chapter 3 Proof You should pay careful attention to what did not appear in the proof of Proposition 3.1. Nowhere in the proof do any fractions appear. The context of the proposition is the set of integers. It is considered poor taste to drag rational numbers into the proof. You should avoid introducing a into a proof when you are told that a I b. Instead, use the definition of divisibility and assert the existence of an integer k such that b = ak.
VuckCeck 3.3 1. Use a direct proof to prove Proposition 3.2. Clearly denote your use of definitions. PROPOSITION 3.2 Let a, b, and c be any integers, with a # 0. If a Ib, then a I (bc).
2. Use a direct proof to prove Proposition 3.3. PROPOSITION 3.3 Let a, b, and c be integers with a A 0andb :A 0. Ifa lb and b I c, then a I c.
Searching for Prime Factors One method for determining if a positive integer n > 1 is prime is to see if it is divisible by any of the numbers 2, 3, 4 .... (n - 1). If not, it must be prime. This strategy can be made more efficient by using the next proposition. PROPOSITION 3.4 If n is a positive composite number, then n has at least one prime factor p with I < p < In.
Proof: Since n is composite and positive, there are integers a and b with I < a < n, I < b < n and n = ab. We can assume, without loss of generality,23 that a < b. Thus n = a • b > a • a = a2, so a < I/n. If a is a prime, we are done. Otherwise, a must itself have a prime divisor p and p < a < In-must be true. Proposition 3.3 implies that p is a prime divisor of n. I The more efficient algorithm for testing for primes is to see if any of the integers 2, 3, 4 . ... frJ are divisors. For example, if n = 13, then Vn- -_ 3.606, so we only need to see if 2 or 3 are divisors. Under the original algorithm, we would have check for divisibility by 2, 3, 4,..., 11, 12. U
3.2.3 Indirect Proof: Proving the Contrapositive The logical equivalence [P --* Q] .• [(-'Q) --> (--P)] implies that the contrapositive of a valid theorem is automatically true. COROLLARY 3.1
Corollary to Proposition3.4 If a positive integer p > 1 has no divisor d with 1 < d
b, we would interchange the respective roles of the letters in the remainder of the proof.
3.2 Proof Strategies
105
As an example, recall the definition of even given in Section 3.1.2, and consider the proposition:
I
PROPOSITION 3.5 If the integer n is not even, then n2 is not even.
Proof: The concept "not even" isn't as easy to work with as "even" (we have a property to associate with "even"). The contrapositive reads "If n2 is even, then n is even." We will prove the contrapositive. Thus we assume that n 2 is even. Using the "concept -- properties" part of the definition of even, we may assert the existence of an integer k such that n 2 = 2k. Since we can factor a 2 out of the right-hand side of the previous equation, we must also be able to factor a 2 out of the left-hand side. Thus 2 is a factor of n 2. But then 2 must be a factor of n, so n = 2j for some j.24 Using the "properties
-*
concept" form
of the definition of even, we conclude that since n = 2j, n is even. This proves the D contrapositive of Proposition 3.4, and simultaneously proves Proposition 3.5.
3.2.4 Proof by Contradiction Sometimes, we don't seem to be able to make progress moving from P to Q using a direct proof or using the contrapositive. Proofby contradiction is based on the logical equivalences: [P
--
Q]
[P --
Q]
[P -+Q]
:
[(P A (--Q))
-
(R
[(P A (--Q))
--
(--P)]
A
(--R))]
[(P/ , (--Q)) -+Q].
We begin by assuming that the hypothesis, P, is true and also assuming that the conclusion, Q, is false. This gives us an additional piece of information (the negation of the conclusion) that we hope will enable us to make progress toward a completed proof. Working forward from (P A (-- Q)), we hope to eventually arrive at a contradiction (such as Q, (--P), or (R A (--R)), where R can be anything). The logical equivalences above then ensure the truth of P --) Q.
On an intuitive level, proof by contradiction is based upon the following ideas. We assume that P is true (if P is false there is nothing to prove, the implication is vacuously true). We then assume that Q is false. By a series of valid steps we arrive at a contradiction. Something must have gone wrong. The only place possible was when we assumed that Q was false. It follows therefore that Q is actually true. But then since whenever P is true, it necessarily follows that Q is true, the implication P --) Q is 25 true. The most difficult part of a proof by contradiction is that we don't know ahead of to prove a time what contradiction we will eventually arrive at. In fact, if we are trying 26 "theorem" that is actually false, we will never arrive at a contradiction! 241 have used Proposition 3.9 on page 114. 25 We have ruled out the one row in the truth table for an implication in which the implication is false. 26 Direct proof has a safeguard that proof by contradiction does not. If you make a mistake in a direct proof, you will have a very hard time arriving at a proof. If you make a mistake in a proof by contradiction, it may actually be easier to reach a contradiction (due to your mistake), perhaps with disastrous consequences; there may be no legitimate contradiction. Disaster occurs if the theorem you try to prove is actually false. When you make a mistake in a proof by contradiction, you probably will arrive at a contradiction (due entirely to your mistake and not to the content of the theorem). You will assume that you have proved the theorem when, in fact, the theorem is actually false. A dramatic example of this occurred in connection with Euclid's fifth postulate. In 1733, a mathemati-
cian named Saccheri tried to prove that the fifth postulate could be proved from the other four (that is, he tried to show there was no such thing as a geometry that did not follow all five of Euclid's axioms-such a
106
Chapter 3 Proof Recall that Theorem Y in Example 3.4 was proved using contradiction and that contradiction played a key role in the existence proof for Theorem 3.1. As another example, consider the following proposition. PROPOSITION 3.6
The number
x/2
is irrational.
Proof: The concept "irrational" is much harder to work with than "rational." We might consider proving the contrapositive, but what is the contrapositive? In fact, what is the implication? We can restate the implication as "if a number is v/2, then the number is irrational." The contrapositive is "if a number is rational (not irrational), then it is not the number V4". Proving the contrapositive would thus involve showing that something is not V/-2. If you try showing this, you will probably end up using a proof by contradiction, so we will begin by using a proof by contradiction on the original theorem. Thus we assume that the number we have is 12, and that it is a rational number. Therefore, there are integers p and q such th4t /_ = P-. We will assume that p and q have no common factors. (If there were any, we could factor them out and cancel, leaving us with a new p and q.) Since q is not 0, we can multiply the equation by q, leaving Vfiq = p. Squaring both sides produces 2q 2 p 2 . Since the left-hand side has a factor of 2, the right-hand side does also. This means that 2 is a factor of p. Hence p 2r, for some integer r. But then p 2 4r 2 . The equation 2q 2 = p2 can be rewritten as 2q 2 = 4r 2 . Dividing by 2 produces q 2 = 2r 2 , from which we conclude (in a now familiar way) that 2 is a factor of q. This is the contradiction we need: 2 is a factor of both p and of q, but p and q have no common factors. The trouble began with the assumption that /¶ is rational. Hence, v/2 is irrational. -
-
-
Sui c'k Ch eck 3.4
1. Use a proof by contradiction to provide another proof of Proposition 3.4.
3.2.5 Proof by Cases Sometimes it is helpful to partition the proof into several disjoint parts whose union is the complete theorem and then prove each part individually.
A Simple Proof by Cases If n E Z, then n3
-
n is even.
Proof: Consider the two cases n even and n odd. Every integer fits one of these cases, and no integer is both even and odd, so by showing the claim is true for each case, the claim will be shown true for all integers. n even If n is even, then there is an integer k such that n = 2k. Therefore, n3-n = (2k) 3 -(2k) = 8k3 -2k = 2(4k 3 -k) = 2m, where = 4k 3 -kisan integer. Thus, n 3 - n is even. geometry is called a non-Euclidean geometry). He assumed that the other postulates were true and that the fifth postulate was false. He then sought to produce a contradiction, If a contradiction occurred, he could then conclude that it was impossible to have all the postulates except the fifth. Saccheri did arrive at a contradiction (due to a mistake). In the course of his "proof" he had produced many of the major theorems of non-Euclidean
geometry. He never realized that he had actually created geometries that satisfied all but the fifth postulate!
3.2 Proof Strategies
107
n odd If n is odd, then there is an integer k such that n = 2k + 1. Thus, n 3 - n = (2k + 1) 3 - (2k + 1) = 8k3+ 12k2 + 4k - 2(4k3 + 6k2 + 2k) = 2m. So n3 - n is again an even integer. WE 0 Many mathematicians feel that proofs with more than a few cases are less elegant than a proof using some other strategy. An extreme case is the proof of the four color theorem (see Theorem 10.14 in Chapter 10). The first proof, by Kenneth Appel and Wolfgang Haken in 1976, involved the use of over 1000 hours of computer time to examine around 2000 cases (each of which resulted in up to 100,000 subcases). More recent proofs have reduced the number of cases to around 600. Appel and Haken's proof initiated a controversy in the mathematics community: Should we accept as valid a proof that no human has read unaided by a machine? How do we know that the computer didn't make a mistake? The current majority opinion is that the proof is valid. However, a more elegant proof (without the use of computers) would be preferred.
3.2.6 Implications with Existential Quantifiers Many implications contain conclusions involving an existential quantifier. For example, "If x and y are two real numbers, then there is a real number r that is between them." The preferred method for proving theorems of this type (called proof by construction or a constructive proof ) is to construct the object that is supposed to exist. This is often done by creating a candidate and showing that it satisfies all the properties of the required object. PROPOSITION 3.7 If x and y are real numbers with x < y, then there exists a real number z with x 2, there is a y < 0 such that x - 2y When the universal quantifier appears in the hypothesis of an implication, it is often appropriate to use what is sometimes called the choose method as part of the proof strategy. The implication usually reads "for all objects A having properties B, C happens." In the choose method we pick an arbitrary member, x, of the universally quantified set A and show that C happens. Since x was a representative of any object in A, we have established that C happens for all elements in A. We must be careful that in using x, only the properties B are used; we must not use any properties that are true for some elements of A but not for others. Mistakes of this sort are commonly seen in "proofs" such as the first attempt in the next example. Using the Choose Method The Proposition For all real numbers x > 2, there is a y < 0 such that x
=
y2.
Incorrect Proof Let x = 4 (an incorrect use of choose). Using the construction method, let y = -2. Then x = 4
2(-2) _7 y -2+1 - y+n
Incorrect! e
The problem with the incorrect proof is that it shows nothing about what to do with any number except x = 4. Preliminary Analysis We will eventually use a constructive proof, but first a preliminary investigation is in order. Suppose (temporarily) that we already have a y < 0 for which the proposition is true. We could then solve x = 2- for y as a function of x. Doing so yields y - x-2". This seems like the most likely choice to use for y in the constructive proof. The actual proof is presented next. Correct Proof Let x be a real number greater than 2. We will show that y = -2 satisfies all the necessary conditions. Routine algebra shows that y = -- -2 satisfies the equation x = y+l. It remains to show that - x-2 -x < 0. But since x > 2, the numerator and the denominator are both positive. The negation of the fraction must be negative. D In the correct proof, no property of x other than x > 2 was used. In this sense we "chose" a generic x. U A Subset of the Even Integers Let A be the set of all positive integers that are divisible by 4, and let B be the set of all positive integers that are divisible by 2. Then A C B. The universal quantifier is implicitly contained in the claim "A C B." This is because Definition 2.1 on page 16 contains an assertion of the form Vx E A, x E B. Proof: Let n E A. Then (using the definition of divisible) we can write n = 4k for some positive integer k. But then n = 2 • (2k) and so n is divisible by 2. This means that n e B. No special properties of n were used (other than its divisibility by 4), so n really does represent any/every element of A. Thus, using the definition of subset, A C B must be true. WE a Up to this point we have always assumed that the implication you are trying to prove actually is true. Research mathematicians do not always have the luxury of knowing for sure that an implication is true (until they prove it is-which is one of the main reasons proofs exist). Often students do not have this luxury either.
3.2 Proof Strategies
109
After spending time unsuccessfully trying to prove a theorem, you might start doubting its truth. What can you do then? You try to prove its negation! If the theorem contains a universal quantifier, the negation is very nice. For example, Proposition For all objects A having property B, C happens. Negation of the Proposition There is an object A having property B such that C 28 does not happen. A proof of the negation would seek to find an object x having property B for which C does not hold. The object x so constructed is usually called a counterexample to the original (false) proposition. To disprove an implication containing a universal quantifier in the hypothesis, it suffices to find just one counterexample. For example, Proposition All functions that are defined at a real number c are differentiable at c.
Counterexample Let f(x) = Ix - cl. This function is not differentiable at c.
D]
Consider the following general observations regarding implications involving universal quantifiers:
"* It is
a big task to prove such an implication is true: You must verify its truth for every possible value in the domain of the quantifier. This often requires you to verify the implication for an infinite number of values. The choose method makes this manageable.
"*It is generally easier to 29show such an implication is false;
one single counterexample is all that is necessary. "*Universally quantified implications in a form similar to "Vn E N, p(n)" are often proved using mathematical induction, which will be introduced in Section 3.3.
VQ-u,_i c-k-C-h"e_c'k -3.5-----------1. Prove that every nonzero rational number has a multiplicative inverse. (You might want to review Defini-
tion 3.5 and the field properties in Appendix A.3.)
3.2.8 Proofs Involving the Biconditional and Logical Equivalence Many theorems are stated using a biconditional: A (-* B. For example, "A real number is irrational if and only if its decimal expansion is infinite and nonrepeating." Recalling the biconditional logical equivalence, (P +- Q) #ý [(P -- Q) A (Q -- P)], it should be clear that we can prove the biconditional A A. In the preceding example, we would prove "if r E R is irrational, then its decimal expansion is infinite and nonrepeating," and also prove "if r E RIhas an infinite and nonrepeating decimal expansion, then r is irrational." The desired biconditional would then have been established. Some theorems are presented as a collection of mutual equivalences: Theorem The following statements are equivalent: A B C and so on. 28Recall the logical equivalence -[Vx, p(x)] i:. [3x, -p(x)]. 29 The situation is similar to advocating ideas in a public forum. It is easier to stand on the sidelines and look for flaws (counterexamples) in the speaker's presentation than it is to be the speaker and carefully construct
the ideas.
110
Chapter 3 Proof
A-
B
C Figure 3.2. IA --- B, B -- C, C -> Al. A-
--
B
C Figure 3.3. (A -- • B, B A -C c, C -- Al.
--
Any proof will consist of proving a series of implications. The critical requirement is that we prove a sufficient number of the implications so that we can "travel" from any statement (A, B, C, etc.) to any other by following the implications. One way to represent this is to produce a diagram containing the statements A, B, C, and so on. An arrow is added for each implication proved; the tail of the arrow leaving the hypothesis, the head of the arrow pointing to the conclusion. If a path exists from any statement to any other statement, then the collection of implications is sufficient to establish the theorem. Two of the possible collections of implications for a theorem with three statements are {A ---B, B ---C, C -* A} and {A -- B, B -- A, A --+ C, C --+ A}. The respective diagrams are shown in Figures 3.2 and 3.3. The actual set of implications used (within the constraints mentioned previously) is a matter of convenience. Choose implications that can be proved easily. The process that has just been outlined can be used to prove the following proposition.
PROPOSITION 3.8
A,
Let a and b be any two distinct real numbers. Then the following are equivalent: 1. a-a7-= a.
2-*3
If a < a+__k, then a + b
a-b.
+ 2
21. Let a, b, and c form a primitive Pythagorean triple with a being odd and a2 + b2 = c2 . Then
34. Prove: If A and B are finite sets with A C B and IA then there exists an element x E B such that x € A.
B 1,
a2 = c 2 - b2 = (c + b)(c - b).
Implication with Universal Quantifiers
(a) Use a proof by contradiction to show that c + b and c - b
35. Let f and g be differentiable, real-valued functions. Find a
have no common prime factors.
(b) Now prove (using any strategy) that c + b and c - b are both squares. Proof by Cases
22. {DUse a proof by cases to show that if n E Z, then n2 + n is even. 2 23. Prove: If n is an integer and 3 1n , then 3 1n. (Hint: Look at remainders mod 3.) 24. Prove that for all real numbers x and y, Ix + yI < IxI + Iyr.
counterexample to the assertion (fg)'(x) = f'(x) • g'(x).
For extra credit, find an example where the assertion is true. 36. O Find a counterexample to the "Freshman Theorem": (a + b) = a" + bn, where n > 2 and a and b are any real numbers. 37. Use the choose method to prove: "If p is a prime with p > 2, then p+ I is not prime." (Hint: Use the result of Exercise 14.) 38. Prove or find a counterexample: Let a be a positive integer and let b, c be integers. If a I (bc), then either a Ib or a Ic (or both).
(Hint: Use four cases.)
a
ifa_>b
39. Exercise 22 of Exercises 2.5.3 shows that x + a = b cannot always be solved uniquely for x if x, a, and b are elements of a Boolean algebra. Use the choose method to prove the following. "Let a and b be elements of a Boolean algebra, B. ThenVx EB,[(x+a=b)-- (x.i=b. a)]."
lb
if a < b
Proofs Involving Equivalence
I{a
ifa io, P(i)" [i.e., P(i) is true for all i E N with i > io] is usually proved using mathematical induction. A familiar example is the formula
n P(n):l+2+3+4+...+n=
-k
~n(n +) 2
forn > 1.
k=1
Mathematical induction is one of the most commonly used proof techniques in discrete mathematics. As will be seen, an amazing variety of claims can be validated using this technique.
3.3.1 The Principle of Mathematical Induction An inductive proof is done in two steps. First we show that the statement P(1) is true. This is often called the base step. Then we prove the implication [P(i) -> P(i + 1)] [read this as "if P(i) is true, then P(i + 1) is also true"]. This is called the inductive step. These two conditions are enough to conclude that P (k) is true for all positive integers k. The reason is related to making domino chains. The inductive step verifies that each domino will knock the next one down if it falls. The base step knocks the first domino over. The inevitable result is that all the dominos fall. This should remind you of modus ponens (Chapter 2): [P A (P -- Q)] --+ Q.
Theorem of MathematicalInduction If {P (i)} is a set of statements such that 1. P(1) is true 2. P(i)
-*
and
P(i + 1) for i > 1,
then P (k) is true for all positive integers k. This can be stated more succinctly as [P(1) A (Vi, P(i) -+ P(i + 1))] --+ [Vk, P(k)]. Proof: This theorem follows directly from the well-ordering principle. Assume that we have verified that P(l) is true and that for all i > 1, the truth of P(i) implies that P(i + 1) is also true. Let S be the set of all integers n > I such that P(n) is true, and let T be the set of all integers k > 1 such that P (k) is false. Suppose that T is not empty. Then by the well-ordering principle, T must have a smallest element, ko. Since ko > 1, we know that ko - 1 c S. This means that P(ko - 1) is true and since ko - 1 > 1, we also know that P(ko - 1) --+ P(ko). Using modus ponens,33 we conclude that P(ko) is true, contradicting the assumption that ko E T. The trouble started with the assumption that T is not empty. Therefore, T = 0 and D] S = {n c N I n > 11; P(n) is true for all positive integers n. 33[p(k0 - 1) A (P(k 0 - 1) --> P(k 0 ))] --* P(ko)
118
Chapter 3 Proof It should be clear that in step one of Theorem 3.5 there is nothing special about the integer 1. We could have started by showing that P(O) is true and ended by concluding that P(k) is true for all nonnegative integers k. Sometimes we may have a set of statements that is true for k > 2. We would then start our inductive proof by using P(3) in step 1. The principle remains the same. You may recall from a previous course that step 2 is the messy step. We must keep the form of P(i + 1) in mind as we transform P(i). In essence, step 2 is a proof. While doing the proof in step 2, you must not lose sight of the major strategy embodied in the theorem of mathematical induction. The version of mathematical induction described in Theorem 3.5 is also calledfinite induction or weak induction. The primary alternative version (complete induction) will be introduced in Section 3.3.2. Mathematical induction is illustrated in the next example. An Informal Example We can use mathematical induction to show that 2+4+6+... +2(n - 1)+2n = n (n + 1) for all positive integers n. Proof: As an aid to discussion, let P(i) be the statement 2+. -+2i = i(i + 1). When i = 1, the statement P(1) becomes 2 = 1(1 + 1), which is obviously true. Assume that P(i) is true. We must prove the implication [P(i) -. P(i + 1)], that is, [2 + ... + 2i = i(i + 1)] --* [2 + -. + 2(i + 1) = (i + 1)(i + 2)]. We may add 2(i + 1) to both sides of P(i) to obtain 2 +...
+ 2i + 2(i + 1) = i(i + 1)+ 2(i + 1).
The left-hand side of this equation is the same as the left-hand side of P(i + 1). If we can show that i(i + 1) +2(i + 1) = (i + 1)(i +2), then the truth of P(i + 1) will follow. Simple algebra suffices to establish the previous equality. We have shown that P(I) is true and that P(i) -- P(i + 1). From the theorem of mathematical induction, we conclude that P (n) is true for all positive integers n. D] U While working on the inductive step, you may often encounter an algebraic expression that you hope to show equals another expression. For example, you may wish to show that (n + 1)2 ± (n+l)(n+2) = 3n2 +7n+4. A common (but unacceptable) approach is to work in a two-column format, transforming each side independently until you reach a line where the two sides are equal. The example just presented might be manipulated in this manner:
Don't do this!=
(n+÷1)2 ÷ (n+l)(n+2) . 3n 2 +7n+4 2 + 2 2 2(n +l)2 +(n +l)(n +2) ? (n +l)(3n +4) 2 2 (n +1)[2(n+l)+(n+2)] ? (n+1)(3n+4) 2 2 (n + 1)(3n + 4) •/(n + 1)(3n + 4) 2
2
In essence, we are manipulating both sides of an equation that we are not yet sure is valid. At the bottom we find some statement that we know is true and then wave our hands and say that we could reverse the sequence and have a valid argument. Many students forget to place the question marks above the equal signs, making a claim of equality that cannot be supported at that point in the presentation. It would be better to present the valid sequence. An easy way to achieve this goal is to write the sequence of steps with the question marks over the "=" signs on scratch paper. For the formal
3.3 Mathematical Induction
119
presentation you wish to hand in, start at the top left, move down to the bottom, then start back up the right-hand side. The manipulations for the example can be written as (n + l)(n + 2) 2
2
_
2(n + I)
2
+ (n + 1)(n +2) 2
(n + 1)[2(n + 1) + (n + 2)] 2 (n + 1)(3n + 4)
Do this:
2 2
+ 7n + 4 2 This approach is illustrated in the proof of the next theorem. Notice the linear style: Start with one side of the claim P (i + 1) and work through a sequence of known equalities that terminates with the other side of P(i + 1). Notice also the use of the colon. It is incorrect to write P(i) = I + 2 + +•• i = i!+1 In tthis example, P(i) is 2 the entire equality, not just the left or the right side. The colon indicates that P(i) is the equation that follows. 3n
Sum of the Firstn Positive Integers n
n(n +
.
=
1)
1 for all positive integers, n, with n >
2
Proof: Let P(i) be the claim 34 Base Step We need to show P(1):
= J j I
*
•l+) Bt ceary, But clearly,
= J -
~
-•1=lJ=1 j
+• 2 )
Inductive Step We will assume that P(i) is true and show that P(i + 1) must also be true under this assumption. [The assumption that P(i) is true is called the inductive hypothesis.] BeLJ later cause we are assuming that P(i) is true, we can use the equation Y j = in the inductive step. [Remember, we only need to show "if P(i) is true, then P(i + 1) is true."] It will be helpful to write down what we want to show (so that we will recognize it when it appears later). However, we must be careful not to try and use this equation, because we don't yet know that it is true. P(i+÷1):
i+1
Lj-=
(i + 1)(i + 2) 2
j=1 The sequence of equalities that follows starts with the left-hand side of P (i + 1) and ends with the right-hand side. At each step, the equalities are valid. Since the assumed truth of P(i) is used in the sequence, P(i + 1) must be true if P(i) is true. That is all we need to do to establish the inductive step. j
j)
--
i(i
+ 1) + 2
34
+ (i + 1)
isolate the final number in the sum
j=
j=1
(i + 1)
by the inductive hypothesis
i(i + 1)
2(i ± 1)
i(i + 1) + 2(i + 1)
(i + 1)(i + 2)
2
2
2
2
See Appendix B for a review of summation notation.
120
Chapter 3 Proof Conclusion We now know that P(1) is true and that P(i + 1) is true whenever P(i) is true. The E] theorem of mathematical induction implies that P (n) is true for all n > 1.
M
A Formal Example This example will follow the preferred method of presentation shown in the proof of Theorem 3.6. However, most of the pedagogical comments will be omitted. This example should show the general pattern to follow when doing an induction proof. As a student handing a proof in for grading, you should be fairly complete and detailed (as opposed to a researcher or writer of an upper-division or graduate level textbook). Show that 2' < n! for all integers, n, with n > 4.35
Proof: Let P(i) : 2i < i!. Base Step Since 24 - 16 < 24 = 4!, P(4) is true. Inductive Step Assume that P(i) is true. We want to show that P(i + 1) : 2 i+1 < (i + 1)! must then also be true.
2i+1
-
2 •2
< 2 i! by the inductive hypothesis < (i+l1).-i! becausei> 4 = (i + 1)! Conclusion Since P(4) is true and P(i) --* P(i + 1), P(n) is true for all integers, n, with n > 4 by the theorem of mathematical induction.
1. Let P(i) : 1 + 3 + -. + (2i - 1) = i 2 . Then P(i + 1) is the equality (a) 1 + 3 +-.. + (2i - 1) = (i + 1)2 ++ (2i + 1) -i2 (b) 1 ± 3 + (c) 1 - 3+ ... +- (2i + 1) = (i + 1)2
(d) I1+3 +.-+(2i
3. You are familiar with the distributive property of the real numbers: a(b+c) = ab+ac.36 Use mathematical induction to prove the generalized distributiveproperty:
.n
-1) =i2a(n
2. Use mathematical induction to show that the sum of the first n odd integers is n 2 : I + 3 +..- + (2n - 1)
a
bi
=
abi
fa2 fora, bi E R. andn E N,n > 2.
n
=
L(2i - 1)
= n2
i=1
for all integers, n, with n > 1. Note: The key idea behind mathematical induction is the proof that the implication P(i) --> P(i + 1) is true. You should think of this implication in fairly generic terms: 35
Recall that n! = n • (n - 1) - (n - 2) ..•3 - 2
Also. 0! = 1 by definition. 36 See Appendix A.3 for more details.
1, for positive integers n and is pronounced "n factorial."
3.3 Mathematical Induction
121
"If P is true at some integer, then it is also true at the next integer." Thus, we can write the implication as P(k) --> P(k + 1) or P(n) -- P(n + 1) or even as P(i - 1) --* P(i). All represent the same implication. To complete the hypotheses of the theorem of mathematical induction, we need a starting value, i0 , where we know P(io) is true (most commonly, io = 1). Recall that the implication P(i) -- P(i + 1) may be true, but P(i) may be false. If we cannot find an integer io for which P(io) is true, then the conclusion that P(n) is true for all n > i0 would be false. Thus, the base step is essential (even though it is often trivial to verify).
Failed Inductions Suppose we want to show Va E IR, Vn E N, [((a > 1) A (n > 0)) --+ (an = 1)]. This claim can be informally stated as "show a' = 1 for a > 1 and n > 0." We can choose a to be any real number with a > 1 and attempt a proof by mathematical induction on n for the assertion: Vn E N, [(n > 0) --* (a" = 1)]. If no special properties (other than a > 1) are used, this will establish the original assertion. We can easily verify the base step (n = 0) for this claim since a° = 1 is true for any a E R with a > I (in fact, any a E R with a :A 0). What we cannot verify in this case is the implication P(0) --+ P(1). For example, there is no valid way to use the assertion 20 = I to conclude that 21 - 1. It is also possible to examine a situation where the inductive step can be completed for all n, but for which the base step fails. To that end, consider the claim that for all positive integers, n, Xin=l 2i = (n - 1)(n + 2). Suppose the sum of the first n positive integers is (n - 1) (n + 2). Then n+1
n
i~=1
i=l
L2i = 2(n + 1) + Z2i
= 2(n + 1) + (n - 1)(n + 2) = n(n + 3).
Therefore, if the claim properly calculates the sum of the first n even positive integers, then it also properly calculates the sum of the first n + I positive even integers. Notice, however, that the claim is wrong for every positive integer: n
S2i = i=1
2
n i = 2.
n(n +1 2
n(n
i=1
and (n - 1)(n + 2) = n(n + 1) implies -2 = 0.
U
You may be wondering why we should go to all this trouble to prove formally that a mathematical pattern is true for all n greater than some initial value. Wouldn't it be sufficient just to verify that the pattern holds for 5 or 6 values (perhaps even 9 or 10 if you want to be really cautious)? The following example vividly illustrates why this procedure is inadequate. Euler's Pretty Good Prime Function It has been a goal of mathematicians for a very long time to produce an algorithm or a function that produces all prime numbers. A more modest version of this problem is to produce an algorithm or a function that produces an infinite number of primes. It may skip some primes, but it should never produce a value that isn't prime. Here is a very simple function discovered by Leonhard Euler. ep(n) = n2 - n + 41 The following table shows the first 42 values of the function.
122
Chapter 3 Proof n
0
ep(n)
41
1 41
2
3
4
5
6
7
8
9
10
11
12
13
14
15
43
47
53
61
71
83
97
113
131
151
173
197
223
251
n
16
17
18
19
20
21
22
23
24
25
26
27
28
29
ep(n)
281
313
347
383
421
461
503
547
593
641
691
743
797
853
n
30
31
32
33
34
35
36
37
38
39
40
41
ep(n)
911
971
1033
1097
1163
1231
1301
1373
1447
1523
1601
1681
This marvelous function actually does produces a prime for the first 41 values (n = 0... 40). However, ep(41) = 1681 = 412, so the pattern fails. Even a conservative "pattern tester" would probably give up checking the pattern before arriving at the problematic 42nd value. U Even if you verify some mathematical pattern for a million different cases, there is still an infinite number of cases left to check. A proof using mathematical induction can provide certainty that the pattern always holds, and it can be done in a finite amount of time. Where do the formulas that are typically proved by induction come from? They are usually first found by someone looking for a pattern. Once a sufficient number of examples is available, it is possible to make an educated guess at the correct pattern. Once such a guess has been made, it can be compared to a few new examples. If it continues to hold, it is worth attempting a proof by mathematical induction. If the proof succeeds, the guess has been validated. If not, there are two explanations for the failure. One possibility is that the guess was incorrect. Additional examples may produce a case for which the pattern fails (see Example 3.22). The other possibility is that there is a mistake in the attempted proof. Guessing a Pattern I wanted to find a nice formula for the partial sums Sn = I
2 + Y2- 2 3 + " " 4:L2 - = T' 2 22 k=2
I started by calculating S,, for a few small values of n. n Sn
1 2
2
3
4
5
3
5
11
21
4
8
1-6
2
What patterns did I observe? A bit of thought made it clear (and, after the fact, pretty obvious 37 ) that the denominator of Sn is 2 n. What pattern emerged in the numerators? The sequence of numerators (so far) was 1 3 5 11 21. One thing I noticed (after a while) was that starting with the 5, each new entry equals the previous entry plus twice the entry before that. 38 However, it was not clear39 how to make use of that information, so I looked for another pattern. I noticed next that the nth numerator (starting with n = 2) seemed to follow the pattern: numerator = n(n - 1) ± 1. However, extending the table one more entry 37
Just put all the fractions in the sum over a common denominator. 381 arrived at this by looking at the differences between successive numerators. For example, 21 - II = 10
2.5. 39
1t will be clear after looking at Chapter 7.
3.3 Mathematical Induction
123
showed that this pattern fails. n
1
2
3
Sn
2
1
3
5
4
4
5
11
8
6
21
1--6
43
3"-2 64
The pattern seemed elusive. Instinct caused me to look for patterns that might involve powers of 2. However, the numbers don't appear to be related to powers of 2. After a while, I decided to multiply all the numerators by 3: 3 9 15 33 63 129. This was interesting. Compare these values with the powers of 2. n
1
2
3
4
5
6
3 • numerator
3
9
15
33
63
129
2n+1
4
8
16
32
64
128
A good guess seemed to be that the nth modified numerator is equal to 2 n+1 +(- 1)f (subtract I for odd n and add I for even n). The real nth numerator would then be 2' + (-")n. 1 Since the denominator appeared to be 2n,the value of Sn (if this were the correct pattern) would simplify to
±1 (2
'. (-1)n) =2 + I
n2+
(
n
It was time to do one last experiment before proceeding to a proof: See what happens for n = 0, 7, 8.
2 + -3 3
n
0
1
2
3
4
5
Sn
1
21
:3
8
5
11 1-6
1
3 4
5 9•T-6
11
(_I)n 2
6
7
8
21 3_2
43 P
85 T2•8
171 256
21 32
43 64
85 12-8
171 256
The result indicated that it was time for an attempt at a formal proof using mathematical induction. That is, I predicted (-I=0
(-I)
= 2+
for n > 0.
The formal proof provides an opportunity for you to practice mathematical induction.
U
The next example demonstrates mathematical induction in conjunction with products. Some (possibly) new notation needs to be introduced. Recall that n
Tai
i=0
is shorthand for the sum ao + al +'"+an.
In a similar fashion, the expression n
Hai i=O
is used to represent the product ao • al .. " an.
1L24
Chapter 3 Proof
Induction with a Product Let n e N. Mathematical induction can be used to show that
>-- + I---•
1-
n__ k=
V>l Vn >1.
1
Proof: Let P(n) be the claim
k1
> 4+
-
Base Step n = 1 P(1) is true because
r--I
-2-4
1-•
1
k=l
Inductive Step Suppose 4
)
k=1"(-l
is true for some n > 1. Then I- i
kI-
2-T+T) "
I-
-- Y)=
k=1
k=1
> (
+
)
-
1
1
4
2n+1l 2+
2n+3
1
4
4 ---
4 + 1 2- 1 4 +2F+1
-
221n+2
-+3
22n+2
1 -n
•-k •
by the inductive hypothesis 1
1
2n
1
+ (2n-'3 + 2-
1
-22n+T2
*.( 1- 2- l _-i )
in +3
2n11
1
4 + Hence, P(n + 1) is also true. Conclusion Since P(1) is true, and whenever P(n) is true, P(n + 1) is also true, the theorem of mathematical induction implies that P(n) is true for all n > 1. D U
3.3.2 Complete Induction Sometimes the inductive step is very hard to prove. Knowing that P(i) is true may not be sufficient to show easily that P(i + 1) is true. There is a second (but equivalent) form of mathematical induction that may work.
Theorem of Complete Induction If {P(i)} is a set of statements such that 1. P(1) is true and 2. [P(1) A P(2) A ... A P(i)] -ý- P(i + 1) for i >_ I then P (k) is true for all positive integers k. This can be stated more succinctly as [P(1) A (Vi, [P(1) A P(2) A... A P(i)] -- P(i + 1))]
-*
[Vk, P(k)].
3.3 Mathematical Induction
125
Proof: This theorem also follows directly from the well-ordering principle. The previous proof requires a few minor modifications. Assume that we have verified that P(l) is true and that for all i > 1, the truth of all of {P(1), P(2) ... , P(i)} implies that P(i + 1) is also true. Let S be the set of all integers n > 1 such that P(n) is true, and let T be the set of all integers k > 1 such that P(k) is false. Suppose that T is not empty. Then by the well-ordering principle, T must have a smallest element, ko. Since ko > 1, we know that ko - 1 E S. This means that P(1), P(2) ..... P(ko - 1) are all true (ko is the smallest integer that isn't in S). Since ko - I > 1, we also know that P(1) A P(2) A ...A P(ko - 1) -- P(ko). Using modus ponens,40 we conclude that P(ko) is true, contradicting the assumption that ko E T. The trouble started with the assumption that T is not empty. Therefore, T = 0 and S = {n E N I n > 1}; P(n) is true for all positive integers n. El This should be even easier to use than the theorem of mathematical induction, because we are assuming more before attempting to reach the same conclusion. Complete induction is also called strong induction, because it uses a stronger hypothesis. The Fundamental Theorem of Arithmetic: Part 1 of Proof The fundamental theorem of arithmetic states that "every integer n, with n > 2, can be written uniquely as a product of primes in ascending order." Using complete induction, it is not hard to prove the first half: Every integer n, with n > 2, can be written as a product of primes. (The word uniquely has been dropped. That part of the theorem was proved in Section 3.2.9.) Proof: Let P (k) be the claim that k can be written as a product of primes. Base Step The claim is clearly true for k = 2 since 2 is itself a prime (the smallest prime). Thus P (2) is true. Inductive Step Assume that all integers k for k = 2, 3. n can be written as a product of primes. That is, P(2) A P(3) A ... A P(n) is true. Consider the integer n + 1. Either n + I is a prime [in which case, P (n + 1) is true and the inductive step has been completed], or else n + 1 is composite. If n + 1 is composite, then n + 1 = a b, where 2 < a < n and 2 < b < n. By the inductive hypothesis, we can write both a and b as products of primes. Suppose the products are a = P 1 P2 •... p and b = qjq2 ... qt. Then n +1 = P1P2... ps qlq2 •qt is also a product of primes. Conclusion It has been shown that P(2) is true and also that whenever P(2) A P(3) A ... A P(n) is true, then P(n + 1) is also true. The theorem of complete induction implies that P(k) is true for all k > 2. LI You should spend a few minutes thinking about how you would prove this using only the theorem of mathematical induction. That is, assuming only that n is a product of primes, how would you show that n + 1 is also a product of primes. It is not easy. U Here is a simple example that illustrates the need for the base case in an induction proof. 40
[A A (A -- B)] -- B, where A = P(1) A P(2) A ... A P(k 0 - 1) and B = P(ko).
126
Chapter 3 Proof A Failed Induction, Revisited Suppose we wish to prove that a' = I for some a > 1 and for all n > 1 (rather than for n > 0, as in Example 3.21). This can be formally written as 3a c R, Vn G N, [((a > 1) A (n > 1)) - (a' = 1)]. Let P(i) : a' = 1 fora > 1. Ignoring the base case for the moment, notice that P(1) --- P(2) is true. Proof: Assuming that P(1) is true, a2 = a - a = 1 •1 = 1, so P(2) is also true. (P(l) A P(2) A ... A P(i)) --- P(i + 1) for i > I is true.
Proof: Assuming that P(k) is true for k
=
1. 2, 3 ...
, i and using complete induction,
ai+l = a 1 • a' = 1 1, so P(i + 1) is true. The inductive step has been completed without any problems. (In fact, it was not necessary to break it into two cases.) However, we know that the claim is false! We have U failed to show that the base step can be completed. It cannot. The claim is false.
V -Qui-ck Check 3.7 1. This is not an Earth-shaking problem, but it is a good first exercise in complete induction. Jedediah and Ebenezum have just cooked a large bowl of popcorn. They are sitting in front of their television watching the test pattern. They alternate taking some popcorn out of the bowl. They always take at least one piece of popcorn but might take as
much as a large handful. Prove that eventually the bowl will be empty. 2. Use complete induction to prove that every positive integer can be expressed as a sum of distinct powers of 2. For example, 7 = 22 + 21 + 20 and 11 = 23 + 21 + 20. Notice that 4 = 2' + 21 is not a sum of distinct powers of 2.
3.3.3 Interesting Mathematical Induction Problems Mathematical induction is very useful. The following examples amply demonstrate the versatility of this technique. They also demonstrate that the inductive step does not follow any set pattern of algebraic manipulations. (In fact, the chess board and stable marriage examples do not involve any algebra!)
Geometric and Arithmetic Progressions The two theorems listed here are both important enough that you should memorize them. DEFINITION 3.16
Geometric Progression
A sequence is called a geometric progression if each term in the sequence (after the first) is a constant multiple of the previous term. Thus, if the terms are {ai I for i = 0, 1, 2, 3, .. ., then ai+l = rai for some constant, r. Notice that we choose the values of a0 and r. All other elements of the sequence are then unambiguously determined: ai = aor'. An alternative name for a geometric progression is geometric sequence.
3.3 Mathematical Induction
127
DEFINITION 3.17 Arithmetic Progression A sequence is called an arithmeticprogression if each term in the sequence (after the first) is obtained by adding a constant to the previous term. Thus, if the terms are {aiI for i = 0, 1, 2, 3, . . . then ai+l = ai + d for some constant, d. Once we choose the values for ao and d, all other elements of the sequence are then unambiguously determined: ai = ao + i • d. An alternative name for an arithmetic progression is arithmeticsequence. The sum of a geometric progression and the sum of an arithmetic progression are worth knowing. Alternative names for these sums are geometric series and arithmetic series, respectively.
PartialSum of a Geometric Progression The sum of the first n + 1 elements of a geometric progression depends upon the value of r E Rii. n r1+1n-r n+1
r E1-r-
r
r-
ifr
1
ifrr
1
i=0 n
E 1
(n +1)
i=0
that 00 formula implicitly Note: The first +01+02 + ±~assumes 1 Fn=0 0i = I 1 + +1 -
+ ... + on = 1.
1. That is, when r
=
0, we want
The problem arises because 00 is generally regarded as indeterminate. That is, in different contexts, different values for the expression can be derived. Since 1 is not without merit. lim,0o xx = 1, the assumption 00 The entire issue could be circumvented by writing the r 0 1 equation as I+
- ri-1 n~~' j=1
nlr~ 1 r
rn+
if r 0 1.
rn 1
I fr l
r --
The version in the theorem has the weight of tradition behind it. Proof: The second case (r = 1) needs no further explanation at this point of your mathematics education. The case r : I can be proved by mathematical induction. Let P(n) be the equation for the sum of the first n + 1 terms. P(n):
Y-
ri-1
rl
1 --r= r
i=0
Base Step Whenn =0, YOr' =r° = land
lsincer #1 I---
0. Notice that the theorem could also be stated as n-
1 -r'n r•
Y- ri i=0
COROLLARY 3.2
Sum of a Geometric Progression Ifr E R and IrI < 1, then oo
1
i=0
Proof: Since IrI < 1, the partial sum Sn = y-
r can be found using Theorem 3.8.
Then take the limit as n --* o: linr S= n--oc because limn
linm n--oc
r -rn~l 1-r
1 1-r
,rn+l =0 when IrI < 1.
The theorem for the partial sum of an arithmetic progression will be left as an exercise.
Chess Boards A standard chess board has 64 squares, arranged in an 8-by-8 grid. Since 8 = 23, one possible generalization of a chess board is to have a 2n-by-2n grid. If we look only at such generalized chess boards, we can prove the following result. Let n > 1. Suppose we have a 2n-by-2' chess board, with one square missing, and a box full of L-shaped tiles. Each tile can cover 3 squares on the chess board. No matter which square on the chess board is missing, we can entirely cover the remaining squares with the tiles. Figure 3.4. An L-shaped tile.
Proof: The tiles look like the diagram in Figure 3.4. Base Step When n = 1, the chess board looks like Figure 3.5. No matter which square is removed, we can clearly orient the tile so that the remaining squares are covered.
3.3 Mathematical Induction
Figure 3.5. The l-by-l chess board. 2n squares
129
Inductive Step Suppose that we can cover the remaining squares on any 2n-by-2n chess board with one square missing. Now consider a 2n+l-by-2n+l chess board with one square missing. How do we reduce this to a 2"-by-2" chess board (so that we can use the inductive hypothesis)? The chess boards in this problem are easy to divide into four equal-sized parts. Since each side will be divided in half, the two halves will each contain 21 squares on a side (Figure 3.6). Thus the four parts will each be a 2"-by-2" chess board. The missing square must be in one of these four quadrants. Assume it is the top left quadrant (we can rotate the chess board to make this true). Figure 3.7 shows how we proceed. The gray area represents the missing square. (It could be on an edge of the top left quadrant. The proof does not depend on its exact location within that quadrant.) We start by placing a tile over the center squares in the other three quadrants. If we consider the squares that are covered by the tile as if they were missing, then each of the four 2n-by-2" quadrants of the chess board has a missing square. By the inductive hypothesis, each of these four quadrants can be covered by tiles. Since the tile at the center covers the squares we were pretending were missing, we have a covering of the 2n+1-by-2n+l chess board. Conclusion The theorem of mathematical induction now guarantees that we can tile any 2n-by-2n Dl chess board with a single missing square, for n > 1. Notice that the proof would fail for a 3-by-3 chess board, since we can't subdivide it into 2-by-2 pieces.
2n squares
2n squares
Figure 3.6. The 2n+i-by-2n+l chess board.
Optimality of the Deferred Acceptance Algorithm You will need to review Section 1.2 of Chapter 1 before reading this example. Recall that for a given set of male and female preferences, there may be multiple stable assignments that can be produced. 4 1 The deferred acceptance algorithm is one mechanism for choosing one of the (potentially) many stable assignments. We already saw that the algorithm produces a stable assignment. Even more is true. DEFINITION 3.18 Optimal A stable assignment is called optimalfor suitors if every suitor is at least as well off in this assignment as in any other stable assignment.
Figure 3.7. Applying the inductive hypothesis.
In Example 1.1 of Chapter 1, the assignment labeled "male 1st choice" is optimal for suitors if the males are the suitors. The definition says that among all stable assignments, an optimal one places each suitor with a mate that is as high on that suitor's list as the mates (for that suitor) in other stable assignments. It is not clear at this point that optimality is possible. Perhaps Al has his best mate in stable assignment A, but Bill might have his best mate in stable assignment B. It might be the case that no one stable assignment is best for all suitors. However, the next theorem will show that the deferred acceptance algorithm always produces an optimal assignment for suitors. DEFINITION 3.19 Possible A potential mate is called possible for a suitor if there is a stable assignment that pairs them. 41See Example 1.1 in Chapter 1.
130
Chapter 3 Proof In Example 1.1 of Chapter 1, all mates are possible. This need not be true for other sets of preferences.
Optimality of the Deferred Acceptance Algorithm The deferred acceptance algorithm produces an assignment that is optimal for every suitor.
Proof Using Complete Induction: The inductive hypothesis will be Prior to round k of the deferred acceptance algorithm, no suitor has been rejected by a possible mate. That is, any potential mate who rejects the suitor must be one that can never exist 42 in a stable assignment with that suitor. I will use letters near the end of the alphabet to represent suitors, and letters near the front of the alphabet to represent suitees. It might be helpful to draw diagrams as you read the details of the proof. Also, the word possible has a technical meaning here, so don't treat it as a normal word. Base Step: After the first round, no suitor has been rejected by a possible mate. For suppose that suitor S has just been rejected by potential mate C, in favor of suitor T (who has also proposed to C). Suitor S could never be in any stable assignment with C, because C has just shown a preference for T over S, and T has indicated C as first choice. Thus, if C and S were assigned to be married, C and T would elope. Hence, C is not possible for S. Since S represents any suitor who was rejected in round 1, it is clear that no suitor has been rejected by a possible mate at this point in the algorithm. Inductive Step Using complete induction, assume that in rounds I ... k - 1, no suitor has been rejected by a possible mate. We now want to consider a suitor, X, who has just been rejected by A in round k. If A has rejected X, then A must have a proposal from some other suitor, Y, whom A prefers to X. Think about the suitor Y for a moment. Y has (perhaps) proposed to other potential mates before arriving at A. So at this stage, Y has been rejected by all potential mates that Y prefers above A. By the inductive hypothesis, we know that no person who has rejected Y prior to this round is a possible mate for Y. Consequently, any person Y prefers to A is not a possible mate for Y. Now consider A and X. Could there be an assignment among the collection of stable assignments in which A and X are paired for marriage? Suppose that such a hypothetical assignment exists. In that hypothetical assignment Y must inevitably be paired with a possible mate. But all mates Y prefers over A are not possible for Y. Thus Y is paired with a mate, B, that Y would gladly leave for A. In addition, we know from the actual activities of round k of the deferred acceptance algorithm that A prefers Y over X. Hence, A and Y will elope. This means that the hypothetical pairing of A and X is not stable. The conclusion is that any suitor who is rejected in round k has been rejected by someone who is not a possible mate. Thus, the inductive hypothesis continues to be true after round k. Summary It has been shown that in round 1, no suitor is rejected by a possible mate. It has also been shown that if prior to round k, no suitor has been rejected by a possible mate, 42
Even if some other algorithm or mechanism is used to produce the assignment.
3.3 Mathematical Induction
131
then in round k itself, no suitor is rejected by a possible mate. Using the theorem of complete induction, we conclude that no suitor is ever rejected by a possible mate when the deferred acceptance algorithm is used. Finally, consider the way the algorithm works. Suitors start with their first choice and work down by order of preference. They are never rejected by a possible mate. Consequently, each suitor must end up paired with the highest ranking possible mate (relative to the suitor's rankings). This proves the theorem. E
3.3.4 The Well-Ordering Principle, Mathematical Induction, and Complete Induction Although in this text, the well-ordering principle was assumed as an axiom and the two theorems on mathematical induction were shown to be direct consequences of that axiom, it is possible to change the roles. It is possible to assume one of the two mathematical induction theorems as an axiom, and then prove the well-ordering principle as a theorem (and also prove the other induction theorem).
§Vffllff WOP
CI
-
ý
l
The Well-Ordering Principle,Mathematical Induction, and Complete Induction Are Equivalent
The following are equivalent: • the well-ordering principle (WOP) * the theorem of mathematical induction (MI) ° the theorem of complete induction (CI)
MI Figure 3.8. Proving Theorem 3.10.
Proof: The implications will be organized as shown in Figure 3.8. The three implications allow us to move from any one of the principles/theorems to any other, so all three are equivalent. Dl The implications will be proved as separate theorems.
s
ý
WOP - CI
The well-ordering principle implies the theorem of complete induction. Proof: The implication WOP
--
CI was already proved as Theorem 3.7 on page 124.
CI -, MI The theorem of complete induction implies the theorem of mathematical induction. Proof: Assume that Theorem 3.7 on page 124 is true. Suppose we have verified that the two hypotheses of Theorem 3.5 hold for some claim. That is, we know that P(1) is true and P(n) -+ P(n + 1) is true for all n > 1. We want to conclude that P(n) is true for n > 1. If we knew that (P(1) A P(2) A ... A P(n)) -- P(n + 1) is also true for all n > 1, the theorem of complete induction (which we are assuming is true) would imply P(n) is true for n > 1 and we would be done. So we need to investigate the implication (P(1) A P(2) A ... A P(n)) --> P(n + 1). Suppose one or more of P(1), P(2) ... , P(n) is false. Then P(1) A P(2) A ... A P(n) will also be false, so the implication (P(1) A P(2) A . A P(n)) --- P(n + 1) would be true.
132
Chapter 3 Proof Finally, consider the case where all of P(1), P(2),..., P(n) are true. Since P(n) P(n + 1) is true and P(n) is true, we know P(n + 1) is true.4 3 Thus, since T -+ T has the truth value T, (P(1) A P(2) A ... A P(n)) -- P(n + 1) is true. In all cases, (P(1) A P(2) A ... A P(n))
--
P(n + 1)
is true and P(1) is true. Complete induction implies that P(n) is true for n > 1. We started by assuming that the two hypotheses of mathematical induction were true for P and concluded that P(n) is true for n > 1. The theorem of mathematical induction is valid. The final implication to be proved will complete the circuit in Figure 3.8. •ll~•Dl,11ll•
MI -- WOP
The theorem of mathematical induction implies the well-ordering principle.
Proof: A proof by contradiction will be used. Let S be a nonempty set of nonnegative integers. Assume that the theorem of mathematical induction (Theorem 3.5) is true, but that the well-ordering principle is not true for S. That is, S has no smallest element. Since S has no smallest element, it should be possible to show that S does not contain any numbers in the initial collection of nonnegative integers. Thus, let P(n): {0,1,2.
n}nS=0.
The theorem of mathematical induction can be used to show that P(n) is true for all n >0. Base Step Consider the set {0} n S. Since S has no smallest element, the number 0 cannot be in S. Thus {0} n S = 0. Inductive Step Assume that P(n): (0, 1, 2. {0, 1,2.
n} n S = 0 is true for some n > 0. Observe that
n,n + 1}AS-= ({0, 1,2 ..... = ({0, 1,2 ....
n) U {n + 1})nS n}N S) U (In + 11 NS)
= 0 U ({n + 1 n S) =
by the inductive hypothesis
{n + 1)nS.
Therefore, {0, 1,2,. n, n + } S : 0 would imply that n + 1 E S. But the inductive hypothesis then leads to the contradiction that n + 1 is the smallest element of S. Thus, 10, 1,2. n, n + 1) n S = 0, completing the inductive step. Conclusion Assuming that S has no smallest element but that the theorem of mathematical induction is valid leads to the conclusion that {0, 1, 2 . n} n S = 0 is true for all n > 0. End of the Induction The induction has established that S does not contain any nonnegative integers as elements. This means that S must be empty, a contradiction. This final contradiction leads back to the assumption that S has no smallest element. We must therefore conclude that S does have a smallest element, proving the theorem. I 43
Modusponens: [P(n) A (P(n) -
P(n + 1))] -*
P(n + 1).
133
3.3 Mathematical Induction
3.3.5 Exercises The exercises marked with ýP have detailed solutions in Appendix G. 1. The following formulas can all be proved by mathematical induction.
(a) 12 + 22 + 32 +
+n2 numbers, n with n > 1.
n(n+1)(2n-1C) for all natural
..-
8. Find a formula for 1.1!+2.2!+..+n~n!
Indicate how you arrived at the formula, and then use mathematical induction to prove the formula. 9. ýP Let n e N, with n > 1. Show that
(b) )14 2 m > m for all natural numbers, m with m > 1. (c) an < 1 for all real numbers, a, with 0 < a < 1 and all natural numbers, n with n > 1 Clearly identify where you have used the assumption 0 < a and then explain why the proof would fail ifra < 0.
n. k(
L
2
< 2 --
Hint: I 1 n
n ( k i)
n2 + n + 1
-
n(n +1)2 1 + n(n+ 1)2
(
n2 + n In(n + 1)2 1
+
1 n(n +
121 )2
i=1
k=1
n(n + 1)(n + 2)
for all natural numbers, n with
n >2.
ncN~,n> 1.
6 Bn be sets, with n > 2. Prove that
10. Let A, B 1 , B2 . -
AA(BI UB
2
n
U...UBn) (AfnB 1 ) U (AfNB
2. Prove that
2)
U... U (AnBn).
11. Let n E N, with n > 1. Use mathematical induction to show that E 3
k=0
3 •
yIn
2 +3 +..+n
21+
< (2n +7 3)2
i=1 for all natural numbers, n with n > 0. 3.at~ a theore ~ ~ ~ u E.nj 0(a + id)of the first sum for theFom theorem 3. Formulate n + 1 elements in an arithmetic progression. Indicate how you arrived at the formula, and then use mathematical induction to prove the formula. 4. O Find a formula for I S+ ý--+ I 2-
y3 +..+
1 2--
n n>1.
Indicate how you arrived at the formula, and then use mathematical induction to prove the formula.
a
1
1
1
>
by x + y. 15. Let n E N, with n > 2. Show that n! < n'.
a
+
2++
I1
1
1
1
an
a
for n E N,n > land a E R, a> 0, a 7 1. Indicate how you arrived at the formula, and then use mathematical induction to prove the formula. 6. Find a formula for n
14. Letn E N,withn > Oandletx,y E Rwithx 0 -y. Use mathematical induction to show that x 2 n - y2n is divisible
16. Prove that
5. Find a formula for 1
2 show that x -1 is divisible inductionodto id)teger.irs 12. 0 Use 8 mathematical xo any p-s=0( by 8 when x is any positive odd integer. 13. Prove that every integer, n, can be written in the form 5 • a + 7 • b, where a, b e Z. Use mathematical induction to show this for all integers n > 0. Then think of another way to validate the claim for all integers n < 0.
i=
-..
>2(
for n E N, with n > 1. 17. Let x, Yl2,Y . y, be elements in a Boolean algebra, with n > 2. Prove x+(Yl "Y2 ... Yn) = (x + Y) " (x + Y2) ...
1 n c- N, n > 1.
I
1. Indicate how you arrived at the formula, and then use mathematical induction to prove the formula,
(x + Yn).
18. Let n be a positive integer. Prove that
Indicate how you arrived at the formula, and then use mathe-
matical induction to prove the formula.
-l)
2n 19. • Let n
e
3 •.5... (2n - 3).(2n - 1) 2-4- 6
..
(2n - 2) - (2n)
N•. Prove that 2n+l
2n+2I
n+/I
n+-2
(Hint: This is a nasty induction problem, but is fairly simple to show algebraically.)
134
Chapter 3 Proof 25. The sequence of numbers 1, 1, 2, 3, 5, 8, 13, 21,34, 55,..
20. Let n be a positive integer. Prove that
2 . 4 6... . (2n - 2) • (2n)
is called the Fibonaccisequence. It can be generated by setting fo = 1, f, = l and setting f, = fn-I + f-2 for
< -
n+ I
21. Prove IT(I -I k=2
_n+1 2 2n
Vn e N with n > 2. -the
n > 2. Notice that all numbers in the sequence are integers.
The following formulas all relate to the definition of the Fibonacci sequence. You will need to remember to use your inductive hypothesis in each case. You will also need to use definition of the sequence: fn = fn-I + fn-2 for n > 2. (a) Provethatf +f?+f2+...+f2= fnfn+l forn > 0.
The results of the next two problems will be used in Chapter11.
(b) ODShow that fo + f2 + •
22. ODLet h > 0 be an integer. Prove that Z=_0(h - i)2i = 2h+- - h - 2.
(c) Prove that fn-I fn+I - f2 = (_4 )n+1 forn > 1.
23. ODLet h > 0 be an integer. Prove that Zhi=0 ii = (h - 1)2h+l + 2, 24. What is wrong with the following proof that all horses have the same color? Proof Let n be the number of horses. When n = 1,the statement is clearly true, that is, one horse has the same color, whatever color it is. Assume that any group of n horses has the same color. Now consider a group of (n + 1) horses. Taking any n of them, the inductive hypothesis states that they all have the same color, say brown. The only issue is the color of the remaining "uncolored" horse. Consider, therefore, any other group of n of the (n + 1) horses that contains the uncolored horse. Again, by the inductive hypothesis, all of the horses in the new group must have the same color. Then, since all of the colored horses in this group are brown, the uncolored horse must also be brown.44
+ f2n = f2n+l for n > 0.
(d) It is an amazing fact that fn, the nth element of the sequence, can be given by a formula that involves .'5. The formula is given by fn = cla n + c2 bn,
where n > 0 and + C
-(I c2
5-)
-
2,/5
2,/5 and 1+ a
-
2
I b -
-
2
It is worth mentioning that a and b are the two solutions to the equation x2 = x + 1. Use complete induction to prove that the formula is correct.
3.4 Creating Proofs: Hints and Suggestions The earlier parts of this chapter have presented numerous strategies for proofs (for example, direct proof and mathematical induction). Learning those strategies will help you to read intelligently proofs created by other people as well as help you to create proofs on your own. However, knowledge of those strategies is often not all that is needed. Creating proofs cannot be done by following a set of rules. It requires insight, instinct based on experience, and sometimes creativity and ingenuity. At this point in your mathematical career, the element of "instinct based on experience" is still in a forma-
tive stage. Fortunately, this is not a double-bind situation. 45 There are some general suggestions that will help you gain that experience while still succeeding at creating proofs. The goal of this section is to acquaint you with some of these suggestions. Your instructor may have additional suggestions.
3.4.1 A Few Very General Suggestions There are a number of fairly simple habits and ideas that can enhance your ability to create proofs. They are presented in no particular order. 441 do not know the original source of this "proof'" The version presented here is from [73, p. 58]. There is also a pun associated with this proof (which I mercifully cannot remember at the moment). 45A common double bind is for a new college graduate to find that the job listings in a field all require two to three years of experience. The graduate doesn't qualify for a job due to lack of experience, but can't gain experience due to lack of a job in the field.
3.4 Creating Proofs: Hints and Suggestions
135
Know the Definitions and Theorems Most proofs will require knowledge of one or more definitions just to understand the statement that is to be proved. The proof itself will often require the use of other definitions (either in concept-to-property or property-to-concept form). In addition, the proof will often be simplified if you use the results of one or more previously proved theorems. If you have memorized the definitions (both at an intuitive and at a precise level), you will recognize more easily when they should be inserted into the proof. If you also have memorized the major theorems, you will spend less time flipping through pages of the text looking for a random theorem that might be of use. The Euclidean Division Algorithm Recall the proof of the Euclidean division algorithm on page 99. The proof requires the use of the well-ordering principle (an axiom), some basic notions from set theory (empty, nonempty, set-builder notation), an understanding of the integers (the nature of integers, properties of integer addition and multiplication, the definition of absolute value), properties of absolute value (a theorem), and familiarity with proof by contradiction (existence) and direct proof (uniqueness). If all that material was familiar to you, you probably found that proof fairly easy to read and understand. However, if you had not memorized and/or understood some of the background material, you may have found a few places where the flow of ideas was a mystery. U The main idea in this suggestion is to memorize and understand (deeply and precisely) definitions and theorems. The more you know in this way, the easier it will be to read and create proofs.
Use Lots of Scratch Paper Colleges produce and recycle lots of paper that has only been used on one side. Grab a stack and use the blank side for scratch paper. You can send it back to the recycling bin after you have used it. Scratch paper is very helpful for trying out ideas without the need to keep it neat or to erase dead ends. Knowing that mistakes and dead ends can be effortlessly discarded frees you to experiment. Writing your proof on scratch paper first also enables you to do some editing and revision as you copy the correct proof onto the final paper. The clean copy might then be an improvement over your first correct proof.
Analyze Other People's Proofs As you read proofs in textbooks or published articles, you will certainly be thinking about the content of the proof. It will also be beneficial to think carefully about the manner in which the proof is presented. You can learn a great amount from a wellpresented proof. In particular, you might see ideas that you would not have considered. For example, in the proof of the Euclidean division algorithm on page 99, the set S = {a - bq I q E Z and a - bq > 01 is not one that many students would think to examine. If you imprint the clever ideas you encounter in your memory, the ideas will be available to you in the future. Unfortunately, published proofs seldom provide you with information about how the proof's creator arrived at the final version. Even if that information is missing, you should realize that the printed version was usually not immediately created in the form you see. The author will typically try a few approaches and settle on the final version 46 after a few detours and dead ends. 46
0f course, if the proof is fairly simple and the author has been creating proofs for many years, the proof
might be written in final form immediately. Experience makes a difference.
136
Chapter 3 Proof
Don't Be Afraid to Try Multiple Strategies If you start using one proof strategy and reach a dead end, try a different strategy. If that also leads nowhere, try another. You might even cycle back to one of the earlier strategies. Perhaps some of the additional thoughts you have during the detour will bear fruit on the return visit. Many assertions can be proved in more than one way. One of the alternatives might make more sense to you than the others. You just need to find the proper alternative. M
Several Approaches Suppose you need to prove that (P V Q) A R --* R is a tautology. There are several options. You might first try a few truth values for P, Q, and R to see if the statement appears to really be a tautology. You could also think about whether it makes intuitive sense. Once you decide to proceed with a proof, you might list the alternative approaches available. You could use a truth table or you could use the fundamental logical equivalences. Suppose your instructor has decided that the fundamental logical equivalences and the logical equivalences and rules of inference for implication and the biconditional are the preferred approach. If you don't recognize that the law of simplification immediately completes the proof, you still have two initial options. You might start with right distributivity or you might start by using a substitution using an implication logical equivalence. For such a simple assertion, either approach will quickly lead to a completed proof. E The proof of the next assertion has been expanded to show the process of considering multiple strategies from Section 3.2. A Cautionary Tale Suppose I want to prove the following assertion: Let x E Rk. If Vx is irrational, then x is irrational. I might try to do a direct proof. However, I get stuck pretty quickly. How can I convert "VT is irrational" into something I can manipulate? Recalling that rational numbers are much easier to write in useful ways, I next consider an indirect proof (since it involves negations of the hypothesis and conclusion of the original implication). Thus, I might try to prove the following: Let x E R. If x is rational, then
Vx is rational.
This looks more promising. I can start by writing x = k for integers p and q with q
q A 0. Now, what does f/x look like? It must be f/x anything else about p and q, so I don't know much about
Hmm
...
I cannot assume
lam stuck again. 1.
Next, I will try a proof by contradiction. I start by assuming that lx is irrational, but x is rational. I hope to arrive at a contradiction. I again proceed to write x = ' with q
p and q integers and q : 0. Now I need to contradict the irrationality of lx. This leads to the same dead end as the previous strategy. It is time to back off and do something I should have done at the beginning: Look at a few examples to get a feel for the validity of the proposition. I can create a small table of values to try with the proposition. The creative part is to choose values for VT that I know are really irrational. It would also help if it were easy to tell if x itself is irrational. Thinking back to Proposition 3.6, my first table entry will be N2.
2
2
3.4 Creating Proofs: Hints and Suggestions
137
Oops! It seems that the assertion is false. No wonder I was getting nowhere with the proof. The final version of my "proof" is quite simple. The assertion is false. The numbers
V/2 and 2
provide a counterexample.
N
Incubation Sometimes you spend a significant amount of time on a problem and don't seem to be making progress. It is often a good idea to move on to something else for a while. You may need some time to free yourself from revisiting the same stale ideas over and over again. Many people find that after a time of intense work with a problem, a period of incubation is necessary. During that time, their minds can focus on other things. After a full night of sleep, or even a few days of tending to other issues, they will be able to find a new approach that solves the problem. Incubation is of no practical value if you delay working on your homework until the hour before class, or at 1 A.M. the night before class. There will be no time during which the incubation can occur (remember that incubation needs a previous intense period of concentration on the problem). Start your math homework on the day it is assigned. That will allow time for incubation, time to meet with classmates to discuss the text section, and time to attend your math tutoring lab or the instructor's office hours.
3.4.2 Some Specific Tactics The following suggestions are more narrowly focused on ways to keep making progress as you search for a proof. They may not all apply in any one case, but they do provide a rich set of helpful approaches.
Look for Common Characteristics Many assertions are stated in ways that give valuable hints about profitable solution strategies. For example, if the assertion contains the term rationalnumber, it is likely that the rational number should be expressed in the form E, q : 0, with p, q E Z. Another example would be an assertion that claims that some set, A, is a subset of another set, B. The presence of the symbol C suggests that you try choosing a generic element in A and show that it is also in B. In essence, the symbol C should lead you back to the definition of subset47 as an initial approach to the proof. Table 3.1 (page 138 and repeated inside the front cover) lists some more general common characteristics, together with recommended initial strategies.
Forward-Backward You do not always need to proceed from point A to point B as you construct a proof. It is often useful to start at the beginning and move forward until you get stuck. At that point, you could start at the end and work backward toward the starting point. You might even bounce bacl and forth a few times. If the two attempts meet in the middle, you can then rework the proof to start at the beginning and proceed all the way to the end. More specifically, let the assertion be in the form of an implication: A -> B. In the forward direction you assume that A is true and use that information to show that B is also true. The motivation is the familiar modus ponens tautology: [PA(P - Q)] -> Q. In the backward phase you are asking questions about what is necessary for B to be true. 47
See page 16.
138
Chapter 3 Proof TABLE 3.1 Some general proof strategies If the assertion ...
Then try ...
claims something is true for all integers n > no
mathematical induction
is stated explicitly or implicitly as an implication
direct; indirect; contradiction
contains an existential quantifier
a constructive proof; a nonconstructive proof
contains a universal quantifier
finding a counterexample; the choose method
contains the phrase "if and only if"
to prove the two implications separately; to produce a sequence of equivalent statements linking the two sides of the biconditional
is stated as an equivalence
to look for a complete set of implications that are relatively easy to prove
can be easily split into a collection of independent assertions
proof by cases
is an implication with a true conclusion
trivial proof
is an implication with a false hypothesis
trivial (vacuous) proof
is about membership in a set
direct proof: verify that the element satisfies the set membership requirements
asserts one set is a subset of another
to show that a generic element of the first set is also a member of the second set
asserts the equality of two sets
to show that each set is a subset of the other; to use a sequence of reversible statements with the fundamental set properties and other theorems
Forward-Backward Consider the assertion Ifx E Rand2 <x b-a 2 Ifb +,thnC2 a 22
0
3b.
b> b - a, then a 1. Thus, q
COROLLARY 3.3
If q
2
> 1. Hence,
is an honest fraction.
El
is not an honest fraction, then neither is p q
Proof: This is the contrapositive of the lemma.
Dl
Proof of Theorem 3.15: The real numbers can be partitioned into the disjoint sets of integers, honest fractions, and irrationals, so n (an integer) is not an honest fraction. By the corollary to the lemma, if n is not an honest fraction, then neither is ,'-n-. Since El ,In is not an honest fraction, it must be an integer or an irrational, It is now easy to prove Proposition 3.6.
Proof of Proposition 3.6: By Theorem 3.15, since 2 is an integer, VN2 is either an integer or it is irrational. We know that it is not an integer, so it must be irrational.
D]
3.4.3 Exercises The exercises marked with OD-have detailed solutions in
Appendix G. 1. Prove, without using mathematical induction, that ymk=k m(2m + 1) for all positive integers, m. (Hint: Add I and 2m, 2 and 2m - 1, 3 and 2m - 2, etc.) 2. P Use the result of Exercise I to (a) Find the value of y2m+ k for m >-- 0, where me N. _k=l (b) Provide a noninductive proof that y-=ln n > 1, where n EN.
k = n(n+l) for
3. Find the error in the proof of the given claim. Additionally, determine whether the claim is true or false. If it is true, provide a correct proof. If it is false, find a counterexample.
Claim: Let x be any even natural number. Let y be any odd natural number greater than x. Then (xy) mod (y - x) = xy. Proof: Given that x is any even natural number, x = 2k for some natural number, k. Similarly, since y is any odd natural number greater than x, there is a natural number, k, such that 2k +1. Thus, Y=2k+1Ths (xy) mod (y - x) = (2k(2k + 1)) mod ((2k + 1) - 2k) = (4k 2 + 2k) mod 1
= (4k 2 + 2k) = xy.
3.4 Creating Proofs: Hints and Suggestions 4. Prove that if n is an even integer and m is an odd integer, then
(b) min(a, b) =
-
_+b
"either 4 divides mn or 4 does not divide n. 5.
'
1 is Use an indirect proof to prove the following: "If 2n -- Ii
prime, then n is prime." (Hint: Try some examples. Factoring 2n - I is easier when n is an even composite integer.) 6. Let a, b, and c be integers with a a I (b + c), then a c.
0. Prove that if a band
7. Prove that the square of any integer can be written in one of the following forms: 4k or 4k +- 1.
al a2 b beb2positive 25. Let n a and bi .. bk integers where a = p
and b -- Pl P2
143
P2 .
a k
Pk with ai, bi > 0. Prove that
(a) gcd(a, b) = pmin(a!,b 1 ) p in(a 2 ,b2) ...pin(ak,bk) 1 2 Pk , (b) lcm(a, b) = Pmax(a2,b "max(ab) .. max(abk) k 2 26. '4 Let 0 < a < b, where a and b are positive integers. Prove
that gcd(a, b) = gcd(b mod a, a). 27. Show that if a and b are positive integers, then ab = gcd(a, b) - lcm(a, b).
8. Use a proof by contradiction to show that if n is any integer, then n 2 - 3 is not divisible by 4. You may use the result of Exercise 7. 28. Letxe N. Prove: Ifthe sum 3, then xis also divisible by 3.of the digits ofxis divisible by 9. Prove that the sum of two odd integers is an even integer.3,texisaodvsblby. 29. Suppose that p and p+2 are both primes (called twin primes). 10. Prove that the sum of an odd number of odd integers is odd. You may use the result of Exercise 9. is a rational 11. 0-PProve that the sum of two rational numbers number. 12. Prove that the product of two rational numbers is a rational number. 13. Prove that the sum of a rational and an irrational is an irrational. 14. If x is irrational, then x +- (-x) = 0 is a rational. Suppose 14.t isalsoan Ify irrational, then + (-x) y 0 araion. Spose fDon't that y is also an irrational with y 7• -x. Prove or find a 15. 16. 17. 18.
counterexample: x + y is irrational. Prove or find a counterexample: The product of a nonzero rational and an irrational is irrational. Prove or find a counterexample: The product of two irrational numbers is irrational. 2 Let x E Z. Prove that x is divisible by 3 if and only x - 1 is not divisible by 3. Prove Proposition 3.8 by using the strategy 1 +-+ 2 and
2 s-* 3. 19. Suppose 19. mpode n7aa 2 is not divisible by 3. Prove that 20. Let p
>
5 be a prime. Prove that p + 2 is not a prime.
What can you say about p mod 3? 2 1 mod 8 31. Let q E Z. Prove that the following are equivalent. * 7 divides q q1, 7 7c+1,q 2 A 7c+2,andq2 0 7c+4foranyinteger, e C
2
divides q Prove: m2 = n 2 *-• ((in = n) or (m = -n)). 32. ,Let7 m, n a IFI. 32LemnERPrv:2=n2+>.(m=)orm=-). trivialize this problem. You need to use the zero product principle (Appendix A. 1) to complete this proof. 33. '1 Let be 0 elements. Prove that S has 2n~ . ing n n S> itelf) (nl a set having Sub S set subsets (including 0 and S itself). 34. Prove every odd integer can be written as the difference o w that qae.(l ubr eeaeitgr. of two squares. (All numbers here are integers.) 35. Let n be a positive integer. Prove the following. (a) If n = 1 mod 3, then n(n + 1) =_2 mod 3. Otherwise, ( + 1) 0 mod 3. (b) n(n+l) 1mod3. 36. Let a, b, and c form a primitive Pythagorean triple (i.e., 2 + b2 = c2 and no prime divides all three). Prove that c is always odd, one of a and b is odd and the other is even. 37. '1 Let a, b, and c form a primitive Pythagorean triple. Show
21. O1Let x, y E R. Prove that Ixyl = lxi • lYl. 22. Use a proof by contradiction to show that the equation x3 + 3x + 3 = 0 has no solutions in Q. (Hint: Use a proof by cases inside the proof by contradiction.)
that one of a or b is a multiple of 3. (Hint: Use the conclusions of Exercises 35 and 36.) 38. Use a proof by cases to show that if a is an integer, then a5a is divisible by 5.
23. Let a and b be real numbers. Prove that max(a, b) + min(a, b) = a + b.
39. Prove that the product of any four consecutive integers is di-
24. Let a and b be real numbers. Prove (a) max(a, b) = b_
40. Provide a proof for the following claim: Every integer of the form 6 3k + 1 is composite, where k is a positive integer.
visible by 8.
144
Chapter 3 Proof
3.5 QUICK CHECK SOLUTIONS Quick Check 3.1 1. Let n be odd. Then Definition 3.8 indicates that there does not exist an integer, k, such that n = 2k. That is, n is not divisible by 2. The Euclidean division algorithm asserts that n can be uniquely expressed in the form n = 2q + r, where r is an integer with 0 < r < 2. Thus, r E {0, 11. Since n is not divisible by 2, the only admissible choice is r = 1. Thus, n = 2 q + 1, with q an integer. The letter used to denote the quotient is not important, so this can be stated as n = 2k + I, for some integer, k.
Quick Check 3.2 1. 100 = 22 • 52, 105 = 3 - 5 . 7
101 is prime,
102 = 2 • 3 • 17,
2. (a) q =3, r=250, 1600=3-450+250 2 6 (b) a= 1600=2 .5 , b=450=2.32 (c) gcd(1600, 450) = 2 52
-
103 is prime,
104 = 23 • 13,
52
50
(d) To find the least common multiple, take the maximum power of each prime factor in the two factorizations (it is not necessary that a prime appear in both factorizations). lcm(1600, 450) = 26 • 32 • 52 = 14400 (e) Use the remainder from part (a): a mod b = 250. (f) 1600=145.11+5, 450=4011+10 They are not congruent mod 11 because the remainders are not the same. Another way to see this is to notice that 1600 - 450 = 1150 is not divisible by 11.
Quick Check 3.3 1. Proof: Since a I b we know that there is an integer k such that b = ak ("concept properties"). Therefore, bc = (ak)c = a(kc). Since kc is an integer this shows that bc is an integer multiple of a, so bc is divisible by a ("properties - concept"). 2. Proof: Since a I b, we know that there is an integer k such that b = ak ("concept properties"). Also, since b I c, we know that there is an integer m such that c =bm ("concept --- properties" again). Therefore, c = bm = (ak)m = a(km). Since km is an integer, this shows that c is divisible by a ("properties -* concept").
Quick Check 3.4 1. Proof: Since n is positive and composite, there are integers a and b with I < a < n, 1 < b < n and n = ab. Suppose that a > JIT and b > I/-n. Then n = ab > n /-7. = n, a contradiction. Therefore, either a < v/• or b < I-n- (or both). Without loss of generality, we may assume a < fHn. If a is a prime, we are done. Otherwise, a must itself have a prime divisor p and p < a _< f/H must be true. Proposition 3.3 implies that p is a prime divisor of n.
Quick Check 3.5 1. Proof: Let p E QLtqq be any rational number with
#0. 0
Since qp #0O, we know that
p # 0. Therefore, 1p is also a rational number. We also know that q # 0 (since pq
3.5 QUICK CHECK SOLUTIONS is a rational number). Thus P- I- = qinverse we sought.
1. The number I is the multiplicative
Notice that no special properties of q- were used except the fact that p (which is part of the proposition's hypothesis and must be used). Quick Check 3.6 1. (c) 1 + 3 +... + (2i + 1) = (i
145
# 0
1)2
+
2. Following the standard pattern, define P (n) as n
P(n): L(2i-1)=n2 . Base Step Since El=, (2i - 1) = l and
12 =
1, P(1) is true.
Inductive Step Assume that P(k) is true for some k > 1. We want to show the P(k + 1) must also be true under that assumption. Observe that P(k + 1) can be written k+1
P(k+ 1): Z(2i-1)=(k+1)2 . i=1
The following equalities show that P (k + 1) is true if P(k) is true. k+1
J_•(2i -- 1) =
k
.. (2i -- 1) + (2(k + 1) - 1)
k j-'2i - 1) + (2k + 1) i=1 -
k 2 + (2k + 1)
= (k + 1)2
Conclusion Since P(l) is true, and P(k) P(n) is true for n > 1.
-+
by the inductive hypothesis by a simple factorization
P(k + 1) is a valid implication, we conclude that
3. Let P(n) be the equation a (E'n 1 bi) . 7 iy=abi. Base Step When n = 2, the statement is the normal distributive property of the real numbers, so it is certainly true. bi) = a(bt+ b2) = abi + ab2 =
a
abi
Inductive Step Assume that P(n) is true for some n. Consider P(n + 1). a (•=bi)
=a
bi + bn+)
=a
bi
=
L abi + abn+j i=1t
n+1
=
+ abn+i
L i~1
abi
using the normal distributive law
by the inductive hypothesis
146
Chapter 3 Proof Conclusion The base step (P(2) is true) and the inductive step (P(n) --* P(n + 1) is true) are both valid. The theorem of mathematical induction implies that P(n) is true for all n >2.
Quick Check 3.7 1. You might have chosen either one of the base steps given (both are acceptable). Let n represent the initial number of pieces of popcorn in the bowl. Let P(n) be the claim "If the bowl starts with n pieces of popcorn, then eventually the bowl will become empty." Base Step n = 0
If the bowl starts empty, then there is nothing to show. Base Step n = 1 If the bowl starts with one piece of popcorn, then when Jedediah takes his first "handful" he will grab that piece (since he must take at least one piece). The bowl will now be empty. Inductive Step Assume that when the bowl starts with k pieces of popcorn, for 0 < k < n, the bowl will eventually become empty. Let the next bowl start with n + 1 pieces of popcorn. Jedediah takes his first handful. He will take at least one piece of popcorn. The bowl will now contain k pieces of popcorn, with 0 < k < n. By the inductive hypothesis, this revised bowl will eventually become empty. Conclusion The base step [P(0) is true] and the inductive step [(P(0) A P(l) A ... A P(n))
--
P(n + 1) is true] are both valid. The theorem of complete induction implies that P(n) is true for all n > 0. 2. This result should not be a surprise; it just states that every positive integer has a base 2 representation. Base Step k = 1 This is easy: 1 = 20.
Inductive Step Assume that every integer k, where I 2' for at least one nonnegative value of i. Let j be the largest exponent for which n + I > 2'. If n + 1 2', the inductive step is complete. Otherwise, let m = (n + 1) - 2J. Since j > 0, m < n + 1. Suppose that m > 2'. Then n + I = m + 2 > 2i + 2
= 2i+
which contradicts the maximality of j. Therefore, m < 2'. By the (complete) inductive hypothesis, m = 2" + 2i2 + ... + 2'i, where all the powers of 2 are distinct and strictly less than 2'. Therefore, n + I = 2' + 2il + 2i2 + ... + 2i' is also a sum of distinct powers of 2. Conclusion The claim is true for n = 1 and whenever it is true for I < k < n, it is also true for n + 1. The principle of complete induction asserts that every positive integer is a sum of distinct powers of 2.
3.6 Chapter Review
147
3.6.1 Summary There are two primary goals for this chapter. The first, and easier, of the goals is to help you become proficient at reading proofs. The more difficult goal is to help you start the process of becoming proficient at producing proofs. Proofs are an essential part of this course and will be a central feature in most of your future mathematics courses. A proof not only establishes the truth of some assertion, but at its best, the proof will also help you gain understanding as to why the assertion is true. The task of reading a proof designed by another person is not trivial. In most cases, the proof's author will assume that you are familiar with the strategies and techniques that have been presented in this chapter. The proof may contain few overt signposts that inform you about which strategy is being used. You may need to "read between the lines" and find that information on your own. Producing proofs is harder. You need to discover the key relationships for yourself. Then you need to find an appropriate way to express the proof in a clear, logical manner. You also need to follow accepted conventions in mathematics (use of notation, presentation style, etc.) and design the presentation for the intended audience. When you first begin creating your own proofs, it is not easy to evaluate the quality of the final product. Have you really proved what you intended, or does your "proof" contain some errors? You really need a second opinion. Having your instructor or a more advanced student evaluate your work will provide essential feedback. Don't get discouraged if your first attempts seem pretty dismal. Learning to create proofs is a process. If you stick with it, you will improve over time. The chapter begins with two kinds of material that provide a foundation for the rest of the chapter. The first section presents an overview of the environment in which proofs naturally belong: axiomatic mathematics. A brief overview of some basic definitions and theorems in elementary number theory is then provided. In addition, it provides a simple context in which to practice the proof techniques presented in the rest of the chapter. The main core of the chapter is the description of a number of proof strategies (direct proof, proof by contradiction, mathematical induction, etc.), followed by some suggestions on how to produce proofs. One useful way to begin your review process for this chapter is to create a detailed outline of the proof strategies presented in Sections 3.2 and 3.3. Then you may wish to review the list of general proof strategies in Table 3.1 on page 138. Finally, there is no substitute for spending time creating your own proofs.
3.6.2 Notation Notation
Brief Description
Page
N
95
the set of natural numbers
z
95
the set of integers
Q
96
the set of rational numbers
R
96
the set of real numbers
b Ia
96
the integer, b, divides the integer, a
gcd lcm
97 97
greatest common divisor least common multiple
a mod b a -- b (mod m)
100 100
the remainder when the integer, a, is divided by the integer, b a is congruent to b, mod m
148
Chapter 3 Proof Notation
Page
-Ji0o ai
117
shorthand for the sum a0 + a 2 +
120
(footnote 35) n factorial
i ai0 fn
123 134
shorthand for the product a0 • a2 the nth Fibonacci number
max(a, b)
116
the maximum (larger) of real numbers, a and b
min(a, b)
116
the minimum (smaller) of real numbers, a and b
n fn
a
Brief Description + an -I a, (see Appendix B) •an-1 -an
3.6.3 Definitions The Axiomatic Method The axiomatic method is a formalization for expressing mathematical systems. The formalization assumes some undefined terms and a set of axioms that define the behavior of the undefined terms. A system of logic and rules of inference are also assumed. Additional components are definitions and collections of assertions (theorems, propositions, corollaries, and lemmas) that need to be proved.
Scientific Theory A scientific theory is a general principle that is accepted as true by a significant majority of the people who are considered competent in the discipline. Scientific theories are usually derived using inductive reasoning. The Natural Numbers The set of natural numbers is denoted by N and is defined by
Undefined Terms See Axiomatic Method.
The Integers The set of integers is denoted by Z and is defined by Z { .... , -4, -3, -2, -1,0, 1, 2, 3, 4, ... }. The Rational Numbers The set of rational numbers is
Axiom An axiom is one of a set of properties that govern a mathematical system. These properties are assumed to be true; they are not subject to proof. Postulate Postulate is another name (in this book) for axiom. Definition A definition is the means for binding a concept, a name for the concept, and a set of associated properties that describe the concept. Theorem; Proposition; Corollary; Lemma Statements in a mathematical system that have been proved are called theorems or propositions. A lemma is usually considered to be a mini-theorem whose main purpose for existing is to help prove part of a more important theorem or proposition. A corollary is a statement whose truth is an immediate consequence of some other theorem or proposition..
N = {0, 1,2, 3, 4,...}.
denoted by Q and is defined by
I p e Z, q E Z, and q A O0. q Irrational Numbers A real number that is not rational is called irrational. Divisible The integer a is divisible by the nonzero integer b if a = bc for some integer c. We denote this by b Ia, and also say that b divides a. =
Even and Odd An integer, n, is even if there exists an ingg even.
Mathematical Proof A mathematical proof of the statement S is a sequence of logically valid statements that connect axioms, definitions, and other already validated statements into a demonstration of the correctness of S. The rules of logic and the axioms are agreed upon ahead of time. At a minimum, the axioms should be independent and consistent. The amount of detail presented should be appropriate for the intended audience.
GCD Let a and b be integers that are not both 0. The greatest common divisor of a and b is a positive integer d such that d I a and d I b and if c divides both a and b, then c I d. The greatest common divisor of a and b is denoted by gcd(a, b). An alternative notation is (a, b).
Deductive Reasoning Mathematics typically utilizes deductive reasoning to infer theorems logically by considering the consequences of prior axioms and theorems. The truth of a theorem necessarily follows from the prior information and the rules of logic.
two integers a and b is a nonnegative integer, m, such that a I m and b I m and if both a and b divide c, then m I c. The least common multiple of a and b is denoted by lcm(a, b). An alternative notation is [a, b]. Prime, Composite A positive integer p, with p > 1, is
Inductive Reasoning Inductive reasoning starts with a collection of experimental evidence or evidence derived by observation and tries to infer general principles that can explain the evidence.
said to be prime if its only positive integer divisors are I and p. A positive integer n, n > 1, that is not prime is called composite. The integer I is neither prime nor composite. (In more advanced contexts it is called a unit.)
Least Common Multiple The least common multiple of
3.6 Chapter Review Pythagorean Triple The set of integers {a, b, c} is called a Pythagoreantriple if a 2 + b 2 = c 2. It is called a primitive Pythagoreantriple if there is no prime that appears in the prime factorization of each of the three numbers (i.e., they have no common prime factor). a mod m Let m be a positive integer. Then a mod m is the remainder when a is divided by m. a - b ( mod m) We say that a is congruent to b mod m if m divides a - b. This is often written as a =_b (mod m) if and only if there is an integer k for which a - b = km. max; min Let a and b be real numbers. Then =
max(a, b) and min(a, b)
=
ifa > b if a
149
Geometric Progression A sequence is called a geometric progressionif each term in the sequence (after the first) is a constant multiple of the previous term. Thus, if the terms are {ai l for i = 0, 1, 2, 3 ... , then ai+j = rai for some constant, r. Arithmetic Progression A sequence is called an arithmetic progression if each term in the sequence (after the first) is obtained by adding a constant to the previous term. Thus, if the terms are {ai} for i = 0, 1, 2, 3 ... , then ai+l = ai + d for some constant, d. Optimal A stable assignment is called optimalfor suitors if every suitor is at least as well off in this assignment as in any other stable assignment. Possible A potential mate is called possible for a suitor if is a stable assignment that pairs them. Honest A rational number 2 is honest if and only if p and q q have no common factors and q > 1.
Ib
a aifa iathere b
3.6.4 Theorems Theorem 3.1 The Euclidean Division Algorithm Let a and b be integers with b A 0. Then there exist unique integers q and r such that a = bq + r and 0 < r < Ib1.
Theorem 3.4 Let a and b be integers such that at least one is not 0. Then there are integers, s and t, such that gcd(a, b) = as + bt.
Theorem 3.2 The Fundamental Theorem of Arithmetic Every integer n, with n > 2, can be uniquely written as a product of primes in ascending order,
Proposition 3.9 Prime Divisibility Property Let p be a prime. If p divides the product a l a2 ... an, then p divides at least one of the factors ai.
Axiom 1 The Well-Ordering Principle Every nonempty set of natural numbers has a smallest element.
Theorem 3.5 Theorem of Mathematical Induction If {P(i)} is a set of statements such that
Proposition 3.1 Let a, b, and c be integers, with a : 0. Ifalbandalc, thenal(b+c). Proposition 3.2 Let a, b, and c be any integers, with
Prop n 32 Lt a, b ad cnbe ay then a 3 0. If a I b, then a I(bc)
Proposition 3.3 Let a, b, and c be integers with a
#
1. P(1) is true and 2. P(i)--> P(i+l) for i> 1, P(k) is true for all positive integers k. This can be stated more succinctly as
0
and b -A 0. Ifa I b and b I c, then a I c.
Proposition 3.4 If n is a positive composite number, then n has at least one prime factor p with I < p < 1_.
[P(1) A (Vi, P(i) --> P(i + 1))] -* [Vk, P(k)].
Theorem 3.6 Sum of the First n Positive Integers n
Corollary 3.1 If a positive integer p > 1 has no divisor d with 1 < d < /p, then p is prime.
J
n(n +) 22
j=1
Proposition 3.5 If the integer n is not even, then n 2 is not
for all positive integers, n, with n > I
even.
Theorem 3.7 Theorem of Complete Induction If {P(i)}
Proposition 3.6 The number /2- is irrational,
is a set of statements such that
Proposition 3.7 If x and y are real numbers with x < y, then there exists a real number z with x < z < y. Proposition 3.8 Let a and b be any two distinct real numbers. Then the following are equivalent, (a) a < b
(b) a < 1-b
(c) -•- < b
Theorem 3.3 The Infinitude of the Primes There are an infinite number of distinct primes.
1. P (1) is true and 2. [P(1) A P(2) A ... A P(i)]
--+
P(i + 1) for i
>
1
then P(k) is true for all positive integers k. This can be stated more succinctly as [P(l)
A
(Vi, [P(1) A P(2) A ... A P(i)] --- P(i + 1))] -
[Vk, P(k)].
150
Chapter 3 Proof
Theorem 3.8 Partial Sum of a Geometric Progression The sum of the first n + 1 elements of a geometric progression depends on the value of r E R. r 1 r"+1 -- rn+l -~ -+1 if r7 r 1-r
=
n
=
The well-orderTheorem 3.7 (restated) WOP -- CI ing principle implies the theorem of complete induction. Theorem 3.11 CI -+ MI The theorem of complete induction implies the theorem of mathematical induction.
rr - 1Theorem
3.12 MI -+ WOP The theorem of mathematical induction implies the well-ordering principle.
(n + 1)
if r = 1
Theorem 3.13 Distance in
i=0 Corollary 3.2 Sum of a Geometric Progression If r E R and IrI < 1, then r i=O Theorem 3.9 Optimality of the Deferred Acceptance Algorithm The deferred acceptance algorithm produces an assignment which is optimal for every suitor.
b = (x2, to b is
Y2, Z2)
R3
Let
a
(xi,yI,zl) and
be points in R . Then the distance from a
d(a, b) = /(Xl - X2) 2 + (yi - Y2) 2 + (z- - Z2) 2 . (xl, y1) and Theorem 3.14 Distance in JR2 Let a = b = (x2, y2) be points in R2. Then the distance from a to b is d(a, b) = ý(x - x2) 2 + (y, - Y2) 2 .
Theorem 3.10 The Well-Ordering Principle, Mathematical Induction, and Complete Induction are Equiva-
Theorem 3.15 VH E/ (Q - Z) For any positive integer n, fn-is either a positive integer, or it is irrational.
lent The following are equivalent:
Lemma 3.1 If I is an honest fraction, then so is (rj)2. q
"*the
well-ordering principle * the theorem of mathematical induction * the theorem of complete induction
Corollary 3.3 If ther is ip
_
is not an honest fraction, then nei-
q
3.6.5 Sample Exam Questions 1. Describe the basic components of an axiomatic mathematical system.
7. Use mathematical induction to prove that the sum of the first n odd positive integers is n 2 :
2. Let x be an even integer and y be an odd integer. What is wrong with the following proof that x + 2y = 3 y - 1 ? Incorrect Proof Since x is even, there is an integer, n, such that x = 2n. Since y is odd, there is an integer, n, such that y = 2n + 1. Therefore, x+2y=2n+2(2n+l)=6n+2=3(2n+l)-I = 3y-
Dl
1.
3. Describe how (and why) an indirect proof works. 4. Consider gcd(140, 336). (a) What is the value of gcd(l40, 336)? Show your work. (b) Use the Euclidean division algorithm to find integers, s and t, such that 140s + 336t = gcd(140, 336). 5. State the well-ordering principle. 6. What is the numeric limit of
i=0 00)
"
n Vn E Z with n > 1, 1:(2k - 1) = n 2 . k=1 8. Let a I, a2.
a, be n real numbers, with n > 1. Prove: a +a 2 +.+an 1. Then n+l n
2. By setting x = 2n and y = 2n + 1, the proof assumes that x and y are consecutive integers. The claim is true in that case, but it will fail for nonconsecutive integers (such as 2 and 5). Q] €: [(-Q) - (-P)] 3. The logical equivalence [P implies that the contrapositive of a valid theorem is automatically true. Thus, if the assertion A --* B needs to be proved, it is possible to show that -B -* -A is true and then conclude that A
--
B is true.
Z(2k - 1) = (2(n + 1) - 1) + Z(2k - 1)
k=1
and 336 = 24 • 3 • 7, gcd(140, 336) = 22 - 7 = 28.
(b) The two phases are quite straightforward for this problem. Phase 1: 336 = 140.2 + 56 140=56.2+28
56 = 140. (-2) + 336 28 = 56. (-2) + 140
56 = 28. 2 + 0
0 = 28 • (-2) + 56
ý (n + 1)2. Consequently, P(n + 1) is also true. Since P(1) is true, and for all n > 1, P(n) --) P(n + 1) is true, the theorem of mathematical induction implies that P(n) is true for all n > 1.
8. Suppose, by way of contradiction, that at +a2- .."""+an +
>aj for all j E{l1,2,3.
n aj
1.
(b) An indirect proof seeks a proof of the assertion: let n
E
Z
with n 3 + 2n 2 even. Then n is even. To establish this 22 3 claim, assume that n + 2n is even. Then there exists an integer, k, with n 3 + 2n 2 - 2k. This is equivalent to n 3 = 2(k - n 2 ). The right-hand side is divisible by
n + 2 must also be odd [there is an integer, k, such that n = 2k + 1, so n + 2 = 2k + 3 = 2(k + 1) + 1 is also odd]. The product n -n (n + 2) is a product of three odd integers. Exercise 10 on page 143 implies that the product is odd.
the prime, 2, so the left-hand side must also be divisible by 2. The prime divisibility property (Proposition 3.9) implies that 2 divides n and so n is even. (c) Suppose that n is odd, but n 3 + 2n 2 is even. Then there exist integers, k and m, such that n = 2k + 1 and n 3 +2n 2 = 2m. Thus, (2k+ 1)3 +2(2k+ 1)2 = 2m. This simplifies to (2k + 3) - (2k + 1)2 = 2m. The right-hand side is divisible by the prime, 2, so the left-hand side must also be divisible by 2. The prime divisibility property (Proposition 3.9) implies that 2 divides one of the
ii. Since n is odd, there is an integer, k, such that n = 2k + 1. Thus, n3 + 2n 2 = (2k + l)3 + 2(2k + 1)2 (2k + 3) . (2k + 1)2. This is a product of three odd integers, so it is also odd (see the previous direct proof).
three factors on the left-hand side. But each of those factors is odd, a contradiction. The only way to resolve the 3 2 contradiction is to assume that when n is odd, n + 2n is also odd.
9. When n = 3, 2 n + 1 = 9, which is not a prime. Thus, n = 3 is a counterexample to the claim, 10. (a) At least two simple direct proofs are possible. i. Notice thatn0+2n2 = n2(n +2). Since n is odd,
CHAPTER
4
Alorithms
An algorithm is a process for solving a problem. You have been using formal algorithms since you were quite young. For example, in elementary school you learned how to do multidigit addition with paper and pencil. The standard algorithm has you start with the rightmost column and work toward the left, carrying into the next column when the result is greater than 10. Some algorithms are so complex that it is necessary to devise a notation to describe them clearly and unambiguously. We will look at one such notation in the first section of this chapter. It is often critically important that we have a way to compare the efficiency of two algorithms that solve the same problem. If one algorithm can be completed in 10 minutes, whereas the other will take 10 days, we will prefer the former. Mechanisms for measuring algorithm efficiency will be presented in Section 4.2. Creating good algorithms is a challenging (and fun) activity. There will be many opportunities to work with algorithms in the remainder of this book. The final section of this chapter will present several algorithmic solutions to a common software task: determining whether (and where) a pattern of characters appears in a document. The efficiency of each algorithm will also be determined. Before proceeding, it will be helpful to establish a formal definition for the term algorithm.
DEFINITION 4.1 Algorithm An algorithm is a finite sequence of unambiguous steps for solving a problem or completing a task in a finite amount of time. The following is not an algorithm, even though it consists of a finite sequence of unambiguous steps: Not an Algorithm n Suppose we would like to calculate the sum variables, s (the sum), and n (the index variable).
2 An obvious procedure uses two
Step 1 Set s to 0 and n to 1. Step 2 Add - to s (and store the result in s). Step 3 Add 1 to n (and store the result in n). Step 4 Go back to step 2 and continue.
153
154
Chapter 4 Algorithms This procedure is certainly correct but of no practical value because the process never ends (and is therefore not an algorithm).'1
4.1 Expressing Algorithms There are many ways to express algorithms. For example, in Section 1.2.2, the deferred acceptance algorithm was expressed using a paragraph of normal English prose and then it was expressed (without explanation) using pseudocode. Pseudocode more fully achieves the goal of avoiding ambiguity. However, it does take some explanation before all the conventions become clear. This section will introduce those conventions. DEFINITION 4.2 Pseudocode Pseudocode is a semiformal language used to describe algorithms. It is more precise than a prose description but contains less syntactic structure than a compilable computer language. There are no required syntax rules for pseudocode, but there are many useful conventions in notation that have developed over the past 40 years. The intermediate level of structure in pseudocode makes it ideal for communicating algorithms. The structure helps to eliminate ambiguity and aids in clearly communicating the steps. On the other hand, since pseudocode does not require strict adherence to a formal syntax, we do not need to spend effort making sure that every required semicolon and closing parenthesis is in the proper place. Even though there is not a rigid syntactic structure to pseudocode, there are some conventions of notation that are helpful. These conventions come in two major areas: flow of control and flow of information. Flow of control is concerned with the order in which steps are completed. Flow of information is concerned with what data are handed to the algorithm, and what data the algorithm ultimately produces.
4.1.1 Flow of Control In 1966, B6hm and Jacopini published a paper entitled Flow Diagrams, Turing Machines and Languages with Only Two FormationRules [7]. It wasn't too long before the computer science community realized that their work implied that any algorithm can be expressed using what are now called structuredcontrol constructs: sequence, selection, and repetition. The essential idea of structured control is that subcollections of steps need to have a single entry and a single exit. This was in sharp contrast with the then current practice of using many goto statements (step 4 in Example 4.1). Pseudocode allows goto's, but they are discouraged. The three major structured control categories are described next. Step 1 Step 2 Step N
Figure 4.1. A sequential block,
Sequence The simplest control structure is sequence. The convention is to read algorithms from top to bottom, completing each step in order unless directed otherwise. Sequential control is the default. You would most likely perform a purely sequential algorithm without even realizing that you were following this simple convention. A sequential block of instructions (Figure 4.1) clearly has a single entry (just above the first instruction in the sequence) and a single exit (just after the final instruction in the block). The first person to calculate this sum was Leonard Euler. The somewhat surprising answer is !-.
for details.
See [21]
4.1 Expressing Algorithms
155
Selection Selection control constructs allow the algorithm to take different paths for different initial data. For our purposes, it is useful to consider one-way, two-way, and multiway selection. One-Way Selection One-way selection allows some steps to be completed conditionally. That is, sometimes the steps will be completed and other times they will be skipped. The pseudocode that achieves this is the if-then construct: if
condition then Step 1 Step 2 Step N
The indented steps (1 - N here) are called the body of the if-then construct. The steps in the body are conditionally executed 2 (depending on the value of condition). Often, the word then is omitted in the pseudocode notation. (That convention will be followed in this text.) if if
condition Step 1 Step 2 Step N
condition Step 1 Step 2 Step N
Steps I-N are completed only if the proposition condition evaluates to true. The indentation is used to indicate that all N of the steps are dependent on condition being true'The if-then construct (Figure 4.2) is a single-entry, single-exit structure; the exit is
Figure 4.2. An if-then block,
the step immediately following the final step in the if-then (independent of whether the statements in the if-then body are actually executed). An Algorithm with a Simple If-Then One simple way to calculate the absolute value of a number is to check its sign. Suppose the number x is a number that can be entered via keyboard and we want to display its absolute value on a computer monitor. The following algorithm will accomplish this. The line numbers are included as a convenience to the reader and are not part of the algorithm. 1: read X from the keyboard 2: if x < 0 set X to -x 3: display x on the monitor Notice the single entry (line 2) and single exit (line 3 even if the "change the sign
of x" step is skipped).
U
The absolute value algorithm has combined sequence and selection. Think of the if-then selection as a single step. The algorithm is a sequence with three steps (one of which happens to be a selection). 2
The term executed is another way to say "do what the step indicates."
156
Chapter 4 Algorithms Assignment Operators A useful pseudocode notational convention is to use the compound symbol := to indicate "set the variable on the left to the value on the right." This symbol is therefore called an assignment operator. The absolute value algorithm could then be represented as 1: read x from the keyboard 2: if x < 0 X
:- -X
3: display x on the monitor The computer language C has introduced the symbol = as an alternate notation for the assignment operator. This is a bit unfortunate because we are used to thinking of the symbol = as representing the question "is the left-hand side equal to the right-hand side?". To avoid confusing the assignment operator with the equality operator, C uses the symbol == to indicate the claim that the left-hand side equals the right-hand side. Because C and its derivatives have essentially won the notational wars, we will use = to represent the assignment operator and the compound symbol == to represent the equality operator. You may still see the := notation in older literature. The final version of the absolute value algorithm is thus 1: read x from the keyboard 2: if x < 0 x = -x 3: display x on the monitor Two-Way Selection In a two-way selection, if the proposition condition is true, the first set of steps are completed; otherwise, the second set of steps are completed. The indentation is again an important part of the notation. if
condition then Step 1 Step 2
Step N else Step N+1 Step N+2
if
or
Step N+M
M
condition Step 1 Step 2
Step N else Step N+l Step N+2 Step N+M
Calculting Paychecks Suppose that all workers at a company are either salaried or hourly workers. A simplified paycheck calculation might look like the following. The symbol * is used to represent multiplication and / represents division.
if
salaried determine determine paycheck else determine determine determine paycheck
yearlysalary number-of payperiods yearlysalary/numberof-payperiods hourlywage hours-worked overtimehours hourlywage*hours_worked + 1.5*hourlywage*overtimehours
U
4.1 Expressing Algorithms
if condition Step 1 Step 2
157
The two-way selection (also called an if-else construct) is a single-entry, single-exit control structure (Figure 4.3). It doesn't matter whether the true path (then) or the false path (else) is chosen; the step after the if-else will always be the one immediately after the structure.
Multiway Selection A multiway selection construct provides more than two mutually exclusive options (Figure 4.4). In a multiway selection, we can pick from among
Step N else Step N+1 Step N+2
if condition 1 Step N+M else if condition 2
Figure 4.3. An if-else block.
else if condition3 : else if condition else
Figure 4.4 A multiway selection block. many mutually exclusive options. We may also optionally provide a default choice if none of the other options are appropriate. One way to express this in pseudocode is as follows: if condition 1 if conditions else if condition2 else if condition 2 else if condition3 : :else else if condition :
or
if condition 3 , else if conditionm
else
where the final else section may be omitted if there is no default collection of steps. As usual, there is a single entry and a single exit. Once one of the conditions is true, the steps in that section are executed and then the first step after the multiway structure is the next executed. Tuition This simple example illustrates a three-way selection. It will be extended when nesting is discussed. (Notice that scholarships cancel out of state tuition.)
158
Chapter 4 Algorithms if
student is on a scholarship fee = standard tuition - amount of scholarship else if student is a state resident fee = standard tuition else fee = out-of-state tuition
U
b'Quick Check 4.1 1. Write an algorithm fragment that returns the sum of two numbers, a and b. Before performing the addition, it should check to see if a equals 0. If it does, a should be converted to I be-
fore the addition is done. 2. Write an algorithm fragment that determines which of two numbers, a and b, is the smaller. In case of a tie, either one can be chosen. []
Repetition The final control structure category consists of those that repeat a collection of statements. These structures can repeat for either a predetermined number of times or for a
variable number of times (determined by a condition). Fixed Iteration When the number of times a collection of steps needs to be executed is known (or is stored in a variable), the proper repetition structure is a fixed iteration. One of the many possible pseudocode expressions of this structure is a simple for loop. The for loop uses an index variable to count how many times the loop has been executed. Each time through the loop, the index variable (i in the following pattern) is incremented. The loop is complete when the index variable gets larger than its terminal value (n in the pattern). Every time step m is executed, control goes back to the top of the for loop. If i is still less than or equal to n, the loop body is executed again. Otherwise, the next statement will be the one immediately after step m. for i = 1 to n Step 1 Step 2 Step m The indentation highlights the body of the loop (the statements that are repeated). Note that the body of the loop is using the sequence structure.
M
Adding To add the numbers 1-100, the following for loop works well. i: sum = 0 2: for i = 1 to 100 2a: sum - sum +i 2b: i = i + 1 3: display the value of sum Notice the convention used in the assignments; place the value of the right-hand side into the variable on the left. Thus, the statement sum = sum + i means, add i to the current value of sum, then replace the current value of sum by the result. I have chosen to number the steps rather than numbering lines (as is typical) in order to highlight the single entry, single exit nature of the for loop. As soon as i becomes greater than 100, steps 2a and 2b are skipped and step 3 is executed. I
4.1 Expressing Algorithms
159
In the previous example, I explicitly incremented the index variable inside the loop body (line 2b). Some people use an implicit increment. If you see a for loop and the index variable is not explicitly incremented, the convention is that it was incremented by 1 after the final step in the body (but before returning to the top of the loop to compare with the terminal value).
Indefinite Iteration There are two common control structures that allow a loop to be executed until some condition is met. The more common is the while loop, which uses a pretest to determine when to quit. The less common repeat-until loop uses a post-test to determine when to exit the loop. while condition Step 1 Step 2 Step N
repeat Step 1 Step 2 Step N until condition
The while loop tests the condition before entering the body of the loop. If the condition is false, the loop body is not executed and the first statement after step N is the next executed. 3 Otherwise, the loop body is executed and then control goes back to the top, where the condition is once again checked. The repeat-until loop always executes the loop body at least once. After the loop body has been executed, the condition is checked. If the condition is false, the loop body is repeated again; otherwise the loop is terminated and the first step after the until line is the next to be executed. Note the difference: A while loop keeps executing as long as the condition is true; a repeat-until loop keeps executing as long as the condition is false. Count Down The next two loops accomplish the same objective: Count down to blastoff. i = 10 while i > 0 display i i - i - 1 display "Blast Off!! "
i = 10 repeat display i i = i - 1 until i = 0 display "Blast Off!!"
I
All three of the loop structures are single-entry, single-exit (Figures 4.5, 4.6, and 4.7 on page 160). There is another, more complex, mechanism that causes a collection of steps to be repeated. This more complex construct is called recursion and will be introduced in Chapter 7.
Nesting Some of the power of the control mechanisms just presented is the ability to combine them. I have already shown examples of combining them by placing one structure after another (sequencing structures as well as steps in an algorithm). It is also possible to combine them using nesting (composition). Any step in one of the patterns can be replaced by a control structure. A second look at the tuition example illustrates this. 3
1t is even possible that the condition will be false when the while loop is first encountered. In that case, the
loop body will never be executed.
160
Chapter 4 Algorithms
for i = 1 to M Step 1 Step 2 Step N
Figure 4.5
A for block.
while condition Step 1 Step 2
t
repeat Step 1 Step 2 Step N until condition
Step N
Figure 4.6
A while block.
Figure 4.7
A repeat block.
Tuition Revisited This algorithm cycles through all students in the school. It also allows out-of-state students to have scholarships. The algorithm also calculates the total amount of tuition to be collected.4 total = 0 for student = firststudent to last-student if student is a state resident if student is on a scholarship fee = standard tuition - amount of scholarship else fee = standard tuition else if student is on a scholarship fee = out-of-state tuition - amount of scholarship else fee = out-of-state tuition total = total + fee display total
Figure 4.8 shows the nesting. The boxes are all single-entry, single-exit constructs.
U A style of computer programming arose after the control structures we have been examining were found to be sufficient for expressing any algorithm. The style was named structuredprogramming. Structured programming stipulates that programs (algorithms) be expressed using only single-entry, single-exit control structures from the 5 categories sequence, selection, and repetition. These control structures may be nested.
1. Write an algorithm fragment that calculates the sum of the first n odd integers. 2. Write an algorithm fragment that 4
5
reads a stream of characters and prints the total number of vowels at the end.
For pedagogical reasons, I have used more nesting than is necessary in this example.
The mechanisms presented here have been extended somewhat by defining two mechanisms for leaving a loop from the middle of its body. The break statement causes the loop to terminate prematurely and control to go to the next step after the loop. The continue statement causes the body to terminate prematurely, but control goes back to the top of the loop (so the loop body may be done again). We will not need these extensions.
4.1 Expressing Algorithms total
161
= 0
for student = first-student to last-student if student is a state resident if student is on a scholarship fee = standard tuition - amount of scholarship else fee = standard tuition else if student is on a scholarship fee = out-of-state tuition - amount of scholarship else fee = out-of-state tuition Total = total + fee display total
Figure 4.8 Nesting of structured constructs.
4.1.2 Flow of Information Algorithms need to specify more than merely the order in which the steps are carried out. They also need to specify what information will be given at the start (the input parameters) and what information will be produced (the return values). At times, it is useful to break an algorithm into two or more communicating pieces. It then becomes necessary to identify the pieces and to define the information they will share. Some notation loosely borrowed from computer programming will provide the mechanisms we need. The pseudocode for an algorithm may optionally give the algorithm a name and define the data values it needs in order to start and also specify the data values that will be produced. We 6do this by placing an extra line at the beginning of the algorithm. The general pattern is returnvalue (s)
algorithm-name (input-parameters)
It is often important to specify the type of the data elements. This can be done by specifying a data type (such as integer, real number, complex number), followed by the name of the variable that holds the value. The naming of the data types is informal but should be unambiguous. If there are no specific return values, that part can be omitted
algorithm-name (input-parameters) or the word void can be used to indicate there is no return value void algorithmname (input-parameters)
Adding a List of Numbers Suppose we have a list of numbers that we wish to add. The algorithm below will accomplish that task and produce the sum as its return value. The integer n specifies the size of the list. real sum add-List(integer n, real {al,a2,.,a}) sum = 0 for i = 1 to n sum = sum + ai return sum 6
You do not need to use a bold font for the algorithm name.
162
Chapter 4 Algorithms The first line specifies that the algorithm addList will return a real number named sum. 7 It requires an integer and a collection of real numbers. Another algorithm can use add-List in one of its steps. The next algorithm will display the sums of the first n integers, for values of n from I up to 20. 1 to 20 for n display addList(n,
{l, 2, ..... n)
Since add-List returns a real number, we can use the name add-List as if it is
M
that number.
The previous example illustrates another notational convention we will need to observe. When a named algorithm is defined, the types of the parameters and return value are specified. The entire algorithm is also written out. When a named algorithm is used, the types are not specified. 8 In addition, only the first line is written. The details of the steps are not written. Adding Even Numbers Suppose we wish to add the even numbers from 0 to 2n. The algorithm add-Evens is a simple solution. integer add-Evens (integer n) sum = 0 1 to n for i sum sum + 2*i return sum end addEvens
I have added an optional line at the bottom that indicates the end of the algorithm. In mathematical notation, add-Evens (n) = Y7"=1 2i. Suppose I want to know for = 2i is evenly divisible by 3. The algorithm threes which values of n the sum provides the answer. threes(integer max) for i = I to max sum = addEvens(i) if sum is divisible by 3 display i and sum end threes void
U
lV Qu,_i~ck C-h,_ec k-_4,_.,3,-,-, 1. Write a complete algorithm that takes
a sequence of numbers aI, a2 ..... an and returns the alternating sum
- a4 + T an-I ar. (The final term will be added if n is [Iv odd and subtracted if n is even.)
at - a2 + a3
The optional "end threes" line in the previous example visually delimits the named algorithm. Some people prefer to have something more than just indentation to delimit the scope of a selection or repetition statement. One popular notation is to use opening and closing braces for this task. Algorithm threes might be represented as follows: 7
If only one return value is specified, the value can be anonymous: In this example, the name sum could have been left out of the first line. 8 The parameters in the definition are called formal parameters. When the algorithm is used, the parameters are called actual parameters.
4.1 Expressing Algorithms
163
threes(integer max) for i = 1 to max I
void
sum =
if
addEvens(i)
sum is divisible by 3 display i and sum
end threes
The algorithms in this text are generally simple enough that I have chosen to rely solely on the indentation level. One very useful addition to pseudocode is the ability to add comments. These comments help the human reader understand the code but are not executed when the algorithm is used. Many notations have been used for comments. 9 One simple option is to start a comment with the symbol # and consider everything between the # and the 10 end of the current line to be a comment.
threes(integer max) # This could go on forever, so quit at the number max. for i = 1 to max sum = addEvens(i) # add the even numbers from 0 to 2n if sum is divisible by 3 display i and sum # i meets the requirements end threes void
One final note about pseudocode. The return statement works in a preemptive fashion: It causes the algorithm to terminate immediately. Return The second return will never be reached because as soon as i reaches 10, the algorithm terminates and sends out the value 10. The for loop never completes. Notice that this algorithm does not require any input data. integer preemptive() for i = 1 to 20 if i == 10 return 10 return 20 end preemptive
U
4.1.3 Exercises The exercises marked with OD have detailed solutions in Appendix G.
teger}. You may assume that the customer does not pay less than the price. Use as many quarters as necessary.
1. ýD Write an algorithm that makes change for a purchase. Assume that the purchase price is between $0.01 and $1.00 and that the customer has paid exactly $1.00. The algorithm should return a set containing the number of quarters, dimes, nickels, and pennies to return (in that order). The return set should be declared as [integer, integer, integer, integer).
3. Write a complete algorithm that takes as input a calendar year (from the Gregorian calendar) and returns the number of days in February. The leap year rules for the Gregorian calendar are as follows: * A leap year occurs in most years that are evenly divisible by 4.
2. Write an algorithm that makes change for a purchase. The algorithm should return a set containing the number of quarters, dimes, nickels, and pennies to return (in that order). The return set should be declared as (integer, integer, integer, in-
° A year that is evenly divisible by 100 is usually not a leap year. * If the year is evenly divisible by 400, then it is a leap year. (continued)
9
For example, I comment, I* comment */, I comment ), and (* comment *) have all been used. The programming language perl uses this notation.
10
164
Chapter 4 Algorithms
For example, 1984 is a leap year; 1983 and 1900 are not. The year 2000 is a leap year. Use one multiway selection structure in your algorithm. You are not required to check divisibility in the same order as the rules just listed. 4. Write an algorithm that determines how many positive integers evenly divide the integer n. The algorithm should return the number of divisors. For example, 4 has three divisors: 1, 2, 4. Call the algorithm numberOfDivisors.
the cookies can bake at the same time.) Write an algorithm to describe the process you must go through for bringing gingerbread men to the party. Input and output values are not necessary in this algorithm. 11. Write an algorithm that takes a sequence of letters a1, a2 ..... a, and first displays the list in the order given and then in reverse order. To illustrate, if the input sequence is "g, i, r, I,"the algorithm will display "girllrig." 12. Recall that two points are collinear if they lie on the same
5. Write an algorithm to determine whether an integer is prime or composite. The algorithm should return a 0 if the integer is prime, a 1 if the integer is composite, and a 2 if the integer is neither prime nor composite and a 3 if the integer is less than 1. You may assume that the algorithm numberOf Divisors, defined in Exercise 4, is available,
line. Write an algorithm that takes a series of ordered pairs, representing points on the real plane, and determines whether or not all of the points are collinear. It returns true if the points are collinear and false if they are not. For instance, if the input is the ordered pairs for four points and only two of the points lie on the same line, the algorithm will return false. Assume that at least two ordered pairs will be used in the algorithm. (Hint: Think about the standard formula for slope. You may also assume that there will be no undefined slopes [representing points that are vertically aligned] when using the slope formula.)
6. O Recall the definition of greatest common divisor (Definition 3.9 on page 97). Write an algorithm that displays the greatest common divisor of two natural numbers, not both zreaterIfvisor tcommo. bothw naturalaynumbers, e not or m h zero. If both natural num bers are zero, display an error m essgsaigthat this is invalid input,
sage saying field 7. Suppose that Jane does different activities during the various seasons of the year. When it is fall, she goes bike riding, but only if the temperature is at least 50 degrees. If the temperature is not warm enough in the fall, she reads a book instead, During the winter, Jane always plays in the snow. In both the summer and spring, she goes running unless the temperature exceeds 70 degrees. If the temperature is between 71 and 90 degrees, Jane walks around the lake. When the temperature gets above 90 degrees, the only activity she can do is swimming. Write an algorithm that returns the activity that Jane should carry out based on the season and temperature (a real number). 8. A family rolls a standard pair of fair, six-sided dice to help determine which person will do the dishes on Monday night, If the sum of the digits on the pair of dice is odd, then Momrn
automatically has to do the dishes. A roll of the dice in which
by4 f te dgit isdivsibl the um the sum of the digits is divisible by 4 manstha means that eiherDad either Dad
the or Brother Joe will do the dishes, depending on whether digits are the same (Dad) or different (Brother Joe). All other rolls of the dice will appoint Sister Sue to do the dishes. Write an algorithm to display the name of the person on dishes duty for Monday night. The algorithm should accept as input two values, which are the the numbers of dots on the top faces of the pair of dice. 9. 'I Write an algorithm that takes a sequence of real numbers X1, x2..... Xn and returns the absolute value of the average of these numbers. 10. Suppose that you are making 40 gingerbread men for a Christmas party. For each cookie, you must put the cutter in the dough, place the cookie on the pan, and then add three raisins for buttons. After this is complete, the 40 cookies must be baked. (Assume that you have a big oven and all of
13. Jake works at the school cafeteria and is charged with the d t f p e ai g b g l n h sf r s m t d
n g i g o duty of preparing bag lunches for some studentss going on a
trip to the zoo. In each bag lunch, he must enclose a turkey sandwich and a granola bar. The beverage and fruit he encloses varies according to student gender: Boys get apple juice and a banana, while girls get orange juice and an apple. Write an algorithm that accepts as input the number of boys and the number of girls going on the zoo field trip and describes the process Jake goes through in preparing the bag lunches for these students. 14. Write an algorithm that accepts as input a positive integer, n, and performs an alternating sequence of multiplications and additions for all the positive integers up through n. For instance, if the integer 6 is entered, the alternating sequence would be 1.2+3.4+5.6. Note that the algorithm will only return the final result after performing the multiplications and additions in this sequence. In your algorithm, assume that n
and
tI
eebrtegnra
addition t
rcdnerlsfrml
may b h e
tul
or
what
It may be helpful to consider what and naddition. tiplication happens when is even versus when n is odd.
15. P Two integers are relatively prime if I is their only common factor. Write the algorithm relat ivelyPrime. You may assume that a > I and b > 1. a, get b) 16. The Euler totient function, 0(n), determines the number of positive integers that are less than the integer n and are also relatively prime to n (i.e., have no common factors with n other than 1). Write an algorithm to calculate and return O(n). For example, q5(5) = 4 since 1, 2, 3, and 4 are relatively prime to 5. Also, 0 (6) = 2 (1 and 5 are relatively prime to 6). Assume that the algorithm relativelyPrime, defined in Exercise 15, is available. bea
4.2 Measuring Algorithm Efficiency
165
4.2 Measuring Algorithm Efficiency The previous section showed a method for expressing the details of an algorithm. The next task is to establish a mechanism for measuring the efficiency of an algorithm. The standard mechanism creates a measure of relative efficiency: An algorithm is given a rating that declares it to be in the same efficiency category as one of a collection of standard reference functions. How do functions enter the discussion? When considering the efficiency of an algorithm, there are two resources that are essential: time and space. Space might consist of how many pieces of paper you need to carry out the steps of an algorithm, but more commonly consists of bytes (or megabytes) of computer memory. An algorithm that uses 1 megabyte of memory is preferred (all other things being equal) to one that requires 8 megabytes of memory."I The time efficiency of an algorithm has been the more commonly measured resource. What is typically done is to designate some steps in an algorithm to be the most critical steps. We then count how many times those steps are executed. For most algorithms, the number of times the critical steps are executed depends on the size of the initial data set. For example, if the algorithm sorts a list of names, the size of the data set will be the number of names to sort. One critical aspect of most sorting algorithms is the need to compare two names to see which comes lexicographically first. For a simple algorithm, the number of times two names are compared may be about n2 times for a list of n names. As the number of names grows, the time the algorithm requires will grow (approximately) as the square of the number of names; double the list and take about four times as long to complete the sort. In the sorting example, the function g(n) = n2 serves as a well-known reference function. The actual sorting algorithm might have a time growth of f(n) = 120n 2
-
lln + 450.
The reference function g(n) is simpler to write, and we know what it looks like without needing to draw the graph. In addition, as n gets larger and larger, it becomes easier to see that the functions have essentially the same shape. We can therefore think of g as representing the time behavior of the algorithm; not an exact description, but a good approximation. These ideas will now be formalized.
4.2.1 Big-o and Its Cousins In this section, several definitions will be given. They formalize notions such as "function f does not grow any faster than function g" or "function f grows just like g for all practical purposes." The definitions will allow for some youthful indiscretion before the comparison is enforced (we will ignore the relative growth rates for "small" values of n). Note that the definitions use a stretched version of g when doing the actual comparisons. Observe the use of n rather than x as the independent variable. The definitions are valid with any real valued independent variable; however, in the context of algorithm analysis, the independent variable represents the size of a data set and is therefore almost always a positive integer. Recall that R+ denotes the set of positive real numbers. 11This is not an exaggerated example; it is possible to take some commercial software and shrink the memory usage by a factor of 8 and still have exactly the same time and functionality performance. This is due to what programmers call bloatware-programs that use more memory than is necessary. See the Bloatbusters link (in the "Textbook-Related Links" section) at http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/ for some examples.
166
Chapter 4 Algorithms
DEFINITION 4.3 Big-0 The function f is said to be in big-0 of g, pronounced "big oh" and denoted f E O(g), if there are positive constants, c and no, such that, for all n > no, If(n)[ < cjg(n)[. Another way to express this is if and only if
f c 0(g)
3c E Ri+, 3no E R+, Vn E R+, [(n > no)
(If(n)l < c~g(n)l)].
-+
The definition is stating that f is one of the many functions that belong, in a specific sense, to the set of all functions that don't grow faster than g. The constants no and c in the definition are not unique (as the next example will demonstrate). Big-V Basics It is not hard to show that f(n) = 120n 2 - 1 In + 450 is in 0(n 2 ). One approach is to graph f(n) and cg(n) for various values of c. Once we find a choice of c that keeps cg(n) above the graph of f (n), we can try to prove formally that the definition holds. The graphs in Figures 4.9 and 4.10 show the stretched reference function cg(n) as a dashed line. They provide different choices for the constants no and c, but both will lead to valid verifications of the definition: There is not just one correct choice of the pair (no, c). c = 150 c
1.4.106
125
1..106 -1.2.106
_7 1.106 1-
1.106 " -/
800000
800000
600000 -
600000 7
400000
400000
200000
200000
n
n
20 Figure 4.9
40
60
80
100
Function f(n) is solid, 125g(n) is dashed.
20 Figure 4.10
40
60
80
100
Functionf(n) is solid, 150g(n) is dashed.
A possible choice for no in the first graph is 40, since clearly the graph of 125g(n) stays above that of f(n) for all values of n greater than 40 (actually, for all values of n greater than about 8.45). The second graph appears to work for n > 20 (actually n > 3.69). A zoomed-in view of Figure 4.9 shows why we ignore the relative behavior for small values of n: f(n) is larger for n < 8 (Figure 4.11). How can the definition be formally verified? Some standard manipulations of inequalities are needed. I initially assume n > 0 and drop the absolute value symbols. This is permissible because f(n) > 0 for n > 0 (but see Quick Check 4.4). 120n
2
-
11n + 450 < 120n 2 + 450 < 120n
2
+-n
< 120n 2 +n = 121n
since -1 In is not positive if n>450
2
sincen 1
2
I have made several assumptions about n: n > 0, n > 450, and n > 1. The most restrictive assumption (which also guarantees that the others hold) is n > 450. We may
4.2 Measuring Algorithm Efficiency Figure 4.11 A zoomed-in view of Figure 4.9.
=
167
125
17500 15000 12500 10000 7500 5000 2500
2
4
6
8
n
10
12
therefore use no = 450 and c 121 to verify the definition. (I could have worked harder and established no = 17 for this c, but there is no compelling reason since big-C estimates are about what happens for large values of n.) U
V Quick Check 4.4 1. Algebraically verify that 2n 2 + 5n + 7 is in 0(n 2 ). 2. Algebraically verify that 10In + 73 is
S1Disproving
in 0(n). 3. Algebraically verify that n 2 - 5n - 4
is in 0(n 2).
R]
Big-O 2 3 How would we show that f(n) = n is not in 0(n )? One useful method is a proof by contradiction. We assume that n 3 E 0(n 2) and then arrive at a contradiction. We conclude that the assumption n 3 e 0(n 2) lead us astray and so reject the assumption. On to the proof. Assume that n 3 E 0(n 2 ). Then, according to the definition of big-C,
3c E R+, 3no E R+, VnE R+, [(n > no)
-*
(
no) -
(n31 I < c~n2I)])
is logically equivalent to
'Ic
E R+,
Vno c R+, 3n
E
R+, [ (n
>
no)
A
( In~l
> cIn2i)]A
Given c and no, we are looking for an n such that n > no and n3 1 > cIn 2I. Since n E R+, n > 0. We can drop the absolute value signs since both functions are positive for n > 0. We can then rewrite the inequality as n 3 - cn 2 > 0, or n 2 (n - c) > 0. Clearly, the factor n 2 is not negative, so we must have n - c > 0, or n > c. No matter what
value c and no have, it is always possible to find an n with n > max(c, no). The desired contradiction has been found, so we reject the assumption that n 3 E 0(n 2 ). U Big-C Is Deficient There is a deficiency with the big-C definition (at least for the purposes of algorithm analysis). To see the deficiency, it is easy to observe graphically that f(n) = 101n + 73 c 0(n 2) (Figure 4.12 on page 168).
168
Chapter 4 Algorithms
Figure 4.12 sufficient.
Big-C isn't
c 40000
I
30000
200o00
I
I
//
20000
n 50
100
150
200
However, as n gets large, f(n) = 11n + 73 does not grow nearly as fast as g(n) = n 2 . To say that 101n + 73 grows no faster than n 2 for large n is quite an understatement. What we really want is a reference function that grows at the same rate as f. [For f(n) = lOin + 73, g(n) = n is the best choice.] U
VQuick Check 4.5 1. Algebraically verify that 101n + 73 is in 0(n
2
).
One way to remedy the deficiency in the big-0 definition is to create another definition. 12 The definition we really want is big-8, but another useful definition needs to be presented first. DEFINITION 4.4 Big-2 The function f is said to be in big-2 of g, pronounced "big omega" and denoted f E Q2(g), if there are positive constants, c and no, such that, for all n > no, If(n)I >_ clg(n)1. Another way to express this is
f E Q(g)
if and only if
3c E R+, 3no E R+, Vn E R+, [(n > no) ---(If(n)l > cjg(n)l)]. So f is in big-Q2 (g) if f eventually stays larger than some positive constant multiple of g. Big-Si Basics It is not hard to show (graphically and algebraically) that f(n) = 120n 2 - 1 in + 450 is in 2 (n 2 ). Figure 4.13 shows that c = 100 and no = 20 verify the definition (c I1 also works, but the graph is rather boring). Figure 4.13 Verifying that lln + 450 is in f(n) = 120n2 £2(n 2 ).
c = 100 1.2 X 106 1 X 106 800000 L_ 600000400000-"" 200000 20
40
60
80
100
n
12 The big-0 definition has been around too long to revise it.
4.2 Measuring Algorithm Efficiency
169
The algebraic verification is also easy. Assume n > 0. Then 120n 2 - lln + 450 > 120n 2 -_lln
> 120n 2 -11n
2
ifn> 1
= 109n2 .
Since 109n 2 > 0, taking absolute values of both sides will not reverse the inequality. Thus 1120n 2
-
+4501 > 109n2l
ln
=
1091n 2 1.
U
Let no = 1 and c = 109 in the big-02 definition.
V Quick Check 4.6 1. Use a graphing calculator or a soft-
Maple to find choices for c and no that
ware package such as Mathematicaor
suggest
4
log 2 (-) C
02
(log 2 (n)).
Z
The next definition is the most useful for algorithm analysis. DEFINITION 4.5 Big-O The function f is said to be in big-6 of g, pronounced "big theta" and denoted f E E(g),iff E 0(g) fn02(g).
Big-O Basics Examples 4.11 and 4.14 demonstrated that f(n) = 120n 2 - 1In + 450 is in 0(n 2) and also in Q2(n 2 ). Therefore, f E 0 (n 2). The graph in Figure 4.14 shows that f can be contained between two different multiples of n2 , once n is large enough. Figure 4.14 f(n) = 120n Q2(n
2
).
2
- lln +450 is in
c1 = 100
1.2.106
C2
125
1.106 .
800000 600000 .--
/-
400000 200000 "
n 20
40
60
80
100
By establishing that f e E(n 2), we know that f grows in essentially the same manner as n 2 , a reference function we are quite familiar with. In addition, we have demonstrated that f grows substantially faster than a function that is in 0(n), but fundamentally slower than one that is in 8 (n3 ). U
Social Convention Wins over Correctness The notion that an algorithm is in 6(g), for some reference function g(n), is the one that properly captures the notion "the algorithm's efficiency changes like g(n) as the data size n increases." The big-0 definition leaves a loophole that was described in Example 4.13. Unfortunately, for many years and in many scholarly books and papers, the concept of big-0 has been used in a somewhat loose manner. Many authors make the statement f E 0(g) but really intend you to recognize that even more is true (namely, f E 0(g)). I will try to use the more complete notation whenever it has been verified.
170
Chapter 4 Algorithms Nevertheless, you need to recognize that the phrase "big-C" is used at times as if it were big-(-). It is also common practice to say that f is 0(g) instead of f is in 0(g). It is even common for less precise notation to be used. For example, you may often see claims such as f = 0(g), which might be read out loud as "f is big-C of g." Another variation would be to write something like "f(n) = 5n2 + 0(n)," meaning that f is a sum of the function 5n2 and other terms that are in O(n).
4.2.2 Practical Big-o Tools For normal evaluation of an algorithm's performance, it is helpful to have some shortcuts that enable us to skip the algebraic details involved in proving the details of the definitions. You have encountered a similar situation in calculus: You first learned (for very good reasons) the definition of the derivative: f'(x) = lira f(x ± h) - f(x) h-0 h
You then studied some theorems that provided shorter ways to determine f', given f. For example, if f(x) = x", then f'(x) = nx"-1 . You also learned more complex
shortcuts such as the product rule and the chain rule. Fortunately, shortcut theorems exist for big-) comparisons. They will be examined after a collection of important reference functions are presented.
A Brief Introduction to the World's Favorite Reference Functions Most algorithms can be classified as being similar (in a big-O sense) to one of the functions listed in Table 4.1. The functions are listed with the slower-growing functions at the top and faster-growing functions at the bottom. In the context of algorithm analysis, slow growth is good (as n increases, the algorithm takes very little additional time to complete). The top entries (constant growth and logarithmic growth) are almost too good to be true. In fact, an algorithm that takes the same amount of time to complete no matter how large the input data set [and is therefore in 0-(1)], is almost certainly an algorithm that ignores the data and is consequently useless. TABLE 4.1 Standard reference functions Category
Constant Logarithmic Linear n log n Quadratic Cubic
Exponential
Reference Function
I log2 (n) n n log 2 (n) n2 n3
a' for a > 1
As will be shown shortly, an algorithm with exponential runtime is essentially an impractical algorithm. You may be able to complete a few simple cases (which could be done by hand quicker than you can write a program to implement the algorithm), but it will take the algorithm longer to finish than you can wait if given any mildly interesting set of data. The graphs in Figure 4.15 show the standard reference functions. Notice that the vertical axes are not the same scale. The function n log 2(n) appears on both graphs.
171
4.2 Measuring Algorithm Efficiency g(n)
g(n)
2000
35 n
2n
3
30 1500
25 20
n 1og 2n
1000 15 n
10
500 n2
2 4 Figure 4.15
6
8
5
n 1og 2 n n 10 12 14
2
4
6
8
10
n
12 14
Comparing the standard reference functions.
A Brief Review of Logarithmic Functions You should have previously memorized the definition and properties of logarithmic functions, but just for completeness, they are repeated here. If you have not memorized them, you will profit from doing so now. DEFINITION 4.6 Logarithmic Functions The logarithmicfunction with base b, denoted logo (x), is defined by the relationship log(x) = y
forx >0
if and only if
by = x,
where 0< band#
1.
The most commonly used logarithmic functions are common logs logl 0 (x), usually denoted log(x) on calculators natural logs loge(X) = flx dr usually denoted ln(x) base 2 logs log 2 (x), the prime candidate in discrete mathematics All logarithmic functions share some common properties: EmO. E
"•lOgb(l) " log0(b)
ll
Propertiesof Logarithmic Functions
= 0 = 1
logo(xn) = n lOgb(x) for any real number n and x > 0 SlOg9b(xY) = log0(x) + lOgb(y) forx > 0 and y > 0 SlOgb(y)
1Ogb(X - 1Ogb(y) forx > 0andy > 0
Note well: There are no theorems for simplifying lOgb(x + y) or 1Ogb(x
-
y).
The final theorem in this brief review is the change of base formula. It shows how you can use the common log or natural log functions built into a calculator to find a base 2 log.13 131f you are using the log functions that come with most computer languages, they will be natural logs,
172
Chapter 4 Algorithms
PIi11tO ILM~l
Change of Base Formula
For all legal bases a and b and all x > 0, loga (x) lga(b) =logb(x)
Thus, log 2 (100)
=
11)
2.3026
3.3219.
3 Theorem 4.2 makes it possible to replace the standard reference function, log2 (n), by any other base logarithm (see Exercise 12 in Exercises 4.2.3). It is common to see natural logs, as well as logs with unspecified base, used as reference functions. The next two examples will provide two views of the practical implications of the vast differences in growth represented by the reference functions. ln(2)
-
.69315
Comparing Reference Functions-Part 1 Suppose we have a computer that can execute one instruction per nanosecond. That is, it takes 10-9 seconds to execute one machine instruction. Thus it can execute 1 billion instructions per second. Table 4.2 compares six hypothetical algorithms. The row labels represent the time complexity of the respective algorithms. For example, the fourth row represents a quadratic algorithm. The columns represent several different sizes of the data set. Making the simplifying assumption that algorithm i uses gi (n) machine instructions for a data set of size n, entry (i, j) in the table represents the amount of time it will take algorithm i to run on a data set of size j. For example, g4(n) = n 2 . If n = 10, 000 then 10, 0002 = 100, 000, 000 instructions are needed, each requiring 10-9 seconds to execute. The total time is therefore. 1 second. TABLE 4.2 Time needed to process n items for six algorithms gi(n) 1og 2 (n)
n = 10,000
n = 100,000
n = 1,000,000
n = 250,000,000
13 nanoseconds
17 nanoseconds
20 nanoseconds
28 nanoseconds
n
0.00001 seconds
0.0001 seconds
0,001 seconds
0.25 seconds
n log2 (n) n2
0.00013 seconds 0.1 second
0.00166 seconds 10.0 seconds
0.01993 seconds 16 minutes, 40 seconds
6.97434 seconds _ 1.98 years
0
16 minutes, 40 seconds
11 days, 14 hours
_ 32 years
500 million years
2n
6 x 102993 years
Too long to contemplate -
To put Table 4.2 in perspective, consider sorting the census data for the United States (population close to 300 million). A standard result in a course on data structures states that the simplest sorting algorithms are in O(n 2 ). The better algorithms for this task are in 9(n log 2 (n)). Would you prefer to wait around 3 years, or a bit under 10 U seconds? 14 We can look at the previous example from another viewpoint. Comparing Reference Functions-Part 2 Suppose that we want to run each of the algorithms in Example 4.16 for 1 minute. What size data set would each be able to process? independent of the notation. Computer systems such as Mathematica have more flexible log notation. For example, Mathematicauses Log [ b, x ] to represent base b logs. 14 The actual times will be longer in both cases because the assumption "algorithm i uses gi (n) machine instructions for a data set of size n" is not really true.
4.2 Measuring Algorithm Efficiency
173
Consider the quadratic algorithm. We want to solve the equation below for n. (10-9 seconds/instruction) • (n 2 instructions) = 60 seconds It is simple to determine that n = /60,000,000,000 The equation
- 244,949.
(10-9 seconds/instruction) • (n log 2 (n) instructions) = 60 seconds is harder to solve since it reduces to n log 2 (n) = 60,000,000,000. Using a numerical approximation technique such as Newton's method, we find n - 2 billion data items. Table 4.3 shows the size data set each of the standard reference functions can complete in one minute. TABLE 4.3 Number of items per minute gi (n) data items in 1 minute 2 6x10
log2 (n) n
10
60 billion
n log2 (n)
2 billion
n2
244,949
n
3
3,915
2n
36
1. Suppose that you are able to use pencil and paper to process the steps of an algorithm. Compare how long you will take to process 10 items and 100 items using two different algorithms. Algorithm 1 is in 6(n log2 (n)) and algorithm 2 is in 6(n 2 ). You may assume that the number of steps is n log2 (n) or n 2 , respectively, and that
you take 30 seconds per step. Convert your answers to appropriate time units (minutes or hours). 2. Using the information from the previous question, how many data items can you process in one hour using algorithm 1? How many items in one hour using algorithm 2?
Big-0 Shortcuts It is now time to introduce the shortcut theorems that allow us to make practical big-0 evaluations. The proof of the first theorem requires a few ideas you may have encountered in a calculus class. DEFINITION 4.7 Limit of a Sequence Consider the sequence of real numbers b0 , bl, b 2. The limit, b, as n goes to infinity is denoted by limnn bn = b. It is defined if and only if for every real number E > 0 there is an integer, N, such that Ibn - bi < E
for all n with n > N.
The sequence converges to oo, denoted by limne+bn integer, M, there is an integer, N, such that bIbn>M
=
os, if for every positive
forallnwithn>N.
So the sequence has a limit if it stays arbitrarily close to b as n gets larger and larger.
174
Chapter 4 Algorithms
Simple Sequence Limits The sequence 0 , .... , defined by
bn = 1 - -T, for n = 0, 1,2 has the limit 1. e i edby yb f= f r n = 0 ,2 . . converges o v r e defined n2 • , forn=0,,2, The sequence 0, '2! '3 4' 49 . .. d to 0C.
The proof of Theorem 4.4 uses the triangle inequality. See Exercise 24 in Exercises 3.2.10 on page 116 for a hint about how to prove this important result. 971FMIT,ý[
The TriangleInequality
If a and b are real numbers, then Ia + bj < lal + Ibl. NFIT)l
ll
Big-o and Polynomials
Let f be the polynomial function f (n) = aknk + ak-lnk-1 + ak : 0. Then f E O(nk).
+ aln + ao, where
Proof: To show f E O(nk), it must first be established that f E O(n(k). It is easy to generalize the procedure used in Quick Check 4.4 to establish that 2n 2 + 5n + 7 is in O(n 2).
Assume that n > 0. Then each of the terms ni is positive so lainil = Jai In'. Also, if n > 1, then n' < ni whenever i < j. Therefore (using the triangle inequality), laknk +ak-Ink-
+ +.
+ --
+ain +aol < lakink + lak-Ilnk-1+n" < Iakln k
=
(
+..
ail
ailn + laol
+ lalnk + laolnk
nk.
We can take c = yk=0 JailI and no = I in Definition 4.3. To complete the proof that f E O(nk), it must also be shown that f E Q (nk). The ideas from Example 4.14 cannot be generalized unless ak plus the sum of the negative coefficients is a positive number (true in Example 4.14 but not true in general). [Try the techniques used in Example 4.14 for f 2 (n) - -120n 2 - I In + 450 or for f 3(n) = 120n 2 - 120n. They will fail.] It will be necessary to use limits to complete this part of the proof. Since ak : 0, the following limit makes sense. Assume that n > 1. Then lir
lira
Jf(n)_
aknk +ak-ln k-1 +"" k-I
-lim nýo 2+
since
1k
0 as n --*
2
+aln +ao
1
j=0 ak nk-J
=2
c.
Therefore, using Definition 4.7, for every e > 0, there is some integer N (that depends on the choice of E)such that Jf(n)J
ifknk
2 1n
2 <E 2_
foralln>N.
4.2 Measuring Algorithm Efficiency
175
It will be useful to assume that c < 1.15 In that case, 2 - E > 1. Recalling that Ix I < y if and only if -y < x < y, we now know that -
f(n) __
(2 -- c)
aýnk> I k-
S2
k
-2
Therefore, Definition 4.4 is satisfied by taking c = I and no = N. Both requirements in Definition 4.5 have now been verified, so f E
a(nk).
D1
The use of limits in the proof of Theorem 4.4 suggests an alternative approach to showing that f E 0(g). This approach is explored in Exercises 23, 24, and 25 in Exercises 4.2.3. The main ideas are simple. If f and g grow at essentially the same rate, then lim,-•l
cf() should have a limit that is neither 0 nor infinity (probably related to a g (n)
reasonable candidate for the constant, c, in Definitions 4.3 and 4.4). On the other hand, if g grows at a faster rate than f [and hence f is trivially in 0(g)], then limn f-n) e should be 0. Finally, if f grows at a faster rate than g [and is therefore in Q (g)1, Ithen f(n) g(n)
should converge to oo as n --* cc.
Polynomials Are Easy Theorem 4.4 makes finding the proper reference function for a polynomial into a trivial task. For example, .001n 6 + 10,000n 5 + 45,876n 3 + 85n + 1,000,000,000 C 0(n
6
).
Notice that the sizes of the coefficients are irrelevant when looking for the reference function. Even though the leading coefficient is small, the function .001 n 6 + 10,000n 5 + 45,876n 3 + 85n + 1,000,000,000 will eventually shoot above any polynomial of degree 5 or less. E Recall that whenever fl(x) and f 2 (x) are defined, the sum fl + f2 is defined by (fi + f 2 )(x) = fi (x) + f2(x) and the product fl • f2 is defined by (fl f2)(X) = fl(x)' f2(x).16 The next two theorems are similar in spirit to some derivative theorems in calculus: (f + g)'(x) = f'(x) + g'(x) and (f . g)'(x) = f'(x) . g(x) + f(x) • g'(x). However, in the big-C and big-8 context, the product theorem is the simpler rule. Big-0 and Products
Suppose that fi E 0(gi) and
f2 E E(g2).
Then f, • f2 E E(gi • g2).
Proof: Once again, it is necessary first to show that fi • f2 that fl • f2 E Q (gl
E 0(g1 - g2)
and then show
g2).
Since fl c 0(gl), it is also true that fl e -(gl).
Therefore, there must exist
constants cl and n, so that Ifl(n)l < clIg (n)I for all n > nl. Similarly, there are constants C2 and n 2 such that [f2(n)I _< c21g2(n)l for all n > n2. 15This assumption doesn't invalidate the proof since there is an N for every C. It would be better to use the notation NE. 6
1 There is content to these statements. Read the first as "the new function ft + f2 sends the number x to
fl (x) + f2(x)."
176
Chapter 4 Algorithms C2. Then for all n > no Set no = max{n1, n21 and c = Cl C I(ft' f2)(n)I = Ifi(n) f2(n)l = Ifl(n)l Jf 2 (n)J < (clI Ig I(n)1)"- (C21g2 (n)1) = (Cl • c2)1g1(n) •g2(n)I
= cl(gK• g2)(n)0 establishing that fl • f2 E O(gi1 g2). A very similar calculation establishes that fi • f2 e Q (g" -Lg2). DEFINITION 4.8 The Pointwise Maximum Function Then the function Suppose gi(x) and g2(x) are defined on some domain. max{gl, g2} is defined as the pointwise maximum of gi and g2:
max{g1, g2}(x) = max{gf(x), g2(x)}. The definition says that at every x, max{gl, g21(x) is assigned the value of whichever original function is larger at that x.
VPITROVR
J
Big-0 and Sums
Suppose that fl E 0 (gl) and f2 E 6(g2). Assume also that there is a positive integer, no, suchthatforalln > no, fl(n) > 0, f2(n) > 0, gl(n) > 0 andg2(n) > 0. Then (fl + f2) E 0(max{gi, g2)).
Proof of Theorem 4.6: The proof begins just like the proof of Theorem 4.5. Since fl E 0(gl), it is also true that ft E 0(gl). Therefore, there must exist constants cl and nI so that Ifl(n)I < clIgI(n)I for all n > n1. Similarly, there are constants C2 and n2 such that If2(n)I n 22 Set n 3 = max{n1, n2} and c 3 = Cl + c2. Set g(n) = max{gI(n), g2(n)}. Then for all n > n3
I(fl + f2)(n)l = Ifl(n) + f2(n)I < Ift(n)l + -f2(n)l < (cIlgI(n)1) + (c21g2(n)1) < (Cl1g(n)1) + (c21g(n)l) = (Cl + C2)Ig(n)I = C31g(n)l = C3 • I max{gl, gz}(n)J. This shows that (f, +f2) E• (max{gt, g9}). To show that (fl +f2) E f2 (maxfgl, g2}), it will be necessary to assume that fl (n) > 0 and f2(n) > 0 for all n > no. Since f, c ®(gl), it is also true that fi E QŽ(gl). Therefore, there must exist constants C4 and n4 so that Ifl(n)I > C41g1(n)I for all n > n4. Similarly, there are constants C5 and n5 such that If2(n)I >-C5g2(n)I for all n > n15. Set n6 = max{no, n 4 , n5}. Also, let c6 = min{C4, C5) and g(n) = max{gl(n), g2(n)}. Then for all n > n6 I(f
+ f2)(n)I = Ifl(n) + f2(n)) = fl (n) + f2(n)
since the functions are positive for n1 > no
= Ifl(n)I + lf2(n)l
4.2 Measuring Algorithm Efficiency
177
> (c41g1(n)1) + (c51g2(n)1)
> c61gl(n)l +c61g2(n)l > c661g(n)l
since the sum of two positives is larger than either
= c6" I max{g 1 , g21(n)l.
D]
To see why it was necessary to make the assumption that ft and f 2 were positive for all n > no, consider the functions fi (n) = 3n 2 and f 2 (n) = -3n 2 . Both are easily seen to be in 0(n 2 ), so gl(n) = n 2 and g2(n) = n 2 . Thus maxigj, g2}(n) = n 2 . However, (fl + f 2)(n) = 0 for all n. This will clearly not be greater than or equal to cn 2 for a positive constant c! Hence, (fl + f2) 0 2 (n 2 ).
You may also be wondering why the function maxfg1, g2}(n) is used as the reference function instead of the more natural (gI + g2)(n). It is true that (fl + f2) E ((g1 + g2) under the assumptions of the theorem. Notice, however, that max{g1, g2}(n) < (gl + g2)(n). Thus, when finding a big-O estimate for (fl + f 2)(n), we have 1(fi + f2)(n)l < cjmax{gi, g2}(n)j 5 cJ(gl ± g2)(n)0 for all n > no. The
reference function that stays closer to fl +
f2
is preferred.
A New Reference Function Suppose we measure an algorithm's time performance and find that it is 3n(7 log 2 (n) + 1In 2 ) + 12n log2 (n). What is a good reference function (in a big-0 sense)? 2 2 Using Theorem 4.6, a good reference function for the sum 7 log 2 (n) + I In is n (notice that both terms are positive for n > 1). The term 3n(n 2 ) has n 3 as its reference function, using Theorem 4.5. [It is easy to see that 3n e 0(n) and 1In2 E 0(n 2); take the product of the two reference functions n and n 2.] We now have the function n 3 + 12n log 2 (n). Using the ranking of the common reference functions (Section 4.2.2), it is clear that n3 is the larger. Thus, 3n(71og 2 (n) + lln2 ) + 12n log 2(n) E 9(n 3). U A Social Convention When choosing a reference function, you should never use any coefficients (other than 1). Thus, n3 is an acceptable reference function, but 5n 3 is not. [We can easily show that 5n 3 E 6(n 3), so the 5 is really unnecessary.] Also, using Theorems 4.4 and 4.6, it is not acceptable to leave lower ranking terms in the reference function. For example, the reference function n log 2 (n) + 5n + 9 can be replaced by n log 2 (n) since for large enough n, n log2 (n) > 5n + 9.
V"uick Check 4.8 1. Find good reference functions for the following: (a) 6n + 7n(log 2 (n) + 9)
(b) (2n + log 2(n))(3 log 2(n) + 8) (c) log 2 (n 2 + n) (Hint: Use properties of logs.) R1
The following definitions will be used in the remainder of this book. DEFINITION 4.9 FloorFunction; Ceiling Function The floor and ceiling functions are defined for all real numbers x by floor(x) = Lxj = the largest integer in the interval (x - 1, x]
and ceiling(x) = Fxl
=
the largest integer in the interval [x, x + 1).
178
Chapter 4 Algorithms
4.2.3 Exercises The exercises marked with O
have detailed solutions in
12. Show that logb(n) G 0(log 2 (n)) for any b > 0 with b 0 1.
Appendix G.
Generalize this result.
1. Use a graphing calculator or a software package to graphically experiment with the big-C definition (Definition 4.3). In each case, either find values for c and no that show fi E 0(n 2 ), or else give some justification that Ifi(n)I > cn 2 will always occur for every c > 0, for large enough values of n 2 [and so fi g-(n )]• (b) f 2 (n) = n 2 + 1000 (a) fl (n) = 6n + 9 (d) ' f 4 (n)
(c) P f3(n) = 3n log 2 (n) (e) f 5 (n) = 2"
=
2. Use a graphing calculator or a software package to graphically experiment with the big-Q definition (Definition 4.4). In each case, either find values for c and no that show 2 f i E Q (n2 ), or else give some justification that Ifi (n)I < n will always occur for every c > 0, for large enough values of n [and so f, ý Q(n 2 )]. (a) fl (n) = 6n + 9 (b) f 2 (n) = nt2 + 1000 (3
(c) f 3 (n) = 3n log2 (n)
13. Suppose someone proves that an algorithm is in 0(1). (a) Write what this means in terms of Definitions 4.3 and 4.4. (h) Explain intuitively what this means in terms of the sizes of data sets. 14. Prove Theorem 4.2. 15. Extend the table in Example 4.16 to include a column for n = 300,000,000. You may omit the 2' row. 16. Suppose the algorithms in Example 4.17 are moved to a computer that requires a microsecond per instruction (10-6 seconds). How many data items can each algorithm complete in 30 seconds? 17. Prove the big-i part of Theorem 4.5. 18. Use the big-0 theorems to find good reference functions for the following:
(a) fl (n) = 3n 2 +5n(2n +7)
(d) D J 4 (n)
(e) f_5(n) = 2" 3. Recall that in Definition 4.5, n is any real number (although we usually consider it as an integer). Define f(x) = LxJ - [x]. Show graphically that f(x) E 0(x 2 ). Write the values of c 1 , c2, and no on the graph. 4. D For each function in Exercise 1, show algebraically that it 2 is in 0(n 2 ), or else show algebraically that it is not in O(n ). [For part (e) you may want to use limits and L'H6pital's rule.] 5. O For each function in Exercise 2, show algebraically that it is in Q (n2), or else show algebraically that it is not in Q (n2).
(b) f 2 (n)
-
n(n+2) 2
(c) D f3(n) = 1210092(n) + n)(n + 3n 10gz(n)) + 6n2 (d) f 4 (n) = 3n + 3n 6 + 5 log 2 (n) (e) f5(n) = n(n1) (6n2 + log 2 (n))
19. Use the big-0 theorems to find good reference functions for the following g 2 (a) O fl (n) = (2n + 7) 10g 2 (n) + 2" (n3 + 4) 7 2 5 .2n) log2 (n •2n) + 2(n+2) + (n = (b) f2(n) 4 (c) f3(n) = (n + 2( n))(n3n + log2 (n) + n2)
6. Which of the functions in Exercise I are in 0(n2)? Justify
(d) f 4 (n) = (n 2 log 2 (n))
your answer.
8. For each pair of functions, show algebraically that f c 0 or f 0 0(g). Note that in order to prove f c 0(g), it must be shown that f E 0(g) and f E Q (g). Show your work in investigating both of these requirements. 5 (a) f(n) = 12n 12+ + 3n 4
-
n - 7; g(n) = n 5
(b) ýP f(n) = n+4; g(n) = n I-n
2
2
20. Show that 1092 (n!) E 0 (n 1092 (n)).
21.
BWhich function grows faster, 2n or n! ? Justify your ansewer algebraically.
22. Let fl (x) =
Jxi and
-
f2 (x) = [x].
(a) Show that max(fl, f 2 )(x) = f 2 (x) forx > 0. (b) Show that (fl + f2) ý Q([x]), i.e, that (fl + f2) 0 Q(max{fI, f2}).
(c) O f(n) = n!;g(n) = nn
(Recall that limno
+ n(log 2 (n) + n 2 )
(e) f 5 (n) = 5n 3 + (n + n3)(4n + 3n2)
7. Algebraically verify that f(n) = 120n2 _ I In + 450 from
Example 4.14 is in Q2(n2) with c = 1.
2
= 0.)
2 (d) f(n) = 5n +2n +3" g(n) 6 n2 (e) f(n) = 3 g(n) = n
23. Prove Theorem 4.7 and Propositions 4.1 and 4.2.
4
(g)
l
Let f and g be real-valued functions for which g(n) A 0 for n > no, for some integer, no > 0. If there is a real
9. P Prove that LxJ E 0(x). 10. Let x, y, z • R with x, y, z > 0 and y A 1. Prove that
number, r A 0, such that
xlog.(Z) = zlogy(X)"
lim
g(n)
logyn-*on
t then 11. Prove that [log 2 (x)j E 0(log 2 (x)).
f(n)
f E 9(g).
r,
4.2 Measuring Algorithm Efficiency PROPOSITION 4.1 f E 0(g) Let f and g be real-valued functions for which g(n) 7= 0 for n > no, for some integer, no > 0. If fl-O (n)no (n)
(b) Let f and g be real-valued functions for which g(n) 0 0 for n > n 0 , for some integer, no > 0. If f E 0(g), then f(n) lim
f
n-c -g(n) for some real number, r 0 0.
f e 2•(g)
Let f and g be real-valued functions for which g(n) # 0 for n > n 0 , for some integer, no > 0. If lim f(n) rn7 1g-then
r
(c) Let f and g be real-valued functions for which g(n) • 0 for n > no, for some integer, no > 0. If f E 2 (g), then
then f E (g-). PROPOSITION 4.2
179
lim
f(n)
=o.
n--*oc g(n)
(d) Let f and g be real-valued functions for which g(n) # 0 for n > no, for some integer, no > 0. If f E 8(g), then
=c,
there is a real number, r : 0, such that
f n • (g).
24. The theorem and propositions in Exercise 23 are not "if and only if" assertions. Simple counterexamples show that their converses are false. Find counterexamples for each of the following claims. (Hint: Think about the paragraph that follows the proof of Theorem 4.4, on page 175.) (a) P Let f and g be real-valued functions for which g(n) 0 0 for n > no, for some integer, no > 0. If f E 0(g), then f(n) n-lim g n 0.
lim
n-c
f (n) =r. g(n)
(Hint: f and g can stay close together in a big-0 sense without the quotient f W ever converging. One of the g(n) functions might dance around the other forever.) 25. Use Theorem 4.7 in Exercise 23 to provide an alternative proof for Theorem 4.4. 26. P Use Theorem 4.7 in Exercise 23 to verify your solutions to Exercise 18 (you may need to use L'H6pital's rule).
4.2.4 Big-o in Action: Searching a List As stated earlier in this chapter, the main point of big-® is to provide a mechanism for comparing the efficiency of competing algorithms. In this section, two algorithms for searching a list will be compared. Some additional ideas will also be presented: best
case, worst case, and average case behavior. The problem will first be described, then the two algorithms explained and analyzed. You should review Theorem 3.6 on page 119. This result is used often enough in
the analysis of algorithms that you should memorize it. The second algorithm requires the following additional theorem.
Logarithms Are Order Preserving If 0 < x < y, then lOgb(x) < lOgb(y).
Searching a List Suppose that a list, {ao, al, a2. a.- }, of n items has been produced. 17 An item with the value x exists, and we need to determine whether x is in the list. (Perhaps x represents a student and the list is a school's database of student information.)
Sequential Search The simplest solution is to start at the beginning of the list, and compare each item, in
turn, to x. As soon as x is found, the algorithm could return the position in the list where x resides. If x is not found, the algorithm could return a "not found message."" 8 This 17 Most popular programming languages start lists at subscript 0. 18In a traditional programming language like C, we could return -1 to represent "not found." In a language such as Java, we could throw an exception.
180
Chapter 4 Algorithms algorithm is called a sequentialsearch1 9 and can be expressed as 1: integer sequentialSearch(x, {a0, al, a2 ..... an-l}) 2: for i = 0 to n-i if x == ai 3: return i # x == ai so exit and return i 4: # x did not match any of the a's 5: return "not found" 6: end sequentialSearch
The critical step in the algorithm is at line 3. The amount of time the algorithm needs is strongly related to the number of times this step is executed. Since the comparison is inside a loop, the number of times it executes is related to the size of the list.
Best Case If x happens to be in the first position of the list (x = ao), then line 3 is executed only one time. The best case behavior is therefore in a(1).
Worst Case The worst case occurs in two ways: Either x is found in the final position (x = a,-I), or else x is not in the list. In either case, line 3 will be executed n times. Thus, the worst case behavior is in O(n).
Average Case Average case behavior is usually harder to analyze than best case or worst case. For the search problem, it is made more troublesome by the need to worry about items that are not found. What proportion of all searches result in "not found"? That is something that we don't know without knowing more about the list and its intended use. The analysis that will be done here will just consider average behavior for successful searches. In addition, it will be assumed that every item in the list is equally likely to be the target of the search. Intuitively, we would expect that on average, about half the list should be searched before x is found. More formally, since items are equally likely to be the target of the search, we can just average the cost of finding each item. To find that x = ai requires line 3 to be executed i + 1 times. We want the sum of these costs, divided by the number of costs that are being averaged. This is 1+2+3+..+n n
1
-n
= - Y_ n i=1
1 n
-
n(n +)
2 2
-
n+
2 2
0(n). 6
The average case behavior for a successful search is in O(n).
Binary Search The more sophisticated approach is to use a binary search. To understand the motivation for this algorithm, think about our list as a phone book. The sequential search would start at the first page and look at names in alphabetical order until the name you wanted was found. This is a poor strategy for looking up a name in a large phone directory. A better approach is to open the directory to the middle page and determine whether the name you are looking for is in the first half or the second half of the phone book. Suppose the name is in the first half. You could then open to the page that is one-quarter of the way through the directory and determine whether the name comes before or after (or on) that page. Each step of the way, you are rejecting lots of names without ever 20 looking at them. A binary search assumes that the list is in some kind of lexicographical order (perhaps alphabetic or numerical order). The concept of successively dividing the list in 19 Also 20
called linearsearch. you might think about an even better strategy: Look at the first letter in the name, then estimate the distance from the front of the directory that the letter will be listed. This is called an interpolationsearch.
4.2 Measuring Algorithm Efficiency
181
half is a simple one. However, properly expressing the algorithm is not trivial: There are many subtle errors that can occur. There are several correct variants (depending on when we determine whether a match has occurred). The one given here uses a multiway selection. Notice the use of the floor function, [xJ, in line 5 to round the quotient down to an integer. 1: integer binarySearch(x, 2: 3: 4:
low = 0 high = n-i while low < high
{ao, al, a2 .....
an-})
# index of left edge of the list's # index of right edge of the list's
active portion active portion
5: 6: 7:
mid = [low2highj # index of middle of active portion (rounded down) if x > amid # ignore the left half next iteration low = mid + 1
8:
else if
x
1, since n < n 2 if n > I if n > I
4.5 QUICK CHECK SOLUTIONS
203
Since 2n 2 + 5n + 7 < 14n 2 , it would be nice to conclude that 12n 2 + 5n + 71 < 14In 2 j. Note, however, that -5 < 2 but I - 51 - 2. We need to make sure that 0 < 2n 2 + 5n + 7 before taking absolute values. Fortunately, the function 2n 2 + 5n + 7 is concave up (since it is quadratic with positive leading coefficient), so for large enough n, the function will be greater than zero. In fact, it is easy to see graphically that this function is positive for all n. Thus 12n 2 + 5n + 71 < 141n 2 1 for n > 1 (i.e., no = 1 and c = 14). 2.
101n+73 < 101n+73n ifn > 1 = 174n
Let no = I and c = 174. Then 0 < 10In + 73 and 1101n + 731 < 1741ni. Another approach is to notice that 101n + 73 = 102n when n = 73. Thus, the definition can be verified with no = 73 and c = 102: 1101n + 731 < 1021n1 for n > 73. 3. The term -5n requires some extra caution. It is important that the absolute values in the definition not be dropped. (In Example 4.11, the absolute values were dropped because the function, 120n 2 - 1 n + 450 is positive for all n > 0-the smallest value of n being considered.) The function, n 2 - 5n -4 does not have this property. For example, when n = 1, n 2 - 5n - 4 = -8 < 1 = n2, but In2 -5n -41 = 8 : I = In2 1. Assume that n > 0. Then In2 - 5n - 41 < In2 - 5 - 5 - 41
if n > 5
2
n - 29 < n22
=
Using no = 6 and c = 1 in Definition 4.2 completes the proof. An alternative approach is to notice that n 2 - 5n2 - 4 =
5
x
,/1
X+ 5
,41 2
)
___2
so n 2 - 5n - 4 > 0 if n > 6. Thus, assuming that n > 6, the absolute values can be temporarily dropped. Then 0 < n 2 - 5n - 4 < n 2 - 5n < n 2 . Consequently, In2 - 5n - 41 < lln 2 1,and so n 2 - 5n - 4 E O(n 2 ) (using no = 6 and c = 1).
Quick Check 4.5 1.
< 101n+73n ifn > 1 = 174n ifn > 1 < 174n 2 Letno = 1 andc = 174. Since < 101n +73, 101n +73
l0ln +731
< 1741n 2 1 forn > 1.
Quick Check 4.6 1. The following graph suggests that c =
_1log 2 (n) 8
0.3
0.2 0.15
-0.1 -0.2
and no = 5 should work.
2.4
6
8
204
Chapter 4 Algorithms
Quick Check 4.7 1. The table represents the approximate times. 0 n log 2 (n) n2
n = 10
n = 100
16.6 minutes 50.0 minutes
5.5 hours 83.3 hours
2. There are 3600 seconds in an hour, so we need to solve the equations 30n log2 (n) = 3600 30n 2 = 3600 Using your graphing calculator, or Newton's method, or a software function such as Mathematica'sNSolve function, the results are algorithm 1 25-26 items algorithm 2 almost 11 items Quick Check 4.8 1. There are alternate paths you might take [for example, in part (a), you might first expand 7n(log 2 (n) + 9) as 7n log2 (n) + 63n]. (a) Using Theorem 4.6 (after checking that both functions are positive), the factor log 2 (n) + 9 can be replaced by log 2 (n) because 9 E 0(1). The new term (7n)(log 2 (n)) can be replaced by n log 2 (n) using Theorem 4.5 [and observing that 7n E E(n)]. This leaves the function 6n + n log2 (n). Observing that n log 2 (n) grows faster than n, Theorem 4.6 implies that 6n + n log 2 (n) G 0(n log2 (n)). The final result is that 6n + 7n(log 2 (n) + 9) e H(n log 2 (n)). (b) The simpler approach is to notice that 2n + log 2 (n) E 0(n) (look at the famous reference functions) and that 3 log2 (n) + 8 E 0(log2 (n)). Thus, (2n + log2 (n))(3 log2 (n) + 8) E 0(n log 2 (n)). You could also start by expanding the original product. It is not difficult to show that 6n log 2 (n) < (2n + log 2 (n))(3 log 2 (n) + 8) < 33n log 2 (n)
for n > 2
so you could graphically verify this result if it seems counterintuitive. (c) This requires the properties of log functions. First, observe that log 2 (n 2 + n) = log2 (n(n + 1)) = log 2 (n) + log 2 (n + 1). Next, notice that log 2 (n) < log 2 (n + 1) < log 2 (n + n) = log 2 (2) + log 2 (n) = l+log 2 (n)
forn> 1
and I + log 2 (n) < log 2 (n) + log 2 (n) = 2og2 (n)
for n > 2
so log 2 (n) < log 2 (n + 1) < 21og 2 (n)
forn > 2.
Thus log 2 (n + 1) E 0(log 2 (n)). Using Theorem 4.6, it should now be clear that log 2 (n 2 + n) E 0(log 2 (n)).
4.5 QUICK CHECK SOLUTIONS
205
Quick Check 4.9 1.
There are two assignments before the loop. The loop is executed n - 1 times. It
doesn't matter which branch of the selection statement we choose; there will always be an assignment. Thus there are n + I assignments, which is in O(n). 2. The comparisons appear easy to count: one comparison for each time through the
loop. There is one technical issue: The for loop actually does a hidden comparison on each iteration to make sure that i < n. It makes one more comparison when
it determines that i > n and the loop should be terminated. Thus, there are really n + (n + 1) = 2n + 1 comparisons. Thus, the number of comparisons is in a(n). The number of assignments is a bit more complicated (but not by much). We want to count each of the numbers 20, 21, 22 .... 2 k, where 2 k < n and 2 k+1 > n. You should be comfortable with this by now: We want k < log 2 (n), where k must be an integer. Thus k = Llog2 (n)j. The number of assignments is in 0(log2 (n)).
Quick Check 4.10 Text and Pattern m a m m a
1 s Li a m a
Hits Misses
z
e
u m a
m a
mama m a m
a m
a
m a
m a
m a
m a
m a
m
a m
a m a
m a
m a m
a
mama Total
3
1
2
1
0
1
0
1
0
1
2
1
0
1
0
1
4
0
11
8
Quick Check 4.11 1. Set all the table values to 0, and then read the pattern from left to right. At each
character, copy its position over the previous value in the table. (a)
(b)
a
b
c
d
e
f
g
h
2
4
0
6
5
0
1
0
a
b
c
d
e
f
g
h
6
8
2
4
0
7
5
0
Quick Check 4.12 1. The last table is 1 a e 4 0 0
m 3
s 0
z 0
u 0
206
Chapter 4 Algorithms Text and Pattern m a m m a m a m a m a m a mama
u
1 s
m
a m
a
Hits Misses z e u m a m a
a m a m a m a m a m a mama
0 2 0 0 3 0 2
1 1 1 1 1 1 1
4
0
11
7
m a m a Total
Quick Check 4.13 1.
m 2
a 2
m 4
a 1
Quick Check 4.14 1. We know from Quick Check 4.13 that the shift table is m
a
m
a
2
2
4
1
The search produces Text and Pattern m a
m m
a
1 s
u
a m a
Hits Misses z
e
u m a m a
m a m a m a m a mama m a m
0 2 0 0
1 1 1 1
1 0
1 1
0
a
0 2 4
1 1 1 0
Total
9
9
a
m a m a m a
m a
mama m a m a ma ma mam
Quick Check 4.15 1. Consider the problem with either answer: "* If the barber shaves himself, then he is in the group of people that the barber does not shave-so he does not shave himself!
"*If the barber
does not shave himself, then he is in the group of people who are shaved by the barber-so he does shave himself! Both options lead to a contradiction. Something is terribly wrong! This paradox (and others) caused mathematicians to reconsider their theories
about sets. Here is a related paradox: Let A be the set containing all sets that do not contain themselves. Is A e A?
4.6 Chapter Review
207
For instance, let T be the set of all sets that contain more than two elements. The sets D = the set of all dogs, C = the set of all cats, and B = the set of all birds are all members of T, so T E T. One way to avoid the paradox is to exclude the notion of a set containing itself from set theory.
4.6.1 Summary
log 2 (n)
With the advent of computers, algorithms have received significant attention. In that context, two issues are of interest in this chapter: (1) how can we express an algorithm clearly and unambiguously? and (2) How should we compare the relative efficiency of two algorithms that accomplish the same task? This chapter starts with a presentation of the basic structuring mechanisms used by modern programming languages. The mechanisms are expressed in pseudocode but can easily be translated into your favorite computer language. Pseudocode has been designed to capture the mechanisms that specify the actions of the algorithm without being overly concerned with syntactic issues (such as the placement of semicolons or the use of parentheses). A major theorem in computer science is that any algorithm can be expressed using only structured control constructs. These constructs can be partitioned into three singleentry, single-exit categories: sequence, selection, and repetition. Constructs in these categories can be nested within other constructs. By using only structured constructs to specify an algorithm, the algorithm becomes easier to comprehend, easier to prove correct, and easier to modify at a later date. One simple approach to comparing the efficiency of two algorithms is to run both algorithms on many data sets and compare the times needed to finish. This requires that both algorithms must be translated to a computer program and also requires the sample data sets to be created. An alternative approach uses a mathematical analysis of the algorithms to categorize each as belonging to one of several groups of algorithms whose efficiencies are similar. The main idea is that two algorithms are in the same group if, for very large data sets, both algorithms take about the same time to complete. This notion is made precise by designating some key operations within the two algorithms as the critical operations. A function representing the number of critical operations used to process a data set of size n is produced for each algorithm. The functions are then compared to a collection of standard reference functions. If f is the function that represents the number of critical operations for an algorithm and g is another function (typically a simple reference function), we say that f E 0(g) if there exist positive constants, no, cl, Cu, such that cllg(n)l < If(n)l < culg(n)l for all n > no. Section 4.2 examines this comparison approach in some detail. The basic definitions are introduced and several theorems are introduced that provide alternative ways to determine an appropriate reference function. The section also discusses the process of starting with an algorithm, expressed in pseudocode, and producing a function that
Linear
n
counts the number of critical operations. The table in the margin contains many of the
n log n
n log 2 (n)
common reference functions, moving from desirable to undesirable as the list moves from top to bottom. Section 4.3 considers an important problem: determining whether a string appears as a substring of some larger text. Several algorithms are presented. A important conclusion is that a better algorithm can make a dramatic difference in efficiency.
Category
Reference Function
Constant
I
Logarithmic
Quadratic
n2
Cubic
n3
Exponential
an for a
>
1
The chapter ends with a significant result in theoretical computer science: There will never be an algorithm that solves the halting problem. This theorem shows that there
208
Chapter 4 Algorithms are limits to what can be achieved with computers. The proof provides an interesting example of a proof by contradiction.
4.6.2 Notation Notation = or ==
Brief Description
*
Page 156 156 156
/ #
156 163
the pseudocode division operator the pseudocode start-of-comment symbol
f E O(g) f E 02(g)
166 168
fisinbig-Oofg f is inbig-Q ofg
f E a (g)
169 177
f is in big-8 of g the floor function
u
177 187
the ceiling function a visible symbol for the space character
[x]
189
represents a character which is not x
LxJ Fx]
the pseudocode assignment operator the pseudocode equality operator (assertion) the pseudocode multiplication operator
4.6.3 Definitions Algorithm An algorithmis a finite sequence of unambiguous steps for solving a problem or completing a task in a finite amount of time. Pseudocode Pseudocodeis a semiformal language used to describe algorithms. It is more precise than a prose description but contains less syntactic structure than a compilable computer language. There are no required syntax rules for pseudocode, but there are many useful conventions in notation. Structured Control Structured control is a way to orga-
If(n)I < clg(n)1. Another way to express this is f E 0(g)if and only if [(n > no)
-
(If(n)l < clg(n)1)].
Big-n2 The function f is said to be in big-2 of g, probounced "big omega" and denoted f th2 (g), if there are positive constants, c and no, such that, for all n > no, If(n)I > clg(n)I. Another way to express this is f • (g) if and only if i a f c O
nize the steps of an algorithm by using a collection of "safe" constructs. The essential idea of structured control is that subcollections of steps need to have a single entry and a single exit. This can be achieved by using nested constructs from three categories: sequence, selection, and repetition.
Ic E R+, 3n0 E R+, Vn E R+, [(n > no) -- (If(n)I > clg(n)J)]. Big-O The function f is said to be in big-6 of g, pronounced "big theta" and denoted f E 8(g), if f e 0(g) n Q2(g). Logarithmic Functions The logarithmic function with
goto A goto statement is a control construct that is antithetical to the goals of the structured control movement. A goto construct permits unconstrained "jumps" to new locations in an algorithm.
base b, denoted logb(x), is defined by the relationship logb (x) = y for x > 0 if and only if by = x, where 0 < b and b : 1.
Input Parameter An input parameter is a variable in pseudocode that represents part of the input data.
Nanosecond A nanosecond is 10-9 seconds. Limit of a Sequence Consider the sequence of real num-
bers bo, bl, b2 ....
Return Values The data that are produced by an algorithm are called the return values,
The limit, b, as n goes to infinity is denoted by lim,,__, b, = b. It is defined if and only if for every real number e > 0 there is an integer, N, such that
Big-O The function f is said to be in big-0 of g, pronounced "big oh" and denoted f E O(g), if there are positive constants, c and no, such that, for all n > no,
b, - bI N. The sequence converges to cc, denoted by limn, b, oc, if for every positive integer, M, there is an integer, N,
4.6 Chapter Review such that Ib, I > M for all n with n > N. The Pointwise Maximum Function Suppose gi (x) and g9(x) are defined on some domain. Then the function 2isthepointwise maximum of g and axg behavior g2:
max{g1, g2J(x) = max{gj (x), g2(x)j. Floor Function; Ceiling Function The floor and ceiling floorunctions ar e iined forallrealnumTefors x d c g functions are defined for all real numbers x by floor(x) = I[xJ =
the largest integer in the interval (x
-
1, x]
and ceiling(x) = [xl = the largest integer in the interval [x, x + 1). Best Case; Worst Case; Average Case it is useful to analyze the behavior of algorithms when given data that enables the algorithm to perform at its best and also data that
209
causes the algorithm to perform as poorly as possible. The best case behavior provides a lower bound on how long it will take on any data set. The worst case behavior provides an upper bound on how long it will take on any data set. The harder problem is to determine the average case of the algorithm when given all possible sets of data.
search Sequential Search; Binary Search Sequential and binary search are two algorithms for finding an item in a list (or determining that the item is not in the list). Hit; Miss When comparing characters in two strings, a hit occurs if the two characters are identical; otherwise a miss occurs. The Halting Problem The halting problem seeks a cornputer program, H, that accepts as input another computer program, P, together with some input data, d, and then outputs true if P eventually halts when run on d but outputs false if P runs forever when d is used as input.
4.6.4 Theorems Theorem 4.1 Properties of Logarithmic Functions
"*logb(l) = 0 "*logb(b) = 1 "* logb(xn) = n logb(x) for any real number n and x > 0 "*logb(xY) = 1Og(x) + 1Ogb(y) for x > 0 and y > 0 " logb(y) = lOgh(x) -- logb(y) for x > 0 and y > 0 r
4Proposition bases a and b and all x > 0,
-
g1 b(x)
loga(x) = loga(b)
Theorem 4.3 The Triangle Inequality If a and b are real numbers, then ja + bi _SlaI + Ibl. Theorem 4.4 Big-0 and Polynomials Let f be the polynomial function f (n) = akn k +ak-ink-1 +-. •+aln ±ao, where ak 7 0. Then f
E0(n k).
E 0(gi) and f 2 E 1(g2). Then fl
lim
f (n)
r,
g(n)
tn
then f e 0(g). 4.1 f E 0(g) Let f and g be real-valued functions for which g(n) : 0 for n > no, for some integer, no > 0. If lim fn ) =-0, then fcn) then f e O(g). Proposition 4.2 f E Q2 (g) Let f and g be real-valued functions for which g(n) :A0 for n > no, for some integer, no > 0. If
Theorem 4.5 Big-O and Products Suppose that fi
Theorem 4.7 f E 0(g) Let f and g be real-valued functons for which g(n) : 0 for n > no, for some integer, no > 0. If there is a real number, r 7 0, such that
"f2 E 0(gi
lim
g2).
n-toh
f(n) g(n)
=
0
Theorem 4.6 Big-0 and Sums Suppose that ft e ®(gi) and f2 E E(g2). Assume also that there is a positive integer, no, such that for all n > no, fl(n) > 0, f 2 (n) > 0, g, (n) > 0 and g2 (n) > 0. Then (fl + f2) E 0(max{gl, g2}).
Theorem 4.8 Logarithms Are Order Preserving If 0 <x 2. Theorem 4.5 implies that
1. There are many ways to create the desired algorithm. Here is a fairly efficient one. It can be made more specific by replacing the checks for even and odd with "(m mod 2) == 0" and "(m mod 2) == 1," respectively. By combining the even and odd cases, only one while loop is needed. integer powersOfTwo (integer n) # combine the even and odd cases by using m m
= n
pow
=
n(4n 2 + 89n)(91og 2 (n) + 100) E O(n3 1og 2 (n)). 4. Assume that x > 0. Then 12x 2 + 7x1 = 2x 2 + 7x
0
n < 0 return else if n return else if n m = n
< 2x 2 + 7x2 9x2
if
-
1
==
0 is -
= 91x21. odd Let no = l and c = 9 in Definition 4.3. 2 5. (a) The 0(n ) algorithm will take 16,3842 . .001 = 268,435.456 seconds; the 0(n log 2 (n)) algorithm will take 16,3841og 2 (214) . .001 = 229.376 seconds. (b) Algorithm 1: 268,435.456 seconds is the same as 3 days, 2 hours, 33 minutes and 55.456 seconds. Algorithm 2: 229.376 seconds is the same as 3 minutes, 49.376 seconds.
while in is even pow = pow + 1 =
in
# another factor of 2 # m is even, so the division
mn/ 2
is
exact
6. (a) The for loops can be converted to summations. Each time the comparison a [i ] < a [j ] is executed, it adds one to the total count of critical operations. The nested loops become nested sums.
return pow end powerOfTwo Least 2. Most Efficient log 2 (n) n n log 2 (n) n 2 2 Efficient 3. The expression n(4n 2 + 89n)(91og 2 (n) + 100) is a product of three functions, so Theorem 4.5 applies. Two of the factors involve sums. Since n > 0, both are positive, so Theorem 4.6 can be used. Notice that n2 > n for n > 1, so 7. (a)
(b)
Last Table n
o
p u
p
o
p
0
0
2
3
2
2
1
0
o
p
p
o
p p
E
1=
i=1 j=l
(b)
)
=,i
i=1
E 0(n 2)
Shift Table
h
h
for x > I
u
o
o
p
p
o
n
p
o
p
L
S
W
max(l,1-0)= 1
2
S
1I
T
1
L
max(l, 3 - 2) = max(l,3-0)=3
p p
o
p
p
o
max(l, 3 - 2) =1 I p
T
2
CHAPTER
5
Counting
A familiar nursery rhyme1 reads As I was going to St. Ives I met a man with seven wives; Every wife had seven sacks; Every sack had seven cats; Every cat had seven kits. Kits, cats, sacks, and wives, How many were going to St. Ives? Of course, you learned at a young age that you could weasel out of the hard question by noticing that "I" was the only one going to St. Ives. Your carefree younger days are in the past. It is time to deal with the implied problem. V/ Quick Check 5.1 - How many kittens, cats, sacks, and wives were met on the road to St. Ives? Give a sub-total for each of the four categories as well as a grand total. Counting problems are ubiquitous. Some are easy; some are extremely difficult. Most of the material in this chapter explores the techniques for counting and applications of those techniques. One more introductory example illustrates the diversity of counting problems.
The Maximum Number of Rounds in the Deferred Acceptance Algorithm Recall the stable marriage problem, introduced in Section 1.2. The deferred acceptance algorithm was developed by Gale and Shapely to solve this problem. Gale and Shapely claim that the deferred acceptance algorithm will end after at most n2 2n + 2 stages. Why is this true? The algorithm ends on the round during which the last unclaimed suitee gets a proposal. This is because once a suitee receives a first proposal, she or he will never have an empty string of suitors. Therefore, once the final suitee receives a proposal, all other suitees must have exactly one suitor (there are equal numbers of males and
females). 'Variations on this problem have been around for a long, long time. An ancient (circa 1650 B.C.) Egyptian collection of math problems, called the Rhind Papyrus, contains one version. See [28, p. 55] for more details.
212
5.1 Permutations and Combinations
213
How long can this be avoided? There is an initial round when each suitor proposes to his or her first choice. There are then n suitors, each of whom can propose to at most n - 2 additional suitees and still leave the final suitee unclaimed. Finally, the last unclaimed suitee must be proposed to. This gives a total of at most 1 + n(n -2)
+ 1 = n2
-
2n + 2
rounds. Problem 1 in Exercises 5.1.6 asks you to create a set of preferences that results in this upper bound actually occurring. U
5.1 Permutations and Combinations Much of the material in this section relates to counting in how many ways a subset of elements from a discrete set can be arranged. There are two important pairs of concepts you should look for: order, and repetition. 2 The counting formulas that will eventually be examined fit neatly into the grid shown in Table 5.1. TABLE 5.1 Order and repetition Without Repetition With Repetition
With Order Permutations Permutations with repetition
Without Order Combinations Combinations with repetition
DEFINITION 5.1 Permutation;Combination Permutations and combinations arise when a subset is to be chosen from a set. A permutationis a collection of elements for which an ordering of the chosen elements has been imposed. A combination is another name for a subset; order is unimportant. Permutations and combinations will be examined in detail after some prerequisite ideas have been presented.
5.1.1 Two General Counting Principles There are two important principles for counting when special conditions prevail. The first principle applies when there is a sequence of independent tasks or choices. For example, if I wish to choose one of five books to read and one of three snacks to eat while I read, the two choices are independent. The snack I choose is not influenced by my choice of book. The second principle applies when there is a sequence of mutually exclusive tasks or choices. For example, if I wish to go to one of three restaurants or one of four delicatessens, the choices are mutually exclusive. I either go to a restaurant or a delicatessen. These concepts are formalized in the next definition.
DEFINITION 5.2 Independent Tasks; Mutually Exclusive Tasks The tasks in a collection or sequence of tasks are said to be independent if the outcome of any task is not influenced by the outcomes of the other tasks in the collection or sequence. The tasks in a collection are said to be mutually exclusive if completing any one of the tasks excludes the completion of the other tasks. These special conditions will now be examined in more detail. 2
An alternative term for "repetition" is replacement.
214
Chapter 5 Counting
General Counting Principle 1
Independent Tasks or Choices
If a project can be decomposed into two independent tasks with n, ways to accomplish the first task and n2 ways to accomplish the second task, then the project can be completed in n1 • n2 ways. This principle can be visualized by thinking of a large matrix or table with n 1 rows and n2 columns. The rows are labeled by the possible ways to accomplish the first task. The columns are labeled by the possible ways to accomplish the second task. At the intersection of row i and column j, we place the project option: accomplish task one by method i and task two by method j. Clearly, there are n I • n2 entries in the table, they are all distinct, and no other options are available. Another visualization is to think of a classroom having n I rows and n2 columns of desks. The total number of desks is n, - n2, which is also the number of ways a student may choose a desk at which to sit.
Books and Snacks If I wish to choose one of five books to read and one of three snacks to eat while I read, the two choices are independent. The total number of book/snack combinations is thus 5 • 3 = 15. This can also be shown by exhaustively listing all the possible combinations. U The principle can be easily extended to any finite number of tasks. For instance, if I am faced with a series of three choices (for example, choose a book, choose a snack, and choose a room to read and eat in), the total number of distinct possibilities is the product of the number of ways to make each choice.
Books, Snacks, and Rooms Suppose I wish to choose one of five books to read, one of three snacks to eat while I read, and one of four rooms to read and eat in. The total number of book/snack/room combinations is 5.3.4 = 60. This can be visualized as a three-dimensional grid with four horizontal levels (corresponding to the rooms). At each level is a table with five rows (books) and three columns (snacks). The arrangement is similar to a three-dimensional tic-tac-toe board. If I place a marker on the second row, first column of the table at level 3, I have effectively chosen book 2, snack 1, and room 3. U A more interesting use of the extended version of general counting principle I is the proof of the following proposition (from [51]), which uses ideas derived from the fundamental theorem of arithmetic (see page 115). . ei . e2 .. ek. ti Suppose the positive integer a has prime factorization a ---P "42 "" 'Pk• It is not difficult to show that any divisor b of a must have a prime factorization of the form k. b=p di P d22 "'Pk dk , where di < ei, for i = 1, 2. PROPOSITION 5.1 Counting Divisors Let a be a positive integer with prime factorization a -- Pe . number, v(a), of positive divisors of a (including I and a) is v(a) = (el + 1) - (e2 + 1)... (ek +-). Proof: Any divisor b of a can be written b = pi . pd2
P2
... p
...
Pk .
Then the
where di < ei, for
i = 1, 2 .... k. The choice of value for di is independent of the choice for the value of dj if i 0 j. There are ei + I choices for di, since di E {0, 1, 2. ei}. Thus, there are (el + 1) • (e2 + 1) ... (ek + 1) distinct divisors of a. ]
5.1 Permutations and Combinations
215
The Principle "Fails" Suppose I wish to choose one of two restaurants. At the restaurant, I will choose one item from the menu. If restaurant A has 15 items on the menu and restaurant B has 25 items, I cannot use general counting principle 1 to count the number of possible meals. If I use n I = 2, what value should I use for n2? Clearly, the two choices (restaurant, menu item) are not independent. The choice of restaurant determines how many menu U items I will have to choose from.
General Counting Principle 2
Mutually Exclusive Tasks
or Choices If a project can be decomposed into two mutually exclusive tasks with n ways to accomplish the first task and n 2 ways to accomplish the second task, then the project can be completed in nI + n2 ways.
This can be visualized by observing that I can do either task I (in one of ni ways) or I can do task 2 (in one of n2 ways), but I cannot do both. I can therefore write all the possibilities for task 1 on one piece of paper and all the possibilities for task 2 on another piece of paper. If I spread both pieces of paper before me, I can choose exactly one of the possibilities I see. I thus add the number of options.
M
The Second Principle to the Rescue Suppose I wish to choose one of two restaurants. At the restaurant I will choose one item from the menu. I can use general counting principle 2 if I think of the two mutually exclusive tasks as "choose an item from restaurant A's menu" and "choose an item from restaurant B's menu." If restaurant A has 15 items on the menu and restaurant B has 25 U items, then there are 15 + 25 = 40 possible restaurant/meal options.
M
Puzzles, Books, and Homework I have three puzzles, two science fiction novels, and four textbooks in my room. I need to decide whether to assemble a puzzle, read a (recreational) book, or do the homework i for one of my four classes. In all, I have 3 + 2 + 4 = 9 ways to spend the evening. These two general counting principles do not cover all possible situations, as the next example illustrates.
Both Principles "Fail" I have three microwave dinners in my refrigerator: a fish dinner, a frozen pizza, and some fried rice. I also have a carton of milk, a container of juice, and a can of root beer. The root beer is only acceptable with the pizza, and milk doesn't go well with pizza. I never drink juice with fried rice. How many acceptable food/drink pairs are there? Principle 1 would predict 3 • 3 = 9, whereas principle 2 would predict 3 + 3 = 6. The five viable alternatives are (fish, milk), (fish, juice), (pizza, root beer), (pizza, juice), and (fried rice, milk). Neither general principle was correct. This is because the choice of food and the choice of drink are not independent (picking pizza excludes milk). Neither are they U mutually exclusive (picking fish does not exclude picking juice).
216
Chapter 5 Counting
1. Exhaustively list all pairs of choices from a list of four types of bread, {BI, B 2 , B 3 , B 4 ), and a list of four types of meat, {M 1 , M 2 , M 3 , M 4 1.
3. A new minivan is available in nine different exterior colors and three different interior colors. How many distinct color combinations are available?
2. I am in the market for a new refrigerator. I will choose between a side-by-side design or a top-freezer design. There are four acceptable side-by-side models and five acceptable top-freezer models. In how many ways can I select a new refrigerator?
4. The minivan in the previous problem also is available with either cloth or leather upholstery. Cloth upholstery is available in all three interior colors, but leather upholstery is only available in two colors. How many extek rior/interior choices are there?
5.1.2 Permutations Suppose we start with a set containing n elements. We wish to arrange r of them (with r < n) in order. That is, order is important, but there is only one copy of each element (no repetition). In how many distinct ways can this be done? The answer will be denoted 3 P (n, r). P (3, 2) Let the three elements be the letters "a," "b," and "c." We wish to arrange two of these letters in order. The possibilities are ab, ac, ba, bc, ca, and cb. Thus P(3, 2) = 6. Notice that "ab" is considered distinct from ba (order is important) and "aa" "bb," and "cc" were not viable choices (no repetition). U
M
A Simple Reading Schedule Suppose you need to decide in which order to read three unrelated books. If there is no obvious reason to read any one book before another, we can read the books (conveniently labeled "a," "b," and "c") in the following orders: abc, acb, bac, bca, cab, and cba. Thus P(3, 3) = 6. U
S"
French Horn Duets A band director has four students who play the French horn. She wants to have a French horn duet at the spring band concert. If all four of the students are at about the same level of competence, she might as well assign the two parts (primo and secondo) to two of them randomly. In how many ways can she do this? Label the students "a," "b," "c," and "d." Let the pair "cb" mean c plays primo and b plays secondo. The parts can be assigned in the following twelve ways: ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, and dc. Therefore, P(4, 2) = 12. U The previous examples were easy to solve by exhaustively listing all possibilities. This is not always realistically possible. We need a more sophisticated approach. The previous examples don't lead to an obvious pattern, so a more theoretical approach is needed. Fortunately, general counting principle 1 is all that is needed. Assume we wish to count the number of ways to place in order r items from a set of n distinct items. There are n choices for the first item. After that item has been selected, there are (n - 1) items remaining from which the second item can be chosen. Clearly, the choice of first item has no influence on which of the remaining (n - 1) items is chosen (independence). Once the second item has been selected, there are still (n - 2) items 3
Other common notations for P (n, r) are n Pr and Pr".
5.1 Permutations and Combinations
217
remaining from which to choose (independently) the third item. The pattern continues until r items have been chosen. General counting principle 1 (in its expanded version) implies that the total number of ways to make these choices is the product of the number of ways to make the individual choices.
Counting Formula 1
Permutations
The number of ways to arrange r objects from a set of n objects, in order, but without repetition is P(n,r)=n.(n--1).(n-2)...(n-r
1)-
n! (n-r)!
and P(n, 0) = 1.
Note that P(n, n) objects in order is n!.
=
n! (since 0! = 1). That is, the number of ways to arrange n
Organizing a Ballot You are in charge of placing the names of five candidates on a ballot. You have been instructed to place the names in random order. In how many ways can this be done? Using the formula for P(n, n), there are 5! = 120 ways. U
SOClass
Presentations An instructor has divided the class into seven groups. She wishes to have three of the groups make their presentations today. In how many ways can she arrange the three presentations? The order in which the presentations are made is important (ask the group members!) so P(7, 3) is appropriate. 7! P (7, 3)
=
7!
7!__
(7-3)!
-
4!
7! 7-6 .5 = 210U
Notice the cancellation that occurred in the previous example: 7! 4!
N
7.6.5.
;-.
2
7=7.6.5. 5
Distinct Birthdays A group of n people can have distinct birthdays in P(365, n) = 365- 364... (365 - n + 1)
ways (ignoring leap years).
U
5.1.3 Permutations with Repetition If the objects that we are arranging come in unlimited (or sufficiently large) quantities, then it is possible to increase the number of arrangements. For example, if the letters "a," "b," and "c" are to be arranged in two-letter sequences, the possibilities are either {ab, ac, ba, bc, ca, cb} or {aa, ab, ac, ba, bb, bc, ca, cb, cc), depending on whether repetitions are excluded or permitted.
218
Chapter 5 Counting
Counting Formula 2
Permutation with Repetition
The number of ways to arrange r objects in order, chosen from a set of n distinct objects if objects may be repeated is n." General counting principle I can be used to prove the previous formula. There are r positions to be filled, each having n possible values. The choices are independent, so there are n n n...n =nr r times
ways to arrange the objects.
Three-Letter Codes There are 263 = 17,576 ways to designate an inventory item by a code consisting of three lower case letters. There are 103 = 1000 ways to designate an item using three U digits. Birthdays A group of n people can have birthdays in 365n ways (ignoring leap years).
6
U
Quick Check 5.3
1. Calculate P(9, 4). 2. The owner of a small business has decided to allow the middle managers (Jane, Tom, and Pin) to each have one week of vacation during the month of July. Only one manager is allowed on vacation at a time. Assuming that July
has four weeks, in how many distinct ways can the managers sign up for vacations? 3. In how many ways can the genders of the children be arranged in a family with eight children? (Assume that order is important.) []
5.1.4 Combinations Permutations are arrangements where order is important. Often the order of the objects is not important. What is of interest is the subset of objects that are chosen. For example, if a mother of five children chooses two of them to pull weeds, the children are concerned with which two are chosen, not what order they are chosen in. Arrangements where order is not important are called combinations,and denoted 4 C(n, r). No matter which notation is used, it is usually pronounced "n choose r." (Some students find it helpful to equate the words combination and committee.)
I
Pulling Weeds Suppose the five children who are potential weed pullers are named Annabelle, Bartholomew, Candice, Dorothea, and Ernie, better known to their close associates as A, B, C, D, and E. The two lucky workers can be chosen in 10 ways: AB, AC, AD, AE, BC, BD, BE, CD, CE, or DE. Observe that AB and BA are the same pair of "happy" children. U 4
Other common notations for C(n, r) include the binomial coefficient notation (") and the notations Cn and
niCr.
5.1 Permutations and Combinations
S0Scrubbing
219
Floors The three children who were not chosen in the previous example have no opportunity to feel smug, because mom has decided that she needs three willing floor scrubbers. It should be clear after a moment of thought that there must be 10 ways to choose three floor scrubbers from a set of five children. (Once the weed pullers are chosen, there are exactly three children left. The weed pullers can be chosen in 10 ways [see the previous example].) It is possible to list all 10 floor scrubbing combinations: ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, and CDE. Observe that ABC, ACB, BAC, BCA, CAB, and CBA are all the same set of floor scrubbers. U
In the previous example, there were 2! = 2 ways to list each set of two weed pullers, and 3! = 6 ways to list each set of three floor scrubbers. This is true in general: There are r! ways to list a set of r objects if order is important. This insight leads to the next counting formula. To count the number of ways to pick a subset of r elements (without repetition) from a set of n objects, count the number of permutations, then divide by the number of times each distinct subset of size r has been counted. For example, there are P(5, 2) = 20 ways to list two children in order (permutations): AB, BA, AC, CA, AD, DA, AE, EA, BC, CB, BD, DB, BE, EB, CD, DC, CE, EC, DE, and ED. Since each subset of two children appears twice, the number of combinations is L' = 10. Counting Formula 3
Combinations
The number of ways to choose a subset of r objects from a set of n objects without repetition, where 0 < r < n, is n!
C(n, r) = r! . (n - r)! If0 < n < r, then C(n, r) = 0. Notice that C(n, r) =
n,r) . r!
(See Exercise 42 in Exercise 5.1.6.)
Choosing a Nominating Committee Your club has 35 members. Elections are approaching, and it is necessary to form a nominating committee consisting of 4 members. There are C(35, 4)
ways to form this committee.
_
35= 52, 5 360
4!.31! -
U
A standard deck of 52 playing cards consists of four suits (clubs 4, diamonds >, hearts (, and spades 0). Diamonds and hearts are red cards, clubs and spades are black cards. Each suit has 13 face values (ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, jack, queen, king). The jack, queen, and king in each suit have "pictures" with faces on them. These 12 cards (3 per suit) are commonly called face cards. Note carefully the distinction between "face value" and "face card." Crazy Eights Crazy eights is a popular children's card game. Each player is dealt 7 cards from a standard 52-card deck. How many different "hands" of 7 cards are there? Since the order in which the cards are dealt does not matter and no card appears more than once, this is a combination problem. The solution is thus C(52, 7)
52! 52
7! .45!
133,784,560.
U
220
Chapter 5 Counting Most calculators cannot compute n! if n > 69 because 69! - 1.71122 x 1098 but 70! - 1.19786 x 10100. Most calculators only have room for a two-digit exponent on 10. Solving the previous example by first calculating 52! and then dividing the result by 45! and then 7! may not give a completely accurate result. One of my calculators gives (1.3378 x 108) as the solution. The calculator has displayed the answer with the last four digits rounded (in this case because the calculator only displays eight digits). There is a method that will work even if your calculator has no [] key: First cancel the 45! (symbolically), then cancel common factors, and then use the calculator. The first step produces C(52, 7)
=-
52! 7! .45!
52 .51 .50 • 49 • 48. 47 • 46 7.6.5.4.3.2.1
It is then possible to use the 7 in the denominator to reduce the 49 in the numerator to 7; the 6, 4, and 2 in the denominator to cancel the 48 in the numerator; and the 5 in the denominator to reduce the 50 in the numerator to a 10. The 3 in the denominator will reduce the 51 in the numerator to a 17. This leaves the product 52.17.10.7. 1 .47.46. N
More Crazy Eights How many different seven-card crazy eights hands are there that contain exactly one 8? There are four ways to choose an 8, with 52 - 4 = 48 cards remaining from which to choose the other six cards for the hand. Consequently, there are 48!
4 • C(48, 6) = 4. 6!.42! -
4 • 48. 47 • 46 • 45 • 44.43 = 49,086,048 6-5.4.3.2.1
distinct crazy eights hands containing exactly one 8. How many different seven-card crazy eights hands are there that contain exactly one 8 and exactly two face cards? There are four choices for the eight. There are 12 cards from which to choose the two face cards. This can be done in C(12, 2) ways. Finally, there are 52 - 12 - 4 = 36 cards left that are neither face cards nor 8s from which to choose the remaining four cards for the hand. There are consequently (using general counting principle 1) 12!
36!
4!.32!
4. C(12, 2) •C(36, 4) = 4 2!.-10! = 4. (6. 11)
4!.-32! .
(3 .35. 17 .33)
= 15,550,920
crazy eights hands containing exactly one 8 and exactly two face cards. M
U
Crop Rotation A farmer has divided his land into nine plots. He typically leaves two of the plots fallow. In how many ways can he choose the two fallow fields? Since the order in which the two fields are chosen is unimportant, and since a plot of land cannot count twice, the proper number is 9!
C(9,2) =
9.8
9 =! -22 2! -7!
= 36.
U
5.1 Permutations and Combinations
221
Choosing a Pair Given n people, a single pair can be chosen in C(n, 2) n.(n-1) 2 E 0(n 2 ) ways. In Figure 5.1, the gray points represent the value of n, and the black points represent the value of C (n, 2). Notice that as the number of people (n) increases, the number of pairs of people increases much faster. 300 .
250
7
200 -C(n,
2),
200
150
7
7
100
.0
50
n
.000
5
10
15
20
Figure 5.1
Rate of growth for C(n, 2).
VQUiCk
Check 5.4
25
1. Compute C(8, 6). 2. A catering service offers 10 dinner choices. The campus Gourmet Club plans to have four catered dinner meetings during the school year. Members do not wish to repeat a menu selection during the year. (a) How many distinct collections of four dinners can members pick? (The order in which the meals are served is unimportant to the club members.) (b) Suppose members also care about
n
U
the order in which the meals are scheduled. How many distinct schedules are there? 3. In the discussion after Example 5.19 I mentioned that my eight-digit calculator rounded the nine digit number 133,784,560 to 1.3378 x 108. I can recapture the full solution by entering the key sequence [] E 9 [W ] and manually appending a "0" to the displayed result. Why does this work? I] Will it always work?
5.1.5 Combinations with Repetition The final counting formula relates to combinations with repetition. A situation that would require this principle is the following. Planning to Study Meg has three classes (algebra, biology, chemistry) that need some attention in the next two days. Tonight she has scheduled a study session during which she can work on two classes, or spend the entire session on one class. Each class has sufficient work to fill the entire session. In how many ways can she schedule the evening if all she cares about is which subjects are to be worked on, not the order in which they are tackled? Observe that repetition is allowed; it is possible to work on the same class for the entire session. Also notice that the order in which the work is scheduled is not important. This problem can be solved by exhaustively listing the possible arrangements: aa, ab, ac, bb, bc, cc. There are six possible ways. U
222
Chapter 5 Counting
Counting Formula 4
Combinations with Repetition
The number of subsets of r elements from a set containing n distinct elements, with repetition permitted, is =-n q-r r - 1, r) =(n C(n +C~n~-l~) r
You need not memorize the expression
-
1)!
(n - 1)!
(n~r-1)!
It is sufficient to memorize
exrsinr!.(n-1)!'Iissfiintom
orz
C(n + r - 1, r) and then use the expansion for C(n, r). An alternative is to memorize the process presented in Example 5.24. Another way to express the notion of repetition is to consider n to represent the number of categories of distinct objects. Each category will contain many identical copies of its object. Whether you consider a single object that can be chosen and then returned so that it can be chosen again ("with replacement") or you consider multiple identical copies ("with repetition"), the net effect will be the same. The derivation of counting formula 4 is more sophisticated than the derivations of the other formulas. It is easiest to understand if the task is cleverly visualized. 5 An example of the visualization will be presented before the derivation is formalized. Candy Markers You are to choose three pieces of candy from a jar containing 12 (or more) pieces of candy. There are four flavors of candy. There are at least three pieces of each flavor in the jar. In how many ways can you choose the candy? The set of flavors is the important feature. You do not care about the order in which the three pieces were chosen. If a flavor has been chosen, there are sufficient pieces left to choose another of the same flavor, so repetition is allowed. Assume for the moment that each piece of candy costs a dime. If the four flavors are represented by four empty boxes, the three pieces can be chosen by placing three dimes in the boxes. Thus, if two dimes are placed in the third box and one dime is placed in the fourth box, two pieces of flavor 3 (licorice) and one piece of flavor 4 (cinnamon) have been chosen (Figure 5.2). Figure 5.2
Choosing candy.
Watermelon Butterscotch
Licorice
Cinnamon
6 This can be diagrammed using "'"to represent an interior wall of a box, and to represent a dime. The arrangement just mentioned would be shown as
"Q"
110010 whereas the selection of two butterscotch and one licorice would be represented by 100101. The number of distinct patterns of I's and O's represents the number of distinct ways to choose the candy. The patterns are uniquely determined by which positions contain the O's. There are 4 - I interior walls (and hence, three I's) and three dimes (O's). 5 The 6
visualization presented next follows the presentation in [62]. Convince yourself that the two exterior walls can be omitted from the diagram.
5.1 Permutations and Combinations
223
Thus, there are 4- 1-+-3 symbols to arrange. There are C(4+3- 1, 3) = C(6, 3) = 20 ways to pick the three positions for the O's (and hence 20 ways to choose the candy). You should list the 20 distinct ways to choose the candy. N
Derivation of Counting Formula 4: The n distinct elements can be represented by a line of empty, adjacent boxes. Each of the n elements corresponds to a unique box. The r elements can be chosen by placing r markers in the boxes. The number of markers in a box represents the number of the corresponding element that has been chosen.
The process of placing markers in the boxes can be carried out in a systematic manner. Start at the first box. At each stage a decision is made: either place a marker in the current box or else move to the next box. Continue until all the markers have been placed in a box. The result can be summarized in a linear diagram by using the symbol "I"to represent a vertical wall of a box and the symbol "Q" to represent a marker. The decision to place a marker results in a Q being written to the diagram; the decision to move to the next box results in a Ibeing written to the diagram. The two exterior walls are not important because they are never involved in a decision: the process starts in the first box (so the leftmost wall has already been crossed) and the process ends before the rightmost wall is reached. The n - 1 interior walls and the r markers completely determine which set of r elements will be chosen. The number of distinct combinations is the same as the number of visually distinct linear arrangements of I's and O's. There are n - 1 interior walls, and r markers, so there are (n - 1) + r = n + r - 1 positions to fill. It is necessary to choose a subset of r positions in which to place the markers. The actual walls and markers are distinct items, so there is no repetition. The order in which the markers are placed is unimportant (since they are visually indistinguishable). What is important is the subset of positions they occupy. Thus, the original problem has been converted into finding a subset of size r from a set of size n + r - 1. Counting formula 3 applies; there are C(n + r - 1, r) ways to place the markers. 7 El M
Selecting Fruit Four family members have just completed lunch and are ready to choose their afterlunch fruit. There are bananas, apples, pears, kiwi, apricots, and oranges in the house. In how many ways can a selection of four pieces of fruit be chosen? Since it has been implied that there is sufficient fruit of each variety for every family member to have his or her first choice, and only the selection of varieties (not which person eats what fruit) is of interest, this is a combination with repetition. The solution is thus C(6 + 4 - 1,4) = C(9, 4)
9! 44! • 5! = 126.
U
O'Quick Check 5.5 1. Calculate the number of ways to choose 2 items, with repetition, from a set of 4 items. (a) Exhaustively list all possibilities, (b) Use counting formula 4. 2. An English teacher has designed extra-credit projects for his students. They may 7
• write a poem • write a short story • read a book and complete a book report - write a one-act play * write a letter to the editor * write an article for the school news-
paper
Notice the problem-solving strategy: A problem with an unknown solution has been transformed into a
problem that has already been solved.
224
Chapter 5 Counting
Multiple projects from a category are
How many different sets of three
acceptable. For example, someone may turn in two poems.
projects are possible?
Table 5.2 summarizes the formulas for permutations and combinations. They are organized according to whether order is important and repetition is permitted. TABLE 5.2 Arranging r elements from a set containing n distinct elements With Order Without Order n! Without Repetition
P(n, r) =-
n!
(n - r)! nr
With Repetition
nr
C(n, r) =+ C(n + r - 1, r)
r!. (n - r)! _(n
-
+ r --1)! 1)! (n+ r!.-(n -1)
5.1.6 Exercises The exercises marked with
B have detailed solutions in
Appendix G. 1. (a) Design a set of marriage preferences so that a group of four men and four women actually require 10 rounds for the deferred acceptance algorithm. (b) Can you determine a general pattern of preferences so that a group of n men and n women require the maximum n 2 - 2n + 2 rounds for the deferred acceptance algorithm? (c) Produce a big-O estimate for the worst-case behavior of the deferred acceptance algorithm, where proposals are the critical operation. 2. O This evening, I can either read one of four books, watch one of three videos, or talk on the phone with one of three friends. In how many ways can I spend the evening? 3. P Tomorrow night I plan to listen to one of 12 cassette tapes, and then eat one of four frozen dinners. After dinner I will put together one of three picture puzzles. How long can I maintain such an active schedule? (Actually, in how many ways can I spend the evening?)
4. A boy needs to eat breakfast and lunch, practice piano, mow the lawn, and read a book today. In how many ways can he arrange these activities, assuming he doesn't care in which order the meals occur? 5. How many tours consisting of a visit (in the order specified) to a park, a museum, a mall, and a restaurant are possible if we can select from five parks, two museums, six malls, and eight restaurants? 6. 'I- Suppose that either a doctor or a dentist is chosen to speak about his or her profession to a seventh-grade classroom, How many choices are there for this speaker if there are 70 doctors and 23 dentists available? 7. Mom requires Bobby to eat 10 servings of grain products each day. Suppose that the grains available to Bobby are cereal, bread, rice, pasta, and crackers. For how many days
can Bobby make a different selection of his grain servings,
assuming that he obeys Mom's rule? 8. I have five textbooks on the shelf above my desk. In how many ways can I place these books in a line? 9. How many distinct, five-digit zip codes are possible? 10. Assuming that there will never be more than 500 million people in the United States before the year 2050 A.D., how many digits are necessary to provide every person with a personal zip code? (Notice that businesses and institutions are being neglected.) 11. * Suppose that for a conference, individuals are given a badge with five features: a color, a shape, an uppercase alphabet letter, an animal, and a one-digit number. What is the largest number of distinct badges, assuming that there are 7 colors, 8 shapes, and 4 animals available? 12. What is the major distinction between permutations and comnbinations? 13. Suppose that in an experiment you are asked to arrange a penny, a cracker, a washer, and a pencil in a row. In how 14. An individual won a raffle and can choose a prize from one of four lists. The first list contains 19 possible prizes, while the second, third, and fourth lists include 14, 21, and 18 items, respectively. How many prizes does the raffle winner have to choose from? 15. D Next week, I intend to visit my old home town. I wish to visit five friends but have time to see only three of them. In how many ways can I schedule the visits? 16. A special school club called the Lions is formed by including either the teacher or one of the three top students from each of 43 classrooms. How many ways are there to form the Lions? Assume that both a teacher and a student from a single classroom are not chosen to be members of the Lions.
5.1 Permutations and Combinations 17. A local pizza store offers a choice of seven toppings. How many distinct three-topping pizzas do they offer if no topping can be repeated? 18. P Recall that a bit is either a "0" or a "1L"If 7 bits are used to encode characters, how many different characters can be encoded? [Hint: Exhaustive enumeration (0000000, 0000001, 0000010, 0000011, ... ) is too tedious for this problem.]
Assume that there are more than six faculty, more than six administrators, and more than seven students. Also, assume that a committee with three administrators, four students and three faculty is different from a committee with four administrators, three students, and three faculty, but we don't care which four administrators, etc. are chosen. 28. A piano has 88 keys; 52 are white and 36 are black.
19. P A small country (9 million people) is installing a new, state-of-the-art telephone system. How many digits are necessary to allow sufficient phone numbers? Discuss the assumptions you have made, as well as the justification for your answer. 20. Suppose that you are taking a bus trip from your home in Florida to California. There are seven different bus services offering trips from Florida to Arizona, four offering trips from Florida to Texas, eight offering trips from Arizona to California, and seven offering trips from Texas to California. 29. How many possibilities are there for a bus ride from Florida to California, via either Arizona or Texas? 21. I will be sharing an apartment with two friends. The apartment has one large and one small bedroom. Two of us will and the other will sleep in the smaller share the large bedroom wys an w asignwithin room Inhow anydistngushale room. In how many distinguishable ways can we assign rooms? 22. D My young niece will soon turn 11. Suppose I want to send her a dollar for every year old she will be. I have a large stack of crisp new dollar bills and also a large pile of shiny new gold Sacagawea 8 dollar coins. In how many distinct ways can I use these two kinds of dollars to give $11 ? Solve this using combinations with repetition and then by a more direct (and simpler) method. 23. The day after nine students complete an exam, the teacher passes out their grades. The possible grades are A, B, C, D, and F. In how many ways can the grades be assigned to the students? 24. In how many ways can you purchase three movies from a store that sells 57 distinct movies in unlimited quantities if (a) There are no restrictions on your purchase of the three movies (b) You must purchase a copy of the owner's home-video footage of her new grandchild 25. A recording artist plans to place six songs on her new CD. (a) In how many ways can she order these songs? (b) In how many ways can she order these songs if she has already chosen the first and last selections?
225
(a) You have 10 fingers. Assuming that your fingers were made of rubber and it was possible to reach any combination of ten keys simultaneously, how many 10-key combinations are there? (b) A standard octave contains 12 keys, representing distinct musical notes. Triad chords require three keys to be played simultaneously. Ignoring how agreeable the sound will be, in how many ways can a triad be played within a single octave? OPThe following excerpt is from The Mythical Man-Month by Frederick Brooks [30, p. 78]: If there are n workers on a project, there are n2-n 2~ interfaces across which there may be communimutes almos aay there and in, teapr witinw h coord ntion mutou which coordination must occur. The purpose of organization is to reduce the amount of communication and coordination necessary; hence organization is a radical attack on the communication problems treated above.
(a) Provide a mathematical justification for the claim: "If interthere are n workers on a project, there are n faces across which there may be communication." (b) Provide a mathematical justification for the claim: "If there are n workers on a project, there are potentially almost 2' teams within which coordination must occur." 30. In 1992 the Mattel toy company introduced a talking Teen Barbie doll. The company had compiled a collection of 270 possible expressions for Barbie to "speak." In order to make the dolls appear more individually distinct, Mattel randomly chose 4 of the 270 possibilities for any particular doll. (a) How many distinct dolls could they manufacture? (b) Suppose one of the 270 statements is considered to be undesirable. How many distinct dolls contain the undesirable statement amongg their 4 exclamations? 31. A local pizza store offers a choice of seven toppings and three sizes (small, medium, and large). How many distinct threetopping pizzas do they offer? 32. A fletcher is making some arrows. Each arrow has three
26. A farmer has divided his land into nine plots. He typically plants one plot of soybeans, one of corn, one of alfalfa, and one of potatoes. The rest of the plots are left fallow. How many distinct patterns of crop/plot pairings are there?
feathers on shaft. The and green. a color can
27. In how many distinguishable ways can I form a committee of ten people if the committee must contain at least one faculty member, at least one administrator, and at least two students?
Assume that rotating the arrow does not result in a different pattern, but changing colors does. Indicate which counting principle or formula most directly applies to this problem.
the tail end, spaced at 120' intervals around the feathers come in four colors: red, blue, yellow, How many visually distinct patterns are there if be used in more than one position on an arrow?
8For more information, see the Sacagawea link at hnp://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/ section).
(in the "Textbook-Related Links"
226
Chapter 5 Counting
33. P (a) How many ways are there to situate four people in identical chairs at a circular table? (b) Generalize part (a) to the case when there are n people. 34. A standard "Trivial Pursuit" game contains two boxes of 500 question cards each. Consider one of these boxes. Each question is in one of six categories (such as "Literature" and "Movies"). Each card contains one question from each category. When a card is used, only one question is asked. The card is then placed at the end of the box. Suppose a game is played that requires exactly 500 questions. In how many ways can the sequence of 500 questions be chosen? (Your calculator will hate this question!)
following digits: 1, 5, 6, 7, 8? Assume that each digit can only be used once. 41. )- What value should C(n, 0) have (intuitively)? What value should C(n, n) have, (again, intuitively)? How do your ansewers match with counting formula 3?
How many different seven-card crazy eights hands are there that contain no 5s but have four cards of the same kind? 36. How many different seven-card crazy eights hands are there that contain exactly two 2s, exactly three 3s, and exactly two cards with a third common face value?
containing the same pattern are called "doubles." The following diagram shows two dominoes: the 1-5 domino and a double 4.
35.
42. Explain (intuitively) why P(n, r) = P(r,r) • C(n, r). 43. The game of dominoes uses a set of small rectangular tiles. Each tile is called a domino. A domino contains two equalsized regions that contain zero or more dots, in fixed patterns. (A region with no dots is called a "blank.") The domino set has a maximum number of dots (usually 6, 9, or 12). Every pattern (number of dots) appears exactly one time with every other pattern on a domino. Dominoes with both regions
'D
37. How many different eight-card hands are there that contain exactly two suits, with four cards from each suit? 38. How many different eight-card hands are there that contain only face cards? 39. Mrs. Candy has a large box of lollipops, chocolate bars, and caramels. She wants to give each of the nine children in her neighborhood three pieces of candy. Taking into account that each type of candy is available in a quantity greater than 30, in how many ways can Mrs. Candy distribute the candy? 40. How many odd four-digit numbers can be created from the
0
0
0
0
0
0
(a) Create a formula D(n) that computes the number of dominoes in a set whose highest tile is a double n. For example, D(0) = 1, since there is only one domino (double blanks). Also, D(l) = 3 since there will be three dominos: double blanks, blank 1, and double Is. (b) A set of dominoes whose highest tile is a double 12 will contain D(12) tiles. Derive this number in as many distinct ways as possible. That is, use different counting techniques to derive this number.
5.1.7 More Complex Counting Problems It might be helpful to reflect for a moment about the rates at which the numbers in the counting formulas grow. It should be clear that there will be more arrangements when order is important than when order is irrelevant. Thus, for a common n and r, there should be more permutations than combinations. It should also be clear that if repetition is permitted the number of arrangements should increase. These observations can be seen in Table 5.3, where the numbers of permutations and combinations of 6 objects are compared. Notice the explosive growth in row 2. Permutations with repetition allow many more arrangements than any of the other options. TABLE 5.3 Permutations and combinations with n = 6 r Permutations Permutations with repetition Combinations Combinations with repetition
0
1
2
3
4
5
6
P(6, r)
1
6
30
120
360
720
720
6r
1
6
36
216
1296
7776
46656
C(6, r)
1
6
15
20
15
6
C(6 + r - 1, r)
1
6
21
56
126
252
1 462
The examples in this section will typically (but not always) require more than one of the counting principles or formulas to be used. It will be helpful to consider the four criteria used to develop the principles: independence, mutual exclusion, order, and repetition. If the presence or absence of these are noted for each problem, the proper
5.1 Permutations and Combinations
227
counting principle can be easily identified. It is generally easiest to apply one of the permutation or combination principles whenever possible. Only if these don't apply should the more general principles 1 and 2 be tried. Section 8.1 contains some additional, more sophisticated material related to counting. Several examples and problems will be easier to describe using the next few definitions. DEFINITION 5.3 Digit A digit is any one of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. DEFINITION 5.4 Alphanumeric A character is said to be alphanumericif it is either an uppercase or lowercase letter or it is a digit. A character is said to be alphanumeric-upperif it is either an uppercase letter or it is a digit.
License Plates A state wishes to produce license plates that consist of three uppercase letters, a space, and three digits. If no three-letter sequences are excluded, how many license plates are possible? Order is clearly important. Repetition is permitted. However, there is a letter sequence and a digit sequence. How should these be combined? The choice of letter sequence has no influence on which digit sequence is chosen. The general counting principle 1 thus applies. There are 263 = 17,576 letter sequences and 103 - 1000 digit sequences. There are thus 17,576 . 1000 = 17,576,000 possible license plates. U Alphanumerics The sequence of characters "aB3d5" is an alphanumeric sequence. The sequence "3DFG67" is an alphanumeric-upper sequence. The sequence "a34,205b" is not alphanumeric because it contains a comma. U More Inventory Codes The owner of a small store has decided to use an inventory code that distinguishes between taxable and non-taxable items. Nontaxable items will have a code that starts with the letter "N," followed by any combination of four alphanumeric-upper characters. Taxable items will start with any upper case letter except "N," again followed by any combination of four alphanumeric-upper characters. How many distinct inventory codes are there? The key feature is that the first letter of the code produces two mutually exclusive sets of items (taxable and nontaxable). General counting principle 2 thus applies; add the number of nontaxable codes to the number of taxable codes. There are 36 alphanumericupper characters, so there are 364 = 1,679,616 codes for nontaxable items. There are 25 choices for the first letter of a code for a taxable item (independent of the rest of the code), so there are 25 • 364 = 41,990,400 codes for taxable items. The sum is 43,670,016. (Can you produce a somewhat simpler solution to this problem?) N Attending the Cinema Four friends are planning a trip to a cinema complex. The complex is showing six
228
Chapter 5 Counting
different movies (all starting at the same time). In how many ways can the friends view the movies if each views only one? I will assume that the people each care which movie they view, so order is important. More than one friend can view a movie, so repetition is permitted. This is a permutation with repetition problem. There are 64 = 1296 possible viewing arrangements. U Scheduling Presentations An instructor has divided her math class into eight groups. Each group must make a presentation to the class. She can schedule three presentations per day but will have only two groups on the final day, so that she can serve cookies. In how many ways can this be done? The simplest way to solve this is to think of scheduling a sequence of eight events. There are P(8, 8) = 8! = 40,320 ways to do this. A more complex approach selects a set of three groups for the first day. This can be done in P(8, 3) = 336 ways. She can then choose a set of three groups for the second day in P(5, 3) = 60 ways. There are now only two groups left for the third day. These can be arranged in two ways. The three days must now be properly combined. The days are not mutually exclusive; all three days must be scheduled. Are they independent? The choice of groups for day one does influence the available choices for day 2. This influence has been eliminated by reducing the pool from which to choose on day two down to five groups. With this modification, the choices by day are independent. The total number of three day arrangements is thus 336 • 60 . 2 = 40,320. E E
At Most One Eight In the card game crazy eights, it is advantageous to have eights in your hand. If seven cards are dealt from a standard deck of 52 cards, how many hands have at most one 8? It will be helpful to consider two mutually exclusive possibilities: no 8s and exactly one 8. If exactly one 8 is in the hand, there are four choices for the 8, and 48 cards from which to choose the other 6 cards. There are C(48, 6) = 12,271,512 ways to choose the six noneights, so there are 4 • 12,271,512 = 49,086,048 hands with exactly one 8. There are C(48, 7) = 73,629,072 ways to deal a hand with no 8s. There are thus 49,086,048 + 73,629,072 = 122,715,120 ways to deal a hand with at most one 8. U 60 Quick Check 5.6 1. Plot the values of C(4, r) for 0 < r < 4. 2. Plot the values of C(5, r) for 0 < r < 5. 3. Produce an alternate solution to Example 5.28. 4. A high school student needs to listen to the election results on the day of the election, and then read the nextday coverage in a newspaper. There are five television channels and four radio stations that are carrying election coverage. The student has two
newspapers to choose from. Assuming the student is loyal to one electionday source for listening, in how many ways can a listening/reading pair be chosen? 5. Suppose the student in the previous problem is instead required either to compare the coverage of two television stations or compare the coverage of two radio stations. The student must then report on the coverage in one newspaper. In how many ways can this be done?
Some apparently difficult counting problems become fairly easy once you adopt the proper viewpoint. The next theorem illustrates this situation.
5.1 Permutations and Combinations
229
SOrdered Triplets Let i, j, k, and n be positive integers. The number of ordered triplets (i, j, k) with 1 < i <j a2 > a3 > ...> ak. The sequence is strictly monotone if all the inequalities are strict (for example, < rather than 0 elements. Then [7N(S) I = 2n. Theorem 5.6 Vandermonde's Theorem Let n, m, and r be nonnegative integers, with r < n and r < m . Then
y)n=y£
(n)xn-r
r=
yr.
5.5 Chapter Review Theorem 5.8 The Multinomial Theorem Let n nonnegative integer. Then (XI +X2 + =
X
be
a
be finite sets. Then n
JIlU A 2 U...U An1
2 ....
Z
ZI AI-
A A jfA I
1 10, a natural number)
14. Compute the probability of rolling a pair of fair dice and obtaining distinct numbers on the two dice. 15. Consider the random experiment of choosing eight distinct letters of the English alphabet, where exactly two of the letters chosen are vowels and six of the letters are consonants. What is the probability that the letter "a" is among the eight letters and that these same eight letters can be arranged in a row to form the word children? 16. ODCompute the probability of picking a spade or a three (using a standard 52-card deck). 17. Consider the candy jar in Example 6.13. Compute the probathat I first choose a watermelon and then choose a cinnamon. Assume that I do not replace any candy once it has been chosen. (Hint: What happens to the sample space after the watermelon candy has been chosen?)
6.2 Conditional Probabilities and Independent Events 18. Compute the probability of picking a diamond with an even number printed on it (i.e., a 2, 4, 6, 8, or 10) from a standard 52-card deck. Use one of the formulas. Describe which formula you used and why it was appropriate. Check the reasonableness of your answer by directly computing the probability. Describe this second solution. 19. O Consider the random experiment of rolling two fair dice. What is the probability that the sum is 10 and both dice show an even number? Solve this three ways (two in part (b)): (a) Using a six by six table, representing the 36 possible rolls of the dice, count the possibilities directly. (b) Use formula 4 (both versions). 20. Compute the probability that a family having three children, randomly chosen from all families having three children, has at least one child of each gender and contains at least two girls. Use one of the formulas. Describe which formula you used and why it was appropriate. What assumptions have you made?
(a) D List the sample space. (b) P Decide whether the successive flips are independent. That is, should the probability of a head on the second flip be revised if we know that the first flip produced a tail? Should knowledge of the first two flips cause us to revise the probabilities of a head on the third flip? (c) P Use your answer to part (b), an appropriate formula from Section 6.2.2, and a proper choice of method from Section 6.1.2 to compute the probability of each outcome. (d) Compute the probabilities of the following events: i. ii. iii. iv. v.
(a) Compute the probabilities for all eight outcomes in Example 6.3. (b) Compute the probabilities of the following events: i. Two girls, one boy ii. All boys i llby iii. At least one girl (Is there an "easy way" to do this part?) part?) v. toe gender v. At All the same ofme gender 26. According to [29], it is possible to categorize the (nonelderly) people without health insurance during 1989 in the United States by family income level. The following table shows the data.
Family Income Level Less than $5,000
md?$5,000-$9,999
21. The random experiment is flipping a fair coin three times. Compute the probability of obtaining at least one head. 22. The random experiment is flipping a fair coin three times. Compute the probability of a head on the first flip or a tail on at least one of the last two flips. 23. The random experiment is choosing an uppercase letter from the English alphabet. (a) What is the probability of choosing either a vowel or a consonant in the first half of the alphabet? (b) What is the probability of choosing a vowel that is within two letters of the letter L? (The letters R, S, U, and V are within two letters of T.) 24. Consider a random unfair coin three times. Assume that experiment P(H) = .8of andflipping P(T) an = .2.
No heads At least one head All tails A tail on the first flip Identical results on the first and third flips 25. According to [78], there were approximately 1050 boys born in the United States during 1987 for every 1000 girls born. Thus the probability that a newly born child is a boy is approximately .512 (so the probability the child is a girl is .488). Assume that the gender of previous children in the family has no influence on the gender of the next child conceived (i.e., assume independence),
279
Number of People (in Millions) 4.7 5.0
$,$,
$10,000-$14,999 $15,000-$19,999 $20,000-$29,999 $30,000-$39,999 $40,000-$49,999 $50,000 or more
5.0 5.6 4.6 5.9 3.2 1.9 3.3
Consider the random experiment of choosing to give health insurance to one of these people, not showing favoritism in any direction with regard to family income level. The events of interest are membership in one of the eight income brackets in the table. (a) Compute the probabilities for all eight events in the relevant sample space for this random experiment. In other words, determine the probability that the randomly chosen person without health insurance fits into any one of the given income levels. (b) Suppose that the previous random experiment just involved giving health insurance to a person for one year. Consider the random experiment of choosing two of these people to receive health insurance during two different years. It is permissible for the same person to be chosen twice. Additionally, assume that the person chosen for one year has no influence on the person chosen for the next year (i.e., assume independence). The focus in this experiment is on the income level of the person chosen for each of the two years. Compute the probabilities of the following events: i. Two people chosen from the same income group. ii. Neither person has an income below $5,000. iii. The person chosen in the first year has an income of at least $50,000. iv. Exactly one person from the $40,000-$49,999 income bracket is chosen or exactly one person from the $15,000-$19,999 income bracket is chosen. v. Both of the people chosen have an income that is at least $30,000.
280
Chapter 6 Finite Probability Theory
27. According to the StatisticsNew Zealand "snapshot" of work, education, and income data collected in the 2001 Census, 6 in 10 people received income from wages and salaries. Consider the random experiment of choosing three people in New Zealand. Let W represent that a person received income from wages and salaries, and N represent that a person received no income from wages and salaries. Assume that whether or not one person received income from wages and salaries has no correlation with the income situation of the other two people (i.e., assume independence). The three people will be chosen in succession. There are eight events of interest: {WWW, WWN, WNW, WNN, NWW, NWN, NNW, NNN}. (a) Compute the probabilities for all eight events of interest for this random experiment. (b) Compute the probabilities of the following events: i. No one received income from wages and salaries or everyone did. ii. At most one person received income from wages and salaries. iii. Exactly one person did not receive income from wages and salaries. iv. Someone did not receive income from wages and salaries,
v. Exactly one person received income from wages and salaries or at least two people did not receive income from wages and salaries. 28. Suppose there is a room with 28 people. What is the probability that at least two of them share a common birthday (ignoring leap years)? 29. Assume that P(A IB) = P(A). Prove P(B I A) = P(B). 30. Show that
(1 - (1 lim (
-
-
n(.
(i
0
e -2
-
-(1
-
n
" e- I
31. Suppose a reporter read the California Supreme Court opinion in Example 6.31. The reporter summarized the verdict by writing The California Supreme Court overturned the verdict because there was not enough evidence to support the prosecution's probability estimate. Write a letter to the editor showing the inadequacies of the reporter's simplistic summary.
6.3 Counting and Probability This section will eventually examine probabilities related to large sample spaces. The technique to be used will first be illustrated with a small sample space.
SIProbabilities
and Candy Example 5.24 on page 222 introduced a candy jar with four flavors of candy, with three pieces of each flavor contained in the jar. What is the probability of picking at least one cinnamon candy? This task will be simplified if the complementary event is examined. Denote the complementary event by NC. There are C(3 + 3 - 1, 3) = C(5, 3) = 10 ways to choose a set of three pieces of candy without any being cinnamon. Since there are 20 ways to choose a collection of three pieces of candy with no restrictions, the probability is I - P(NC) =-- 1 L0 .5 that at least one cinnamon was picked. 20What is the probability of picking three pieces of candy in which there is either exactly one cinnamon or there is exactly one licorice, (cinnamon and licorice may both present)? Let OC be the event "exactly one cinnamon" and OL be the event "exactly one licorice". We want the probability of the event OC U OL. That probability is P(OC U OL) = P(OC) + P(OL) - P(OC n OL). The jar contains identical pieces of candy, so lOCI can be calculated by picking one cinnamon, removing the other pieces of cinnamon from the jar, and then counting the number of ways to choose two more pieces of candy. There are C(3 + 2 - 1, 2) = C(4, 2) = 6 ways to do this. Similarly, there are 6 ways to pick exactly one licorice. There are 2 ways to pick one cinnamon, one licorice, and a third piece that is neither cinnamon nor licorice. 5 ( -+ Thus, P(OC U OL) = P(OC) + P(OL) - P(OC n OL) = 62 6 might find it helpful to list the 10 favorable events). U
6.3 Counting and Probability
281
The second example considers the probability of a card hand. Crazy Eights Hands Young crazy eights players1 6 might wonder how likely it is that a hand they have just been dealt will contain at most one 8. We have the theoretical basis to compute this probability. It is just the number of seven-card hands that contain at most one 8 divided by the total number of seven-card hands. The numerator has already been computed in Example 5.31 on page 228 in Chapter 5. There are C(52, 7) = 133,784,560 seven-card hands. Thus P(at most one eight) = 133,71560 .917, or about 91.7%. Hands with two or more 8s occur only about 8.3% of the time. Example 5.20 on page 220 calculated the number of crazy eights hands with exactly one 8 and exactly 2 face cards to be 15,550,920. The probability of such a hand occurring at random (assuming every hand is equally likely to occur) is 15,550,920 116 133,784,560 16 (approximately 12%). U In the board game RiskO, the opponents each try to conquer the entire world. The mechanism used is to roll fair dice. The attacker can use from one to three dice (depending on the number of armies the attacker has). The defender can use one or two dice (again depending on the number of defending armies). The opponents roll all the dice. They each line their dice up in descending numeric order. They then compare their (respective) highest values until the defender runs out of dice for that roll. For example, suppose they roll A6, A3, A4, D4, and D5. (That is, the attacker rolls a 6, 3, and 4; the defender rolls a 4 and 5.) They would compare the two pairs (A6, D5) and (A4, D4). The pairs are scored independently. The player with the higher number in a pair wins that match. If there is a tie, the defender wins. In the example just mentioned, each opponent would lose one army. A smart attacker will compute how likely it is to win. I will compute this probability in two special cases. Risk Dice: One against One If each player has one die, the sample space is the set of 36 pairs: (attacker's number, defender's number). It is possible to list all the pairs, then determine how many have the first component larger than the second. This seems too tedious. An alternate approach is to write a short computer program to do the search for you. I will take a more mathematical approach. We need to count how many ways the attacker can win. We will then divide by 36 (the number of possible outcomes) to compute the probability. There are 6 values the attacker can roll. For each of these numbers, I need to count how many numbers the defender can roll that are less than the attacker's number (Table 6.3). TABLE 6.3 Number of rolls smaller than attacker's Attacker's roll Number of smaller rolls 6 5 4 3 2 1 16 Example 5.19 on page 219.
5 4 3 2 1 0
282
Chapter 6 Finite Probability Theory There are 15 ways the attacker can win. (What counting principle or counting formula have I used?) Thus P(attacker wins) = 36 15 - .417. This is not a favorable situation for the attacker. U
TABLE 6.4 Number of rolls defender wins Defender's Number of rolls defender wins roll 6
36
5
25
4
16
3
9
2
4
I1
Risk Dice: Two against One If the attacker has two dice and the defender has only one, the sample space can be visualized as a table with 36 rows, labeled by the possible pairs of numbers the attacker can roll, and 6 columns, labeled by the 6 rolls the defender can roll. In each of the 216 cells, an A or D could be written, depending on who is the winner of that match. The number of A's could then be counted. This is much too tedious. A computer program would work nicely in this case. I will again use a mathematical approach. If the defender rolls a 6, the attacker loses. There are 36 possible ways this can happen (the attacker rolls any number from 1 to 6 on each of two dice, for a total of 62 = 36). If the defender rolls a 5, the defender will win if any number from 1 to 5 is rolled by the attacker. There are 52 = 25 such rolls possible. If the defender rolls a 4, the attacker will lose if only numbers less than or equal to 4 are rolled. There are 42 = 16 ways this can happen. The pattern continues and is recorded in Table 6.4. There are thus 36 + 25 + 16 + 9 + 4 + t = 91 rolls on which the defender will win. Thus P(defender wins) = 91 - .42 1. Consequently P(attacker wins) = I - P(defender wins) -- .579. This is favorable for the attacker. 0
V Qu,-ic-k_
Check' 6.9"'
............
1. If a five-card hand is dealt using a standard 52-card deck, what is the probability of being dealt a hand with all four aces? 2. What is the probability of being dealt
a five-card hand that contains three clubs and two diamonds? 3. What is the probability that rolling a pair of dice will result in the sum being even?
6.3.1 Exercises The exercises marked with ýIA have detailed solutions in Appendix G. 1. There are four common coins in the United States: pennies (I cent), nickels (5 cents), dimes (10 cents), and quarters (25 cents), (a) In how many ways can I form 29 cents? (b) How many distinguishable ways can I have a set of three of these four kinds of coins? Assume I have many of each kind of coin. Solve this in two ways: by using the appropriate counting principles and formulas and then by listing all the distinguishable collections of three coins, (c) If I randomly select three coins from a jar containing 12 coins (three coins of each of the four types), what is the probability that the coins I choose add to at least 29 cents? 2. Consider the random experiment of tossing one fair coin three times in succession and then rolling a single fair, six-sided die. Jake and his parents are carrying out this experiment to help them decide where to go on vacation this summer. (a) D How many distinct outcomes are there in the sample space for the random experiment in this exercise? (b) 0 Suppose that Jake gets to choose the vacation spot if
the number of heads multiplied by the digit on the die is greater than 10. What is the probability that Jake's parents get to choose the vacation spot? (c) Now suppose that Jake gets to choose the vacation spot if there are at least two tails, but the digit on the die cannot be even. What is the probability that Jake gets to pick the vacation spot? (d) Finally, suppose that Jake's parents get to choose the vacation spot if no tails appear or the digit on the die is divisible by 3. What is the probability that Jake's parents do not get to pick the vacation spot? 3. Consider the random experiment of tossing eight fair coins tossngaeghtsoin simultaneously. (a) How many possible outcomes are there in this experiment? (b) What is the probability that at least five of the coins are heads? (c) D What is the probability that the number of heads and the number of tails differ by at most 2? (d) What is the probability that not exactly two coins are tails?
6.3 Counting and Probability
283
4. Suppose that a movie theater places seven distinct positive integers not exceeding 60 on each movie ticket in no particular order. On the day you buy a ticket, the movie theater randomly selects seven distinct positive integers not exceeding 60 and awards a free movie to any person with a ticket containing all seven of these numbers. You receive a ticket containing these seven numbers on it: 7, 15, 16, 27, 44, 45, 49.
(a) How many spin combinations are possible, assuming that you spin the wheel six different times? (b) What is the probability that you will win at least two stuffed animals (i.e., the wheel pointer lands in the orange section at least twice)? (c) What is the probability that you will win exactly one cookie and exactly one book (i.e., the wheel pointer lands in the red section exactly once and in the yellow section
(a) In how many ways can the movie theater select the seven distinct positive integers not exceeding 60?
exactly once)? 9. I will be sharing an apartment with two friends. The apart-
(b) What is the probability that you will win a free movie?
ment has one large and one small bedroom. Two of us will
(c) What is the probability that you will have at most one of the numbers correct?
share the large bedroom and the other will sleep in the smaller room.
5. P In order to win a prize at the school fair, you must choose exactly six out of seven distinct winning integers, where all numbers are between 10 and 99, inclusive. The order in which you select these integers does not matter, but each participant must choose seven numbers. Note that people who choose all seven of the winning integers do not qualify for a prize,
(a) In how many distinguishable ways can we assign rooms? (b) What is the probability that I get the single room, if we choose randomly?
(a) In how many ways can seven numbers be chosen? (b) What is the probability that a randomly chosen set of seven numbers will match exactly six of the winning numbers? 6. Suppose that at a raffle, you buy a ticket that contains three distinct integers greater than 150 but less than 200. Assume that the order in which these integers appear on the ticket does not matter. At the time of the raffle, the people running the raffle randomly choose eight of these viable integers. A person can win a raffle prize by having the three numbers on his or her raffle ticket be among the eight chosen by the people running the raffle. The raffle numbers on your ticket are the following: 151, 174, 199. (a) How many distinct ways can the eight numbers be chosen by the people running the raffle? (b) What is the probability that you will win a prize? 7. Mom has seven jobs that she needs completed. The jobs are doing the laundry, mowing the lawn, vacuuming, washing the windows, folding the towels, cleaning the garage, and running some errands. She will randomly assign you three of these jobs and will also assign a day on which each job must be completed. The jobs will be completed on Monday, Tuesday, and Wednesday. Assume that Mom will not ask you to do any job more than once in these three days. (a) In how many ways can Mom assign the jobs for you to complete? (b) What is the probability that you will have to fold the towels? 8. O You are at a school carnival and are going to spin a colorful wheel six successive times. The equally distributed colors on this wheel are red, orange, yellow, and green. Having the wheel pointer land in the red section means that you win a cookie, while having the wheel pointer land in the orange, yellow, or green sections indicates winning the following prizes, respectively: a stuffed animal, a book, a piece of paper that says "better luck next time."
10. In 1992 the Mattel toy company introduced a talking Teen Barbie doll. The company had compiled a collection of 270 possible expressions for Barbie to "speak." In order to make the dolls appear more individually distinct, they randomly chose 4 of the 270 possibilities to program into any particular doll. (a) How many distinct dolls could they manufacture? (b) Suppose one of the 270 statements is considered to be undesirable. How many distinct dolls contain the undesizable statement among their four exclamations? (c) What is the probability of getting a doll that speaks the undesirable phrase? 11. Compute the probabilities for the following crazy eights hands. (a) O At least 3 eights (b) All the same suit (c) All clubs (d) All red cards (e) Four eights and three diamonds other than the eight 12. Compute the probabilities for the following crazy eights hands. (a) Exactly one ace (a) Exactly one ace (b) Exactly one ace and exactly two cards with a face value between 5 and 7, inclusive 13. Compute the probabilities for the following crazy eights hands. (a) No 5s but four cards of the same kind (b) Exactly two face cards and exactly two red cards that are not face cards (c) Exactly two face cards and exactly two red cards 14. Calculate the following six probabilities for a 5-card hand using a normal 52-card deck. a All five cards are diamonds. (b) ' All five cards are the same suit. (c) Containing two queens and no other duplicate face values (for example, two of the queens, together with a three, a seven, and a king). (d) Containing three queens and no other duplicate face values.
284
Chapter 6 Finite Probability Theory
(e) Containing three cards with the same face value and two cards that don't repeat previous face values, (f) Containing three cards with face value A and two cards with face value B (for example, 3 queens and 2 fours, or 3 tens and 2 fives). 15. Calculate the following probabilities for a 5-card hand using a normal 52-card deck. (a) Containing five distinct face values on the cards. (b) Containing a straight. A straight occurs when the five cards have five consecutive face values. One example of a straight is 8-9-10-jack-queen. Note that the ace can be used as either the lowest card or the highest card in a straight. 16. Calculate the following probabilities for a 5-card hand using a normal 52-card deck. (a) A royal flush. A royal flush is the 10, jack, king, queen, and ace within a single suit. (b) A straight flush. A straight flush is simply a straight (see Exercise 15) that occurs within a single suit. 17. Calculate the following probabilities for an 8-card hand using a normal 52-card deck. (a) Only face cards (b) No more than three black cards
(d) Player A repeatedly rolls a pair of dice until one of the following two events occurs. If doubles occur, player A wins, If the sum of the digits on the two dice is either 3 or 9, player B wins? Which player is more likely to win? 21. Consider the random experiment of rolling three fair, sixsided dice. Compute the probability of the following events. (a) At least one die is a 1 or a 6. (b) At least two digits are even. (c) Doubles (i.e., at least two of the dice show the same value). (d) At most one of the digits on the dice is divisible by 3. 22. It has been determined that 7 out of a group of 11 teachers will be required to work at an after school homework help session. However, two of the teachers, Mrs. Henderson and Mr. Lay, also have an after school commitment with coaching the swim team, and so at most one of these two people can work at the homework help session. (a) In how many ways can the seven teachers be chosen to work at the after school homework help session? (b) What is the probability that Mrs. Henderson will work at the after school homework help session? 23. Coach Benson is going to choose 9 out of 16 people to fill the nine distinct positions on his baseball team. If Joe is assigned a position, it must be catcher. Additionally, if Joe is
18. Calculate the following probabilities for an 8-card hand using a normal 52-card deck. (a) Containing exactly two suits, with four cards from each suit (b) Containing exactly two suits, with no specification about how the cards are distributed between those suits
chosen for the team, then his friend Jeff must also be assigned a position. (a) In how many ways can Coach Benson fill the positions for his baseball team? (b) What is the probability that Jeff is assigned a position on the baseball team?
19. Calculate the following probabilities for an 8-card hand using a normal 52-card deck. (a) o Exactly two 2s, exactly three 3s, and exactly three cards with a third common face value, (b) Four pairs. In other words, two of each of four distinct face values, 20. Consider the following two-person games involving a pair of fair, six-sided dice. Use appropriate counting principles and probability computation formulas to derive the solutions. (a) P Player A rolls a pair of dice and x and y are assigned the respective values showing on the top faces of the dice. Player A wins if either x -3 > 0 or y -3 > 0; otherwise, player B wins. Which player is more likely to win? (b) Player A rolls a pair of dice. Player A wins if the sum of the digits on the dice is 8 or if both dice show odd digits; otherwise, player B wins. Which player is more likely to win? (c) Player A repeatedly rolls a pair of dice until one of the following two events occurs. If a 3 appears on at least one of the dice, player A wins. If the sum of the digits on the two dice is 8, player B wins. Which player is more likely to win?
24. Seventeen men and 12 women are signed up to go on a ski trip. Unfortunately, there is only transportation available for 18 people to go. There is an additional requirement that at least 4 women attend the ski trip. (a) In how many ways can the people be chosen to attend the ski trip? (b) What is the probability that at least 12 men are chosen to attend the ski trip? 25. Suppose that you have decided that you will buy at least one, but no more than four items in a department store that sells 24 distinct product types, including T-shirts and jeans. You know that if you buy any T-shirt(s), you will also buy at least one pair of jeans. Assume that items within a product type are indistinguishable. (a) ýD How many acceptable distinct purchases of the items can you make at the department store, assuming that you may buy more than one item of any particular product type? (b) What is the probability that you will buy exactly one Tshirt? 26. Calculate the remaining probabilities that the attacker wins in a game of Risk. (This is not an easy task.)
6.4 Expected Value
285
6.4 Expected Value It is time to extend the probability model that has been presented. The new concepts are the value of an outcome and random variables. One of the benefits of introducing these concepts will be a mathematical tool for analyzing sweepstakes and lotteries.
DEFINITION 6.10 Value of an Outcome To each outcome in a sample space, a real number may be associated. This number is called the value of the outcome. The value is a measure of the usefulness or desirability of the outcome.
The values of outcomes can be assigned by whatever criteria you wish. Often they are monetary values. They can also be assigned subjectively, as in the second example.
SMFree
Tickets
A radio station is giving away free tickets to a concert. There are three levels of seating quality: $20 seats, $15 seats, and $8 seats. If the station is giving away tickets at every quality level, the numbers 20, 15, 8, and 0 represent the value of being chosen (or not U chosen) as the recipient of a ticket.
Candy Values Recall the candy jar in Example 6.13. 1 have given each flavor a rating from 1 to 10, with 10 being most favorable. The ratings are shown in Table 6.5. TABLE 6.5 Candy ratings Flavor Value Licorice
10
Cinnamon Butterscotch Watermelon
8 7 3
U
Notice that the probability of an outcome and the outcome's value are distinct concepts. If subjective values are used, the value of an outcome might change from person to person. For example, you might rate the four candy flavors differently. M
Flipping Coins For the random experiment of flipping a coin, a variable X can be assigned one of the values 1 or 0, depending on whether the result of the flip was a head or a tail. There is no reason (at the moment) to prefer 1 and 0 to any other pair of distinct numbers. The choice of value is a feature of the model that we tailor to the problem at hand. For example, suppose the random experiment of flipping a coin is linked to a game in which I win 1point if a head appears, but my opponent wins a point if a tail appears. I can think of my opponent winning a point as being equivalent to my losing a point (and the opponent not keeping a score). It thus makes sense to assign the value 1 to X if a head appears, and a value of -1 if a tail appears. If the game is played many times, I U can keep a running sum of the values X attains to see who is winning.
286
Chapter 6 Finite Probability Theory DEFINITION 6.11 Random Variable A variable whose numeric value is assigned as the result of a random experiment is called a random variable. Random variables are traditionally denoted by uppercase letters, such as X and Y. The values of the random variable X are denoted by x or xi.
Rolling a Die If the random experiment is rolling a die, the random variable X can be assigned one of the values {1, 2, 3, 4, 5, 6}, again depending on the result of the roll. U
Picking a Piece of Candy If the random experiment is to reach into the candy jar and pull one piece out, the random variable X can be assigned the value of the flavor that is picked. Thus, if a cinnamon is chosen, X will have the value 8. M It is often useful to examine the average behavior of a random experiment. For example, if I were to roll a die many times, what number (on average) would appear? In other words, what is the average value of the random variable X, associated with this random experiment? One way to decide is to carry out an actual sequence of rolls, record the values of X after each roll, and then compute the average of these numbers. It is better to consider this theoretically. Suppose for the moment that the die is rolled 600 times. If the die is fair, then about 1/6 of the time X will be 1, about 1/6 of the time X will be 2, and so on. The theoretical 17 average is thus
100 times
100 times
1+.+1+2+
100 times
... +2+
100 times
... +5+...+5+6+...+6 600
1 100+2- 100+3- 100+4
100+5
100+6
100
600 100 = 1. +2'
100 100 100 - +3.0 +4.-60 +5
600
1 = 1.-+2.
6
600
600
100
600
600
+6.
100
600
1-+3.- 1 +4.- 1 +5.-+6.1
6
6
6
= 1 P(l) + 2- P(2) + 3. P(3) +4
6
6
P(4) + 5- P(5) + 6. P(6) = 3.5.
Notice that the theoretical average value (3.5) is an impossible value. You can never roll a 3.5! What this number signifies is that after many rolls, the average value of X should be near 3.5. If you were to actually roll a die 600 times, you are not likely to roll exactly 100 of each number. You are likely to roll close to 100 of each number.
Average Candy Value If every day I randomly pick a piece of candy from the jar in Examples 6.13 and 6.37, what should my average value be? It will be helpful to have the probabilities and values 171 have taken the liberty of sorting the values X assumes. This will not affect the result of the computation but makes the principles involved clearer.
6.4 Expected Value
287
of each flavor (Table 6.6). TABLE 6.6 Candy values and probabilities Flavor
Probability
Licorice
3/16
10
Cinnamon Butterscotch
1/4
8
1/4
7
Watermelon
5/16
3
Value
If I were to perform this random experiment every day for 160 days (each time replacing the flavor I had just picked), I would expect an average value of 30 times
10 + 10
40 times
+--10+8+8
40 times
+.--+8+7+
50 times
+.--+7+33
.+
3
160 10.30 + 8. 40 + 7 . 40 + 3 . 50
= 10.
160 1 1 +8-± +-- 7- +3.
3
16
4
4
5
16
= 10. P(10) + 8 . P(8) + 7 . P(7) + 3 • P(3) = 6.5625. Notice again that this is a theoretical average value. If I perform the experiment many times, I expect the average value to be near 69. 0 16 In both examples, the average value of the (respective) random variable X could be expressed as a sum of the values of X times the probabilities X attains those values. This computation is of sufficient importance to warrant a definition.
DEFINITION 6.12 Expected Value If X is a random variable, the expected value of X, denoted E(X), is defined to be
xP(x),
E(X) =
where the sum is understood to be over all possible values, x, of x.a
"aAnother
way to understand this sum is to add the product x • P(0) for each outcome 0 in the sample
space, since X is assigned one of its values for each outcome.
Expected Candy Value As was demonstrated previously, the expected value for the candy jar is
Lx • P(x) = 10- P(10) + 8 . P(8) + 7- P(7) + 3 • P(3) 3 = 10- - +8-
16
1
4
+7-
1
4
+3
5 -=6
16
.5625.
288
Chapter 6 Finite Probability Theory
1. Compute the expected value for the random variable X.
2. Compute the expected value for the random variable X.
x
P(x)
2
.30
4
.40
5
.10
6
.10
0
.10
10
.05
75
.15
20
.05
x
P(x)
100
.25 .20
-50
-25
.30
[vl
Expected values can be effectively used as one piece of information to analyze lotteries and sweepstakes. The idea is to define a random variable whose value corresponds to your winnings (or losses). The expected value of this random variable represents what you would win or lose if you were to play the game many times. Alternately, the expected value represents the combined average loss (or winnings) of all people who participate in the lottery. It will be easier to discuss lotteries and sweepstakes using the following definitions. DEFINITION 6.13 Odds The odds of an event can be expressed either as a ratio of success to failure of occurrence, or as a ratio of success to total occurrences. That is, the odds of an event are expressed as either S : F or as S : T. The corresponding probabilities of success are P(S) =
s
or P(S) = s, respectively. Note that T = S + F.
If a phrase such as "odds are 3 to 2 in favor," the odds are being expressed as S: F = 3 2. If a phrase such as "odds are 5 to 2 against," the odds are in the form F: S = 5 2. This is the traditional manner for expressing odds. However, most lotteries and sweepstakes are using the S : T form. It is your responsibility to determine which version is being used in any particular context. The example that follows demonstrates one possible way to check.
A Simple Lottery The Minnesota State Lottery sponsored the lottery shown in Figure 6.6. The first task, before the expected value can be computed by means of the definition, is to decide which version of odds is being used. Consider the third row. The ratio 1: 50 Figure 6.6
Lakes and Loons
PrIzes And Odds For LAKES AND LOONS
lottery.
(Based on 25,200,000 tickets sold) If you get
You win
3 Bobbers
Free Ticket $2 $5 $10 $20 $50 $100 $500 $5,000
3 3 3 3 3 3 3 3
Nets Cabins Stars Fish Boats Trees Hats Loons
Appro,. odds
Appox. number of winnees-
1:8.33
3,024,000
l1A11 1:50 1:125 1:500 1:1,200 1:4,800 1:12,000 1:240,000
2,268,000 504,000 201.600 50,400 21.000 5,250 2.100 105
"Theaverage overall odds of winning a prize are approximately 1:4,15. The average odds of winning a cash prize are 1:8126. "The number of winners may vary based on sales, distribution and number of prizes claimed,
6.4 Expected Value
289
is given, with 504,000 prizes available. Does this mean 1 chance in 50 of winning or 1 chance to win and 50 chances to lose? We can compare both versions to the total expected ticket sales: s
_1
S: T Assume the ratio is in the form S : T. Then the probability of winning is T = 5-" The number of winners for this prize is thus 1- "25,200,000 = 504,000, which agrees with the chart. S : F Assume the ratio is in the form S : F. Then the probability of winning is S S+F
1 1+50
1 51
The number of winners for this prize is thus - .25,200,000 -- 494,118, which does not agree with the chart. Therefore, the S : T form for expressing odds is being used, not the S : F form. It is also necessary to interpret the first prize and add the missing prize. Each ticket costs $1, so I will assume a free ticket is worth $1.18 The missing prize is the one for which you win nothing. If we add the final column of the previous table and subtract the result from 25,200,000, we can compute the approximate number of losers (people who win nothing). In this case, there are approximately 19,123,545 losers. With this settled, we can produce Table 6.7, which summarizes the values and probabilities for the random variable X associated with this lottery. The final column contains the product of the values and their respective (approximate) probabilities. You should compute the values of a few rows yourself, to make sure you understand how I did it. TABLE 6.7 Expected value for Lakes and Loons x
P(x)
x. P(x)
0
.758870833
.00000
1
.120048019
.12000
2
.090009000
.18000
5 10 20
.020000000 .008000000 .002000000
.10000 .08000 .04000
50 100
.000833333 .000208333
.04167 .02083
500
.000083333
.04167
5000
.000004167
.02083
Yx •P(x) -
.6450
The (approximate) expected value is the sum of the final column. This is .645, or 64.5 cents. After subtracting the dollar to buy the ticket, we observe that, on average, you lose 35.5 cents every time you buy a lottery ticket. In other words, if all the winnings and losses for all participants were totaled, the average gain (including the cost of the ticket) is -35.5 cents. In this case, it is more accurate to speak of an "average loss" rather than an "average gain." At 25,200,000 tickets, this represents a consumer loss of about $8,946,000 (a number that will now be computed). 18you won't be allowed to cash the free, unscratched ticket in for a dollar. I am making a simplifying assumption here: You win a dollar, which is immediately spent on another ticket.
290
Chapter 6 Finite Probability Theory There are 105 lucky first prize tickets that pay (after the ticket cost is subtracted) a total of $524,895, contributing to a total cash payout of $13,230,000. The total (prepayout) income is computed as follows: Total number of tickets Number of free tickets
25,200,000 -3,024,000
Total income
$22,176,000.
Since (total consumer loss) = (total income) - (cash payout), the total consumer loss is $22,176,000 - $13,230,000 = $8,946,000. Of course, no one will ever lose 35.5 cents on a ticket. They will either lose a dollar, win a free ticket, or win some multiple of a dollar. However, there will be 19,123,545 times someone immediately loses a dollar. U Using the following definition, it should be clear that the Lakes and Loons lottery was not a fair game since the game cost $1 to play but E(X) = $.645.
DEFINITION 6.14
Fair Game
A game of chance having an associated random variable X is called a fair game if E(X) = the cost of playing the game.
The Reader's Digest Sweepstakes The Reader's Digest sweepstakes has been around for many years. The prizes and estimated 19 S : T odds for one sweepstakes are listed in Table 6.8 (with bonus prizes omitted). TABLE 6.8 A Reader's Digest sweepstakes x (Prize)
Number of Prizes
Odds
P(x)
x. P(x)
$5,000,000
1
1: 201,000,000
4.97512 x 10-9
0.024876
$150,000
1
1: 201,000,000
4.97512 x 10-9
0.000746
$100,000
1
1: 201,000,000
4.97512 x 10-9
0.000498
$25,000
2
1 :100,500,000
9.95025 x 10-9
0.000249
$10,000
4
1: 50,250,000
1.99005 x 10-8
0.000199
$5,000
8
1: 25,125,000
3.98010 x 10-8
0.000199
$200
25
1: 8,040,000
1.24378 x 10-7
0.000025
$125
200
1 :1,005,000
9.95025 x 10-7
0.000124
1: 3,774
2.64971 x 10-4
0.023582
1 :1.00027
0.99973
0.000000
$89
53,259
$0
200,946,499
Sx •P(x)
0.050498
The odds for losing are 200,946,499 : 201,000,000, which is approximately 1 1.00027. Notice that the probability of losing is over 99.9%! This explains why most people reading this book may not know anyone who has won a prize in this sweepstakes. At the time this sweepstakes was run, it cost 29 cents to enter (a first class stamp). The expected value is a bit over 5 cents, so the average loss was approximately 24 cents. 19 The actual odds depend on how many entries the publisher receives. The estimates are based on the number of contestants in previous sweepstakes.
6.4 Expected Value
291
The actual losses were 29 cents; 53,501 lucky people actually gained at least $89.20 The expected number of entries (201,000,000) indicates that many people don't mind paying for a stamp in return for a (small) chance to win a much larger prize. U
VoQuick Chc
6.11
1. A state lottery has posted the following S : F odds. There are seven prizes. Prize
Odds
$500 $200 $100
1:99 2: 98 4 : 96
(a) Compute the expected value, (b) It costs $20 to buy a ticket. Is this a fair game? x (Prize)
(c) What is the probability of losing? (d) How much profit does the state expect to make? 2. The 1992 Reader's Digest sweepstakes lists the following prizes and odds. Assuming there are 199,500,000 entries, compute the expected value and compare it to the sweepstakes in Example 6.44. Extra credit: There is an inconsistency in the published figures. What is it?
Number of Prizes
Odds
$5,000,000
1
1 :199,500,000
$100,000
1
1 :199,500,000
$25,000
3
1: 66,500,000
$10,000
5
1: 39,900,000
$5,000
10
1 :19,950,000
$2,500
50
1: 3,990,000
$120
250
$109
53,395
$0
?
1: 665,000 1: 3,736
1:?
6.4.1 Exercises The exercises marked with O have detailed solutions in Appendix G. 1. P What do the ratios 1 : 4.15 and 1 : 8.26 signify in Example 6.43? How can these probabilities be computed directly from the prize table? 2. Assign values to the following prizes. (a) Two tickets to the Super Bowl (b) Two tickets to Hawaii (c) A college degree (d) A close friend (e) Good health How did you arrive at these numbers? 3. The proponents of state lotteries emphasize the following reasons for having a lottery. (a) A lottery provides entertainment. (a) enertinmnt.eral loter prvide lowering (b) The lottery generates revenue for the state, thus taxes. (Most states predesignate where their share of the profits will be spent: for example, education, environmental protection, or road repair.) 20
(c) Legal gambling should make illegal gambling less desirable. The opponents of state lotteries emphasize these points: (a) The state will need to spend more money for social programs (such as treatment for chronic gamblers). (b) The lottery won't generate as much revenue as promised. (A substantial portion of the profit is used to administer the lottery itself.) (c) The money won't be used as promised. For example, if the money is designated for education, a corresponding amount of revenue from other sources will be diverted away from education. In other cases, much of the money may be used for other purposes. Which arguments do you find most convincing? (Write sevparagraphs to support your answer.)
4. Design a spreadsheet model for analyzing lotteries and sweepstakes. The model should contain the following features.
The $89 prize was a watch, so the winners of that prize didn't actually win money.
292
Chapter 6 Finite Probability Theory
(a) The data to be entered will be i. The cost to participate ii. The values of the various prizes (excluding losing) clouding losing) (b) The values produced will include i. The probability of winning each prize ii. The probability of losing (displayed as a separate summary value) iii. The expected value of the game iv. The average gain or loss (c) You may assume that there are at most 25 prizes, (d) You may also assume that odds are computed in the form S : T but are expressed in the reduced form 1 : T (that is, with a numerator of 1). 5. Compute the expected values for the random variables listed below. (a) O
(c)
X
P(X)
2
.35
5
.40
8
.25
X .524
P(X) .138
.913
.322
(b)
Y -10
P(Y) .125
-5 0
.25 .25
10
.125
9. Suppose that you are playing a card game at one of the school carnival booths. The cost of this game is $0.90. To play, you draw two cards from a standard 52-card deck. If you obtain two face cards, you win $8. However, if you get two aces, you owe $0.50 (in addition to the cost of the game). Drawing an ace and a card with a number from 2 to 10, inclusive, on it will allow you to win $2.50, while choosing two cards with a number from 2 to 10, inclusive, on each of them will give you $0.50. In all other cases, nothing happens. What is the expected value of the game? Is it a fair game? 10. There are three types of problems on your final exam. Each of the 55 T/F questions are worth 2 points, while each of the 65 multiple choice questions are worth 3 points and each of the 27 fill-in-the-blank problems are worth 4 points. The probability that class member Jana will answer a T/F question correctly is .94, while the probability that she will answer a multiple choice question correctly is .89. Jana is twice as likely to get a T/F question correct as a fill-in-the blank problem. Assuming that there is no partial credit for answers given, what is Jana's expected score on her final? 11. Another way to deal with the "free ticket" prize in the Lakes & Loons lottery is to assign it the value of E(X) instead of the value $1. (a) Write an equation that expresses this relationship. You may use the table in Example 6.43 to save time. (b) Solve the equation for E(X). (c) Is the new value for E(X) larger or smaller than the old value? Does this make sense? 12. A loaded die has the following probabilities:
1.127
.045
1
.1
2.418
.285
2
.125
3.212
.160
3
.125
-8.2
.050
4
.125
5
.125
6
.4
6. You are at the fair and are about to spin a colorful wheel in attempt to win a prize. If the spinner lands in the green sector of the wheel, you win a cookie, worth $1. Similarly, having the spinner land in the yellow, red, and blue sectors gives you the following prizes, respectively: a toy car, worth $3; a stuffed animal, worth $5; a $10 bill. The central angles in the colored sectors on the wheel have been measured and are recorded here: green: 180 degrees; yellow: 130 degrees; red: 35 degrees; blue: 15 degrees. It costs you $2.75 to play this game once (i.e., spin the wheel once). What is the expected value of the game? Is it a fair game? 7. OD A local club is holding a raffle for a used car valued at $500. They are selling 2000 tickets for $2 per ticket. What is the expected value of this raffle? What is the probability of losing? What is the average gain or loss (for the ticket purchasers)? 8. It costs $2.50 to buy one of four sealed envelopes. Two of the envelopes each contain one dollar. The third envelope contains three dollars, and the fourth contains five dollars. Assuming that each envelope is equally likely to be picked, what is the expected value of the game? Is it a fair game?
What is the expected value of a roll? What is the expected value of a roll with a fair die? 13. Consider the random experiment of rolling a pair of six-sided dice. Let X be the random variable whose value is the sum of the digits on the dice. (a) Find the expected value of X, assuming that the two dice are fair. (b) Suppose that any sum greater than 9 is three times more likelyppose tha anyn sum le ssthan9 ee to m likely to occur than any sum less than or equal to 9. Coipute the expected value of X. (c) Calculate the expected value of X, assuming that each die is biased so that a 1 is twice as likely to come up than any other digit. 14. O Suppose that a seamstress estimates that next year she will make 5,000 shirts of a particular style to sell. Because of the variation in production costs and in the price that she can sell
6.4 Expected Value her shirts, her profit per shirt may vary (see the probabilities given in the following table).
293
19. A tire manufacturer has introduced a new 50,000-mile tire. Their testing indicates that about 5% of the tires will wear out before 45,000 miles. About 15% will wear out between
Profit Per Shirt Probability
$-2
$0
$1
$2
$5
$7
.30
.23
.19
.10
.11
.07
Estimate the profit on the 5,000 shirts. 15. There are 23 students in a classroom. Let S be the random variable whose value is a randomly chosen test score selected from the student scores for the past math test. The following table shows the distribution. Test Score (%)
# of Students
45,000 and 50,000 miles, while 50% will wear out between 50,000 and 55,000 miles. About 25% will wear out between 55,000 and 60,000 miles. The remaining 5% will wear out between 60,000 and 65,000 miles. What is the approximate expected tire life (in miles)? What assumptions have you made? 20. A lottery advertises three levels of prize money. Your lottery card has five numbers on it. You win if three or more of the numbers match. The prizes and odds are listed. Match
Prize
Odds (S : T)
5 of 5
$100,000
1: 575,757 1 :3,386.8
70
6
4of5
$250
75
1
3 of 5
$10
82
7
86 94
2 3
99
4
1 :102.6
(a) Compute the expected value of this game. (b) Compute the probability of losing. (c) If tickets cost $ 1, compute the average loss.
(b) If students scoring 99% are 3 times as likely to be chosen as students scoring 94%, students scoring 94% are 4 times as likely to be chosen as are students scoring in the 80s, students scoring in the 80s are twice as likely to be selected as students scoring 75%, and students scoring 75% and 70% are equally likely to be chosen, what is the expected value of S?
21. ýP4A lottery charges $5 per entry. In the lottery, the participants choose six digits (in order, with repetition). Only one person is allowed to choose any particular six-digit number. The lottery officials will award $5,000,000 to the person who selects the winning combination of numbers. Find the expected value and probability of losing. Is this a fair game? State the assumptions you made in order to solve this problem. 22. The "Daily 3" lottery consists of four games. The participants pay $1, choose a game, and then choose (in most cases) a three-digit number. The winning three-digit number is picked in the evening. The winning number may contain repeated
16. O Let D be the random variable whose values are 28, 30, or 31, depending on whether a randomly chosen (nonleap year) month contains 28, 30, or 31 days. (a) If months are equally likely to be chosen, what is the expected value of D?
digits (for example, 533 is a possible winning number). The games, odds, and payouts are listed. 6-way box: You must pick three distinct digits. You win if a permutation of your three digits matches the winning number. Notice that if 533 is the winning number, no one can win
(b) If months that start with the letter "J" are twice as likely to be chosen as other months, what is the expected value of D?
this game. Straight: You win if your number matches the winning number exactly. You may repeat digits.
(a) If all students are equally likely to be chosen, what is the expected value of S?
17. Consider the following definition of gambling [36]. A gamble is a reallocation of wealth, on the basis of deliberate risk, involving gain to one party and loss to another, usually without the introduction of productive work on either side. The determining process always involves an element of chance and may be only chance. (a) In a gambling situation (as defined previously), what does expected value quantify? (b) What is the significance of the logic operator and in the phrase "gain to one party and loss to another"? 18. Describe any differences between sweepstakes and lotteries. (Does the previous exercise offer any insight?)
3-way box: Two of your digits must be the same (and the other different from them). You win if a permutation of your number matches the winning number. Front pair: You choose the front two numbers (leaving the third number unspecified). If your two front numbers match those positions in the winning number exactly, you win. You may repeat digits. Game
Odds (S : T)
Straight
1 :1000
$500
3-way box Front pair
1 : 333.33 1: 100
$160 $50
Payout
294
Chapter 6 Finite Probability Theory
For each game,
(c) Compute the average loss.
(a) Show how the odds were computed. (This requires material from Chapter 5.) What assumptions have you made? (b) Compute the expected value,
(d) Compute the net profit (for the state and administrators) if one million people play the game.
6.5 Bayes's Theorem Conditional probabilities enable us to revise probability estimates. The knowledge that one event has occurred often causes the probability of another event's occurrence to change. 2 1 Sometimes we know the numeric value of a conditional probability in one direction (perhaps P(A I B)) but actually need the numeric probability in the other order (P(B I A)).
SQDiagnosing
Tuberculosis A convenient test for tuberculosis is the intermediate-strength purified protein derivative (PPD) Mantoux skin test. This test is less expensive than a chest X-ray but is less reliable. The test sometimes predicts a patient has tuberculosis when in fact he or she doesn't. It also occasionally predicts someone is healthy when in fact that person does have tuberculosis. Let T be the event "the patient has tuberculosis." Then T is the event "the patient does not have tuberculosis." Let W be the event "Warning: the PPD test predicts the patient has tuberculosis," and W be the complementary event "the PPD test predicts the patient does not have tuberculosis." Clinical studies have been done with two groups of people. The first group contains people known (by other means) to have the disease. The second group contains people who are known (again by other means) to be free of tuberculosis. The results of these clinical studies produced the following approximate, empirical conditional probabilities [64]: P(WIT) = .775 P(WIT) = .15 and consequently (using computation formula 6) P(W IT) = .225
P(W IT) = .85.
This adds credence to the test but does not answer the questions patients care about the most: What are P(T I W) and P(T I W)? The solution will be presented after some additional theoretical development. U Computation formulas 3 and 4 from Section 6.2.2 can be used to reorder conditional probabilities in a useful manner. Rearranging formula 4 gives P(A n B) P(B IA) =-- P(A) nB Using the other part of the same formula produces the next refinement: P(B) .P(A IB) P(B IkI) A) P(A P(A) Formula 3 implies P(A) = P(A n B) + P(A n B). If formula 4 is once again applied to each of the previous summands, P(A) = P(B) • P(A I B) + P(B) • P(A 1B). Putting these equations together leads to Bayes's theorem. 21 See Example 6.15 on page 266 for a quick review.
6.5 Bayes's Theorem
295
Bayes's Theorem Let A and B be events. Then
P(BIA)
-
P(B) .P(AIB) P(B).- P(A I B) + P(_B).-P(A B)
P(B) .P(AIB) P(A)
Since B and B are mutually exclusive and their union is the entire sample space, the events in Bayes's theorem can be visualized similar to one of the diagrams in Figure 6.7.22 Figure 6.7
Events in Bayes's
B
B
theorem.
Diagnosing Tuberculosis In order to apply Bayes's theorem, we need P(T). According to [64], P(T) Bayes's theorem thus implies
.0021
P(T) • P(W I T) P(T) .P(W IT) + P(T) P(W IT) .002 •.775
.0102.
.002 •.775 +.998 -. 15
Similarly, P(T) • P(W I T)
P(T I W)=
P(T) .P(W I T) + P(T)- P(W IT)
.002 •.225 .002 - .225 + .998 •.85
-• .0005.
This test does an excellent job of catching people who actually have the disease
since only .05% are missed (P(T IW)). However, about 99% of those given a warning do not have the disease (1 - P(T I W)). These people require additional tests, generally an X-ray, to verify that they are indeed free of tuberculosis. About 15% of the people
U
without tuberculosis will have a false warning (P(WI T)).
V
-''Q",-u-ic...k C" h"ec'k 6.12_'
1. Calculate P(B I A) if P(A) P(B) = .4, and P(A B) = .8.
=
.6,
2. Calculate P(D I J) if P(J I D) = .40, P(D) = .75, andP(JJD) = .20.
[]
There is a more general version of Bayes's theorem. It assumes that the sample space, S, can be expressed as a union of mutually exclusive events, B1 , B 2 . . ., B, In that case, computation formula 3 implies P(A n S) = P(A n BI) + P(A n B 2 ) +'"
22
The diagrams assume that neither A nor B is the entire sample space.
+ P(A n Bn).
296
Chapter 6 Finite Probability Theory
Generalized Bayes's Theorem BI
B3
Suppose the events, B 1 , B 2 ... , B,, are mutually exclusive and their union is the entire sample space. Then for 1 < i < n
A P(Bi). P(A I Bi) P(BiI B A)= 4 P(Bi I A) P(Bt). P(A I BI) + P(B 2 ). P(A I B2 ) +... + P(Bn) " P(A IB,)"
B4 B2 Figure 6.8. Bayes's theorem with n =4.
Figure 6.8 shows one of many possible ways that the events might be related, assuming n = 4.
Monty Hall A problem whose solution has generated many heated discussions is the "Monty Hall" problem. The problem involves a television game show in which a contestant is presented with three doors. Behind one door is an expensive prize (perhaps a four-year scholarship to a famous college). Behind the other two doors are inexpensive consolation prizes (perhaps coffee cups bearing the show's logo). The host knows which door conceals the expensive prize; the contestant does not. The contestant chooses a door, which remains closed. The host opens one of the other doors, displaying one of the consolation prizes. The host decides which door to open in one of two ways:
"* If the contestant
has chosen the door concealing the expensive prize, the host uses 23 a coin flip to randomly choose one of the remaining doors. "* If the contestant has chosen a door concealing a consolation prize, the host chooses the door that conceals the other consolation prize. At this time, the contestant is offered the opportunity to switch doors. Let Si be the event "the scholarship is behind door i." Let Cj be the event "the host reveals a cup behind door j." We assume that P(Si) = 1/3 for i = 1,2 and 3. Suppose the contestant has chosen door 1 and the host has opened door 3, revealing a coffee cup. Is the contestant better off (on average) changing doors or sticking with the initial choice? Clearly, P(C M03 2 ) = 1 and P(C3 1S3 ) = 0, since the host must open a door containing a cup. Also, P(C 3 IS) = 1/2, since the host flips a coin in this case. We are interested in P(S 2 I C 3 ). The generalized Bayes's theorem provides the value: WS2I C3)
-
-P(S
2 ).
P(C 3 IS2 )
P(SO) • P(C 3 ISO) + P(S2 ) • P(C3 S2) 5 + P(S 3 ) • P(C 3 IS3) .1
2
1.+11+1.0
3
3
The contestant should change doors. Notice that from the host's perspective, P(S2 ) is either I or 0, depending on whether the scholarship is or is not behind door 2. The contestant however, has only partial knowledge. From her perspective, the scholarship has a probability of 2/3 of being behind door 2.24 0 23
This has been added here to eliminate some of the controversy. If there is any ambiguity about how the host selects a door, the problem will not have a clear solution. 24 See the extended footnote in Example 6.15 for a similar discussion.
6.5 Bayes's Theorem
297
Quick ... "Check 6.13 1. Compute P(St i C 3 ) in Example 6.47.
What are
2. SupposeP(AIBI)= 1/4, P(AIB 2)=
(a) P(BiIA)
1/8, and P(A B3 ) = 5/8. If P(Bi) = 1/3, P(B 2 ) = 1/2, and P(B 3) = 1/6.
(c) P(B 3 IA)
(b) P(B 2 IA)
6.5.1 Exercises The exercises marked with )14 have detailed solutions in Appendix G. 1. ODCompute the following conditional probabilities from Example 6.45. (a) P(T7 W) (b) P(T W)
the people of voting age at the time of the 1984 presidential election in the United States and whether or not these people reported that they voted. P(8 years or less) P(9-11 years)
= .121 = .130
P(12 years) P(13-15 years) P( 16 years or more)
= .399 = .182 = .168
P(Did not P(Did not P(Did not P(Did not
=
2. You have invented a lie detector machine and have presented it to the court system for use. Suspects are asked whether or not they have committed a particular crime, and the machine can detect whether or not they are lying. Your invention has the property that 89% of guilty suspects in the court of law are properly judged. However, innocent suspects are incorrectly judged 1.75% of the time. (a) Suppose that a suspect is randomly selected from a group of suspects in which it is known that 10% of the people have committed a crime. The lie detector machine indicates that this person is guilty. What is the probability that this person is actually innocent? (b) Suppose now that a suspect is randomly selected from a group of suspects in which it is known that 17% of the people have committed a crime. The lie detector machine indicates that this person is innocent. What is the probability that this person is actually guilty? 3. The 1988 Information Please Almanac [46, page 799] offers information concerning the number of female and male arrests for serious crimes in the United States in 1986, categorized by sex and age. According to this source, the total number of arrests for serious crimes in the United States in 1986 was 2,167,071. Of the people arrested, 1,709,919 were male and 457,152 were female. There were 516,494 arrested males under the age of 18, while 124,911 of the arrested females were under 18. Let M represent the event "male," F represent the event "female," and Ul8 represent the event "under 18." Calculate the following conditional probabilities. (a) ODP(M IU18) (b) P(FIU18) (c) P(M I U18) (d) P(F U 18) 4. Suppose that the weather can either be Hot, Mild, or Cold. It is either Sunny or Rainy. Approximately 50% of the days in a year are Mild, with 25% of the days Hot. It rains on about 20% of the days. The following probabilities are also known: P(S IM) = .9, P(S IH) = .8, P(S IC) = .6, and P(MIR) = .25. Compute P(M IS), P(H IS), P(C I S), and P(R IM).
7. Design a spreadsheet model for Bayes's theorem. 8. Design a spreadsheet model for the generalized Bayes's theorem. 9. A symphony orchestra schedules its musical offerings in the following proportions: 30% Baroque, 40% Classical, 20% Romantic, and 10% Modern. The resident Director almost al-
5. The 1988 Information Please Almanac [46, page 615] indicates the following empirical probabilities related to the years of education (8 or less, 9-11, 12, 13-15, or 16 or more) of
ways directs the more modem music, with Guest conductors more frequently directing older music. More specifically, for the past few years the approximate probabilities have been
vote 8 years or less) vote 19-11 years) vote 12 years) vote 13-15 years)
.571 = .556 = .413 = .325
P(Did not vote 16 years or more) = .209 Compute the conditional probabilities listed. (a) P(8 years or less IVoted) (b) P(9-11 years IVoted) (c) P(12 years IVoted) (d) P(13-15 years IVoted) (e) P(16 years or more IVoted) 6. In a small kitchen appliance assembly plant, there are three types of products: Electric Can Openers make up 27% of the production, while Microwaves and Toasters make up 55% and 18% of the production, respectively. Not all of the products assembled at this plant work correctly. In fact, it is unfortunate that I in 10 Electric Can Openers, I in 5 Microwaves, and 25% of Toasters are Defective. Compute the following probabilities and then answer the questions. (a) P(M I D) (b) P(T ID) (c) P(E ID) (d) P(D) (e) Should the plant management be concerned? (f) Suppose that a product assembled at this plant is randomly selected. What is the meaning of the probabilities in parts (a), (b), and (c)?
298
Chapter 6 Finite Probability Theory
P(G IB) = .4, P(G IC) = .25, P(G I R) = .2, P(G IM) = .1.
(a) Suppose that a customer received his or her loaf of bread
Compute (c) ýD P(R JD) (b) q4 P(C IG) (a) P(B I G) (f) P(D) (e) P(G) (d) P(MI D) to award his em10. The boss of a large company has decided ployees with an all-expense-paid trip to Florida. He has 225 employees, each of which will be assigned accommodations at one of the following hotels: Gator Inn, Everglades Inn, Beachside Suites, Sea Breeze Resort. More specifically, 47 employees will be assigned rooms at the Gator Inn, while 59, 79, and 40 employees will be assigned rooms at the Everglades Inn, Beachside Suites, and Sea Breeze Resort, respectively. It is known that the showers do not have hot water in 3% of the rooms at the Gator Inn, in 1.5% of the rooms at the Beachside Suites, in 5% of the rooms at Sea Breeze Resort, and in 2.5% of the rooms at the Everglades Inn. What is the probability that (a) an employee will be assigned a room with a shower that has hot water? (b) an individual who has been assigned a room with a shower that has no hot water is staying at Sea Breeze Resort? (c) an individual who has been assigned a room with a shower that has hot water is staying at the Everglades Inn? (d) an individual who has been assigned a room with a shower that has no hot water is staying at the Beachside Suites? (e) an individual who has been assigned a room with a shower that has hot water is staying at the Gator Inn? 11. D Both children (that is, people under the age of 18) and adults attended a music concert, with 33% of the people being children. As an added bonus for attending the concert, each guest got to choose exactly one of the following gifts: a ticket to the next concert, a poster autographed by the band members, a video, a T-shirt. Among the adults, 65% chose the ticket, while 21% chose the T-shirt, 10% the video, and 4% the poster. 32% of the children chose the ticket, while 31% chose the video, 9% the poster, and 28% the T-shirt. What is the probability that a randomly selected person from the concert who (a) chose the video is an adult? (b) chose the ticket is a child? (c) chose the poster is an adult? (d) chose the T-shirt is a child?
and complained that it was missing an expiration date stamp. What is the probability that this was a loaf given to Pamela to stamp (i.e., Pamela forgot to place the date
12. Suppose that five workers at a bakery are charged with the
duty of stamping the expiration date on the wrapper for each 22% of the loaf of bread to be sold. Janelle, who is given loaves to stamp, fails to stamp the expiration date once in every 100 loaves; Amy, who is given 14% of the loaves to stamp, fails to stamp the expiration date twice in every 99 loaves; Mary, who is given 18% of the loaves to stamp, fails to stamp the expiration date three times in every 88 loaves; Sam, who is given 27% of the loaves to stamp, stamps the expiration date on 45 loaves out of every 50; and Pamela stamps the expiration date on 8 loaves out of every 10.
on the loaf)?
(b) Suppose that it is uncertain whether or not Amy really fails to stamp the expiration date twice in every 99 loaves. However, it has been determined that if a loaf of bread has no expiration date, the probability that it was Amy who failed to do this is .23. Was the original conclusion that Amy fails to stamp the expiration date twice in every 99 loaves correct? If not, give a new estimate for Amy's rate of stamping the expiration date. 13. The 1988 Information Please Almanac [46, page 66] offers data on the full-time status of the United States civilian labor force in 1986. The civilian labor force can be divided into three distinct groups of people: Males, 20 years and older, Females, 20 years and older, Persons 16-19 years old. Males, 20 years and older, made up approximately 58.44% of the labor force in 1986, while Females, 20 years and older, made up approximately 38.34% of the labor force. The following conditional probabilities are implied in the text: = .061 P(Unemployed IMale, 20+) = .934 P(Employed IFemale, 20+) P(Unemployed IPerson, 16-19) = .234. Compute the following probabilities. (a) P(Male, 20+ Employed) (b) P(Female, 20+ Unemployed) (c) P(Person, 16-191 Employed) (d) P(Person, 16-191Unmployed) (f) P(Unemployed) (e) P(Employed) 14. With the advent of AIDS, it has become essential that donated blood be screened. A screening test developed in the mid-1980s was given the acronym ELISA [33] (which is simpler than "enzyme linked immuno sorbent assay"). This test correctly produces a warning with a probability of about .977 when the donated blood contains AIDS antibodies. The test incorrectly produces a warning with probability near .074 when the donated blood does not contain AIDS antibodies. Suppose one blood sample in ten thousand actually contains AIDS antibodies. (Near the end of the 1980s, about 1 person in ten thousand in the United States was known to have AIDS. may wishsentences to use a more estimate.) usingup-to-date the notation of conRestateYou the previous tions. ques the entanswer T s ie t probbili di ta (a) If ELISA produces a warning, what is the probability that a donor has AIDS antibodies in his or her blood?
(b) If ELISA produces a warning, what is the probability that a donor does not have AIDS antibodies in his or her blood? (c) If ELISA does not produce a warning, what is the probability that a donor has AIDS antibodies in his or her blood?
6.5 Bayes's Theorem (d) If ELISA does not produce a warning, what is the probability that a donor does not have AIDS antibodies in his or her blood? (e) If a donor is notified that a blood screen produced a warning, should the donor panic? Explain your answer. 15. Consider the following variations of the "Monty Hall" problem (Example 6.47). In each case, assume you have chosen door number I and the host has opened door number 3. What is P(S 2 IC3 )? (a) There are two consolation prizes: an autographed photo of the host, and a coffee cup. Door 3 contained a cup. (b) Both consolation prizes are cups (as in Example 6.47). There is no coin flip; if the contestant chooses door 1, the host opens door 3 unless the scholarship is behind door 3, in which case door 2 is opened. 16. At a 6th, 7th, and 8th grade middle school of 600 students, each student writes a story and is rated on a scale of Poor, Average, and Excellent. There are twice as many 6th graders as there are 7th graders. There are 24 more 7th grade students than 8th grade students. 23% of the 6th graders rate Poor, while 44% rate Average. It is also known that 59% of the 7th grade students rate Average, and 29% rate Excellent. In the 8th grade, 11% of the people rate Poor, and 11% of the people rate Excellent. Calculate the following conditional probabilities. (a) P(61 E) (b) P(6I P) (c) P(71 A) (f) P(8I P) A) P(81 (e) (d) P(7IE) 17. A company owns three identical fast food restaurants (i.e., they have the same name) at different locations. The fraction of employees who quit, reported by each restaurant, and the causes are shown in the following table. (For example, the fraction 1 means that approximately 1 in 7 people from the entries Restaurant A quit due to Low Pay.) Assume that in the table list the #1 reason why people quit, as people can obviously terminate employment due to multiple factors, Restaurant Restaurant Restaurant A B C
(c) If it was discovered that this employee quit because of being Too Busy, what is the probability that he or she came from Restaurant B? (d) If it was discovered that this employee quit because of Poor Management, what is the probability that he or she came from Restaurant B? (e) If it was discovered that this employee quit because of Low Pay, what is the probability that he or she did not come from Restaurant C? (f) If it was discovered that this employee quit because of Irritable Customers, what is the probability that he or she did not come from Restaurant A? 18. 0D The Homeless in America volume in the Information Series on Current Topics [52] indicates the following empirical probabilities related to a family in 1989 America being considered "low income." P(White)
= .781
P(Other) = .116 P(Low income IWhite) = .078 P(Low income Black) --.278 P(Low income Other) = .192 Compute the following conditional probabilities, which indicate the ethnic mix of the low-income families. (a) (b) (c) (d)
P(White ILow income) P(Black ILow income) P(Other Low income) Is this the ratio you were expecting?
19. There are nine players on a particular baseball team, each of which has a batting average (see the following table). Player Johnson
Batting Average .402
Low Pay
7L o P
5
1-2
Sawtell
.382
Poor Management Too Busy
3 14 211 27
1 4
1 4 6
Nelson Teller Carlson
.200 .310 .457
Irritable Customers
5
7
1
Anderson Patters
.210 .315
Reeds
.278
Brookson
.341
Suppose that out of all of the employees who quit last 1 month, half came from Restaurant A, while 3 and of the 10
5
employees who quit came from Restaurant B and Restaurant C, respectively. Consider an employee who quit last month. (a) If it was discovered that this employee quit because of Irritable Customers, what is the probability that he or she came from Restaurant C? (b) If it was discovered that this employee quit because of Low Pay, what is the probability that he or she came from Restaurant A?
299
(a) Suppose that a player on this team was selected to go up to bat, and he got a hit. Assume that the first four players in the preceding table were twice as likely to be selected as the other players. What is the probability that i. it was Johnson at bat? ii. it was Carlson at bat?
300
Chapter 6 Finite Probability Theory
(b) Suppose that a player on this team was selected to go up to bat, and he did not get a hit. Assume now that the last four players in the preceding table were twice as likely to be selected as the other players. What is the probability that it was Nelson at bat? 20. The 1992 World Almanac [42, page 943] provides some empirical probabilities that a 25- to 34-year-old adult American was living with his or her parents in 1990. Let MLP represent "male, living with parents" and FLP represent "female, living with parents." Let SM stand for "single male" and SF represent "single female." The empirical probabilities are P(MLP) = .15
P(MLP ISM) = .32
P(FLP) = .095
P(FLPISF) = .20.
The almanac does not list the probabilities that an adult in that age bracket is single or married. This omission provides you with the opportunity to complete the following activities. (a) Solve for x: i. P(SMIMLP) = x • P(SM)
Hint: Use the simple form of Bayes's theorem. (b) State in words what these two equations mean. (c) State whether more or less than half of the men and women in this age group were single. [Hint: Use your answer to part (a).]
6.6 QUICK CHECK SOLUTIONS Quick Check 6.1 1. One possibility is T = IL, M, S), where T is the name of the sample space, L represents "long," M represents "medium," and S represents "short." The outcomes are the three possible choices of straw. 2. (a) A simple choice is G = {R S, R}. There is one outcome for each possible choice. There is no significant difference; both have three distinct outcomes. (b) A natural attempt is to designate the sample space as {P-P, P-S, P-R, S-S, S-R, R-R}. This is not the easiest to use sample space because it does not adequately represent which opponent wins. A better sample space is {P-P, P-S, P-R, S-P, S-S, S-R, R-P, R-S, R-R}. We can agree that the first letter represents opponent I and the second letter represents opponent 2. Thus, the outcome P-S represents a win for opponent 2.
Quick Check 6.2 1. NW ={P-P, S-S, R-R4 2.
O1
{P-R, S-PR R-S}
Quick Check 6.3 1. O1 = {P-P, P-S, S-S, S-R, R-P, R-R} 2. PC = {P-P, P-S, P-R, S-R R-P} 3. Yes, the intersection {P-R, S-P, R-S} nl {P-P, S-S, R-R} is empty. It is impossible simultaneously to have opponent 1 win and also have no winner. 4. No, the intersection is nonempty: {P-P, P-S, P-R, S-P, R-P}
n
{P-R, S-P, R-S} = {P-R, S-P}.
Therefore, it is possible simultaneously to have opponent 1 win and to have at least one of the opponents choose paper.
Quick Check 6.4 1. Theoretical-equally likely outcomes. likely, P(M) = 1/3.
Assuming that each outcome is equally
6.6 QUICK CHECK SOLUTIONS
301
2. Theoretical-equally likely outcomes. (a) P(NW) = 3/9 = 1/3. There are three ways to have no winner and 9 possible outcomes. (b) P(PC) = 5/9 (c) P(0 1 ) = 3/9 = 1/3
3. Empirical. His current ratio is oversleeping 23/100 of the time. The best estimate (assuming he does not have a sudden transformation of character and habits) is that there is a probability of about .23 that he will oversleep tomorrow. 4. Subjective. Your answer to this will be influenced by the ages of your classmates and how much you know about their families. (There is one situation where the subjective probability becomes precise: If you are the only student and all four of your grandparents are currently living, the probability is 1.) The next time the class gathers, you can conduct a survey. You will then have a theoretical determination of the probability. It will not be empirical because once the survey is taken, you have complete knowledge of the ratio: (number with all 4 living)/(number in class). An empirical probability is an estimate of a current probability that is based on past performance. 5. Theoretical-unequally likely outcomes. (a) P(Fr) = 18/36 = 1/2. P(So) = 12/36 = 1/3. P(Ju) = 4/36 = 1/9. P(Se) = 2/36 = 1/18.
(b) P(UD) = P(Ju) + P(Se) = 1/9 + 1/18 P(E)
=
1/6. I have used the definition
ZP(o). O•E
Quick Check 6.5 1. (a) The probability is 3/9 = 1/3. This problem does not use conditional probability. (b) P(P-RI ?-R) = 1/3. The revised sample space is {P-R, S-R, R-R}. (c) If we didn't know that opponent 2 had displayed rock, the probability estimate would be 3/9 = 1/3. Therefore, the events "opponent 2 displays rock" and "opponent 1 wins" are independent. (d) These events are not mutually exclusive since both contain the outcome P-R. 2. (a) P(0 1 I PC) = 2/5. Opponent 1 wins on P-R or S-P (among the five outcomes in PC). (b) PC = {P-P, P-S, P-R, S-P, R-P} 3. P(E) = 3/6 = 1/2. P(E IL) = 2/5. The events are not independent. (Knowing that a 6 was not rolled decreases the possibility of an even number.)
Quick Check 6.6 1. Computation formula 3 is appropriate since these are mutually exclusive events. P(0 1 U 02) = P(WO) + P(0 2 ) = 1/3 + 1/3 = 2/3. 2. Computation formula 2. 3. P(J U S) = P(J) + P(S) - P(J n S) = 4/52 + 13/52-
Quick Check 6.7 1. (a) P(B n L) = P(B) .P(LIB) =.103..278 _ .029
1/52 = 4/13
302
Chapter 6 Finite Probability Theory (b) There were about .02 -6 2.1 times as many low-income whites.
2. (a) P-S is the only outcome in the sample space that is in the event. The probability is thus 1/9. (b) Using computation formula 5, P(P-? n ?-S) = P(P-?) - P(?-S) 1 Formula 5 is more appropriate because the choices of opponent I are independent from the choices of opponent 2. This can be seen by comparing P(P-?) and P(P-? I?-S). (Formula 4 is not incorrect since it is always valid. Formula 5 is simpler and is thus preferred when it is valid.) _
(There are three ways for opponent 1 to lose I I 3. P(6-) = I - P(0 1) and three ways for a tie. Thus there are six ways out of nine for opponent 1 to not win.)
4. Using the second form of formula 4, P(A
n B)
= P(B) • P(A IB) = P(B)
(since P(A IB) = P(A))
.P(A)
= P(A). P(B).
Quick Check 6.8 1. P(C)= 1 - 365.364.363...355.354 36512
365 T365
364 365
363 365
.355 365
354 36
.1670 .5
The change at the second line was unnecessary for n = 12 but was done to show how to keep your calculator from producing a number too large to store (which will happen with large values of n). 2.
P(more than one couple has C I at least one couple has C) S[1-(] • 07 )"1 10\9 -
_.405
If the population is 10 people and there is a I in 10 chance that a couple with particular characteristics can be found, then if we actually find such a couple, there is about a 40% chance we will find at least one other couple who have those characteristics.
Quick Check 6.9 1. There are 48 ways to choose the card that is not an Ace. There are C(52, 5) 2, 598, 960 ways to deal a five-card hand. Thus 48 P(4 Aces)
=
2,598,960
_ .000018.
A .0018% chance indicates a very rare event. 2. There are C(13, 3) ways to choose the clubs and C(13, 2) ways to choose the diamonds. The choices are not mutually exclusive. Since the problem has predetermined that these two suits will occur, the choices of clubs and of diamonds are independent. Thus there are C(13, 3) • C(13, 2) = 22, 308 ways to be dealt the required kind of hand. The probability is thus P(3 clubs and 4 diamonds) -
22,308 2 2,598,960
.0086,
which is still unlikely. 3. Recall that the sum of two numbers is even if both are even or both are odd. Since the two dice are independent, there must be 3 .3 = 9 ways to roll two even numbers
6.6 QUICK CHECK SOLUTIONS
303
and 3 • 3 = 9 ways to roll two odd numbers. (I have used general counting principle 1.) Rolling two even and rolling two odd are mutually exclusive, so general counting principle 2 implies that there are 9 + 9 = 18 pairs that have an even sum. The probability is thus 18 P(even sum) = 36 = .5, which probably agrees with your intuition.
Quick Check 6.10 1. E(X)=2-.3+4-.4+5..l+6..1+10..05+20..05=4.8 2.
100 .25
25 +
-50 .20
-10 +
S~.10+
0
L75 .15 -25 .30
11.25 + +
-7.5 18.75
Quick Check 6.11 1. (a) The first step is to convert odds to probabilities. The table in part (a) reflects this conversion. The random variable X will be assigned the value of the prize won. Notice the inclusion of the most common prize: losing. x
P(x)
x • P(x)
500 200 100 0
.01 .02 .04 .93
5 4 4 0
YLx. P(x) -_ 13 Thus E(X) = $13. (b) This is not a fair game. The state expects to make (on average) $7 per ticket. (c) P(losing) = .93 =1 - (P(500) + P(200) + P(100)) (d) There are only seven prizes, so the odds imply that 100 tickets are to be sold. The state will make $7 • 100 = $700. A quick way to check this answer is to notice that the state takes in $20. 100 = $2000 and pays out $1300, for a profit of $2000 - $1300 = $700. 2. There are 53,715 winners, so 199,500,000 - 53,715 = 199,446,285 losers. The odds are expressed in the S : T form. This can be seen from the $25,000 prize. There should be three such prizes. The product 6 0 199,500,000 = 3 matches the assumption S : T. The inconsistency is the $120 entry. With the stated odds, there should be three hundred $120 prizes. From the symmetry of the table of odds, I assumed
304
Chapter 6 Finite Probability Theory that the ratio 1: 665,000 is correct and the number 250 is incorrect. (The sweepstakes actually awarded 300 prizes; the number 250 was incorrect.) Another minor discrepancy: The odds for the $109 prize are really 1 : 3,736.3, but this does not make any significant difference in the expected value. The rounded number was legitimate to publish. The odds for losing are 199,446,285 : 199,500,000. This can be converted to 1: 1.00027 by dividing both sides by 199,446,285. The probabilities can be calculated from the S : T odds as SIT. x (Prize) $5,000,000 $100,000 $25,000 $10,000 $5,000
Number of Prizes 1 1 3 5 10
$2,500
1: 1: 1: 1: 1:
50 300
$120 $109
53,395
$0
199,446,285
Odds 199,500,000 199,500,000 66,500,000 39,900,000 19,950,000
P(x) 5.01253 5.01253 1.50376 2.50627 5.01253
x x x x x
x •P(x) 10-9 10-9 10-8 10-8 10-8
0.02506 0.00050 0.00038 0.00025 0.00025
1: 3,990,000 1: 665,000
2.50627 x 10-7 1.50376 x 10-6
0.00063 0.00018
1: 3,736
2.67666 x 10-4
0.02918
1 :1.00027
0.99973
0.00000
Ex.P(x) -_ 0.05643 The expected value is about half a cent higher than in Example 6.44.
Quick Check 6.12 P(A-) 1. P(B IA) = P(B).P(AIB)
7.6.4-.8
53
2. The expanded form of Bayes's theorem is appropriate here (P(J) was not given). It is necessary to compute P(D) = 1 - P(D) = .25.
PD'IJ)
-
P(D).P(JID) P(D).P(JID) + P(D) • P(JID) .75 - .20 __________= + .25 •.40 •.20 .75
.60
Quick Check 6.13 1. Since P(S 3 I C3 ) = 0 and P(S1 IC 3 ) + P(S
2
1C 3 ) + P(S 3IC 3 ) = 1
P(S1 1C 3 ) = 1/3. 2. (a)
1) P(A BI) P(B IA) =P(B P(B 1 ) - P(A I B 1 ) + P(B2 ) • P(A IB 2 ) + P(B3 ) • P(A I B 3 )
1
1
1 1
(b)
P(B 2 A)
15
3
P(BI) - P(A P(B 1 ) • P(A I B1) + P(B 2 ) • P(A 1 1 1
1
1
1. 2 !83 4 ÷ 1 "1÷ 6 '8
44
I BI) I B 2 ) + P(B 3 ) - P(A I B 3 )
6.7 Chapter Review
(c)
P(B 3 IA)
305
P(BI)=P(AIBI) P(Bl) • P(A IBI) + P(B2 ) • P(A I B 2 ) + P(B 3 ) • P(A IB3) 1
5
628
6
5
Notice that the three conditional probabilities add to 1, which is expected since we are assuming that the sample space is a union of the mutually exclusive events Bi.
6.7.1 Summary This chapter introduces the basic concepts of finite probability theory. An advanced course in probability or probability and statistics will extend these ideas to continuous (hence infinite) sample spaces. Most of the concepts will remain unchanged, but the mathematical details will change. In particular, integration will become a central tool in continuous probability theory. Section 6.1 begins with a number of foundational concepts (such as sample space, event, independence). The standard model for probability is also introduced. The key notion is that the probabilities of the outcomes in a sample space should add to 1. Several methods (theoretical, empirical, subjective) for determining the probabilities of events are then discussed. The most important is the theoretical methods. In that context, the notion of equally likely outcomes is important. Probability theory becomes more useful with the notion of conditional probabilities (Section 6.2). A conditional probability allows us to revise our estimate of an event's likelihood if we gain additional information. Section 6.2 continues by presenting several formulas that summarize some fundamental relationships between the probabilities of two events. The chapter concludes with three short, but very interesting, sections. Section 6.3 shows how the material in Chapter 5 can be used to calculate theoretical probabilities. Section 6.4 introduces the important notions of random variables and expected value. The basic idea is to create a variable whose value is determined by the outcome of a random experiment. The expected value of a random variable, X, captures the notion of "the average value of X." Section 6.5 introduces Bayes's theorem. This theorem shows how to turn a collection of conditional probabilities into a different set of conditional probabilities (which are perhaps of greater interest). Practical applications include use with medical tests that attempt to diagnose diseases such as tuberculosis. The material in this chapter will make much more sense if you thoroughly understand the definitions. Seek to gain an intuitive understanding of the material. If you can accomplish these tasks, the rest of the chapter will be much easier; conceptual understanding is more important than computational details. It is very useful to be aware that the definitions of outcomes, sample spaces, and events are expressed in the language of sets. This means that notions such as "the complement of an event" are really not new ideas. The formulas and theorems are motivated by simple ideas. If you understand those ideas, the formulas are easy to memorize (since they are just mathematical shorthand for ideas you have already mastered). Perhaps the most complex part of the chapter is the discussion of Bayes's theorem. For the generalized Bayes's theorem, it is important to note that the events, B1 , B 2 , • .. , Bn, form a partition of the sample space. In order to use the theorem, it is necessary to know all but one of the probabilities in the formula.
306
Chapter 6 Finite Probability Theory
6.7.2 Notation Notation
Page
Brief Description
S
257
a sample space
0
260
an outcome
E
259
the complement of event, E, ( E is also an event)
P(0)
260
the probability of outcome 0
P(E)
260
the probability of event E
P(A I B)
266
the conditional probability of event A, given that event B has occurred
E(X)
287
the expected value of random variable X
6.7.3 Definitions Sample Space, Outcome A sample space is the set of all outcomes. The outcomes are an exhaustive collection of the possible results of some random experiment. Sample spaces are often denoted by S.
about the mechanisms that govern the random experiment. This knowledge is used to assign probabilities to outcomes. The probability of an event is then the sum of the probabilities of the outcomes in that event.
Favorable Outcome Suppose that an event, E, in some random experiment is one that is of interest (perhaps we win a prize if E occurs). Any outcome in E is called a favorable outcome, since if one of those outcomes is the result of the experiment, then E occurs.
Empirical Probability Empirical probabilities are used when theoretical probabilities are not possible. Instead, historical data are used to list relative frequencies of the outcomes of the sample space. Those relative frequencies will not be exact probabilities, but they may be close
Event An event is a set of outcomes,
enough for most purposes.
Complement of an Event The complement of an event is an event consisting of all outcomes in the sample space, S, which are not part of the original event. The complement of an event, E, is denoted K. Mutually Exclusive Events Two events are said to be mu-
Subjective Probability In some situations, we have neither a theoretical basis nor past records from which to determine the probabilities of outcomes. In those situations, the best that can be done initially is to make educated guesses, called subjective probabilities.
tually exclusive events if they have no outcomes in common. Another way of defining this is to call two events mutually exclusive if they cannot both occur simultaneously.
Mathematical Model for Finite Probability The model consists of three primary components: 1. The sample space chosen to reflect the possible out-
The Probabilities of Outcomes and Events We denote the probability of an outcome, 0, by P(0), and the probabilityThability of an event, A, by P(A). of an event, A, is a. n3. The probability of an event, A, is a number between 0 and 1, inclusive, which reflects the likelihood that event A occurs. That is, 0 < P(A) < 1. In addition, it is required that the sum of the probabilcities of all outcomes in the sample space add to 1. We can express this as
Y
comes 2. The probabilities assigned to the outcomes The theoretical requirements that probabilities of outc m s( n vns aif comes (and events) satisfy 0 0.27 The schematic diagram Sn : An \ B. / Cn 1\ Dn / indicates that Sn should have as many line segments as there are (combined) in the subcurves An, Bn, Cn, and Dn, plus four more line segments for the arrow segments. It won't take long to convince yourself that the subcurves each have the same number of line segments. Let the number of line segments in An be given by an. If we let sn represent the number of line segments in the curve Sn, then
"*so "*Sn
=
4
=4a,+4 forn >0
A 0 is empty. The diagram for An, An : An- 1\B.- I
"*ao "*an 26
0 = 4an-
1
+ 4
D. I/An-1, indicates that
for n > 0
The formula for the sum of a geometric series is something you should already have memorized.
27
This number is useful when determining how long each line segment should be in order for the curve to be drawn inside the fixed-sized square.
336
Chapter 7 Recursion It is now straightforward to solve the recurrence relation for {a, }. a, =4an-1 +4 = 4(4a,-2 + 4) + 4
substitute
= 42an_2 + (42 + 4)
simplify
= 4 2 (4a,-3 + 4) + (42 + 4) = 4'a,-3 + (43 + 42 + 4)
4kan-k+-(4k
+4
substitute simplify
k-+...+42+4)
TABLE 7.5 The Number of Line Segments in S,,, for 0 < n 0.
0 and R1
A Harder Recurrence Relation Consider the recurrence relation defined by
"*ao=0 "* an = nan-1 +1I
forn > 0
Before starting the back substitution, notice that n is now more than merely a subscript. The relationship between n the number and n the subscript needs to be handled carefully. In particular, the recurrence relation implies that aI = lao±+ I = I and a2 = 2aI + 1 = 3. Notice also that a,_1 = (n - l)an-2 + 1.28 a n = n an-1 I + I
1) + 1 substitute
= n((n -1)
. an_2 +
= n(n -1)
an_-2 + (n + 1)
simplify
= n(n - 1) ((n - 2)a,-3 ± 1) + (n + 1) =n(n- 1)(n-2).an_3 +(n(n-
28Not an-] = nan_ 2 + 1.
substitute
1)÷n+l) simplify
7.2 Recurrence Relations (nt-
n!
(
n!
k)! -"an-k + ((n -- (k -1))!
=n .'ao + (n!0!
n! ++ 2(n-1!
1!
-+
n!-
+ (n -(k
n! -2))! +
n! + (n -
337 n!) +! n1
n) + n
k=J k! It is possible (but not trivial) to go a step further. Symbolic mathematics software packages such as Mathematica and Maple are able to convert this to an expression involving the Euler gamma functions: an!n(n+l)'(n+1,1)"e
((n n! + I
=
I
+ 2)
where F(z) =
tZ-le-t dt
F(z, 1) =
tZ-le-tdt.
(n+l)-F(n+l,1)
It can be shown that proaches n!(e - 1).
F(n+2) +
1 as n --> ox, so a,, asymptotically ap-
U
The previous example indicates that recurrence relations may not always have simple solutions. The next example indicates that the back substitution method for solving recurrence relations has some limitations. Solving the Fibonacci Sequence Suppose we want a closed-form formula for fn, the nth Fibonacci number. Starting the back substitution, A=
fn-l + fA-2 =
(fn-2 + fn-3) + fn-2
= 2fn-2 + fn-3
substitute
simplify
2
= (fn-3 + fA-4) + A-3 = 3fA-
3
+ 2fn-4
substitute
simplify substitute
= 3(fn-4 + fn-5) + 2fn-4
= 5f-4 + 3f,-5
simplify.
How do we identify a general pattern? Notice that the coefficients after the simplification are: (2, 1), (3, 2), (5, 3). These seem familiar. They look like pairs of numbers from the Fibonacci sequence! If the next pair is (8, 5), this guess is most likely correct. = 5(fA-
+ fn-6) + 3fn-5
= 8fn-5 5+ 5fn-6
simplify
substitute
338
Chapter 7 Recursion
The conjecture seems correct. What is the general pattern? Perhaps the following table will help. n
0
1
2
3
4
5
fn
1
1
2
3
5
8
It seems that the coefficients appear in the pairs (fk, fk-i). The generic step is therefore = fkfn-k + fk-1 fn-(k+l). This will terminate when k = n - 1: = fn-lfn-(n-1) + f(n-1)-I fn-(n-l+I)
= f.-If] + fn-2fo = fn-I + fn-2.
We have arrived back where we started! The attempt to find a closed-form formula for fJn has failed. U Fortunately, there are other techniques for solving recurrence relations. A closedform formula for fn will be derived in Example 7.26.
7.2.2 Linear Homogeneous Recurrence Relations with Constant Coefficients Recurrence relations with some special properties can be solved by a technique that is more powerful than back substitution. The required special properties are defined next. DEFINITION 7.4 Homogeneous; Constant Coefficients; Linear A recurrence relation for the sequence {an J is called homogeneous if every term on the right-hand side of the recurrence contains a factor of the form aj, for some integer j. The recurrence relation has constant coefficients if n does not appear in any term involving some aj except in subscripts. A recurrence relation is called linear if no term contains more than one factor of the form aj (even with different values of j), and no factor of the form aj appears in a denominator, as an exponent, or as part of a more complex function. Some examples should make these ideas clear. Base values (such as a0) for the sequence (an I will not be given, since the definitions do not depend on these values.
N
Illustrating Definition 7.4 The following recurrence relations illustrate the terminology introduced in Definition 7.4. Recurrence Relation an = 5a,_1 + 7 2
4
an = (n + )an-
8
+ a,-2
an = 5an-i + 3n 3
an = an_ 1 • a,-2 + 4a,an = 4a2_ + 5 an-2 an =sin(an-l) + n
3
Homogeneous
Linear
Constant Coefficients
no: +7
yes
yes
yes
yes
no: n 2 + 4
no: +3n
yes
yes
no: an-1
yes no: +n
yes: 3n is not a coefficient of some aj •an-2
2 an--1
no: no: sin(anil)
yes yes yes
U
7.2 Recurrence Relations
339
Recurrence relations that have all three of the properties in Definition 7.4 are the main topic of this section. The name for such recurrence relations is long, but their form is simple. DEFINITION 7.5 Linear Homogeneous Recurrence Relations with Constant Coefficients of Degree k A linearhomogeneous recurrencerelationwith constant coefficients of degree k is a recurrence relation that can be written in the form ++ Ckan-k
an = clan-] + c2an-2 +
for some k with 1 < k and ck 3 0. The constant k is called the degree of the recurrence relation. The constants, cj, are called the coefficients of the recurrence relations.
Some Linear Homogeneous Recurrence Relations with Constant Coefficients The following recurrence relations are all linear homogeneous recurrence relations with constant coefficients. Recurrence Relation
Degree
an = 3an-I + 4an-2
2
an - 5an_ 1
1
an = 3an-1 + 4a,-3
3
an = -- 3a-2
+
6
an-3 - 9an-
5
5
Notice the convention that the factors ai are written in descending order. Follow this convention! It will keep you from some errors later in this section. U Observe that an = 0 is always a solution to any linear homogeneous recurrence relation with constant coefficients (usually called the trivial solution). Suppose, for the moment, that the recurrence relation is an = clan-1 + c2an-2 +- •.. + ckan-k
and that we were fortunate enough to find a real (or complex) number r 0 0 such that an = rn
for all n > k.
In that case, we could substitute the explicit values into the recurrence relation and produce
= rn-k (cirk-l
+ c
ckrn-k
...+
rn = c1rn-1 + C2r2
rk-
2
+
'+
k-1r + ck).
Thus, rTk _ crsk-1
This leads to the next definition:
a C2r k-2
tCk-
Idr -i Ck =
.
340
Chapter 7 Recursion
DEFINITION 7.6 The CharacteristicEquation The characteristicequation of the recurrence relation an = Clan-i + C2an-2 + •
+ Ckan-k
is Xk - CXk-l - C2Xk-2 ...
Ck-lX - Ck = O.
Note that if the final term 29 in the recurrence is Ckan-k with Ck # 0, then the characteristic equation will have degree k. The equation is formed by moving every term in the recursion to the left and then replacing an-j by xk-j, for 0 < j < k. The phrase "characteristic equation" is used in linear algebra when discussing eigenvalues of a square matrix and also in differential equations. Although in each case the context is distinct, nevertheless there is a common feature: The roots of the characteristic equation are the key to the solutions of the various problems.
1. For each recurrence relation, determine which of the special properties apply (homogeneous, constant coefficients, linear), (a) a, = cos(n) • a,-, + 3a,-2 (b) bn = 3b,- 2 + 5bn- 4 (c) an = (a, -)2 + 4a,-2 + 9 (d) bn = 2 b,- + 8b,- 2
2. For each of these linear homogeneous recurrence relations with constant coefficients, determine the characteristic equation. (a) an = 5a,_1 - 9a,-2 (b) b, = bn-I - 7bn-3 (c) a, = 4an-2 + 3a,-I Be careful! Something is wrong here. VV
Suppose the sequence {a, I is generated by the recurrence relation an = clan-] + C2an-2 + • • - + Ckan-k.
If a, = Orn also generates this sequence, then r is a root of the characteristic equation. Conversely, if r is a root of the characteristic equation, then any expression of the form Or' generates a sequence that is a solution to the recurrence relation. The proof is an easy extension of the ideas that motivated Definition 7.6 and will be left as an exercise. Where did the idea for setting a, equal to something of the form Or" come from in the first place? One reason is an experimentally derived one: Suppose you solve (perhaps using back substitution) a number of linear homogeneous recurrence relations with constant coefficients and look for a pattern. The first place to start would be linear homogeneous recurrence relations with constant coefficients of the form a n = clan-l. It is easy to see, using back substitution, that the solution is an = aoc". The solution is essentially a power of the coefficient. The next attempt would be to solve linear homogeneous recurrence relations with constant coefficients of the form an = clan-] + C2an- 2 . 29
Assuming subscripts are arranged in decreasing order.
7.2 Recurrence Relations
341
It is not as easy to see a pattern in the form of the solutions to this recursion. After spending some time getting nowhere, you might eventually try to use the result from the degree one case. Since the solution was expressed in terms of powers of some number, you might try this for the degree 2 case. This leads naturally to Theorem 7.2. We now know (by Theorem 7.2) that any solution in the form Or" must have r as a root of the characteristic equation, so there will be only a small number of such solutions. For a linear homogeneous recurrence relation with constant coefficients of degree k > 1, the characteristic equation has more than one root. 30 Each of these roots generates a solution to the recurrence relation. What is the most general form of a solution? The next theorem will begin to answer that question.
Suppose the characteristic equation of the degree k recurrence relation an = clan-I + c2an-2 +- " has k distinct roots, ri, r2 ... the closed-form expression a.=
+ ckan-k
, rk. Then for any choice of constants, O0,02.
Ok,
orn+6,2r +.-.+okr?7
generates a solution to the recurrence relation. In addition, if the k initial values, ao, al ... , ak-I, are specified, it is always possible to find unique values, 01, 02, .... Ok, so that the recurrence relation generates the solution that matches those initial values. Proof: The proof of the first claim is not difficult but does carry some algebraic baggage. I will restrict the demonstration here to the case k = 2. The more general proof does not require any additional ideas. What needs to be done then is to show that a, = O1rn + 02r" actually satisfies the recurrence relation. Substituting into the relation produces Ilrn + 02r 2 =
l (Oirn-' + 02r2-)
+ C2 (OIrj-2 + 02r-2)
=-(clO1•r-' + C201rn-2) + (Cl02rn-1 + c 2 02 r-2) The equation can be rearranged to produce
0 1rn 2 (r2
-
cl - C2 + 0 2 r7
2
r2 _ cjr-
C2)
0
The previous equation is true since rl and r 2 are both roots of the characteristic equation [x 2 - clx - C2 = 0]. Ok need to be chosen Now a proof for the second claim. The constants 01, 02. so that the first k terms in the sequence match ao, ai .... , ak-I. That is, so that
orO1
3 0Ol•1
rntn
30Counting repeated roots.
+02r2
-+
Okrk
ao
0lr,1 + 02r +...0kr'
= al
0 1 r7 + 02r2+
= a2
r 2-O repeated
Ok
' '~ k r+otsrk-.
ak-1.
342
Chapter 7 Recursion This forms a system of k linear equations in the unknowns 01, 02, .... Ok (the values ao, al .... ak-1, and ri, r2 ..... rk are all known). A famous theorem in linear algebra (Vandermonde's matrix theorem) implies that whenever the coefficient matrix 31 of a system of linear equations has the following form (for distinct values V1 , V2. Vk), the system has a unique solution. Vl
V2 ..• Vk 2 V2 .. V2
LVk-' V-1 .. 4-13 Thus, 01, 02 .
Ok are uniquely determined (and are not hard to calculate in practice).
It is time for a few examples before developing additional theoretical insights. An Easy Linear Homogeneous Recurrence Relation with Constant Coefficients Consider the recurrence relation specified by
" ao = "* an =
-2 and al = 3 an-1+
6
an-2
for n > 2
The characteristic equation is x2 -x-6=0 having roots 32 rl = 3 and r2 = -2. Thus, the general solution is of the form an = 01 3 n + 02(-2)n. The final step is to determine values for 01 and 02 so that the solution matches ao and al. In order for that to be true, the closed-form expression for an must produce the predetermined values for ao and a I: 0130 + 02(-2)0 = -2 013' + 02(-2)1 = 3. This simplifies to
O + 02 = -2 301 - 202 = 3. This system is small enough to be solved by substitution (Gaussian elimination 33 is usually better). 01 = -02
- 2
31 The coefficient matrix for the linear system of interest here is created by extracting the O's and ask's (the coefficients of the system). The extracted coefficients are listed in rows that correspond to the equations in the
system. See Definition E.3 (on page 000) in Appendix E. 321t does not matter which root is labeled rq and which is labeled r 2, as long as we are consistent throughout the problem. 33 Gaussian elimination is a row reduction technique taught in linear algebra courses.
7.2 Recurrence Relations
343
So 3(-02 - 2) - 202 = 3 -502 = 9 02 -
9
-
5 Therefore, 01 = -02 - 2 = The solution we seek is a
I . 3n -9
5
-
(-2)'.
5
Warning! It is critical that you use exact values for the roots and for the O's. Using decimal approximations produced by a calculator will not lead to a correct formula. More on this soon. It is very easy to do some simple checking for validity. Form a table that contains the first few values generated by the recurrence relation and the first few values generated by the closed-form formula. They should match. Table 7.6 shows the validity check for this example. TABLE 7.6 Validity Checking n Recurrence Closed-Form 0 1
-2
2
-9
3 4
-2 3
3
-9 9
9
-45
U
-45
The previous example illustrates the general approach for solving linear homogeneous recurrence relations with constant coefficients having characteristic equations with distinct roots: The General Procedure for Solving Linear Homogeneous Recurrence Relations with Constant Coefficients Having Distinct Roots Step 1 Form the characteristic equation. Step 2 Find the roots of the characteristic equation. Step 3 If the roots are distinct, express the general solution in the form a, =Oirn+ 0 2r2n
...+Okr,
forn>k.
Step 4 Form the system of linear equations to determine the 0's: O1r + 02r2 +... + Okr'
ao
O1r + 02r +..
+ Okr = al
01r, + 02r2 +"
+ Okr= a2
klr-1 + k2r-1 + + •rk k01r, +±02r +..Orkl 2
a-1 ak-1.
Step 5 Solve the linear system and substitute the solution values for the O's into the general solution from step 3.
344
Chapter 7 Recursion
SMMErroneous Rounding Consider the linear homogeneous recurrence relation with constant coefficients defined by
ao
=
Sa, =
1 andaI = -1 2
an-2
for n > 2
The characteristic equation, x 2 - 2 = 0, has roots rI =-/2- and r2 = . 144.34 we use a calculator and write these as rl = 1.414 and r2 = The linear system would then be
-V2. Suppose
01 + 02 = 1 1. 1 1 - 1.41402 = -1. 4 40
Using a calculator to solve this system leads to 01 solution is therefore
.146 and
02
= .854. The
a, = .146. 1.414n + .854. (-1.414)'. Consider Table 7.7, which compares the values generated by the recurrence relation and the values generated by the closed-form formula. TABLE 7.7 Validity Check when Rounding n
Formula
Recurrence
0
1
1
--l
-1.00111
2 3
2 -2
1.99940 -2.00162
4
4
3.99758
5
1.00000
-4
20 21
-4.00203
1,020.91
1,024 -1,024
-1,022.05
Clearly, the formula starts poorly and deteriorates rapidly. The correct solution is
(2
- v'2s) (,,-7n1 (+
"2-1)(-V2)n
Recall that the method of back substitution was not very helpful for finding a closedform formula for the Fibonacci numbers. We now have a technique that can find such a formula.
SIThe
Fibonacci Sequence Revisited The characteristic equation for the Fibonacci sequence is x 2 - x - 1 = 0, since the recurrence relation is f, = f,-I + fn-2 for n > 2. There is no simple way to factor the left-hand side of the equation, so the quadratic formula is necessary.
2 34
Not an uncommon action; many students routinely round all calculator output to two or three digits. This is
not a good idea, especially with intermediate values during a long calculation.
7.2 Recurrence Relations The roots are r I
+
and r 2
345
The general solution is of the form
l
( 2
"
2ý_)n
A= 01(1 +±V5_)n+02(1 The system of linear equations that determine the O's is
01 +02
2)
= 1
( 2 )
This looks messy, but the substitution is not really difficult: Since 02
+15
1 -V
2
0+
1
01,
1 2 -1=
2
1
5+V/5 10
Thus 02= 1= I- 01 10 The Fibonacci sequence is generated by
1k
10
1~0 /
2
2j
This result may strike you as a bit odd; how can the Fibonacci sequence (a sequence of integers) be generated by a complicated expression involving
V/-? 35
However, if you
program this formula into a graphing calculator or a program like Mathematica, you will find that it does indeed generate the Fibonacci sequence. U
1. Find a closed-form formula for an, if "*ao = 1,al =2, and a2 = 3 " an = 4an-I + an-2 - 4a,-3 for n > 3
N
A Second Look at a Previous Pattern Recall the attempt in Example 3.23 to find a simple formula for the partial sum Sn = I
2
23+
2 35
...-t - = .. (_2)2n k=0
•
you may also have noticed the unexpected appearance of the golden ratio: (P - 1@. See Appendix D for more about the golden ratio.
346
Chapter 7 Recursion Some of the initial partial sums are listed in Table 7.8. TABLE 7.8 The Initial Partial Sums n
0
1
2
3
4
5
6
7
8
s,
1
1
3 4
5 8
11 16
21 32
43 64
85 12-8
171 25-6
2
One of the observations (which was not pursued in that example) concerned the differences of successive numerators. Let an = the nth numerator. Some initial differences are listed in Table 7.9. TABLE 7.9 Some Initial Differences of Successive Numerators n 0 1 2 3 4 5 6 7 8 an
i
an - an-i
21
43
1
0
5
11
21
43
6-4
T2-8
85
171
11
21
43
85
171
6
10
22
42
86
1_6
3
8 5
2
2
3-2
25-6
It is easy to notice that an - an-j = 2 an-2 for n > 2. Thus, there is a linear homogeneous recurrence relation with constant coefficients that the numerators seem to follow: an =an- + 2 an-2 for n > 2. The characteristic equation is x 2 - x - 2 = 0, which has roots x = 2, -1. The general solution is therefore
a, = 012n
+- 02(- 1)n.
The O's can be found using the initial values ao = a
.
01 + 02 =-
201 Thus 01
=
- 02 = I
4 and 02
The numerator of S, should be 22,n (- 1I), which agrees with the pattern guessed in Example 3.23. The recurrence relation was built from a partial table of values for Sn, so it is still only a (good) guess. The proof by mathematical induction at the end of Example 3.23 is still needed. a An alternative approach for finding the roots of the characteristic equation in Quick Check 7.6 is to use the following theorem (which will not help if all the roots are irrational or complex). llRINlTIR
ýk
Rational Roots Theorem
Suppose the polynomial cx" + c, Ixnl + .. - -+ c x + co has integer coefficients where Cn : 0 and co : 0. Then any rational (or integer) zero of the polynomial must be of the form ± P, where p evenly divides co and q evenly divides ca.
Finding Rational Roots The polynomial equation 3x3 - 4x 2 - 6x + 8 = 0 has possible rational roots
±-1,±2, ±4,±8, ±- 1, 2 3' 3'
±
3'
3
7.2 Recurrence Relations
347
Brute force substitution into the equation verifies that 3 (4)'(4 -4
)2(4) - -6
+=0 -+8----0.
None of the other 15 possibilities work. Since 4 is a root, (x - 4) is a factor. It is also 2 3 valid to instead assert that (3x - 4) is a factor. Dividing the polynomial 3x - 4x 6x + 8 = 0 by 3x - 4 produces the quotient x2 - 2. This can be factored using the quadratic formula. The full factorization yields (3x - 4)(x - N/2)(x + N'2) = 0. *
7.2.3 Repeated Roots Two Linear Homogeneous Recurrence Relations with Constant Coefficients Whose Characteristic Equations Have Repeated Roots Consider the linear homogeneous recurrence relation with constant coefficients defined by
"*ao = 2 and ai "* an = 2an-I -
= 1 an-2
for n > 2
The characteristic equation is x 2 - 2x + 1 = 0. This factors as (x - 1)2 = 0, so the roots are rl = r2 = I. Blindly following established procedure would mean the general solution is of the form a, = Ol In + 021'. That is, an =: a constant, for all n. However, the recurrence relation indicates that the sequence starts as {2, 1, 0, -2, -4, -6, ... }! Clearly, blindly following the previous technique was a mistake. Suppose, instead, that the recurrence relation is defined by * a 0 =2andal = I San = 4a,_ I
4a,,-2
forn > 2
In this case, the characteristic equation is x 2 - 4x + 4 = (x - 2)2 = 0, having roots rl = r2 = 2. The general solution (assuming past procedures apply) is an = 012n + 022n = (01 + 02)2n. What does the system of linear equations look like? 01 + 02= 2
2(01 + 02) = I Substituting the first equation into the second,
2.2 = 1. U
Certainly something is wrong again!
The problems in the previous example arose because the characteristic equations have repeated roots. Some new technique is needed to handle this situation. One source of the problem is that there are not enough unknowns in the system of linear equations. Notice that if the substitution w = 01 + 02 is made for the second illustration in Example 7.29, the system of equations becomes two equations in one unknown (an overdetermined system). wo=2 2w= 1 What is needed is some way to have 01 and 02 be multiplied by different expressions. It seems reasonable to keep the 2" in each case. It should also be clear (after a moment's reflection) that an expression of the form c2n, for some constant c, will not improve matters. Somehow, n needs to be involved. Before proceeding to the solution, some background material is necessary.
348
Chapter 7 Recursion
1. Let p(x) = x 4 -6x 3 +12x 2 -1Ox+3. (a) Factor p(x) (perhaps using Theorem 7.4). There is a repeated zero, r. What is its multiplicity? (b) Find p'(x) and then factor it. Is r still a zero? If so, what is its multiplicity as a zero of p'(x)? (c) Find p"(x) and then factor it. Is r still a zero? If so, what is its mul-
tiplicity as a zero of p"(x)? (d) Find p(3)(x) and then factor it. Is r still a zero? If so, what is its multiplicity as a zero of p( 3 )(x)? (e) Make a hypothesis about the relationship between a repeated zero of a polynomial and the derivatives of the polynomial.
The explorations in Quick Check 7.7 lead to the next theorem.
V NNR
Derivatives and Repeated Roots
Let p(x) be a polynomial with a zero, r, having multiplicity v > 1. Then r is also a zero of the derivative p(j)(x) for j = 1, 2 ... , v - 1. Proof: Since r has multiplicity v, p(x) can be written as p(x) = (x - r)Vqo(x), for some polynomial qo(x) for which r is not a zero. Consider the derivatives p(x) = (x - r)v qo(x) p'(x) = v(x - r) -'qo(x) + (x - r)vq'(x) = (x - r)v-I (vqo(x) + (x - r)qf(x)) = (x - r) -lqI(x)
p"(x) = (v - 1)(x - r)V- 2 ql(x)
+ (x - r)v-lq',(x)
= (x - r) -2 q2(x)
p(J) (x)
(x - r)-Jqj}(x)
p ('-)(x) - (x - r)qp-l(x). Thus, r is a zero for each of the first v - 1 derivatives.
[I
The next theorem settles the issue of how to handle repeated roots.
V
7
•
Linear Homogeneous Recurrence Relations with Constant Coefficients Whose CharacteristicEquations Have Repeated Roots
Suppose the characteristic equation of the recurrence relation an = clan-1 + c2an-2
+
" • • + Ckan-k
has a root, r, of multiplicity v. Then for any choice of constants, ao, a2 .... the closed-form expression an = (Qxo + tain +
2 n2
+ . . + tlnv-1) rn
generates a solution to the recurrence relation.
I
- 1,
7.2 Recurrence Relations
349
Notation: Notice the change in notation. When there are k distinct roots, ri, r2. 0],02 ..... Ok. When there is a single root, r, of multiplicity v, the coefficients are given subscripts that match the corresponding exponent on n: ao, al,...a, a,-. The Greek letter for the coefficients has also changed. This change in variable name will be extended in Theorem 7.7. rk, the coefficients are given corresponding subscripts:
A simple example will be given before looking at the proof of this theorem. Repeated Roots Revisited Recall the following recurrence relation. • a=0=2andal=l a= a
4an-1 - 4an-2
for n > 2
The characteristic equation is x 2 - 4x + 4 = (x - 2)2 = 0, having root r = 2 of multiplicity v = 2. A general solution (assuming Theorem 7.6 is correct) is an = (aeo + aln) 2'. It is now possible to form a system of linear equations (having a unique solution) to find values for the a's that will match the base values of the recurrence relation: 0o+ a,1 0 = 2 2(ao + al
1) = 1.
This can be expressed as ao = 2 2
2
aI = 1.
to+
3. A solution 36 is an = (2-
The values of the a's are ao =2 and aO =
3n)2".
*
Proof of Theorem 7.6: The characteristic equation is xk
_ CIxk-I -.
CkX
-
Ck= O.
Define po(x) = xn-k (xk = Xn
-
clxk-1
C1xn-1 -
Ck-lX
....
Cklxn-k+l
-
Ck)
- Ckxn-k.
Using the same ideas 37 as in the proof of Theorem 7.5, it can be shown that r is a zero of poJ)(x), for 0 < j < v - 1. In particular, r is a zero of multiplicity v - 1 for the polynomial po(x), and hence a zero of multiplicity v - 1 for the polynomial p, (x) = x p,(x). 38 But then r is a zero of multiplicity v - 2 for the polynomial p2(x) = x - p1(x). This can be continued until arriving at the polynomial P,-I (x) = x •pK_2 (x), where r is a root of multiplicity 1. Thus, r is a zero of each of the v polynomials p0(x), p1 (x),..., P,-1 (x). The form of the polynomials pj (x) is important. po(x) = x, - c 1 xn-1 .....
Ckxn-k
p1(x) =
1)xn-
x.
(nxn-
-
= nxn -- cl(n 36
cl(n -
1)xn-I -
....
2
-
.....
ck(n-
Ck(n - k)xn-k
Actually, the solution, but this hasn't been proved yet. just consider xn-k part of q(x). (Note that r 0 0 since Cn-k 0 0.) 38 Recall that r - 0, so multiplying by x doesn't change the multiplicity. 37
k)xn-k-l)
350
Chapter 7 Recursion P2(X) = x. (n2xnl
2 cl(n - 1) xn-
-
2
n'-]xn _ c, (n -
k)2xn-k-l)
-
ck(n - k)xn-k
= n 2xn _ cl(n -l)2xn-I
p,-l
Ck(n
-
)v-lxn-l .
ck•(n - k)v-lxn-k
..
In order to keep the algebraic details from becoming too messy, but still keep sufficient detail to indicate the general case, I will restrict the rest of the proof to the case "v= 3. Substituting a, = (ao + aln + t2n 2 ) rn into the recurrence relation, 1)2)
(nn 2n +ClaI((a (ao + celn + ae2nZ)rn = C, (o
2 (n -2)2)
+ C2 (aO + al(n - 2) + +...
(aOa+
+
rn-1
- k)+
2 (n
rn - k)2)rnk.
Moving all nonzero terms to the same side of the equation and grouping by powers of n yields Uo(rn - Clrn-I
+ at (nrn
-
ckrn-k)
Cl (n -)rn-I
....
Ck(n - k)rn-k)
2 + U2 (n2rn - Cl (n - 1) rn-I.....
Ck(n - k)2rn-k)
-0. This is the same as uopo(r) + alpl (r) + cY2p2(r) = 0, 39 which is true since r is a zero of each of the polynomials pj (x).
A Root of Multiplicity Three A linear homogeneous recurrence relation with constant coefficients having a root r = 2 with multiplicity 3 will have characteristic equation (x - 2)3 = x3 - 6X2 + 12x - 8 = 0. The recurrence relation will therefore be an = 6a,-I - 12a,-2 + 8a,-3
for n > 3.
Suppose the base values are ao = 1, al = 0, and a2 = 4. The general solution is of the form an = (ao + aln + a2n2) 2n. The a's are determined by the system of linear equations ao +0+0-= I 2oto + 2aI + 2012 = 0 4cao + 8cI + 16a2 = 4. 39
Dropping the assumption that i = 3 leads to the equation aOpO(r) + c•i p, (r) +..
±+a,_ Pv-1 (r)
=
0.
7.2 Recurrence Relations The first two equations imply that leads to al• = -2, SO a02 1. The solution is
U2 = -1
,, = (I--
- Ul.
351
Substituting into the third equation
2n + n2) 2'.
U
.Quick.Check.7.8 .. 1. Find a closed-form formula for the nth term of the recurrence relation
"*ao "*a,
= -2,
al = -6
= -6an-I -
9
an-2
for n > 2
The General Case It is now possible to state a theorem about linear homogeneous recurrence relations with constant coefficients having repeated roots.
Solving Linear Homogeneous Recurrence Relations with Constant Coefficients Suppose the characteristic equation of the recurrence relation an = clan-I + c2an-2 + •.. + ckan-k
has j distinct roots, ri, r2, ..
. , rj, having respective multiplicities vI, V2 ..... vj, with v, + v2 + • + vj = k. Then for any choice of constants, cao, oil ..... ov]-1, io,/3 ,, fi2~-I....the closed-form expression
an= (ao +ene
-in2+n
...
+
nv,_-nv-) rI
+ (fio +frjn +Pfin2 + . . . + 8 1)2 _1 nV2-1) rn + oln2 +±... +wvj_In-l)r)n wcn
* -. .. + (coo +
generates a solution to the recurrence relation. In addition, if base values ao, a ..... ak-I are specified, then unique values can be found for ao, tl .... , favi/PI , ...... 6.2-1 so that the closed-form formula matches the sequence generated by the recurrence relation. Thus, the closed-form formula is the only solution; any other solution must be algebraically equivalent. The proof of this theorem is not conceptually difficult, but the notational aspects are a bit messy. A more precise statement of this theorem would use doubly subscripted coefficients in the expression for the general form. The first subscript would link the coefficient to the root, ri. The second subscript would link the coefficient to the exponent on n. The proof will not be given. Several examples should serve to indicate how the theorem can be used.
A Simple Example of the General Case Let ao = 12, aI = 18, and a2 = 24 and a, = 12a,-2
+
16an-3
for n > 3.
The characteristic equation is x-
12x
-
16 = (x
-
4)(x + 2)2 =0.
352
Chapter 7 Recursion The roots are rl = 4 with multiplicity vi = 1, and r2 = -2 with multiplicity V2 = 2. The general solution is of the form a. = ao4n + (Po + Pin) (-2)". The system of linear equations that determines the unknown coefficients is =
o-+-50 +0.1
12
4ao - 2Po - 2PI1 = 18 16ao + 4O + 8fIl = 24. One way that this system can be solved is to add 4 times the second equation to the third equation, producing 32ao -- 4PO = 96. Then substitute Po = 12- ao to find that ao = 4. Thus, Po = 8, and the second of the original equations then implies that PI = -9. The solution is therefore a, = 4n+1 + (8 - 9n) (-2)".
U
Two Repeated Roots Consider the recurrence relation defined by
"*ao=l,alz=l,a2= O,a3=2 "*an = 8an-2 - 16an-4 The characteristic equation is x 4 - 8x 2 + 16. This polynomial is easily factored: X4
-
8X2 + 16 = (x 2 - 4)2 = (x - 2) 2 (x + 2)2.
Thus, rl = 2 with multiplicity 2 and r2 = -2 with multiplicity 2. The general form is thus an = (ao + oin) 2n + (Po + PI n) (-2)". The system of linear equations that will determine the a's and P's is shown next.
ao + 00 = 1 (o0 +al
.1)21 + (Po0 + i.
1)(-2)1 = I
(ao + al 2) 22 + (00 + it 2) (-2)2 = 0 (ao + a, 3) 23 + (P0 + P, "3) (-2)3 = 2
This simplifies to U0 + P0 = I
2ao + 2al - 2fio - 2I5 = 1 4ao + 80x + 4PO + 8PIi = 0 8uo + 24al - 8PO - 24PI5 = 2. The solution of this system of linear equations is 13
o=16
5 l
16
3 16
3 16
7.2 Recurrence Relations
353
The solution for the recurrence relation is thus an = (13
n) 2 +
n
16
(-2)n.
A small table of values for both the original recurrence relation and the closed-form expression indicate that no arithmetic errors have occurred. n
Recurrence Relation
0
1
1
1
1
1
2
0
3
2
0 2
4
Closed-Form
-16
5
-16 0
0
6
-128
-128
7
-32
-32
8
-768
-768
U
The General Procedure for Solving Linear Homogeneous Recurrence Relations with Constant Coefficients Step 1 Form the characteristic equation. Step 2 Find the roots of the characteristic equation, rl, r2. rj having respective multiplicities V1, V2 ..... vj, with vt + v 2 + ... + vj =k. Step 3 Express the general solution as a sum of j terms in the form (6o + 31n + 3n 2 + . . .+ Svi-lnvi-1) rny. Step 4 Use the result of step 3 to form a system of linear equations to determine the unknown coefficients. Step 5 Solve the linear system and substitute the solution values for the unknown coefficients into the general solution from step 3.
Linear Homogeneous Recurrence Relations with Constant Coefficients Having Complex Roots You should recall from your previous mathematics courses that some polynomials have complex numbers 40 as zeros. Theorem 7.7 is still valid. This example involves a linear homogeneous recurrence relation with constant coefficients whose characteristic equation has complex roots. Let a0 = 3, a, = 3, a2 = 11, and a3 = 34. For n > 4, let a, = 3an-I - an-2 4an -4. The characteristic equation is x4 - 3x
3
+ X2 + 4 = 0.
Theorem 7.4 can be used to find a root. The choices are ±1, ±2, ±4. One root is r = 2. Dividing x 4 - 3x 3 + x 2 + 4 by x - 2 produces the quotient x3 - x -2. So x 4 - 3x 33 40
X2 + 4 = (x - 2)(X3 - x 2 - x - 2).
Numbers of the form a + bi, where j2 = - 1. See Appendix A for a brief review of complex numbers.
354
Chapter 7 Recursion Using Theorem 7.4 again, it is easy to see that 2 is a root of x 3 - x2 - x - 2. After dividing x3 - x 2 - x - 2 by x - 2, we can write x 4 -3x
3
+x
2
+4=(x-2)2(x2+x+
).
The rational roots theorem provides no help factoring x 2 + x + 1. The quadratic formula ±,-1± produces the zeros x 2 -3 -- -1±V13i 2 The roots are therefore rI = 2 with multiplicity vI = 2, r =
2
with multi-
plicity v2 = 1, and r3 = 2 with multiplicity V3 = I The general form for the solution is
a,
=
(2-
-+ Ni) 2 (ao +aOIn)2n+fo(--l+/i)+
2V
The system of linear equations is
ao + 0 al +/30 + yo = 3 2
4
ao + 2aI + (
±V
++o8at + (
2
8cao + 24a1 +
Poo+ (3
yo = 3
(
[+Po
= 11
)o
+
yo = 34.
This simplifies to ao + fo + yo = 3 2ao + 2at +
4
/3o
ao + Sa + (Po
+
+
yo = 3
+
Yo = I1
Sao + 24ul + PBo+ yo = 34.
Po
Solving this system by hand is not fun. 4 1 I used Mathematica to find ao = a= = yo = 1. The final solution is thus
a = (1 +n) 2" + (
+
+ (.,)
The solution sequence is {3, 3, 11, 34, 79, 191, 450, 1023, 2303, . . .
U
41 But it can be done without a totally unreasonable amount of work. You might start by adding the two middle equations.
7.2 Recurrence Relations
355
7.2.4 The Sordid Truth The techniques presented in this section for solving linear homogeneous recurrence relations with constant coefficients depend on two assumptions: 42 * We must be able to find all the roots of the characteristic equation in exact form.
* We must be able to solve exactly the system of linear equations that determines the U's. The second assumption becomes a bit of an issue if there are more than 3 or 4 a's. Solving linear systems with more than three or four equations is quite tedious, especially if rounding off is not allowed. However, computer software exists that can handle this task for quite a large number of equations (much more than you will ever need). The real issue is the first assumption. There is more involved than just being able to factor properly. Many polynomials have no rational roots (that is, all their roots are either irrational or complex). Can computer software help us here? Yes, but only to a limited extent. A brief detour is needed to make this more explicit. 43 You are familiar with the famous quadratic formula.
Quadratic Formula The equation
ax 2 + bx + c = 0,
where a, b, and c are real or complex numbers, has the solutions x
=
-b±
lb 2 - 4ac 2a
Notice that this formula expresses the roots in terms of the coefficients of the polynomial. The content44 of the quadratic formula was known in ancient times. During the 1500s A.D., it became a challenge to find such a formula (involving algebraic manipulations of the coefficients) for a general cubic equation. The bizarre story behind the eventual discovery of such a formula can be found in [21]. If the result were to be expressed using modem notation, it would look like the following: Let ax 3 + bx 2 + cx + d = 0 be a polynomial equation with real or complex coefficients. Then the simplest of the three roots looks like the multiline expression in Figure 7.23. /2b03+9
b
a c b-27 a2 d +
3a
4(3a c -b2)3
+(-2b03+9 ac b -27
a2d) 2
3 sY2a /2_ (3 a c - b 2 ) 2 2 3 3 2 $-2 b 3 + 9 a c b - 27 a d + V/4 (3 a c - b2) + (-2 b + 9 a c b - 27 a d)
3a
Figure 7.23
The simplest root of ax 3 + bx 2 + cx + d
=
O.
A few years later, a "formula" for the general degree 4 equation was found (actually, it was more like an algorithm prescribing the algebraic manipulations of the 42
That is, if the root is v', we may not use an approximation such as 1.41421356. Review Example 7.25. 1nfamous in the minds of some high school students.
43 44
But not the modern formulaic expression.
356
Chapter 7 Recursion coefficients). 45 Progress then came to a halt: None could find a formula for the general degree 5 polynomial. During the 1800s two young mathematicians independently proved that no such formula exists for polynomials of degree 5 or higher. The mathematicians were the Norwegian Neils Abel (died in 1829 of tuberculosis at age 26) and the Frenchman tvariste Galois (shot in a duel in 1832 at age 20). Since no such formula exists 46 for degree 5 or higher polynomial equations, there is no guarantee that we can exactly solve the characteristic equation for recurrence relations of degree 5 or higher. This means that the technique of Theorem 7.7 cannot be applied for every potential recurrence relation. The quadratic formula is truly amazing: It is simple, and it always produces all the roots. More could be done with the technique presented in Sections 7.2.2 and 7.2.3. In particular, it can be extended to cover linear nonhomogeneous recurrence relations with constant coefficients. Such recurrence relations look like an = Clan-l + C2an-2 + "'" -+- Ckan-k + g(n)
for some function, g(n), which does not involve any of the aj 's. Additional work can also be done to simplify the solution when the characteristic equation has complex roots. Neither of these additions will be pursued here. Instead, a more general approach that uses generating functions will be examined in Section 7.4.
7.2.5 Exercises The exercises marked with OD-have detailed solutions in Appendix G.
1. How many rabbits will there be after one year? (See Example 7.16.)
(c) a0 = 3, and an = 3 + (d) ao = 1, and an = 6an
2
an_1 for n > 1 for n > 1
6. Solve the following recurrence relations (find closed-form formulas).
2. ODWrite a recursive algorithm for fn, the nth Fibonacci number. Is this an efficient algorithm?
(a) ao = 0, and an 8 an (b) a0 = 7, andan = (n!
3. Solve the following recurrence relations (find closed-form
(c) a0 = 2, and an = an-I +
formulas). (a) ao = 5, and an = 2anI - 3 for n > 1 for n > 1 (b) *1ao = 1, and an = nan2 (c) *1-a0 = - 1, aI = 3, and an = an-2 for n > 2 (d) a0 =-3, andan =an_1 n forn>_1 _
4. Solve the following recurrence relations (find closed-form formulas). (a) ao = 7, and an = 8 an for n > 1 (b) a 0 = 0, and an = 4an_- + 5 for n > 1 (c) P a 0 = 0, and an = 5an-I + n for n > 1 Full simplification is possible but tricky. (d) a0 = 1, and a,
=
an-1 n
for n > 1
5. Solve the following recurrence relations (find closed-form
for n > I forn > 1 1)an_ forn > I
(d) a0 = 5, a1I = 4, and an = lan-2
for n > 2
( 7. The Tower of Hanoi (a) Find a closed-form formula for Hn, the minimum num-
ber of moves to solve the Tower of Hanoi problem with ndss n disks. (b) Suppose n = 64 and that the Brahman monks can move one disk per second. How long will it take to solve the puzzle?
8. Let fn be the nth Fibonacci number. Prove that fn = fkfn-k + f-1 fn-k-1
fork E {1, 2,...n - 1).
9. Prove Theorem 7.2. 9 h 10. Prove that
formulas). (a) ao = 4, and an = 3 nan-t for n > 1 forn > I (b) ao=3, andan =a2 n--1 45 46
lim
fn_
n-+6 fnI
÷/5 + 2
= 4),
the golden ratio.
It produces a formula that is even nastier than the degree 3 case. For degree 5 polynomial equations, there is a technique that does not express the roots as algebraic expressions in the coefficients. It involves much
more sophisticated ideas. No such technique is known for degree 6 or higher.
7.2 Recurrence Relations 11. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients. Don't round off or use calculator approximations; use exact arithmetic! 5 (a) ODao= 2, al = -2, and a, = -2a 1 + 1 a,-2 for n > 2 (b) ao 3, and an= 7 a,-I for n > 1 (Use the technique for solving linear homogeneous recurrence relation with constant coefficients, not back substitution.) (c) ao = 1, a, = 1,and a,= 2 a,-2 (d) ao = -1, aI = 0, a2 1, and an = 2 an,- j + 5 an-2 - 6an-
for n > 2 3
for n > 3
357
look the same as each other. Create, and solve, a recurrence relation that counts the number of visually distinguishable case arrangements there are for a rack with n slots. Bit strings of length n 17. ýIA (a) Create a recurrence relation for counting the number of distinct bit strings of length n that do not contain three consecutive Is. Include the base conditions for the recurrency relation. (b) Explain why the process for finding a closed-form formula for the recurrence relation in part (a) may be problematic with current methods. (A closed-form formula will be the goal of a future exercise.)
12. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients. Don't
18. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients whose
round off or use calculator approximations; use exact arithmetic! (a) ao= l, al =4, anda, =- 5 an_ - 6 an_2 forn >2 (b) a0 =2,a =3, andan =8anI -l-16an2 forn >2
characteristic equations have repeated roots. Use exact arithmetic! (a) ao =5, a = -3, anda, =an1 - lan2 forn >2 (b) Oao = 1,aI =2,a 2 = 1,and
(c)
an =8a_1_-21an_2 +18an_3 forn>3 (c) a 0 = 0,at = 0,a 2 = 0,a 3 = 125,and
ao0=3, aI=4,a2=6, andan=6an 6a,-3 for n > 3
(d) ao=O,al=l, a 2 = 2,and an = an-1 + 9 an-2 - 9 an-3
-lan_2+
an= -2anl+llan_2+1 for n >3
13. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients. Don't round off or use calculator approximations; use exact arithmetic! for n > 1 (Use the tech(a) a0 = 1,and an 3- = !an-I nique for solving linear homogeneous recurrence relation with constant coefficients, not back substitution.) (b) ao = 3, al = 7, and a =6anll + 3a,-2 forn>2 (c) ao =2, a =5, andan = -Ian2 (d) a 0 = -4, a1 =-3, a2 =0, and an l--3an-1- 3an-2-an-3
forn >2
for n>3 aan
14. Suppose that it is possible to climb a set of stairs by taking arbitrary combinations of either one stair or two stairs at a time. For example, someone could get to stair 3 by taking 3 single steps, or by moving to stair 1 in a single step and then to stair 3 using a double step. It is also possible to move to stair 2 in a double step and then to stair 3 with a single step. (a) Create a recurrence relation for counting the number of distinct ways to climb n stairs. Include the base conditions for the recurrence relation. (Hint: Try a few small values for n by exhaustively listing all possibilities.) (b) Find a closed-form formula for the number of distinct ways to climb n stairs. 15. In how many ways can a 2 x n rectangular checker board be tiled using combinations of 2 x I and 2 x 2 tiles? 16. A CD rack contains n slots in which CD cases can be inserted. The slots can hold a single CD, or a double CD case can be inserted into two adjacent slots. Suppose that the pattern of interest is the visually distinguishable arrangements of single and double CD cases in the rack. That is, all single CD cases look the same as each other and all double CD cases
(d) ao=2,al =2,a2
2
an_3-36an_4
=4,a3 =
forn > 4
8,and
an = -an- 1 + 9 an-2 - I lan_3 + 4an_4
for n > 4
19. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients whose equations have repeated roots. Use exact arithcharacteristic metic ! (a) ao = 1, al = 3, a2 = 8, and 6 anl3 forn>3 an -aI-20a,-2+l f -1,a3 =2, and (b) 00 =0,aI = 1,a2 an = 4a,-1 +26an-2 -60an-3 - 22 5an-4 forn > 4 (c) a0=l,at = 1,a2=2,a3=2,and (c3 0 1a=~ 2a=,n = -6an- 1 212an_2 - 10an_3 - 3an_4 forn > 4 (d) a0 = 2, al = 4, a2 = 6, a3 = 6, a4 = 20, and an = 7 an-1 - 9 an-2 - 2 3 an-3 + 50an-4 - 24an-5 for n > 5 (Use a matrix-capable calculator or computer software to solve the linear system.) 20. Find closed-form formulas for the following linear homogeneous recurrence relations with constant coefficients whose characteristic equations have repeated roots. Use exact arithmetic! (a) ao = 1, al = 5, a2 = 9, a3 = 12, and an = 10an-2 - 25a,-4 for n > 4 aj =0, a2 = 8, a3 = 10, a4 = 40, and (b) ao = 6, an = 6 an-1 - llan-2 + 2 an-3 + 12a,-4 - 8an-5 for n > 5 (Use a matrix-capable calculator or computer software to solve the linear system.) (c) ao = 2, al = -2, and an = -a,-2 (d) ao = 1, al = 2, a2 = 3, and
for n > 2
21. Suppose you are given an unlimited supply of red, blue, and green cards.
358
Chapter 7 Recursion One of the following strategies may appeal to you. A computer algebra system such as Mathematica or Maple or Derive will be useful. Strategy 1: There are three possible colors for the top card in a stack of height n that does not have adjacent green cards.
(a) Write a recurrence relation that counts the number, an, of distinguishable ways to form a stack consisting of n colored cards. (b) Solve the recurrence relation. (c) Show that Sn=
Now determine how each of these three cases can be built from smaller stacks that don't have adjacent green cards. Strategy 2: Work the problem by listing the possibilities for
n n-i T multinomial(i, j, n-i -j). i=0 j=0
the first few values of n. Then let y, be the number of stacks of height n which have a nongreen card on top and let Xn be the number of stacks of height n that have a green card on top. Express an in terms of Yn and xn. Then write a recursion for Yn in terms of yi's and xj's. Next, express x, in terms of yi 's. Finally, find a recurrence relation that only contains yi's. Solve that recurrence relation and use its solution to find a closed-form expression for an.
(d) Prove that E E-i= i=0
n.
_
3n.
i!j!(n-i-j)!
22. Suppose you are given an unlimited supply of red, blue, and green cards. Use recurrence relations to determine the number, an, of ways to form a stack consisting of n colored cards in which a green card is never directly on top of another green card.
7.3 Big-o and Recursive Algorithms: The Master Theorem Section 4.2 introduced the big-® mechanism for ranking algorithm efficiency and Section 7.1 introduced recursive algorithms. The goal of this section is to apply big-8 analysis to a class of recursive algorithms called divide-and-conquer algorithms. The name divide and conquer has been used for this solution technique because the original problem is divided into smaller problems, each of which is solved, and then the individual solutions are used to solve (conquer) the original problem. A few examples will indicate the primary direction. E
Recursive Binary Search The binary search algorithm (page 180) can be rewritten as a recursive algorithm. The algorithm assumes that a return value of "not found" will terminate all pending recursive invocations. 1: integer
recBinarySearch(x,
2:
if n ==
3: 4: 5: 6: 7: 8: 9: 10:
if
11:
lao, al, a2 .....
an-l)
1
ao == x return 0 else return "not found" x < a Lj # is x in the first half of the list? return recBinarySearch(x, {ao ..... aLsj_}) # try the 1st half else if
return
LL2+
recBinarySearch(x,
ta|nrj L2-
....
an-l}) # try the 2nd half
12: end recBinarySearch
Assume, for the duration of this example, that n = 2 k, for some k G N. Then each of the divisions into two sublists will result in equal-sized sublists. Let f (n) represent the number of critical operations (worst case) to search a list of size n for x. There are two comparisons (lines 2 and 3) when the list has only one element, so f(1) = 2. If n > 1, then there will still be two comparisons (lines 2 and 8), and one recursive invocation of recBinarysearch on a list of size n (since n =
Thus
2 k).
7.3 Big-O and Recursive Algorithms: The Master Theorem ff(n)
* f(1)
359
ff(ý) + 2 = 2
This nonlinear nonhomogeneous recurrence relation seems like a good candidate for back substitution. Notice that when a list of size ý is divided into two equal-sized sublists, each sublist will consist of - elements. Only one of those lists will be examined. f(n)
f()+2 =
[f (
) +"2] +"2
simplify
22
F [f (
substitute
) + 2] + (2 + 2)
=f ()
+(2
+ 2 + 2)
substitute simplify
k times
ff(1) + 2+ ... +-24- 2 = 2(k + 1) = 2(log 2 (n) +- 1) = 21og2 (n) + 2
We can test this formula for a few small values of n. Recalling that n following table can be derived. n
21og2 (n) + 2
1
2
2
4
4
6
8
8
2 k,
the
When n = 2, there will be two comparisons, and then one recursive invocation with two comparisons (for a list of size 1), for a total of four comparisons. This agrees with the table. When n = 4, there will be two comparisons, and then a recursive invocation of recBinarySearch on a list of size 2, requiring four comparisons, for a total of six comparisons. This also agrees with the table. You should verify that the final value in the table is also correct. The net result is that when n = 2k, recBinarysearch E 6(log2 (n)), in agreement with the prior (worst case) result for the nonrecursive binarySearch algorithm (in Section 4.2.4). E The next example also involves an algorithm which divides the data list into two equal parts. However, in this example, both sublists need to be processed recursively. Simple Merge Sort Consider the following recursive algorithm for sorting a list of items. First, split the list into two equal (or nearly equal) length sublists. Recursively sort each list, and then merge the two sublists into a single, sorted list. The merging can proceed by starting at
360
Chapter 7 Recursion the front of each of the two sorted sublists and comparing the current front item in each list. Move the smaller of the two items to a new list. Keep moving the smaller of the two front items until one of the lists is empty. Then append all remaining items in the nonempty list onto the end of the new list. For instance, to sort the data set {q, s, p, w, z, r, y, x), first split it into two sublists of size 4. The sublists are {q, s, p, w} and {z, r, y, x). The sublists, after sorting, are 1p, q, s, w) and {r, x, y, z}. They can be merged by starting at the left of each list. The elements p and r are compared first and p is moved (or copied) to a new list. Next, the new front elements (or first uncopied elements) are compared-q and r in this case. The element q is copied to the new list. Then s and r are compared and r is copied to the new list. The lists now look like: {s, w}, {x, y, z}, and {p, q, r}. The elements s and x are compared and s is moved; then w and x are compared and w is moved. The lists now look like: {}, {x, y, z}, and {p, q, r, s, w). Finally, the elements x, y, and z are moved to the new list, producing the sorted list: {p, q, r, s, w, x, y, z}. The algorithm mergesort utilizes this strategy. 1: sorted list mergeSort({ao, al, ... , an 1}) 2: if n == 1 3: return {ao0 # a list of length I is already sorted 4: 5: mergeSort({ao. ..... a[Lj2lI ) # sort the left half 6: mergeSort({aLsj......an-}) # sort the right half 7: 8: # merge the two sorted sublists 9: i = 0 10: 11: j 12: k - 0 13: 14: # compare the front elements 15: 16: while i < L 1 -- and j < n-17: if ai < aj 18: bk = ai 19: i i+1 20 : else 21: bk = aj 22: j j+l 23: k = k+1 24: 25: # copy everything remaining in the left list 26: 27: while i _< /n] 28 : bk = ai 29: i i+1 30: k = k+l 31: 32: # copy everything remaining in the right list 33: 34: while j < n-35: bk = aj j = j+1 36: 37: k = k+l 38: 39: return {bo, bl ... , bn-l} 40: end mergeSort
7.3 Big-6 and Recursive Algorithms: The Master Theorem
361
Assume, for the rest of this example, that n = 2 k for some k E N. The algorithm works by first sorting two sublists of length L and then merging the sublists. The merge requires n + 1 comparisons (some while neither list has been exhausted, others while the remaining elements in the uncompleted list are copied, and one more to determine that the other list has been completed). The merge also requires n data items to be copied. There will be an additional comparison at the beginning of each invocation of the algorithm. Suppose, for now, that comparisons are much faster than the data copies, so they can be ignored. Then the function f, which counts the number of data copies, is
"*f(n) =2f() +n "*f(1) = 0 This recurrence relation is also easy to solve using back substitution.
f (n) = 2f(+ ±n = 2 [2f S22 2f
= 22 [2f
n)+±
+ n
)
simplify
(n) )
=23f
=
2 k f( 1 )
+ 2n
substitute
n] + 2n + 3n
substitute
simplify
+ kn
= kn
= n log 2 (n) This sorting algorithm is therefore in O(n log2 (n)). You might wish to verify this for n = 1, 2, 4, and 8. Notice that best case, worst case, and average case are all the same: This algorithm uses the same number of data copies for every data set with n elements. 4 7 U
V/Quick Check 7.9 1. Show the complete details for a merge sort of the set {h, d, a, c, g, f, b, e}. The next lemma will be used many times in this section. The proof was the content of Exercise 10 in Exercises 4.2.3, on page 178. LEMMA 7.1 Let x, y, z e R with x, y, z > 0 andy
1. Then
xlogy(z)
. zzlogy(X)
One more example will be presented before looking at some theorems. The notation will look nastier but is not conceptually harder than the two previous examples. 47
This has only been proved here when n ý 2k.
362
Chapter 7 Recursion
Persian Rugs Section 7.1.4 introduces a recursive algorithm for drawing Persian rug designs. The key step is the algorithm colorSquare. In that algorithm, a check for the base condition is made, then a new color is calculated, the central "plus sign" is painted, and then the remaining unpainted pixels are divided into four equal-sized groups and recursively painted. The actual code that implements this algorithm assumes that there are (2 k + 1)2 pixels in the entire rug. The outer border (containing
2 (2 k +
1) + 2 (2 k - 1) = 2 k+2 pixels) is painted before
the recursion starts, so the recursive algorithm starts with a grid containing n = (2 k- 1)2 pixels. Assume, for this example, that painting pixels is the critical operation (so the coinparisons and color calculations can be ignored). An efficient way to paint the pixels in the central "plus sign" is to paint the horizontal line and then paint the vertical line. The central pixel will be painted twice, but this is faster (and simpler) than adding tests to see if the next pixel to paint is the central pixel. The plus sign will therefore require 2 (2 k _ 1) = 2/nH pixels to be painted, but only 2 k+l - 3 pixels can be removed from further consideration. This leaves
(2k _ 1)2 - 2k-- + 3
22k - 2 k+2
+
=
4 (2 2k-2
-
+ 1)
4 (2 k-1 _ 1)
pixels to color via the four recursive invocations. Each invocation will therefore paint a grid with (2 k1
-
1)2
-
4
pixels.
The function f, which counts the number of pixels that are painted after the border has been painted, is therefore
* f(1)
=
1
The back substitution for this recurrence relation is a bit messy. It might be worth making a simplifying assumption and then seeing how the simpler (but incorrect) recursion behaves. If the new recurrence relation does not deviate too much from the correct recurrence relation, perhaps they will have the same big-8 reference functions. The new (but incorrect) recurrence relation is
* f(n) =4f(M + * f(1)
12n
=1
The justification is that (In-- 1)2 - n when n is large. When n is small, the approximation is not the best. For example, when k = 3, n = (2 k - 1)2
-
49, but (/In
-
1)2 = 36.
Ignoring this discrepancy for now, and assuming that n = ( 2k)2 = 4 k, the back substitution is presented next. Notice that when n - 4, there would be no central plus sign to paint. It makes sense for this approximation to make the base condition f(4) = 4. f(n) =4f (
+2J,/H nT 2 ()2)+ 2-
4
] + 2f
=
4 [4f
=
4 2f (p) + (221" + 21n)
=~~~ ~ 42[f( F'
±
substitute simplify
(22U/Hn + 2VH-n)
substitute
7.3 Big-6 and Recursive Algorithms: The Master Theorem
(i)
=
43 f
=
43 4f
=
44 f
+ (231H +
( n) )
2krn + 2,H)
+2
+ (23/
(24 /-n + 2 3 1'/
363
simplify
-n+ 22I/- + 2,'H)
± 2 2 1I-n + 21n-)
substitute
simplify
Continuing the back substitution eventually leads to k-2
f(n) =
4
k-l f(
4
) + 21YL 2' i=0
2 k-1 = 4k + 2,/n-n • 2-1
=
=
k+
n+
4H.
1
( 2 k - 2)
v-n. (,I-n - 2)
= 2n - 2./-n.
The net result is that the number of pixel paintings is in 9 (n) (if the approximation (./n - 1)2 -_ n is legitimate). This should make intuitive sense: Except for a few double paintings with the plus signs, every pixel is painted only once. Thus, 6(n) behavior should be expected. Exercise 7 in Exercises 7.3.1 asks you to perform the back substitution with the correct recurrence relation. U The examples indicate that back substitution might be useful for analyzing the worst case behavior of recursive algorithms that split the data into a number of equal-sized subsets. However, those examples made some assumptions about n. It is time to assume less. It is also time to start demonstrating that the behavior in these examples fits into some general patterns. The examples in this section have demonstrated a method for analyzing the computational complexity of recursive divide-and-conquer algorithms. The complexity of many recursive algorithms can be found by noticing that the associated recurrence relation fits a pattern that has been captured in an established theorem. There are a number of similar theorems that have been developed that all tend to be named the master theorem. The first appeared in [45] in 1980. Two simple versions of the theorem will be proved here. A few reasonable extensions will then be briefly mentioned. The next definition will be used in the theorems. DEFINITION 7.7
Nondecreasing Function
A real-valued function is said to be nondecreasingon an interval if for all x, y in the interval with x < y, f(x) < f(y). The first theorem is patterned after the recursion, f(n) = f (ý) + 2, that arose in Example 7.35. Three of the coefficients have been turned into more general constants. Notice the strong assumption that the problem data set has been subdivided into a subsets of size n. The recurrence relation also asserts that there are c extra critical operations during each invocation of the algorithm.
364
Chapter 7 Recursion
The Master Theorem-Version 1 Let a, b E N and c, d E R, with a > 1, b > 1, c > 0, and d > 0. Let f(n) be a nondecreasing function on the interval (0, oc), where *(n)=afQ)+ c * f(1) =d • f(n) =Oifn < 1 Then fE (nlOgb(a)) ifa > 1 f E
(logb(n))
if a = 1.
Three comments are in order before proceeding to the proof. First, the assumption that f is nondecreasing is not too limiting. We expect algorithms to take longer (or use more memory, or perform more calculations) as n increases. The nondecreasing assumption is used instead of "strictly increasing" because some algorithms might take the same amount of time for 19 items of data as they do for 18 items. The second comment is that the base of the logarithms in the big-@ reference functions is not important as long as you are consistent (see Exercise 12 on page 178). The third comment is that the assertion f(n) = 0 if n < I is necessary so that the recursion has a complete set of base cases. This condition is important whenever n is not a power of b. The proof will use results from Problems 2, 3, and 4 in Exercises 7.3.1. The proof uses cases and many algebraic manipulations, but each step is fairly simple.
Proof of Theorem 7.8: The theorem will be proved using two cases. Case 1: n = bk Suppose there is a nonnegative integer, k, such that n is possible (Exercise 2) to show that
-
bk. Using back substitution, it
k-1
f(n) =dak +cZai. i=0
If a = 1, then ak
=
ki-o1 a' = k; otherwise
1 and
i=O
f dak +cak?1 Id + ck cak-I
kii-o a--
ak-I Thus,
i=O
a-]
ifa >l ifa = 1. c .Lto--c+d(a-1)
c+d(a-) ak
The expression dak + ca- can be written as a1a-I and P = a'I'' Note that a > 0 and P > 0.
a
a
-
-1_
Let a
a-
a-]
The expression for f(n) when a > 1 can be written as aak -/f. Since n = bk [and so k = logb(n)], aak - f! = aa' 0 gb(n) - ft. Lemma 7.1 implies that this can be written as ofnllgb(a) - p. If a = 1, then d + ck = d + c logb(n). These observations can be combined as f
Inlogb(a) _-p
if a > 1
d+clogb(n) if a= 1. d
Consequently, if a > 1 (logb(n)) if a = 1.
fnE (n109b(a))
f(n)
E
7.3 Big-9 and Recursive Algorithms: The Master Theorem
365
Case 2: n 96 bk Suppose that n is not a power of b. Then there must exist a nonnegative integer, k, such that bk < n < bk+ 1 . One immediate observation is that k < lOgb(n) < k + 1. The proof will proceed by examining two subcases: a = 1 and a > 1. The definitions of a and P from case 1 will be retained. Other results from the analysis in case 1 will also be used. Both subcases use the assumption that f is nondecreasing.
a
= 1 Letm =bk. Thenk =logb(m) andm (d - c) + Clogo (n) Exercise 4 implies that f E 2 (logb(n)). Combining the two assertions, it is valid to conclude that f E E (logb(n)). a > 1 Let m = bk. Then m 1, b > 1, c > 0, d > 0and v > 0. Let f(n) be a nondecreasing function on the interval (0, no), where
ff(n) =afQ)+cnv * f(1)= d * f(n)=Oifn < 1 Then
f •
e
(nlOgb(a))
9
(nlOgb(a)
logb(n))
0 (nv)
if a > bv (logb(a) > v) if a = bv (lOgb(a) = v) if a < bv (lOgb(a) < v).
Notice that when v = 0, this theorem gives the same reference functions as Theorem 7.8. The three cases have a simple intuitive explanation. The critical question is, "Which term is more significant, af (a) or cnV"? As will be seen in the proof, this answer is determined by the relative sizes of a and bv.
Proof of Theorem 7.9: The theorem will be proved using the same two cases as were used to prove Theorem 7.8. The outline of the proof is similar to the outline of the proof of Theorem 7.8. However, the algebraic details are messier and the proof is a bit longer. Nevertheless, each step is reasonably simple, so the proof is not difficult to read. Case 1: n - bk Suppose there is a nonnegative integer, k, such that n = bk. Using back substitution, it is possible (Exercise 6) to show that k-1
f(n) =dak +cnV y(-) i=O Recall that k Ifa
lOgb(n) for this case.
bv, then
=
But (bv)k
=
-
land
=
bvk = bv19gb(n) cnv
Set a= (d
,+c
and
-
Ek-1
a
•j=0
(e)
yk-li= =
k; otherwise
)a )k_
a )i
(i )
=
_a
blogb(nu) = nv. Thus,
(ak --
= (
I -(ak
(Ca
)(k
-- n)) )_n "
)c Note that a > 0 andfl > 0 when a > b
Also
S< 0 when a < bV. The previous observations show that f(n) can be expressed as f (n)= faa + cnvk idakk-nv I~)
bv ifa if a--bv.
7.3 Big-6 and Recursive Algorithms: The Master Theorem Since ak =
nlogb(a)
367
(using Lemma 7.1), this can also be written as
f anlOgb(a) - fbv
~fln)
= [dIogb(a) +
if a ifa
cn' logb(n)
= by.
The three cases in the theorem statement will now be examined. a = bV When a
bV, v = logs(a). Thus, f (n) = dnlogb(a) +
cnlogb(a)
lOgb(n).
Consequently, f(n) =
+ cnl9gb(a) logb(n) E ®
dnllgb(a)
a > bV In this case, a > 0 and / f(n) =
(nlogb(a)
lOgb(n))
> 0, and so afllogb(a)
- 1nv E 6(nfllOb(a))
since loga(a) > v. a < bV In this case, ý < 0, and thus f(n) = an logb(a) - ,nv = amlogb(a) + [Jnv cE(nv) since logb(a) < v. Case 2: n 9 bk Suppose that n is not a power of b. Then there must exist a nonnegative integer, k, such that bk < n < bk+l. One immediate observation is that k < logb (n) < k + 1. The proof will proceed by examining two subcases: a = bv and a 0 bv. The definitions of a and fi from case 1 will be retained. Other results from the analysis in case 1 will also be used. Both subcases use the assumption that f is nondecreasing. a = bV Note that v = logo (a) in this case. Let m = bk. Then since m < n mlogb(a) < nlogb(a). Consequently, f(n) < f(bm)
-
< bm, lag,(m)
nlogb(a)
+ 8)logb(a)
- c)
-
(C)
M1 ~gb(' +
c)
(n --
M1ogb(a) (logb(M) -
a c
(5C)Mlogb(a) (logb(M)
3) 09b(a) + +(-C
nlogb(a),
-
1)
logb(n).
which does not grow as fast as
log0(n). Exercise 4 implies that f E Q2(nlhgb(a) logb(n)). Combining the two
assertions,
f c-0
that f E E
(nlogb(a)
(n'ogb(a) logb(n)) and f E Q2(nlogb(a) logb(n)), it is valid to conclude
logh(n)).
a 0bv Letm =bk. Then f(n) < f(bm) = f(bk+l) = aak+l - fl(bm)v = oaalogb(m) -
fbvmv
= aamlOgb(a) -
flbVmv
Ifa > bv, then a > 0 and f > 0 and log0(a) > v, so = aamlogb(a) - flbvmv < anlogb(a).
f(n)
Exercise 3 implies that f E O(nlogb(a)). Ifa < bv, then < 0 and logo(a) < v. Letn -m f(n)
= y. Then
= oamlogb(a) ±-,flbvmv
= cia(n - y)lOgb(a) + [filbvmv < aa(n - y)l0gb(a) + [l3bvnv.
Since (n
-
y)lQgb(a) € • (nlOgb(a))
Exercise 3 implies that f E O(nv).
Now let M = bk+l. Then
f(n) > f(
f(bk) =
- f kak
)
(ii)alOgb(M) -
(•)Mv
Mlog(a) _- (i-)
48
1)
logb(M)
)
±gb(a)-+-iorlogb(a)
(n +-)
is dominated48 by
1)
(lO gb(M) -
+c
mlg9b(M) +
a
(d
The factor (n
k
dak + c
-
This can be made precise by using Newton's binomial theorem on page 378.
Mv.
369
7.3 Big-O and Recursive Algorithms: The Master Theorem
/
Ifa > bV,thena > 0and f(n)=
> 0andlogb(a) > v. LetM -n
a
- ()
_
M
) (nM
-()
Ml'ogb(a)
>(a-)
(
-
(aMlb(a)
=8. Then
(L)
)
(n + B)1)
The factor (n + 3)v is in 0 (nV). Since log0(a) > v, (C)nllgb(a)l-
()(n
+
5)v
is in
Therefore, Exercise 4 implies that f E Q (nllgb(a)). Ifa < bV, then P3< 0 and 1Ogb(a) < v. Thus,
fl(nl1gb(a)).
f(n)
_(1) Mlogb(a) -
=
()C-1 M'gb(a)
(a-)
Mv
I) + (I1P
~ - (n± 6)109b (a) + (A3) - (n +
>
8)I0gb(a)
"±-
Mv nv.
Since (a) (n + 3)logb(a) E Q (nlogb(a)), Exercise 4 implies that f c 02(n v). Thus, when a > bv, f c 0(nlgob(a)) and f E Q(nlogb(a)). Consequently, f E O(n°g9b(a)) Also, when a < bv, f E 0(nv) and f E gŽ(nV). Consequently, f E )(nv). El
ItQuick Check 7.11 In each case, use Theorem 7.9 to find a good big-( reference function. 1. For the complexity function, 2. For the approximate complexity funcf(n) = 2f (n) + n, f(1) = 0 for tion, f(n) = 4f (n)+2n2, f(1) = 1, mergesort. for the Persian rugs algorithm. I] There are many recursive algorithms for which the complexity function does not fit the description in Theorem 7.9. A few simple extensions seem desirable. One possible extension is to change the term cn" to a more general term, s (n). Another extension is to be more realistic about the reduction, The recBinarySearch algorithm can handle
5.
lists with lengths that are not powers of 2. When the list is split, the sublists may have
lengths
L
and [f] (see Exercise 1). The recursion of interest would involve either f (n)
or
f
[]+2
f (n) fq1)+2.
An algorithm like mergeSort might involve a recursion that looks like
f (n) =
f ([])+f
([n]) + n.
Another reasonable extension is to broaden the base condition. Instead of f(1) - d, it might become f(n) = d for 1 < n < no.
370
Chapter 7 Recursion
7.3.1 Exercises The exercises marked with O-Dhave detailed solutions in Appendix G. 1. Let n E N. Prove that
4 (d) f(1) = 5 and f (n) = 3f (L) + 14n 11. Prove Proposition 7.1.
PROPOSITION 7.1 1. Then
Let n c N and r G R with r >0 and r
.
n
[n +
n
2. Let f be a function that satisfies the hypotheses of Theorem 7.8. Use back substitution to show that
nrn+l
r(rn - 1)
r- I
(r - 1)2
Y_ ir i=0
k-1
12. Prove the following version of the master theorem. (Hint: You will need Proposition 7.1.)
f (n) = dak + c Z ai i=0 whenever n = bk for some k e N. 3. D Let f, g, and h be real-valued functions. If f(n) < h(n), for all n > no and if h e 0(g), prove f e 0(g). 4. Let f, g, and h be real-valued functions. If f(n) > h(n), for all n > no and if h e 2 (g), prove f E Q (g). 5. P Let x, y, z be positive real numbers with y # 1. Prove (xl0gY(z))
. 0 x Llog.(z)J
.
••
The Master Theorem-Version 3
Let a, b e N and c, d e llR,with a > 1, b > 1, c > by 0, and d > 0. Let f(n) be defined on the interval [1, oc) af(nta 0f d ( + cl1gb(n) *f ( If n = bk, for some k E N, then
f E Ie
6. Let f be a function that satisfies the hypotheses of Theorem 7.9. Use back substitution to show that
(nlogb(a)
.
logb(n))
([lOgb(n)]2)
if a > 1 if a = 1.
k-I1 f (n) = dak + cnv I: (O1.
( i=f+ whenever n = bk for some k 2e N. 2),use back substitution to solve 7. Assuming that n = ( 2 k the recurrencerelation
Sf(n)= 4f (
2(n - 1)anIzn- 2 +
1
z
n=1
Using the identity ao = 1 and a change of summation index, this can be written as A(z) = zZ Lkakzk-l +
I
k=0
Consequently, Definition 7.9 implies that A (z) must satisfy the differential equation 1 A(z) = z2A'(z) + I with A(0) = 1. 1 -z Solving this differential equation (and converting the solution to a power series) is not
trivial (see [34] for an approach using hypergeometric series). The result, which involves the permutation P (n - 1, r), is equivalent to (via Exercise 4 in Exercises 7.4.1) A(z) = 1 + :L [ý
P(n - 1, r)J zn'l
ISO=
so
n-1
an=EP(n-
1,r)
forn > 1.
r=O
This problem and its solution should remind you of Example 7.20 on page 336.
U
378
Chapter 7 Recursion
One additional theorem is very useful in this context. The theorem is a generalization, first created by Isaac Newton, of the binomial theorem and was used by him to great effect. The modern notation differs from Newton's original notation. 52 Before the theorem is presented, a definition is required. Recall that the normal binomial coefficient (n) contains a product of r integers in the numerator (assuming r > 0): r
(n)--
n(n-1)(n-2)... (n-(r-1))
-n!r
n(n-1)(n-2).. (n-r+l) r! -- 1.. and (n)0 - 0!(n-0)! n We can extend this by keeping the form, but allowing n to be any real number.
DEFINITION 7.10 Generalized Binomial Coefficients Let u be any real number and let r be a nonnegative integer. Then the generalized binomial coefficient, u choose r, is defined by Sr! u(u-l)(u-2)... (u-r+l)
(U)
r
if r > 0 ifr =0.
Note that if u = 0 and r > 0, then (M)= 0. In addition, if u is a negative integer, for example u = -n for n a positive integer, and r > 0, then r--n)
(-n)(-n
-
1)(-n
(-)r(n) (n +l)(n
2)
-
(-n - r +
...
1)
+2)...(n +r-1 r!
(n + r - 1)! r!(n - 1)! = (_l)r
(n
+
(an ordinary binomial coefficient).
-)r
Newton's Binomial Theorem Let w and z be real or complex numbers with I•1 < 1. Then for any real number, u, (WO+t Z)u
•
U-
U-k zk.
k=0
When w = 1, this reduces to
(l+z)U
kizk
for Izj 1
Let A(z) = "--o akZk be the generating function for the sequence generated by the recurrence relation. The same initial step will convert the recurrence relation into an equation involving generating functions: Multiply both sides by z" and sum over all appropriate values of n. In this case, the n - 1 on the right produces a valid subscript as long as n > 1. Thus
Yanzn n=1
--an-lZn +
=
n=1
'nZn n=1 CIO
'o
=Z~
anlzn-lw+ n=1
n=1
OO
cOO
nzn =zYanz n÷+--nzn. n=O
n=O
The final step results from a change of variable in the left-hand sum and because 0z° can be added to the right-hand sum without changing its value. Note that A(z) = ao + J-' I anZn• Using Example 7.42 and converting to expressions involving A (z) leads to A(z) - a0 = zA(z)+÷ (1 -z z) 2 "
380
Chapter 7 Recursion Solving for A(z) produces A (z) --
__a0
1-z+
+
z
1-Z)3"
Using the identity in Example 7.45, n +2
aozn +zY
A(z)£=
n
n=0
n=0
n )
= o Z+ (n +2)(n 2 + 1) nn+I =•"Eaozn+±" n=0
n=0
aOzn +
=
0
E
E
(k + 1)(k)zk 2
k=1
n=O
= ao +
(ao --
12
n-=1
= 0 Since a 0
=
ao + n(n +
Zn•
-3, this simplifies to L(nY,+ 3)(n 2 - 2)zn.
A(z)=
n=0
This implies that forn>0. >O
(n + 3)(n - 2) an-=frn 2
VQuick Check 7.12 1. Use Newton's binomial theorem to as power seand 1 expand 1 ries.
Simplify as far as possible. 3. Use generating functions to find a closed-form formula for an if * ao = 1, al = 2
2. Use Theorem 7.11 to find the power 1--').
series expansion of (1
*
an = a,_i +
6
2 an-2 for n >2
']
Generating Functions and Counting Generating functions can be used to solve many counting problems. The next example illustrates one such use. + Zn) = One new idea is all that is needed. Observe that (1 - z)(I + z + z2 + I - zn+l. Thus
+
2+
n
=
11- z
We can think of the polynomial 1 + z + z2 + • • + zn as the generating function for the sequence, 1, 1, 1,.... 1, 0, 0 ..... with n + 1 initial Is and Os for all other elements of the sequence.
7.4 Generating Functions
381
Solving an Integer equation A homework exercise in Chapter 5 asked for the number of distinct solutions to the equation x1 + x2 + X3 + X4 = 16, where x1 , X2, X3, X4 are nonnegative integers. This can be solved using a generating function approach. Consider a solution to the equation. Let the solution be k1 + k2 + k 3 + k 4 = 16. Since this equation is true, it is also true that zkl zk3 . . zk4 = Z16 . Now consider the product +Z 2 ± Z
(1
.Z. +.
Z).
The coefficient of z16 is obtained as a sum of products of the form zkl
. Z . zk3 . z.4, where k, + k2 + k3 + k4 = 16. In fact, every such product contributes 1 to the sum which determines the coefficient of z 16. There is therefore a one-to-one correspondence between solutions to the equation x1 +x2 +x3 +x4 = 16 and terms of the form zkl •zk2. zk3 .zk4 with k1 +k 2 +k 3 +k 4 = 16 in the expansion of (1 + z + z2 + z3 + .-. + Z16) 4 . Consequently, the number of solutions to the equation is the coefficient of z 16 in the product (1 + z + z 2 + z3 +-.- _- Z16)4 . With the help of Mathematica, the product expands to
4 2 1 + 4 z + 10z + 20z3 + 35z + 56z5 + 84z6 11 "+120z7 + 165? + 220z9 + 286z'° + 364z + 455z'2
"- 560z13 + 680z14 + 816z15 + 969z 16 + other terms (up to z 64). The number of solutions is therefore 969. As a bonus, we also now know that there are 816 solutions to the equation x1 + X2 + X3 + X4 = 15, and 680 solutions to the equation x1 + x2 + X3 + X4 = 14, etc. This expansion would not tell us the number of solutions to the equation x, + X2 + X3 + X4 = 17 (why not?). U The previous example can be solved more easily if the problem is first generalized.
Example 7.47 Revisited Let an represent the number of nonnegative integer solutions to the equation xl + x2 + X3 + X4 = n, for 0 < n. Let A(z) = ZF _O akZk. Example 7.47 has already established that a, is the coefficient ofz' in the expansion of (1 + z + z2 + .. )4. Thus (using Table 7.10), a(z)A
) Y
-
1
1
Z4-
•
-• 4 +-
n=0
In particular, when n
=
16, a,
=
(ý) = 969.
- 1n zn Z-• n
O
-c
+ 3l ÷"n. n=0
t)
U
The final example in this section introduces the notion of counting with inequality constraints.
Counting with Inequality Constraints I have 12 Sacagawea dollars that I wish to distribute to my three nieces. Each niece should get at least two coins. Since Erin is much younger than Grace or May and doesn't have a job, she should receive at least four coins. In how many ways can I distribute the coins? There are some natural upper limits on the numbers of coins each niece can receive. Erin can receive at most eight coins in order to leave at least four coins for the other two
382
Chapter 7 Recursion nieces. Similarly, Grace and May can each receive at most six coins in order to leave at least four coins for Erin and two coins for the other older niece. Generating functions can be introduced by creating a polynomial in z for each niece. For Erin, the natural choice is the expression z 4 + z 5 + z 6 + z 7 + z 8 , since she can receive from4 four5 to eight coins. Grace and May will share the same expression, namely, 6 2
3 -Z
-z
-Z
.
The solution to the problem is the coefficient of z 12 in (z4 +z5
6 + Z + Z7-+ Z8) (z 2 +
z4 + z ±z3-+
z6) (Z2--
z3 +
5
Z4 + Z + Z6)
The product simplifies to z8 + 3z9 + 6z10 + 10z" + 15z
12
+
Z13
18
+ 19z14 + 18z' 5 + 15z16 + 10z17 + 6z 18 + 3z19 + z2 0 .
U
Thus, there are 15 ways to distribute the coins (list them).
7.4.1 Exercises The exercises marked with
P have detailed solutions in
Appendix G.
9. Use generating functions to solve the following recurrence
relations.
1. Prove Theorem 7.11.
(a) a0 = 1, and a, =
2. Use Theorem 7.11 to find the coefficients of the following generating functions. Simplify as far as possible.
(b) P ao = 1, and a, (c) a0 = 1, and an
(c)
2 k~k)
(c)Y--kZk
(zo
=
Exercise 8.) (d) a0 = 2, al = -2,
zkk) (•- kO kzk) (a) O (y-O _k _k=0
(b) (Zo
3
an-
for n > 1
3an-I + 7 for n > 1 3 an- t n for n > 1 (Hint: Use and an = -2an-
+- 15an-2 for
5kZk)n
Y-Okzk
(int:Usetere10.
k k (Hint: Use the result of Exercise 1(a) in Exercises 3.3.5.)
3. Use Theorem 7.11 to find the coefficients of the following generating functions. Simplify as far as possible. (YO-•~) 3k) (Yec 0 (a)k kk k 0 (b) (Y- 0 (k±+l)zk) (y- 0 2zk)n>2 (c)/ -^ kzk) / y•=05kzk) (Hint: Use the result of kc k) Exercise 4(c) in Exercises 7.2.5.)
Use generating functions to solve the following recurrence relations. (a) a 0 = 1, a 1 = 8, andan =7an_- - l2an_2 forn > 2 (b) (c) (c) (d)
ao = 5, and a, = !an for n > I n a0 = 3, and an = n> I -I+lIfor ao =3, and an=5an O-D a0 = 3, a, = -12, and a, = -5an-
1+
36an-
2
for
11. Use generating functions to solve the following recurrence relations. (a) ao = 8, and an = 24an-I - 144 for n > 1
4. P Use back substitution to solve the recurrence relation an = (n - l)an-1 + 1, where ao= 1. 0
(b) a0 = -1, al = -2, and an = an_1 + 20a,2 for n > 2 (c) a 0 = 1, andan = 6an_ 1 -5 forn > 1
5. Use Newton's binomial theorem to expand the following into powerpowe series. 3 (a) I (b) P (I + 3z)- 2 (c) 1+ z 6. Use Newton's binomial theorem to expand the following into power series, (a) (1 + )(c) 1 ) ( -2z)4 1(b) T)
(d) a0 = 4, aI = 20, and an
12. P If an unlimited supply of indistinguishable pennies, indistinguishable nickels, indistinguishable dimes, and indistinguishable quarters is available, how many distinct arrangements of coins can be formed whose sum is 38 cents? Use generating functions in your solution strategy. You may want to use a computer algebra system for this problem.
7. Use generating functions to provide an alternative derivation of the solution for the recurrence relation an = 4a, -1 + 4 in Example 7.19 on page 335.
13. You are about to buy an item in the vending machine that costs $0.95. You have 7 pennies, 12 nickels, and 8 dimes, where all the coins of the same type are indistinguishable.
3+1 3 2k (Hint: First show =k~ 4 "vending m=0 (Y_E"k-1 (7-, i0 3'.) 3 is
Determine the number of ways to insert the coins into the machine to make the purchase of exactly $0.95, assumning that the order in which the coins are inserted does
8. Show that Y= that
0 (k
- j)3j
_j= 0 (k - j)3j =
=
4an_1 - 4an-2 for n > 2
383
7.5 The Josephus Problem not matter. Use generating functions in your solution strategy. You may want to use a computer algebra system for this problem.
18. Use partial fractions decompositions to find generating functions for the following expressions. (a)
14. Mary made three dozen identical homemade chocolate chip cookies and is going to distribute them to four families in her
neighborhood. Each family must receive at least six cookies. The Landers family cannot receive more than seven cookies because the mother does not want her family eating too many sweets. Mary also knows that there are many children in the Johnson family, so she wants to give them an ample supply of cookies. There are seven people in this family, and Mary
(c)
P
3z 1-z-6z2
(b)
7+7z
(d)
2(1-3 Zz2)
__12
4z-1
6z I 36 Z+l-7 z2
19. Generating functions and Newton's binomial theorem can be used to derive counting formula 4 on page 222. The proof will examine all legal values for n and r simultaneously. The g 1 1- Z
wants to make sure that each family member gets at least two
cookies. In how many ways can Mary distribute the chocolate chip cookies to the four neighborhood families? Use generating functions in your solution strategy. You may want to use a computer algebra system for this problem. 15. Find a generating function for the sequence -1, 1, -1, 1, -1, ... 16. Find a generating function for the sequence, 1, 23, 33, 43, ... of cubes. (Hint: Use derivatives.) 17. Exercise 17 in Exercises 7.2.5 asked you to create a recurrence relation for counting the number of distinct bit strings of length n that do not contain three consecutive Is. A closedform solution was not produced in that exercise. It is now possible to make some progress. Find the generating function for this recurrence relation (expressed as a ratio of two polynomials in z).
5 I-Z-12Z2
represent each of the n objects in the set of distinguishable objects. The term zk will represent choosing k copies of the item. Since there are n items, the expression 1 C(z). (l)
=
-
1
(1 - z)n
=( I
+z
+
)-+
is the key to the result. (a) Complete the discussion started previously by showing that the number of ways to choose r items with repetition from a set of n distinct items is the coefficient of zr in C(z). (b) Use Newton's binomial theorem to calculate the coefficlient of zr in C(z).
7.5 The Josephus Problem Many of the problems that mathematicians and computer scientists dearly love have been around for a long time. One such problem is known as the Josephus problem,
named after the first-century Jewish historian Flavius Josephus. Josephus did not invent the problem. Instead, an event from his life served as the inspiration for the problem statement. Many current books refer to MathematicalRecreationsand Essays by W. W. Rouse Ball [4, originally published in 1892] for the problem statement: Another of these antique problems consists in placing men around a circle so that if every mth man is killed, the remainder shall be certain specified individuals. Such problems can be easily solved empirically. Hegesippusa says that Josephus saved his life by such a device. According to his account, after the Romans had captured Jotapat, Josephus and forty other Jews took refuge in a cave. Josephus, much to his disgust, found that all except himself and one other man were resolved to kill themselves, so as not to fall
into the hands of their conquerors. Fearing to show his opposition too openly he consented, but declared that the operation must be carried out in an orderly way, and suggested that they should arrange themselves round a circle and that every third person should be killed until all but one man was left, who must then commit suicide. It is alleged that he placed himself and the other man in the 31 st and 16st place respectively. 'De Bello Judaico, bk III, chaps. 16-18
The problem (which will be addressed eventually) is quite interesting. However, the story, as quoted above, is not completely accurate. In fact, Hegesippus never existed,
384
Chapter 7 Recursion and there is no evidence that Josephus and his allies ever sat in a circle and killed every 53 third person. The original event can be found in Josephus's book The Jewish War. The Hegesippus that Ball cites was a fourth-century translation of Josephus. Some 54 anonymous translator got the author's name wrong. The story, as related by Josephus, 55 is as follows: Josephus was a general for the Jews in a war against the Romans, who were led by Vespacian. Josephus and his troops were surrounded in the city of Jotapata. Eventually the city fell, but Vespacian ordered his troops to capture Josephus (rather than kill him). Before the city fell, Josephus and 40 others managed to hide in a cave. On the third day after the city fell, the Romans found out about the cave. Vespacian sent two men to offer Josephus safe passage if he would surrender. At first he refused, but eventually started to change his mind. His companions were not pleased when they saw he was starting to consider surrender. They told him he should kill himself instead of surrender, or, if he was not brave enough, they would take the matter into their own hands. Josephus then launched into an articulate speech about why suicide is morally wrong. His speech did not convince his allies. In fact, they were on the verge of killing him and then killing themselves. The story concludes 56 (with Josephus speaking in the third person): But in this predicament, his resourcefulness did not forsake him. Trusting in God's protection, he hazarded his life on one last throw, saying: "As we are resolved to die, come, let us draw lots and decide the order in which we are to kill each other in turn. Whoever draws the first lot shall die by the hand of him who comes next; luck will thus take its course down the whole line. In this way we shall be spared taking our lives in our own hands. For it would be unfair when the rest were gone if one man should change his mind and escape." This proposal inspired assurance; his advice was taken, and he drew lots with the rest. Each man in turn offered his throat for the next man to cut, in the belief that his general would immediately share his fate; they thought death together with Josephus sweeter than life. He, however-should we say by fortune or by divine providence-was left with one other man; and, anxious neither to be condemned by the lot, nor, if he were left as the last, to stain his hand with the blood of a fellow countryman, he persuaded this man also, under a pact, to remain alive. I do not know when or where the mathematical version (with the circle and every third person) originated.
Other Versions of the Problem A version of the problem, existing in published form at least as early as the 1500s or early 1600s, involves a ship with 15 Turks and 15 Christians. A storm has arisen and in order to save some, it is decided that half the passengers need to be thrown into the sea. The passengers are placed into a circle, and every ninth man is tossed overboard. The problem is to find an arrangement so that all members of your favorite religio-ethnic group survive and all members of the other group become fish food. An Asian variant involves a man with two wives, each of whom is the mother of 15 children. The first wife has died and the man is getting old. The surviving wife convinces him that the estate is too small to divide among 30 children. In fact, it should go to just one child. The wife convinces him to arrange the children in a circle and eliminate (but not kill for a change!) every tenth child. The final child will inherit everything. The second wife arranges the children and the process begins. In an interesting twist, the 53
A good translation into English is [47]. Dr. Laurence Creider provided extensive help researching Hegesippus. 55 With only one other surviving witness to verify the details. 54
56
From chapter 8 of The Jewish War.
7.5 The Josephus Problem
385
first 14 to be eliminated are all children of the first wife. The father becomes alarmed, especially after he notices that the only remaining child from the first wife will be eliminated next. He suggests that they should start over, beginning with the sole remaining child of the first wife and travel around the circle in the opposite direction. The second wife cannot object without giving herself away, but she figures that the odds are 15 to 1 in her favor. The end result is that the child of the first wife is the final child, defeating the second wife's evil strategy. Your task, of course is to place the children around the circle to match the story.
Solving the Josephus Problem The original problem (with every third person being eliminated) is a bit more involved than is appropriate for the level of this text (but not by much). The interested reader should consult [34] for the solution. Instead, imagine that n people are placed in a circle, and every second person is eliminated. 57 The value we want is the position (start counting at 1) of the final person. Call this position j,. A good place to begin is with a few small examples. Table 7.11 shows the order in which people are eliminated and the value of jn, for several small n. You should draw a few of the circles and verify the numbers. TABLE 7.11 Order of Elimination with n People n
Elimination Sequence
1
--
jn 1
2
2
1
3
21
3
4
243
1
5
2415
3
6
24631
5
7
246153
7
8
2468375
1
9
24681597
3
What can we observe from these examples? One trend (at least so far) is that jn seems to be odd. Another trend seems to be that even numbered positions are eliminated first, in order. These two observations are actually related: Since all even positions will be eliminated first (according to the "every second person" rule), the final position will always be an odd number. It takes just a little bit of creativity (or else a few years worth of mathematical maturity and experience) to make the following observations: Since approximately half the people (those in even-numbered positions) are eliminated immediately, it may be profitable to write n in a form that involves the number 2. If n is even, we use up exactly half the people in this first phase, while if n is odd, there will still be one extra person left before wrapping back to the beginning. 58 Because even and odd are apparently significant characteristics, it may be useful to write n as either n = 2k for even n, or n = 2k + 1 for odd n. 57
The solution presented here is also from [34]. For example, when n = 7, positions 2, 4, and 6 are eliminated in phase 1, but 7 still remains before getting back to 1. 58
386
Chapter 7 Recursion Consider the case where n = 2k is even. After phase 1, only the odd numbered positions are left. There will be k such numbers and the next available position will be position 1. The problem has effectively been reduced to a problem of size k. There is one pesky detail: a problem of size k has the positions numbered as 1, 2, 3 .... k, but a problem of size n = 2k has the remaining positions numbered 1, 3, 5 ... , 2k - 1. It is easy to see how the two sequences relate: the old sequence can be grouped in pairs (odd, even). We keep only the first member of each pair. Look at the following table as 59 i goes from 1 to k: Original sequence
1
2
3
4
5
...
(2i - 1)
(2i)
...
(2k - 1)
(2k)
Relabeled sequence
I
-
2
-
3
. ..
i
.
...
k
-
Suppose we already knew the final position number, jk, for a circle of size k. Then a circle of size n = 2k would end up in the same place, assuming we could suitably relabel the original odd positions after phase 1 eliminates the even positions. It should be clear from the table that relabeled position i corresponds to original position 2i - 1. This leads to a clever strategy: Start with a circle of size n = 2k. After the even positions have been eliminated, relabel the positions as 1, 2, 3 ... , k. The final position in this relabeled circle will be jk. This corresponds to position 2 jk - I in the original circle. We now have the recursive relations jl = 1, and j2k = 2jk - 1. What we need is a similar recursive reduction when n is odd. If n = 2k + I is odd, phase I leaves only the odd positions. There are now k + 1 positions, so the reduced problem looks like a circle of size k + I = ý---' 2 The relabeling is also a bit more complicated, since the next person is not in the original position 1, but in original position 2k + 1.60 Original
1
2
3
4
5
...
Relabeled
2
-
3
-
4
...
(2i -2) .
(2i - 1)
(2i)
(i + 1)
-
...
(2k(k
1)
+ 1)
(2k)
(2k + 1)
-
1
The correspondence is a bit messy. Here is a revised idea: Don't end phase 1 until the original position 1 is eliminated (that person will always be the next to go). If we relabel after this point, the table becomes
Original
1
2
3
4
5
...
(2i)
(2i + 1)
(2i + 2)
...
(2k - 1)
(2k)
(2k + 1)
Relabeled
-
-
1
-
2
•..
-
i
.
.
(k - 1)
-
k
That looks much better! In fact, after the revised phase 1, there will be a circle of size k. The final person will be in relabeled position jk, corresponding to original position 2 jk + 1. This leads to the recursive relation j2k+l = 2 jk + 1. The recursive reduction formulas are
jj2n
2jn - 1
jj2n+l =
2
jn + 1
Can these recurrence relations be turned into a closed-form formula? If so, by what technique? Notice that they are not homogeneous, so the linear homogeneous recurrence relation with constant coefficients technique is out. If you try to do some back substitution, the need for two distinct relations (even vs odd) will quickly lead to something that is messy and quite awkward. You need to keep track of how many 2s are 59
60
Look at the case n = 8 if you want something more concrete.
Look at the case n
=
9 for a concrete example.
387
7.5 The Josephus Problem
in the original n to keep this sorted out. Let's try this a bit just to see what happens. Let n = 2mq, where q is odd and m > 1. j(2mq) =
2
= 2
substitute
j(2m-lq) - 1 (2j(2m-2q)
- 1) -
2
1 substitute
= 2 jz(2-2q) - (2 + 1)
simplify
m-1 =
2m
jq -
z
2i
i=O
= 2 m jq + 1 - 2m At this point, we know that q is odd, so there is an r with q = 2r + 1. Then jq = 2 jr + 1. But what do we do about r? Is it even or odd?
We have reached an apparent dead end, but the experience may still provide some insight later on. So linear homogeneous recurrence relations with constant coefficients techniques don't work, back substitution seems to fail, and after some messing around, it seems that generating functions may also be difficult to apply. What can be done? One observation is that 12k+1 - 12k = 2 in all cases. That is, for each odd n, subtracting jn-l from j, always equals 2. There is no similar constant difference if n is even and the same subtraction is done. Perhaps a larger table of small cases will help, especially now that the recurrence relations help to reduce the work. For example, jlo = 2j 5 - 1 = 2 - 3 - 1 = 5. (Note the duplication for n = 16 in the tables.) n
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
jn
1
1
3
1
3
5
7
1
3
5
7
9
11
13
15
1
n
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
in
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
1
Notice the pattern in the second row. In particular, notice that the pattern changes whenever n = 2 m. This should not be a big surprise if you consider what was learned in the attempt to use back substitution. The pattern seems to start at 1 when n = 2 m, then build by 2 until n = 2 m+1, where it returns to 1. A bit of thought and experimentation will lead to a simple formula, once the proper characterization of n is found. The useful way to write n is n = 2 m + i, where 0 < i < 2 m. For example, n m
2
+i
4
5
6
7
22+0
22+1
22+2
22+3
8 23+0
The Modified Josephus Problem Suppose n people are seated around a circle, numbered from 1 to n. Start counting with the first person and eliminate every second person. Continue until only one person is left. Denote the final position by j,. Let n = 2m ± i with 0 < i < 2m. Then i, = j(2m+i) = 2i + 1. Proof: The theorem can be proved using complete induction.
388
Chapter 7 Recursion Base Step n = 1 = 20 + 0 Since i = 0, the theorem predicts ji = 2 - 0 + 1 = 1, which is correct. Inductive Step Assume that the theorem is true for all positive integers less than n. Suppose first that n is even. Then n = 2' + i = 2' + 2k = 2(2m'0 < k < 2'- . The recurrence relation implies that In = j(2(2m '+k)) = 2 j(2m-,+k) -
By the inductive hypothesis,
i(2m-1+k) =
+ k), for
1.
2k + 1. Thus
in = 2j(z-I +k) -- 1 = 2(2k + 1) - 1 = 2(2k) +
1 = 2i + 1.
Now suppose thatn is odd. Thenn = 2 m +i = 2 m +2k+l = 2(2m- l+k)+•1, where 0 < k < 2 m-1 - 1. Using the recurrence relation, and then the inductive hypothesis in
= j(2(2m' -+k)+l1)
= 2j(2-
+k) + 1 = 2 (2k + 1) +
1 = 2i + 1.
The induction is finished: The theorem is true for n = 1, and whenever the theorem ED is true for all positive integers less than n, it is also true for n.
7.5.1 Exercises The exercises marked with Appendix G.
I have detailed solutions in
1. Consider the problem variant with 30 people on a boat. Designate the two groups the G's and the B's (the "good guys" and the "bad guys"). Where should the G's and B's be placed around the circle so that all 15 G's are left after 15 people are thrown overboard? 2. Consider the man with 30 children. How were the children used iniplaced around the circle (and which direction was tially)? Designate the children by F and S ("child of First wife" and "child of Second wife"). 3. You are probably familiar with the story of Scheherazade from the story 1001 Arabian Nights. In the story, the king, displeased with women in general, decided to spend each night with a new bride and then have her executed the next day (thus ensuring her faithfulness). Many women died. Eventually Scheherazade volunteered to marry the king. On her wedding night she told the king such an interesting story
that he delayed her execution by one day. The next night she told another story that was so good, the king delayed her execution again. This continued for 1001 nights, until finally the king realized he should stay married to her and keep her as queen. (He was a slow learner.) Suppose instead that the king had 1001 concubines. Each would draw a number and he would select every other concubine in circular order. The last concubine left azade alive e should h . queen W woubeoe pick in order to survive and become queen? 4. Let Pn represent the position that Josephus's partner should be in so that he is the second-to-last to be selected for execution (in a circle with n people, and every second person executed). (a) OPFind or guess a formula for p,. (b) Prove that your formula is correct. (Hint: Do the same recurrence relations hold?)
7.6 QUICK CHECK SOLUTIONS Quick Check 7.1 1. The complete diagram contains two invocations that lead to the sum 4. Notice how the W (the set on the right at each point of the diagram) gets smaller at each step. 4,, {4,3, 1}
{4}, {3, 11
141, {1}
{4, 3}, Ill 1}I,ob
{4, 3), 40
14, 3,11},0
7.6 QUICK CHECK SOLUTIONS
389
Quick Check 7.2 1. A 2 is created from A 1 , B 1, and D1 using the schematic
AA:
AlA
BI
2.
DI
B 3 is created from A 2 , B2 , and C2 using the schematic
B2I C2C
A2 B2
3. A 3 , B 3 , C 3 , and D 3 are
.. .... --
390
Chapter 7 Recursion So S3 is Sn,: A,, \
B.
//
Cn.,
D./
Quick Check 7.3 The graph of the function is 2 1.5
0.5 1
2
3
4
It may help to program Simpson's rule into a graphing calculator or a symbolic system such as Mathematicaor Maple, or else using a traditional programing language such as C or Java. Page 1: [0, 41 r = .01, S(f, 0, 4)
5.10457, S(f, 0, 2) _ 3.44747, and S(f, 2, 4)
1.80474, so whole - left - right = 0.147641 > 10. .01.
Page 2: [0, 2] (Page 3: [2, 4] is pending.) r = .005, S(f, 0, 2) 3.44747, S(f, 0, 1) -_ 1.86923, and S(f, 1, 2) -_ 1.57847, so whole - left - rightl 0.000225281 < 10. .005. The algorithm returns 1.86923 + 1.57847 = 3.4477 as the value of f 42- x dx. Page 3: [2,4] r = .005, S(f, 2, 4) 1.80474, S(f, 2,3) -_ 1.21887, and S(f, 3, 4) _• 0.638071, so whole - left - right = 0.0521988 > 10- .005.
Page 4: [2, 3] (Page 5: [3, 4] is pending.) r = .0025, S(f, 2, 3) - 1.21887, S(f, 2, 2.5) __ 0.660872, and S(f, 2.5, 3) 0.558073, so whole - left - right = 0.0000796489 < 10 • .0025. The algorithm returns 0.660872 + 0.558073 = 1.218945 as the value of f /4 - x dx.
Page 5: [3, 4] r = .0025, S(f, 3, 4) _- 0.638071, S(f, 3, 3.5)
_- 0.430934, and
S(f, 3.5, 4) _ 0.225592, so whole - left - right = 0.0184551 < 10 • .0025. The
algorithm returns 0.430934 + 0.225592 = .656526 as the value of f4 /4 - x dx. Back to Page 3 Page 3 returns 4- xdx-I-f 34 - X dx = 1.218945+.656526 = 1.87547. Back to Page 1 The final result is 3.4477 + 1.87547 = 5.32317, using the intervals [0, 2], [2, 3], and [3, 4]. The actual value is 51, so the error is approximately .0101633. Even using the more conservative safety factor of 10 (rather than 15) the algorithm didn't quite deliver as promised. Changing the safety constant to 5 produces a result that is in error by approximately .00129. It uses the intervals [0, 2], [2, 3], [3, 3.5], [3.5, 3.75], and [3.75, 4].
7.6 QUICK CHECK SOLUTIONS
391
Quick Check 7.4 1. The back substitution is simple. In this example, however, it is better to simplify at each step. I -bN-1
b= =
1 - (1
=
bn- 2 simplify 1 - b,- 3 substitute (no simplification needed)
= =
- b,-
substitute
2)
1 - (1 - b,- 4 )
b,-
substitute
simplify
4
Il-bn-k
is odd
Sbn-k
if k is even
-!b0
b0
ifn is odd if n is even Sif n is odd
0
if n is even
Quick Check 7.5 1. (a) Homogeneous and linear; does not have constant coefficients [the coefficient cos(n) depends on n]. (b) Linear homogeneous recurrence relation with constant coefficients (c) This does have constant coefficients but is not homogeneous (+9) and is not linear ((an_
1)2).
(d) This is homogeneous and does have constant coefficients but is not linear ( 2 b-J ). 2. (a) x 2 - 5x+9=0
Notice that the missing term b,- 2 results in Ox in the (b) x 3 - x 2 + 7 = 0 characteristic equation. (c) x2 - 3x - 4 = 0 Note that the subscripts were not in the standard order. They needed to be converted to a, = 3an-1 +4an-2 before forming the characteristic equation.
Quick Check 7.6 1. The characteristic equation is x3 - 4X2 - x + 4 = 0. This can be factored. One approach is to group terms: (x 3 - x) + (-4x2 + 4) = 0. This simplifies to x(x 2 -1) -4(X2 - 1) = 0. Combining common terms leads to (x -4)(x 2 - 1) = 0, which factors as (x - 4)(x - 1)(x + 1) = 0. Thus, the roots are distinct: rl = 4, r2 = 1, and r3 = -1.
The general solution is a, = 014n + 021n +03(-l)n = 014n + 02 + 03(-1)n. The system of linear equations that determines the values of the O's is 01
+092
+03
= 1
401 + 02 - 03 = 2 1601 + 02 + 03 = 3.
392
Chapter 7 Recursion Rather than begin a messy set of substitutions, and without assuming you have seen Gaussian elimination, it is still possible to solve this system without much effort. As an initial step, I will add the second equation to the first and third equations (thus eliminating 03). This produces the reduced system 501 + 202 = 3 20 0
1 + 202 = 5.
Subtracting the first equation from the second (in this reduced system) eliminates 02: 150t = 2. Thus 0t 22 15*
Substituting this value into the equation 501 + 202 = 3 and solving for 02 produces 02 - 7 Finally, substituting the values for 01 and 02 into 01 + 02 + 03 = 1 leads to -3 03= 10 The solution is generated by 24+ 15
7 6
3 10
This formula should certainly be tested to see that it generates the same values as the first five or six iterations of the original recurrence relation.
Quick Check 7.7 1. (a) p(x)
l) 3 (x - 3), the zero r = I has multiplicity v = 3. 4x 3 - 18x 2 +24x - 10 = 2(x - 1) 2 (2x -5) so r = 1 has multiplicity
= (x -
(b) p'(x) = v = 2. An alternative approach is to use the product rule: p'(x) = 3(x - 1) 2 (x - 3) + (x - 1)3 = (x - 1)2 ((3x - 9) + (x - 1)) = (x - 1)2 (4x - 10). (c) p"(x) = 12x 2 - 36x +24 = 12(x 2 - 3x + 2) - 12(x - 1)(x - 2). r = 1 has multiplicity v = 1. (d) p(3) (x) = 24x - 36 = 12(2x - 3), so r = 1 is no longer a zero. (e) It appears that the multiplicity of the zero decreases by I with each new derivative.
Quick Check 7.8 1. The characteristic equation is x 2 + 6x + 9 = (x + 3)2 = 0, having root r = -3 of multiplicity v = 2. The general solution has the form a, = (ao + aoln)(-3)n. The values of the a's are determined by a0 = -2 --3o - 3ol = -6. The solution is a, = (-2 + 4n)(-3)n for n > 0.
Quick Check 7.9 1. The first step is to split the list in half and recursively sort each half. Then the sorted sublists are merged. Sort {h, d, a, c} Split the list into two lists of length 2: {h, d} and {a, c}. Sort {h, di Split {h, di into two lists of size 1. The recursive invocations just return the single-item lists. Now merge the lists, forming {d, hl. Sort {a, c} Split {a, ci into two lists of size 1. The recursive invocations just return the single-item lists. Now merge the lists, forming {a, c}.
393
7.6 QUICK CHECK SOLUTIONS
Merge Now merge the two sorted lists, {d, h} and {a, c}, producing {a, c, d, h}. Sort {g, f, b, e} Split the list into two lists of length 2: {g, f) and {b, e}. Sort {g, f} Split {g, f} into two lists of size 1. The recursive invocations just return the single-item lists. Now merge the lists, forming {f, g}. Sort {b, e} Split {b, el into two lists of size 1. The recursive invocations just return the single-item lists. Now merge the lists, forming {b, el. Merge Now merge the two sorted lists, {f, g} and {b, e}, producing {b, e, f, g}. Merge Now merge the sorted sublists: {a, c, d, h} and {b, e, f, g}. The merged list is {a, b, c, d, e, f, g, h}.
Quick Check 7.10 1. Using Theorem 7.8 with a > 1, g(n) = n'0g3( 9 ) = n
Quick Check 7.11 1. For this algorithm, a = b = 2, c = v = 1, and d = 0. Since a =bv, and log 2 (2) = 1, mergesort is in 0(n log2 (n)). 2. For this recurrence relation, a = b = 4, c = 2, d = 1, and v = 1
a= 4> 2=
Since
/4 = b', and since log 4 (4) = 1, f E ®(n).
Quick Check 7.12 1. The expansion for 1
1
is
(1 + 2z)-')•
)(-2)kzk' )2=k =
=((1z)k
1+ 2z
k=0
(kk
k= k=0
k=0
For1 3z, the expansion is I
1 - 3z
= (I - 3z)-1 = (1
(k=O
2. Make the identification f(z)
---
k
k
k=0
1 -- +2z
(-
()k(k)
and g(z) =
1 1--3z"
-
2-
3k j=0
j =0
(_W-
1-
1
S3k
3kl (
5
.
-
3
)kzk =
kzk.
E k=O
. Then fj = (-2)j and
gk-j = 3 k-J Therefore, the coefficient of zk in the product is Ic
3
(-2)k+l)
394
Chapter 7 Recursion Therefore,
1
1
1 + 2z
1- 3z
_•'!
1 ( 3 k1
ko
2)k+1)
-(-
5
3. Multiply both sides of the recurrence relation by zn and sum from n = 2 (the smallest value for which all the subscripts are valid).
Y"anzn
=
n=2
E"an-Izn +6Yan_ n=2
2
Zn.
n=2
If the generating function for the sequence is A (z) = expression can be rewritten as
--- azn, then the previous
A(z) - ao - alz = (zA(z) - aoz) + 6z2 A(z).
This is true because 00
00
"an-zn =
00
Z
n=2
an-Zn-1 '
z
n=2
ajzJ = z (A(z) - ao) j=l
and 00
00
S
00
Z2 1
=
n=2
n-2
2
z
n=2
a=z2
2
=
z A(z).
j=0
Rearranging the terms produces A(z)(
-
z - 6z2)
=
ao + (al
ao)z=(1 + z),
-
so [using part (2)] (1 + z) 1 - z - 6z 2
A(z)
1 + Z)(1 + 2 z)(1 - 3z) 0(1
+
(
-
z)
2
)k+1)
2
k
k=0
The coefficient of zn in the preceding product is an=
1
3n+1 - (-2)n+l) +
5 (
-
5
(3fn -
1 (3n(3 + 1) - (-2)'(-2 + 1))
5 43 1 4 31 + -(-2)". 5 5
(-2)n)
7.7 Chapter Review
395
7.7.1 Summary This chapter introduces recursion. Recursion is an ingenious problem solving strategy that expresses the solution to a problem in terms of smaller versions of the same problem. The chapter presents recursion in an algorithmic context (Section 7.1) and also in a functional context (Sections 7.2, 7.3, and 7.4). In the algorithmic context, the key ideas to review are the suggested five steps for creating recursive algorithms and the material that indicates when recursion may be inappropriate. Creating a Recursive Algorithm Step 1 Identify how to reduce the problem into smaller versions of itself. Step 2 Identify one or more instances of the problem that can be directly solved. Step 3 Determine how the solution can be obtained by combining the solutions to one or more smaller versions. Step 4 Verify that the invocations in step 3 are within bounds. Step 5 Assemble the algorithm. When Recursion May Be Inappropriate Tail-end recursion An algorithm uses tail-end recursion if the only recursive invocation it makes occurs at the last line of the algorithm. Redundant recursion Redundant recursion occurs when an algorithm directly or indirectly invokes multiple instances of the same smaller version. Several examples are presented that demonstrate the power of recursion. The algorithms for drawing Persian rugs and Sierpinski curves and the adaptive quadrature algorithm all use multiple recursions. Section 7.2 introduces recurrence relations. Recurrence relations express the notion of recursion using functions and/or sequences. The chapter presents three general methods for solving recurrence relations: back substitution (Section 7.2.1), a two-phased process for linear homogeneous recurrence relations with constant coefficients (Section 7.2.2), and generating functions (Section 7.4). Back substitution is the simplest of the three methods but can become algebraically unmanageable if the recurrence relation is too complex. The second method requires the recurrence relation to be in a special form: linear, homogeneous, with constant coefficients. The solution for this special form of recurrence relation can be expressed in terms of the roots of the characteristic equation. The solution is initially presented for the case where the characteristic equation has distinct roots and then generalized for the case where the characteristic equation has repeated roots. This method has a practical limitation: It requires the exact roots of a polynomial equation. It is possible to extend this technique so that the homogeneous requirement can be dropped. That extension is not presented in this book. The use of generating function is the most powerful of the three methods. That method is also the most complex of the three. The presentation in this chapter is only an introduction to generating functions, but it contains sufficient material to be practically useful. Back substitution is used in Section 7.3 to prove several versions of the master theorem. This theorem describes appropriate big-E reference functions for a subclass of recursive algorithms. The algorithms for which the master theorem is appropriate are those that divide the original problem into two or more subproblems, each having the same size.
396
Chapter 7 Recursion The chapter concludes with an application of recurrence relations to solve a modified version of an ancient math puzzle called the Josephus problem. The solution technique for linear homogeneous recurrence relations with constant coefficients is summarized in the following five steps.
The General Procedure for Solving Linear Homogeneous Recurrence Relations with Constant Coefficients Step 1 Form the characteristic equation. Step 2 Find the roots of the characteristic equation, rl, r2. rj having respective multiplicities vj, V2 ..... vj, with vj + v2 + ±.. + = k. Step 3 Express the general solution as a sum of j terms in the form
wj
(8o + cSn + Sn2 + ... + 8vi-lnvi-1) rn. Step 4 Use the result of step 3 to form a system of linear equations to determine the unknown coefficients. Step 5 Solve the linear system and substitute the solution values for the unknown coefficients into the general solution from step 3.
7.7.2 Notation Notation Sn
An"\
B. /
C.
Page
\ D./
Brief Description
324
The schematic diagram for the nth
fn
332
Sierpinski curve The nth Fibonacci number
p J0(x)
348
(D
345 (footnote 35)
The jth derivative of the function, p (x) The golden ratio
372
A generating function
(Gr)
378
A generalized binomial coefficient
in
385
The position of the surviving person in a Josephus problem where every second person is eliminated
YO=
kZ
7.7.3 Definitions Recursion Recursion is a process of expressing the solution to a problem in terms of a simpler version of the same problem. A recursive algorithm is an algorithm that inevokes itself during execution. Tail-end Recursion An algorithm uses tail-end recursion if the only recursive invocation it makes occurs at the last line of the algorithm.
negative integers and for which . f(0) is known - f(n) is defined in terms {f(0), f(l) ... , f(n - 1)}
of
some
subset
of
Redundant recursionoccurs when Redundant Recursion ultpleinorindrecly nvoes an agorthmdirctl an algorithm directly or indirectly invokes multiple instances of the same smaller version.
Recurrence Relation Let {an n = 0, 1, 2 .... I be a sequence. A recurrence relation for {an } is a formula that a t af of[a, rms afioms expes A in expresses an in terms of some subset of {a0, al, .. ... a,_- 1. The recurrence relation must also specify one or more base conditions.
Recursively Defined Functions A recursively defined function is a function, f, whose domain is the set of non-
Given a recurrence relation, the sequence it generates is called the solution of the recurrencerelation.
397
7.7 Chapter Review The Fibonacci Sequence The Fibonacci sequence is the
solution to the recurrence relation: " fo = 1
xk
"•fl = 1 "* fA = fn-1
+ fn-2 for n > 1 The Fibonacci sequence is generated by
15 + V5 +
5- n
I1 2
0=
5-
(1_-.V5\n 2
Linear Homogeneous Recurrence Relations with Constant Coefficients of Degree k A linearhomogeneous recurrence relationwith constant coefficients of degree k is a recurrence relation that can be written in the form clan-I + C2an-2 +
+ ckan-k
for some k with 1 < k and ck 0. The constant k is called the degree of the recurrence relation. The constants, cj, are called the coefficients of the recurrence relations. tionothe C ractrerice creation Thecm tion of the recurrence relation an= clian-I + C2an-2 +
-
CIxk-1
_ C2X k-2
Ck-lX - Ck
=O.
Coefficient Matrix The coefficient matrix is a matrix that organizes the variable coefficients in a system of linear equations. Rational Root A rationalroot is a root of an equation that
is a rational number. with factorization, Repeated Root Let p be a polynomial p(x) = c(x - rl)"l . (x - r2)1 2 ... (x - rk)lk, over C. If
Homogeneous; Constant Coefficients; Linear A recurrence relation for the sequence {an } is called homogeneous if every term on the right-hand side of the recurrence relation contains a factor of the form aj, for some integer jj. The recurrence relation has constant coefficients if n does not appear in any term involving some aj except in subscripts. A recurrence relation is called linear if no term contains more than one factor of the form aj (even with different values of j), and no factor of the form aj appears in a denominator, as an exponent, or as part of a more complex function.
an
is
" • + Ckan-k
vi >1, then ri is called a repeatedroot of p.
Nondecreasing Function A real-valued function is said to be nondecreasing wih f)< on an interval if for all x, y in the interval G Generating Function Let {ao, a,, a2, ... I be a sequence The generating funcof real or complex numbers. tion, G(z), for the sequence is the formal power series Oak zk " G(z) = ao + alz + a2z2 + .... If the sequence is finite, {ao, ai, a2 . an), then G(z) = ao + alz + a2z 2 + ++ anZn is a polynomial. Define ak = 0 if k < 0. The Derivative of a Generating Function Let A (z) = Y-o akZk be a generating function. Then its derivative is denoted by A'(z) and is defined by Zkakzk-. k=O Let u be any real numCoefficient Binomial Generalized ber and let r be a nonnegative integer. Then the generalized binomial coefficient, u choose r, is defined by A'(z)
(u
r
=
fu(u-1)(u-2) ... (u-r+l)
'I
r
if r > 0
if r
=
0.
The Josephus Problem The Josephus problem is one form of an ancient puzzle. In this version of the puzzle, men sit in a circle and every third man is eliminated until only one man remains. The puzzle asks for the original location of the last remaining man.
7.7.4 Theorems an error that is at most (b -a) a
Simpson's Rule
90.2
b
f (x)dx (f-f(a) + 4f (a f
-b) + f (b)) (b-a) 6
Theorem 7.1 Error in Simpson's Rule Suppose the interval [a, b] is subdivided into ff subintervals, where N = 2m for some positive integer m . Also, assume that f(4) (x) exists. Then a composite Simpson's rule will have
. max If(4)(x) . xE[a,b]
Theorem 7.2 Suppose the sequence the recurrence relation an = clan-1 + c2an- 2
+"
tan}
is generated by
+ Ckan-k.
If a, = Orn also generates this sequence, then r is a root of the characteristic equation. Conversely, if r is a root of the characteristic equation, then any expression of the form Or' generates a sequence that is a solution to the recurrence relation.
398
Chapter 7 Recursion
Theorem 7.3 Suppose the characteristic equation of the degree k recurrence relation an = clan-I + C2an-2 +
++ Ckan-k.
The Quadratic Formula The equation
has k distinct roots, rI, r2..... rk. Then for any choice of constants, 01, 02, .. •, Ok, the closed-form expression an = 01r1 + 02r1 +... + Okrk
01, 02,. Ok, so that the recurrence relation generates the solution that matches those initial values. Theorem 7.4 Rational Roots Theorem Suppose the + •.- + cIx + co has integer polynomial cnx" + c, lx-lI 0 and co 0 0. Then any rational coefficients where c, (or integer) zero of the polynomial must be of the form P-, where p evenly divides co and q evenly divides ca. Theorem 7.5 Derivatives and Repeated Roots Let p(x) be a polynomial with a zero, r, having multiplicity v > 1. Then r is also a zero of the derivative pU) (x) for j = 1,2 ... , v-1. Theorem 7.6 Linear Homogeneous Recurrence Relations with Constant Coefficients Whose Characteristic Equations Have Repeated Roots Suppose the characteristic equation of the recurrence relation an = cla,-I + C2an-2 + - • • + Ckan-k has a root, r, of multiplicity v. Then for any choice of constants, aO, U2,.. u_, the closed-form expression
1)rn
an= (ao+catln + U2n2 +-...4-avIngenerates a solution to the recurrence relation.
Theorem 7.7 Solving Linear Homogeneous Recurrence Relations with Constant Coefficients Suppose the characteristic equation of the recurrence relation I
+ C2an2 +-
Afl2-1,
Lemma 7.1
/-
ln2 +-'
tni=O -+ av-Ini1)
+ (Po + Pin + Pln2 +..+
Theorem 7.8 The Master Theorem-Version 1 Let a, b E N and c, d c R, with a > 1, b > 1, c > 0, and d > 0. Let f (n) be a nondecreasing function on the interval (0, oo), where
- f n) afsf + c .
b f(1)= d 0 if n < I f(n)
Then
fE(&g(B)
rn}
the interval (0, oc), where • f (n) = af (W)+ cnV • f(1)=d • f(n) =0 ifn < 1 Then 0 (nlOgb(a))
if a > b' (logb (a) > v)
f E
(nlog,,(a)" - logb(n))
Proposition 7.1 r -A 1. Then
In addition, if base values ao, at .... ak-I are specified, then unique values can be found for cio, o1 . .. cv,- 1 , & AI0,/P,/ P12 -1-t... so that the closed-form formula matches the sequence generated by the recurrence relation.
if a = bv (logb(a) = v) if a < bV (logb(a) < v).
Let n E N and r E R with r > 0 and nrn+l r -1
r(rn - 1)
(r (
-
1)2 " )
Theorem 7.10 The Master Theorem-Version 3 Let a,b E Nand c,d e R, witha > 1, b > 1, c > 0, and d > 0. Let f(n) be defined on the interval [1, o0) by
• f(n) +...÷(woo+-wmn+oln24-...-+wvInv-1)ir
generates a solution to the recurrence relation.
if a > I
1®(logb(n)) if a = I. Theorem 7.9 The Master Theorem-Version 2 Let a, b E N and c, d, v E R, with a > 1, b > 1, c > 0, d > 0 and v > 0. Let f (n) be a nondecreasing function on
r - I
rI
0 and y : 1
xlog)(z) = zlogy(X).
0-(n')
PV2In V2-1
>
Then
the closed-form expression.
an = (aO+ atln +
+
/b 2 -4ac X 2a Let x, y, z c R with x, y, z -b+
-+ C
has j distinct roots, rI, r2 ..... rj, having respective multiplicities v1, v 2 ... . vj, with vt + V2 +- ... + vj = k. Then for any choice of constants, c0o, a. ,a av-, I--irin Po, PI Il,
ax 2 + bx + c = 0, where a, b, and c are real or complex numbers, has the solutions
generates a solution to the recurrence relation. In addition, if the k initial values, ao, al ..... ak-l, are specified, it is always possible to find unique values,
c an
Thus, the closed-form formula is the only solution; any other solution must be algebraically equivalent.
af
(Q)+
Clogb(n)
f(1) = d. If n = bk, for some k E N, then 0 (,lgbI(a) • logb(n))
f E
2
l og )] 2(n ([Lgb(f
if aI>
7.7 Chapter Review
399
Some Useful Generating Functions Summation Notation 1 zzk
G (z)
3
~
z
2
+...
3
+- z + z - z + +÷z3m÷
1 +÷Zm-+'z2m
SZmk k=O
0 ckzk
I(--cz)
z2+z
2
(-)kzk1 k=O
1+ z
I + cz + c2z2 + c3z3 +*
k=O
z)+
(z)
k k
E2! z
+
k=O Elkk1z~
lz 1
(1 +
Expanded Notation 1+ z
kzk
2
1 + mz + () 2
z2 + ()
+ 3
+2..
0 + z + 2z2 + 3z3 +•"
( )2 k=o
k=0 0 1 kZ2
ez
•O•lZk E
Z3
I+ z +
Proposition 7.2 Shifting Generating Functions Let (z) = -- = ak be the generating function for the sequence {ao, aI, a2, ... }. Then
. + -•. + •.
real number, u, -0UN
(w + z)" = E
k wUkzk
00k=
E ak-mZk = ZmG(Z).
When w = 1, this reduces to 00
k=m
Theorem 7.11 Multiplying Generating Functions Let G(z) = Y =Ogkzk be two gener= Y-E o fkzk F(z) eating functions. The and generating function that is the product
of F and G
gs
•G(z) P(z) Ter7.1= F(z) Netn
=
E
fjg-jZ. j=0
Theorem 7.12 Newton's Binomial Theorem Let w and z be real or complex numbers with Iz- < 1. Then for any
(1E z)u
=
Z k=O
(U)
zk for Izk 1, version 1 of the master theorem implies that f E 0(n).
for--n > 0.
(-
-1
invocations also use two comparisons (lines 2 and 13), so c = 2. There are two recursive invocations, each on a list half the size of the original, so a = b = 2. recurrence relation for this algorithm (with respect comparisons) is f(1) = 2 and f(n) = 2f (!) + 2.
Substituting
A quick check is in order. The following table compares a few values, calculated first directly from the recurrence relation, and second from the closed-form formula. n
-1
5. The master theorem (version 2) with a = 8, b = 4, c = 5, d = 4, and v = 1 applies. Since a = 8 > 41 = bv, and log 4 (8) = 3 (4./4 = 8), f E ®(n•/n). 6. (a) The base case uses two comparisons, so d = 2. All other
+ 402 = 14
an = (-2)(-3)n + 2(4)n
1
4
an = 01 (-3)n + 02(4)n.
3 )n
+
- 2 n+l + 3
1
0
x -- 12 = (x + 3)(x - 4) = 0.
The roots are r! = -3 therefore
via recurrence relation
n
with con3. This is a linear homogeneous recurrence relation stant coefficients. The characteristic equation is 2
33T2'
n+(3 2n -1I
B1
B, D,
DI
2nao
2 (4
7.
(-2)(-3)(-4)(-5) 120 24 4! 4 8. Let A (z) -= 0 an zn be the generating function for the recurrence relation. Then
)_0 0
1
14
14
2
14
14
3
182
182
4
350
350
E n=1
anzn =22E" an- lzn n=1
5 YZ n. n=1
Thus 00
A(z) - ao = 2z E akzk +5 k=0
1-).
402
Chapter 7 Recursion
This simplifies to A(z)(1 -
2
1
z) =5
It is appropriate at this point to stop and determine the value of the inner summation for a few values of k. The fol-
-4,
--
lowing table helps to organize this information.
so
k A(z) = 5
1- 2z
I-
1
1
4
z
Yk
I - 2z"
Table 7.10 implies 'cc) A (Z) = 5 • - 2 kzkk
zk)\
Theorem 7.11 can be used to change this to
I
=
c
(-=0
(
2 k+
2-
Ik=
j=0
It is now possible to complete the original task. 1
In summary,
(1+
((3- 2n1+- 5) Zn.
A(z)=
z() 0C
n=O
an = 2an-l + 5
3
=
1
1
7
1 7
2
19
19
3 4
43 91
43 91
1 l+z
..I
-
4z3 + 5z44
=
Z+ Z2
Z3 +
and take the first derivative of both sides. The result is 1 (1 +z)
9. Use Theorem 7.11.
1 +z
k=O 1 - 2 z + 3z2
This can also be done by using derivatives. Start with
2 n+t-5
0
.-
)k(k + 1)zk
0=
Thus, a, = 3 • 2+ -5, for n > 0. This can be checked against the original recurrence relation for a few values.
2
(k
cc
(I + Z)2
o@D
.I
3 -4
)J(--1)-J - (-)k(k + 1).
zZ(
4
=Y (3.2k+1 - 5) zk.
(1 +z)
3
k
1k
k=0
n
2
-2
Thus, the product, (- 1)J (- l)k-J, will be - 1 whenever k is even. The sum will add k + 1 one's. This can be summarized very simply:
zk-4E'2kzk k=O
1
1
1
product, (- 1 )J(-)k-j, will be I whenever k is even. The sum will add k ± 1 one's. On the other hand, if k is odd, k - j will be odd whenever j is even and even when j is odd.
(0=00the
"ok 2j1k-j A(z)=5L k=O j=O
0
There seems to be a very simple pattern. In fact, it is easy to see how the pattern arises. When k is even, k - j will be even whenever j is even and odd whenever j is odd. Thus,
kzk.
-
_( 1 )J (_-1)k-j
_ 0-l1+2z-3z 2 ±..
2
Consequently, .
''
=
1 +z
E
l
" k(1
~
1 -k - 2z + 3z2
-j'
.
(I)+(Z)
2 ( + Z)2
-
CHAPTER
Combinatorics is a branch of mathematics that examines many seemingly unrelated ideas. These ideas are collected for several reasons. One reason is that the ideas are much more tightly connected than a casual look would indicate. It is often the case that one combinatorial object will be used to create another. Examples of this will be seen when Latin squares, finite projective planes, balanced incomplete block designs, and error-correcting codes are discussed in this chapter. A second reason for grouping these seemingly different topics under a common mathematical subdiscipline is the commonality of methods used in their study. Different branches of mathematics tend to have somewhat distinctive collections of mathematical tools and strategies. Combinatorics is characterized by the frequent use of mathematical tools such as counting, induction, constructive proofs, generating functions, and theorems from linear algebra, algebraic systems, and number theory. The third reason these topics are given a collective identity is that they tend to be concerned with a fairly small collection of broadly related questions. In particular, combinatorial topics can be categorized as problems concerned with one or more of the following three broad questions: Existence Many combinatorial topics relate to finding a special configuration of elements from some set. The existence problem seeks to determine if such a configuration is possible. If it is possible, the existence problem seeks ways to construct the configuration.
A very simple example is a search for the existence of magic squares. A magic square is an n by n matrixT whose entries are the complete set of positive integers mc,2m ... ti ng2h,arranged in such a way that the sum of every row, the sum of every column, and the wil beetwo diagonals are all the same number. The following example of a 3-by-3 magic square illustrates this definition.2
Enumeration Sometimes it is not too difficult erent ticshe desired configuration. The interesting question then may be "How many disve ctins ofimathons are possible?". This is an enumeration problem. See Appendix E for a brief introduction to matrices. 2bn "Textbook-Related Links" at http://www.mathcs.bethel.edu/cgossett/DiscreteMathWithProof/there link to a Web site that has much more information about magic squares.
is a
403
404
Chapter 8 Combinatorics One simple example might be the determination of the number of distinct 3-by-3 magic squares. Section 8.1 will explore additional enumeration problems. Optimization Sometimes the configuration of interest has additional properties that enable us to distinguish among acceptable alternatives. It may be that among all configurations of a certain type, some are more useful than others. The problem is then to find an optimal configuration. The stable marriage problem (Section 1.2) is an example of this type of problem, as is the knapsack problem in Section 8.4. The previous discussion used the word configuration many times. Although combinatorics includes more ideas than merely that of arranging objects into special configurations, exploring configurations is a major thread in the discipline. A more formal terminology to describe such arrangements is to call the configuration a combinatorial design. This concept is so important that Herbert Ryser [59, p. 2] uses it as the basis of 3 his informal definition of combinatorics. Combinatorial mathematics cuts across the many subdivisions of mathematics, and this makes a formal definition difficult. But by and large it is concerned with the study of the arrangement of elements into sets. The elements are usually finite in number, and the arrangement is restricted by certain boundary conditions imposed by the particular problem under investigation. The boundary conditions for magic squares problems are the requirements that every number in the set f1,2 .... n2 }lappears in the magic square and that the row, column, and diagonal sums be equal. The boundary conditions for the stable marriage problem include the requirement that an assignment be stable and also the assumption that there be no ties in the individual preferences. The first sentence in Ryser's informal definition refers to the connections between combinatorics and other branches of mathematics. As has already been mentioned, those connections are quite strong. This provides a rich assortment of mathematical ideas and tools that may be applied to solving problems in combinatorics and also permits combinatorial ideas to be used for solving problems in other branches of mathematics. However, it creates a problem when discussing combinatorics in a text aimed at lower division undergraduates. The exposition of many of the topics quickly incorporates mathematical ideas that are typically encountered in the junior and senior years of college. This limits how deeply many topics can be explored here. However, there is still a sufficiently rich assortment of accessible combinatorial ideas so that this chapter can provide a nontrivial introduction to combinatorics. Furthermore, the material on graph theory in Chapter 10 is often considered a part of combinatorics.
V Quic'k, C'heck, 8.1 1. How many distinct ways can the numbers 1, 2 ... , 9 be arranged in a 3-by3 matrix? 2. Prove that the common sum in an n by n magic square must be ''(j2. (Hint: Use a combinatorial proofcount the entries in the magic square two different ways.) 3. List all subsets of three integers from {1, 2, 3, ... , 91 whose sum is 15. Conclude that 1, 3, 7, and 9 cannot 3
appear in the center or in one of the corners, that 2, 4, 6, and 8 must be in one of the corners, and that 5 must be in the center of any 3-by-3 magic square. 4. Using the results of the previous problem, how many potential 3-by-3 magic squares are possible? 5. Find all squares.
possible
3-by-3
magic
His informal definition goes on to describe existence and enumeration problems. Those ideas have already been discussed here and so the quotation has been truncated.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
405
8.1 Partitions, Occupancy Problems, and Stirling
Numbers This section introduces some interesting problems in enumerative combinatorics. They are collectively gathered into a category of counting problems called occupancy problems. Many of the background ideas needed to explore these problems were introduced in Chapter 5. Two additional ideas need to be presented here: partitions of an integer
and Stirling numbers of the second kind. For completeness, Stirling numbers of the first kind will also be discussed at the end of the section.
8.1.1 Partitions of a Positive Integer Leonhard Euler was the first mathematician to make significant progress in answering a seemingly simple counting problem. The problem starts with a positive integer, n, and asks how many ways n can be written as a sum of positive integers, where the order of the summands does not matter. Expressing 6 as a Sum of Positive Integers The number 6 is small enough to exhaustively list all possible ways to express it as a sum of positive integers (with the order of the summands unimportant). The list (Table 8.1) shows that there are eleven ways to do this. TABLE 8.1 All Ways to Write 6 as a Sum of Positive Integers (Order of Summands Unimportant) 6=6 6=5+1 6=4+2 6=4+1+1
6=3+3
6=3+2+1
6=3+1+1+1
6=2+2+2
6=2+2+1+1
6=2+1+1+1+1
6=1+1+1+1+1+1
Notice that there is only one way (each) to write 6 as a sum of exactly one term, as a sum of exactly five terms, or as a sum of exactly six terms. There are three ways to write 6 as a sum of two terms (5 + 1, 3 + 3, and 4 + 2) and three ways to write 6 as a sum of three terms. Finally, there are two ways to write it as a sum of four terms. U Example 8.1 introduces ideas that will be easier to discuss if some formal definitions and new notation are introduced. The notation introduced in the next definition is one common choice for this context, but there is no standard notation.
DEFINITION 8.1 Partitionof an Integer A partition of an integer, n, is a representation of n as a sum of positive integers, where the order of the summands is not important. The number of partitions of n is denoted by p(n). The number of partitions of n that contain exactly k summands is denoted by p(n, k). Note that p is lowercase: p(n, k) is not the permutation P(n, r). Notice that this definition of partitionis quite different from the set-oriented definition introduced on page 19. The context will almost always make clear which definition is intended. 1
The Number of Partitions of 6 The terminology in Definition 8.1 can be applied to Example 8.1. Thus, p(6) = 11 and p(6, 6) = I
p(6, 5)= 1
p(6, 4) = 2
p(6, 3) = 3
p(6, 2)= 3
p(6, 1) = 1.
406
Chapter 8 Combinatorics Although it is quite easy to understand what p(n) represents, it is not so easy to calculate p(n) when n gets larger than one or two digits. Some additional relationships need to be presented before any real progress can be made. PROPOSITION 8.1 p(n) and p(n, k) Let n be a positive integer. Then n
p(n) = •
p(n, k).
k=1
Proof: Let Sk be the set of all partitions of n that contain exactly k summands. Then Sk I = p(n, k). In addition, the sets tSr, S2,..... S,} are disjoint and their union is the [1 set, S, of all partitions of n. Proposition 2.5 on page 29 completes the proof.
The main result for p(n) that will be presented in this textbook is Theorem 8. 1, which was first recognized by Euler. The theorem uses ideas from Section 7.4 but can be understood without reading that section. A Generating Function for p(n) Let n be a positive integer. Then p(n) is the coefficient of z' in the generating function H
00
EZim
.
m=l (i=0
That is,
SIDetermining
Ep(n)zn=
H
n=1
m=1
(I +Zm +Z2m
+z
3
m ±z
4
m +
p(3) Theorem 8.1 asserts that p(3)
is the coefficient of z3 in the power series
H m00= 1 (Yio zmi)• The initial values of the product
can be calculated by noticing that any term with a power of z that is larger than 3 cannot contribute to the coefficient of z 3 . The partial calculation below does not show all the factors that equal 1 (for instance, 1 • 1 • 1 • z 4. 1 . . is written as just z4). (1+
Z±Z2±z3 +±..)(1
=
+...)(I
+Z3 ±Z6 ±...)...
+z2
+x4
1+
+ (Z2 +
1z
2 3 + 2z + 3z + ..-
+ Z2) + (z.Z2 +±Z3+Z3)
Since the only partitions of 3 are 3 = 3, 3 = 2 + 1, and 3 = I + I + 1, the assertion in the theorem is correct in this case.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
407
It will be helpful to look at the previous algebraic expansion in more detail. For that purpose, the exponents will be left as products.
(1z1±2±z1...)(1z1±z2 ... )(l+...)... I +Z 1. + (I
+ Zl.2) + (ZIl1
912 + 31+Zl1'3) ± + -
Now, let the right-hand factor in an exponent represent the distinct integers in the partition and let the left-hand factor in the exponents represent the number of copies of the right-hand factor that appear in the partition. Thus, the factor z 34 would correspond to three 4s and the term z23. z3 5 would correspond to the partition 21 = 3 + 3 + 5 + 5+ 5. Terms
Corresponding Partitions
11
Z*
I =1
2 1
2 =1 + 1
Z . + Z1'2 11
z . z12 +z31 +z
1 3
3=1+2
2= 2
3=1+1+1
3=3
The association between partitions of n and terms in the expansion of the product in Theorem 8.1 is the central idea in the proof of the theorem. U
Proof of Theorem 8.1 For the duration of this proof, let Sm = (1 + Zm + Z2m + z 3 m +
•).Also, let P=HmtSm. Consider the coefficient, Ca, of zn in P, for n > 1. That coefficient is obtained by adding a finite set of terms of the form zilml . Zi2m2 ... Zikmk for some positive integer, k, with n = ilml + i2m2 + + ikmk. In addition, mr 0 ms if r A s, since the mar's must come from different factors, Sr, in the original product, P. It is possible to convert each of these terms to a partition of n. The value of mr can be used as the integer in the sum on the right-hand side of the partition and the value of ir will be the repeat factor. The equation n = iIm + i2m2 + ... + ikMk guarantees that this is a partition of n. A different choice of k and of the i's and m's will result in a different partition of n. Since every such term in the expansion corresponds to a different partition, it is clear that c, > p(n). It is also possible to start with a partition of n and find a corresponding term in the expansion of P. Suppose the partition can be expressed as n = amm, where am is the number of copies of the integer m that are present in the partition. Some of the am's may be zero. Let k be the number of nonzero am's in the sum. Label the values of m having a nonzero am as ml, m2. , Mk m.. and rename their associated am's as il, i 2. ... , ik. The partition thus corresponds to a term zilml . Zizm2 ... Z.kmk. Each mi is distinct, so the factor z'imi is actually present in a distinct factor, Sj, in P. Thus, the term z2 im . zzm2 • . • Zikrk must occur as part of the expansion. This implies that p(n) > Ca. The conclusion is that c, = p(n) and the theorem has been proved. II The infinite sums and products in Theorem 8.1 are not necessary if all that is needed is the coefficient for a particular value of n. Corollary 8.1 establishes the appropriate modification (which was implicitly used in Example 8.3). COROLLARY 8.1
Calculatingp(n) Let n be a positive integer. Then p(n) is the coefficient of Zn in the polynomial M=I
(•
irn)
408
Chapter 8 Combinatorics
Proof: The coefficient of z" is a sum involving terms of the form zi 'I for some positive integer, k, with n
. Zi1m 2 ... Zikmk
+ ikMk. If im > n, then zim
= ilmi + i2m2 +--.
cannot be a factor in such a term. Thus, m < n must certainly hold. In addition, in order that im < n, it is necessary that i < ~n Since i is an integer, i < Ln] must also be L true.
Calculating p(4) The value of p(
H
1
4
) is the coefficient of z in the expression
zim
(
+ z +z2
8
+ z4)
(1 4
(Lo
Z2 + Z4) (I
+ 3z3
+
5z
+6z 9+5z
0
+5z'1 +3z
I +z+2z = 1 +7z
+z32
Hm= 1
6
+ 5z5 +6z 12
+ 7z
zim
3) (I + z4) 7
+2z 3+z
14
+z
15
Note well: The coefficients of z, z2 , z 3 ,and z 4 in the expansion above are the correct values for p(l), p( 2 ), p(3), and p( 4 ), respectively. However, the remaining coefficients are not the values of p(n) for n = 5, 6 ... , 15. It was not really necessary to keep track of terms with exponents greater than 4. M It is now time to consider the numbers, p(n, k). A nice combinatorial proof establishes the useful recurrence relation in the next theorem.
A Recurrence Relation for p(n, k) Let n and k be integers and let 0 < k < n. Then p(n, k)= p(n - 1,k
-
1) + p(n
-
k, k),
where p(n, k) = 0 for k > n and p(n, 0) = 0. In addition, p(n, n) = 1.
Proof: The boundary conditions will be established first. Since n > 0, there is no way that n can be expressed as a sum with no summands. Thus, p(n, 0) = 0. Also, any sum with k > n positive integer summands will have a sum that is greater than n. Thus, p(n, k) = 0 if k > n and the boundary conditions have been established. Finally, the only way to express n as a sum with n summands is to add n Is. Thus, p(n, n) = 1. Now consider the set, S, of all partitions of n that contain exactly k summands. Define subsets, A and B, of S by A = Is E S I s contains at least one 1 as a summand} B = {s E S I s contains no l's as summandsl. Clearly, S = A U B and A n B = 0, so (A, B} is a (set) partition of S. Thus, ISI = JAI + JBI. For each partition in A, remove one of the I s from the sum. The resulting partition will be a partition of n - 1 having exactly k - 1 summands. Also, any partition of n - 1 having exactly k - 1 summands can be transformed into a partition of n with k summands by adding an additional 1 to the sum. Therefore, IAI = p(n - 1, k - 1). For each partition in B, subtract 1 from each summand. The result will still be a partition since every summand will be at least 1. The new partition will have a sum of n - k. Conversely, every partition of n - k having exactly k summands (for k < n) can be transformed into a partition of n having k summands by adding 1 to each summand. Therefore, IBI = p(n - k, k).
8.1 Partitions, Occupancy Problems, and Stirling Numbers
6
409
QuickCheck82 1. Use Theorem 8.2 and Proposition 8.1 to fill in the blank spots in Table 8.2. TABLE 8.2 The Values of p(n) and p(n, k) for n, k < 6 p(n, k)
n\k 1 2
1
2
3
4
p(n)
5
6
p(n, k)
n\k
1 2 3 4
p(n)
5
6
-4 ----
3
5 6
R1
8.1.2 Occupancy Problems The value of p(n, k) has other interpretations. One alternative is explored in Exercise 4 of Exercises 8.1.4. Here is another interpretation. Suppose we have n identical red balls and k identical buckets. In how many ways can the balls be placed into the buckets if every bucket must receive at least one ball? The answer turns out to be p(n, k). This problem is an example of an occupancy problem. This class of counting problems is the subject of this section. 4 Familiarity with the material in Chapter 5 is assumed. DEFINITION 8.2 Occupancy Problems Occupancy problems are concerned with placing objects into containers. Occupancy problems are categorized by whether the objects and containers are distinguishable or indistinguishable and by whether or not containers can be empty. Objects have typically been balls and the containers have usually been urns or cells. The more general notions of objects and containers will be used here.
A Small Occupancy Problem-Part 1 Suppose there are six objects and three containers. There are eight possible occupancy problems, depending on whether objects and containers are distinguishable and whether or not containers can be empty. The cases will be abbreviated using the notation O-object, C-container, D-distinguishable, I-indistinguishable, 0-containers may be empty, -'0--containers may not be empty. OD CD 0 Suppose the objects are six driver's licenses (for six different people) and the containers are three bins labeled "$50 fine," "$100 fine," and "no fine." A court clerk randomly places licenses into the bins and then the judge issues the fines to the hapless drivers. There are 36 = 729 ways to do this since each license has three possible locations and the assignments are independent. OD CD -0 Suppose the objects are slips of paper with the names of six contestants in a piano contest and the containers are clipboards labeled "outstanding," "commendable," and "participant." The judges confer after each contestant has performed and then attach the name slip to the appropriate clipboard. The sponsoring organization has decided that every clipboard must contain at least one name (judging is relative to all participants, not to some absolute standard). One tempting counting strategy is to first make sure that the three clipboards each have one name, and then distribute the remaining names randomly. There are P (6, 3) = 120 ways to choose a distinct name for each clipboard. The remaining three names can be distributed independently in 33 = 27 ways for a total of 120 • 27 = 3240 ways.
However, this can't be correct because there would be only 729 ways to attach names to 4
The organization that will be used was inspired by the presentation and arrangement in [61].
410
Chapter 8 Combinatorics clipboards if the restriction that every clipboard needed to have at least one name was dropped. The error occurs because this strategy has many arrangements that are counted more than once. For example, placing names a, b, c, d on the outstanding clipboard, e on the commendable clipboard, and f on the participant clipboard can be done in four ways (depending on which of a, b, c, and d is chosen in the first round). This occupancy problem is nontrivial. The strategy that will eventually be used is to first determine in how many ways the six names can be placed onto unlabeled clipboards, and then multiply by 3! (the number of ways to label the three clipboards). There are 90 ways to complete the first phase, 5 so there are 540 ways to distribute the contestant's names onto the clipboards. 01 CD 0 Suppose there are six oranges and three Christmas stockings hanging by the fireplace. The stockings are labeled "Tom," "Mary," and "Spot." In how many ways can the oranges be placed into the stockings if stockings can be empty? Instead of placing the oranges directly into the stockings, a marker can first be used to label each orange with the name of the stocking into which it will be placed. There are three distinct labels, and each label can be used up to six times. Therefore the task is to select a set of six labels, from a set of three distinct possibilities. This is thus a combination with repetition problem. Counting formula 4 on page 222 can be used with n = 3 and r =6. There are C(3 + 6 - 1, 6) = 28 ways to fill the stockings. 01 CD -0 Suppose Bob, Sue, and Bob Junior also have labeled Christmas stockings and six oranges. However, they have decided that each person will get at least one orange in his or her stocking. In how many ways can the oranges be placed into the stockings? The best strategy is to first place an orange into each stocking. There are then n = 3 labels to write on r = 3 oranges. Using the strategy from the previous case, there are C(3 + 3 - 1, 3) = C(5, 3) = C(5, 2) = 10 ways to fill the stockings.
The remaining four cases will be illustrated in Example 8.6 on page 412.
0
Example 8.5 introduced the case of distributing n distinguishable objects into k distinguishable containers, where every container must receive at least one object. The solution to that case was presented without justification. Before that justification can be given, some new notation needs to be introduced. DEFINITION 8.3 Stirling Numbers of the Second Kind The number of ways to distribute n distinguishable objects into k indistinguishable containers with every container receiving at least one object is denoted S(n, k). The numbers, S(n, k), are called the Stirling numbers of the second kind. The Stirling numbers are named after James Stirling, an English mathematician who lived in the 1700s and was a contemporary of Isaac Newton. Section 8.1.3 will more fully develop the Stirling numbers of the second kind, as well as introducing the Stirling numbers of the first kind. A complete set of solutions to the eight categories of occupancy problems is presented in the next theorem and also inside the back cover. Occupancy Problems Table 8.3 lists solutions to all eight categories of occupancy problems, where n represents the number of objects and k represents the number of containers. j
ustification for this claim will be presented later in Example 8.7 on page 414.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
411
TABLE 8.3 The Number of Ways to Place n Objects into k containers Containers Distinguishable
Indistinguishable i.1
Distinguishable -0:
Objects Indistinguishable
0:
-0:
k!S(n, k)
0 (k+n-t) n
(-1)
-0:
0:
S(n, k)
0:
_i= 1 p(n, i)
-'0:
p(n, k)
0: containers may be empty -0: containers must contain at least one object The cases will Proof: Each of the eight cases will be examined separately. be abbreviated using the notation 0-object, C-container, D-distinguishable, I-indistinguishable, 0-containers may be empty, -'0-containers may not be empty. The cases will be examined starting in the upper-right square of the table and moving in a counterclockwise direction. OD CI -0 The number of ways to distribute n distinguishable objects into k indistinguishable containers with every container receiving at least one object is S(n, k) (true by definition). The proof need not show how to calculate S(n, k). OD CI 0 This case differs from the previous case because some of the containers may be empty. In fact, the objects can all be placed into just one container, or just two of the containers, or any subcollection of the containers (including the full set of containers). There are S(n, i) ways to distribute the objects into exactly i containers. Since the containers are indistinguishable and since the subcases of distributions into exactly i or exactly j containers are mutually exclusive if i :A j, a simple sum will complete the count. OD CD 0 Each of the objects can be placed into any container, with no restrictions. k choices There are k choices for the first object, k choices for the second object .... for the nth object. Thus, there are kn ways to distribute the objects. OD CD -0 Erase the labels (or other distinguishing characteristics) from the containers. There are S(n, k) ways to distribute the objects into these unlabeled containers. There are now k! ways to place the labels back onto the containers. These two phases are independent, so there are k!S(n, k) ways to distribute the objects in this case. 01 CD 0 The containers are labeled, but the objects are indistinguishable. Place the objects into the containers, and then, for each object, write the label of its container on the object. There will now be a collection of n written labels, from a set of k possible labels. This is a combination with repetition problem. Counting formula 4 implies that 6 there are C(k + n - 1, n) = (k+nn-1) ways to do this. 01 CD -- 0 Since objects are indistinguishable, it is possible to first place one object into each container. There are now n - k indistinguishable objects to place into the k containers, with no additional restraints. The previous case implies that there are k+(n-k )-) = (n-lJ = _,-(n)--1) ways to do this. (Proposition 5.2 on page 234 was used in the middle step.) O CI -0 There is a one-to-one correspondence between distributions of n indistinguishable objects into k indistinguishable containers and partitions of the integer n with exactly k summands. To see this, first consider a distribution of n indistinguishable objects into k indistinguishable containers. Count the number of objects in each container. 6
The n in counting formula 4 is the number, k, of labels and r is the number, n, of objects.
412
Chapter 8 Combinatorics This will result in k numbers whose sum is n. That is, a partition of n with exactly k summands (recall that the order of the summands is unimportant). Now consider a partition of n with exactly k summands. The individual summands can be used to determine the number of objects placed into corresponding containers. There is no implied order to the summands, so the corresponding containers are indistinguishable. 01 CI 0 This case differs from the previous case because some of the containers may be empty. In fact, the objects can be all be placed into just one container, or just two of the containers, or any subcollection of the containers (including all of them). There are p(n, i) ways to distribute the indistinguishable objects into exactly i containers. Since the containers are indistinguishable and the subcases with different choices for i are El mutually exclusive, a simple sum will complete the count. Table 8.3 contains solutions for all eight cases of the occupancy problem. The only missing piece is a practical method for finding values for S(n, k). That deficiency will be fully redressed in Section 8.1.3. The next Quick Check will provide a small collection of values.
V Quick Check 8.3 1. Use Definition 8.3 and exhaustive enumeration to calculate S(n, k), for
1 < n < 4 and 1 < k < n.
A Small Occupancy Problem-Part 2 The four cases that were omitted in Example 8.5 will be explored here. OD CI -0 Six coworkers have all arrived at the airport and need to get to the convention center. Their company has arranged for three taxis to meet them at the airport. It doesn't make sense for one of the pre-paid taxis to drive to the convention center without a passenger. In how many ways can the coworkers ride to the convention center? Notice that they probably care about with whom they share a taxi. (Some people have more in common to discuss during the ride.) The number of arrangements is S(6, 3), which was previously claimed to be 90. OD CI 0 A hostess has invited some friends over for tea and crumpets. Altogether, there are six people. The hostess has arranged seating in three areas: the living room, the dining room, and the patio. Each area is large enough to hold all six people. If the hostess and her guests consider the location unimportant, but the grouping of friends to be significant, in how many ways can the people be arranged? In this problem, the people are the objects and the locations are the containers. Since a location can be empty, there are S(6, 1) + S(6, 2) + S(6, 3) = 122 ways for 7 the people to be distributed. 01 CI -0 Josephine has designed an interactive art project. The project consists of three large bowls, placed at the vertices of an equilateral triangle on the floor. The triangle has been inscribed in a circle, and people can walk freely around the circle. Josephine has provided a set of six identical wax oranges. Art patrons can complete the installation by placing the wax oranges into the bowls. Josephine has left instructions to specify that every bowl should have at least one orange (in order for the project to have "balance"). How many distinct arrangements are possible? In this problem, both the bowls and the oranges are indistinguishable. Since every bowl must contain at least one orange, there will be p(6, 3) = 3 different arrangements: {4, 1, 1}, {3, 2, 1}, {2, 2, 2}. 7The values of S(6, 1) and S(6, 2) can be found using techniques introduced in Section 8.1.3.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
413
01 CI 0 Even though Josephine has specified that at least one orange should be placed in each bowl, there is nothing to stop people from violating that request. In how many ways can the oranges be arranged if bowls can be empty? There are p(6, 1) + p(6, 2) + p(6, 3) = 1 + 3 + 3 = 7 possible arrangements: {6, 0, 0}, [{5, 1, 01, {4, 2, 0}, 13, 3, 0}1], [{4, 1, 11, t3, 2, 1), {2, 2, 21]. U
Use Theorem 8.3, Table 8.48 on page 499, and Table 8.47 on page 499 to answer the following questions. 1. I have four CDs that I wish to give 2. I have four new Sacagawea $1 coins away, and three friends who are pothat I plan to distribute to my three tential recipients. In how many ways charming nieces. In how many ways can I give away the CDs if can I do this? (a) The CDs are all different, and 3. A farmer has four sons and three ideneach friend should receive at least tical potato fields. Each field takes one CD. eight person-days to harvest. In how (b) The CDs are all different, and I many ways can the sons be assigned feel no obligation to give every fields to harvest on the first day of the harvest? friend at least one CD.
8.1.3 Stirling Numbers Some of the basic properties of the Stirling numbers of the first and second kinds will be examined in this section. As a convenience, Definition 8.3 is repeated here. DEFINITION 8.3 Stirling Numbers of the Second Kind The number of ways to distribute n distinguishable objects into k indistinguishable containers with every container receiving at least one object is denoted S(n, k). The numbers, S(n, k), are called the Stirling numbers of the second kind. An alternative view of S(n, k) is given in the next proposition. PROPOSITION 8.2 Set Partitionsand S(n, k) Let A be a set with n elements. The number of ways to partition A into a collection of exactly k nonempty subsets is S(n, k). Proof: Think of the elements of A as objects, and the subsets in the partition as containers. Since every subset is nonempty, this is an occupancy problem with distinguishable objects, indistinguishable containers (the sets in the partition have no implied order), and in which each container must receive at least one object. Theorem 8.3 completes the proof. L1 The recurrence relation in the next theorem provides a useful mechanism for calculating S(n, k).
A Recurrence Relation for S(n, k) Let n and k be integers and let 0 < k < n. Then S(n,k) = S(n-
1,k-
1) +kS(n - 1,k).
where S(n,k) = 0 fork > n and S(n, 1)= 1. In addition, S(n,n) = 1 and S(n,0) = 0 forn > 0.
414
Chapter 8 Combinatorics
Proof: The boundary conditions will be established first. The only way n > 0 objects can be distributed into a single container is to place each object into that container and hence S(n, 1) = 1. Also, it is impossible to distribute n objects into more than n containers and still require each container to receive at least one object. Thus, S(n, k) = 0 if k > n, establishing the boundary conditions. Also, the only way to distribute n objects into n containers if each container must receive at least one object is to place exactly one object in each container. Thus, S(n, n) = 1. If n > 0, then it is impossible to distribute the n objects into no containers, so S(n, 0) = 0 when n > 0. A combinatorial proof will be given for the recurrence relation. Consider a collection of n distinguishable objects. Since they are distinguishable, choose one of the objects and paint it red (or some other unique color). Let D be the set of all distributions of these objects into exactly k containers. The elements of D can be 8 partitioned into two subsets: A = {d c D I the red object is the only object in its container) B = {d E D I the red object is not the only object in its container}. Then D = A U B and A n B = 0. Hence, IDI = JAI + JBI. The distributions in A are in one-to-one correspondence with distributions of n - 1 distinguishable objects into k - I indistinguishable containers. The correspondence can be achieved by removing (or adding) the red object and its enclosing container. Thus, JAI = S(n - 1,k- 1). Let C be the set of all distributions of n - 1 distinguishable (but nonred) objects into k indistinguishable containers. The relationship between distributions in B and distributions in C is more complex. If we start with a distribution in C, then adding the red object to any one of the k containers will create a distribution in B. Because the objects are distinguishable, each of these k possible distributions will be distinct. Conversely, if d e B, then removing the red object will certainly produce a distribution, x E C. But d is not the only distribution in B that will produce x. Since each of the containers in d contains at least one object, moving the red object to any other container will still result in a distribution in B. There are k such distributions, each resulting in x when the red object is removed. Thus, there is a k-to-I correspondence between B and C. Consequently, IBI = kS(n - 1, k). ]
SICalculating
S(6, 3) It is now possible to calculate S(6, 3). The calculation will use Theorem 8.4 and Table 8.48 on page 499. S(6, 3) = S(5, 2) + 3. S(5, 3) = [S(4, 1) + 2. S(4, 2)] + 3. [S(4, 2) + 3 . S(4, 3)] = [1 + 2 . 7] + 3. [7 + 3 . 6] = 90
U
Stirling Numbers of the First Kind The remainder of this section will introduce the Stirling numbers of the first kind and discuss their connection to the Stirling numbers of the second kind. The results are not about counting, so the rest of this section is not directly related to enumerative combinatorics. To set the context, consider a function, f(x), that represents some process that can be measured. The function is unknown, but it is possible to approximate the function 8
Exercise 22 on page 420 asks you to illustrate this for n = 4 and k = 2. It might be helpful to do that
problem as you read the proof.
8.1 Partitions, Occupancy Problems, and Stirling Numbers TABLE 8.4 The Measured Values of Some Unknown Function, f
415
using the measured values. The simplest approximation technique is to construct an interpolating polynomial, p(x), which matches the measured values of f. Suppose, for example, that the process has been measured at 0, 1, 2, 3, and 4 seconds. Table 8.4 shows the measured values.
x
f(x)
One fairly simple way to construct the interpolating polynomial is to write it as
0
a0
I
al
2
a2
+
3 3
a3 a4
+ a4 - 4a3 + 6a2 -
4a4
+
p(x) = ao + (a,- ao)x + a2
2a1+
-
X(X
2 a3 - 3a2 + 3al - ao xx-
6
4
1)(x -2)(
al + a
(
-
1)(x
-
2)(x
3).
24
It should be easy to convince yourself that p and f have the same values when x E {0, 1, 2, 3, 4}. The coefficients were derived by successively substituting 9 the values of x into the prototype polynomial p(x) = bo + bix + b 2 x(x - 1) + b 3 x(x -
1)(x - 2) + b 4 x(x - 1)(x - 2)(x - 3).
Thus, bo can be found by substituting 0 for x. Then bl can be found by substituting 1 for x, giving the equation a I = ao + bl • 1. Continuing in this fashion will eventually produce all the coefficients. It is possible to convert this polynomial into the standard form as a sum of powers of x. The result will be a polynomial of degree 4: (-25 ao + 48al - 36a2 + 16a3 - 3a 4 ) p(x) =ao + 12x 12 (35ao - 104al + 114a2 - 56a3 + I Ia
4 )x 2
24
(8.2)
+ 14a3 - 3a4) 3 S(-5ao + 18al - 24a2 12 (ao-4al
+6a2
-4a3
+ a4)x
+
4
24 The main idea needed here is that any polynomial of degree 4 or less can be expressed as a sum (with appropriate coefficients) of elements of B 1 - {1, x x 2 , x 3 , x4} or as a sum (with different coefficients) of elements in B2 01, x, x(x - 1), x (x - 1) (x - 2), x (x- 1) (x - 2) (x - 3) }. A sum of this type is called a linearcombination.
DEFINITION 8.4 Linear Combination Let el, e2 ... , ek be expressions. A linear combination of {el, e2 ... expression of the form clel + c2e2 +
where {c I, C2..,
, ek} is an
+ ckek,
Ck is a set of constants.
The polynomial transformation presented just before Definition 8.4 is an example of transforming a linear combination of elements in B 2 into a linear combination of elements in B1. It is also always possible to transform a linear combination of elements 0 in BI into an equivalent linear combination of elements in B2 .l There is an alternative to brute force polynomial multiplication that will transform equation (8.1) into equation (8.2). The idea is to first find expressions that represent 9A more general method for calculating the coefficients, using divided differences, can be found in most numerical methods textbooks. 10 1f you have had a course in linear algebra, you will recognize B, and B 2 as alternative bases for the set of all polynomials over R having degree 4 or less. The sums are just linear combinations of the basis elements. The transformations being discussed are the standard change of basis transformations.
416
Chapter 8 Combinatorics each of the polynomials, 1, x, x(x - 1), x(x - 1)(x - 2), x(x - 1)(x - 2)(x - 3), as a
sum of the polynomials 1, x, x2 , x3 , x4 . Table 8.5 shows the result. TABLE 8.5 Transforming x(x - 1)(x - 2) ... (x - k) into a Linear 2 Combination of x, x , ... , xk+1 x
x
x2
x(x - 1)-X
2 2x - 3x +x
x(x - 1)(x - 2) x(x-l)(x-2)(x-3)
-6x+1lx
2
3
-6x
3
+x
4
The linear combination CO + CIX + C2X(X
-
1) + c3x(x - 1)(x - 2) + c4x(x - 1)(x - 2)(x - 4)
can then be written as CO + CIX + C (-x + x2) + c3 (2x
-
3x2 + x3) + c4 (-6x +± 1 x 2
-
6x 3 + x4)
and this simpler expression can then be expanded. The coefficients for the reverse transformation can also be calculated. That is, each of the polynomials in 1, x, x 2 , x3 , x4 can be written as a linear combination of the polynomials, 1, x, x(x - 1), x(x - 1)(x - 2), x(x - 1)(x - 2)(x - 3). Table 8.6 shows the
results. TABLE 8.6 Transforming xi into a Linear Combination of Expressions of the Form x(x - 1)(x - 2) ... (x - k) xn Linear Combination x
x
x2 x
3
x4
X +x(x -) x +3x(x - 1) +x(x - 1)(x -2)
x+7x(x- 1)+6x(x-
1)(x-2)+x(x-1)(x-2)(x-3)
This long detour has finally come back to Stirling numbers. If the coefficients in the right-hand side of Table 8.6 are examined, they bear a striking resemblance to the numbers in Table 8.48 on page 499. In order to demonstrate that this is not a coincidence, some additional notation is needed. DEFINITION 8.5 The Falling Factorial,(x)n The falling factorialis denoted (x), and is defined by (x)o = 1 and n-1
(x)n = H(x - i) = x(x - 1)(x -2)... (x - n + 1)
forn > 1
i=O
The notation in Definition 8.5 is the most commonly used in combinatorics. However, the same notation means something else (but very similar) in other contexts. One suggested alternative notation for the falling factorial is x-n. The claim that the Stirling numbers of the second kind are the coefficients in the transformation of xn to a linear combination of falling factorials can now be formally stated and proved.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
417
Expressing x' as a Linear Combination of Falling
Factorials Let n be a positive integer. Then n
x=
S(n, k) • (X)k. k~l
Proof: The proof is by mathematical induction. If n = 1, then x 1 =x =(x), so the theorem is true for the base step. Suppose that x'- 1 y E,= S(n - 1, k). (x)k for some n > 1. Then Xn
= X
Xn-I n-1
=x E
S(n - 1, k) • (x)k
inductive hypothesis
k=1 n-I
E
S(n - 1, k)
-
(x - k) • (x)k
add and subtract the same
expression
k=1 n-1
+
S(n-
1, k) .k. (x)k
k=-1 n-1
= LS(n - 1, k) • (x)k+l
Definition 8.5 and commutativity
k=1 n-1
+ Zk"
S(n - 1, k) • (X)k
k=1 n
S(n - 1, j -1)
• (x)j
change of index and
Theorem 8.4
j=2 n-1
+
[S(n, k) -S(n
- 1,k-
0)1
(X)k
k=1 n
S(n - 1, k-1)
E
(x)Wk
change of index and distributivity and summation properties
k=2 n-1
+
S(n, k) • (x)k k=1 n-1
-
S(n-
1,k- -)'(x)k
k=1 n-1
) 7 S(n, k) • (x)k
commutativity and summation
k=1
properties
+ [S(n - 1, n - 1) • (x)n - S(n - 1, 0)
(x)l]
n -
S(nk) (x)k
k=1
This completes the induction.
S(n - 1,n -1) = S(n,n) = 1 and S(n - 1,0) = 0. D
418
Chapter 8 Combinatorics
Illustrating the Proof of Theorem 8.5 It may be helpful to use a concrete example to illustrate the inductive step in the proof of Theorem 8.5. For this purpose, let n = 3 and recall that S(2, 1) = S(2, 2) = 1, S(3, 1) = S(3, 3) = 1, and S(3, 2) = 3. 5(2, k)=(x)k = I .x+ 1.x(x -- ). The inductive hypothesis assumes that x 2 = y Then x 3 --x-[1.x+l.x(x-1)] =[1.(x - 1) . x
+ 1 . (x -2).
x(x
=[1-x(x -
I) + 1 x(x
=[1x(x
1) + 1 x(x - 1)(x -2)]
=
-
-
-
1)(x -2)]
1)] +[I. I
x + 1 .2. x(x
-
1)]
+ [1. 1 . x + 2 1 . x(x - 1)] + [(1 - O)x + (3 - )x(x - 1)]
[1 x(x - 1) + I x(x - 1)(x - 2)] + [I .x + 3 .x(x - 1)] - [0.x + 1 .x(x - 1)]
= =I =
[1 . x + 3 x(x - 1)] + 1 . x(x - 1)(x -2)x + 3 . x(x
S(3,1)
.
-
1) +
0 x
1)(x -2)
.x(x -
(x)I + S(3, 2) • (x)2 + S(3, 3) • (x)3.
U
Theorem 8.5 characterizes the coefficients that express x" as a linear combination of falling factorials. The corresponding result for expressing (x), as a linear combination of powers of x is provided by the next definition. DEFINITION 8.6 Stirling Numbers of the First Kind The coefficients of the expansion of (x), as a linear combination of powers of x are the Stirling numbers of the first kind and are denoted by s (n, k). That is, n
(x)W = Es(n, k)xk. k=O
The notation for the Stirling numbers of the first kind is very similar to the notation for the Stirling numbers of the second kind. They differ only in the case of the s. The Stirling numbers of the first kind have no direct significance in counting problems: some of them are negative integers. The current mechanism for finding the numbers, s(n, k), is not very convenient. It involves expanding the algebraic expression (x),. A more suitable method is presented in the next theorem.
A Recurrence Relation for s (n, k) Let n and k be positive integers. Then s(n,k) = s(n-
1,k-
1) - (n - 1)s(n - 1,k),
where s(0, 0) = 1, s(n, O) = 0, and s(n, k) = 0 ifn < k. In addition, s(n, n) = 1.
Proof: The base conditions are straightforward to verify.
First, (x)o = 1 = s(0, 0)x 0 = s(0, 0). Also, for n > 0, x is always a factor of (x)n, so s(n, 0) = 0 must hold (or else a term with no x's would appear in the sum in Definition 8.6). If k > n and s(n, k) 4 0, then a nonzero term with a power of x greater than n would appear in the expansion of the degree-n polynomial (X)n. This is impossible, so s(n, k) = 0 whenever k > n. The polynomial (x),1 = x(x - l)(x - 2) ... (x - n + 1) has a coefficient of I on the xn term after it is expanded, so s(n, n) = 1.
8.1 Partitions, Occupancy Problems, and Stirling Numbers
419
Now assume that n > 0 and notice that n
=Zs(n, k)xk
(x)n
k=O
and also n-I
(x)n = (x
n-I
(n - 1))(X)n-I = E
-
s(n -
1, k)xk+l -
-- (n - 1)s(n -
k=O
1, k)xk.
k=O
Thus n
n-i
k=O
n-I
s(n - 1, k)x
s(n, )xk =
k=O
k=O
n =
1)s(n -1, k)xk
-
-
n-I s(n -
Y
1, j
-
1)xj -
j=1
- 1, k k=1
)xk -
1)s(n -
-
1, k)xk
(n -
)s(n -
1, k)xk
k=1
(n - 1)s(n - 1, 0)x
0
n
n-I s(n -
1,k -
1)xk -
7n
k=1
k=1
n
n
s(n -1, k
-
1)xk
-
k=1 n
-
k=O
Tslf
-
L
[s(n-
1, k -1)
-(n
-
1)s(n -
...,(n - 1)s(n k=1 -1)s(n
-
1,k)xk
since s(n - 1,n)= 0
1, k)xk
1,k)] x.
k=l
The recurrence relation is established by comparing the coefficients of xk for k > 0
on both sides of the equation.
El
8.1.4 Exercises The exercises marked with ýD4 have detailed solutions in Appendix G. 1. Use Corollary 8.1 to find p(5). Show your work, but do not do more than is necessary. 2. Use Theorem 8.2 and Table 8.47 to calculate the following values of p(n, k). (a) p( 7 , 4) (c) p(8, 3)
(b) ODp( 8 , 5)
O
1
2
12
288
34560
24883200
L(n)
1
2
12
576
161280
812851200
5
6
Latin Squares and the Design of Experiments Latin squares can be used to help researchers design experiments that minimize bias in the results. In particular, if the experiment has two factors that might influence the result, a Latin square can be of use. A New Comic Strip Suppose that a newspaper editor has decided to replace one of the comic strips in the daily paper. He has found three potential replacement strips but would like to learn a bit about subscriber preferences before making a decision. He has decided to run each candidate replacement for a week and then see which is most popular. It has occurred to him that the order in which the strips appear might be a significant factor in how subscribers react. He would therefore like to divide the subscribers into subgroups with each possible ordering of the three candidate strips assigned to a subgroup. This requires 3! = 6 subgroups. The editor has realized that there is a practical problem with this scheme. The problem is that the newspaper would need to print six editions of each paper, but currently there are only three editions: metro, northern suburbs, and southern suburbs. Creating the extra editions is expensive. The editor has found an acceptable alternative that still controls for the "order of appearance" effect. Suppose the candidate comic strips are denoted as A, B, and C, with editions denoted M, N, and S. The following Latin square uses comic strips as row labels and editions as column labels. The entry in the ith row and jth column represents the week in which comic strip i will appear in edition j. In this arrangement, each strip will appear in some edition first, in another edition second, and in the other edition last. Every edition will run each strip for a week.
A B C
M
N
S
1 2 3
2 3 1
3 1 2
This scheme is not as exhaustively thorough as the original scheme (for instance, no subscribers will read the strips in the order ACB). However, it is far better than arbitrarily picking only one ordering to run in all editions. 0 The previous method of experimental design can be used in many situations, as the next example illustrates.
8.2 Latin Squares; Finite Projective Planes
423
An Agricultural Experiment An agricultural researcher is trying to breed a new variety of popcorn. She has three
university-owned farms in the area that each have a small section of field she may use. She wants to study the effects of three different fertilizer treatment regimes on each of
three new varieties of seed. She knows that the three fields have different soil characteristics (among other differences), so the choice of field is significant when interpreting
the results. In addition, she is assuming that the variety of seed and the fertilizer regimes are also significant. She does not have the funding to run separate experiments to test seeds and fertilizers, so she needs to study both factors simultaneously. An exhaustive comparison would have each seed variety matched with each fertilizer regime in each field. This would require each section of field to be subdivided into nine plots. Her allocation of field space is not large enough for this to be practical; she only has space for three plots per field. She has decided to use a Latin square design to minimize bias in the results. In particular, she wants to ensure that every variety of seed is matched with every fertilizer regime, that every variety of seed is planted in a plot of each field, and that every fertilizer regime is used in some plot of every field. The rows in the Latin square represent the seed varieties: A, B, and C. The columns represent the fertilizer regimes: L, M, and N. The numeric entries in the design represent the fields. A B C
L 1 2 3
M N 3 2 1 3 21 1
The next example is more complex.
Washing Machines An industrial chemist is studying the effectiveness of a number of options for washing clothes. He wishes to compare three brands of washing machine, three brands of detergent, as well as three water temperatures. He also wishes to factor in the effects of water hardness. He will use three levels of hardness. He would like to match these characteristics as fairly as possible, but does not have the time or resources to examine every conceivable combination (there are 34 = 81 combinations). He has decided that the washing machine brand (A, B, and C) and detergent brand (X, Y, and Z) should be represented as the rows and columns, respectively, of a design. He has ranked the water temperatures as 1, 2, and 3 and also ranked water hardness as 1, 2 and 3. He is searching for a design with ordered pairs, (t, h) as entries. An entry of (t, h) in row i and column j means that temperature t and hardness h will be used together in a brand i washing machine with detergent j. His first attempt at a design is shown next. He has realized that the set of temperatures should form a Latin square and the set of water hardness ratings should also form a Latin square. These Latin squares will ensure that each temperature (water hardness) will be used with each brand of washer and each brand of detergent.
X
Y
Z
(1,2)
(2,1)
(3,3)
B (2,1) C (3,3)
(3,3) (1,2)
(1,2) (2,1)
A
The problem with this design is that temperature 1 and hardness I are never tested together, nor is temperature 1 and hardness 3. The researcher revised the design and
424
Chapter 8 Combinatorics produced the following arrangement. This time, each temperature is matched with each hardness. In fact, the design ensures that the two elements in every pair of categories (washer, detergent, temperature, hardness) share a common test. Thus, brands A and Y appear together in a test, brand B and hardness 3 share a common test, temperature 1 and hardness 2 occur together, and so on. A B C
X (1,1) (2, 2) (3,3)
Y (2,3) (3,1) (1,2)
z (3,2) (1,3) (2,1)
E
Orthogonal Latin Squares Example 8.12 introduced one of the most important aspects of Latin squares: orthogonality. The next definition provides the formal context. The definition uses notational conventions from Appendix E.
DEFINITION 8.8 Orthogonal Latin Squares Let L 1 = (aij) and L2 = (bij) be two Latin squares of order n. L 1 and L 2 are said to be orthogonal if the set of ordered pairs {(aij, bij) I i = 1, 2 ..... n and j = 1, 2 ... , n} contains n2 distinct ordered pairs. That is, (aij, bij) = (ars, brs) unless i = r and j = s.
A collection of k Latin squares of order n is said to be mutually orthogonal if every pair in the collection is orthogonal.
Example 8.12 Revisited Example 8.12 already has established that the following pair of Latin squares is orthogonal. The collection of ordered pairs is also shown. 1 2 3 2 3 1 3 1 2
1 3 2 1 3 2
2 3 1
(1,1) (2, 2) (3,3)
(2,3) (3,1) (1,2)
(3,2) (1,3) (2,1)
Notice that all nine ordered pairs are distinct; there are no repetitions. Compare this with the pair of Latin squares that the chemist initially tried. The set of ordered pairs, ((1, 2), (2, 1), (3, 3)), contains only three elements. 1 2 3 2 3 1 3 1 2 1
2 1 3 1 3 2 3 2 1
(1,2) (2,1) (3,3)
(2,1) (3,3) (1,2)
(3,3) (1,2) (2,1)
U
Mutually Orthogonal Latin Squares The following collection is a set of three mutually orthogonal Latin squares of order 4. You should take the time to verify that each pair is orthogonal. LI=
1 2 3 4 1 4 3 3 4 1 2 4 3 2 1 2
2
1 2 3 4 4 3 2 1 2 1 4 3 3 4 1 2
1 2 3 4 4 3 2 1
3 4 1 2 2 1 4 3
U
A brief examination of Example 8.9 on page 421 will convince you that there are no pairs of orthogonal Latin squares of order 2. It is possible to find a pair of orthogonal Latin squares of order 3 (Example 8.13) and also possible to find a set of three mutually orthogonal Latin squares of order 4 (Example 8.14). An exhaustive search through the collection of all 12 Latin squares of order 3 (see the Quick Check 8.5 solution on
425
8.2 Latin Squares; Finite Projective Planes
page 500) will prove that there is no collection of three or more mutually orthogonal
TABLE 8.8 Partial Results
Latin squares of order 3. Let m (n) represent the maximum number of mutually orthog-
for m(n) M(n)
2
3
4
onal Latin squares of order n. Table 8.8 shows what has been established so far in this discussion.
1
2
at least 3
Exhaustively verifying (or disproving) that m(4) = 3 seems like too much work. Perhaps a better approach is to attempt a conjecture based on the (admittedly meager) results so far. The obvious guesses are that m(n) = n - 1 or m(n) >_ n - 1 or m(n) < n - 1. The following example will provide the key insight needed to identify and prove the proper conjecture. Transforming Sets of Orthogonal Latin Squares Consider the pair of orthogonal Latin squares from a previous example. L'=
1 2
3
2
3
1
3
1 2
L2 =
1 3
2
2
1
3
3
2
1
It is possible to modify L2 so that its first row is the same as the first row of L 1 and so that the new pair is still orthogonal. The transformation is very easy. Notice that the entry in the first row, second column of L2 is a 3 but a 2 is desired. Simply change every 3 to a 2 and vice versa. The result is shown next. 1 2 3 L'= 2 3 1 3 1 2
1 2 3 1 2 2 3 1
L2'= 3
Because 2s and 3s have been uniformly exchanged, L2' is still a Latin square (every number appears exactly once in each row and column). Also, the exchange has not changed the property that the set of ordered pairs from corresponding positions still contains all nine possible ordered pairs. This is true since, for example, the old pair (1, 2) becomes (1, 3) after the transformation, but the old (1, 3) simultaneously becomes (1,2). In a similar fashion, consider the following three mutually orthogonal Latin squares of order 4. 4 3 1 2
1 4 2
L1= 2
3
4
1
1
4
3
2
1
3
4
4
1
2 3
2
3
4
1
4
2
1 3
2
1 4
1 4
3
2
1 2 3 2 1 4
2 3
4 L2= 3
3
L3=
These can be transformed into a new set of mutually orthogonal Latin squares in which the first rows are all "1 2 3 4" This can be done in three steps (for this example). In step 1, interchange 1 and 3 in L 1, interchange 1 and 4 in L 2, and interchange 1 and 2 in L 3. The resulting Latin squares are shown. 1 2 L
2 3 4
3
4
1 4 3 4 1 2 3 2 1
1 4 2 L2'= 4 1 2 3 3
2 3 4 1 3 2 1 4
1 4 3 2 L3,= 3 2 1 4 2 3 4 1 4 1 2 3
In step 2, interchange 2 and 4 in both L 2' and L3'. 1 2 L1
3
4
2 3
1 4
4 1
3
4
3
2
1
2
1 2
4
3 1
1 2
3
4
1 2
2 1
3
L2,,= 3 2
4 1
2 3
4
3 4
4 3
4
3
1 2
2
1 4
L3,,=
426
Chapter 8 Combinatorics In the final step, interchange 3 and 4 in L2'". 1
L
= 2 3 4
2 1 4 3
3 4 1 2
1 2
4 3 2 1
L2
4 2
3
3
3 2 1 4 4 1
4 1 3 2
1 L3,,= 3 4 2
2 4
3 1
3 4 1 2 2 1 4 3
It is now possible to make a clever observation about Li', L", and L'. Suppose we want to find a fourth Latin square of order 4, L4, to add to this mutually orthogonal collection. The same kinds of transformations can be used to ensure that the first row of L4 is "1 2 3 4" before it is added to the collection. What number can be in the second row, first column of L 4 ? It can't be a 1, or else 4 L would not be a Latin square. If it is a 2, then L4 and LI' would not be orthogonal since there would be at least two copies of the ordered pair (2, 2). There are similar problems with a 3 or a 4 in that position of L4. The conclusion is that no such Latin square can exist. There can be at most 3 mutually orthogonal Latin squares of order 4. U Theorem 8.7 on page 427 will present the major result about sets of mutually orthogonal Latin squares: There are at most n - 1 mutually orthogonal Latin squares of order n. Two lemmas will be used in the proof of the theorem. LEMMA 8.1 Interchanging Numbers Preserves Latin Squares Let L be a Latin square of order n and let i, j E {1,2 ..... nJ. If every copy of i in L is changed to a j, and simultaneously, every j in L is changed to an i, then the resulting n-by-n matrix is still a Latin square.
Proof: Since L is a Latin square, every number in {1, 2.
n} appears exactly once in each row and in each column. After the interchange, there will still be exactly one i and one j in each row and column (but now in different positions). Every other number, k 0 {i, j}, will still be in the same positions in the matrix, so will also appear exactly once in each row and in each column. Consequently, every number in {1,2 ..... n} still appears exactly once in each row and in each column. Therefore the new matrix will be a Latin square. El LEMMA 8.2 Interchanging Numbers Preserves Orthogonality Let L 1 and L 2 be orthogonal Latin squares of order n and let i, j c f 1, 2. n Form an n-by-n matrix L 1' by changing every copy of i in L to a j, and simultaneously, changing every j in L 1 to an i. All other entries in L 1 are copied unchanged into L '. Then LI' and L 2 are also orthogonal Latin squares. Proof: Lemma 8.1 ensures that LI' is a Latin square. Let M be the matrix of ordered pairs from L1 and L 2 and let M' be the matrix of ordered pairs from L 1 ' and L 2 . Since L 1 and L 2 are orthogonal, the ordered pairs (i, k) and (j, k) (for every choice of k) both appear somewhere in M. After interchanging i and j, the ordered pair (i, k) will appear in M' at the position that (j, k) is at in M. Similarly, the ordered pair (J, k) will appear in M' at the position occupied by (i, k) in M. That is, the ordered pairs (i, k) and (j, k) will swap places, for every choice of k. Ordered pairs that don't have either i or j as first element do not move. The interchange has moved some ordered pairs to other positions, but no ordered pairs have disappeared and no new ordered pairs have been created. Therefore, L ' and L2 are orthogonal. [I
8.2 Latin Squares; Finite Projective Planes
SMIllustrating
427
the Proof of Lemma 8.2 Consider the following Latin squares. The matrix of ordered pairs is also shown. 1 2 3 L' = 2 3 1 3 1 2
1 3 2 L2 -- 2 1 3 3 2 1
(1,1) (2, 2) (3,3)
(2,3) (3,1) (1,2)
(3,2) (1,3) (2,1)
Suppose the Is and 2s are exchanged in L 1, creating LI'. In the matrix of ordered pairs, the elements (a l, a21) = (1, 1) and (a33, a23) = (2, 1) have traded positions. Also, (1, 2) and (2, 2) have swapped places and (1, 3) and (2, 3) have traded places. 2 1 3 L" = 1 3 2 3 2 1
•flflll
1 3 2 L2 = 2 1 3 3 2 1
(2,1) (1,2) (3, 3)
(1,3) (3,2) (3, 1) (2, 3) (2, 2) (1,1)
E
Maximal Sets of Mutually OrthogonalLatin Squares
Let {L 1 , L2 . Lk be a collection of mutually orthogonal Latin squares of order n> 1. Thenk 1 from n - 1 mutually orthogonal Latin squares of order n.
V Quic,, Ch,ec 8.7 8.7...eck . . .
.
1. Use Construction 8.1 (Figure 8.7) to create a finite projective plane of or-
.
.
..
..
der 2. Include a visual diagram of the
plane.
R1
436
Chapter 8 Combinatorics
Finite Projective Planes from Mutually OrthogonalLatin Squares If a set of n - 1 mutually orthogonal Latin squares of order n exists, then a finite projective plane of order n also exists.
Proof: The proof shows that Construction 8.1 produces a finite projective plane of order n. The proof is not elegant, but has the advantage of using only elementary ideas. More elegant proofs exist, but these proofs use some advanced mathematical tools which have not been introduced in this text. The Rows of A Are Orthogonal As a preliminary observation, notice that every pair of rows in A (the matrix in Construction 8.1) is orthogonal, in the sense that every one of the n 2 possible vertical pairs in the list (1), (1) ..... (1), (2). (2). (n). (n) appears exactly once if one row is placed directly above the other. This claim can been justified by considering a few cases. Rows 1 and 2 Rows I and 2 have the orthogonality property by construction. The vertical pairs are even arranged in the natural ordering. Rows 1 or 2 and row j, with j > 2 Row 1 has the orthogonality property with row j because every cluster of n columns has the same number in row 1, but each number in {1, 2 ... , n} in the corresponding columns of row j. Each first entry of the vertical pairs appears in a distinct cluster of row 1. The vertical pairs will have ( (2),...-) ('), in some order from the first cluster, (2).. () in some order from the second cluster, etc. All n 2 vertical pairs will be present. Notice that row 2 contains one copy of each of the numbers 1, 2 . n in every cluster of n columns. Since row j in A is defined by the rows of a Latin square, every cluster of n columns in that row will be a row of a Latin square. Thus, every cluster of n columns in row j will also contain exactly one copy of each of the numbers 1, 2 ... , n. Row 2 also has the orthogonality property with row j. This is true because the source for row j is the Latin square, Lj-2, and any Latin square contains every number in {1,2 .2.. n} exactly once in each of its columns. The element in the kth column of row 2 is always a k and the kth column of Lj- 2 is distributed among the kth positions of each column cluster (relative to the start of the cluster), so the 17 second elements in each pair will be distinct. Rows j and k, with j, k > 2 The orthogonality of these rows is a direct result of Lj- 2 and Lk- 2 being members of a set of mutually orthogonal Latin squares. The numbers are arranged in one row of n2 elements instead of n rows of n elements each, but the set of pairs is the same in either arrangement when checking orthogonality. Now that the orthogonality of the rows has been verified, the proof may proceed. Notice that Construction 8.1 ensures that the lines each contain n + 1 points. Every point is on exactly n + 1 lines because the points in {1, 2 .... n 2} each appear in exactly one of the lines produced from a row of A, and A has n + 1 rows. 1 8 The points 0OO1, Oo2. .... OOnr+l} are each in L,,. In addition, ooj appears in each of the n lines Ljc, for c 1, 2 .... n, but in no other lines. The construction creates n2 + n + 1 points and n 2 + n + 1 lines. If the construction has produced a finite projective plane, then it must be of order n. It remains to verify 17
FOT example, the second positions of the clusters in TOWj are the elements in the second column of Li- 2 . These positions, when matched with row 2 of A will produce (probably in a different order) the vertical pairs
(2), (2)....
18 For example, row j of A contains some number, i, in the kth column. The number k will consequently be on line Lj,i and no other line produced by row j.
8.2 Latin Squares; Finite Projective Planes
437
that the points and lines in .F actually constitute a finite projective plane. This will be done by directly verifying the three axioms. FPP1 The proof that every pair of points is on exactly one common line will be completed by looking at three cases. oci and ooj The only common line for these two points is L,. k and ooj The only common line for these two points is Lj,i, where ajk = k and j with k 6 j For points k and j to be on at least one common line, the matrix A must contain some row with the same number, i, in both column k and column j.
I... Row r I..
j
...
... i
...
k ... i
If k = j mod n, then this will occur in row 2 of A. If k and j are in the same cluster of columns, it will occur in row 1. Suppose that neither of these cases apply. Then the positions in the original Latin squares, corresponding to columns k and j in A, cannot be in the same row or the same column of the Latin squares. Lemma 8.5 (Exercises 8.2.4, Exercise 32 on page 442) implies that there is some row of A where columns k and j have the same value. Therefore, points k and j must be on at least one common line. Now suppose that points k and j are on two common lines. Then there must be a pair, L'r and Lr 2, 2 of partial lines with {k, j} C (L' 1 ,c n L' 2, 2). Suppose r1 0 r2. Then the matrix A looks like this: •... k ... J -.. Row rl
...'
Row r 2
..
l
-
C2
cl
...
"" C2
...
which violates the orthogonality property established at the beginning of this proof. Therefore rl = r2. Since Lr,,c and Lrc 2 are disjoint if cl A C2, it must be that cl = C2, so the
assumed pair of lines common to k and j is really a single line. FPP2 Cases will be used (once again) to show that every pair of lines contain exactly one common point. L,, and Lr,c These two lines contain only the point o0 r in common. Lr,cl and Lr,c2 with Cl # c2 These two lines contain only Ocr in common, since Lr, 1 and L' are disjoint. Lrl,ci and Lr 2 ,c2 with rl # r2 Suppose cl # C2 and Lrj,cj and Lr9 ,c2 contain the points k and j in common (they can't contain a common point of the form ooi). Then the following table must represent part of A, violating the orthogonality property. . •... k
...
J .
Row rl
...
c1
-
cl
...
Row r 2
...
C2
•.
C2
...
If instead, cl = C2, then A must contain the following configuration, again violating the orthogonality property. -...
k
...
j
...
Row rl
...
Cl
...
Cl
..
Row r 2
..
C
... Ca
...
438
Chapter 8 Combinatorics Therefore, Lri,cl and Lr 2 ,c2 can have only one point in common. Since rl # r2, that point cannot be in the form ooi. It remains to show that Lri,ci and Lr 2 ,c2 are not disjoint. By the orthogonality property, the vertical pair (G) must appear in some column of A when rows rl and r2 are compared. Suppose this occurs in column k. Then A must look like the following. ...
The lines Lrl,CI joint.
and
k
Rowr
l
-..
ct
Row
r2
•..
C2
...
"
both contain the point k, showing that they are not dis-
Lr 2 ,c2
FPP3 Since n > 1, there are at least three rows and at least four columns in A. Consider the lines constructed in steps 3 and 4. Points I and 2 are both in/on L 1 , 1 and points n + 1 and n + 2 are both in/on L1 ,2 . Also, points 1 and n + I are both on L 2, 1 and points 2 and n + 2 are both on L 2 ,2 . Since by FPPI, no two points can be on more than one common line, it is not difficult to verify that {1, 2, n + 1, n + 2} is a set of 4 points in F for which no three are on a common line. H The reverse construction is also possible. Orthogonal Latin Squares from a Finite Projective Plane It is possible to create a pair of orthogonal Latin squares of order 3 from the finite projective plane of order 3 shown in Figure 8.8. Figure 8.8 Constructing orthogonal Latin squares from a finite projective plane.
11
124;
-
-
-
122 1
//
-4 --
\3
13
--
-----
-----------------
The first step is to create a table (Table 8.11) with the n 2 + n + 1 13 lines arranged in five rows. The bottom row will contain only one line, arbitrarily chosen here to be the line L 13 = {10, 11, 12, 13}. The other four rows are arranged by the presence of one of the points on the special line in the last row. TABLE 8.11 Step 1 10
(1,2,3,101
(4,5,6,101
17,8,9,10)
11 12
11,5,9,111 (1,4,7,121
{2,6,7,11) (2,5,8,121
{3,4,8,11) {3,6,9,12}
13
11,6,8,131
(2,4,9,131 f10,11,12,131
{3,5,7,131
8.2 Latin Squares; Finite Projective Planes
439
In the next step, the line L 1 3 and all points on it will be removed. This will leave n 2+n =12 partial lines and n2 = 9 points, with the partial lines arranged in an n + I = 4 by n = 3 table, T, with the columns labeled by the numbers 1 through n = 3. Notice (Table 8.12) that the partial lines in each row have been sorted by smallest elements. Thus, {2, 5, 8} appears before {3, 6, 9} in row three because 2 is less than 3. TABLE 8.12 Step 2a 1
2
3
11,2,3)
14,5,61
17,8,91
11,5,9) 11,4,71 11,6, 81
12, 6, 71 {2,5,81 12,4,91
13,4,81 {3,6,9} 13,5,71
The reduced table (Table 8.12) corresponds to the reduced geometric construct in Figure 8.9. Figure 8.9
The reduced
geometric construct.
2
8
9 Notice that the partial lines in row 1 list the points in their natural order: 1, 2, 3,.. n. If this were not so, it would be easy to go back to the original finite projective plane and rename the points so that the points in row I would be in natural order. The construction also requires the second row to be in a particular pattern. In this case, the desired pattern is 11, 4, 7}, {2, 5, 8}, 13, 6, 91. This can also be achieved by a more selective renaming of points. The key idea is to swap the names of pairs of points with the condition that a point can only exchange with points where it is on a common line in row 1. Thus, 4 and 5 can exchange names without modifying the line 14, 5, 61 in row 1. The desired pattern is achieved by the following circular renaming of the original points: 4 -- 6 --* 5 --+ 4, 7 --* 8 -> 9 -- 7. The renaming changes the set 14, 5, 61 to {6, 4, 51, which is the same set as {4, 5, 6}. The convention is to write the set in sorted order. The result is shown in Table 8.13. TABLE 8.13 Step 2b 1
2
3
[1, 2, 3)
{4, 5, 61
17,8,91
11,4,7} 11,6,8) 11,5,9)
12, 5, 81 12,4,91 (2,6,71
{3,6,9} 13,5,71 13,4,81
The third step will transform T into an n + I = 4 by n 2 = 9 matrix, A (Table 8.14). The columns of A will be labeled by the points (1 through 9) and the rows will be derived from the rows in T. The entry in row i, column j of A will be the label of the column of T for which the point j appears in a line at row i of T. Thus, row 3 of A will have a 2
440
Chapter 8 Combinatorics in columns 2, 4, and 9, since the line {2, 4, 91 appears in the second column of the third row of T. TABLE 8.14 Step 3 1
2
3
4
5
6
7
8
9
1 1 1
1 2 2
1 3 3
2 1 2
2 2 3
2 3 1
3 1 3
3 2 1
3 3 2
1
2
3
3
1
2
2
3
1
The columns of T have been separated into three clusters of three columns each. The bottom n - 1 rows will form the Latin squares. Each of these rows can be copied cluster-by-cluster into n by n matrices (Table 8.15). TABLE 8.15 Step 4 L1=
1 2 2 3
3 1
3
2
1
L2
1
2
3
3
1
2
2
3
1
It is easy to verify that L 1 and L 2 are a pair of orthogonal Latin squares of order 3. N The steps in Example 8.20 can be generalized. Construction 8.2 (Figure 8.10) describes how to start with a finite projective plane of order n and construct a set of n - I mutually orthogonal Latin squares.
V Qu ick' Check 8.8 7
1. Use Construction 8.2 to create a Latin square of order 2, starting with the Fano plane. 2
6
3A5 4[j
THEOREM-8.10
Mutually OrthogonalLatin Squares from Finite Projective Planes If a finite projective plane of order n exists, then a set of n - 1 mutually orthogonal Latin squares of order n also exists.
Proof: The proof will be accomplished by showing that Construction 8.2 always works. Most of the proof will be concerned with the resulting matrices in step 4. However, a few claims need to be verified for the earlier steps. Step 1 claims it is possible to construct a table with n + 2 rows, where all but the final row contain n lines, each with a single point common to all lines in that row. This is possible because the points in Ln2+n+l cannot be contained in any other common line (FPP1). On the other hand, each of the points n2 + 1.... n 2 +n + I is on exactly n + 1 lines. Step 2 results in a table, T, with n + I rows, each containing n partial lines of n points each. Neither renaming changes any essential features of the original finite projective plane, F. The second renaming is done by exchanging names in such a way
8.2 Latin Squares; Finite Projective Planes
Construction 8.2 Order n > 1 from
441
Constructing a Collection of n - 1 Mutually Orthogonal Latin Squares of
a Finite Projective Plane of Order n
Let Y" be a finite projective plane of order n, consisting of the n 2 + n + 1 points 1, 2,...n 2 + n, n2 + n + 1, and the n 2 + n + 1 lines L 1 , L 2, ... , Ln2 +n, Ln2+n+l. By suitably renaming points if necessary, assume that 2 Ln2+n+l -= n2 + 1, + n2 2,..., n2+n n, n 2 + n + 1}. Step 1 Create a table that contains n-+2 rows. The final row will consist only of Ln2+n+l. Row i, for i = 1, 2, ... n + 1, will consist of the n lines other than Ln2+n+l that contain the point n 2 + i. Step 2 Create a table, T, having n + 1 rows and n columns by deleting Ln2+n+l and all the points it contains from the previous table. If necessary, rename points or swap rows so that the first row is in numeric order; that is, it consists of the partial lines {1, 2 ....
,nJ, {n + 1, n + 2,..., 2n}.
{(n - l)n + 1, (n - 1)n + 2 ... , n 2 }.
Now rename points again so that the second line consists of the pattern {1, n + 1,2n + 1...
(n - 1)n + l}, {2, n + 2, 2n + 2 ...
, (n - 1)n + 2}.
in, 2n ....
n 2}.
This must be done by renaming a point using only points it is on a common partial line with in row 1. Label the columns of T with the numbers 1 through n. Step 3 Create an n + I by n 2 matrix, A = (aij), from T. The columns of A will be labeled with the numbers I through n2 . Set aij = k if the partial line in row i, column k of T contains the point j. Divide the columns of A into n clusters of n columns each. Step 4 A will contain the pattern "111 ...122 .. 2... n n ... n" in row I and the pattern "12 ...n 12... n ...12....•n" in row 2. Remove those two rows, forming the n - 1 by n 2 matrix, A'. The n - 1 rows of A' will be used to form the Latin squares, with the ith row forming the Latin square Li. More specifically, the jth row of Li will be the jth cluster in row i of A'. Figure 8.10 Constructing a collection of n - I mutually orthogonal Latin squares of order n > I from a finite projective plane of order n. that row I remains unchanged (recall that the order in which elements are listed inside a set is not important). The second renaming is always possible because in row 2 every point, x, must appear with points that are not on a common line with x in row 1 (or else FPP1 would be violated). This can only happen if the partial lines in row 2 each contain exactly one point from each partial line in row 1. It is therefore possible to rename elements within partial lines in row 1 of T so that the partial lines in row 2 have the desired pattern. The matrix, A, in step 3 will contain n clusters per row because T contains n partial lines. Each cluster will contain n columns because each partial line in T contains n points. Notice that row i of A must contain each of the numbers I through n exactly n times apiece. This is because each column label in T is assigned to the n points in the partial line residing in that column of row i in T. In addition, each point in the set [1,2 ... , n 2 } appears exactly once per row of T. (Suppose some point, m, were in two partial lines of row i of T. Add n 2 + i back to the partial lines, re-creating two of the lines in T. These two lines in _f would contain the pair of common points m and n 2 + i, contradicting FPPI and FPP2.) Step 2 assures that the pattern "111 ... 1 22- -. 2... nn..•. n" will appear in row 1 n ... 12. ..n" will appear in row 2. of A and the pattern "12. . . n 12 ... It now must be shown that each of rows 3 through n + I forms a Latin square and that those Latin squares are mutually orthogonal. The rows of A' form Latin squares Consider the transition from T to A and focus on any row, i > 1 of A. Each cluster of row i in A contains each of the numbers I through
442
Chapter 8 Combinatorics n. If this were not so, then some cluster of row i would contain a duplicate value, J. This means that there is a partial line in row i, column j of T that contains two points, k and m, that are both in the same cluster of A. The points k and mt will also be on a common partial line in row I of T since they are in the same cluster. This contradicts FPP1 and FPP2. Since it is now clear that each cluster in row i of A' contains every number in {l, 2 . n), it is clear that every row in L' contains each of those numbers exactly once. Suppose some column, c, of L' contains a duplicate value, j. Then there must be a partial line in row i of T that contains a pair of points, k and m. The common column means that k =_-m mod n (since k and m are in the same relative positions from the start of their respective clusters). This, in turn, implies that k and m must be on a common partial line in row 2 of T, contradicting FPPI and FPP2. Therefore, every column of L' must also contain every number in {l, 2. n}. Thus, Li is a Latin square of order n, for i = 1, 2 ... , n - I. The Latin squares are mutually orthogonal Suppose that L' and Li are not orthogonal. That is, some ordered pair appears in two places when the matrix of ordered pairs from Li and Li is created. Then there will be distinct columns, k and m in A', such that the following configuration must occur (it is possible that a = b). • •.
k
...
Row i
...
a
... a
...
Row j
...
b
...
...
m
b
...
This indicates that both k and m are on a common partial line in column a of T and that k and m are also on a common partial line in column b of T. Since i 7 j, the two partial lines are distinct. But these partial lines come from distinct lines in Y, so they cannot contain two common points. This contradiction means that Li and Lj must be orthogonal. [1 COROLLARY 8.2
Finite Projective Plane If and Only If Mutually OrthogonalLatin Squares A finite projective plane of order n exists if and only if there is a set of n - I mutually orthogonal Latin squares of order n.
Proof: Theorems 8.9 and 8.10.
L]
8.2.4 Exercises The exercises marked with 0 Appendix G.
have detailed solutions in
(c)
n) must appear at most 1. Prove that every number in 1I, 2. once in every row and in every column of an n by n Latin square.
3 2 1 2 1 3 1 3 2
(d)
1 2
4
1 3
4
2
3
4
2
1
3
1 2 2 1
2 3
4 3 1 2
1 4
1 2 3 4 41 41 23 4 3 2 1
1 23
2 34
3 41
4
1
2
4 1 2 3
2
1 4
3. Determine if the following pairs of Latin squares are orthogonal by constructing the matrices of ordered pairs.
3 4
4 3
(b)4A
2 3 1
1 3 2 1 3 2 122 2 3 3 1
2 1 3 b) 1 2
3 1 2 3 1 2
1233123 2 3 1 1 2 3
(e)
2 1 3
3
2. Produce a standardized Latin square of order 5.
(a)
3 1 2 3 1 2
8.2 Latin Squares; Finite Projective Planes 4. Fill in the missing elements to produce an orthogonal pair of Latin squares of order 5. 1 2
2
3 4
*3
5
*
3 •
4 ,
5 ,
1 *
,
*
*
*
2
*
*
*
*4
•
*
*
*
*
•
*
55
5. A Latin square is called self-orthogonal if it is orthogonal to its transpose. (See page A22 for the definition of transpose.) Determine which of the following Latin squares are self-orthogonal. (b) O 1 2 3 (a) 2 1 3 2 3 1 3 2 1 3 1 2 1 3 2 (c) 1 3 4 2 4 2 1 3 2 4 3 1 3 1 2 4 6. Prove that there are no self-orthogonal Latin squares of order 3. (The definition of self-orthogonal is presented in Exercise 5.)
443
while still matching boys and girls equitably. Describe a design for assigning dance partners that meets the enhanced objectives (you do not need to produce the design). How do you know such a design is possible? 12. A soap manufacturer wishes to develop a new detergent that causes less environmental damage than the old detergent. The new detergent should also work well at various kinds
of cleaning. The researchers at the company have four candidate detergents, code named A, B, C, and D. They wish to test them with four kinds of cleaning tasks: normal, grease stains, food stains, and mud stains. They have also decided to use four different brands, W, X, Y, and Z, of washing machine to see if the machine is a significant factor. Management has decided that doing tests with all 64 combinations is too expensive. Design a fair testing arrangemeant that exposes each candidate detergent to each washer brand and each type of cleaning task. Present your results using the terminology of the detergent company.
7. O-DLet L be a Latin square of order n and assume that i, j E {1, 2 ... , n}. Prove that if rows i and j are swapped and then columns i and j are swapped, the new matrix, L', is also a Latin square. (Hint: Try working with a few 3-by-3 and 4-by-4 Latin squares to gain insight.)
13. Suppose the research team in Exercise 12 also wishes to conside the effect of water hardness. They will use four different levels of hardness, denoted H1, H2, H3, and H4. However, management will only allow 16 tests to be run. Design a fair testing arrangement that exposes each candidate detergent to each washer brand and each type of cleaning task as well as each water hardness level. Present your results using the terminology of the detergent company. 14. Solve the "25 officers problem": Arrange 25 officers into a 5-by-5 matrix such that every row and every column contains
8. Suppose two orthogonal Latin squares of order n both have the same last row. Is it possible for them to have identical
exactly one officer from each of 5 ranks and from each of 5 regiments. Every rank/regiment pairing should appear ex-
entries in any other row? Explain your answer. 9. A nursing supervisor in a small nursing home needs to schedule the four duty nurses (Wendy, Xing, Yolanda, and Zoe) for the next four weeks. There are four six-hour shifts (morning, afternoon, evening, and graveyard). She wants to be fair, so she intends to schedule every nurse to work every shift for a one-week period. Design a fair duty roster for the four weeks. The roster should be expressed in terms of the problem, not in terms of any mathematical tools that you use to solve the problem.
actly one time. (Hint: You might try solving the "9 officers problem" just to get a feel for the organizational aspects of the problem.) 15. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice.
10. D A square dance teacher at a small school has decided to assign dance partners so that the children are not forced to choose. The class consists of 12 boys and 12 girls. The square dance lessons are three days per week for four weeks. Describe an equitable design for assigning dance partners (you do not need to produce the design). How do you know such a design is possible? 11. According to Western Square Dance Dancer Terminology (current Web address available in the "TextbookRelated Links" section of http://www.mathcs.bethel.edu/ -gossett/DiscreteMathWithProof/), a standard western square dance has four couples arranged in a square. (This information is also available in [68].) The couple whose backs are to the "caller" is designated couple 1. Couples 2, 3, and 4 are numbered going counterclockwise from couple 1. Suppose the teacher in Exercise 10 wants every child to have equal numbers of class periods in each of the four positions,
(a) In a Latin square of order n, the two diagonals will each contain every number in {l, 2... n}. (b) It is possible to find a pair of orthogonal Latin squares of order n for all n > 3. (c) Any set of mutually orthogonal Latin squares of order n > I will contain fewer than n matrices. (d) O The number of distinct Latin squares of order n > I is always greater than the maximum number of mutually orthogonal Latin squares of order n. 16. Theorem 8.21 asserts that the number of distinct Latin squares of order n is at least n! . (n- 1)!. (n- 2)!.. •3! .2!. 1. (a) Produce (with proof) a lower bound for the number of standardizedLatin squares of order n, for n > 2. (b) Calculate the actual number of standardized Latin squares of order n, for 1 < n < 6. 17. Prove that FPP1, FPP2, and FPP3' imply FPP3. 18. D Prove part (B) of Lemma 8.4 on page 430. Do not appeal to duality.
444
Chapter 8 Combinatorics
19. A configuration similar to that shown in the diagram from the proof of Theorem 8.8 must be present in any finite projective plane. In particular, it must be present in the Fano plane. Show this is true by a suitable renaming of the points in Example 8.18. 20. Prove Proposition 8.4. 21. ýB- Let L 1 and L 2 be two distinct lines in a finite projective plane. Prove that there is a point that is on neither L 1 nor L 2 . 22. Let P l and P2 be two distinct points in a finite protective plane. Prove that there is a line that contains neither pI nor
(e) O According to Theorem 8.8, there exist finite projective planes with 7, 13, 21, 31, 43, and 57 points and lines, respectively. 27. Let P I, P2, and P3 be three points in a finite projective plane, F, that are not on a common line. Prove that they must be the vertices in a triangle within F. 28. Use Construction 8.1 to create a finite projective plane of order 4. You do not need to draw a visual diagram, just list all the lines. Start with the pair of orthogonal Latin squares listed and fill in the third Latin square to produce a set of three mutually orthogonal Latin squares.
P2.
23. Why can't a finite projective plane of order 1 exist? 24. Start with a copy of the Fano plane and eliminate lines until the following property is true. For every line, L, there exists another line, L', such that L and L' have no common points. How many lines do you need to eliminate before this property holds? Generalize your conclusion to a statement that is valid for all finite projective planes. 25. A finite projective plane of order 3. (a) How many points and lines are in a finite projective plane of order 3? How many points are on each line? How many lines contain each point? (b) Draw a diagram of a finite projective plane of order 3. Don't worry too much about aesthetics. Start with the following partial diagram.
1 2 2 1 3 4 4 3
3 4 4 3 1 2 2 1
1 3 4 2 4 2 1 3 2 4 3 1 3 1 2 4
1
2
4
3
29. Use Construction 8.2 to create a set of three mutually orthogonal Latin squares of order 4. The lines of a finite projective plane of order 4 are listed next. {5, 6, 7, 8, 171 {3, 8, 9, 15, 19) {1, 3, 5, 12, 18) 1,7, 13, 15, 201 {6, 9, 12, 13,211 {2, 6, 10, 15, 18} (1, 8, 10, 14,211 4, 8, 13, 16, 18} (1,6, 11, 16, 191 (2, 5, 13, 14, 191 {1,2, 4, 9, 171 f4, 7, 10, 12, 191 (12, 14, 15, 16, 17} (3, 4, 6, 14, 20} (4, 5, 11, 15, 211 f2, 8, 11, 12, 201 (7, 9, 11, 14, 181 (3, 10, 11, 13, 171 (5, 9, 10, 16, 201 (2, 3, 7, 16, 21) (17, 18, 19, 20, 211 30. Finish the verification of FPP3 in the proof of Theorem 8.9 on page 436.
26. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) A square (with the four vertices considered to be points) is a geometric figure in which neighboring points are on exactly one common line and intersecting lines contain exactly one common point. There are also four distinct points, no three of which are on a common line. A square therefore satisfies all the axioms for a finite projective plane. (b) In a finite projective plane, every set of four lines has the property that no three of the lines contain a common point, (c) Let Pl, P2, and P3 be three points in a finite projective plane, F, of order n > 2. Then there is a line in Y that contains all three points, (d) ýP The duality principle for finite projective planes states that if Y is a finite projective plane, then it is possible to create another projective plane by changing points in Y into lines and lines in F into points.
31. In the proof of Theorem 8.9, it was shown that the rows of the matrix, A, in Construction 8.1 are orthogonal, Verify this directly for the construction in Quick Check 8.7 by exhaustively listing all the vertical pairs for each pair of rows in A. 32. Follow the proof outline, (a)-(d), to prove the following lemma. LEMMA 8.5 Let {L1, L 2. Ln- 1} be a set of mutually orthogonal Latin squares of order n. Let rl A r 2 and c I c2. = n - 11,Lr Then for some j E ( 1,2,. (a) Explain why it is possible to use sequences of interchanges to transform each of the Latin squares into a Latin square whose first row is "123... n" and still end up with a collection of mutually orthogonal Latin squares. Then note that the interchanges can be reversed, so the transformation process does not change the validity of what follows. Denote the new Latins squares as . ... Lnesq" ity of Ln 1 . L2 . (b) Suppose r1 A r 2 , cl A C2 and i A j. Lil c = Lrc ', then Lrzc2 5 LJ
Prove: If
8.3 Balanced Incomplete Block Designs (c) Suppose r1 7 1 and r2 0 1. Form the n- I ordered pairs (Lrca, Lr , for I = 1, 2 ... , n - 1. Denote the list of ordered pairs as (x1, Y), (x 2 , Y2 ) .....- (Xn-, Yn-1) (just to keep the notation simple). The previous part of the proof implies that (xi, yi) A (xj, yj) if i A j. Show that if k -* m, then xk 74 xnl and YA - Ym . (d) Complete the proof by considering two cases. i. r1 = 1 or r2 = 1: Suppose r2 = 1 (the other option is handled in a similar manner). Show that yi = C2
for all i. Notice that xj for some value of j, xj
=
445
c I for all j. Conclude that Yj = c2 .
ii. rI A 1 and r2 0 1: Show that xj A cI and yj # c2 . Conclude that if yi # xi for all i, then it is impossible to create n - I distinct ordered pairs (xi, Yi). Consequently, yi = xi for some i. 33. Does a finite projective plane of order 6 exist? If so construct one, if not explain why not.
8.3 Balanced Incomplete Block Designs Recall the fertilizer experiment in Example 8.11 on page 423. You might have found the example a bit contrived: there were three fields, three plots per field, three seed varieties, and three fertilizer regimes. What if the various components of the experiment don't all come in convenient groups of size three? The answer is that sometimes there is an alternative to Latin square designs. The alternative is called a balanced incomplete block design. Balanced incomplete block designs were briefly introduced in Section 1.3.3 (page 8). That section introduced a schoolmistress with nine schoolgirls in her boarding school. The solution to that puzzle is reviewed in the next example.
Nine Schoolgirls A schoolmistress desires to have the nine girls at her boarding school go for a walk on four days each week. The girls will walk in three rows with three students in each row. The schoolmistress also wants each girl to walk in a common row with every other girl exactly once per week. The solution that was presented in Section 1.3.3 is listed in Table 8.16. TABLE 8.16 Arranging Nine Schoolgirls for Weekly
Walks
I
Thursday
Friday
147
159
168
258
267
249
369
348
357
Monday
Tuesday
Row1
123
Row2
456
Row3
789
Table 8.16 provides a convenient example to introduce some terminology that will be used throughout this section. Notice that there are 12 rows in the table. These rows are examples of blocks (the main topic is "balanced incomplete block designs"). Each block (row) contains three girls. In the general setting, the items contained in the blocks are called varieties (recall the "three seed varieties" in the fertilizer experiment). Each block contains three varieties, and each variety is in four blocks (one block per day). Finally, every pair of varieties appears in exactly one common block. U
Popcorn Revisited Suppose the agricultural researcher has four fields (each with three plots). Two of the fields are near the city (and adjacent to each other). The other two are in a rural area (and have similar soil conditions). The researcher wishes to compare four seed varieties and only two fertilizer regimes. Is it possible to create an experimental design that allows each seed variety to be tested in a common field with every other variety and also for each seed variety to have both fertilizer regimes and both city and rural fields? A Latin square will not work with these parameters. However, it is not hard to produce an experimental design that meets the objectives. The four fields will be the blocks, and the seeds will be four varieties. The fields are denoted C 1, C2 , R 1 , and R 2
446
Chapter 8 Combinatorics (C for "city" and R for "rural"). Fields C1 and R 1 will use the first fertilizer regime; C 2 and R 2 will use the second regime. The varieties will be denoted 1, 2, 3, and 4. Each block will contain three varieties (since there are three plots per field). Every variety will appear twice with every other variety, but each variety will be planted in only three blocks. The design is listed next. C1
C2
RI
R2
1
1
1
2
2
3
2
3
3
4
4
4
You should check the various claims made prior to the listing of the design. For example, varieties 2 and 3 appear together in blocks C1 and R 2 . Also, variety 2 will appear in both city and rural fields and also with both fertilizer regimes. Notice that a block (field) does not contain every variety of seed. U These examples motivate the formal definition. DEFINITION 8.11
Balanced Incomplete Block Design
A balanced incomplete block design, abbreviated BIBD, is a combinatorial design consisting of a finite collection of finite sets (called blocks), each consisting of a finite number of elements (called varieties). The boundary conditions a BIBD must satisfy are expressed in terms of five parameters, commonly expressed as the 5-tuple of positive integers, (v, b, r, k, k). The parameter v represents the number of distinct varieties; the parameter b represents the number of blocks. Every variety is required to be in exactly r blocks, and every block must contain exactly k varieties. Finally, every pair of distinct varieties must appear in exactly X common blocks. A combinatorial design which meets these conditions is often referred to as a (v, b, r, k, X)-design. A (v, b, r, k, k)-design with k = v and r = b is called trivial. A trivial BIBD consists of b identical blocks, each containing every variety. It is convenient to express balanced incomplete block designs using a matrix. The next definition provides the details. DEFINITION 8.12 The Incidence Matrix of a BIBD Let D be a (v, b, r, k, k)-design with varieties {ul, U2 .. {B 1 , B 2 ... , Bb}. The incidence matrix of D is the v by b matrix, M, where {= 0J
uv
and blocks
ifv i E Bj otherwise
The term incomplete refers to the fact that not every possible block is present in the design. Since there are v varieties and each block contains k varieties, there are potentially (') blocks. A typical BIBD has fewer blocks. 19 The term balanced refers to the uniform size of blocks, the uniform number of blocks each variety appears in, and the uniform way that pairs of varieties appear in blocks. 19
1n Example 8.21 there are only 12 of the (9) = 84 potential blocks in the design. In Example 8.22 all
4=
(4)
of the possible blocks are used in the design, so this design is actually complete.
8.3 Balanced Incomplete Block Designs
447
VQuick Check 8.9 1. Determine the values of the parameters (v, b, r, k, X.) for the BIBD in Example 8.21. 2. Determine the values of the parameters (v, b, r, k, ;X)for the BIBD in Example 8.22.
3. Produce the incidence matrix for the agricultural BIBD in Example 8.22. 4. Produce the incidence matrix for the schoolgirl BIBD in Example 8.21. z
It should seem likely that the parameters, (v, b, r, k, )X), cannot be chosen arbitrarily. The following theorem expresses the most fundamental relationships that these 20
parameters must satisfy.
I
MILnoMu:mmThe Parametersof a BIBD
Let D be a balanced incomplete block design with parameters (v, b, r, k, X). Then bk = vr
and r(k - 1) = ;(v - 1).
Proof: Let M be the incidence matrix for the (v, b, r, k, Xý)-design. Both equations will be proved using combinatorial proofs that count Is in M. The first equation is verified by counting all the Is in M two different ways. Since there are b blocks, each containing k varieties, each of the b columns of M will contain k Is, for a total of bk Is. On the other hand, each of the v varieties is in r blocks, so each of the v rows of M contains r Is, for a total of vr Is. Therefore, bk = yr. The second equation is verified by counting the ls in a submatrix of M. Start by choosing any variety, u. Delete the row of M that corresponds to u and delete every column that corresponds to a block that does not contain u. Now count the ls in the matrix, Mu, that remains. Since u is in r blocks, there will be r columns in M,. Each of those columns will contain k - I Is (since the 1 in u's row has been removed). On the other hand, u is in ), common blocks with each of the v - 1 other varieties. So each of those varieties El contributes X.Is to M,. Consequently, r(k - 1) = X.(v - 1). Theorem 8.11 presents a pair of necessary conditions for the existence of a (v, b, r, k, k)-design. These conditions are not sufficient conditions. That is, in order for a (v, b, r, k, ),)-design to exist, the two equations in the theorem must be true. However, even if the two equations are true, there may be no BIBD with the given parameters. For example, the parameters (43, 43, 7, 7, 1) satisfy the two equations (43 • 7 = 43 • 7 and 7 - 6 = 1 • 42), but no BIBD with parameters (43, 43, 7, 7, 1) exists. (The nonexistence is a consequence of Theorem 8.13 on page 448.)
DEFINITION 8.13 Symmetric; Resolvable A balanced incomplete block design is symmetric if v = b and r = k. Symmetric balanced incomplete block designs are often referred to as (v, k, X)-designs. A balanced incomplete block design is resolvable if the blocks can be grouped into disjoint collections (of equal numbers of blocks) so that every variety appears exactly once in each group of blocks. 20
There are other conditions. Some of them will be presented in this text.
448
Chapter 8 Combinatorics The BIBD in Example 8.22 is a symmetric BIBD because v = b = 4 and r = k = 3. The BIBD in Example 8.21 is resolvable. The blocks are grouped by day of the week. If this design were not resolvable, the design would not have solved the original puzzle of enabling all nine girls to walk in three lines each day.
1. The original schoolgirl puzzle, proposed by Reverend Thomas Kirkman (see page 8), requires 7 walks with 15 girls arranged in 5 rows of 3 girls each. (a) A solution to his puzzle requires
a resolvable BIBD. What are the parameters, (v, b, r, k, X,), of the BIBD? (b) Show that these parameters satisfy the requirements in Theorem 8.11. R1
The next two theorems are presented without proof. See [37] for proofs (which utilize mathematical ideas beyond the assumed background for this text). | •ý Fisher's Inequality If D is a nontrivial (v, b, r, k, Xý)-design, then b > v and r > k.
Bruck-Ryser-Chowla If D is a symmetric balanced incomplete block design, with parameters (v, k, X), then the following statements are true. " If v is even, then k - ), is a square. " If v is odd, then the equation Z' = (k
-
;X)x 2 + (-1)(v-l)/
2
;Xy2
has a solution in integers x, y, and z, where x, y, and z are not all zero. The second part of the Bruck-Ryser-Chowla Theorem can be used to show that a (43, 7, 1)-design cannot exist.
M
Fragrance Testing A perfume company wishes to test some new fragrances. There are six candidate aromas. They wish to test them in groups of three, with the human test subjects classifying the individual aromas in each triple of fragrances as "best," "middle," and "worst" (with no ties). They want to have every pair of fragrances matched in at least two tests. There are (6) = 20 subsets of three fragrances. These 20 subsets will match each pair of fragrances more than twice (four times to be exact). Can this be reduced without compromising the requirements? A balanced incomplete block design is one possible mechanism to reduce the number of tests, assuming a suitable BIBD exists. What parameters are necessary? The requirements have specified that v = 6, since there are six fragrances to test. The number of blocks (tests) is undetermined so far. However, each test matches three fragrances, so k = 3. If the minimum number of pairings is used, then X = 2. This is enough to determine b and r. From Theorem 8.11, b • 3 = 6. r and r • 2 = 2 • 5. Thus, r = 5 and b = 10 are required. This is a promising start; the solution values b and r might not have been integers. Since 10 > 6 and 5 > 3, Fisher's inequality holds, so it still looks promising. Since k ; r, the Bruck-Ryser-Chowla theorem doesn't apply. It seems likely (but not guaranteed) that a (6, 10, 5, 3, 2)-design exists. If one does
449
8.3 Balanced Incomplete Block Designs
exist, then the number of tests can be reduced from 40 to b = 10. This is a substantial
U
improvement.
Knowing that a BIBD might exist is not the same as being able to construct one.
21 Example 8.23 must remain incomplete for now.
The next section discusses some techniques for constructing BIBDs.
8.3.1 Constructing Balanced Incomplete Block Designs The methods for constructing BIBDs fall into two broad categories. One category consists of constructions that start with other kinds of combinatorial designs or mathematical objects and build a balanced incomplete block design. 22 The other category consists of methods to transform one BIBD into another BIBD.
New BIBDs from Old There are a number of ways to create a new BIBD from an existing BIBD. Several will be described in this section. The first two are very general; they will work with any initial BIBD. Each of the second pair of constructions assumes that the initial BIBD is symmetric. The first of these constructions is extremely simple: just make one or more copies of each of the blocks.
Construction 8.3 A Replicated Design: A (v, nb, nr, k, n;)-Design from a (v, b, r, k, 1)-Design If D is a (v, b, r, k, ),)-design, then a (v, nb, nr, k, n)X)-design, Dn, can be created by making n copies of each block in D. Thus, the varieties of Dn are identical to the varieties in D and each block of D will appear n times in D,. Notice that neither v nor k will change in this construction. If a variety, u, appears in r blocks in D, then it will appear in nr blocks in Dn (all its old blocks, each repeated n times). In a similar fashion, every pair of varieties will appear in nXý common blocks.
A Replicated Design
E
Table 8.17 shows the blocks of a (5, 10, 6, 3, 3)-design.
TABLE 8.17 The Blocks of a (5, 10, 6, 3, 3)-Design B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
1
1
1
1
1
1
2
2
2
3
2
2
2
3
3
4
3
3
4
4
3
4
5
4
5
5
4
5
5
5
A (5, 20, 12, 3, 6)-design can be created by taking two copies of each block (Table 8.18).
TABLE 8.18 A (5, 20, 12, 3, 6)-Design BI
B2
B3
B4
B5
B6
B7
B8
B9
B10
B1l
B12
B13
B14
B15
B16
BI7
B18
B19
B20
I
I
I
I
1
1
2
2
2
3
1
1
1
1
1
1
2
2
2
3
2
2
2
3
3
4
3
3
4
4
2
2
2
3
3
4
3
3
4
4
3
4
5
4
5
5
4
5
5
5
3
4
5
4
5
5
4
5
5
5
21
E
A complete solution will be given in Exercise 18 in Exercises 8.3.2 (on p. 456). This is similar to using a set of n - I mutually orthogonal Latin squares of order n to construct a finite projective plane of order n. 22
450
Chapter 8 Combinatorics The next construction is only slightly more sophisticated.
Construction 8.4 The Complement Design: A (v, b, b - r, v - k, b - 2r + x)-Design from a (v, b, r, k, X)-Design If D is a (v, b, r, k, X)-design, then a (1j, b, T, k, 3) = (v, b, b - r, v - k, b - 2r + X)design, D, can be created by complementing each of the blocks in D. The varieties in D are identical to those in D. Suppose the set of varieties in D is U. If B is a block in D, then the corresponding block in D is B = U - B. The numbers of varieties and blocks do not change when moving from D to D, so T = v and b = b. A block, B, in D contains k varieties. The corresponding block, B, will contain every variety that is not in B. Thus k = v - k. The variety, u, appears in r blocks, B 1 , B 2 ... , Br, in D but will appear in every block except BT, B 2 . . TBr in D. Thus, T = b - r. Finally, consider two varieties, u I and U2. They are in ;X common blocks in D. There are an additional r - X blocks in D that contain u 1, and another r - X different blocks in D that contain u2. The varieties uI and u2 will appear together in all blocks, B E D for which B E D does not contain either u I or u2. Thus, X = b - () + (r - ) + (r-))= b - 2r +. A Complement Design The (5, 10, 6, 3, 3)-design in Example 8.24 has a complement that is a (5, 10, 4, 2, 1)design. The blocks of the complement are shown in Table 8.19. TABLE 8.19 A (5, 10, 4, 2, 1)-Design
Y1
B2
B3
B4
B5
B6
7T7
4
3
3
2
2
2
1
1
1
1
5
5
4
5
4
3
5
4
3
2
BA B9
B1 0
U
The next two constructions, which transform symmetric BIBDs into new BIBDs, are more sophisticated. The following theorem guarantees a necessary background condition. The proof involves more matrix algebra than is assumed for this text, so it will be omitted. - ,
Block Intersections in Symmetric BIBDs
If D is a symmetric (v, k, X,)-design, then every pair of blocks contains exactly X common varieties.
Construction 8.5 Derived Designs: A (k, v-1, k-i, X, X -1)-Design from a (v, k, X.)-Design Let D be a symmetric (v, k, X)-design, with blocks {BO, B 1 , B 2 ,..., B, -.1 A derived design, D', can be constructed by selecting any block in D (assumed here to be BO). The varieties of D' will be the set of k varieties in B0 . The v - 1 blocks of D' will be formed from the blocks, B 1 , B 2 ,..., Bv-l, by removing any varieties that are not in B0 . That is, B' = Bi n Bo. Since every block in D contains k varieties, there will be v' = k varieties in D' (since the varieties of D' come from BO). The blocks in D' are constructed from the v - I blocks other than B0 in D. Each variety, u, in B0 is in r = k blocks in D. That variety will still be in the corresponding blocks in D', but B0 has no corresponding block in D'. Thus, r' = r - 1 = k - 1. Theorem 8.14 asserts that B0 and Bi have exactly X,
8.3 Balanced Incomplete Block Designs
451
varieties in common, for i = 1,2 ... , v - 1. Thus, k' = Bo n Bi = X. Finally, every pair of varieties in Bo appears in )Xcommon blocks in D. All but one of those common appearances will be preserved in D' (the common block BO has no counterpart in D'). Thus, X' = Aý- 1 (assuming v' > 2). A Derived Design A symmetric (15, 7, 3)-design is listed in Table 8.20.23 The derived design (using the block B0 ) is shown in Table 8.21. It has parameters (7, 14, 6, 3, 2). TABLE 8.20 A Symmetric (15, 7, 3)-Design B0
B!
B2
B3
B4
B5
B6
0
0
0
0
0
0
1
1
1
3
3
5
2
2
2
4
4
6
3
7
11
7
9
4
8
12
8
10
5
9
13
11
6
10
14
12
B7
B8
0
1
1
5
3
3
6
5
6
7
9
7
8
10
9
13
13
11
11
14
14
12
13
B9
B 10
B1 1
B 12
1
1
2
2
2
2
4
4
3
3
4
4
5
6
5
6
5
6
7
8
8
8
8
7
7
10
10
9
10
9
9
10
12
11
12
12
11
12
11
14
14
13
13
14
14
13
TABLE 8.21 A (7, 14, 6, 3, 2)-Design
B 13
B 14
B2jB
B'
B4
B5
B6
By
B1
B'
B1O
B'l I
B1I2
B'I 3
B1I4
0
0
0
0
0
0
1
1
1
1
2
2
2
2
1
1
3
3
5
5
3
3
4
4
3
3
4
4
2
2
4
4
6
6
5
6
5
6
5
6
5
6
U
The next construction also requires a symmetric design. Construction 8.6
Residual Designs: A (v-k, v-i, k, k-., X)-Design
from a (v, k, X)-Design Let D be a symmetric (v, k, X)-design, with blocks {Bo, B 1, B 2... , B-1}. A residual design, D*, can be constructed by selecting any block in D (assumed again to be Bo). The varieties of D* will be the set of v - k varieties that are not in Bo. The v - I blocks of D* will be formed from the blocks, B 1, B2 ,..., By-j, by removing any varieties that are in Bo. That is, B* = Bi - Bo. Since the k varieties in B0 are not present in D*, v* = v - k. Also, b* = v - 1 since B0 does not transform into a block in D*. The varieties that remain in D* are not in B0 , so removing B0 does not change the number of blocks they are in. Thus, r* = r = k. Theorem 8.14 asserts that B0 shares ), common varieties with Bi, for i : 0. Thus, k* - k - ),. Finally, if u 1 and u2 are in the common block, B, in D, they will still be in the common block, B*, in D*. Thus, A'* = A (assuming v* > 2).
A Residual Design The residual design (using block Bo) for the (15, 7, 3)-design in Example 8.26 is shown in Table 8.22. It has parameters (8, 14, 7, 4, 3). 23
From [37, p. 128].
452
Chapter 8 Combinatorics TABLE 8.22 A (8, 14, 7, 4, 3)-Design B*B1 B2 B* B3 B* B4 B* B5 B* B*6 B7 B*' B8 B* 7
9
12
8
13
11
14
12
7
11
8 9 10
B9 B*
7
7
8
10
9
10
11
11
12
12
13
14
7
9
10
8
13
13
14
14
B10 B B*~ B
1
B12 B*2
B13 B*
B14 B*4
8
7
7
8
8
10
9
10
9
9
10
11
12
12
11
12
11
14
13
13
14
14
13
U
S/ Quick Check 8.11 The symmetric BIBD, D, in Table 8.23 has parameters (7, 4, 2).
TABLE 8.23 A Symmetric (7, 4, 2)-Design B0
B1
B2
B3
B4
B5
B6
3
2
2
1
1
1
1
4
4
3
4
3
2
2
6
5
5
5
5
6
3
7
7
6
6
7
7
4
1. Form the complement design, D. What are its parameters? 2. Form the derived design, D', using block B 2 . What are its parameters?
3. Form the residual design, D*, using block B2 . What are its parame[ ters ?
BIBDs from Other Mathematical Objects The following theorem, together with Construction 8.7, provide a prime example of a construction of a BIBD from some other combinatorial design. BIBD Iff Finite Projective Plane A finite projective plane of order n > 2 exists if and only if a (n 2 + n + 1, n + 1,1)design exists.
Construction 8.7 A (n 2 + n + 1, n + 1, 1)-Design from a Finite Projective Plane of Order n Let F be a finite projective plane of order n. A symmetric balanced incomplete block design, D, can be constructed from F by taking the lines of F as the blocks of D and the points of F as the varieties of D. A variety will be in a block if and only if (viewed as a point in F) it is on the line in F from which the block was derived. D will be a (n 2 + n + 1, n + 1, 1)-design.
Construction 8.8
A Finite Projective Plane of Order n from a
(n2 + n + 1, n + 1, 1)-Design Let D be a symmetric balanced incomplete block design with parameters (n 2 + n + 1, n + 1, 1). The lines of T will be the blocks of D and the points of F will be the varieties of D. A point will be on a line if that point (viewed as a variety) is in the block from which the line was derived.
Proof of Theorem 8.15: The proof is a verification that Constructions 8.7 and 8.8 are correct.
8.3 Balanced Incomplete Block Designs
453
Construction 8.7 is valid Since F has n 2 + n + 1 lines and n 2 + n + 1 points, b = v = n 2 + n + 1. Since every line in _T contains n + 1 points and every point in F is on n + 1 lines, k = r = n + 1. Finally, every pair of points is on exactly one common line, so ý = 1. Construction 8.8 is valid Since D is symmetric, b = v = n 2 +n+1 and r = k = n+l, so there will be n 2 + n + 1 points and n 2 + n + 1 lines. Theorem 8.8 implies that the construction will be a finite projective plane of order n (if it is actually a finite projective plane). Since every block contains k = n + 1 varieties, it is clear that every line contains n + 1 points. Similarly, since r = n + 1, every point will be on n + 1 lines. Since X.= 1, every pair of points will be on exactly one common line, so FPP1 holds. Theorem 8.14 implies that every pair of lines contains k = 1 common point, so FPP2 holds. It only remains to verify that FPP3 holds. Since k = n + 1 < n 2 + n + 1 = v, no block contains every variety. Choose any two varieties, u and u2. There is exactly one block, B 12 , that contains both uI and u2 (by Theorem 8.14). Since k < v, there must exist some variety, U3, that is not in B 12 . There are unique blocks, B 13 and B 23 , such that ul and u3 are both in B 13 and also U2 and U3 are both in B23 . The three varieties uI, U2, and u3 are not all in a common block. How many distinct varieties are in the three blocks B 12 , B 13 , and B23 ? There are k = n + 1 distinct varieties in B 12 , but only k - 1 = n additional varieties in B13 (since ul has already been counted). The block B23 contributes only k - 2 = n - 1 new varieties since both u2 and u3 have already been counted. These three blocks therefore contain a total of 3n distinct varieties. 24 Since n > 2, n 2 + n + 1 > 3n. There must be at least one more variety, U4, that is not in the blocks B 12 , B 13 , and B23 . Since pairs of varieties are only in one common block (0 = 1), u4 is not in a common block with any pair among {Ul, u2, U3}. Now change the viewpoint back to the projective plane interpretation. The points Ul, u2, u3, and U4 are 4 points in F with no three among them on a common line. This means that FPP3 holds. ED
V Quick Check 8.12 1. Create a (7, 3, 1)-design from the Fano plane. Sort the varieties in increasing order and list the blocks sorted by the varieties they contain.
2. Show the incidence matrix for the (7, 3, 1)-design you just created. 9
7
2
3
6
115 4 25
The next construction uses modular arithmetic.
24 25
you should convince yourself that they do not contain additional common varieties. See Definition 3.13 on p. 117.
454
Chapter 8 Combinatorics
Construction 8.9
A (13, 26, 6, 3, 1)-Design
1 2 3 and B 14 = 5 . Create twelve additional blocks 9 6 by starting with B1 and adding I to each row. Do the arithmetic mod 13. That is, whenever the addition produces a 13, change the value to 0 (and re-sort the block). Keep doing this until the original block appears. Discard the second copy of the original block. Repeat the process with B 14. Start with the blocks BI =
Each of the two original blocks is the source for 12 additional blocks, for a total of b = 26 blocks. There are 13 varieties (the numbers 0 through 12). Each block contains k = 3 varieties. You can verify that r = 6 and X 1 by constructing the incidence matrix. The BIBD is listed (without labels) in Table 8.24. It is split into two groupings so that the progression from the two initial blocks is apparent. TABLE 8.24 A (13, 26, 6, 3, 1)-Design 1
2
3
4
0
1
2
3
4
5
0
1
0
3
4
5
6
5
6
7
8
9
10
6
7
2
9
10
11
12
7
8
9
10
11
12
11
12
8
2
3
4
5
6
7
8
0
0
1
2
0
1
5
6
7
8
9
10
11
9
1
2
3
3
4
6
7
8
9
10
11
12
12
10
11
12
4
5
The next construction is similar to Construction 8.9. It will be the basis for a solution to Kirkman's schoolgirls problem.
Construction 8.10
A (15, 35, 7, 3, 1)-Design
0 0 0 1 and B 16 = 2 and B3 1 = 5 . Create additional 4 8 10 blocks by starting with B1 and adding I to each row. Do the arithmetic mod 15. Keep doing this until the original block appears. Discard the second copy of the original Start with the blocks B1 =
block. Repeat the process with B 16 and B 31.
The 35 blocks that result from Construction 8.10 are listed as columns in Table 8.25.26
Kirkman's Schoolgirls A schoolmistress has 15 young girls at her boarding school. Each day the schoolmistress lines the girls up in five rows of three girls each. She wishes to group the girls so that in the course of seven walks, each girl will have been in a row with every other girl exactly once. This requires a (15, 35, 7, 3, 1)-design (see Quick Check 8.10). There are 80 essentially distinct BIBDs with these parameters, but only four of them are resolvable. The four resolvable BIBDs lead to seven essentially distinct solutions to the schoolgirl problem 27 [14, p. 88]. Construction 8.10 produces one of the four resolvable (15, 35, 7, 3, 1)-designs. The girls are numbered from 0 to 14. Table 8.26 shows these blocks (reformatted for the original problem) separated into seven disjoint groups. U 26
27
The construction is from [63, p. 761].
There can be more than one way to partition the BIBD's blocks into disjoint groups.
8.3 Balanced Incomplete Block Designs
455
TABLE 8.25 A (15, 35, 7, 3, 1)-Design 0
1
2
3
4
5
6
7
1
2
3
4
5
6
7
8
4
5
6
7
8
9
10
11
8 9
9 10
10 11
0 11
1 12
2
0
13
3
12
13
14
12
13
14
14
0
1
2
3
4
5
6
0
1
2
3
4
5
0
1
2
3
4
5
6
7
8
7
8
9
10
11
12
6
7
8
9
10
11
12
13
14
9
10
11
12
13
14
13
14
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
TABLE 8.26 A Solution to the Original Kirkman's Schoolgirl Problem Sunday 0
Monday
1
4
0
2
9
11
1
3
3
10
12
4
11
5
7
13
5
12
6
8
14
6
7
Thursday
2
Tuesday 8
Wednesday
0
3
14
9
1
6
13
2
7
14
4
10
9
Friday
0
5
10
11
1
12
13
12
2
3
6
5
8
4
9
14
10
13
7
8
11
Saturday
0
6
13
0
7
9
0
11
12
1
7
14
1
2
5
1
8
10
2
4
10
3
8
13
2
13
14
3
5
11
4
6
12
3
4
7
8
9
12
10
11
14
5
6
9
8.3.2 Exercises The exercises marked with OD have detailed solutions in Appendix G. l) and 1. Let D be a (v, b, r, k, X.)-design. Prove that r = k 1v(v-1) b - k(k-4) " This implies that it is only necessary to specify the parameters v, k, and L. Thus, the shorthand notation "(v, k, X)-design" need not be limited to only symmetric BIBDs. 2. Assume the inequality b > v in Fisher's theorem and use Theorem 8.11 to prove the inequality r > k. 3. Prove the following lemma. LEMMA 8.6 If D is a nontrivial (v, b, r, k, X)-design, then v > k and r > 4. Can a BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (a) '14 (10, 15, 6, 4, 2) (b) 14 (15, 12, 5, 3, 2) (c) (8, 12,4,4, 1) (d) (55, 99, 18, 10,3)
5. Can a BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (a) (13, 18, 9, 3, 2) (b) (27, 117, 13,3,1) 6. Can a BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (b) (46, 22, 5, 28, 3) (a) (21, 70, 10, 3, 1) (d) (61,305, 20, 4, 1) (c) (9, 18, 8, 4, 3) 7. Can a symmetric BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (22, 7, 2) (b) O-D (a) (8, 7, 4) (c) "1P (15, 7, 3) (d) (27, 13, 6) 8. Can a symmetric BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (a) (13, 4, 2) (b) (22, 13, 4) (c) (12,6,3) (d) (115, 19,3)
456
Chapter 8 Combinatorics
9. Can a symmetric BIBD have the following sets of parameters? If it is not possible, explain why not. Otherwise, list the parameters as "potential." (a) (64, 34, 8) (b) (78, 22, 6) (c) (111, 11, 1) (d) (34, 12, 4) 10. O The Bruck-Ryser-Chowla theorem can be used to show that a (43, 7, 1)-design cannot exist. The theorem requires a solution to an equation in x, y, and z. What is that equation for the parameters (43, 7, 1)? You need not attempt to show there are no integer solutions that are not all 0. 11. Prove, without using the Bruck-Ryser-Chowla theorem, that a (43, 7, 1)-design does not exist. 12. Construct a (13,4, 1)-design. 13. Construct the incidence matrix of a (6, 15, 10, 4, 6)-design. You may find it helpful to think about the BIBD whose incitable. dence matrix is shown in the following uI U2 u3 U4 U5 u6
B1 1 1 0 0 0 0
B2 1 0 1 0 0 0
B3 1 0 0 1 0 0
B4 1 0 0 0 1 0
B5 1 0 0 0 0 1
B6 0 1 1 0 0 0
B7 0 1 0 1 0 0
u1
B9 0
B10 0
B11 0
B 12 0
B 13 B 14 0 0
U2
1
0
0
0
0
0
0
u3 U4 u5 u6
0 0 0 1
1 1 0 0
1 0 1 0
1 0 0 1
0 1 1 0
0 1 0 1
0 0 1 1
B8 0 1 0 0 1 0
B15 0
B1 c f g h
B8 b e g i k 1
f
B0 1 4 5 6 7 9 11 17
B1 2 5 6 7 8 10 12 18
B2 0 3 6 7 8 9 11 18
B3 0 1 4 7 8 9 10 14
B4 1 2 5 8 9 10 11 15
B 10 0 2 7 8 11 14 15 16 17
B11 1 3 8 9 12 15 16 17 18
B12 0 2 4 9 10 13 16 17 18
B 13 0 1 3 5 10 11 14 17 18
B 14 B 15 0 0 1 1 2 2 4 3 6 5 11 7 12 12 15 13 18 16
(a) List the parameters then list the blocks (b) List the parameters then list the blocks
14. A (13, 26, 12, 6, 5)-design. (a) Construct a (13, 26, 12, 6, 5)-design by starting with the blocks [0 1 3 67 11] and [0 1 2 3 7 11] and adding, mod 13. (b) What are the parameters for the complement of the design in part (a)? List the first five blocks in the complement design. 15. The design, D, shown is a symmetric (16, 6, 2)-design. B0 a b c d e f
(a) 1 List the parameters for the residual design, D*, of D, and then list the blocks ofD*. (b) List the parameters for the derived design, D', of D, and then list the blocks ofD'. 16. The design, D, shown is a symmetric (19, 9, 4)-design. B5 2 3 6 9 10 11 12 16
B6 3 4 7 10 11 12 13 17
B7 4 5 8 11 12 13 14 18
B8 0 5 6 9 12 13 14 17
B 16 1 2 3 4 6 8 13 14 17
B17 B 18 2 0 3 3 4 4 5 5 7 6 9 8 14 10 15 15 18 16
B9 I 6 7 10 13 14 15 18
for the residual design, D*, of D, and of D*. for the derived design, D', of D, and of D'.
17. Let M be an n-by-n magic square. Create a combinatorial design by taking the individual rows, columns, and two diagonals of M as "blocks." Let the numbers 1 through n 2 be the varieties. Is this a BIBD? If so, determine the parameters. If not, explain why not. 18. Example 8.23 can be completed by performing the following steps. (a) Construct a symmetric (11, 5, 2)-design, D, by starting
B3 c d i I o p
B4 a e h i m o
B5 b d g j m m
B6 b c h k n n
B7 b f h 1
with the block, B1, whose varieties are 1, 3, 4, 5, and 9 (written as a single column) and adding Is to the rows using mod 1Iarithmetic. (b) Form a residual design, D*, from the design D in the previous step.
ji
B2 d f i k m n
m p p
(c) Interpret the results in terms of the original example.
B9 a d h j k 1
B 10 a c g 1 m n
B11 a f g k
B 12 a b i j n p
B 13 B 14 d c e e g j h k n m p p
B15 e f
j
n
p
0
p
o
o
o
j 1 n o
19. Suppose two BIBDs with parameter sets (v, bI, r 1 , kt, ) 1 ) and (v, b 2 , r 2 , k 2 , X, 2 ) exist. What requirements are necessary for the union of their blocks to be a balanced incomplete block design? What will the parameters of the union be? (You can experiment with the (6, 10, 5, 3, 2)-design in Exercise 18 and its complement design.) 20. Let D be a (v, b, r, k, X)-design with incidence matrix, M. Describe the matrix MMT.
8.4 The Knapsack Problem
457
8.4 The Knapsack Problem The introduction to this chapter mentioned the knapsack problem as an example of a combinatorial optimization problem. Combinatorial optimization problems seek a best solution from among many contending solutions to some problem. The knapsack problem seeks to pack a knapsack with a set of items. Each item has a benefit value (or utility or cost) and a size (or weight). The goal is to obtain the maximum benefit under the constraint that the knapsack has a finite capacity. These ideas are expressed formally in the next definition. DEFINITION 8.14 The Knapsack Problem
The knapsack problem is concerned with a knapsack that has positive integer volume (or capacity) v. There are n distinct items that may potentially be placed into the knapsack. Item i has positive integer volume vi and positive integer benefit bi. In addition, there are qi copies of item i available, where quantity qi is a positive integer satisfying 1 < qi b2-_ > ... >ý bn V1
V2
Vn
If bil
>b2
then it is possible to find an optimal packing that includes min q I,
[J
copies of
item 1.
Proof: The ratio b, represents the benefit per unit of volume derived from packing item i. Since L2 > L' > ... > bn, a knapsack that does not contain any item Is can V2 -
V3 -
-
Vn
have no greater total benefit than b 2 ( L2). This is because item 2 offers at least as great a benefit per unit of volume as do any of the remaining items. This estimate may be an overestimate because v may not be an integer; however, the estimate will be at least as large as any achievable packing using an assortment of items other than item 1. The knapsack can hold
L1i copies of item 1, for an achievable benefit of bi [
.
Hence, it will certainly be optimal to pack some copies of item 1 if
bl
> b2(
)
If the inequality is true, then packing as many copies of item I as are available and will fit into the knapsack will produce at least as large a total benefit as any combination of other items. The number of available copies of item I is q1, and the number of copies that will fit into the knapsack is
[1.
In order to meet both the availability and the total volume
constraints, the number of copies to pack must be limited to min qI,,
L[ -L
E
A few small examples will illustrate both the strengths and limitations of this theorem. Applying the Heuristic Consider a knapsack with capacity 15 and two potential item types. Table 8.34 shows the relevant information. 31An heuristic technique is one that suggests a course of action that is probably correct but is not guaranteed.
Often the course of action is determined after some preliminary calculations are made.
8.4 The Knapsack Problem
TABLE 8.34 The Knapsack Heuristic at Its Best. Item Benefit
X 18
Y 6
Volume 6
3
2
3
3
2
Quantity B/V Ratio
1
The inequality from Theorem 8.16 is 161
465
(3) 6Q3
18L
36 > 30,
which is true. It is therefore valid to pack min
L,
L5
2 itemX'.Tebcpk X
will then have 15 - 12 = 3 units of volume left. A single item Y will fit, yielding a total benefit of 42. U A Second Application of the Heuristic The knapsack in this example holds 13 units of volume. The other information is shown in Table 8.35. TABLE 8.35 The Knapsack Heuristic is Silent Item X Y Z Benefit 12 3 5 Volume 18 7 8 Quantities Oc 00 0o BIV Ratio 2.66 7 3 -58 .625 5-.429 3 The heuristic will treat item X as the first item, followed by Z and then Y. The inequality will compare X to Z:
L4J>5(~ 65 0> - = 8.125.
8
The inequality is false, so the heuristic provides no recommendation about whether to pack any item X. This is correct, since item X will not fit in the knapsack. The optimal solution requires one item Z to be packed. U W
The Heuristic Is Not Perfect Consider a knapsack with volume 9 that can contain unlimited quantities of two types of items. Table 8.36 shows the essential information. TABLE 8.36 The Knapsack Heuristic Misses X Y Item Benefit 11 5 4 6 Volume Quantities cc cc B-VRatio
-L 1,833
= 1.25
The inequality from Theorem 8.16 is 11l[9] >5 (9)
II > 11.25, which is false. The heuristic has no advice to offer. However, it is easy to see that packing one item X is better than packing 2 item Y's (and it is not possible to pack one of each). The reason that the heuristic failed to suggest packing an item X is because it is comparing packing 1 item X ([9]) to 2.25 item 2s ((9)). However, it is not possible to pack one quarter of an item Y.
466
Chapter 8 Combinatorics
The previous example might suggest that the heuristic could be improved by comparing bL] to b 2 V However, this would invalidate the theorem, since it does not
[
account for the possibility 32 that packing
item 2s and
L
item 3s might be better
than packing •Vj item Is. Exercise 13 in Exercises 8.4.1 illustrates this phenomenon. t
pChhecnk m8.15 X Y Z 4 5 1
Quick
1. Consider the knapsack problem from Quick Check 8.14. The knapsack holds 10 units of volume. Apply the heuristic from Theorem 8.16 as often as possible.
Item Benefit
W 3
1
3
1
Quantity
1
The final example for this section applies the knapsack reduction heuristic to Example 8.29.
Reducing a 0-1 Knapsack Problem The knapsack in Example 8.29 has a capacity of 1000 units of volume. The items can be ranked by descending benefit-to-volume ratio (Table 8.37). TABLE 8.37 Potential Items for a 0-1 Knapsack with Capacity 1000 Item #
H
C
F
I
A
B
D
E
G
Benefit
9
i1
1
10
13
8
16
4
1
Volume
120
190
20
220
340
210
450
120
60
I .0750
1 .0579
1 .0500
1 .0455
1 .0382
1
1 .0356
1
1
.0333
.0167
Quantity B/V Ratio
.0381
Table 8.38 shows how Theorem 8.16 can be used to successively reduce the knapsack. At each stage, a smaller knapsack, having fewer potential item types, can be considered. The process will terminate as soon as the heuristic inequality becomes false. Notice that since qi = 1 for all i, the expression min ql, will evaluate to 1 whenever the inequality suggests packing some item Is. TABLE 8.38 Applying the Knapsack Heuristic to Example 8.29
LJ
Compare Items Hand C
Heuristic Inequality
9[
]
__11 (00)
72 > 57.895
C and F
190-
1 20
44 > 44 F andl
and A
A and B
lL
> 1( 220
34 > 31.364 10 [.70]J > 13 (L34)
volume to fill = 880
pack item C volume to fill = 690 pack item F volume to fill = 670 pack item I
30 > 25.618
volume to fill = 450
13 345o/ > 8 (450)
no recommendation
13 > 17.143 32
Conclusion packitem H
volume to fill = 450
This is just one among a great many alternatives that have not been accounted for by the heuristic.
8.4 The Knapsack Problem
467
The sequence of reductions has lead to a knapsack that contains items C, F, H, and I, having a total benefit of 31 and a remaining volume of 450. The heuristic has been unable to decide whether to add item 1. (The solution in Example 8.30 on page 458 does contain item 1.) At this point, Theorem 8.16 is no longer useful. It is now time to use the Knapsack algorithm. However, instead of using tables with 1001 rows and 9 columns, the algorithm will only need 451 rows and 5 columns. The algorithm indicates that item D should be packed, yielding a benefit of 16. The final solution is thus to pack items C, D, F, H, and I, for a total benefit of 47. This is the same optimal benefit as was mentioned in Example 8.30, but the collections of items is U different. The heuristic reduction technique introduced in Theorem 8.16 partially answers the question "is it possible to find an optimal packing that includes some item ls," where item 1 is the item with the largest benefit-to-volume ratio. The answer is partial because the theorem compares adding only item Is against a (possibly unattainable) collection of item 2s. The heuristic is not powerful enough to consider a mixed collection that includes both item I s and other items, nor is it powerful enough to handle every attainable collection of items 2 through n. Exercise 15 in Exercises 8.4.1 asks you to consider an alternative heuristic that does not use the floor function.
8.4.1 Exercises The exercises marked with I have detailed solutions in Appendix G. 1. oD Compare the greedy algorithm, the sophisticated greedy algorithm, and algorithm Knapsack for the following knapsack problem. The knapsack has capacity 7. Item
X
Y
z
Benefit
4
3
2
Volume
5
4
Quantity
I
I
3 I
2. Compare the greedy algorithm, the sophisticated greedy algorithm, and algorithm Knapsack for the following knapsack problem. The knapsack has capacity 8.
(a) ODThe knapsack has volume 10. Item
X
Y
Z
Benefit Volume
4 3
5 4
1 2
Quantity
2
2
2
(b) The knapsack has volume 10. Item Benefit Volume
X
Y
z
4 3
5 4
1 2
Quantity
1
2
1
Y
Z
(c) The knapsack has volume 10.
Item
X
Y
z
Benefit
2
5
4
Item
X
Volume
4
6
4
Benefit
4
5
3
Quantity
1
1
1
Volume
3
5
2
Quantity
2
2
1
3. Compare the greedy algorithm, the sophisticated greedy algorithm, and algorithm Knapsack for the following knapsack problem. The knapsack has capacity 12. Item
X
Y
Z
Benefit
7
13
Volume
7
8
8 5
Quantity
I
1
1
4. Use algorithm Knapsack to solve the following problems. Do not use Theorem 8.16.
5. Use algorithm Knapsack to solve the following problems. Do not use Theorem 8.16. (a) The knapsack has volume 12. Item Benefit Volume Quantity
W
X
Y
Z
8 6 2
5 3 2
7 5 2
3 2 4
468
Chapter 8 Combinatorics
(b) The knapsack has volume 12. Item
(2), and stationary (1). You will not allow yourself to use any other payment method besides cash (i.e., you will only use the money in your pocket). Set up a complete knapsack problem to help find an optimal selection of items at the mall for your shopping spree. Optimality is determined strictly by the ratings. Assume that you are willing to purchase more than
W
X
Y
Z
Benefit
8
5
7
3
Volume
6
3
5
2
Quantity
2
2
2
2
one of each type of item. You need not solve the problem unless you have access to computer software that implements a knapsack algorithm. 9. Suppose that a farmer wants to maximize the profit obtained from his crops. The profits per acre for his crops have been determined and are as follows: soybeans ($100), wheat ($110), cotton ($78), hay ($69), corn ($120), barley
(c) The knapsack has volume 12. Item
W
X
Y
Z
Benefit
7
4
6
4
Quantity
1
2
1
2
6. Use algorithm Knapsack to solve the following problems. Do not use Theorem 8.16. (a) The knapsack has volume 16. Item
V
W
Benefit
7
6
X 5
Volume
4
5
Quantity
I
1
6
Y 7 6
Z 3 3
1
2
3
(b) The knapsack has volume 16.
($71), peanuts ($88), sugar beets ($58), tomatoes ($90), and radishes ($72). The numbers of hours required to tend each acre of the specified crops are also known. They are soybeans (60), wheat (70), cotton (50), hay (45), corn (79), barley (43), peanuts (60), sugar beets (50), tomatoes (70), and radishes (66). The farmer only has 2500 hours of labor available. Set up a complete knapsack problem that will help the farmer find an optimal selection of crops to plant in order to maximize his profit. You need not solve the problem unless you have access to computer software that implements a knapsack algorithm. 10. As a birthday gift, you are going to make your friend a quilt. The quilt will be formed using colored squares of fabric. You cut the squares of fabric in different sizes (but all squares of
Benefit
3
4
6
2
2
Volume
4
3
6
2
3
Quantity
1
2
2
1
3
7. A dieter has decided that dinner must contain at most 600 calories. The food choices at a small fast-food court are limited. They include taco (180 calories), burrito (380), tostado (300), fish (170), hamburger (270), french fries (250), garden salad (100), garden salad with dressing (270), chicken nuggets (190), and hush puppy (60). The dieter has rated sfrom I to 10 (with 10 being best), The ratings are taco (8), burrito (10), tostado (7), fish (4), hamburger (7), french fries (6), garden salad (4), garden salad with dressing (7), chicken nuggets (6), hush puppy (2). The dieter is very decided tohs ignore the values and fat hungry and has hungy dcide ad toignre he health ealt vauesandfat limit the total calories to content of the various foods but still ill 00.Setupknpsak coplee at most 600. Set atup mst a complete knapsack poblm problem tat that will help the dieter find an optimal selection of foods. Optimality is determined by the ratings, not the health benefit. You need not solve the problem unless you have access to computer software that implements a knapsack algorithm. 8. ODSuppose you are going on a shopping spree with $100 in your pocket. After arriving at the mall, you find many items that you wish to purchase. They include a watch ($30), a skirt ($20), a mug ($10), a book ($14), a swim suit ($21), a ring ($55), a pair of earrings ($12), a backpack ($28), and a box of stationary ($5). You have rated these items from I to 5 (with 5 being best). The ratings are watch (5), skirt (4), mug (3), book (3), swim suit (3), ring (5), earrings (2), backpack
the same color are the same size). The areas in square centimeters for the distinct colors are blue (144), green (121), yellow (100), purple (81), orange (144), red (81), turquoise (64), gold (49), and white (64). You have asked your friend to assign a rating to each color from I to 10 (with 10 being best). The results are recorded next: blue (3), green (7), yellow (8), purple (9), orange (6), red (7), turquoise (5), gold (6), and white (7). You want the quilt to be at most 1600 square centimeters in area. Set up a complete knapsack problem that will help you find an optimal selection of colored squares for the quilt. Optimality is determined by the color ratings given by your friend. You need not solve the problem unless you algorithm. 11. Each of the following statements is either true (always) or fals (atleastlsomimes)Dtem ineswhich tionalies false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) P Every knapsack problem has a unique optimal packing. (b) If everything else is equal, an unbounded knapsack problem might have a higher optimal benefit than a bounded version of the same knapsack problem. (c) The greedy algorithm will always give a less than optimal solution. (d) Every unbounded knapsack problem can be solved using the bounded knapsack algorithm.
8.4 The Knapsack Problem 12. A young high school math teacher has just come home after an evening of parent-teacher conferences. She has two hours before bedtime and more items on her to-do list that can possibly fit into two hours. The to-do list contains the following items: Item
469
(a) The knapsack has volume 10. Item Benefit
X 4
Y 5
Z 1
3 2
4 2
2 2
Volume Quantity
Estimated Time
T: watch tape of favorite TV show (skipping over ads) Q: grade a set of quizzes
50 minutes
B: phone boyfriend and talk
40 minutes
Item
W
X
Y
Z
L: make up lesson plans for the following week C: clean the bathroom E: go to the gym for a workout
70 minutes
Benefit Volume
8 6
5 3
7 5
3 2
Quantity
2
2
2
4
P: phone parents and talk on the phone
20 minutes
(b) The knapsack has volume 12. 20 minutes
30 minutes 70 minutes
(a) Her initial instinct is to rank the items on the to-do list on a scale of 1 to 10 and quickly solve a knapsack problem to find the optimal items to fill the two hours. The rankings are T6 Q7 B9 L8 C3 E4 P5. Determine the set of optimal items, the total benefit, and
how much free time she will have. Do not use Theo-
15. An alternative to the heuristic in Theorem 8.16 can be derived that does not use the floor function [84, p. 1028]. The alternative is representative of an approach to mathematical problem solving that seeks to eliminate some messy mathematical calculations by making simplifying assumptions early in the process. What is gained in this approach is a problem that may be easier to solve. However, there is a price to pay: The solution may not be as powerful as it could be. The alternate heuristic makes an additional simplifica-
tion. The simplification starts by noticing that
I
Vl will
rem 8.16. (Hint: Notice that the estimated times are in 10 minute intervals.) (b) After looking at her list of rankings, she noticed that they are all distinct. Therefore, it is possible to list the tasks in order from I to 7, with 7 being the most beneficial. Her first instinct is that this should provide the same solution. Her instinct is wrong. The new rankings are
truncate the fraction by some number that is less than I (but possibly a value very close to 1). The floor function can be eliminated by using J- - 1 as an estimate for the number of copies of item 1 that can fit into the knapsack. This leads to the inequality
T4 Q5 B7 L6 Cl E2 P3. Determine the set of optimal items, the total benefit, and how much free time she will have with this second set of rankings. Do not use Theorem 8.16. (Hint: Notice that
V2 / (V (a) Prove the following theorem. Note that the first inequality is strict.
An Alternative Knapsack Reduction
the estimated times are in 10-minute intervals.) (c) Extra credit: Explain the discrepancy between the two solutions. 13. P Consider a knapsack with volume 9 that can contain unlimited quantities of three types of items. The following table shows the essential information, Item
X
Y
Z
Benefit
11
5
1.1
Volume
8.5
4
1
Quantities B/V Ratio
Suppose a knapsack has volume v and there are n types of items that can be packed. Denote the item benefits by {bl, b2 ... , bn}, the item volumes by {v1 , V2,..... V0} and the item quantities by {qi, q2 ... ,qn }. Assume that the items have already been ordered so that
1.294
1.1
> b 2 > b3
b 1 f
1.25 4 8.5
_l)>b2V'
b(
.
3
2 >
b Vn
bl 1
V2
y b2 > Show that the inequality Li- _ V2 doesnotprovide a valid heuristic for deciding whether to include an item I.
(b) Use both heuristics to do one reduction for the follow-
14. Use Theorem 8.16 to reduce the following knapsack problems as much as possible.
ing knapsacks. Then compare their suggestions with the optimal packing. Assume all problems are unbounded.
[
then it is possible to find an optimal packing that includes at least one item 1.
470
Chapter 8 Combinatorics i. The knapsack has volume 10.
iv. The knapsack has volume 10.
Item
X
Y
Benefit
50
20
Volume
10
5
Item
X
Y
Benefit
30
20
Volume
6
5
ii. The knapsack has volume 10.
v. The knapsack has volume 15.
Item
X
Y
Benefit
40
16
Item
X
Y
Volume
10
4
Benefit
18
6
Volume
6
3
iii. The knapsack has volume 10. Item
X
Y
(c) Compare the heuristics in Theorems 8.16 and 8.17.
What, if anything, has been gained by making the ad-
Volume
4
10
ditional assumption? What, if anything, has been lost?
8.5 Error-Correcting Codes With the advent of computer technology, information has become one of our primary commodities. This information needs to be produced, stored, transferred, and consumed. The processes of storing and transmitting this information occasionally introduce errors into the information. The mathematical field of error-correctingcodes (often called coding theory) seeks to provide mechanisms to combat this problem. Computer-compatible information is stored using a binary representation. This 33 means that we use only two symbols, 0 and 1. Information is stored as strings of bits. For example, the letter "a" is often stored as the binary string "1100001 .*34 Errors occur when a 0 gets changed to a 1, or a 1 gets changed to a 0, or bits are added or removed from the string. This can be caused by noise on a phone line, electrical interference in the atmosphere, tapes or disks getting old, electronic component failure, or other reasons. Coding theory has combined some sophisticated theoretical mathematics with some intuitive ideas to ensure reliable transmission and storage of information. The primary intuitive idea is similar to the manner in which human languages ensure reliable information exchange: Add redundancy. In English, we interpret poorly communicated words by considering the context. Teh strktr of th langage und contxtul kluus hlp us recnstrut the originla messg. In the same way, by adding cleverly chosen extra bits to a binary string before it is transmitted or stored, we can reconstruct the original form of a garbled message. Many different techniques exist to add these redundant bits. One relatively simple technique is a 7-bit Hamming code. A preliminary definition will be helpful. DEFINITION 8.15 Binary String A binary string of length n is a sequence of n symbols, where each symbol is either a "0" or a "1."
Some Binary Strings The following are binary strings of length 7: 0100100 1111000 0000000. The following list contains all binary strings of length 3. 000 001 33
010
011
100
101
110
111
1010101 0
A bit is a binary digit, that is, a 0 or a 1. Strings are defined more fully in Definition 9.3 on page 000. Because only the symbols "0" and "I" are allowed, it is necessary to use several Os and Is strung together in order to distinguish the letter "a" from the other letters and punctuation symbols.
34
8.5 Error-Correcting Codes
471
PROPOSITION 8.5 There are 2n binary strings of length n.
Proof: There are n positions in the string, each with two possible values. Since the U choices are independent, general counting principle I applies (page 214).
8.5.1 The 7-Bit Hamming Code A 7-bit Hamming code is a set of 16 binary strings of length 7. In each string, the first 4 bits contain an encoded message; the last 3 bits are cleverly chosen redundant bits. With this code, there are 4 bits used to carry a message, so there are 16 different messages (such as 0010, 1101, and 1001). All 16 possible messages are listed in Table 8.39. I have arbitrarily assigned meanings to each of the messages. You could assign any other meanings you wished to communicate. TABLE 8.39 Encoding Sixteen Messages 1000 I'm anxious 0000 hello 0001
goodbye
1001
yes
0010
send money
1010
no
0011
send pizza
1011
maybe
0100
I'm hungry
1100
will you marry me?
0101
I'm in love
1101
I like snow
0110
I'm happy
1110
I like rain
0111
I'm sad
1111
I like sunny days
Before a message is transmitted, it will be encoded by adding three additional bits, called check bits. The encoding process starts with messages (which are 4 bits long) and adds 3 check bits (the 3 values depend on the current message), creating a 7-bit string called a code word. The 7-bit code word is then transmitted. The person receiving the transmission decodes the 7-bit string she receives, ending up with the original 4-bit message (unless two or more errors have occurred). This process works because the 16 code words all differ in at least 3 places. 35 If any one of the 7 bits is changed, the received string will be closer to the original code word than to any of the other 15 code words. 36 The received message can be decoded by choosing the code word that requires the fewest changes to the received string to obtain a match. 37 The message will be the first 4 bits of that code word. The encoding process is diagrammed as follows.
"sendmoney 0010 message
"encode
0010 110 message + check bits
0010110 code word
The code word is then transmitted. Suppose that the first bit is accidentally changed during transmission and a different string is received. The decoding process is as follows (and is later explained in detail). 1010110 rciesti received string (not a code word)
decode
0010110 code word
extract message
"send , money" 0010
35
There are sixteen 7-bit code words, but 128 possible strings of seven Os and Is. The verification that the code words differ in at least three places will be presented later in this section. 36 Closeness is measured by counting the number of places two strings differ. Strings with small total differences are considered to be close. This notion will be formalized in Definition 8.16 on page 474. 37
For example, the received string 1010110 requires only one change to match the code word 0010110 but two changes to match the code word 1010101.
472
Chapter 8 Combinatorics If no errors occur, the decoding process will verify that the received string is actually a code word (which will then be assumed to be the code word sent). Imagine a table that lists messages and their corresponding code words. A partial version is shown in Table 8.40. When a string is received, decoding essentially consists of finding the code word that is closest to the received string and then assuming that the corresponding message was sent. TABLE 8.40 A Partial Listing of Messages and Corresponding Code Words mesg
code word
0000 0001 0010 0011
mesg
1000 1001 1010 1011
0010110
code word
mesg
1001100 1010101
0100 0101 0110 0111
code word
mesg
1100 1101 1110 1II1
code word
1101001
As an example, suppose the message 1101 (I like snow) has been sent. The message must be changed into a code word before it is transmitted. In this case, the code word is 1101001 (the details will be presented soon). The transmitted code word will be 1101001. Now suppose that the transmission arrives incorrectly. For instance, assume that the second bit of the code word is changed from a 1 to a 0. The person receiving the transmission sees 1001001, which is not one of the 16 possible code words. (The code word for the message 1001 is 1001 100.) There is only one code word of the 16 that does not differ from the received string in more than 1 bit. That code word is 1101001. The person receiving the transmission should assume that 1101001 was sent, decoding the first 4 bits as the message 1101. This process of manually comparing received strings with a list of the possible code words is not too tedious for this code, but useful codes have many more code words (each of which is a long string of bits), so a more efficient way to encode and decode is needed. The technique presented next can be used with larger Hamming codes. In fact, a computer can be used to do the encoding and decoding. Encoding and Decoding the 7-Bit Hamming Code
+
0
1
The encoding and decoding process can be accomplished using mod 2 arithmetic. The addition table for addition mod 2 is presented in Table 8.41.38 The critical observation is that 1+ 1 = 0 ( mod 2). This can be extended to multiple additions. Thus 1 + 0 + 1 + I = 1 (mod 2). In essence, the result is I if there is an odd number of Is and 0 otherwise.
1
1 0
Encoding: Denote the bits in the original message by xl, X2, X3, and X4. The check
TABLE 8.41 Addition mod 2
bits are denoted x5, X6, and x7. (The subscripts indicate the position of the bit. Thus, X3
represents the third bit.) The check bits for the code word XlX2x3x4x 5X6x7 are chosen to make the following equations true. =X2+x3+X4 (mod 2) x6 =xI +x3+x4 (mod 2)
X5
X7 =X1
+
X2
+
X4 (mod 2)
38This should not be confused with the binary (or base 2) addition table, which looks like
+ 0 1
0 0 1
1 1
10
where 10 is the binary representation for the number 2.
8.5 Error-Correcting Codes
473
Thus the message 0010 (send money) becomes the code word 0010110, since x5=0+l+0=
1
(mod 2)
x 6 =0+1+0-=- (mod 2) X7=0+0+0=0(mod 2). Notice that the formula for x5 adds all the message bits other than X], the formula for X6 adds all the message bits except x2, and the formula for x7 adds all the message bits
except X3.
Decoding: To see how to decode, consider what happens to the check bits if there is an error in exactly one of the 4 original message bits. For example, if bit x, is changed from 0 to I or from I to 0, but x2, x3, and x4 are the same, then X5 won't change, but x6, and x7 will. (In the formulas for computing the check bits, the formula for X5 does not contain xj, but the formulas for x6 and X7 do contain xl. Thus, if xI changes, so will x6 and X7.) Table 8.42 summarizes the results. TABLE 8.42 Errors in a 7-Bit Hamming Code an error in bit causes changes in x1
x 6 and x 7
X2
x 5 and X7
x3
x5 and x6
X4
x 5 , x 6 , and x
7
When a transmitted string is received, the check bits can be recalculated. If the recalculated check bits are the same as the transmitted check bits, it is reasonable to assume the message arrived safely. If the new check bits are different, then an error has occurred. If only one check bit is different, the error occurred in sending that check bit, so the error can be ignored. If two or three check bits differ, use Table 8.42 to see which message bit was erroneously changed. It will be helpful to introduce some additional notation. A superscript will be used to distinguish between the check bits that are received and those that are calculated from the received message bits. A superscript r will denote the received bit, and a superscript c will denote the calculated bit. Suppose the bit string 0110110 has been received. The 4 message bits (0110) appear to be the message "I'm happy." The transmitted check bits are xr = 1, xr = 1, and = 0. The computed check bits using the 4 message bits 0110) are xc = 0, x' = 1, andx7x ---1. Thus the transmitted and computed check bits for x5 and x7 differ. Table 8.42 shows that x2 is incorrect. The message actually sent was probably 0010 "send money." Tables 8.43 and 8.44 summarize encoding and decoding for the 7-bit Hamming code. An alternative approach is presented in Problem 15 in Exercises 8.5.4. TABLE 8.43 Encoding a 7-Bit Hamming Code x5 = x2 + x3 + X4 (mod 2)
Differing Check Bits
Error in Bit
4
(mod2)
x 6 and x 7
X1
= X1 +x 2 +x 4
(mod2)
x 5 and x 7
X2
x 5 and x 6 x 5 , x 6 , and x 7
X3 X4
X5
X5
X6
X6 x7
x 6 = X1 + x
x7
TABLE 8.44 Decoding a 7-Bit Hamming Code
3
+x
X7
474
Chapter 8 Combinatorics
IV Qui~ck C'h'e'ck' . 8'.16
.
....
1. Encodeathe following messages into code words in the 7-bit Hamming code. (a) 1101 (b) 0001 2. Assume that the 7-bit Hamming code is being used. Decode the follow-
ing received transmissions. (Find the message that was most likely to have been sent.) (a) 0011001
(b) 0011011
(c) 1011001
8.5.2 A Formal Look at Coding Theory The previous section introduced many of the major ideas in coding theory. It is now time to formalize some of these ideas. This process will start by introducing several definitions. DEFINITION 8.16 Hamming Distance The Hamming distance between two binary strings, u and v, having common length, n, is the number of positions in which u and v differ. The Hamming distance is denoted by Hd(u, v).
DEFINITION 8.17
Hamming Weight
The Hamming weight of a binary string is the number of Is in the string. The Hamming weight of the string, u, is denoted by Hw(u).
DEFINITION 8.18 Adding and Subtracting Binary Strings Let u and v be binary strings with common length, n. The sum and difference of the two strings are denoted u + v and u - v, respectively. Both operations are defined as the bitwise (mod 2) sum. That is, the bit in position k of u + v is the same as the bit in position k of u - v and has the value Uk + Vk (mod 2).
A simple example of the previous definition may be helpful. Let u = 1010 and let
v = 1100. Then u + v = u - v = w, where wI = 1 +1 = 0 (mod 2), W2 = 0 + =1 1 (mod 2), w3 = 1 +0 = 1 (mod 2), and W4 = 0±0 = 0 (mod 2). Thus, u + v = u - v = 0110.
DEFINITION 8.19 Binary Error-CorrectingCode A binary error-correctingcode is a nonempty subset, C, of the set of all binary strings having length, n. Let JCl = M and let d = minu,,vC Hd(u, v). C is characterized by the parameters n, M, and d and is referred to as an (n, M, d) code. Parameter d is called the minimum distance of the code. The binary strings that are the elements of the code are called code words. An error-correcting code is linear if the sum (or difference) of any two code words is also a code word. If C is a linear code, then the string consisting of n O's is a member of C. (Let u E C be any code word. Then u - u, the string of n Os, is also in C.) Note that the phrase code word applies to the encoded string with the extra redundancy bits. It does not refer to the embedded message.
8.5 Error-Correcting Codes
475
The 7-Bit Hamming Code Revisited The 7-bit Hamming code is a (7,16,3) linear code. It is easy to see that Hd(1001100, 1010101) = 3
Hd(00101 10, 1001100) = 4 0010110--1001100 = 1011010 Hw(1011010) = 4. It appears to be fairly tedious to verify that the minimum distance between any two code words is 3: There are (2) = 120 pairs of distinct code vectors, u and v, for which Hd(u, v) needs to be calculated. Corollary 8.3 provides a simpler, alternative calculation that is valid for linear codes. To see that the 7-bit Hamming code is linear, suppose that x = XlX2X3x4x5x6x7 and y = Y1Y2Y2Y4Y5Y6Y7 are code words. Then w = x + y is another code word if it satisfies the three encoding equations. The value of position k of w is Wk = xk + Y (mod 2). Since x and y are both code words, x5 = X2 + X3 + X4 (mod 2) and Y5 = Y2 + Y3 + y4 (mod 2). Thus (using some simple properties of sums mod 2) W5 = X5 + Y5 (mod 2) = ((x2 + X3 + X4 (mod 2)) + (y2 + Y3 + Y4 (mod 2))) (mod 2) = ((X2 + Y2) + (X3 + Y3) + (X4 + Y4)) (mod 2)
= W2 + W3 + W4 (mod 2). Similar calculations verify that W6 and W7 also satisfy the encoding equations.
U
Hamming Distance and Hamming Weight Let u, v, and w be any binary strings having common length, n. Then
1. Hd(u, v)
= Hw(u - v) 2. Hd(u, v) t must satisfy t M •
(n)
2'(E.) = 3n" i=0
iv. 1101
(b) Use the alternative approach to decode the following received strings. (Find the most likely transmitted code word; then determine the corresponding message.) i. 1011101
correcting
18. Use Corollary 8.4 to determine the maximum number of code words possible in each of the following codes.
(a) Use the alternative approach to encode the following
1001
t-error
(c) Prove the following ternary version of the sphere packing
X6r
i.
binary
(n, M, 2t +1) code must satisfy
(x)=(
(c2
483
ii.
(d) Show that the ternary (11, 36, 5) Golay code satisfies the ternary sphere packing condition. 20. A major reason for error-correcting codes is that some of
0110011
the bits in a message might get changed during transmission
iv. 1100000
(from a 0 to a 1 or from a 1 to a 0). Suppose the probability that a single bit is changed is p, and the probability of a change in 1 bit is independent of any changes in other bits. is te faycagsi needn nIbti cag (This is not always a valid assumption. Some channels introduerosinbst. errors in bursts.)
(c) Explain why this alternative approach correctly implecode. Don't forget to discuss Hamming ments the mhetsy the7-bit 7-bitcHamming teconiede.oteorgnetw dis ito s the way the decoding technique determines which bit is eror.duce in in error.
(a) List the code words.
(a) ý4 What is the probability (as a function of p) of exactly I error occurring in a transmission that is 7 bits long? What is the probability if p = .1? What is the probability if p = .5?
(b) What are the parameters of the code?
(b) What is the probability (as a function of p) of exactly 2 errors occurring in a transmission that is 7 bits long?
16. Use Construction 8.11 to create an error-correcting code from the (5, 10, 4, 2, 1)-design in Example 8.25 on page 450.
(c) How many errors can the code correct? (d) Estimate the efficiency of the code. 17. Prove the following corollary to Proposition 8.7.
What is the probability if p ity if p = .5?
=
.1 ? What is the probabil-
(c) What is the probability that exactly r errors occur in a transmission that is n bits long?
484
Chapter 8 Combinatorics
8.6 Distinct Representatives; Ramsey Numbers This final section introduces two topics that investigate some properties of sets whose elements are other sets. The first topic, systems of distinct representatives, determines whether it is possible to choose a distinct element from each set in a collection. The second topic, Ramsey numbers, investigates the minimum size of a set necessary to guarantee the existence of a property related to some of its subsets.
8.6.1 Systems of Distinct Representatives The notion of a system of distinct representatives arises when considering the marriage problem. The marriage problem is not the same as the stable marriage problem (page 3), although there are some similarities.
The Marriage Problem The marriage problem concerns a group of eligible young women and a group of unmarried young men. The two groups need not be the same size. Each young woman makes a list of acceptable mates from among the group of young men. She then checks with each man on her list to see if he is willing to marry her. She removes the name of any man on the list who is unwilling to marry her. It is assumed that any man left on the list is completely acceptable as a mate. All the lists are then handed to a neutral referee. The referee must determine whether it is possible to marry each young woman to a young man who is on her list. Bigamy, of course, is not permitted.
Illustrating the Marriage Problem Suppose there are three young women: Xena, Yolanda, and Zelda. Assume also that there are four young men: Abel, Bart, Collin, and Dermot. The final lists are Xena: Bart, Collin
Yolanda: Collin, Dermot
Zelda: Collin, Dermot
It is clear that Collin cannot be matched with Xena without leaving one of the other young women unattached. One possible successful matching is Xena-Bart, YolandaCollin, Zelda-Dermot. Suppose that Collin wants to become a monk and therefore will never marry. The lists are now Xena: Bart
Yolanda: Dermot
Zelda: Dermot
It is clearly impossible to match all three young women with acceptable mates. U The previous example shows that it is not always possible to find an acceptable matching for the marriage problem. The main theorem in this section will present a necessary and sufficient condition for there to be a successful matching. The theorem is stated in the more general context of sets, so the problem needs to be restated using set-theoretic terminology.
DEFINITION 8.23 A System of Distinct Representatives Let A 1 , A2 ..... A, be n (not necessarily distinct) subsets of a set U. A list, rl, r2 ..... rn, of elements in U is called a system of distinct representatives for {A 1 , A 2 ... , A,} if "*ri E Ai, for i = 1,2 .... n
"*ri 0 rj, for i A j
8.6 Distinct Representatives; Ramsey Numbers
485
Example 8.46 can be written in the form of this definition. The initial sets are U = {Abel, Bart, Collin, Dermot), A 1 = {Bart, Collin}, A2 = A 3 ={Collin, Dermot}. In the original version (with Collin eligible), one choice of distinct representatives is R = Bart, Collin, Dermot. In the second version, no system of distinct representatives is possible since there is an insufficient supply of eligible males for Yolanda and Zelda. In other words, the union of the lists for the two women contains only one man.
No System of Distinct Representatives Let A 1 = {a, b}, A 2 = [a, b}, A 3 = {c, d, e} and A 4 = {a, b}. There can be no system of distinct representatives, since the set {rl, r2, r3, r4) _ {a, b, c, d, e} must have three distinct elements to assign as the values of rl, r2, and r4, but there are only two elements, a and b, available to choose from. U The pattern outlined in the previous example can be generalized. Let
{BI, B 2 ... , Bk} C {A,, A 2 .
An}.
If there are fewer than k elements in B 1 U B 2 U ... U Bk, then it is not possible to find a system of distinct representatives for {B1 , B 2 ... , Bk}, and hence also not possible for (A 1 , A 2 ..... A.1.39 This pattern is central, so it is useful to give it a name.
DEFINITION 8.24 The MarriageCondition Let A 1 , A 2 , ... , A, be tion, {AI, A 2 , .... , An), 1 < k < n and every 1 1. All are true for this example.
> 1,
IA 2 1 >
1, IA31
>
1, and
k = 2 Each pair of sets must have at least two elements in their union: IA1 U A 2 1 > 2, IAI U A 3 1 > 2, JAI U A 4 1 > 2, IA2 U A 31 > 2, IA2 U A 4 1 > 2, and JA3 U A4 1 > 2. All these conditions are also true for this example. k = 3 Every collection of three of the sets must contain at least three elements in their union: IAIUA 2 UA 3 1 > 3, JAIUA 2 UA 4 1 > 3, JAI UA 3 UA 4I > 3, IA2 UA 3 UA 4 I > 3. All but one of these conditions are true. For this example, IAI U A 2 U A4 I = 2. k = 4 The union of all four of the sets must contain at least four elements: JAI U A2 U A 3 U A 41 > 4. This condition is true. Since one of the required conditions fails, the collection {A I, A 2 , A 3 , A 4 } from Example 8.47 does not satisfy the marriage condition. U
V QQuick..Ch..eck
8.1 9___.
.
..
.
..
Determine whether the marriage condition is satisfied. If it is not satisfied, show at least one specific point of failure. 1. Al = {a, b,c}, A 2 = {a}, and 2. A 1 = {a, b}, A 2 = {a), and A 3 -= {b}. A 3 = {b}. [v The marriage condition is necessary for a system of distinct representatives to exist. The next theorem shows that it is also sufficient.
SM
Systems of Distinct Representatives(The Marriage Theorem) Let A 1 , A 2 ... , A, be n (not necessarily distinct) subsets of a set U. A system of distinct representatives exists if and only if (A,, A 2. A,, I satisfies the marriage condition.
This theorem, sometimes called Hall's marriage theorem, was first stated and proved by Philip Hall in 1935. Many proofs exist, including some constructive proofs. The constructive proofs tend to be a bit complex but would be preferred if an efficient algorithm for producing the system of distinct representatives is desired. The proof given here uses complete induction and proof by cases [11]. The cases are illustrated by Example 8.49 and Example 8.50, which immediately follow the formal proof. You may wish to refer to them as you read the proof. The downloads section at http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/ has a pdf file which contains these two examples. Proof: Only If The "only if" part of the theorem is logically equivalent to the following claim: if [AI, A 2 ..... An , does not satisfy the marriage condition, then a system of distinct representatives does not exist. Thus, suppose that for some k, with 1 < k < n, and for some set of subscripts, il, i2. 42
ik, with 1 < ii < i 2 < ... < ii, IAil U Ai 2 U ... U AikI
This is the negation of the marriage condition. See Exercise 1 in Exercises 8.6.3.
k + 1. The new feature is the assertion that (for k < n - 1) the union of any k of the sets must contain at least k + 1 elements. Notice that when k = n, the normal marriage condition will still apply. The negation of the enhanced marriage condition is for some m with I < m < n - I and some choice of a size-m subcollection, {Ail, Ai2 ..... Ai.j, with 1 < iI < i2 < -.. < im < n
IAij U Ai 2 U .-• U Ai, I _ 1, so the set isn't empty. Choose any element in the set to be the representative. Inductive Step The inductive hypothesis is every collection of p - I or fewer sets that satisfies the marriage condition has a set of distinct representatives. Assume now that the collection {A 1 , A2 . Ap I satisfies the marriage condition. The goal is to show that this collection has a set of distinct representatives. There are two complementary conditions to consider. Case 1: The enhanced marriage condition is satisfied.43 Because the marriage condition is satisfied, Ap has at least one element. Choose any element in Ap to be its representative, rp. Since the enhanced marriage condition is also satisfied, it should be possible to remove rp from A 1, A2 ..... Ap_ 1 and still have the marriage condition be satisfied for those sets. The details follow. Let Bi = Ai - {rp}, for i = 1,2,...,p - 1. To see that {B1 , B2. - . Bp-1 satisfies the marriage condition, consider any k with 1 < k < p - 1 and any collection of subscripts with I < iI < i 2 < ... (k + 1)1
1
= k.
Since {B1 , B2 . Bp- I} satisfies the marriage condition, the inductive hypothesis asserts that a system, rl, r2..... Irp-1, of distinct representatives exists, where ri E Bi. It is also clear that ri 0 rp for i = 1, 2 ... , p - 1. Since Bi C Ai, it is also true that ri G Ai for i = 1, 2,..., p - 1. Therefore, the list ri, r2 ..... rp is a system of distinct representatives for {A 1 , A 2 .. 43
A}.
See Example 8.49 on page 488 for an illustration of this case.
488
Chapter 8 Combinatorics 44 Case 2: The enhanced marriage condition is not satisfied. Since the enhanced marriage condition is not satisfied, there must be some m, with 1 < m < p - 1, and some collection of distinct subscripts, {it, i2 . .... iml, such that IAil U Ai, U ... U Ai. I < m. However, the marriage condition is satisfied, so IAil U Ai 2 U ... UAim I >Im. These inequalities indicate that IAil U Ai 2 U . . UAimI = In. In order to simplify the notation, rename the A's so that IAI U A2 U ... U Am l = m. Since In < p - 1, the inductive hypothesis asserts that there is a list, T rl, r2 .... , rm, of distinct representatives for {AI, A 2 , ... , A01 }. Because IAI UA 2 U ...U AmI = m, {rl, r 2 ... , rm. = AI U A 2 U ... U Am. In a manner similar to case 1, remove the elements of T from each set in {Am+l, Am+2 ..... Ap). Thus, let Bi =Ai - T, for i = In + 1, m + 2 ... , p. Since m > 1, this collection of sets contains at most p - 1 sets. If this collection satisfies the marriage condition, then the inductive hypothesis can be used a second time. A brief detour will verify that {Bm+1, Bm+2 ..... Bp} does indeed satisfy the marriage condition. The set identity 45 (X - Z) U (Y - Z) = (X U Y) - Z will be used with X = Z = T and Y = (Ail U Ai 2U ... U Aik). Note also that IX - YI = IXI - IY] if Y C X.
Detour Let k be any integer such that m + 1 < k < p and let i1, i 2. M + l < il < i2 k." Let S = (1,2 ... , n) represent the set of legal subscripts for the A's. (b) Write the negation of the expression from part (a). (c) Write the negation of the marriage condition using a style similar to the style in Definition 8.24. 2. Determine which of the collections of sets have a system of distinct representatives. If a system of distinct representatives does exist, show one; if no system exists, indicate clearly how the marriage condition is violated. (a) P A 1 = (b), A2 = {bl, A 3 = {a, b} (b) ý4A1 =(a,b},A2 ={b,cl,A 3 ={c} (c) Al = {a, b], A, = {a, b, c}, A 3 = {a, b, c}, A4 = {b, c}
(b) A 1 = {a, b, d), A2 = {e, f, gJ, A3 A4 = {d, g), A 5 = 1c, e}
=
{a, e, fl,
(c) A1 = {w, x, y, z), A2 = {x, yl, A 3
=
{z}, A4 = {w, x},
A5 = 1W, Z} (d) A, = {a,b, f}, A2 = [a,c,e}, A 3
1 {c,e}, A4 = 1b, dl, A 5 = {a, c, e}, A6 = {c, e} 5. P For each collection of sets in Exercise 4, determine whether edition. the collection satisfies the enhanced marriage con-
6. A small rural high school has five extracurricular clubs. The clubs and the club members are listed in the following table. Math Don Effie
Honors Angus Carla
Service Bart
Don
Effie
Don
Yearbook Bart Don
Poetry Bart Effie
atwo
The principal wishes to form a "club council" that will contain one representative from each club. She does not want clubs represented by the same student (otherwise the council could consist of just Bart and Don). Is this possi-
3. For each collection of sets in Exercise 2, determine whether the collection satisfies the enhanced marriage condition,
ble? If it is, show a list of five representatives; otherwise, explain why it is not possible.
4. Determine which of the collections of sets have a system of distinct representatives. If a system of distinct representatives does exist, show one; if no system exists, indicate clearly how the marriage condition is violated. (a) A 1 ={x,y},A2 = (w,yl,A 3 = {z}, A4 = {w, x}
7. For each collection of sets, perform the inductive step in the proof of Theorem 8.20. Indicate which case applies. (a) P A1 = {b, c}, A2 = (b), A3 = {a, b} (b) A 1 = (a), A2 = (b), A 3 = [cI (c) A 1 = [a,d},A 2 ={b},A 3 ={a, b}, A4 = {b, cl
dA5 A5
= =
(a, c, e a, c, e}
8.6 Distinct Representatives; Ramsey Numbers (d) A1 = {b, c}, A2 = {a, b}, A 3 = [a, d}, A4 = {a, b, d} (e) A 1 = {a, b, d}, A2 = {b, e}, A3 = (a, b}, A 4 = {b, c}, A5 = {a, d, e} 8. Each of the following collections of sets has a system of distinct representatives. In each case, determine the lower bound asserted by Corollary 8.5 and then determine how many different systems of distinct representatives actually exist. (a) 04A = {a, b1, A2 = {a, b) (b) ODA 1 = (a, b, c}, A2 = {b, c}, A 3 (c) A 1 = {x, y, z}, A 2 = {x, y}
=
{a, c}
(d) A1 = {a, b}, A 2 = [a, c}, A3 = {b, c} 9. Each of the following collections of sets has a system of distinct representatives. In each case, determine the lower bound asserted by Corollary 8.5 and then determine how many different systems of distinct representatives actually exist. (a) At ={a, b, c}, A2 ={a, b, c} (b) A 1 = {x}, A2 = (y} (c) A 1 = (a, b, c, d}, A 2 = {a, b, c, d} (d) A 1 = [a, b, c}, A 2 = (a, c, d}, A3 = [b, ci, A 4 = {a, c) 10. Let n and k be positive integers. Suppose you are given k • n cards; each card is marked with a number from I to n such that each number is represented k times. You shuffle and deal k cards to each of n people. Is it possible for each person to lay down one card so that every number from 1 to n is given once? (a) Show that for k = 1 or k = 2, the answer is yes. (b) Is it true in general? Justify your answer. 11. Prove Corollary 8.5. Relabel the A's so that
A,1. Choose anyrE A, and set < IA21 :S ... t t== JAII A < A1, for" < jAn. Choenr A ndI Bj =Aj-{rI},for2<j < n. (a) Show that {B 2 , B 3 ... , B,} satisfies the marriage condition. (Hint: Suppose IBi, U ... U Bik I < k. Show that JAII = [A iq ... IAikI = 1 and derive a 1 = IAi2 contradiction.) (b) There are t choices for rI. Since {B2 , B 3 . Bn} satisfies the marriage condition, part (a) can be repeated. (Hint: Rename and reorder the B's as a new collection of A's, wheret - t < (All < 1A21 < ... < JAn 1 I.) 12. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice, (a) ODThe sets (A 1 , A2. An) satisfy the marriage condition if for each k there exists a subcollection, {Ail, Ai2 . . Aik} (A 1 , A2 ... , An), such that JAil U Ai 2 U ... U AikI > k. (b) The marriage condition is a necessary and sufficient condition for a collection of sets to have a system of distinct representatives, 49
497
(c) The enhanced marriage condition is easier to satisfy than the marriage condition. (d) R(j,k; 1) = R(j,k) (e) 0 Table 8.46 indicates that 43 < R(5, 5) < 49. This means that any set of size 49 or larger will satisfy the (5, 5) Ramsey condition, but some sets of sizes 43-48 will also satisfy the (5, 5) Ramsey condition. However, no set of size 42 or less will satisfy the (5, 5) Ramsey condition. (f) The m in the notation (j, k; m) indicates that the (j, k; m) Ramsey condition is an assertion about partitions of the m-element subsets of some set, S. 13. ODThe June 22, 1993 Ann Landers advice column contained the following letter: Dear Ann: I am sure that many members of Congress read your column. I hope they will see this, because it's the best way I can think of to get their attention. I am enclosing an article from the RochesterDemocrat & Chronicle so you will know Iam not making this up. Two professors, one from Rochester, the other from Australia, have worked for three years, used 110 computers and communicated 10,000 miles by electronic mail, and finally have learned the answer to a question that has baffled scientists for 63 years. The question is this: If you are having a party and want to invite at least four people who know each other and [the word or should have been used here] five who don't, how many people should you invite? The answer is 25. Mathematicians and scientists in countries worldwide have sent messages of congratulations. don't want to take anything away from this spectacular achievement, but it seems to me that the time and money spent on this project could have been better used had they put it toward finding ways to get food to the millions of starving children in war-torn countries around the world. - B. V. B., Rochester, N.Y. Ann says: There has to be more to this "discovery" than you recounted. The principle must be one that can be applied to solve important scientific problems. If anyone in my reading audience can provide an explanation in language a lay person can understand, I will print it. Meanwhile, I am "Baffled in Chicago." If the incorrect conjunction and is replaced by the proper or, the statement proved by the mathematicians (Stanislaw Radziszowski and Brendan McKay) showed that R(4, 5) = 25. Was B. V. B. justified in criticizing these mathematicians number? Write a for spending time determining a Ramsey 49 paragraph defending your opinion.
There was active discussion in the USENET newsgroups sci.math and sci.math.research about how to respond to the letter. Some mathematicians surely sent responses, but none were published.
498
Chapter 8 Combinatorics
14. Two integers are either relatively prime, or they share at least one common prime factor. How large a collection of integers are necessary in order to ensure that there exists a subset of fourenecessar integers aree ertoensure that m tu erelstie p ori fou r integ ers th at are eith er m u tu ally relativ ely prim ebe, , or inel whic hars evry acomon air pimefacorelement 15. A large union has members from 30 different industries and many companies within those industries. At the national convention, there will be hundreds of delegates from many companies and from all the associated industries. Is the following claim true or false? Give evidence for your answer. As long as the convention contains a predetermined number of delegates, there will always be a set of 20 people such that one of the following is true: "*All 20 work at the same company. "*All 20 work at different companies in the same industry. "*All 20 work in different industries.
16. Complete the proof of Proposition 8.9. 17. Prove case 2 of Theorem 8.22. 18. Prove the following proposition. (Hint: If there are no twom n su et of S th n a l f t e e n n xi e t s b e s subsets of 5, then all of these nonexistent subsets can be considered as members of any set. Investigate the two cases: 1S1 = 1 and ISI > 2.) PROPOSITION 8.11 R(j, 1; 2) and R(1, j; 2) Let S be any nonempty set. If j > 1 and k > 1, then 1. R(j, 1; 2) = I 2. R(l, k; 2) = I 19. Without using Table 8.46, outline the steps in a proof that 9 < R(3, 4) < 10. 20. Prove that R(j, m; m) 21. Prove that R(k1 , k 2.
=
j for j > m > 1. kn; 1) = k, + k2 + .. • k, - n + 1.
8.7 QUICK CHECK SOLUTIONS Quick Check 8.1 1. There are 9! = 362880 distinct arrangements-far too many possibilities to try
them all. 2. Before presenting a formal proof, the following heuristic argument shows that the proposed value is reasonable. Whatever the common sum, S, is, it should be larger than the sum of the n smallest numbers in the magic square: S > (1 +2+.•• +n) = n(n+2) It should also be smaller than the sum of the n largest numbers in the magic 2
'
2 square: S < (n 2 +(n2 -1)+.-.+(n -(n- 1)) = n-n 2 -(0++.-.(n-1))= n3 (n-1)n It seems reasonable that S should be close to the average of these two
2
"
extremes: S
"[n(n-0 + (n3 _
n(n
(n21)n)]
2
l)2
Proof: Let the common sum be S. Then each row has sum S and there are n rows. Thus, adding the rows together produces a total of nS. On the other hand, adding each individual entry in the magic square will add each number in {1, 2, • • • , n 2 } to the total: Z=I~' n i
--
nZ(n2+)2. Equating these two totals leads to S
=
n(n2+l)2
E)
3. The possible subsets are 1
5
9
1
6
8
2 2
4 5
9 8
2
6
7
3
4
8
3
5
7
4
5
6.
First, observe that a 3-by-3 magic square requires eight subsets of size 3 (3 rows, 3 columns, 2 diagonals). There are only eight possible subsets of size 3 that have sum 15, so all must be used in any magic square. Next, notice that 1, 3, 7, and 9 only appear in two of the subsets. This means that they cannot be at a corner (which requires the element to be in three subsets since diagonals are important) nor in the center. Also, 2, 4, 6, and 8 each appear
8.7 QUICK CHECK SOLUTIONS
499
in three subsets, so they must each be in a corner. The number 5 is the only one that appears in four of the subsets, so it must be the center element (which is in four
subsets of size 3). 4. The subset {1, 5, 9) can appear in one of four possible ways (the second row or the
second column, each in two possible orders since 5 must be the middle element). Once that subset has been placed, the subset {1, 6, 8} has only two possible place-
ments (perpendicular to {1, 5, 91, with 1 at the center). There is then only one way to fill in each of the two diagonals. The remaining entries of the magic square are
also then uniquely determined, so there are at most eight distinct magic squares. It is easy to see that all eight possibilities are actual magic squares. 5. The eight 3-by-3 magic squares are as follows. 6 7 1 5 8 3
2 9 4
8 3 1 5 6 7
4 9 2
2 9 4
7 5 3
6 1 8
4 9 2
3 5 7
8 1 6
6
1 8
8
1 6
2
9
4
4
9
2
7
5
3
3
5
7
7
5
3
3
5
7
2
9
4
4
9
2
6
1 8
8
1 6
Quick Check 8.2 1. Use the boundary condition p(n, n) = I to fill in the main diagonal. Then use p(n, 1) = p(n - 1,0) + p(n - 1,1) = 0 + p(n
-
1,1) = p(n - 1, 1)
to fill in column 1. Now work by rows, starting with n = 3. The results are shown in Table 8.47.
TABLE 8.47 The Values of p(n) and p(n, k) for n, k < 6 p(n, k) n\k 1
1 2 11
3
4
p(n) 5
2
1
1----
3
1
1
1---
4 5
1 1
2 2
1 2
1-1 1
6
1
3
3
2
6 2 3
1
5 7
1
11
TABLE 8.48 The Values of S(n, k) Quick Check 8.3
for n, k < 4
n\ k 2 3 4
S(n, k) 1 2 3 1 1 1 1
4
---
1 3 7
-
-
1 6 1
1. Notice that S(n, n) = 1 because each of the n containers must receive at least one of the n objects. This can only be accomplished by placing exactly one object in each container. This determines the main diagonal of Table 8.48. Observe also that S(n, 1) = 1 because all n objects must be placed into the single container. The remaining values can be found by listing all possible distributions. The main restrictions are that every container must receive at least one object and containers are indistinguishable (so there is no order within distributions). The values for S(3, 2), S(4, 2), and S(4, 3) can be calculated by listing all distinct possibilities, as
500
Chapter 8 Combinatorics
shown in the following lists. S(4, 2) 1,0 031) {{01,01, {03}, {04}}
S(3, 2) {{01, 031, 10311 {U02, 031, {011)
{{O1, 02, 031, {041}
{01, 02, 041, [0311 01, 03, 04), {0211
0{01, 031, {02}, {04}31 {102, 034, 101), (0431
1102, 03, 041, 10111
[{02, 031, 1011, (0411 0032, 041, o01i, 103211
{101, 021, {03, 0411 U01, 031, 102, 0411
{{O3' 041, {O1}, {O2}}
{{01, 041, 102, 03}}
Quick Check 8.4 1.
In this problem, the objects are the four CDs, which are distinguishable. The containers are the three friends, which are also distinguishable. (a) Since every friend will receive at least one CD, there are 3!S(4, 3) = 6 . 6 = 36 ways to distribute the CDs. Here are some incorrect solutions: i.
ii.
The first three CDs can be distributed in 4.3.2 ways (four choices for friend A's first CD, then three choices for friend B's first CD, then two choices for friend C's first CD). There are then three choices for the friend who gets the last CD. Thus, there are 24 • 3 = 72 ways to distribute the CDs. The error is that some options are double counted. For example, I can give CD W to A, CD X to B, and CD Y to C. I can then give CD Z to A. But this same distribution could be generated by giving Z to A, X to B, and Y to C in the first round. There are three choices for the friend who will receive two CDs, so there are three possible distributions. This is incorrect because it assumes that the CDs are indistinguishable. A correct solution could start with this observation and proceed to notice that the favored friend has (4) ways to receive the two CDs, and then the remaining CDs can be distributed in two ways. Thus, there are 3 . (4) . 2 = 36 ways to distribute the CDs.
(b) This is easy: There are 34 = 81 ways to distribute the CDs. 2. The objects (Sacagawea dollars) are indistinguishable, but the containers (nieces) are not. The answer depends on whether I want to risk the wrath of a niece who 4-1 does not receive at least one dollar. If I choose the safe course, there will be (3-1) = 3 ways to distribute the coins. This makes sense: There are three ways to choose which niece will receive two coins (there is still some risk of favoritism3+4involved). 1) = 1 If I choose to ignore the possibility of hurt feelings, there will be ( 4) = 15 ways to distribute the coins: I can give all four to one niece (three ways), give three coins to one niece and 1 coin to another (six ways), give two coins each to two nieces (three ways), or give one niece two coins and the other nieces I coin each (three ways). 3. The problem statement implies that harvesting need not occur in every field on the first day. The objects (sons) are distinguishable, but the containers (fields) are indistinguishable. There are S(4, 1) + S(4, 2) + S(4, 3) = 1 + 7 + 6 = 14 ways to assign fields on the first day.
Quick Check 8.5 1. There are 12 distinct Latin squares of order 3. 1 2 3
2 3 1
3 1 2
1 3 2
2 1 3
3 2 1
1 2 3
3 1 2
2 3 1
1 3 2
3 2 2 1 1 3
8.7 QUICK CHECK SOLUTIONS 2 1 1 3 3 2
3 2 1
2 1 3 2 1 3
3 1 2 1 2 3 2 3 1
3 1 2
2 1 3
3 1 2 2 3 1 1 2 3
3 1 2 3 1 2
2 3 1 3 1 2 1 2 3
3 2 1 1 3 2 2 1 3
3 2 1 2 1 3 1 3 2
501
2. There are four standardized Latin squares of order 4. 1 2 3 4 2 1 4 3 3 4 1 2 4 3 2 1
1 2 3 4 2 1 4 3 3 4 2 1 4 3 1 2
1 2 3 4 2 3 4 1 3 4 1 2 4 1 2 3
1 2 3 2 4 1 3 1 4 4 3 2
4 3 2 1
Quick Check 8.6 Both problems are just duals of claims that have already been proved. The solutions here have interchanged the terms point and line and the terms on and contains. These new proofs are valid independent of the duality used to write them. 1. Let p be any point in the finite projective plane. Axiom FPP3 asserts the existence of four points, pl, P2, P3, P4, no three of which are on a common line. At least three of these points are distinct from p. Assume, with suitable renaming if necessary, that p (pI, P2, P3}. Let the common line containing both p and pi be denoted Li, for i = 1, 2, 3. If the lines Li are all distinct, the claim is true. Otherwise, there must still be two distinct Li (since pl, P2, and P3 are on no common line). Assume that L 1 0 L 2 . There must be at least one other line, L'1 containing Pl and at least one other line, L•, containing P2. They must contain a common point, p1, with p' : p. p and p' must be on a common line, L (FPP1). If L = L 1, then Pl and p' would both be on L 1 and on LP, contradicting FPP2. Similarly, L 0 L 2. Therefore, p must be on the three distinct lines L 1, L 2 and L. 2. Suppose that there are k distinct lines in T-. Since each pair of lines contains a common point, there must be (k) such pairs. However, this collection of pairs of lines overcounts the points. In fact, each point has been counted once for each pair of lines it is common to. There are n + I lines containing each point, so each point has been counted (n2]) times. The number of points is thus 2) (nf+1)
k - (k - 1)
(n +1).n
By the duality principle, if there are k lines, then there must also be k points. Hence, n2 k=k-(k-1) f-- -), and consequently, there are k = + n + I points. This proves statement 5. Quick Check 8.7 1. Step 1 The construction requires a set of n - I squares of order 2. The one shown will work. 1 2 2 1 The matrix A is shown next. 1 1 2 2 1 2 1 2 1 2 2 1
1 mutually orthogonal Latin
502
Chapter 8 Combinatorics Step 2 Now add column labels. 1 1 1 1
2 1 2 2
3 2 1 2
4 2 2 1
Step 3 The six partial lines are shown next.
Step 4 The finiteprti
r
1
2 (3,4)
1
(1,2}
2
{1,31
12,4}
3
{1,41
{2,32
plane,
, has the seven points
{1, 2, 3, 4, 001, 002, 0031 and the seven lines shown in the next table. i r
1
1
2
(1,2, 1} (3,4,oo11 (1,3,002} 12,4,0021 {1,4,003} 12,3,0031 S ee, oh2, 003o
2 3
The lines and points can be shown in a diagram that should look familiar.
r12
2A
3
Quick Check 8.8 1. Step 1
5 [1,2, 5} 6 11,3,061 7
13,4,51 (2,4,61 12,3,71
(1,4,71 15, 6, 71
Step 2
1 T
2
= [1,2} [1, 3) 11,4)
13, 4} {2,4} {2, 3}
Step 3 5
1
2
3
4
1
2
2
1
1 1 2 2 1 2 1 2
8.7 QUICK CHECK SOLUTIONS
503
Step 4 A'=
12
L=
1
2
2
1
2
1
Quick Check 8.9 1. v=9,b=12,r=-4,k=3,and)X=l 2. v=4,b=4,r=3,jk=3,and)X=2 3. The incidence matrix is listed. Notice that every column has three Is (since k = 3) and every row has three Is (since r = 3). Also, every pair of rows has Is in two common columns (since k = 2). C1
C2 1
R1 1
R2 0
1 1 0
0 1 1
1 0 1
1 1 1
11
2 3 4
4. Denote the columns as M 1 , M 2 , M 3 , and so on. The design matrix does not need to contain row and column labels. They are added here for clarity. Notice that every column has three Is (since k = 3) and every row has four Is (since r = 4). Also, every pair of rows has exactly one common column with a 1 (since X = 1). I
M1 1
M2 0
M3 0
Tu1 1
Tu 2 0
Tu 3 0
Thl 1
Th2 0
Th 3 0
F1 1
F2 0
F3 0
2 3 4 5 6 7 8 9
1 1 0 0 0 0 0 0
0 0 1 1 1 0 0 0
0 0 0 0 0 1 1 1
0 0 1 0 0 1 0 0
1 0 0 1 0 0 1 0
0 1 0 0 1 0 0 1
0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
0 1 1 0 0 0 1 0
0 0 0 0 1 0 1 0
1 0 1 0 0 0 0 1
0 1 0 1 0 1 0 0
Quick Check 8.10 1. (a) The blocks are the rows of girls; the varieties are the 15 girls. There are 5 rows on each of 7 days for a total of 35 rows. Thus v = 15 and b = 35. Each row has 3 girls so k = 3. Each girl must walk every day, so r = 7. Finally, every girl is to walk in a row with every other girl once per week, so X = 1. The BIBD needs to be a (15, 35, 7, 3, )-design. (b) 35 . 3 = 15 . 7 and 7 . 2 = 1. 14 Quick Check 8.11 1. The parameters of D are (7, 3, 1). WO
W1
W2
Y3
W4
W5
W6
I
1
1
2
2
3
5
2
3
4
3
4
4
6
5
6
7
7
6
5
7
2. The parameters of D' are (4, 6, 3, 2, 1). B'0
B'1
B'3
B'4
BP5
BP 6
3
2
5
3
2
6
5
6
5
2 6
3
504
Chapter 8 Combinatorics 3. The parameters of D* are (3, 6, 4, 2, 2). Notice that D* is a two-fold replication of a (3, 3, 2, 2, 1)-design. B*0 4
B*1 4
B*3 1
B*4 1
B*5 1
B*6 1
7
7
4
7
7
4
Quick Check 8.12 1. The blocks are the columns of the following table. The varieties are the numbers I through 7. B1
B2
B3
B4
B5
B6
B7
I
1
1
2
2
3
5
2
3
4
3
4
4
6
5
6
7
7
6
5
7
2. The incidence matrix (with helpful labels) is unambiguous if the blocks are labeled in the sorted order used in part 1. B1
B2
B3
B4
B5
B6
B7
1
1
1
1
0
0
0
0
2
1
0
0
1
1
0
0
3
0
1
0
1
0
1
0
4
0
0
1
0
1
1
0
5
1
0
0
0
0
1
1
6
0
1
0
0
1
0
1
7
0
0
1
1
0
0
1
Quick Check 8.13 1.
w
W
X
Y
Z
B(w)
0
0
0
0
0
0
1
0
0
0
0
0
2
0
0
0
1 + B(O) =
1
1
3
0
4+B(0)= 4
0
l+B(1)=
1
4
4
3+B(0)= 3
4+B(1)= 4
0
1 + B(2)= 2
4
5
3+B(1)= 3
4+B(2)= 5
0
1+ B(3)= 5
5
6
3+B(2)= 4
4+B(3)= 8
0
1+B(4)= 5
8
7
3+B(3)= 7
4+B(4)= 8
ll+B(0)= 11
1+B(5) = 6
11
8
3+B(4)= 7
4+B(5)= 9
11+B(1)=lI
I+B(6)= 9
11
9
3+B(5)= 8
4+B(6)= 12
1l+B(2)= 12
1 +B(7)= 12
12
10
3+B(6) = 11
4+B(7) = 15
1l+B(3)= 15
1+B(8)= 12
15
The maximum total benefit is 15. There are two choices for the item to add in the final row: X or Y. Suppose X is chosen. Then the next item will be chosen by looking at row 10 - 3 = 7. The only item that matches B(7) is Y, so a Y is added. This leads to row 7 - 7 = 0, so no other items are added.
If Y is chosen in row 10, the next item will be determined by row 10 - 7 ý 3. In row 3, the item that matches B(3) is X, so an X is added to the knapsack. The next row to examine is row 0, so no other items are added. Both routes lead to the same solution: one X and one Y. (Other examples may have several distinct solutions.)
8.7 QUICK CHECK SOLUTIONS
505
Quick Check 8.14 1. The tables follow. T
K
w
W
X
Y
Z
B(w)
W
X
Y
Z
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
2
0
0
0
1 + B(O) = I
1
0
0
0
1
3
0
4+B(O)=4
0
1+B(1)=I1
4
0
1
0
0
4
3+B(O)= 3
4+B(1)=4
0
B(3)=4
4
0
1
0
0
5
3+B(1) = 3
4+B(2)= 5
0
1+B(3) =5
5
0
1
0
1
6
3+B(2)=
4
4+B(3)=8
0
1+B(4)=5
8
0
2
0
0
7
3+B(3)=
7
4+B(4)=8
5+B(0)=5
B(6) =r
8
0
2
0
0
8
3+B(4)= 7
4+B(5)=9
5+B(1)=5
1+B(6)=9
9
0
2
0
1
9 10
3+B(5)= 8 3+B(6)= 11
B(8) =9 B(9) =9
5+B(2)=6 5+B(3)=9
1+B(7)=9 B(9) =9
9 11
0 1
2
0
1
2
0
0
An optimal packing is one W and two X's, for a total benefit of 11. Notice that an
unbounded problem would pack three X's, for a total benefit of 12.
Quick Check 8.15 1. The knapsack holds 10 units of volume. The benefit-to-volume ratios are shown in the next table. Item Benefit Volume Quantity B/V Ratio
W 3 4 1
X 4 3.5 3
3 = .75
4 -• 1.143 -
43
Y 5 7 1 7
Z 1 2 1 = .5
.714
7
2
The heuristic will treat item X as the first item, followed by W, then Y, and finally Z. The inequality will compare X to W:
8
> 7.5.
L
2 units of item The inequality is true, so the heuristic ensures that min 2, . X should be packed. The knapsack will still have three units of volume left to fill, but no additional item X's. The heuristic can be applied a second time using only items W, Y, and Z and a knapsack with volume 4. It will compare item W to item Y:
0 >
-7
15 -- 2- 2.143.
The inequality is false, so the heuristic provides no additional suggestions. An
optimal solution is to pack two item X's and one item Z, for a total benefit of 9. The reason that the heuristic fails to suggest packing an item W is because Theorem 8.16 only checks to see if item Is is better than -L item 2s. It does not properly compensate for the fractional part of -2 (3 in this case).
506
Chapter 8 Combinatorics
Quick Check 8.16 1. Use the three encoding equations. (a) The check bits arex 5 = 1+0+1 = 0(mod2),x 6 1+0+1 =0(mod2), and x 7 = 1 + I + 1 = 1 (mod 2). The code word is therefore 1101001. (b) The check bits are x5 =0 + 0 + 1 = I (mod 2), x6 =0 + 0 + 1 (mod 2), and x 7 = 0 + 0 + 1 = 1 (mod 2). The code word is therefore 0001111. 2.
Use the decoding table. Tereeve hekbisar (a) The (a received check bits are x 5 = 0, x67 r = 0, and x' = 1. The calculated check bits are x4 = 0 + I + 1 = 0 (mod 2), x4 = 0 + I + 1 = 0 (mod 2), and xc = 0 + 0 + 1 = 1 (mod 2). Since the two sets of check bits are the same, the transmission is assumed to be without error. The message is therefore 0011. (b) The received check bits are x' = 0, xr = 1, and x' = 1. The calculated check bits are xc = 0 + 1 + 1 = 0 (mod 2),x' = 0 + 1 + 1 0 (mod 2), and x76 = 0 + 0 + I 1 (mod 2). Since 4 4,6 the transmission is assumed to contain an error. The decoding table indicates that bit x6 is where the error occurred. The message bits were received unaltered, so the message is therefore 0011. Tereeve hekbisar (c) The (c received check bits are x 5 = 0, x467= 0, and x' = 1. The calculated check bits are x= 0 + 1 1 = 0 (mod 2), X 1 + 1 + I = I (mod 2), and xc = t + 0 + I = 0 (mod 2). Since x4 7ý x4 and x' - x4, the transmission is assumed to contain an error. The decoding table indicates that bit xi is where the error occurred. The code word that was sent is therefore assumed to have been 0011001, corresponding to the message 0011.
Quick Check 8.17 1. The eight messages and their corresponding code words are shown in the following table. Message
Code Word
Message
Code Word
000
000000
100
100110
001
001011
101
101101
010
010101
110
110011
011
011110
111
111000
2. There are three message bits out of a total of six, so the efficiency is.3 3.
- 2
Since the code is linear, the minimum distance, d, of the code is the smallest nonzero weight among the code words (Corollary 8.3). The table of code words indicates that this minimum weight is 3. Theorem 8.19 indicates that the code is 1 error-correcting (2t + I = 3, so t = 1).
Quick Check 8.18 1. The parameters (23, 212, 7) indicate that M = 212, n = 23, and t = 3 errors can be corrected. Thus 212.
(23)
=212. ((23)
(23)
+
= 212 .(1 +23+253+
= 212 + 211 = 223 as required.
+
(23)
1771)
+
(23))
8.8 Chapter Review
507
Quick Check 8.19 1. The marriage condition is satisfied. k = 1 JAI[ = 3, IA2 1 = 1, and IA 3 1= 1 k=2 IAI UA 2 1 =3, IAI UA 3 1=3,andIA 2 UAL3 =2 k=3 IAIUA 2 UA 3I=3 2. The marriage condition is not satisfied because IAI U A2 U A3 1< 3. k = 1 AII = 2, 1A2 1 = 1, and IA3 1= 1 k=2 IAI UA 2I =2, JAI UA 3 1=2,andIA 2 UA 3l =2 k-=3 IAiUA 2 UA 3 I=2
Quick Check 8.20 1. Single out person A. Place all A's friends in a group and all A's enemies in another group. There are five people that are being placed in groups, so one of the groups must have at least three members (by the Generalized pigeon-hole principle). Suppose the group of friends has three or more members. If two of them (say B and C) are also friends with each other, then A, B, and C form a set of three mutual friends. Otherwise the three or more people in the group of A's friends are all mutual enemies. On the other hand, suppose the group of A's enemies contains three or more people. If two of them (again named B and C) are enemies with each other, then A, B, and C form a set of three mutual enemies. Otherwise there must be three people in the group of A's enemies that are all mutual friends. Quick Check 8.21 1. R(j, 3; 3) = j. Suppose ISI < j - 1. The three-element subsets of S can be partitioned by placing all of them into X, with Y = 0. Since S has no j-element subsets, there cannot be a subset T C S whose three-elements subsets are all in X. On the other hand, there are no three-element subsets U C S all of whose three-element subsets are in Y. Thus, R(j, 3; 3) > j - 1. Suppose ISI = j. If any one of the three-element subsets of S is placed in Y, then that subset can be used as U. On the other hand, if every three-element subset is placed in X, then let T = S. In either case, the (j, 3; 3) Ramsey condition is satisfied. Since this argument works even if ISI > j, R(j, 3; 3) < j. Combining the inequalities shows that R(j, 3; 3) -- j.
8.8.1 Summary This chapter provides an introduction to combinatorics by exploring several representative topics. These topics can be broadly organized by using three categories that encompass much of combinatorics: existence, enumeration, and optimization. Many of the topics in this chapter have subtopics in more than one of these broad categories. The chapter starts with some enumeration topics (Section 8.1). More specifically, it starts with partitions of an integer and occupancy problems. In the course of the discussion, the sets of Stirling numbers of the first and second kinds are introduced. Occupancy problems can be subdivided into eight categories. The categories add new kinds of counting problems to the categories of permutations and combinations (with and without repetition) encountered in Chapter 5.
508
Chapter 8 Combinatorics Section 8.2 discusses two apparantly unrelated topics: Latin squares and finite projective planes. Existence is important in both topics. In fact, the two topics are related because constructions exist that create sets of Latin squares from finite projective planes and that create finite projective planes from an appropriate collection of mutually orthogonal Latin squares. Enumeration also shows up in this section (estimating the number of mutually orthogonal Latin squares of order n). You may find some of the material in this section to be challenging. Existence is also a key concern in Section 8.3. Balanced incomplete block designs are examples of combinatorial designs (as are Latin squares and finite projective planes). One of the early theorems in the section provides another opportunity to use a combinatorial proof. The basic definitions and some standard properties of BIBDs are presented, as well as some applications in the design of experiments. The rest of the section presents a number of constructions. This section illustrates two distinct aspects of existence. Some of the theorems can be used to show BIBDs with certain parameters cannot exist. There are also constructions that actually create BIBDs with other sets of parameters. Section 8.4 describes a topic from the combinatorial optimization category. Knapsack problems have optimal solutions, but exhaustively checking all possible arrangements is not feasible in most cases. Rather than providing an analytic determination of 50 an optimal packing, an algorithm is presented that produces an optimal solution. Section 8.5 is about error-correcting codes. Much of the work in error-correcting codes has been accomplished by using ideas found in courses about linear algebra and algebraic structures. The material presented here is more combinatorial in nature. The basic definitions and some introduction to the notion of perfect codes are perhaps the most significant portions of the section. The final section in the chapter (Section 8.6) contains material that is about sets of sets. The section contains two topics. The first topic, systems of distinct representatives, determines whether it is possible to choose a distinct element from each set in a collection. The second topic, Ramsey numbers, investigates the minimum size of a set necessary to guarantee the existence of a property related to some of its subsets. This section contains some fairly advanced material. This chapter contains some fairly abstract, theoretical material. There is also some fairly concrete material, either in the form of identifying the proper formula or in the form of constructions, often small enough to list in a table or draw in a diagram. You will need to spend extra time reviewing the more abstract material. Gaining an intuitive feel for the definitions and theorems will help you master the material, You may find it very helpful to form a study group to discuss this chapter. You will benefit from the opportunity to express ideas verbally to others and will also benefit from their insights.
8.8.2 Notation
50
Notation
Page
p(n)
405
the number of partitions of n
p(n, k)
405
the number of partitions of n that contain exactly k summands
O C D I 0 -0
409
in occupancy problems: 0-object, C-container, D-distinguishable, I-indistinguishable, 0-containers may be empty, -0-containers may not be empty
S(n, k)
410
Stirling number of the second kind
Brief Description
An example of analytically determining an optimal value is the familiar max-min problems in calculus. A function is found that describes the problem, and then the first derivative is used to find the critical points. There are several other derivative-based tests that can be used to determine which of the critical points might provide an optimal solution to the original problem.
8.8 Chapter Review Notation
Page
s (n, k)
418
Stirling number of the first kind
(X)n
416
the falling factorial
Lj
424
one of a set of mutually orthogonal Latin squares
428 428
a point in a finite projective plane a line in a finite projective plane
m (n)
425
the maximum number of mutually orthogonal Latin squares of order n
.F
430
a finite projective plane
Pj Lj
509
{PI, P2.,
Pk
Brief Description
oci
434
a point at infinity in a finite projective plane
(v, b, r, k, X)
446
the parameters of a balanced incomplete block design
(v, k, X)
447
the parameters of a symmetric balanced incomplete block design
D
450
the complement design of the BIBD, D
D1
450
the derived design of the BIBD, D
D*
451
the residual design of the BIBD, D
v, vi, bi, qi
457
parameters for a knapsack problem
B(v)
459
optimal total benefit for a knapsack with capacity, v
Hd(u, v)
474
the Hamming distance between binary strings, u and V
Hw(u)
474
the Hamming weight of binary string, u
(n, M, d)
474
the parameters of a binary error-correcting code
Bn
478
the set of all binary strings of length n
St (u)
478
the binary sphere of radius t about u
R(j, k)
492
a Ramsey number
R(j, k; m)
494
a generalized Ramsey number
8.8.3 Definitions Existence Many combinatorial topics relate to finding a special configuration of elements from some set. The existence problem seeks to determine if such a configuration is possible. If it is possible, the existence problem seeks ways to construct the configuration.
whose entries are the complete set of positive integers { 1, 2 ... n 2}, arranged in such a way that the sum of every row, the sum of every column, and the sum of the two diagonals are all the same number. Partition of an Integer A partition of an integer, n, is a
Enumeration Sometimes it is not too difficult to construct a desired configuration. The interesting question then may be "how many distinct configurations are possible?". This is an enumeration problem. Optimization Sometimes a configuration of interest has additional properties that enable us to distinguish among acceptable alternatives. It may be that among all configurations of a certain type, some are more useful than others. The problem is then to find an optimal configuration. Combinatorial Design A combinatorialdesign is an arrangement of mathematical objects into a configuration that has some additional properties or requirements. (Balanced incomplete block designs, Latin squares, and projective planes are all examples of combinatorial designs.)
representation of n as a sum of positive integers, where the order of the summands is not important. The number of partitions of n is denoted by p(n). The number of partitions of n that contain exactly k summands is denoted by p(n, k). Occupancy Problems Occupancy problems are concerned with placing objects into containers. Occupancy problems are categorized by whether the objects and containers are distinguishable or indistinguishable and also by whether or not containers can be empty. Stirling Numbers of the Second Kind The number of ways to distribute n distinguishable objects into k indistinguishable containers with every container receiving at least one object is denoted S(n, k). The numbers, S(n, k), are
Magic Square A magic square is an n-by-n
called the Stirling numbers of the second kind.
matrix
510
Chapter 8 Combinatorics
Linear Combination Let e, e2..... , ek be expressions. A linear combination of {el, e2 .... ek} is an expression of the form clel + c2e2 + • where {cI, c ...
+ Ckek,
ck I is a set of constants.
The Falling Factorial, (x)n The falling factorial is denoted (x), and is defined by (x)o =1 and n-l
(x) = H(x - i) = x(x - 1)(x - 2) ... (x - n + 1) i=0 for n > 1. of Stirling Numbers of the First Kind The coefficients the expansion of (x), as a linear combination of powers of x are the Stirling numbers of thefirst kind and are denoted by s(n, k). That is,
FPP2 Any two distinct lines contain one and only one common point. FPP3 There exist four distinct points, no three of which are on a common line. Finite Projective Plane of Order n A finite projective plane, .F, is said to have (or be of) order n if every line in Y contains n + 1 points. The Fano Plane The smallest finite projective plane is named the Fano plane. It contains seven points and seven lines. Every line contains three points. The following diagram shows the standard visual representation for the Fano plane.
n
(x)W =
s(n, k)xk. k=O
Latin Square A Latin square of order n is an n-by-n matrix for which every entry is a number in {1,22 .... n}. Every number in {1, 2 ..... n} must appear at least once in every row and at least once in every column. A Latin square of order n is standardized if the elements in the first row are written in increasing numeric order moving left to right, and the elements in the first column are written in increasing numeric order moving top to bottom. permutation of the set Permutation of a Set A {1,2 .... n} is an ordered arrangement of the numbers 1, 2 .... n.
Point at Infinity In order to show perspective, painters use a "vanishing point" or a point at infinity. The majority of lines in the painting will tend toward the vanishing point. This is similar to looking down a long, very straight train track. It appears as if the two tracks converge and meet in the far distance. This terminology has been borrowed as an aid for discussing diagrams of finite projective planes (since the lines in every pair meet at a point). Balanced Incomplete Block Design A balanced incom-
2 = Orthogonal Latin Squares Let L1 = (aij) and L 1 (bij) be two Latin squares of order n. L and L2 are said to be orthogonal if the set of ordered pairs {(aij, bij) I i = 2 n and j = 1,2 .... n} contains n distinct or1,2 .... dered pairs. That is, (aij, bij) 0 (ars, brs) unless i = r and j = s. A collection of k Latin squares of order n is said to be mutually orthogonal if every pair in the collection is orthogonal. Self-Orthogonal Latin Square A Latin square is called
plete block design, abbreviated BIBD, is a combinatorial design consisting of a finite collection of finite sets (called blocks), each consisting of a finite number of elements (called varieties). The boundary conditions a BIBD must satisfy are expressed in terms of five parameters, commonly expressed as the 5-tuple of positive integers, (v, b, r, k, X). The parameter v represents the number of distinct varieties; the parameter b represents the number of blocks. Every variety is required to be in exactly r blocks, and every block must contain exactly k varieties. Finally, every pair of distinct varieties must appear in exactly X, common blocks. A combinatorial design that meets these
Finite Projective Plane A finite projective plane consists of a finite set, 'P, of points and a finite set, L, of lines. Lines are finite sets of points. If L = {PI, P2 .... Pkl}, then point pi is said to be on line L and L is said to contain the point pi, for i = 1, 2 .... k. The following axioms characterize finite projective
conditions is often referred to as a (v, b, r, k, X,)-design. A (v, b, r, k, X)-design with k = v and r = b is called trivial. The Incidence Matrix of a BIBD Let D be a (v, b, r, . uL and blocks k, Ic)-design with varieties {uo , u of D is the v by matrix incidence The Bbi. {Bk,,2)..... b matrix, M, where
planes. FPP1 Any two distinct points are on one and only one common line.
mi.
l
0
if vi E Bj
otherwise
8.8 Chapter Review Symmetric; Resolvable A balanced incomplete block design is symmetric if v = b and r = k. Symmetric balanced incomplete block designs are often referred to as (v, k, k)designs. A balanced incomplete block design is resolvable if the blocks can be grouped into disjoint collections (of equal numbers of blocks) so that every variety appears exactly once in each group of blocks.
Replicated Design If D is a (v, b, r, k, X)-design, then a (v, nb, nr, k, nX)-design, Dn, can be created by making n copies of each block in D. The design D, is called a replicated design. Complement Design If D is a (v, b, r, k, ),)-design then a (v, b, b - r, v - k, b - 2r + X)-design, D, can be created by complementing each of the blocks in D. The design D is called a complement design. Derived Design A derived design, D', is constructed from a symmetric BIBD, D, by selecting one block, B 0 , and removing from the design all the varieties that are not in B 0 .
Residual Design A residual design, D*, is constructed from a symmetric BIBD, D, by selecting one block, B0 , and removing from the design all the varieties that are in Bo.
Greedy Algorithm A greedy algorithm is an algorithm that works by employing a series of local optimizations. Greedy algorithms often do not produce a global optimum. Heuristic An heuristic technique is one that suggests a course of action that is probably correct but is not guaranteed. Often the course of action is determined after some preliminary calculations are made. Binary String A binary string of length n is a sequence of n symbols, where each symbol is either a "0" or a " L" 7-Bit Hamming Code A 7-bit Hamming code is a set of 16 binary strings of length 7. In each string, the first 4 bits contain an encoded message; the last 3 bits are cleverly chosen redundant bits.
Encoding a 7-Bit Hamming Code X5 = X2 + X3 + X4 (mod 2) X6 = XI + X3 + X4 (mod 2) x7
girls at her boarding school. Each day the schoolmistress lines the girls up in 5 rows of 3 girls each. She wishes to group the girls so that in the course of 7 walks, each girl will have been in a row with every other girl exactly once. The Knapsack Problem The knapsack problem is concerned with a knapsack that has positive integer volume (or capacity), v. There are n distinct items that may potentially be placed into the knapsack. Item i has positive integer volume, vi, and positive integer benefit, bi. In addition, there are qi copies of item i available, where quantity, qi, is a positive integer satisfying 1 < qi < o. The integer variables, Xl, X2 .... x,, will determine how many copies of item i are to be placed into the knapsack. The goal is to Maximize zn
Xl
+
x2
+
x4
(mod 2)
Decoding a 7-Bit Hamming Code Differing Check Bits Error in Bit
B.X6
Kirkman's Schoolgirl Problem The Kirkman's Schoolgirl Problem is about a schoolmistress who has 15 young
511
and X7
x1
X5 and X7 x5
and x6 and x 7 x5
x5, x6 ,
x3 X4 x5
Hamming Distance The Hamming distance between two binary strings, u and v, having common length, n, is the number of positions in which u and v differ. The Hamming distance is denoted by Hd(u, v). Hamming Weight The Hamming weight of a binary string is the number of is in the string. The Hamming weight of the string, u, is denoted by Hw(u).
Y Vixi < V
Adding and Subtracting Binary Strings Let u and v be binary strings with common length, n. The sum and difference of the two strings are denoted u + v and u - v, respectively. Both operations are defined as the bitwise (mod 2) sum. That is, the bit in position k of u + v is the same as the bit in position k of u - v and has the value uk + Vk (mod 2).
0 < xi < qi. If qi = 1 for i = 1, 2,..., n, the problem is a 0-1 knapsack problem. If one or more of the qi is infinite, the problem is unbounded; otherwise, the problem is bounded.
A binary errorBinary Error-Correcting Code correcting code is a nonempty subset, C, of the set of all binary strings having length, n. Let JCI = M and let d = minu,vC Hd(u, v). C is characterized by the parameters n, M, and d, and is referred to as an (n, M, d) code. Parameter d is called the minimum distance of the code.
Zbixi i=1 subject to the constraints
and
512
Chapter 8 Combinatorics
The binary strings that are the elements of the code are called code words. An error-correcting code is linearif the sum (or difference) of any two code words is also a code word.
list, rl, r2, r, of elements in U is called a system of' distinct representatives for {A 1, A2 . An I if • ri e Ai, for i = 1, 2 . n - ri 0 rj, for i :j
Error-Correcting Capability; Efficiency Let C be a binary error-correcting code. If it is possible to correctly decode any received string whenever t or fewer bits have been changed during transmission but not possible always to correctly decode if t + 1 or more bits are changed, then C is said to be t-error correcting. messages If the code words in C have length n and the that the code words represent have length k, then C is said
The Marriage Condition Let A,, A 2 . An be n (not necessarily distinct) subsets of a set U. The collection JAI, A 2 . A,1 is said to satisfy the marriage condition if for every k with I < k < n and every choice of a size-k
to have efficiency k The Binary Sphere Centered at a Code Word Let C be a binary error-correcting code with code words of length n, and let !3n be the set of all binary strings of length n. The binary sphere of radius t centered at a code word, u, is denoted St(u) and is defined as St(u) = {x E B3 I Hd(u, x) < t}.
The Enhanced Marriage Condition Let A,, A 2 , An be n (not necessarily distinct) subsets of a set U. The collection, {AI, A2 ... Ad}, is said to satisfy the enhanced marriagecondition if for every k with 1 < k < n - 1 and every choice of a size-k subcollection, {Air, Ai2 .... Aik }, with 1 < il < i 2 < ... < ik < n JAil U Ai 2 U ... U Aiki > k + I.
Perfect Code Let C be a binary error-correcting code whose code words have length n. Let Bn be the set of all binary strings of length n and assume that C is a proper subset of 83,. Then C is called a perfect t-error correctingcode if, for some t > 0, B, is a disjoint union of the spheres St(u), for u e C. Such a code may also be referred to as a perfect code, without specifying tR(j, Marriage Problem The Marriage Problem concerns a group of eligible young women and a group of unmarried same size. young men. The two groups need not be the
The (j, k) Ramsey Condition Let S be a set with n elements. Let j > 2 and k > 2. S satisfies the (j, k) Ramsey condition if for every partition of the two-element subsets of S into the disjoint sets, X and Y, there is either a jelement subset, T, of S such that every two-element subset of T is in X, or else there is a k-element subset, U, of S such that every two-element subset of U is in Y. k) The Ramsey number, R(j, k), is the smallest integer such that every set, S, with at least R(j, k) elements satisfies the (j, k) Ramsey condition.
Each young woman makes a list of acceptable mates from among the group of young men. She then checks with each man on her list to see if he is willing to marry her. She removes the name of any man on the list who is unwilling to marry her. It is assumed that any man left on the list is cornpletely acceptable as a mate. All the lists are then handed to a neutral referee. The referee must determine whether it is possible to marry each young woman to a young man
The (j, k; m) Ramsey Condition Let S be a set with n elements. Let j > m > 1 and k > m > 1. S satisfies the (j, k; m) Ramsey condition if for every partition of the m-element subsets of S into the disjoint sets, X and Y, there is either a j-element subset, T, of S such that every m-element subset of T is in X, or else there is a k-element subset, U, of S such that every m-element subset of U is in Y.
who is on her list. Bigamy, of course, is not permitted.
R(j, k; m) The Ramsey number, R(j, k; m), is the smallest integer such that every set, S, with at least R (j, k; m) elements satisfies the (j, k; m) Ramsey condition.
A System of Distinct Representatives Let A,, A 2. . . . . . A, be n (not necessarily distinct) subsets of a set U. A
1 1}.
With this notation, the language in the previous example can be described as L(G) = {(ab)" I n > 0}. String concatenation is associative: If x, y, and z are strings, then (xy)z = x(yz). One additional set of notation will be useful.
Notation Let A and B be sets of symbols or sets of strings. Then
"•AB={abla A andbE B}. "*A 0 {A. "•A" = AAn-' for n > 1. "*The Kleene closure of A is A* = {A}U 3 2 "•A+ =AUA UA U ... = UQ Ai. -
3 A U A2 U A U ...
Y•=oA'i.
Notation Practice Let x = "ab" and y = "c". Let A = {r, s, t} and B = {u, v}. (Note: I have used the quotation marks to emphasize that x and y are strings. In this context, the quotation marks are often omitted.)
X3 =- "ababab" * x 2 y = "ababc" * (x 2 y)3 = "ababcababcababc" x* E {A, "ab", "abab", "ababab",... kE}UX+ x* X* * AB = {ru, rv, su, sv, tu, tv}
* B3
{uuu, [ uuv, uvu, uvv, vuu, vuv, vvu, vvv}
U
V Quick Check 9.5 Let x = ab and let y = cde. Let A = {r, s} and B = {t, u}. 1. Is "abba" in {x" I n > 01? 4. What is AB? 2. What is xy 2 ? 5. How many elements are in AB 2 A? 0 0 3. Are x and A the same? vl
9.3 Formal Languages
545
The following examples show how grammars can be constructed to produce some well-defined languages. {abibck I i, i, k >_0)
How do we construct a grammar that would have L(G) [{aibick I i, j, k > 01? The set consists of zero or more a's followed by zero or more b's followed by zero or more c's. We need productions that will let us generate a bunch of a's then move on to generate a bunch of b's and finally generate some c's, then quit. The null string also needs to be derivable. The grammar can be formally specified as E = {X, a, b, c}, A = {S, A, B, C}, and FI contains S
--
A
move to the a section
A A B B C
--
aA B bB C
generate another a move to the b section generate another b move to the c section generate another c
--
-
--
C
cC )quit ;,
The string aaccc can be derived by S =:. A =:. aA =•ý aaA ==•aaB =:• aaC =• aacC =:. aaccC =: aacccC =ý aaccc.
Everything seems fine. However, the grammar produced is not a regular grammar since it contains the productions S --* A, A --* B, and B --* C.
Here is an alternative set of productions that do conform to the rules of regular grammars. S
-+
S
-*
S
-+
S A A A
----
A
aA bB cC X X aA bB cC
move to the a section move to the b section move to the c section generate the null string and quit generate another a move to the b section move to the c section
)quit X
B
--
bB
generate another b
B
-
cC
move to the c section
B
k÷ C -- cC C--X
quit generate another c
U
quit
The listing of the productions in the previous example is a bit long (and this is just a simple example), A shorthand notation is available to simplify the specification of productions. The shorthand notation uses the "pipe" symbol, 1,to represent the word or. The productions
"*A-- aA "*A -bB can both be written on one line as A --> aA IbB. The grammar in Example 9.18 can be written succinctly as E = {X, a, b, c}, A = {S, A, B, C}, with start symbol S, and 11 contains
546
Chapter 9 Formal Models in Computer Science
"* S
---
aA bB cC X
"*A "*B
--
aA
--
bB cC I
"*C
--
cC
bB I cC 1
I
VQuick Check 9.6 1. Produce a regular grammar that generates the language {aibick i, k > 0, j > 1}. Use at most four nonterminals. Write productions using the
shorthand notation with I. 2. Derive the string bbc using the grammar from 1.
No Double Letters Let T = {a, b}. How would we specify a grammar that generates the language of all non-null strings in T* that contain no double letters? (The string ababbab would be rejected since it contains bb.) On first thought, it might be possible to make every production always contain either an ab or a ba, depending on what the previous letter was (perhaps represented by the nonterminal we are replacing). However, this won't allow acceptable strings such as a or aba to be generated. What is needed are productions that produce a terminal symbol, followed by a nonterminal that "remembers" which terminal it came with. Then we ensure that the nonterminal only leads to the other character. Formally, G = {IE, A, S, F1I, where E = {a, b, •l, A = {S, A, B} and 1- contains the six productions S -- aA I bB A -- bB I X B -- aA I ),
can start with either letter switch letters or quit switch letters or quit
The string aba would be derived as
S=
aA = abB = abaA • aba.
Recall that the definition of a regular grammar specified that a production can have at most one nonterminal on the right-hand side and any such nonterminal must be the rightmost symbol. This restriction makes regular grammars easy to work with and causes regular languages to have a very simple structure. However, regular grammars are not always as expressively powerful as we need.
A Language that Is Not Regular Let E {a, b). The language (subset of E*) specified by L = {anbn I n > 0} cannot be generated by a regular grammar (and so is not a regular language). The problem centers on the need to remember the number of a's that were in the first part of the string while producing the final half of the string. You may have observed that having the only nonterminal in a production's right-hand side as the final symbol causes derivations to always expand from left to right. That means that all the a's will have been generated (and "forgotten") before any of the b's can be produced. The productions will have no way to remember how many a's were generated (or even that multiple a's were produced). A more formal justification for the claim that this language is not regular uses a result called the pumping lemma. Details can be found in [48]. U
9.4 Regular Expressions
547
9.3.2 Exercises (b) L = {an baa In > 0} (c) L= {akbbam Ik > landm >0}
The exercises marked with O have detailed solutions in Appendix G. 1. Which of the following sets of productions can be from a regular grammar? (a) {S- aXI, X---X } (b) I{S-->aXIbY, XbY, Y-aXIYIbi (c) {S -- aX I bY, X -- bYa Ib, Y -+ bX I a} (d) IS- aX I YZ, X -- aZ, Y --* b, Z -- aX Ia} 2. Prove that string concatenation is associative, {uvw} U {vw} = (u, X}{vw}.chie that 0- Showthat 3. 3.PrShowe if vwand B arwe sets o g, tn (AB) U B = (A U
10. Let E = 10, 1, X}.Create a regular grammar that generates each of the following languages. (a) L = {(01)n In > 1} (b) L = {(01)n In > 0} (c) L={001n00In>2) 11. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) If E is a finite set of symbols, then XE*is also a finite set. (b) A production in a regular grammar is not permitted to contain a terminal symbol on its left-hand side.
osn)B.
5. Let E = {0,1 }, A = {S, X, Y}, and FI = {S --* OX I lY, X --+ OX I1, Y --
(c) ODLet G be a regular grammar. Then X E L(9).
ooY I1}
and let S be the start symbol. Which of the following strings are derivable from S? If the string can be derived, show a derivation; otherwise give some reason why it cannot. (a) 011 (b) 101 (c) O 100001
(d) 010001
(e) 0101
--
(b) Prove that
+
=
+.
13. P Let E be a set of symbols. Prove that (E
6. Let E = {a, b, c}, A = {S, W, X, Y}, and
Y
(d) Let x and y be symbols. Then (xy)+ represents one or more copies of the string "xy," concatenated together. 12. Let E be a set of symbols. (a) Prove that (E-I)* = E
aX I cY I c}
16. and let S be the start symbol. Which of the following strings are derivable from S? If the string can be derived, show a 17. derivation; otherwise give some reason why it cannot. (a) abba (b) aaab (c) bbacab 18. (d) bbaacb (e) caabb 7. Describe the language generated by the grammar g = {E, 19. A, S, fI-, where E = (a, b, c), A = IS, X, Y} and H7is (a) I{S --*aX IbY, Y --) bY Ib, X --+ aX I ci (b) {S -aX IbY, Y --> bY I c, X -+ aY I a} (c) IS- aX, X --* aX I bY, Y --- cY I hb 8. Describe the language generated by the grammar • = { ,minimal A = IS,X, Y, Z} and HIis A, S, FI}, where E = {0, 1, X.i, ZI } OXIOYI 1Z, X-+ 1Z, Y- OYIO, Z-(b IY S OXIIY I Z, Y OX i O, Z 01 (b)IS OX I IY, Ithe (c) {S--*OXI1Y, X---*1YI1Z, Y--OXIOZ, Z--+OI)i (a) {S-
9. Let E = {a, b}. Create a regular grammar that generates each of the following languages. (a) O L = {w E E* I w does not contain bbi
=
(E
14. Is (E*)+ = (E+)* always true? If it is always true, provide a proof; otherwise, provide a counterexample. 15. Find a grammar whose language has more words of length three than it has words of length 4. Find sets A and B such that (A U B)* 0 (A* U B*). Let A = {xnym I n > 1, m > 01.Use the + and * operators to express A more succinctly. Let A and B be any two (not necessarily disjoint) sets of finite strings over a common set, E, of symbols. (That is, A and B are two languages over E). Prove that IABI < Al •IBI. Let A be a nonempty set such that A2 = A. (a) Prove that A+ = A. (b) Prove thatXEA. (Hint [35]: Consider the cases IA = Iand IAl > 1. For the second case, think about a non-null sting in A with length.)
(c)ProvethatA*=A. 20. Prove that it is impossible to have a language, L following property:
C
E*, with
Let x, y E L, with x # X and y : X. Then the concatenation xy is in L if and only if x $ y. (This problem is from [13].)
9.4 Regular Expressions One of the most immediately useful topics in this chapter is the notion of regular expressions. Many computer programs and operating systems support the use of regular expressions to specify the pattern to be found in a file. For example, if your word processor has opened a large document and you wish to find the phrase "separate the novices
548
Chapter 9 Formal Models in Computer Science from the experienced" but can't remember whether the first word is spelled "separate" or ".'seperate" and also can't recall whether "novice" or "experienced" comes first, a regular expression would allow you to search and find the phrase in any of the possible combinations. One regular expression that would achieve this goal (and will be explained shortly) is "sep[ae]rate.+novice."
9.4.1 Introduction to Regular Expressions DEFINITION 9.10 An Informal Definition of Regular Expressions Let E be an alphabet. A regularexpression over E is a mechanism for building or recognizing or matching a subset of V. The subset is called the regular set generated by the regular expression. A regular expression serves as an abstract pattern that specifies which strings in E * belong to the corresponding regular set.
A Very Simple Regular Expression Let E be the ASCII character set. The regular expression mom matches the strings
"*Hi mom! "*The moment I saw you, I fell "* moM
in love.
but not the strings * I love math! * Mom said I could have it! * Plant the mums over there. since "mom" occurs as a substring of each string in the first group but not in any string in the second group. The strings in the first group are all part of the regular set specified by the regular expression "mom." 0 If directly specifying a substring were all there was to regular expressions, they would not be very useful or important. Fortunately, there are more powerful mechanisms for specifying patterns. Many UNIX programs use some form of regular expressions. A simple application would be to locate all lines in a document that contain the letter q followed by any letter except u. A regular expression that would identify all such lines is q["ul
(an explanation of the role played by [ and I and - will be given shortly). We need to use several special characters (called metacharacters)to help specify the more interesting regular expressions. These characters will be presented next.
Details Unfortunately, there is no standard set of rules for building regular expressions. In this section, a close approximation to a minimal set of generic rules (at least for UNIX systems) will be presented. The simplest rule can be approximated by stating: Most characters match themselves. In the first example, the characters in the string "mom" matched themselves
9.4 Regular Expressions
549
in the first group of strings. There are a few characters (the metacharacters) that don't match themselves. Instead, they serve other roles.
Metacharacters The characters on the next line have special meanings when they appear in a regular expression: I
$
(
)
\
"A*
?
+
Their meanings are described in the following discussion. A summary description can be found on page 584.
"*The $ character matches the end of a line. For example, the regular expression mom$
will only match strings for which "mom" occurs as the last characters on the line (followed by the newline character). "*The - character matches the beginning of a line. (There is another usage of the metacharacter if it is used with a [] pair. This will be described in the next bullet.) The regular expression
"mornm$ will only match strings for which "mom" occurs as the only characters on the line. "*The [ character initiates a [ I pair. A regular expression consisting of a pair [ I with a set of characters inside matches any one character from the set. A regular expression consisting of [ - ] with a set of characters after the - matches any one character that doesn't occur inside the [ pair.
Using [ i Pairs The regular expression [02468]
matches any string with at least one even digit. The regular expression [^0123456789]
matches any string with at least one character that is not a digit. The regular expression q[^u]
will match any string that contains a q followed by any character other than u. The regular expression [0-91
matches any string with at least one digit. The "-" character acts as a range operator. If the "-" character is one of the characters you want as a choice in the [I pair, it needs to be the first character so that it isn't interpreted as a range operator. The regular expression [ - + */ ] matches any one of the four arithmetic operators. A common U method for specifying a single alphabetic character is [a - zA- Z].
550
Chapter 9 Formal Models in Computer Science
" The metacharacter I indicates an alternative. A regular expression containing a I matches any string that contains either the left or the right alternative. For example, alb matches any string that contains either an a or a b. The regular expression albIcId matches any string that contains one of the first four lowercase letters.
"*The
( ) metacharacters are used to group characters into subpatterns of the regular expression. For example, the regular expression s(aIe i)t matches any string that contains sat, set, or sit as a substring but doesn't match seat or seit.
"* The
\ metacharacter is used to convert a metacharacter back into a normal symbol (so it can match itself). For example, the regular expression \$5 will match any string containing the monetary unit $ 5. A DOS path name can be specified using C: \\Classes\\CWC which matches any string containing "C: \Classes \Cwc" as a substring.
"*The "*The
.
metacharacter matches any single character except a newline.
metacharacter is used to surround a regular expression. Everything between a pair of " characters is treated as the regular expression. The main reason for this is for use with software that accepts regular expressions as command-line arguments. The " characters keep the operating system from trying to evaluate the regular expression as a part of a command. "
The role played by the remaining metacharacters will be defined in the next subsection. Before attempting the quick check problems, it might be useful to determine the availability of computer software that you can use to check your own answers. If you have access to a UNIX operating system, the standard utilities grep and egrep can be used. 13 If you do not have a UNIX system, you can install Perl on your computer and use 14 the following short Perl script dmgrep. To use Perl to check your regular expressions, copy Table 9.7 to a file named dmgrep (Discrete Math grep), To use dmgrep to validate a regular expression for finding a line with at least one even digit, use the following command string (from a DOS window or a UNIX shell): 15 perl dmgrep "[024681" 13
For details on use, see the man pages: man grep. You may find it helpful to use the metacharacters A and $ at the beginning and end of your pattern. 14you can get Perl for free. The "Textbook-Related Links" section of http://www.mathcs.bethel.edu/ -gossett/DiscreteMathWithProof/ has a link to CPAN (the Common Perl Archive Network), where you can find Per] distributions for most operating systems. The file, dmgrep, is available in the "Downloads" section of the DiscreteMathWithProof Web page. 15 In UNIX you may use either single quotes '...' or double quotes "..." to surround the regular expression. The alternative symbol can then be treated inside the pattern as a normal character.
9.4 Regular Expressions
551
TABLE 9.7 dmgrep $pattern = shift
@ARGV;
while () { if (m/($pattern)/) print
{
"$l\n";
Then start typing lines of text. Whenever a line contains the pattern you are looking for, the matched pattern will be echoed; otherwise the program will do nothing. To end, type "control-Z" (Windows) or "control-D" (Unix). The following is a sample session. pearl dmgrep "[02468]" I am 3 years old. I have 4 cats. 4 My lunch cost $8.46 today. 8
Notice that only the first possible match is echoed. A UNIX session using egrep or grep would look similar, except that the entire line in which a match occurred would be echoed.
V
Quick Check 9.7 1. Write a regular expression that matches the words ear and eat. 2. Write a regular expression that
matches any word with an i that is not followed by an e or a z.
Repetition Specification The real power of regular expressions comes from the ability to use concatenation, alternation, and Kleene closure. Kleene closure provides the ability to repeat a subpattern 0 or more times. As a convenience, most regular expression specifications add the ability to specify 0 or 1 occurrences and 1 or more occurrences. The * metacharacter indicates that 0 or more copies of the immediately preceding character or subpattern will be matched.
Using * The regular expression (alb) *cd*
will match any string that contains any finite (possibly empty) sequence of a's and b's, followed by exactly one c, followed by zero or more d's. The following strings are in the regular set generated by "(a ýb) *cd*".
* aabac * xyzbbacdddnm * bbbbcd * abdcd
(Think carefully about this one.)
The string abdd will not be matched.
U
552
Chapter 9 Formal Models in Computer Science
" The ? metacharacter matches either 0 or 1 copy of the immediately preceding character or subpattern. The regular expression wan?d
"
will match strings containing either wad or wand, but not the string wannd. The + metacharacter matches 1 or more copies of the immediately preceding character or subpattern. Thus, the regular expression a. +z
will match any string containing an a and a z, in sequence, with at least one other character between them. The strings a8z and a z would fit this criterion.
1. Write a regular expression matches any positive integer,
that
2. Write a regular expression matches any integer.
that
Precedence As usual, the major regular expression operators have an agreed-on precedence. Subexpressions that are inside parentheses are evaluated first. Kleene closure, X*, has the next highest priority, followed by concatenation, X Y, and then alternative, X I Y. Precedence The regular expression ab* Ic matches substrings that either begin with a single a, followed by zero or more b's or else contain exactly one c. That is, it generates the regular set {abn In > 0} U {c}. The regular expression a(b*lc) matches all substrings that begin with a single a, followed by either zero or more b's or else followed by one c. The associated regular set is thus {ab'1 In > 01 U {ac}. The regular expression (ab)*Ic matches all substrings that begin with any collection of zero or more ab's or else exactly one c. The associated regular set is thus {(ab)n In > }U {c}.
The Formal Definition DEFINITION 9.11 A Formal Definition of Regular Expressions Let E be an alphabet. A regular expression over E is defined recursively by the following: "*The empty set, 0, is a regular expression. "*The empty string, k, is a regular expression. "*The symbol, a, is a regular expression for every symbol a E E. "*If the symbols or strings, A and B, are regular expressions, then their concatenation, AB, is also a regular expression. "*If the symbols or strings, A and B, are regular expressions, then AIB is also a regular expression. "•If the symbol or string, A, is a regular expression, then A* is also a regular expression. The metacharacters introduced in the previous subsections provide convenient mechanisms for implementing this definition. For example, the regular expression 1$
9.4 Regular Expressions
553
implements the regular expression L. The [] pair is an alternative (more compact) mechanism for implementing I alternation when the subpattems in the alternative are single symbols of E. The ? and + metacharacters could be eliminated by using * and I in combination, but it is convenient to keep them.
SINovices
and Experts Recall the example used to introduce the topic of regular expressions: Your word processor has opened a large document and you wish to find the phrase "separate the novices from the experienced" but can't remember whether the first word is spelled "separate" or "seperate" and also can't recall whether "novice" or "experienced" comes first. A regular expression that would achieve this goal is "sep[ae]rate.+novice." This will match the initial letters "sep," then match either an a or an e (depending on which actually appears in the document). It will then match an arbitrary string followed by the letters novice. The arbitrary string will be either "the" or "the experienced from the" (again depending on the order in which the words novice and experienced actually appear in the document). Of course, the regular expression, "sep[ae]rate.+novice," will also match strings such as "separate the young novices before they start a fight." However, the context assumes that any other such match is unlikely, so it is not worth the effort to create a more precise regular expression (but see Exercise 6 in Exercises 9.4.3 for an improved E version).
9.4.2 Perl Extensions The programming language Perl has a number of useful extensions to the regular expression constructors presented so far. A few of these extensions are presented in Table 9.8. Perl uses regular expressions within a very powerful pattern matching operator that allows you to specify whether the match should be case sensitive or not and also allows the use of variables within the pattern. See a Perl text for the complete details. TABLE 9.8 Perl Extensions for Regular Expressions Symbol Represents \n
newline
\r
carriage return
\t
tab
\f
formfeed
\d
a digit, same as [0-9]
\D
a nondigit
\w
a (single) word character (alphanumeric), same as [_ - 9 a-zA-ZI
\W
a nonword character
\b
a word boundary (between \w and \W in some order)
\B
a non wordboundary
\s
a whitespace character, same as [ \ t \ n \ r \ f ]
\S
a nonwhitespace character
The following examples illustrate more complex uses of regular expressions (with Perl extensions).
554
Chapter 9 Formal Models in Computer Science
Finding an HTML Tag Hyper-Text Markup Language (HTML) is used to define the logical structure of a Web page. One of the features that is often overlooked is the tag that defines a title to display on the top border of the Web browser, in a list of bookmarks, or in the list some search engines return. The HTML tag that defines this title looks like <TITLE>Your Title Here The word title in the pairs of brackets need not be in upper case. The regular expression listed next can be used to scan an HTML file to make sure it has a valid TITLE tag. The dmgrep script can be set to ignore case (add the "i"-for case insensitive-in the Per] expression m/$pattern/i at line 3). perl dmgrep
"( .*)" < MyHomePage.html
The end of the line (< MyHome Page. html) tells the operating system to grab lines E from the file MyHomePage. html instead of from the keyboard. N
E-mail Addresses The following Perl regular expression will allow a file to be scanned for an email address in the form username@bethel, edu. It also allows additional characters between @and bethel. edu;
for example,
username@homer. acs. bethel. edu.
and underscore characters are allowed in the username. perl dmgrep "((\wl\. I-I_)
16
Period, hyphen,
[@] (\S*)bethel[.]edu" < myfile
The regular expression requires one or more characters from the alphanumeric set (with periods, hyphens, and underscores also allowed). Then it must match an @character, followed by 0 or more nonwhitespace characters, followed by bethel. edu. U The final example uses a rather complicated regular expression.
Class Lists At one time, it was possible for an instructor at my college to receive an e-mail from the Administrative Computing Center that contained a list of all students enrolled in a class. However, the list contained lots of information (student ID number, post office box number, class rank, major code, advisor). A typical list might look like the following fragment: 1 Anderson, Erik Vincent 2 Berget, Neil Jonathan 3 Crownhart, Brian Scott
12345 67890 24680
181 JR 375 SO 392 SO
COMP COMS COMP
LT LT LT
3.0 3.0 3.0
CR Turnquis:B CR Gossett,EJ CR Gossett,EJ
Suppose I want to extract the name and PO number but ignore the rest of the information. I want a list that looks like 1 Anderson, Erik Vincent 2 Berget, Neil Jonathan 3 Crownhart, Brian Scott
181 375 392
that can be used for recording homework scores by hand. 16 To place this regular expression directly into a Perl script, it would be necessary to backslash escape the @ symbol: \@.
9.4 Regular Expressions
555
The critical portion of a Perl script that accomplishes this task is as follows. The
only part you need to understand is the regular expression that appears inside m/ in line if
/
1.17
(m/(\s*\d+\s\w+,\s\w+\s\D*) (\d+) (\s) (\s*\d+)/) print "$I$4 \n11;
The regular expression looks for one or more leading digits (perhaps preceded by some whitespace), then a comma separated name. It calls that part of the match $1. Then
it tries to also match another digit (the student ID, which will be discarded), and then some more whitespace, followed by the PO number (named $4). If a successful match
is made, the name (with leading number) and the PO are printed (with a trailing "I"to make things look nice). U
9.4.3 Exercises The exercises marked with 'I
have detailed solutions in
Appendix G. 1. Show that the + operator (metacharacter) is not necessary. That is, show that the regular expression x+ can be replaced by an equivalent regular expression that only uses concatenation and the I and * operators. 2. Write a regular expression that matches telephone numbers in the form 4 - ### - ###-##. 3. iF Write a regular expression that matches words that start and end in vowels (the standard English vowels aeiou). Assume that only lowercase letters are being used. Assume that the word must have at least two letters. 4. Write a regular expression that matches any line of characters that does not contain any periods, commas, colons, or semicolons, 5. Create regular expressions that match single words with the following characteristics: (a) Containing at least one q. (b) ODContaining a double vowel. A double vowel would be the same vowel twice, as in book. (c) Containing at least two double vowels. They can be the same double vowel, as in beekeeper. 6. Write a regular expression that will match only the eight phrases: "X the Y from the Z," where X is either "separate" or "seperate," and Y and Z are both either "novices" or "experienced," with Y # Z. 7. Write a regular expression that matches any non-null binary string.
10. A hexadecimal constant is usually written as a 0, followed
by either an x or an X and then a mixture of digits and lettens from the set [a, b, c, d, e, f, A, B, C, D, E, F) (for example, 0X3 B6). Write a regular expression that matches any hexadecimal constant. 11. Write a regular expression that matches the string "\begin{definition}". 12. Write a regular expression that matches a complete sentence that is a question. The sentence should start with a capital letter and end with a question mark. Assume that the only permissible internal punctuation characters are commas. 13. Write a regular expression that matches a formal name. The name may have an optional title or honorific (such as Doctor, Dr., Miss, Mr.), a first name, an optional middle name or initial (initials will always have a period), a last name, followed by an optional period-free designation (such as Junior, III). The designation should be preceded by a comma. All parts of the name should begin with uppercase letters. You may assume that all words, except a middle initial, contain at least two letters. 14. iF Write a regular expression that matches a credit card number that contains an expiration date. The number may either be in the form "dddd dddd dddd dddd mm/yy" or the form "dddd-dddd-dddd-dddd mm/yy" or the form "dddddddddddddddd mm/yy," where "d," "im," and "y" represent digits. 15. Write a regular expression that matches a letter grade. Assume that letter grades must be in the following set: [A, A-, B+, B, B-, C+, C, C-, D+, D, F, 1,W}. You may not use the regular expression "(A[A-I B+ BIB-I
I-
IC IC-
8. Write a regular expression that matches a valid Pascal identifier. A Pascal identifier consists of a letter, followed by 0 or more letters and/or digits.
ID+ ID IF I I IW)", even if you fix the error. 16. Write a regular expression that matches any single lowercase word. For this problem, a word must contain one or more
9. Write a regular expression that matches a standard C identifier. A standard C identifier consists of a letter, followed by 0 or more letters, digits, and/or underscores.
consonants and at least one vowel. (Assume that a, e, i, o, and u are the only vowels, so the word why will not be matched.)
7
1 The m// characters tell Perl to match the regular expression inside the 's. The variables $1, $2, and so on match the subregular expressions inside the
parentheses.
556
Chapter 9 Formal Models in Computer Science
17. Write a regular expression that matches a single line of text (one delimited by a beginning and end of line, with arbitrary strings in the middle).
exp is any expression and op is a binary operator. For this problem, assume that expressions are either integers or oneletter variables. Also, assume that operators are one of the four standard arithmetic operators: {+, -, *, /1. Write a regular expression that matches infix expressions with these restrictions. 21. Extend the previous exercise to allow expressions of the form (exp) op (exp). The parentheses are optional, but the expresal numbe or er the are en o cn oninsid inside can now consist of either a real number or a onesion letrviae.[ntFrswieargurexesonht ers or ueg (prnthese opti a mates justa(e. matches just (exp) (parentheses optional) and uses integers or one-letter variables for the expression. Next, add real numbers. Finally, reintroduce the operators and two expressions. Note: The final expression will be too long to test in a DOS
18. P Is it possible to write a regular expression that matches any line of text that does not contain the string "Percival"? Explain your answer (possibly by producing such a regular expression). stanthat matches a real number in lad19. W rite a regular expression deima otaion(wit darddecmal dard decimal notation (with opionl optional decimal pont).A point). A leadsigns need not be ing minus sign is optional; leading plus igis beorethe ecial ointareopmatced.Leadng matched. Leading digits before the decimal point are optional, as are trailing digits after the decimal point. Multiple leading Os are acceptable for this problem. 20. An infix expression is written in the form exp op exp, where
window but will work on a UNIX system.]
9.5 The Three Faces of Regular The main result of this section is that the input strings recognized by finite automata, the languages generated by regular grammars, and the sets generated by regular expressions are all the same. A bit of review may be helpful. Suppose that we have an alphabet E. The symbols in E might be used as the input symbols for a finite automaton.t8 In that case, we are interested in which strings in E* lead to a final state. On the other hand, if we are given a grammar, G = {(, A, S, l-I}, with E as its set of terminal symbols, we are interested in which words in E* belong to L(9). 19 Finally, if we construct a regular expression over Z, we are interested in the regular set (a subset of E*) generated by the regular expression.2° We will show that for corresponding choices of the finite automaton, regular grammar, and regular expression, the subset of E* will be the same in each case. This means that the three mechanisms are equivalent in expressive power; any subset of E* that can be specified using one of the mechanisms can be expressed using the other two as well. Figure 9.10 shows the relationships that will be proved. Figure 9.10 of regular.
The three faces
I ~Generate
•J
dcg t ne f eno9.n
SFinite-state " Automata )
a
\',
Regular atmaaSets 5and 53guages.
eRecognize 9
Generate5 .
Finite Automaton Regular Grammar
Regular [Grammars)
The proofs of two of the theorems in the diagram are greatly simplified by introducing nondeterministic finite automata. 18Definition 9.2 on page 53 1. 19Definitions 9.6 and 9.9 on pages 541 and 543. 2°Definition 9.11 on page 552.
9.5 The Three Faces of Regular
557
DEFINITION 9.12 Nondeterministic FiniteAutomaton A nondeterministicfinite automaton is a finite-state machine that relaxes three
requirements in the definition of a finite automaton (page 531).
"* It is permissible to move between states without any input symbol to trigger the transition. Such transitions are called X-transitionsand denoted by labeling the transition with the empty string.
"* States do not need transitions associated with every input symbol. "* States may have more than one transition associated with the same input symbol. The adjective deterministic is often used to emphasize that a finite automaton is not nondeterministic. A Nondeterministic Finite Automaton Figure 9.11 illustrates the previous definition. Notice that
"*There is a X-transition from the start state to state x. "*There are many omitted transitions. For instance, there is no b transition from state x and there are no transitions at all from state z. "*There are two distinct a transitions leaving state x.
Figure 9.11 A nondeterministic finite automaton.
U
a
A ,a
S
ab
b
There seems to be significant inherent ambiguity associated with nondeterministic finite automata (hence the adjective nondeterministic). However, the next theorem indicates that the ambiguity can be removed without changing the functionality of the automaton.
Any nondeterministic finite automaton can be transformed into a deterministic finite automaton that recognizes the same set of strings. A constructive proof can be found in either [41] or [48]. The key idea is to have the states in the new (deterministic) finite automaton represent subsets of states in the original (nondeterministic) finite automaton. It is now time to start proving that regular grammars, regular expressions, and finite automata are three equally expressive mechanisms for specifying sets of strings (languages).
SFinite
Automaton -. Regular Grammar
Let E be the set of input symbols for a finite automaton A = (8, E, t, so, fl. Let
R be the subset of E* that is recognized by A. Then there is a regular grammar g = {E, A, S, I-I} such that L(G) = R.
558
Chapter 9 Formal Models in Computer Science The theorem will be illustrated with an example that will motivate the formal proof. Converting Example 9.8 to a Grammar Example 9.8 constructed a finite automaton that recognizes any binary string that contains two adjacent Os. The automaton's state diagram is repeated in Figure 9.12.
Figure 9.12 diagram.
Two Os state
S•
0
21 The automaton can be formally specified by the following:
8S E
= =
(Zero, One, Two} is the set of states. {0, 1) is the set of input values.
* The transition function t is described by the state diagram. * The start state is Zero. T = {Two} is the set of final states.
We need to convert this to a grammar, g, that has a corresponding nature. 22 The grammar can be specified as follows: E = E (the set of terminal symbols in g is the set of input values for A). * A = {Z, E, T} (one nonterminal in g for each state in A-notice that E has replaced the notationally less suitable 0). * The start symbol is Z. S171 is constructed from the transition function t. Transitions to nonfinal states will generate one production; transitions to final states will generate two productions. Transition
First Production
t(Zero, 0) = One
Z
--
OE
t(Zero, 1) - Zero
Z
--
IZ
t (One, O)= Two
E -OT
t(One, 1) = Zero
E-- 1Z
t(Two, O) = Two
T -OT
t(Two, 1) = Two
T-
The pattern used is
t(Sk, i)
Sk --> i.-
Second Production
IT
E
0
T
0
T
= Sm becomes Sk --> iSm. If sm is a final state, also add
Does g generate the strings that A recognizes? A few samples will raise confidence that it does. A will recognize 0101001. This string can be derived as Z =* OE
=:>
O1Z ==>010E :=> 0101Z :ý 01010E
=:>
010100T =:>0101001.
A does not recognize the string 11. The only production that can begin at the start symbol, Z and produce ls is Z -- 1Z. There is no way to eliminate the nonterminal after generating just two Is, so g cannot generate the string 11. U The formal proof follows the construction hinted at in the previous example. 21 Look at Definition 9.2 on page 531. 22
See Definition 9.7 on page 542.
9.5 The Three Faces of Regular
559
Proof of Theorem 9.3: We need to construct a grammar with the required properties. We can begin by using the set, E, of input values for A as the set, 1, of terminal symbols for the grammar (but this set may eventually need to be expanded to include X). The set, A, of nonterminal symbols can be created from the set, 8, of states in A. For each state, si c S, create a nonterminal symbol, Si E A. The nonterminal, So, that is created to correspond to the start state, so, will be the start symbol for the grammar. Finally, we need to create a set of productions from which we can derive any string that A can recognize. There will be either one or two productions created for each transition specified by t. Suppose that t(sk, i) = Sm. Then the production Sk - iSm will be added to 11. If S E T, then the production Sk --) i will also be added to F1. If the start state is a final state, we also add So -* ) to FI and add ), to E if it is not already there. Can every string recognized by A be derived from the start symbol in g? Let iI i2 i 3 ... ij be an input string that is recognized by A, corresponding to the state changes rl rzr3 ... rj, where rk is one of the states in 8, for each k E {1, 2 ..... j }. Thus, the input symbol il moves A from state so to rl. Then the input symbol i 2 moves A from rI to r 2 . This continues until the symbol ij moves A from rj-1 to rj E 5. We can derive the string iIi 2 i 3 ... ii using the grammar. Begin with the start symbol, So, and apply the production So -+ iI RI (where R1 is really one of the states, Sk c A, previously defined). Then replace R 1 using the production R 1 --> i 2 R2 . Such a production must exist because the transition t (rl, i 2 ) = r2 has been used by A. Continue in this fashion until the string ili 2 i3 ... ij-I R j- 1 has been generated. The final substitution uses the production Rj-I --> ij, which is in 1- since rj is a final state. In summary, we can derive iIi 2 i 3 ...ij as So •: i 1 R 1 => ili 2 R 2 •> ili 2 i 3 R 3 => ... =# ihi2i
.3.."ij
lRj-
.
i> i2i
3 ...
ji.
Thus, any string recognized by A can be generated by g (so R C L(9)). Will g generate any strings that A will not recognize? Notice that each production of g follows a transition that exists in A. If a production permits the replacement of some nonterminal Sk by iSm, then the input symbol i will move A from state Sk to state s,. The only productions that enable the derivation to end are those that correspond to a final state in A. So a replacement of the form Sk -- > i corresponds to a transition from Sk to some final state. Therefore, g will never generate a string that A does not recognize [so L(g) C R]. Since R C L(G) and L(G) C R, it must be the case that L(9) = R. El 0
mP
Regular Grammar-
Finite Automaton
Let L(g) be the language generated by a regular grammar, g = {I, A, S, I-I}. Then there exists a finite automaton, A = I8, EA, t, so, Y}, which recognizes L(9). The following example demonstrates the major ideas in the formal proof. Grammar to Finite Automaton Let E {a,b, )} and A = {S,X}. Let 11I= IS G = {(, A, S, F-I}. Then
X abX, X -> bX Ial. Let
L(g) = {X, aba, abba, abbba... . The finite automaton, A, will be defined to have input values EA )[a, b}. The set, 8, of states will be built incrementally. The start symbol, S, for g will correspond to the start state, so, of A. Every other nonterminal symbol in A will correspond to a lowercase version of itself as a state in 8. The initial version of S for this example is therefore 8 = {so, x}.
560
Chapter 9 Formal Models in Computer Science Because there is a production, X -- a, that has no nonterminals on the right-hand side (and also does not have X as the right-hand side), we will add a new state f to S. So we now have S = [so, x, f}. At this point it is possible to identify the final states. Any nonterminal symbol that has a production of the form Y --> X will be a final state. If the state f has been added, it will also be a final state. For this example, T = [so, f}. It remains to specify the transition function, t. In the process of specifying t it will be necessary to add additional states. The transition function will be specified by converting productions to transitions. The production S -+ X has already been accounted for by making so a final state. The production X --> a will become the transition t(x, a) = f. The production X -> bX will become the transition t (x, b) = x. The remaining production, S -- abX, will take more effort. The problem is the presence of multiple terminal symbols on the right-hand side. A finite automaton must accept single input values and specify a transition for each value. This can be accommodated by introducing a new intermediate state, sI, that fills the gap between the input symbols a and b in the right-hand side, abX. Thus, 8 = Iso, x, f, si }. We now specify a transition on either side of SI: t (so, a) = sI and t (sI, b) = x. We are almost done. It might be helpful to examine the partial finite automaton constructed so far (Figure 9.13).
Figure 9.13 Grammar to partial finite automaton. bb
a
b
How should the transition function treat the missing input values at each state? Any other input will lead to a string that the grammar cannot generate. Thus, a state bh (which acts like a black hole) will be added. All other input symbols will drive A to bh and the string will not be recognized (Figure 9.14). Figure 9.14 Grammar to finite automaton.
a•
b
b
b
a
If you try a few strings, both in and out of L(g), you should become convinced that this finite automaton recognizes exactly the set of strings in L (G). U
9.5 The Three Faces of Regular
561
The previous example is a bit misleading. It has been carefully constructed to avoid a pair of messy possibilities ()-transitions and multiple transitions with the same input symbol). The next example introduces some of the messiness. Gram m ar That Adds Nondeterm inacy SIA Let E = [a, b) and A {S, X, Y}. Let [I = {S --* a, S -- aX, X --- bY, Y --+ a, Y --. bY}. Then L(G) {a} U {abna I n > 1}. The process outlined in the previous example produces (Figure 9.15) a nondeterministic finite automaton (the start state has two a transitions). Figure 9.15 Grammartoab nondeterministic finite automaton.
Figure 9.16 Adding anew state with a A-transition.
It is easy (Figure 9.16) a from the start terminacy. a
(for this example) to eliminate the nondeterminacy by adding a new state and then adding X,to E. Even if there were more than two a transitions x state, a sequence of A-transitions to new states would remove the nonde7>a°7
a
beoe
However, if there are multiple A-transitions from some state, the previous trick will fail. An additional step (not shown here) would be required to convert this to a deterministic finite automaton. The full algorithm from Theorem 9.2 would be needed. U The formal proof follows the pattern of the previous two examples.
Proof of Theorem 9.4: Suppose we are given the grammar, g = {£, Ax, 5, 1-I}. We need to construct a deterministic finite automaton, A = {S, EA, t, so, 31}, which recognizes L (g). The process begins by constructing a nondeterministic finite automaton, N, which recognizes L (9). The nondeterministic finite automaton N will have input values bN n = - {d}. Note that = poNif A l l. The set,t, of states will be built incrementally. The start symbol, o S, for will correspond to the start state, so, of N. Every nonterminal symbol in A will correspond to a lowercase version of itself as a state in S. If there is a production whose right-hand side contains neithery nor a nonterminal, we will add a new state f to S.
562
Chapter 9 Formal Models in Computer Science At this point it is possible to identify the final states. Any nonterminal symbol that has a production of the form Y --> ;Xwill cause state y to be a final state. If the state f has been added, it will also be a final state. It remains to specify the transition function, t. In the process of specifying t it may be necessary to add additional states. The transition function will be specified by converting productions to transitions. Any production in the form Y --* )Ahas already been accounted for by making y a final state. Any production Y -- i, where i 7ý ;. is a single terminal symbol, will become the transition t(y, i) = f. Productions of the form X -- iY will become the transition t(x, i) = y (this includes the cases where i = k). The remaining productions are all in the form, X I i 2 ... ik Y, with k > 2. For each pair of adjacent terminal symbols, ijijyi, in the production, introduce a new state, sj (thus adding states st, S2 ..... k- I). We now specify the transitions t(x, il) = S1, t(sk-1, ik) = y and {t(sj-1 , ij) = sjij=2,..k-l. At this point, N has been fully specified. The fairly routine verification that the set of strings recognized by N is exactly L(9) will be left to the reader. Using the constructive algorithm from the proof of Theorem 9.2, the nondeterministic finite automaton, N, can be converted to a deterministic finite automaton, A, that recognizes the language generated by g, completing the proof. 1] The finite automaton specified by the previous constructive proof is usually nondeterministic. If there are any productions of the form X --> ),Y, then the finite automaton will contain A-transitions. There will usually be many transitions that the productions do not specify. (The formal proof has omitted the creation of the "black hole" state presented in Example 9.31.) Finally, if there are productions of the form X -+ aY and X --) aZ, then there will be two a transitions leaving state x.
V
Quick Check 9.9-
1. (a) Convert the finite automaton defined by the following state diagram into an equivalent regular grammar. (b) Determine L(g).
2. Create a finite automaton that recognizes L(g) [and only L(g)], where the regular grammar g is defined by E {a, b), the start symbol is S, k Ibaa}. A ={S}, I = {S -aS
Recall that "regular set" is the name for the collection of strings generated by a regular expression. The following theorem shows that regular expressions and finite automata are equally expressive; any set of strings generated by a regular expression can be recognized by a finite automaton (without recognizing any additional strings), and vice versa. WlIGTR
ý•
Kleene's Theorem
A set is regular if and only if it is recognized by a finite automaton.
9.5 The Three Faces of Regular
563
Proof: The theorem will be proved by appealing to a pair of constructive proofs, each presented as a lemma. The first lemma will show that any regular set can be recognized by a finite automaton. This will be accomplished by starting with the regular expression that generates the regular set and then using the recursive definition (page 552) of regular expression to construct recursively a nondeterministic finite automaton that recognizes exactly those strings in the regular set. The nondeterministic finite automaton could then be converted to a deterministic finite automaton by using Theorem 9.2. The second lemma will show that it is possible to construct a regular expression that matches exactly those strings recognized by a finite automaton. El Before presenting the lemmas that prove Kleene's theorem, there is an easy corollary to Kleene's theorem that fills in the final equivalence on the diagram from page 556.
COROLLARY 9.1
Regular Set If and Only If Regular Language A subset R C E* is a regular set if and only if it is a regular language.
Proof: Suppose R is a regular set. Then Kleene's theorem implies that there is a finite automaton, A, that recognizes R. Theorem 9.3 asserts that there is a regular grammar that generates R. Since R can be generated by a regular grammar, R is a regular language (Definition 9.9). If R - L(!) for some regular grammar, !, then Theorem 9.4 implies that there is a finite automaton, A, that recognizes R. Kleene's theorem asserts that R is a regular set. ED
9.5.1 Optional: Completing the Proof of Kleene's Theorem LEMMA 9.1 Converting Regular Expressions to Nondeterministic Finite Automata Given any regular expression, R, it is possible to find a nondeterministic finite automaton that recognizes all strings, and only those strings, in the regular set generated by R.
Proof: The main idea is to go back to the recursive definition of regular expressions,
Figure 9.17. A finite automaton for 0.
Furemato9.
automaton for
.Afinitomaton X•.
-0-fL-0 Figure 9.19. A finite automaton for symbol a.
found on page 552, and build a finite automaton for each piece of the definition. If the finite automaton and the regular expression generate/recognize the same set of strings, and if the construction shows how to hook the pieces together, then the lemma will be proved. To that end, each part of Definition 9.11 will be discussed in order. 0 The regular expression 0 does not match any strings. Its regular set is the empty set. A nondeterministic finite automaton with a single, nonfinal state recognizes the empty set (Figure 9.17). .X The regular expression Xýmatches the null string. A nondeterministic finite auwith a single, final state also recognizes only the null string (Figure 9.18). a Every symbol, a E E, is a regular expression that matches only itself. A nondeterministic finite automaton with a nonfinal start state and one final state will match that one string (Figure 9.19). The remaining three parts of the definition require recursion. To that end, it is necessary to introduce some diagrammatic notation. Suppose that X is a regular expression for which a corresponding nondeterministic finite automaton, Ax, exists. Figure 9.20 will represent Ax. There may be no final states, exactly one final state, or several final states in Ax. The generic diagram shows two. The diagram does not show the
564
Chapter 9 Formal Models in Computer Science
f~
intermediate transitions except to indicate that they exist. The dotted inner circle on the start state is to indicate that the start state may or may not be a final state. The subscripts indicate that this nondeterministic finite automaton corresponds to the regular expression X.
.
XY Suppose X and Y are regular expressions having corresponding nondeterministic finite automata Ax and Ay. The concatenation, XY, is also a regular expression. A nondeterministic finite automaton that recognizes exactly those strings in the regular set generated by XY can be built from Ax and Ay. The following steps will create the new nondeterministic finite automaton. First, create a X-transition from each final state in Ax to the start state of Ay. Second, transform all final states in Ax into nonfinat states. The X-transitions are needed so that once the string X is recognized, the new machine continues on to look for the string Y. The removal of final states from the Ax machine is needed so that the new machine does not recognize strings that match X but do not have a Y appended (Figure 9.21).
Figure 9.20. Ax.
Figure 9.21
A
nondeterministic finite automaton for X Y.
A A
G•
From
Q
X I Y Suppose X and Y are regular expressions having corresponding nondeterministic finite automata Ax and Ay. The alternative, X I Y, is also a regular expression. A nondeterministic finite automaton that recognizes exactly those strings in the regular set generated by X I Y can be built from Ax and Ay. First, a new start state, Sxly, will be created. Then a A-transition will be created sx to the start states in Ax and Ay. The A-transitions allow the machine to either move to a submachine that recognizes X, or to a submachine that recognizes Y (Figure 9.22).
X* Suppose X is a regular expression having corresponding nondeterministic fi-
Figure 9.22. A nondeterministic finite automaton for X I Y.
nite automata Ax. The Kleene closure, X*, is also a regular expression. A nondeterministic finite automaton that recognizes exactly those strings in the regular set generated by X* can be built from Ax. First, make the start state of Ax a final state (so that A can be recognized). Then make a A-transition from every final state in Ax to the start state. The Atransitions enable the machine to look for an additional copy of X, once a valid copy is recognized in the input string (Figure 9.23).
Figure 9.23 A nondeterministic finite automaton for X*.
Because it is possible to provide a recursive construction that corresponds exactly to the recursive definition of regularexpression, the construction completes the proof of the lemma. El
9.5 The Three Faces of Regular
565
Before proceeding to the second lemma, it will be helpful to look at an example" that illustrates the algorithm inherent to the constructive proof of Lemma 9.1. Transforming a Regular Expression into a Nondeterministic Finite Automaton The construction used to prove Lemma 9.1 can be used to convert the regular expression ab*a I c* into a nondeterministic finite automaton that recognizes the regular set, {abna In > 01 U {cm Im > 0}, generated by the regular expression. The process begins by creating finite automata for the three regular expressions a, b, and c (Figure 9.24). Figure 9.24
for a, b, and c.
Finite automata
a
-._S
b
Sc C Next, the nondeterministic finite automata corresponding to b* and c* can be formed from the finite automata for b and c using the rule for Kleene closure (Figure 9.25). Figure 9.25
b
Nondeterministic
C
b
finite automata for b* and c*.
The concatenation rule can be used to construct a non-deterministic finite automaton for the regular expression ab* and then for ab*a (Figure 9.26). Figure 9.26
Nondeterministic
a
finite automata for ab* and
A -
b
-j
b A
ab*a.
A
Finally, the rule for alternative can be used to complete the construction (Fig-
ure 9.27). Figure 9.27
A
A
nondeterministic finite automata that recognizes {abna I n > 01 Ua 1m I m > 01.
Sab*ajc
S
566
Chapter 9 Formal Models in Computer Science LEMMA 9.2 Converting Nondeterministic Finite Automata to Regular Expressions Given any nondeterministic finite automaton, A, it is possible to find a regular expression that generates all strings, and only those strings, recognized by A. Proof Highlights: The translation from nondeterministic finite automaton to regular expression is fairly straightforward. The key observations are captured in Figures 9.28, 9.29, and 9.30. If the nondeterministic finite automaton contains subsets Ax, Ay, and so on. that can be translated into regular expressions X, Y, and so on, then a configuration like Figure 9.28 can be converted to the equivalent regular expression aX I bY I cZ.
Figure 9.28
Parallel
Ax
subdiagrams.
a
A
w b C
Az
Ax
AY aIf
the nondeterministic finite automaton contains subsets, AX and Ay, that can be
Figure 9.29. Sequential subdiagrams.
Ax
a Figure 9.30. Loops.
translated into regular expressions, X and Y, then a configuration like Figure 9.29 can be converted to the equivalent regular expression Xa Y. Finally, if the nondeterministic finite automaton contains a subset, Ax, that corresponds to the regular expression, X, and if a well-defined loop transition is added to Ax (as in Figure 9.30), then the new configuration corresponds to the regular expression ] (Xa)*. Note that a = ,Xis permitted. Rather than formalizing these insights into a proof, an algorithm that uses them will be presented. The algorithm follows that found in [41, pp. 592-593]. The main task is to reduce the finite automaton to a finite-state machine that has two states and one transition. The transition will be labeled with the regular expression that corresponds to the original finite automaton. There are four preliminary steps to the algorithm:
"*Create a new start state, so, and connect it to the old start state via a ),-transition. "*Create a new final state, fo, and connect all the old final states to it via ;X-transitions. "*Eliminate any nonfinal state that has no transitions that move from the state to some different state. (That is, remove nonfinal black hole states. A string that moves to such a state will never be recognized.)
"*Eliminate multiple
transitions. For each pair of states, x and y, that have more than one transition from x to y, replace those transitions by a single transition whose label is the regular expression formed by using the alternative operator, 1, with the labels on the former transitions.
If the finite automaton looks like the state diagram in Figure 9.31, then the preliminary steps would produce the finite-state machine in Figure 9.32. The rest of the algorithm is an iterative process that reduces the number of states (and ultimately, the number of transitions). More precisely, while there are still more than two states, - eliminate some state, y ý iso, fu}.
9.5 The Three Faces of Regular Figure 9.31
567
The initial finite
automaton.
A
0
B
Figure 9.32 Finite-state machine after preliminary steps.
C
A
The algorithm will be complete once the state-elimination procedure is described. Figure 9.33 shows a state, y, and some of the possible transitions connected to it. Figure 9.33
State y is ready
to eliminate.
D x
z B
C
A
FE
F
C
U
Note that it is possible to move from state x to state z by either a string that is matched by the regular expression D, or else by a string that is matched by the regular expression A, followed by zero or more strings that are matched by B, followed by a string that is matched by C. Thus, if y is eliminated, a new transition from x to z with label D I AB*C will need to replace the transitions from x to z, and from x to y to z. Similarly, the state u will need a loop with label FB*E. More formally, let < x, z > represent the transition from state x to state z. Let Lo(x, z) denote the current (old) label on that transition and Ln(x, z) denote the new label, after eliminating some state which is different from x and z. Define Lo(x, z) = 0 if there is no transition < x, z >. To eliminate a state, y, consider all pairs of transitions < x, y > and < y, z > such that x A y and z • y. (Note that x = z is permissible.) For each such pair, create a new label for the transition < x, z >. The new label is the regular expression, Ln(x, z), where Ln(x, z) = Lo(x, z)jLo(x, y)Lo(y, y)*Lo(y, z). After applying this process, Figure 9.33 would become the finite-state machine shown in Figure 9.34. Figure 9.34
State y has been
D[AB*C
eliminated.
FB*C
AB*E U
FB*E
568
Chapter 9 Formal Models in Computer Science
Transforming a Finite Automaton into a Regular Expression Suppose the finite automaton from Example 9.13 on page 542 needs to be converted to a regular expression. The state diagram is repeated in Figure 9.35. Figure 9.35 A finite automaton to recognize strings in L = lab*c}.
and < No, z > with x and z not the same as No. The relabeling rule for any existing transition becomes Ln(x, z) = Lo(x, z) I 0 = Lo(x, z). This effectively eliminates the nonfinal black hole state, which could have been done during the preliminary phase (Figure 9.37). Figure 9.37 No.
Eliminate state
b
a
C
The states ,A and Yes can be eliminated next, producing the finite-state machine in Figure 9.38. Note that the regular expressions 0 1 0*a and a generate the same regular set.23 The simpler expression has been used as a label. The final iteration removes the state p (Figure 9.39). A regular expression that generates the set of strings recognized by the original finite automaton is ab*c, which is consistent with the original example on page 542. E 23
The regular expression 0* generates the regular set {X)}.
9.5 The Three Faces of Regular Figure 9.38 X and Yes.
Eliminate states
Figure 9.39
Eliminate state
569
b a
O
ab*c
p.
9.10 I/Quick Check 1. Use the pseudocode notation from Chapter 4 to express the algorithm for converting from a finite automaton to a regular expression. You may
use the notation already introduced (<x, y >, Lo(x, y), and Ln(x, y)).
9.5.2 Exercises The exercises marked with 0I' have detailed solutions in Appendix G. 1. Use the construction in Theorem 9.3 to convert the following finite automata to regular grammars. (a) ýD- The finite automaton from Problem 1 of Quick Check 9.3 on page 533. (b) Let A be defined by S = (so, sI, s 2}, E = {0, 11, the start state is so, Y = {so, s2 J, and t is defined by the following state table. Input
The start state is so. S = {so, A, B, C} and _ = (A, B}. The transition function is implicitly defined by the state diagram.
a
a
s0
sl
s2
S1
Sl
s2
s2
Sl
$2
c
2. The following finite automaton helps you pick petals off of daisies. It has input symbols E = (1,n}, where I = "loves me" and n = "loves me not." The start state is P ("pick" the poor helpless daisy from the ground). The states are S = {P, L, N}. The final state is L. The transition function is just what you expect. It is represented by the following table. Input
aa
b b
(d) The following automaton has input symbols E = [a, b, c} and recognizes any nonempty string with no c's.
ab c
.
0
(c) The following finite automaton has input symbols E (a, b} and recognizes any string with two adjacent a's {s0 , A, or two adjacent b's. The start state is s0 . 8 B, DBL} and T = {DBLJ. The transition function is implicitly defined by the state diagram.
b
c
SO
1
State
a
A
State P
1
n
L
N
L
L
N
N L N (a) The finite automaton just described creates a social problem. It enables (in the psychological sense) someone to cheat because it accepts strings of the form 11111. In proper "loves me, loves me not" form, the I's and n's should alternate and the string should start with an 1. Design a finite automaton that accepts any nonempty string of I's and n's in proper "loves me, loves me not" form and rejects all other strings. Clearly define Z, 8, _T, and t. Clearly label the start state.
570
Chapter 9 Formal Models in Computer Science
(b) Modify the finite automaton in part (a) so that it only accepts properly formed strings that end with 1.
(c) Suppose that a set of strings is recognized by a finite automaton. Then the set must be regular.
(c) Create a grammar that recognizes all strings recognized by the finite automaton in part (b). 3. Use the construction from Theorem 9.4 to convert the following regular grammars to finite automata,
(d) 0D A nondeterministic finite automaton differs from a finite automaton in that there will always be at least one state that does not have transitions associated with every input symbol.
P The grammar in Example 9.14 on page 543. Use uppercase letters for states in this problem, (b) The grammar in Example 9.19 on page 546. Use uppercase letters for states in this problem.
8. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice.
(c) The grammar in Example 9.18 (summarized on page 546). Use uppercase letters for states in this probwhy the original, nonregular grammar lem. Also, explainlem. would be harder to convert to a finite automaton. (d) The grammar in Problem I of Quick Check 9.6 on page 546. Use uppercase letters for states in this problem. (e) The grammar in Example 9.16 on page 543. Use lower-
(a) In the conversion process from a finite automaton to a regular grammar, it is always valid to let the set of terminal symbols for the grammar be the set of input values for the finite automaton.
case letters for states in this problem. 4. The recursive definition of regular expression does not disexpression havcuss the "+" operator. Suppose X is a regular ing corresponding nondeterministic finite automaton, Ax. Design a nondeterministic finite automaton that recognizes X+" 5. Use the construction in Lemma 9.1 to convert the following regular expressions to nondeterministic finite automata.
(c) A finite automaton is not nondeterministic, and thus it cannot recognize the same set of strings as a nondeterministic finite automaton.
(a)
(b) O (ab I cd)e
(a) ab*c (c) ((a I b)(c I d))* (d) (a + c) I b cise 4.)
(This part requires the result from Exer-
6. Use the construction in Lemma 9.2 to convert each of the following finite automata to an equivalent regular expression.
(b) An important idea in this section is that the input strings recognized by finite automata, the languages generated by regular grammars, and the sets generated by regular expressions exhibit strong connections.
(d) In the conversion process from a regular grammar to a finite automaton, it is always valid to let the set of input values for the finite automaton be the alphabet of the regular grammar. DEFINITION 9.13
Let L be a language over E. The complement of L is denoted by L and consists of all strings in E* that are not in L. 9. Prove Theorem 9.6.
The Complement of a Regular Language
(a) The finite automaton from Example 9.8 on page 533 (b) P The finite automaton from part (b) of Exercise 1 of this set of exercises (c) The finite automaton from part (c) of Exercise I of this set of exercises (d) The finite automaton from part (d) of Exercise 1 of this set of exercises true (always) or 7. Each of the following statements is either false (at least sometimes). Determine which option applies
The Complement of a
Language
If L is a regular language, then T is also a regular language.
9.13states.) and Theorems 9.3 and 9.4. Think(Hint: about Use final Definition and nonfinal 10. Prove Theorem 9.7.
for each statement and provide adequate explanation for your
•
The Intersection of Regular
choice.
Iff
(a) Suppose, in the nondeterministic finite automaton in Example 9.29, a loop is added at state x having transition label, a. Then the new state diagram still represents a nondeterministic finite automaton.
If Lt and L 2 are both regular languages over the same alphabet, E, then L1I nL 2 is also a regular language.
(b) Any finite automaton will recognize the language generated by a regular grammar, G.
(Hint: Use De Morgan's laws, Theorem 9.6, and most of the other theorems in this section.)
Languages
9.6 A Glimpse at More Advanced Topics
571
9.6 A Glimpse at More Advanced Topics As is true with many chapters in this textbook, this chapter could easily be extended into a full-semester course. In particular, regular grammars and finite automata are not the only useful grammars or models of computation. This section gives a brief glimpse at some other important ideas that extend what has already been presented.
9.6.1 Context-Free Languages and Grammars Recall Example 9.20, where it was mentioned that the set {anbn I n > 0} cannot be specified by a regular grammar. That does not mean we are unable to find a grammar that can specify this language. This section specifies a hierarchy that provides a collection of increasingly more expressive grammars. The price for greater expressive power is a grammar that is harder to use. For example, a regular grammar can be modeled by a finite automaton. The more expressive grammars require more complex computational models. The hierarchy of grammars is called the Chomsky hierarchy. It was first introduced in 1959 by the linguist Noam Chomsky. The grammars are all in the form
g = { E, A, S, 1 1}. The only point of difference is the nature of productions in 1-I.In all cases, productions look like a -- 8, where a and P are strings in (E U A)*. Table 9.9 24 shows how the productions are defined for each grammar in the Chomsky hierarchy. *
TABLE 9.9 The Chomsky Hierarchy of Grammars Type 0
Grammar Phrase structured (unrestricted)
I
Context sensitive
Productions a
--
fi
a E (E U A)+, P E (Z U A)* (Note: a # k, but fl = A is ok) ctNot2 -` 1lYa 2 for al, Ot2, Y e (E U A)*,
y #
A and N
Recognized by Turing machine
Turing machine
EA
2
Context free
N - >3 where N E A, 0 E (E U A)*
Pushdown automaton
3
Regular
N1 ---> 61 N2 or N --+ 6 where N 1 , N2 , N E A and E,Ex*, • A
Finite automaton
A context-sensitive grammar cannot generate the null string. With a suitable adjustment, the grammars form a hierarchy: (regular grammars} - ), C {context-free grammars) - A _ (context-sensitive grammars) g {phrase-structured grammars}. Pushdown automata and Turing machines will be discussed in the next section. The productions I Na12 -- UtI yaC2 in a context-sensitive grammar allow the nonterminal N on the left to be replaced by the non-null string y whenever N appears in the context of al, U2. Notice that the context, al ... •2, is preserved. 24
Computer scientists are also interested in other grammars. I have expressed some of the grammars described here by using normal forms. That is, the original definitions are not exactly what is shown in the table, but any grammar following the more general rules can be transformed into one that conforms to the normal form.
572
Chapter 9 Formal Models in Computer Science A Context-Sensitive Language Define the grammar g as E = (a, b, c}, A = {S, A, B}, with S as the start symbol. The productions in 1l are listed next. S Ab Ac
--
abc aAbc
-
bA Bbcc
bB -- Bb aB -+ aa aaA This grammar generates the language L = {abnc I n > 1). Notice that the nonterminal A can only be replaced if it is in the context of having E either a b or a c as the symbol on its right. Context-free languages have some of the nice properties of regular grammars (a single nonterminal on the left side of every production) but are more expressive. The next example shows a grammar for the nonregular language {anbn I n > 0}. A Context-Free Language Let g be defined by having E = {a, b, X), A = {S}, with S as the start symbol, and Fl = {S --+ X I aSb}. With two simple productions, this grammar produces a language that is not possible using a regular grammar. The cost is the need for something more complex than a finite automaton if we want a machine to recognize strings in this language. The more complex machine is called a pushdown automaton and combines a finite automaton with a stack. More on this in the next section. U
Vi Qui-c k--Che-ck 9.-1 1
---------......
1. Define a context-free grammar that specifies the language L1 {anb 2 , I n > 0). 2. Define a context-free gram-
mar L2
=
that specifies the language {anbcn n > 1}. Use
E =
{a, b, c, X}.
IA
9.6.2 Turing Machines The goal of this section is to provide informal descriptions of pushdown automata and Turing machines and then to comment briefly on their significance.
Pushdown Automata The previous section alluded to pushdown automata as computational models that can recognize context-free languages. A pushdown automaton is a finite automaton with some limited-access memory. That is, in addition to the states and input symbols found in a finite automaton, there is also a place to store symbols for later use. This limitedaccess memory is called a stack. A stack consists of a (potentially very large) collection of storage bins and two operations that move items in and out of bins. The operations are called push and pop. Push and pop enforce a "last-in, first-out" discipline on accessing the memory bins. Push places a new item (input symbol, string, etc.) on the top of the stack. Pop removes the item at the top of the stack. No other access to the stack's bins is permitted. A stack can be informally visualized as a pile of clean dinner plates in a springloaded serving cylinder. Only the top plate is available for the next customer to grab. As the pile of plates becomes almost empty, the serving staff can add new plates to the top (one at a time in this visualization).
9.6 A Glimpse at More Advanced Topics
573
Transitions in a pushdown automaton are determined by looking at the current state, the current input symbol, and the item on the top of the stack. As a transition is performed, it is permissible to pop the top item off the stack, push a new item onto the stack, or leave the stack alone. Recognizing Ianbn I n >_ 0) Here is an outline of how a pushdown automaton can be used to recognize the nonregular language {anbnl > 0}. The start state, so, is a final state and the stack is initially empty. If the first input symbol is an a, push some special marker symbol, e, onto an empty stack and move to nonfinal state sl. Then, as long as input symbol a is encountered, remain in state sl and push a second marker, 8, onto the stack. As soon as the first b is encountered in the input, and as long as the top of the stack is a 8, move to nonfinal state s2 and pop the top item off the stack. If a b is encountered and the top of the stack is an E,move to final state S3. Any other combination of state, input, and top of stack will lead to a black hole nonfinal state S4. This pushdown automaton will terminate in a final state for exactly those strings in the form, anbn, for some n > 0. N The major result about pushdown automata is summarized in the next theorem. Pushdown automata are also important tools when thinking about the design of compilers25 for computer programs. In particular, pushdown automata are appropriate models for the process of parsing the source code of the program that needs compiling.
Context-Free Grammarsand Pushdown Automata Any language generated by a context-free grammar can be recognized by a pushdown automaton. Any language recognized by a pushdown automaton can be generated by a context-free grammar. Context-free grammars and pushdown automata are therefore equally expressive mechanisms.
Turing Machines Pushdown automata are more expressive than finite automata. However, there are easily defined languages that cannot be recognized by them. A simple example is the language {anbncn I n > 0}. A more powerful class of finite-state machines, called Turing machines, can recognize this language, and in fact, any language generated by a phrasestructured grammar. Turing machines were first formally described in a paper by Alan Turing that was published in 1936-a decade before the first electronic computer. There are many computationally equivalent descriptions of Turing machines available today. The one presented here is similar to the original presentation. A Turing machine consists of a control unit and an infinite data tape. The data tape is divided into cells that are arranged in a line that extends forever in both directions. Each cell may either be blank (contain the symbol k) or contain one symbol from some finite alphabet. 26 The control unit has the ability to read the contents of a cell, write a new symbol in a cell, and move the Turing machine one cell to the left or right. 25
A compiler is a program that translates human-readable computer source code into a form that the computer can run. Parsing is the process of breaking the source code into strings that are meaningful components of the grammar for the programming language. 26 The data tape is similar in spirit to the stack used by a pushdown automaton, but it is a more powerful mechanism for external data storage.
574
Chapter 9 Formal Models in Computer Science The control unit also contains a finite set of states together with a set of instructions. The instructions are 5-tuples that specify the current state, the symbol in the current cell, and the response the control unit should exhibit. The response consists of three parts: which symbol to write in the current cell; whether to move left, right, or remain over the same cell; and, finally, which state should become the current state. The 5-tuples can be written in the form <current state, symbol in current cell, symbol to write in current cell, direction of motion on data tape, next state> or, more succinctly, <current state, read, write, move, next state>. Adopting the shorthand notation L = move left, R = move right, S = stay at the same cell, a typical instruction might be <x, a, b, L, y> where x and y are states, and a and b are symbols in the input/output alphabet. A few other conventions are needed before a simple example is possible.
"*The tape will always start with only a finite number of nonblank cells (and the nonblank cells will be contiguous). "*The control unit always starts at the leftmost nonblank cell. "*There is a single start state, so, and a single final state, f. "*If the control unit is in a state, x, and the current cell contains a symbol, a, such that there is no instruction that begins < x, a, .... >, the machine will halt in state x.
"*If the machine halts in state f, the Turing machine is said to recognize the string that was initially written on the data tape. "*If the Turing machine halts in any state other than f or if the machine runs forever without halting, the initial string on the data tape is not recognized. Notice that Turing machines don't use black hole states to reject strings; rejection is accomplished by failing to specify transitions and actions for dead-end state/input pairs. A Turing Machine That Recognizes {anbncn I n > 01 A Turing machine that recognizes exactly the strings W = {anbncn In > 0} will need to have an input/output alphabet that contains the symbols {a, b, c, )X}. We need to use the data tape to "count" a's and then b's and finally c's. The tape must initially contain a string for which the Turing machine must determine membership in W. Label the start state so and the final state f. One easy-to-determine instruction is needed to recognize the empty string (where n = 0). That instruction is < so, Xý,),, S, f >. It really doesn't matter whether another blank is written to the cell and no motion to another cell is made. I have chosen those two components of the 5-tuple so as to minimize activity. How can the data tape and states be used to ensure that any symbols come in runs having the same number of elements? That isn't apparent yet. However, it seems reasonable to start consuming a's, moving to the right after each new a. If a c is encountered (where another a or else the first b is expected), there should be no instruction; the machine should halt in some state other than f. Similarly, during a run of b's, encountering a blank cell or an a when more b's or the first c is expected should also cause a halt in a nonfinal state. It is clear that we can't use the states to count the number of a's because there could potentially be a billion a's. Perhaps the control unit should move to state sl as soon as an a is read and stay in that state until a b is read.
9.6 A Glimpse at More Advanced Topics
575
The key insight is that the machine needs some additional output symbols. Perhaps it can use A, B, and C to indicate that the cell previously contained the lower case version of the symbol, and also that the cell has already been processed. With these symbols, the machine can use a series of left-to-right scans that replace a single a-b-c non-adjacent sequence by A-B-C, effectively counting one of each input symbol. Then the machine can return to the left edge of the input string and look for another matched set a-b-c. Keep this up until a return sweep recognizes that all cells have been processed or until an unexpected input symbol is found. With a bit of thought, the following instructions can be developed. < so, )X,k, S, f >
Recognize the empty string
< so, a, A, R, sl >
The first a on this sweep
< sI, a, a, R, sl >
Skip over other a's on this sweep
< SI, b, B, R, S2
>
The first b on this sweep
>
Skip over other b's this sweep
< S2,
b, b, R, S2
< S2,
c, C, L,
< S3,
b, b, L, s3 > B, B, L, s3 > a, a, L, s3 >
< s3, < S3,
S3 >
< S3, A, A, R, ???>
The first c this sweep; start moving left Ignore any b's when moving left Ignore any B's when moving left Ignore any a's when moving left Past active region; start moving right
After using these instructions once, in sequence, the machine is ready to start sweeping right again. It seems inefficient to add additional states (as might be tempting for the incomplete final instruction). The only difference from the initial sweep right is that now there are A's, B's, and C's on the tape. Additional instructions are needed to skip over them (in the proper order, of course). How does the machine know it is done? It is done when a right sweep encounters only B and C (but no lowercase b or c). If the right sweep starts in state so and never leaves that state unless an a is encountered, then the instruction < so, ),, A,,S, f > will properly halt the machine after verifying that all original symbols match in number and are in the proper order. Here is the full list of instructions. < so, Ak,A-,S, f >
Recognize a valid string
< so, a, A, R, sl > < so, B, B, R, so > < so, C, C, R, so >
The first a on this sweep Should be the final sweep Should be the final sweep
< si, a, a, R, si >
Skip over other a's on this sweep
< sl, B, B, R, sl > < si, b, B, R, S2 > < S2, b, b, R, S2 > < S2, C, C, R, S2 >
Skip over any B's on this sweep The first b on this sweep Skip over other b's this sweep Skip over any C's on this sweep
< S2, c, C, L,
The first c this sweep; start moving left
S3 >
< S3, C, C, L, S3 > < S3, < S3,
b, b, L, s3 > B, B, L, s3 >
a, a, L, s3 > < s3, A, A, R, so > < S3,
Ignore any C's when moving left Ignore any b's when moving left Ignore any B's when moving left Ignore any a's when moving left Past active region; start moving right
U
576
Chapter 9 Formal Models in Computer Science
V Quick Check 9.12 1. For each input string (shown on the data tape), determine whether the Turing machine in Example 9.38 halts. If it does halt,
symbol that caused the machine to halt • indicate whether the string is recognized or not
"•indicate
(a) ...- aabbccý •... (b) ... - aabcc0...
"•show the data tape after halting "•put parentheses around the input
(c)
the state at which the Turing machine halts
-••ab ...
(d) .. )kabbck...
vl
The Church-Turing Thesis Turing machines are important because they are as expressive as any phrase-structured grammar. They are important for an additional reason. A famous assertion, called the Church-Turing thesis, 27 claims that a Turing machine is as expressive as any possible model of computation. The assertion is called a thesis (rather than a theorem) because it compares the precise, formal notion of a Turing machine to the imprecise, intuitive notion of computability.
Church-Turing Thesis A Turing machine is capable of performing any computable algorithm. This thesis asserts that a Turing machine is capable of performing any computation that a massively parallel supercomputer can do. Of course, the supercomputer may be capable of completing some tasks in a few seconds that a Turing machine would need billions of years to complete.
Turing Machines and Neural Nets There is an additional significant issue related to Turing machines that should be mentioned. A Turing machine is a model for machine computation. In the early 1940s, McCulloch and Pitts formulated a model of the human brain called a neural net. The basic component in their model is a neuron. Neurons have excitatory inputs and inhibitory inputs. If the excitatory signal minus the inhibitory signal is greater than some threshold value, the neuron fires, sending a signal to one of another neuron's inputs. Threshold values can vary from neuron to neuron. Neurons are connected to other neurons in nets. There are also inputs to the net from outside the net and outputs that leave the net. It is possible to hook neurons together to form simple logic circuits. Pitts showed that a neural net can express the same computations as a Turing machine. If Pitts's observation is coupled with the Church-Turing thesis, you arrive at the conclusion that there exist expressively equivalent models for computers and for human brains. It is tempting to carry this one step further and assert that there is no essential difference between humans and intelligent machines. Many people have made just this claim. In fact, they go even further. Recall Shannon's definition of information. That definition focused on pattern and probability and excluded any notion of meaning. If information is just pattern and if humans and intelligent machines are essentially the same, it is not hard to generate a description (or metamodel) that describes both humans and machines: informationprocessor. The information may be stored in different ways, 27
Named for Alonzo Church and Alan Turing.
9.6 A Glimpse at More Advanced Topics
577
and the manipulations may be done using different mechanisms, but both are capable of performing the same transformations on some collection of data. This equating of human and machine is not without its critics. One simple observation is that the equation is extremely reductionistic. Many details of what humans and machines really are have been omitted in order to produce the simple models. Those omitted details may be quite significant in defining the very real differences. A second line of criticism asserts that humans are more intimately connected to their bodies than the simple models admit. These critics would assert that human intelligence cannot be defined (or developed) apart from the human body. This is not the place to carry this discussion any further. It has been mentioned in order to indicate how seemingly academic models of computation have entered the discussion of what it means to be human or intelligent. These models are no longer just theoretical curiosities for professors to play with; they have become a frequently mentioned artifact in our attempt to understand ourselves.
9.6.3 Exercises The exercises marked with OP have detailed solutions in Appendix G.
(c) {N -- aN IX, M -+ aabM IbcN I a) (d) {ccN -* aNMa I bM, M -+ JMa I bc}
1. qA Design a context-free grammar that specifies the language L = {anbcnd In > 1}. Use E = {a, b, c, d} and A = {S, X} (you really do need at least two nonterminals). 2. Design a context-free language that produces all binary strings that have two zeros at both the beginning and end of the string. (The string 00 of length 2 qualifies.) 3. Define a grammar (of any type) that specifies the language L = {anbmcn I n, m > 0). 4. A context-free grammar is defined by Z = {a, b}, aXb, X -- bYa I ha, A= S, X, Y}, and HI = {S Y -- aXb Iabl. The start symbol is S. (a) Describe the language generated by this grammar. (b) Is there a regular grammar that generates the same language? If there is, describe one; otherwise, show why there is none. (c) Speculate about the answer to the following assertion: "a nonregular grammar can generate a regular language." 5. For each set of productions, determine the most restrictive eof grammar that the productions can be part of. (A atype higher grammar type in the set {0, 1, 2, 31 indicates a more Assume that uppercase letters represent grammar.) restrictive nonterminal symbols and lowercase letters represent terminal symbols. (a) P {abN -+ abc, M - cN I a} (b) [N --> ý,I aN, M --+ cMa I aN} (b) {Nb->. IcaN MeaIM, +I X] (c) {aNb --+ aMc Iacc M, M - cN I X (d) [N - aaNM hbb, M -+ aM)
(c) Given any computational model, a Turing machine will be at least as expressive. (d) Turing machines do not use black hole states to reject strings. Instead, Turing machines simply do not specify transitions and actions for dead-end state/input pairs. 12. Each of the following four statements is either true (always) or false (at least sometimes). Determine which option applies
6. For each set of productions, determine the most restrictive type of grammar that the productions can be part of. (A higher grammar type in the set (0, 1, 2, 31 indicates a more restrictive grammar.) Assume that uppercase letters represent nonterminal symbols and lowercase letters represent terminal symbols. (a) ýB {aaNa -- bN I cNc I c, X -- bM I N) (b) IN --+ bNb I cMc, dMe --> dbNbbe I dace)
for each statement and provide adequate explanation for your choice. (a) A language can be generated by a context-free grammar if and only if it can be recognized by a pushdown automaton. (b) It is a universal requirement for the grammars specified in this text that at least one nonterminal appears on the left side of a production.
7. Design a grammar, of any type, that generates all strings of a's and b's that contain an equal number of each letter. Thus, the grammar should generate abaababbbaand X,but not aba. Show the derivation for abaababbbausing your grammar. 8. Design a regular language that generates all strings of a's and b's that contain exactly one a. Use E = {a, b} (so X should not be used). 9. Design a context-free grammar that generates the language {anbn+l I n > 01. 10. Describe a pushdown automaton that recognizes all the strings in the language {anbn+l I n > 01. 11. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) B Any language recognized by a Turing machine can be recognized by a pushdown automaton. (b) A stack has two operations that move items in and out of bins. Push and pop have the jobs of removing the item at the top of the stack and placing a new item on the top of tesak epciey
578
Chapter 9 Formal Models in Computer Science
(c) It is permissible for a cell in the data tape for a Turing machine to be blank. (d) Context-free grammars cannot generate the null string. 13. Describe a pushdown automaton that recognizes all the strings in the language, {anb m In > m > 0). 14. For each input string (shown on the data tape), determine whether the Turing machine in Example 9.38 halts. If it does halt, "*indicate the state at which the Turing machine halts "*show the data tape after halting
• put parentheses around the input symbol that caused the machine to halt ° indicate whether the string is recognized or not (a) P ... ,aaabbccý. (b) ... Xaabbcc)X...
(d) ... Xaaabbbcccc ... 15. Design a Turing machine that recognizes the language {0 10n I n > 1).
9.7 QUICK CHECK SOLUTIONS Quick Check 9.1 1. Alternative sequences of questions are valid. The number of questions should not vary. (a) Three questions suffice. One possible sequence might be
"*Does the compass reading contain an N?-Suppose the answer is yes. "*Does the compass reading contain an E?-Suppose the answer is no. "*Does the compass reading contain a W?-If the answer is yes, the message was NW, otherwise the message was N. Similar sequences of questions will always determine the message. This experiment should be assigned information value "three questions." (b) Three questions are guaranteed to work. Sometimes two will suffice. One set of questions would be
"•Is the value less than 4? Suppose it is. "* Is the value an even number? If the answer
is yes, then a 2 was rolled. Sup-
pose the answer is no (an odd was rolled).
"* Is
the number a 1? The answer will determine whether the message is informing us that a I or a 3 was rolled.
Since sometimes three questions are needed, the information value should be "three questions."
Quick Check 9.2 1. The change of base formula (page 172) is needed to calculate base 2 logarithms.
(a) H
=
-4_lo0 2 (3)-13o0
(b) H = -
2
SIT
0log2 ") - - log2
(ý)00)
(1)
(1)
(LO)
(-1.585)-(Z) (-.585) (-.1375)
-
=
.918
(-3.459)
= .439 2. In this case, P1 = P2
=
1. Thus,
1 H
2
1l_
1--02g 2 (2-1)
-
2
1
10g 2 (2-1) = 21g 2
1 2 (2)+
2
- 109 2 (2) = 1.
Quick Check 9.3 1. This can be accomplished very efficiently using only two states: Odd (an odd number of ls so far), and Even (an even number of ones so far). The start state is Even, the final state is Odd. The inputs are {0, 1}.
9.7 QUICK CHECK SOLUTIONS
579
(a) The state table is Input
State Even
0 Even
1 Odd
Odd
Odd
Even
(b) The state diagram is 0
Even
1
0 0d
Quick Check 9.4 1. The modified finite-state machine will require three additional transitional states. These states keep track of the bits that are being flushed at the end of the input string. I will call these new states fo (still have one 0 to flush), fl (a final 1 to flush), and fo (all done). To be really complete, I should add one more state (an "error state") that would indicate an error in the input string (a "-" that is in the wrong place). The error state would be a black hole state. I have made the state fo serve in this additional role. The transition and output functions in tabular form are shown next. (A state diagram follows.) The symbol "-" now has dual roles (input and output symbol). Transition
Output
Input
Input
1
State
0
1
-
0
s0
so
Sl
f'
-
so
SOO
Sol
fo
-
s1
Sl0
Sll
fo
-
-
-
S00
S00
S01
fo
0
0
0
S01
So0
Sl1
fl
0
0
0
Slo
S00
S01
fo
1
I
I
Sl
Slo
Sil
fl
1
1
1
fA fl
fei fo
fo fo
fo
0 1
0 1
0 1
fo
fo
fo
fo
f0 -
-
-
-
-
-
The following diagram is an alternative representation of this finite-state machine.
580
Chapter 9 Formal Models in Computer Science
(-,-) (1, 1) S1o(
-
0, )
(-, 1
A
Quick Check 9.5 1. No. The concatenations preserve the order of the characters in the strings. The string abab is in {xn I n > 0}, but abba is not. 2. abcdecde 3. No, x° = k, the null string; A element.
= {[X}, the set containing the null string as its only
4. AB = {rt, ru, st, su} 5.
16. Each successive concatenation pairs every element of the left-hand set with every element of the right-hand set. AB will have four elements. Concatenating with another B will cause the set to double (since IB I = 2). Finally, concatenating with another A will double the set again.
Quick Check 9.6 1. The major change from Example 9.18 is that the transition to the b section of the string must guarantee that at least one b is generated. The productions S -- cC, S -- X, A -+ cC, and A --> X must be dropped.
g = {E, A, S, rI-, where E = [X, a, b, c}, A = {S, A, B, C}, and 1- contains - S
IbB
-aA
- A •
aA bB
- B -- bB IcC
IX
- C-> cCIX 2. S
bB
bbB
bbcC=>bbc
Quick Check 9.7 1. Two simple solutions are possible:
"*ea(rIt) "*ea[rt] Notice that ear I t is not a valid solution (try it).
2. i [ez]
9.7 QUICK CHECK SOLUTIONS
581
Quick Check 9.8 1.
[ 0- 9] + is incorrect (it allows 0). The solution is [1- 9] [ 0- 9] * (which excludes leading Os).
2. We need to allow for a leading - sign. (-) ? [ 0- 9] + will almost work (but allows -0 and 00). A better solution is ^ ( 0 1- ? [ 1- 9] [ 0- 9] * ) $ (either the number is 0, or else start with an optional minus sign followed by a nonzero digit, followed by zero or more digits).
Quick Check 9.9 1.
(a) Use the construction from the proof of Theorem 9.3. Let E = {0, 1} and A = {Z, E, T} (where state One has become nonterminal E). The start symbol will be Z. The productions are 1- = {Z - 1Z, Z -* OE, E -+ OT, E -+ 0, E -+ 1Z, T -+ 1Z, T -+ OT, T0). (b) L(g)is the set of all bit strings that end in two zeros.
2. Use the construction from the proof of Theorem 9.4. The initial part of the construction leads to E = {a, b}, the start state is so, and since there is a production s -* baa that contains neither a nonterminal nor X on the right-hand side, 8 initially contains so and f. We also know that Y = {f}. From the production S -ý. aS we have the transition t (so, a) = so. The production S --+ baa will introduce two more states (which can be named sl and S2 for this example). The transitions t(so, b) = s1, t(sl, a) = S2, and t(s 2 , a) = f. Finally, adding the state bh and the transitions t(sl, b) = bh, t(s 2 , b) = bh, t(f, a) = bh, t(f, b) = bh, t(bh, a) = bh, and t(bh, b) = bh completes the finite automaton. A is therefore defined by 8 = {so, f, S1, S2, bh}, with start state so. The input values are E = {a, b} and the final states are 9Y= {f}. The transition function is summarized in the following state diagram.
582
Chapter 9 Formal Models in Computer Science
Quick Check 9.10 1. This is quite straightforward. 1. Create a new start state, so, and connect it to the old start state via a Xý-transition. 2. Create a new final state, fo, and connect all the old final states to it via X-transitions. 3. Eliminate any nonfinal state that has no transitions that move from the state to some different state (a nonfinal black hole). 4. For each pair of states, x and y, that have more than one transition from x to y, replace those transitions by a single transition whose label is the regular expression formed by using the alternative operator, 1, with the labels on the former transitions. 5. While there are still more than two states A. Choose some state, y ý Iso, fo), to eliminate. B. For each pair of transitions, < x,y > and < y,Z >, with x*y, Z*y: a. Save Lo(xz) and eliminate the transition < x,Z > if it exists. /3. Create a new transition, <x,z>, with Ln(x, z) = Lo(x, z)jLo(x, y)Lo(y, y)*Lo(y, z).
Quick Check 9.11 1. Let g be defined by having E = {a, b, X.}, A FI = IS -I aSbb}. 2. One solution is E = FI = {S -+ aSc I b}.
(a, b, c, X}, A =
=
{S}, with S as the start symbol, and
{SJ, with S as the start symbol, and
Quick Check 9.12 1. The input symbol that caused the halt is in parentheses. (a) Halts in state f (recognized). Tape = ... ,XAABBCC(X) ... (b) Halts in state sI (not recognized). Tape ... XAABC(c)X... (c) Halts in state S2 (not recognized). Tape = ... ý.AB(X.) .. (d) Halts in state so (not recognized). Tape ... ý.AB(b)C,.
9.8.1 Summary This chapter presents some mathematical models that seek to describe several aspects of computer science. The chapter starts with Shannon's models of information and a communication system. The model of a communication system is simple, but it is very useful. The development of error-correcting codes (Section 8.5) is a response to the presence of noise in the communication channel. Shannon's model of information is not as simple, but it is based on some intuitive ideas. It is worth remembering that Shannon's definition of information does not consider "meaning." It is an attempt to form an abstraction that concentrates on the engineering aspects of reliably transmitting data. Notions such as the "average information" in a randomly selected message help to discuss ideas such as "how much information per second can be transmitted over this channel?". The remainder of the chapter presents several models that are abstractions created to help theorists understand computers. Finite-state machines (Section 9.2) are capable
9.8 Chapter Review
583
of representing the notion that at any given instant, a computer has a definite state (the contents of the memory, registers, program counter, etc.). In addition, state changes (and perhaps output) can be triggered by new input, but the action will also depend on the current state. Section 9.3 is motivated by the effort to understand compilers. Human languages are too ambiguous and complex to make the process of translating a human-oriented program into machine-oriented object code that can run on a central processing unit. Formal languages remove some of the complexity and ambiguity of human languages (as well as some of the expressiveness). What is gained is conformity to a grammar (similar to a set of axioms). This regularity and concrete set of transformation rules (productions) enables the creation of compilers (programs whose task is to translate other programs into machine language). One very simple formal language that is used extensively is the language of regular expressions (Section 9.4). Regular expressions enable people to do very sophisticated pattern matching for text processing. The patterns can be very general (such as matching everything that looks like a phone number, or everything that looks like a zip code, or an e-mail address). Section 9.5 explores the connections between finite automata (Section 9.2), regular grammars (Section 9.3), and regular expressions (Section 9.4). The main theorem in that section demonstrates that they are distinct models that have identical expressive power. The material in Section 9.6 extends the simple finite-state machine models into a series of increasingly more powerful models (and associated formal languages). This section is intended as a brief preview of ideas that you may encounter in future courses. It does not contain enough details to serve as a solid introduction. Except for (perhaps) the model of information and the proof of Kleene's theorem, the material in this chapter is mostly of a concrete nature. There are many algorithmic manipulations and there are fewer definitions and theorems than some other chapters contain. There is, however, a large amount of new notation. This chapter contains some very useful models, especially for students who plan to do more advanced work in computer science. Instructors in future courses expect students to be comfortable with these models and capable of using them effectively. As you review, don't concentrate on the theory alone; spend some time reviewing how to use the models to solve practical problems.
9.8.2 Notation Notation
Page
Brief Description
I(S)
525
the information value (or average information) of selecting/receiving a message from sample space, S
{S, E, t, s0 , T)
531
a finite automaton
g
531
the set of states for a finite-state machine
531
the set of input values for a finite-state machine (alternative usage listed below)
t (s, i)
531
a transition function that maps state-input pairs to states
so
531
the start state in a finite-state machine
T
531
the set of final (or accepting) states in a finite automaton
M = {S, E, t, F, g, so)
535
a finite-state machine with output
F
535
the set of output values for a finite-state machine with output
584
Chapter 9 Formal Models in Computer Science Notation
Page
g(s, i)
535
an output function that maps state-input pairs to output values
E
541
an alphabet in a formal language (alternative usage listed above)
X
541 541
the null string the set of all finite-length words over E
E*
g
=
{E, A, S, Fl}
Brief Description
542
a regular grammar
A
542
the set of nonterminal symbols in a regular grammar
S
542
the start symbol in a regular grammar
Hl
542
the set of productions in a regular grammar
v2 v] V1 => Vn
543 543
V2 is directly derivable from v1
L(9)
543
T
570
the language generated by grammar, the complement of the language, L
Vn is derivable from v1
Q
String Concatentaion (See page 544.) Let x and y be strings. Then * xy is the string formed by concatenating the two strings.
*ý x •X0
=
.X X ), 1
for n > 1 Sx* represents zero or more copies of x. That is, x* E {x" I n > 01. * x+ represents one or more copies of x. That is, x+ E {xn I n > 1}. *Xn
=XXn
Kleene Operators (See page 544.) Let A and B be sets of symbols or sets of strings. Then
"* AB={ablaEAandbcB} " a° = {p} "* An = AAn- 1 for n > 1 "*The Kleene closure of A is A* = {1} 3 2 "*A+=AUA UA U ... = U?0=Ai.
U A U A 2 U A 3 U ...
U
A.
Regular Expression Metacharacters (See page 549.)
"*The $ character matches the end of a line. "*The character matches the beginning of a line. "* The [ character initiates a [ ] pair. A regular expression
consisting of a pair [1 with a set of characters inside matches any one character from the set. A regular expression consisting of [ - I with a set of characters after the - matches any one character that doesn't occur inside the [] pair. The "-" character acts as a range operator. If the "-" character is one of the characters you want as a choice in the [] pair, it needs to be the first character so that it isn't interpreted as a range operator.
"*The metacharacter
ý indicates an alternative. A regular expression containing a I matches any string that contains either the left or the right alternative. "*The ( ) metacharacters are used to group characters into subpatterns of the regular expression. "*The \ metacharacter is used to convert a metacharacter back into a normal symbol (so it can match itself).
9.8 Chapter Review
"* The "*The
585
metacharacter matches any single character except a newline. metacharacter is used to surround a regular expression. Everything between a pair of " characters is treated as the regular expression. The main reason for this is for use with software that accepts regular expressions as command-line arguments. The " characters keep the operating system from trying to evaluate the regular expression as a part of a command. .
"
"*The
* metacharacter indicates that 0 or more copies of the immediately preceding character or subpattern will be matched. "*The ? metacharacter matches either 0 or I copy of the immediately preceding character or subpattern. "*The + metacharacter matches 1 or more copies of the immediately preceding character or subpattern.
9.8.3 Definitions Communication System Shannon's model of a communication system connects the following components: an information source, a destination, messages, a transmitter, a channel, signals, a noise source, received signals, a receiver, and received messages. See the diagram and descriptions on page 522. A discrete communication system is one in which the message and the signal both consist of a sequence of discrete symbols. If the message and signal are both represented by continuous functions, the system is said to be a continuous communication system. Otherwise, the system is said to be a mixed communication system.
Information Let S
m.. a} be a finite collection of messages, having respective nonzero probabilities IP , P2,..... Pn}. Then the average information in a randomly selected message from S is =
{ml, m2.
n
k=1
State A state is one of a set of well-defined conditions in which a system can exist. If there are only a finite number of states, the system is afinite-state system. Finite Automaton A finite automaton, A, is a model that consists of "*a finite set of states, 8 "•a set of input values, E * a transition function, t (s, i), that maps state-input pairs to states "•a special state called the start state (generically named aspa C
SeTable; se
5, of final (or accepting) states
Dar
There are
tom
mon
first uses a table called a state table; the second uses a diagram called a state diagram. Finite-State Machine with Output A finite-state machine with output, M, is a model that consists of a finite set of states, S - a set of input values, E * a transition function, t(s, i), that maps state-input pairs to states *
a set of output values, F
(Pk)10plog2
I(S) = -
so) * a subset, J•
and Os that are concatenated together is often called a binary string. Recognized; Accepted When an input string causes the finite automaton to land in a final state, the string is said to be recognized (or accepted).
*
an output function, g(s, i), that maps state-input pairs to output values
- a special state called the start state (generically denoted so) This is represented as M = {S, E, t, F, g, so}. Symbols; Alphabet; Word; Null String An alphabet is a finite, nonempty set, E, of elements (called symbols). In this context, any finite string of symbols from E will be "called a word. The null string will be denoted by L. It is the zero-length string that contains no characters. Language The set of all finite-length words using symbols from E will be denoted E*. A language over E is a subset of E *.
"Complementof a
This can be represented as A = (S, E, t, so, T1.
Language Let L be a language over E. The complement of L is denoted by L and consists of all
Concatenate The word concatenate means to "connect or link in a series or chain."
strings in V that are not in L. Regular Grammar A regular grammar, g =
String A finite sequence of characters that are concatenated together is called a string. A finite sequence of Is
[I }, consists of o an alphabet, E, also called the set of terminal symbols
E, A, S, E
586
Chapter 9 Formal Models in Computer Science
"* a set, A, of nonterminal symbols "* a nonterminal symbol, S, called the start symbol "* a set, FH, of replacement rules called productions
E*. The subset is called the regular set generated by the regular expression. A regular expression serves as an abstract pattern that specifies which strings in E* belong to the corresponding regular set.
> v, where N is a The productions are all in the form N -nonterminal and v E (E U A)*. The string v satisfies the following conditions: "* It must contain at most one nonterminal symbol.
"* If v contains
Metacharacter A metacharacter is a special character that provides structure to pattern specifications for regular expressions.
a nonterminal it must be the right-most sym-
bol. only terminal symbol is X, then there can be no nonterminal symbol. "* The string v must contain at least one terminal symbol. Derivation The process of transforming the start symbol into a word over E is called a derivation, More generally, if vi and v2 are strings in E U A such that there is a production in H that enables v, to be replaced by v2, we say that v2 is directly derivable from vi and denote the fact by vl ==• v2. If v,, can be obtained from v, by a sequence of substitutions using productions in H, we say that v, is
"*If the
derivable from vl and denote this as vi =* vn. Language Generated by a Grammar Let g = {E, A, S, H} be a grammar. The language generatedby g is the set of all words over E that can be derived from S. This set is denoted L(g). Thus,
A Formal Definition of Regular Expressions Let E be an alphabet. A regular expression over Z is defined recursively by the following: • The empty set, 0, is a regular expression. • The empty string, X, is a regular expression. • The symbol, a, is a regular expression for every symbol a E E. • If the symbols or strings, A and B, are regular expressions, then their concatenation, AB, is also a regular expression. • If the symbols or strings, A and B, are regular expressions, then A I B is also a regular expression. • If the symbol or string, A, is a regular expression, then A* is also a regular expression. Nondeterministic Finite Automaton A nondeterministic finite automaton is a finite-state machine that relaxes three requirements in the definition of a finite automaton (page 531).
L(g) = {w E E IS *: w}. If the grammar, 9, is a regular grammar, then L (9) is called a regularlanguage. Kleene Closure The Kleene closure of A is U A U A 2 U A 3 U ... = U 0oAi. A= * 0
• It is permissible to move between states without any input symbol to trigger the transition. Such transitions are called X,-tratisitions and denoted by labeling the transition with the empty string.
An Informal Definition of Regular Expressions Let E be an alphabet. A regular expression over E is a mechanism for building or recognizing or matching a subset of
symbol. - States may have more than one transition associated with the same input symbol.
*
States do not need transitions associated with every input
Chomsky Hierarchy of Grammars The Chomsky hierarchy of grammars was first introduced in 1959 by the linguist Noam Chomsky. The grammars are all in the form g = {I, A, S, H}. The only point of difference is the nature of productions in FH. In all cases, productions look like a ---/3, where ae and / are strings in (E U A)*.
Type
Grammar
Productions a --* #
Recognized by
0
Phrase structured (unrestricted)
a E (E U A)+, P3E (Y U A)* (Note: a A X, but /3 = ,Xis ok)
Turing machine
I
Context sensitive
aI Na12 -> a1 Ya2 for a 1, u 2 , y E (E U A)*,
Turing machine
y 0 X and N E A
2
Context free
N -- /3 where N E A, /8 E (Z U A)*
Pushdown automaton
3
Regular
N, -- /31 N2 or N --> /3 where N 1 , N 2 , N E A and
Finite automaton
__/331C
F-*, /311
#
9.8 Chapter Review Stack A stack consists of a (potentially very large) collection of storage bins and two operations that move items in and out of bins. The operations are called push and pop. Push and pop enforce a "last-in, first-out" discipline on accessing the memory bins. Push places a new item on the top of the stack. Pop removes the item at the top of the stack. No other access to the stack's bins is permitted. Pushdown Automaton A pushdown automaton is a finite automaton with some limited-access memory. That is, in addition to the states and input symbols found in a finite automaton, there is also a stack on which to store symbols for later use. Touring Machine A Turing machine consists of a control unit and an infinite data tape. The data tape is divided into cells that are arranged in a line that extends forever in both
587
directions. Each cell may either be blank (contain the symbol X) or contain one symbol from some finite alphabet. The control unit has the ability to read the contents of a cell, write a new symbol in a cell, and move the Turing machine one cell to the left or right. The control unit also contains a finite set of states together with a set of instructions. Neural Net A neural net is a model of the human brain. The basic component in the model is a neuron. Neurons have excitatory inputs and inhibitory inputs. If the excitatory signal minus the inhibitory signal is greater than some threshold value, the neuron fires, sending a signal to one of another neuron's inputs. Threshold values can vary from neuron to neuron. Neurons are connected to other neurons in nets. There are also inputs to the net from outside the net and outputs that leave the net.
9.8.4 Theorems Theorem 9.1 Maximizing Information Let S = {m 1, m2. mn I be a finite collection of messages, having respective probabilities {PI, P2. Pn}1. Then I(S) is maximized when p, = P2 . pn = ieua
Regular Expressions
A /I , sha Term94R ua
Lt btlaueget
Generate FRegular Setshr
Languages band A
S--"Recognize
Automata
ogz b n
ny
g
Generate-ý "...
AFinite aGrammars Automaton , T Regular Grammar
Theorem 9.2 Any nondeterministic finite automaton can
sion, R, it is possible to find a nondeterministic finite au-
be transformed into a deterministic finite automaton that i recognizes the same set of strings Theorem 9.3 Finite Automaton -+ Regular Grammar
tomato that recognizes all strings, and only those strings, in the regular set generated by R. Lemma 9.2 Converting Nondeterministic Finite Au-
Let E be the set of input symbols for a finite automaton A = 9.,1 , t, so, f}. Let R be the subset ga of n that is recognized by A. Then there is a regular grammar 9 = [{E, A, S, FI-I such that L (9) =7Z.
tomata to Regular Expressions Given any nondeterministic finite automaton, A, it is possible to find a regular expressing that generates all strings, and only those strings, recognized by A.
aheoregula 4Rela gulag GrammaTheorem Let L 9() be the language generated by a regular grammar t=IErminiSt A thatnrecognizes herautomatan anyregularexprsA = {8, E A,FinI.The t, SO, -TI} L(9). Theorem 9.5 Kleene's Theorem A set is regular if and o nly if it is reco gn ized by a fin ite auto m ato n . Corollary 9.1 Regular Set If and Only If Regular Language A subset R C E* is a regular set if and only if it is a regular language. Lemma 9.1 Converting Regular Expressions to Nondeterministie Finite Automata Given any regular expres-
9.8 Context-Free Grammars and Pushdown Automata Any language generated by a context-free betgrammar can be recognized a pushdown automaton. Any language recognized by by a pushdown automaton can be generated by a context-free grammar. T e r m9 6 T e C m l m n f a R g l r L n u g I sarglrlnugte sas eua agae eua agae sas sarglrlnugte I Theorem 9.7 The Intersection of Regular Languages If L I and L2 are both regular languages over the same alphabet, E, then L I n L2 is also a regular language.
588
Chapter 9 Formal Models in Computer Science
9.8.5 Sample Exam Questions 1. Reproduce the diagram of Shannon's model for a communication system. Label the diagram. 2. A collection of four messages, S = {m 1, m 2 , in 3 , m4 }, has repetie roabliie, I~-2, ' ,' , ,4_,1}Whtiteavrg ' . What is the average respective probabilities,
both first and last names contain two or more characters. Use ., to represent the space character. Do not use any Perl extensions. Mr. First Last Mrs. First Last Ms. First Last First Last
information in a randomly received message from this set? 3. Create a finite automaton that recognizes any input string that contains exactly two a's. Assume the input alphabet is {a, b, c}.
Note that "mr. bob Smith" should not be matched, since the "m" in "mr." and the "b" in "bob" are lowercase. 8. Produce a regular grammar, Q, such that L(9) is the set of
4. What is the output string that results when the input string abbcbab is sent to the following finite-state machine? What is a likely purpose for this machine? (b c, F) . (a, F) (b, T)Od
input strings recognized by the following finite automaton. The set of input symbols is {0, 1).
(a c, F)
(a, F)
(b c,F)
0
I
1_
9. Match each type of grammar with a state model that is capable of recognizing strings in the language generated by the grammar. Grammars Context free; regular; context sensitive;
5. Describe the productions in a regular grammar.
phrase structured
6. Let E be a set of symbols. Define the Kleene closure of E. 7. Create a single regular expression that matches any name in one of the following forms (case is important). Assume that
State machines Pushdown automaton; finite automaton, Turing machine 10. State the Church-Turing thesis.
9.8.6 Projects Mathematics
Computer Science
1. Write a brief expository paper that introduces Shannon's theory of channel capacity.
1. Write a GUI-based program that simulates finite automata and/or finite-state machines with output.
2. Master the proof of Theorem 9.2. Then write a clear presentation of the proof together with any necessary background information.
2. Write a brief introductory paper about the use of formal languages in the design of compilers. 3. Write a brief expository paper about LR grammars. Include
3. Write a brief expository paper on the Church-Turing thesis. 4. Write a brief expository paper about primitive recursive functions and computable functions.
a discussion about the meaning of "LR." 4. Write a detailed report about Turing machines. Include a simple program that will run on a Turing machine.
9.8.7 Solutions to Sample Exam Questions 1. See Figure 9.1 on page 522.
3. The following state diagram shows such a finite automaton.
4 2.IS)=-
(.0log
+
a
a 2
1 (l - 1092 = --
bb
-1
(1)
+
'0log2
)l1 +
1092
1-2. 83.
c
bbba
Pk log2(Pk) k=1
=
C
C
() (Y)
4
a
3. 8 )
4. The following table summarizes the output string (in the final row), as well as the sequence of states. Input State
a a
b ab
b No
c No
b No
a a
ab
Output
F
T
F
F
F
F
T
b
7
4
9.8 Chapter Review This finite-state machine outputs a T if the previous two input characters are a and then b. It outputs an F in all other cases. So its purpose is to detect the substring ab. 5. The productions in a regular grammar are all in the form N --* v, where N is a nonterminal and v E (E U A)*. The string v satisfies the following conditions: "*It must contain at most one nonterminal symbol. "*If v contains a nonterminal, it must be the rightmost symbol. "*If the only terminal symbol is A,then there can be no nonterminal symbol. "*The string v must contain at least one terminal symbol. 6. The Kleene closure of E is denoted by E* and defined as E* 1. An independent set induces a subgraph that contains no edges. For each of the graphs in Exercise 4, find a maximal a maximal clique and clique and a maximal independent set. 21. Prove that a maximal clique in a graph, G, is a maximal independent set in G. 22. Prove Theorem 10.2 and Corollary 10.1. (Hints: For the theorem, use Proposition 11.3 on page 719. For the corollary, see Exercise 21.)
14. If G has 10 edges and G has 5 edges, how many vertices does G have? 15. If G has j edges and G has k edges, where 2(j + k) =IRRlITVF m(m + 1) for some positive integer, m, how many vertices does G have? 16. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) If there is an edge joining two vertices in a graph, we say
that these two vertices are incident. (b) o It is permissible to call some graph a multigraph in order to emphasize that loops and/or multiple edges between vertices are not permitted. (c) If there is no edge joining two of the vertices in a simple graph, G, then there must be an edge joining these two vertices in .
Clique; Independent Set
Bipartite Graphs and Independent Sets A graph, G, is bipartite if and only if every subgraph, H, of G contains an independent set consisting of at least half the vertices in H.
COROLLARY10.1
Bipartite Graphs and Clites Cliques
Let G be a graph. Then G is bipartite if and only if every subgraph, H, of G contains a clique consisting of at least half the vertices in H.
600
Chapter 10 Graphs
10.2 Connectivity and Adjacency Paths and connectivity are generalized forms of adjacency. These topics are the substance of this section.
10.2.1 Connectivity The goal in this section is to define and then explore what it means for a graph to be connected. You may already have an intuitive idea of what it means for a graph to be "connected." One way to test your intuitive notion is to make up several graphs that you think should be connected and several that you think should not be called connected. Check your definition on each example. Your goal is to put your definition through a torture test to see if it is really pure gold or still contains impurities. Try to be devious; make your definition fail if possible. Then create a better definition and start over.
V Qui ck Ch eckk.10.
.3.
1. Carry out the process just outlined. Here is a final test for your definition. Should the graph in Figure 10.24 be considered to be connected? Notice that the vertices V, = {a, b, c} and the vertices Vp = {d, e, f} are not
a
d
e
b
c
connected by any edges. A definition that uses terms such as "without lifting your pencil from the paper" fails on two counts. First, it would imply that the previous graph is connected. Second, a definition in terms of the visual representation is inadequate. , It seems necessary that "getting from one vertex to any other" be part of the final definition. This concept needs to be expressed in terms of the formal definition G = {V, E, •1. To that end, some new auxiliary definitions are in order. These definitions are presented as a basic concept (a walk), together with some additional refinements (can edges and vertices be repeated?, does the walk start and end at the same place?).
f Figure 10.24. Is this connected?
DEFINITION 10.11 Walk; Trail; Path;Closed Walk; Circuit; Cycle Let G
=
{V, E,
4P} be a graph.
The Basic Concept A walk of length k is a nonempty sequence of alternating vertices and edges,. voel ve 2 v 2 .. ek Vk, such that 0 (ei) = vi-lIvi, for i = 1, 2 . k. The vertices, vo and Vk, are called the endpoints of the walk. Excluding Repeated Edges and Vertices "*A trail is a walk with no repeated edges.
"*A path is a walk with no repeated vertices. Starting and Ending at the Same Vertex "•A walk is closed if it has length of at least one and its endpoints are the same vertex.
"*A circuit is a closed trail. "*A cycle is a circuit in which
the endpoints are the only repeated vertices.
Alternative Notation for Walks in Simple Graphs If G is a simple graph, or a simple graph with loops, then the edge joining two vertices is uniquely determined. Consequently, the walk voeI v I e2v2 ... ekvk can be denoted unambiguously by V Vo V2 ... Vk.
10.2 Connectivity and Adjacency
601
You may have wondered why there is no special name for a closed path, especially since there are two special names for different kinds of closed trails. A moment's reflection provides the answer. In order to be a closed walk, the initial vertex must be repeated. The presence of a repeated vertex means that the walk cannot be a path. M p
Walks and Circuits The graph in Figure 10.25 contains many walks. Since it is a simple graph with loops, it is not necessary to include edges when listing a walk. Several walks of various kinds are listed next. In order to keep the graph from becoming cluttered, the edges have not been labeled. The edges will be designated by an e with subscripts that indicate the incident vertices. For example, the edge with endpoints p and r will be denoted by epr.
U r
s
Y
Z V
- The walk peprrerzzezww (or przw) is both a trail and a path.
W
*
The loop yeyyy is a circuit of length one.
- The walk syytx is not a path (the vertex y is repeated), but it is a trail. + The walk ptxups is a trail, is not a path, and is not a circuit (even though it contains a subwalk that is a circuit).
q
The walk s has length zero. • The walk prqsptxup is a circuit. - The walk tpuxt is a cycle.
Figure 10.25. Walks and circuits.
*
U
Sorting Out Notation
V)3
V2V4
,1 V6
V5
Figure 10.26. A graph with vertices labeled as subscripted V's.
The fact that the definition of a walk contains a list of subscripted v's, in numeric order, might be confusing when applied to a graph in which the vertices have been labeled using subscripted v's. Figure 10.26 shows such a graph. The definition should be interpreted as showing a general pattern. The subscripts there help to show the ordering of the elements in the walk and also help to determine the length of the walk. However, they are not to be taken too literally. In Figure 10.26, the walk V4V5V3VIV2 has length 4. The walk V4v5v3V1V2Vov4 is a cycle with length 6. The walk V3VI V3 is a closed walk with length 2. U
It is finally time to define connected. DEFINITION 10.12 Connected A graph G = {V, E, 0 } is connected if IVI = 1 or if there is a walk in G between every pair of distinct vertices. The set V of vertices can always be partitioned into a disjoint union V = V1 U V2 U ... Vk such that v, w E V are adjacent if and only if v, w E Vj for a common
j. The subgraphs induced by the vertex subsets Vj are the components of G. A connected graph has one component.
Components This graph has three components.
2: 3: My dearest darling: 4: 5: <paragraph> 6: How I have missed you! The knowledge that I will see 7: you again is the only reason ... 8: 9: <paragraph> 10: The good news is that my Discrete Mathematics class is 11: really interesting. It is a lot of work, but I am doing well 12: 13: 14: Your faithful admirer, 15: <signature>Name suppressed to protect the privacy of the author 16: U 9 An XML document contains the following kinds of components.
Processing instructions Line 1 of Example 11.18 contains a processing instruction. Its purpose is to inform any program that is reading the document which version of XML is being used. Processing instructions are not part of the tree that represents the document. You can copy this line exactly as the first line of any XML documents you create as homework. Attributes The only attribute in this example is in line 1. It is version-"i. 0" and is part of the processing instruction. The string "version" is the attribute name, and the quoted string "1.0" is the attribute value. Attributes will be discussed in more detail on page 698. Tags Tags are the mechanism used by XML (and HTML) to create the logical structure of the document. Tags come in pairs. The opening tag consists of a pair of angle brackets, , enclosing a string that identifies the element the tag is creating, and possibly some optional attributes. The closing tag 1 ° consists of an opening angle bracket and a slash, . There are no attributes in closing tags. Lines 2 and 16 of Example 11.18 contain a pair of tags for the "letter" element. Line 3 contains a 9
Other components exist but are not necessary for this overview. 10 Opening and closing tags are also called start and end tags.
11.3 More Applications of Trees
695
pair of tags for the "greeting" element. The enclosed string, "My dearest darling:", is not part of the tags. Elements Elements are what give an XML document its logical structure. They are one of the two building blocks that make up the tree nodes in the tree representation. In Example 11.18, the elements are "letter," "body," "greeting," "paragraph," "closing," "signature." Element names are case sensitive. Thus, "letter" and "Letter" are considered to be distinct elements. The element "letter" at line 2 is called the root element. It will be the root of
the tree representation. Everything between the element's opening and closing tags is considered part of the element. An element can be composed of any mixture of other elements and character data. It can also be an empty element and have nothing between the two tags. In Example 11.18, the "letter" element is composed of the "greeting," "body," "closing," and "signature" elements. The "greeting" element is composed of character data. Character data The strings that appear between matching tags are called character data. They are the second of the building blocks that become nodes in the associated tree. The string "My dearest darling:" on line 3 is an example, as are the longer strings between the "paragraph" elements. There are a few symbols that cannot be used as part of character data. They are the two angle brackets, < and >, the ampersand, &, and single and double quotes, ' and ". They cannot be used because they have special meanings in XML. For example, using an angle bracket,
< >
&
& ' "
The following XML fragment demonstrates the use of these entities. If x ⁢ 4,
then x squared ⁢
16.
On a Web browser that can display XML, this would appear as either If x < 4,
then x squared < 16.
or as If
x < 4,
then x squared < 16.
Well-Formed XML Documents There are some requirements about the manner in which these building blocks may be composed. One requirement is that element and attribute names are case sensitive. Another significant fact is that multiple adjacent whitespace characters1 1 are usually treated as a single space. The other requirements are summarized in the next definition. See Section 9.4.2 for a brief description of whitespace characters.
696
Chapter 11 Trees
DEFINITION 11.19
Well-formed
An XML document is well-formed if the following are all true: "*There is a single root element.
"*Each element has a matching opening and closing "*All attribute values are in single or double quotes. "*Elements are properly nested.
tag.
The final requirement in the previous definition needs some additional explanation. An element can be nested inside another element. For example, in Example 11.18, there are two "paragraph" elements nested inside the "body" element. The indentation used in that example helps to highlight the nesting. Two elements are properly nested if both the opening and closing tags of the second element are entirely contained inside the two tags of the first element. <second>
Improper nesting occurs when the opening and closing tags are not both inside the other element's tags. <second>
INCORRECT!
Nesting The following fragment of an XML document shows proper nesting with parallel element and character data. <paragraph> One of my favorite books is Lord of the Rings by J. R. R. Tolkien. This could also be formatted in the following manner (since whitespace is not significant). <paragraph> One of my favorite books is Lord of the Rings by J.R.R. Tolkien. This clearly shows that this "paragraph" is composed of a sequence of four items (two character data sections and two elements). The "book-title" and "author" elements consist of character data. 0 There are a number of software programs that can determine if an XML document is well formed. Perhaps the most accessible is Internet Explorer, versions 5 or higher. Saving Example 11.18 in a file (perhaps named "letter.xml") and opening the file with Internet Explorer produces the display in Figure 11.33. If the XML document had not been well formed, an error message would have been displayed instead.
11.3 More Applications of Trees Figure 11.33 Opening letter.xml in Internet Explorer.
697
kI j
F, -;0,,
H_11)
.7' \ok s\eR et- etk
1\Ieter,-m1
1
-
My dearest darling: - <paragraph>How I have missed you! The knowledge that I will see you again Is the only reason ... <paragraph>The good news is that my Discrete Mathematics class is really interesting. It is a lot of work, but I am doing well ... Your faithful admirer, <signature>Name suppressed to protect the privacy of the author
rvj
4-
The Tree Representation of an XML Document The nesting of elements leads directly to a tree representation for an XML document. The root element becomes the root of the tree. Because every element is composed of a sequence of other elements and character data, it is easy to determine the children of each element. Just take each item in the sequence and create a new child node. A Subtree Example 11.19 contained a properly nested XML fragment. A subtree that represents that fragment is shown in Figure 11.34. The element nodes and character data nodes have been visually distinguished, but that is not necessary. Figure 11.34 A tree for Example 11.19.
O emyofa vori te books is
b o k t l eba
Lord of the Rings
h r
J. R. R. Tolkien
The Tree Representation for Example 11.18 A tree for the XML document in Example 11.18 is shown in Figure 11.35.
VQuick Check 11.8 1. Produce the tree representation for the following well-formed XML document. Discrete Mathematics <enrolled>2< /enrolled> <student> Jennifer Adams Sophomore< / cohort>
U
698
Chapter 11 Trees
Figure 11.35 A tree for Example 11.18.
L
pargr
ragraph
My dearest darling:
How I have missed you! The knowledge
Your faithful
Nm
Your, ft
IName suppressed]
upesd
The good news is that my Discrete Mathematics
reason that I will see you again is the only o reaadongoelclass is really interesting. It is a lot of work, but •I am doing well ... <maj or>Mathematics <student> William Zembrot Junior <major>Computer Science
Attributes The "classlist" element in Quick Check 11.8 contains two elements that appear once each ("course" and "enrolled"). The other elements are all of the same type ("student"). These elements differ in other ways as well. The "course" and "enrolled" elements contain information about the "classlist" whereas the "student" elements are what the "classlist" is composed from. The XML standard provides an alternate mechanism for recording information about an element: attributes. Attributes are listed inside the opening element tag. They appear in the following format: AttributeName
= "Attribute Value"
The attribute name must not contain any whitespace, but the attribute value may. The attribute value must be enclosed by either single or double quotes. The whitespace around the equal sign is optional. A Revised Class List The classlist document in Quick Check 11.8 can be revised using attributes. Jennifer Adams Sophomore< /cohort> <maj or>Mathematics
enrolled -
"2'>
11.3 More Applications of Trees
699
<student>
William Zembrot Junior< /cohort> <major>Computer Science
The attributes can be considered to be an integral part of the element. The tree representation (Figure 11.36) can show this by making the attributes part of the element's node. U Figure 11.36
A tree with
attributes.
classlistMathematics") course = "Discrete
enrolled = "2"
FJennifer Adams Sopho
Mathematics William ZembrotI Jun
IComputer Science]
The decision about whether some information related to an element should be represented as an attribute or as a nested element or as nested character data is complex. Part of the decision relates to how the document will be manipulated by the software it is intended to be used by. There are no rules that always apply. This introduction to XML is much too brief to worry about those issues. Validation So far, XML appears to be a complicated way to create a document. What are the benefits? There are several. The first benefit is that XML can be used instead of the less flexible HTML standard for creating Web pages. A second advantage is that XML documents are ideal for sharing electronically, and more important, are designed so that software can automatically perform complex manipulations. For example, a program could print a report about the class list in Example 11.22 that sorted students by cohort and then by major. The report could also very easily count the number of students in each major or cohort. One of the other significant advantages is the ability to define in advance what a particular class of documents should look like. Documents that don't conform to the user-defined standard can be rejected. There are several mechanisms available to create such a standard. The two most common are Document Type Definitions (DTDs) and Schemas. A Schema is written as a well-formed XML document, so it can also be represented as a tree. DTDs use a format that does not translate into a tree. DTDs will be briefly presented here because they are a bit easier to describe and because they 12 provide another application of simple regular expression notation. 12See Section 9.4.
700
Chapter 11 Trees The next example will provide a context to introduce some of the major kinds of DTD rules. J
A Classlist DTD The following DTD defines the structure for class list XML documents. The line numbers are not part of the DTD, but are included to aid the discussion. 1: 2: 3: 4: 5: 6: 7:
U
DTDs use a fairly rich set of rules. For the purposes of this introduction, only two kinds are discussed (and neither in its fullest form). Element declarations An element's structure is defined by a declaration of the following form. The word "ELEMENT" that appears after the exclamation point must be all uppercase. The "ElementName" will be the name of the type of element being specified. The "ConsistsOfRule" portion uses regular expression notation to indicate exactly what other elements and character data are permitted to be nested inside the element. For instance, in line I of Example 11.23, the "classlist" element is composed of one or more "student elements." Line 4 specifies that a "student" element consists of a "name" followed by a "cohort" followed by a "major." Those three elements must appear in the order listed. Line 5 specifies that a "name" consists of character data. 13 The specification of what the element consists of will always be inside parentheses. It will typically use combinations of the following regular expressioninspired constructors. Repeat factors The regular expression symbols "+" and "*" and "?" retain their meanings here (one or more, zero or more, zero or one, respectively). Notice the convention that if there is a single repeated element, the repeat factor is placed outside the parentheses, as was done in line 1 of Example 11.23. Sequence Element names separated by commas must appear in the order specified. For example,
lunch, email?,
meetings*)>
indicates that a "workday" consists of one or more meetings, followed by a single lunch, possibly followed by an e-mail session, and then perhaps some more meetings. Alternative The character "I" is used to indicate that only one of a collection of elements will appear. For example, 13 The DTD specification requires you to use the declaration "#PCDATA" to specify character data nested inside an element. It stands for "parsed character data." The distinction between this and the "CDATA" declaration in line 2 is best left for a more complete look at DTDs. You will have no trouble as long as you just mimic their use here. Note that both "#PCDATA" and "CDATA: must be all uppercase.
11.3 More Applications of Trees
(paint I wallpaper
701
I paneling)>
indicates that "wallcovering" will contain exactly one nested element. That element will be either a "paint" element or a "wallpaper" element or a "paneling" element. It is also possible to use "#PCDATA" in place of an element name. The one special requirement for "#PCDATA" is that it must be the first option in an alternative rule that ends with an asterisk, and no other rules may be nested inside. For example, there may be some predefined elements for the language used to write a document. Perhaps only the most common languages are listed with special elements. Less commonly used languages will just be written as character data. These constructors can be freely mixed (with the help of parentheses), as shown in the next rule.
(paragraph+-
The rule indicates that a segment has an optional segment header and then either one or more paragraphs, or else one or more poem-with-interpretation pairs. Attribute declarations Lines 2 and 3 of Example 11.23 show how attributes can be specified. The general form (for the purposes of this overview) looks like the following pattern. The string "OwningElementName" is the name of the element that the attribute is attached to (e.g, "classlist"). The string "AttributeName" is the name of the attribute (e.g., "enrolled"). The remaining two items are two common responses for much more general categories of information. The "CDATA" string indicates that the value of the attribute will be character data. The string "#REQUIRED" indicates that this attribute is not optional. If the string "#IMPLIED" were used instead, the attribute could be left out of any particular tag of type "OwningElementName."
"....uick
Check 11.9
1. Create a DTD for letters, using the XML document in Example 11.18 as
a concrete example of what a letter should look like. RI
Any program that reads an XML document for the purpose of manipulation or processing must use a component called a parser. A parser makes sense of the structure inherent in the document. If the document is not well formed, the parser will issue an error message, and the controlling program will be unable to complete its task. However, a well-formed document may still cause the task to fail. For example, if the program's task is to count the number of mathematics majors in a class list, it will have problems if some of the "student" elements don't contain a "major" element. The DTD in Example 11.23 contains all the information necessary to enable the parser to issue an error message if the XML document doesn't conform to the requirements in the DTD. t 4 A parser with this ability is called a validatingparser. 14 Look at http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/ in the "Textbook-Related Links"
section for a link to rxp, a free validating parser. It isrun from a command line: rxp -V classlist.xml where "classlist.xml" may be replaced by some other XML document name. The DTD file will need to be in the same directory as the XML file.
702
Chapter 11 Trees One additional processing instruction is needed in the XML document in order to inform the parser how to find the associated DTD. The document in Example 11.22 can be modified to include this new kind of processing instruction.
EA
Class List with DTD The class list XML document in Example 11.22 can be associated with the DTD in Example 11.23 by adding the processing instruction shown below at line 2. The strings "DOCTYPE" and "SYSTEM" must be all uppercase. Between them the name of the root element will be listed. At the end, in quotes, will be the name of the file that contains the DTD ("class.dtd" in this case). 1: 2: 3: 5: Jennifer Adams 6: Sophomore 7: <major>Mathematics 8-: 9: <student> 10: William Zembrot 11: Junior 12: <major>Computer Science 13: 14:
enrolled
"2->
U
This completes the brief introduction to XML. There is much more to learn if you wish to become proficient. For example, one of the more advanced XML technologies, called DOM, is strongly tied to this chapter. DOM, which stands for the document object model, provides a collection of software components that support XML document processing. DOM components are used to parse an XML document and create a tree that represents the document. The DOM components provide mechanisms to traverse the tree and either gather information or modify the tree. You can even specify which kind of tree traversal you want to use.
11.3.4 Exercises The exercises marked with D have detailed solutions in Appendix G. 1. Let
S
{E, A, S, Il-, where E
A
{a, b},
a
B
A
A =IS, A, B,C],
b
/
and
A
Fl
=
IS -* aABb I C, A --* a, B
B
--
0
b, C -+ AabB}. a
(a)
ODProduce two distinct parse trees for the string aabb in L (g). (b) Produce all distinct parse trees for this grammar.
2. Consider the grammar in Example 11.13. (a) Create a parse tree for the string aaab. (b) Produce three different derivations that are consistent with the following parse tree.
3. OI4A language is generated by the grammar G I{Z, A, S, 1I}, where E = {a, b, c}, and
A = IS, X, Y), IH
{S
--
aX I bY, Y -) bY Ib, X -+ aX Icl.
11.3 More Applications of Trees Is the following parse tree consistent with this grammar? Give reasons for your answer.
703
(b) Show that the expression a + b * c has two distinct parse trees. (c) Do the expressions (a + b) * c and a + (b * c) have unique parse trees? Support your answer.
S a
7. '14 Produce a Huffman encoding for the phrase "pied piper." Assume that the encoding alphabet consists of the letters d,e, i, p,r, andu.
X
8. Produce a Huffman encoding for the phrase "eager beaver." Assume that the encoding alphabet consists of the letters
Y
b
b
Y9.
a, b, e, g, r, v, and u. Produce a Huffman encoding for the phrase "bibbity bobbity
b
boo." Assume that the encoding alphabet consists of the letters b, i, o, t, y, and u.
4. A language is generated by the grammar g = { A, S, 17, where E {a, b, c}, A {S, X, Y), and HI (S -* aXc I bY, Y -* bYX I c, X -+ XaY I a}. Create a parse tree for the string aaabcac.
10. For each DNA string, answer the following questions. • Produce the final labeled tree and the encoding table for a Huffman encoding of this string.
5. A language is generated by the grammar g z, A, s, nS}, where E = {a, b, c}, A = {S, X, Y, Z}, and FI = {S-XbjYZY-- bZlb,X-- aZIa,Zc}. Draw all parse trees for this grammar.
• How many bits are needed to encode this string using a constant width encoding with 2 bits per character? How many bits are needed for the Huffman encoding if the encoding table is not included in the file? (a) 'I4ACGTACGACA (b) ACTGGTACCCAGTTAACCCG
6. Let g = {E, A, S, HI, where E {a, b, c, (,) +, *1, A = {S, E, V}, and H71 contains the following productions.
(c) TATAACACATATTGTTGACTTACTTTATAACACATATTGTTGACTTACTT
S-
E
E
(E) I E + E I E * E I V
--
11. I am currently using the following constant-width encoding to store certain characters as binary strings on a computer.
V-- aIblc
i
g
f 110
i 001
I 100
m 011
n 000
0 010
s 111
is a simple grammar for algebraic expressions in the three variables a, b, and c and the operators + and *, with properly paired parentheses permitted.
I wish to store the string "millions of minions." (a) Produce the final labeled Huffman tree and the table of
(a) Create the parse tree corresponding to the following derivations. Note that "(" and ")" and "+" and "*" are each separate terminal symbols and will consequently appear as separate nodes in the parse tree.
Huffman encodings. (b) How many bits will the constant-width encoding require to store the string? (c) How many bits will the Huffman encoding require to
i. S E =: (E) => (E*E) =# (V*E) =: (a*E) =• (a • V) = (a * c)
store the string? 12. I am currently using the following constant-width encoding
ii.
to store certain characters as binary strings on a computer.
S =: E => E + E => V + E => a + E •a a
(E)r•a+(E+E)=> a+(E+V)
•a+(E+
c)=>a+ (V+0=a+(b+c
Li
0111
iii. S = E =ý E * E => (E) * E =ý (E) * (E) => (E + E) • (E) => (E + E) * (E + E)
S(V
+ E) • (E + E) ='
=' (a + V) * (E + E)
z='
(a + E) * (E + E) (a + b) * (E + E)
=• (a + b) * (V + F) =• (a + b) * (a + F) => (a + b) *(a + V) 15
101
(a + b) *(a + c)
a
d
e
i
m
n
0
000
001
010
011
100
101
110
I wish to store the string "mamie minded momma."'15 (a) Produce the final labeled Huffman tree and the table of Huffman encodings. (b) How many bits will the constant-width encoding require to store the string? (c) How many bits will the Huffman encoding require to store the string?
The original is from a bit of verse by Walt Kelly from the Pogo comic strip [49, p. 90]. The context is 0, Mamie minded momma 'Til one day in Singapore A sailorman from Turkestan Came knocking at the Door.
704
Chapter 11 Trees
13. Produce the tree representation for the following well-formed
7:
XML document.
(description)> 8: 9: 10:
May 28
Yellow Majesty
The XML document in Figure 11.37 fails to conform to the DTD. Determine all the ways in which it fails to follow the DTD, and then modify the document so that it conforms to the DTD while still keeping all the intended information. The line numbers are not part of the XML document.
Big Boy Cherry Delight
14. 'I- The "letter" DTD in Quick Check 11.9 could be improved by adding an element that indicates that the enclosed character data requires emphasis. This element might be used, for example, by a program that prints the document on a laser
printer.
I: 2: 3: 4: 5: <machine-powered capacity="5"> 6: <description>aut omobile 7: <power-source>gasoline
engine er pon rsml rc co Dear mom and dad, 16 <prarah>18 I <emphasis>really ed19 nedmoe!Plae need money! Please send
1:
10: i I:
12: 13 :
<description>bicycle
14: 15 : 15: : 17: : 19: : 20:
descriptiono> feet e/hmnpowced> <water> <machine-powered> <decpon> <description>submarine<description> <power-source>nuclear source> rer
soon! Otherwisereactor Otheraph> """21: 22:
Modify the letter DTD on page 721 by including this new element. 15. Consider the following DTD, which describes various modes of transportation. As usual, the line numbers are not part of the DTD. 1:
(land,
3:
<power-source>diesel engine mcne-poweced>
23 24: 25:
26: 27:
<description>sail boat
<description>canoe <description>rowhoat
28: 29: 30: 31: Figurel1.37 An invalid XML document.
2:
4:
(machine-powered+,
human-powered+, air-powered+)> 5: 6:
16. A computer program that allows the display and manipulation of graphs also has the ability to read and write the essential information as a disk file. The disk file uses XML to represent the graph. The displayed graph consists of a nonempty collection of vertices, edges, and labels (in any order). The element names are "vertex," "edge," and "label," respectively. Vertices have three attributes: a diameter (in points), an x-coordinate, and a y-coordinate. Edges have four attributes:
11.4 Spanning Trees the x- and y-coordinates of the two vertices at the endpoints of the edge. 16 Labels have four attributes: a font, a (font) size (in points), and the x- and y-coordinates of the top left corner of the label. A label also must enclose some character data.
thesized "ConsistsOfRule" by the string "EMPTY" (all uppercase). The vertex rule will be written in the following manner. < !ELEMENT vertex EMPTY>
(a) Construct the tree representation for the following well-
formed XML document.
3 5
4
2
5
w
v
t
Xk
Code 000 010 110 011 III
6
2
8
9
6 4
11.6.6 Projects Mathematics 1. Write a brief expository paper on backtracking algorithms, Provide one or two detailed examples that are different from any backtracking algorithms found in this book.
them, and prove a theorem about the number of vertices in the nth Fibonacci tree. What other theorems can you find or prove?
2. Write a brief expository paper on the connections between Cayley's formula and chemistry.
1. Write a program that uses a tree to convert infix expressions to postfix.
3. Write a paper illustrating the use of trees in decision theory. Discuss the role of Bayes's theorem.
2. Write a program which implements the Huffman compression algorithm. The program should analyze the text to calcu-
4. Write a brief expository paper on Fibonacci trees. Define
late the character frequencies before building any trees. As-
ComputerScience
726
Chapter 11 Trees
sume that only the 256 standard ASCII characters are permitted. The encoding tree will need to be stored with the compressed data in order to allow the message to be uncompressed.
3. Write a brief expository paper to compare and contrast DTDs and schemas. 4. Write a brief expository paper on the use of parse trees in compilers.
11.6.7 Solutions to Sample Exam Questions 1. This is Theorem 11.2. The proof is quite simple and elegant. Observe that every node except the root has exactly one adjacent edge on the unique path joining the node to the root, Every edge has two incident nodes. Identify the edge with the node that is farther from the root (determined by the respective levels). There is a one-to-one association between edges in the tree and nonroot nodes. Since there are n - 1 nodes that are not the root, there must be n - I edges.
(a) The steps of the Huffman algorithm are shown below, without commentary. Step I Front
1f
F7
FI
Back 33
[li
Step 2 Front
Back
2. (a) There are mi vertices at level j. (b) n =
m'
m-I M1 "/2
3e
0
(c) At level 0 there is m = I node. At the next level there will be m I = m nodes, since every interior node in a full tree has m children. At level 2 there will be m2 nodes since there are in nodes at level 1, each having m chilnodes level j. The dren. In general, there will be mJ Eh i atm11+11 total number of nodes will be
mi=0
E1
If Step 3
Back Front
ni---I
[ 73e
3
3. The search tree that results is shown.
2
lu
d
a
tStep
w
r
b
4 Back
Front
4. Preordernhkaetzyvswx In-order a k e h t z n s v y x w
\
Postorder a e k z t h s v x wy n
5. The corresponding binary tree is shown.
hi
Step5
*
A
2
10 4
2
5
7
z
A postorder traversal yields 4 2 ^ 5 7 * +. 6. The character frequencies in the message are listed in the following table. Character e
Frequency 3
0
1
r
4
f
,u
1
I
3
3e
P
2
2
If
FE
11.6 Chapter Review Step 6 10
F4-,] 0
o
(b) The original encoding takes 10 - 3 = 30 bits. The new encoding takes 32+4.1+l.4+ l.4+l- 3 =21 bits (a 30% savings). 7. A minimal spanning tree is shown. It can be found with either Prim's algorithm or Kruskal's algorithm. There are multiple correct solutions, but all share the weight 2, 3, and 4 edges in common. They differ only in which weight 5 edge is chosen.
1 6
0
1
The total weight is 35. 0
3
10
224
2 110
0 1110
1 1113
> 3o
2
5
4
The character encodings are shown in the next table. Character e
Code 10
f
1110
o
727
1111
r
0
-,
110
4
4
CHAPTER
Functions, Relations, Databases, and Circuits
This chapter begins with the familiar, but fundamental, concept of function and develops that concept in two directions. eain One direction is the introduction of a generalization of a~mtocle
Ondretinisth itodctono ageerlzaio
o
function, called a relation.
Relations with some additional properties are examined in the section on equivalence relations. An important application of relations is briefly introduced in the section about relationaldatabases. The other direction that is developed is the notion of Booleanfunctions. This notion is then applied to the construction of logic circuits in digital electronics.
12.1 Functions and Relations This section will begin with a definition of function that can be naturally extended to the more general notion of relation. Basic properties and associated definitions will also be presented in each case.
12.1.1 Functions The process of stating, and then refining, the definition of afunction can be traced back' to Leonard Euler [22]. Euler's mathematics textbook Introductio in analysin infinitorum, published in 1748, shaped what we would now call the "precalculus" curriculum. Euler defined a function in this (no longer used) manner: A function of a variable quantity is an analytic expression composed in any way whatsoever of the variable quantity and numbers or constant quantities [26]. This obsolete definition emphasizes the algebraic expression rather than the ideas of mapping or association. However, it implicitly contains the notion of an algebraic expression that can transform some variable into an associated value. Euler's textbook introduced the standard collection of functions that you have studied in previous courses: polynomials, trigonometric functions, exponential functions, 2 and logarithms. The modern definition of a function did not finally appear until it was formulated by Peter Dirichlet in 1837: "y is a function of x when to each value of x in a given interval there corresponds a unique value of y" [50, p. 950]. 1For a brief but more extensive view of this development, see "Where Do Functions Come From?" by Leigh Atkinson [2]. 2 Euler provided the transition from log tables, which are not used much at present, to logfiunctions, which are essential in much of modern mathematics. Enler was also the originator of the natural log function, ln(x), and its associated exponential function, e'. 728
12.1 Functions and Relations
729
Two essentially equivalent definitions are most often used at present. Both are restatements of Dirichlet's definition. The first defines a function to be a mapping from a domain to a range for which no element in the domain is mapped to more than one element of the range. The second definition, presented next, deemphasizes the mapping idea and highlights the association between elements in two sets. It might be helpful to review the definition of Cartesian product on page 20. DEFINITION 12.1 Function; Domain; Range A function from the nonempty set D into the nonempty set 7z is a subset, 7, of the Cartesian product D9 x 7Z such that every element of E9 appears in one and only one ordered pair in 7. The set, D9, is called the domain and the set, 7R, is called the range. The image of the function is the subset of 7R consisting of elements that actually appear in the right-hand side of at least one ordered pair in 7. The assertion "every element of D9 appears in one and only one ordered pair in F' means (by the definition of the Cartesian product D9 x 7?) that no two ordered pairs in 'r have the same first element. It also implies that every element of D9 appears in some ordered pair in y.3 This may seem to be a completely different idea from the mapping version of a function. However, the similarities in the definitions are intuitively compelling. For example, to move from the mapping version to the set version, think about building a (possibly infinite) table similar to those used to graph functions by hand. x
f(x)
X1
f(xt)
X2
f(x
2)
It is easy to see the corresponding set of ordered pairs: {(XI, f(xj)), (x2, f(x 2)) .... Definition 12.1 does not require the specification of an algebraic expression or an algorithm for producing the association of elements in D9 and 7R. If such an algebraic expression, f, is known, it is easy to see that the subset, 7, of ordered pairs can be viewed as the result of mapping elements of D9 to elements of 7z using the expression f. Even if such an expression is not known, there is an implicit "mapping" available: Associate each element of D with the element of 71 with which it appears in an ordered pair of.7. There are some additional definitions that will allow more precise discussions about functions. DEFINITION 12.2 Onto and One-to-One Functions Let 7 be a function from D9 into 7Z. Then 7 is called onto if every element of R appears as a second coordinate in at least one ordered pair in 7. If no element of R appears as a second coordinate in more than one ordered pair in 7, then 7 is called one-to-one (also abbreviated as I-1). aSome more advanced texts refer to onto functions as surjective functions and one-to-one functions as injective functions. Functions that are both one-to-one and onto are referred to as bijective functions. This terminology is awkward and will not be used here. 3
Note that changing the domain also changes the function. Thus, using the mapping view of functions for the moment, f(x) = x with domain 2 is not the same function as f(x) = x with domain R. The second function is defined at 7r, whereas the first is not.
730
Chapter 12 Functions, Relations, Databases, and Circuits Note that it is possible for a function to be one-to-one without being onto ({(x, ex)1 C IR{x R), and to be onto without being one-to-one ({(x, x(x - 2)(x + 2))} g R x 1R). It is also possible to have both properties (((x, x)} C_1R x R) or neither property
(Q(x,x2)1 Cgk x R). The Sine Function Most of the functions you have encountered prior to this course were functions with an infinite domain and an infinite range. For instance, the familiar sine function, sin(x), has the entire set of real numbers as its domain and as its range, but the image is the closed interval [- 1, 1]. The function is neither onto nor one-to-one. It is not onto because the image is a proper subset of the range. It is not one-to-one because the set of ordered pairs {(n7r, 0) 1 4 n E Z} is a subset of the sine function. For the purposes of this textbook, the alternate, mapping-oriented terminology is acceptable as an informal mode for describing functions. Thus, the sine function could be informally described as a mapping that "sends" radian angles to real numbers in [-1, 1]. For a particular radian measure, x, we would denote the value it is mapped to as sin(x). More formally, the sine function can be described by defining sin(x) to be the y-coordinate of the point, (x, sin(x)), determined by the standard wrapping function from trigonometry. 5 The sine function is then defined by sin = {(x, sin(x)) I x E R }. Many functions of interest in a discrete mathematics course have a finite or countably infinite domain and a finite or countably infinite range. The next example describes some functions with a finite domain and range. Marriage Assignments The six possible marriage assignments in Example 1.1 on page 4 all represent one-toone, onto functions with domain D = (A, B, C} and range R = f X, Y, Z}. The assignment labeled "female 1st choice" is associated with the function U T, = {(A, Y), (B, Z), (C, X)}.
V Q'_uic~k C'hec~k,__12.1_ -1. Theorem 3.6 on page 119 implicitly defines a function, S, with a countably infinite domain and range. Use the formal definition of function to define explicitly the function, its domain, and its range. Also, indicate whether the function is one-to-one and if it is onto.
2. Formally describe the parity function, P, corresponding to the informal notion that an integer is either even or odd. Describe the function and its domain and range, and then determine whether the function is one-toone and if it is onto.
A Word about Notation Up to this point, the exposition about functions has mostly used an uppercase, script font for the name of a function. (The sine function is too steeped in tradition to use anything other than "sin" as its name.) This is the same as the font used to represent the domain and range. The font choice emphasizes their commonality as sets. 4
1t may seem strange to say that a set of ordered pairs is a subset of the sine function, but this is the proper
terminology when using Definition 12.1. 5 That is, start at the point (1,0) and move counterclockwise around the unit circle until a distance of x radians has been traveled. The y-coordinate of the final point is then labeled as sin(x).
12.1 Functions and Relations
731
Once the nature of functions as "sets of ordered pairs" is understood, it is more in keeping with standard notational practice to use lowercase letters (such as f) to represent functions. It is then easy to move from the more formal notation to the standard notation. For example, if f = t(x, x2)) C R x R, then we can write f(2) = 4, or more generally, f (x) = x2. Functions that are both one-to-one and onto have a naturally associated partner function, called the inverse function. If the function is informally denoted as f, the inverse function is often denoted as f- 1 . Familiar examples are sin (with a suitably restricted domain, such as [_- 1, 1]) and its inverse, sin- 1 (also denoted by "arcsin"), and the pair of inverse functions whose second coordinates are denoted by ex and ln(x). Informally, we say that functions f and g are inverses if the following conditions are met:
"*The domain, Df, of f is the same as the range, 7Zg, of g. "*The domain, Dg, of g is the same as the range, 7Rf, of f. "*VX E Df, g(f(x)) = x. "*Vy c Dg, f(g(y)) = y. This can be expressed formally by the following definition. DEFINITION 12.3 Inverse Function Let F be a one-to-one and onto function with domain Dy-and range )Zy. A function, g, whose domain is Ry and whose range is Dy is called the inverse of F if the following conditions hold:
"*If (x, y) E 5, then (y, x) "*If (y, x) - G, then (x, y)
E
.
eF.
It is easy to show that g exists and is unique as long as F- is one-to-one and onto. G is also one-to-one and onto. (See Exercise 16 in Exercises 12.1.3.) The function F 1 = {(A, Y), (B, Z), (C, X)}, from Example 12.2, has inverse function Gi = {(Y, A), (Z, B), (X, C)}. The ordered pairs in F1 indicate the spouse for each female, whereas the ordered pairs in g 1 indicate the spouse for each male. The familiar notion of composition of functions can also be described using the set-oriented definition of function. Recall that if f is informally defined as a function that maps elements, x, from X to Y, and if g is a function that maps elements, y, from Y to Z, then g o f is the function that maps elements from X to Z using the rule (g o f)(x) = g(f(x)).
DEFINITION 12.4 Composition of Functions Let 17 be a function whose domain is X and whose range is Y. Let G be a function whose domain is Y and whose range is Z. The composition of 9 and F is denoted by G oF•and is defined by
G o F ={(x, z) I 3y •
Y with (x, y) E7"and (y, z) EG}.
A Simple Composition Let F = {(1, 2), (2, 4), (3, 6), (4, 8), (5, 4)) and G = {(2, -6), (4, -12), (8, -24), (10, -100)}. Then 9 o•F ={(1, -6), (2, -12),
(3, -24), (4, -24), (5, -12)1.
(6, -24),
U
732
Chapter 12 Functions, Relations, Databases, and Circuits
12.1.2 Relations Functions have been defined as subsets of a Cartesian product for which there is no duplication in the first elements of the ordered pairs. A relation is also a subset of a Cartesian product. However, there are no restrictions.
DEFINITION 12.5
Relation
A relation between the set A and the set B is a subset, RZ, of the Cartesian product
AxiB. If (a, b) E IZ, it is common to write aRb and to say that a is related to b. If A = B, the relation is said to be a relationon A. The set A is called the domain of the relation and the set B is called the range.
The definitions of one-to-one and onto do not change.
DEFINITION 12.6 Onto and One-to-One Relations Let 7Z be a relation between the sets A and S. Then 7R is called onto if every element of 3 appears as a second coordinate in at least one ordered pair in R. If no element of 5 appears as second coordinate in more than one ordered pair in 7?, then 7? is called one-to-one.
Why would an unrestricted subset of a Cartesian product be of interest, and how is it related to the notion of a function? Perhaps a simple example will begin to answer these questions.
A Relation between Large Discrete Sets Let A be the set of all people who were alive at some point between 1000 A.D. and 2000 A.D., and let B represent the set of all roles that might characterize a person at some time in life during that millenium. It is easy to imagine a relation, 7Z, between A and B that correctly captures all the roles each person assumed. Some entries might be as follows:
"•(Shirley
Temple Black, daughter), (Shirley Temple Black, child actor), (Shirley Temple Black, wife), (Shirley Temple Black, mother), (Shirley Temple Black, U.S. Ambassador), ... "*(Hayao Miyazaki, son), (Hayao Miyazaki, animator), (Hayao Miyazaki, husband), (Hayao Miyazaki, father), ...
"* (Leonhard Euler, son), (Leonhard
Euler, mathematician), (Leonhard Euler, teacher), (Leonhard Euler, husband), (Leonhard Euler, father), ....
"*and
so on
The relation 7? is clearly not a function, since Shirley Temple Black appears as first coordinate in multiple ordered pairs. It is also not one-to-one, since Leonhard Euler and Hayao Miyazaki are both associated with the role of father. The relation is onto if, and only if, every role in B has at least one person associated with it. Although the uniqueness of second coordinates has been lost, this relation clearly retains some of the flavor of a function. The ordered pairs serve to associate the first coordinates with their (multiple) second coordinates. U Many functions do not have an inverse; every relation does.
12.1 Functions and Relations
733
DEFINITION 12.7 Inverse Relation Let 7Z be a relation between the sets A and 13. The inverse relation of 7Z is denoted k-1 and is a subset of the Cartesian product B x A. More precisely, 79-1 = 1(b, a) E 3 x
A I (a, b)•E R).
If 7Z is the relation defined in Example 12.4, then R- 1 contains (among many other ordered pairs): (daughter, Shirley Temple Black), (son, Hayao Miyazaki), (son, Leonhard Euler), (U.S. Ambassador, Shirley Temple Black), (animator, Hayao Miyazaki), and (mathematician, Leonhard Euler). Composition of relations will be used extensively in Section 12.2. DEFINITION 12.8 Compositionof Relations Let 7Z be a relation whose domain is A and whose image is B. Let S be a relation whose domain contains 13 and whose range is C. The composition of S and 7R is a subset of A x C. It is denoted by S o 7Z and is defined as S o,7 = {(a, c) I3b E B with (a, b) E 7Z and (b, c) E S}. The range of 7? need not be contained in the domain of S. All that is required is Image(7Z) C Domain(S). The definition can be intuitively visualized by placing ordered pairs next to each other, with elements of 7? on the left and elements of S on the right: (a, b) (b, c) --- (a, c). 6 However, the notation, S o 7?, reverses this ordering.
A Small Relation 7 Let A = {1, 2, 3, 4}. Define a relation, 7Z, on A by x7y if and only ifx Iy. Define a second relation, S, on A by xSy if and only if x and y have the same parity and x -A y. Then •7 = {(1, S = {(1, S o7 7 oS R
1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 4), (3, 3), (4, 4)} 3), (2, 4), (3, 1), (4, 2)1 {(1, 3), (1, 4), (1, 1), (1, 2), (2, 4), (2, 2), (3, 1), (4, 2)} {(1, 3), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4), (4, 2), (4, 4)1.
The composition S 0 7? can be determined systematically by starting with the ordered pairs in 7 and finding all ordered pairs in S whose first coordinate matches the second coordinate of the ordered pair in 7?: (a, b) (b, c) --+ (a, c). For instance, the ordered pair (1, 2) in 7 can be matched with the ordered pair (2, 4) in S. The ordered pair (1, 4) is therefore a member ofS o R7(it appears as the second element of S o R in the preceding list). U 6
One way to keep this straight is to read "S o 7R" as "S follows 7?." That is, first apply the association in 7R (a -- b), and then apply the association in S (b -) c). 7 See Definition 3.7 on page 96 for a refresher on "x divides y."
734
Chapter 12 Functions, Relations, Databases, and Circuits
V Quick Check 12.-2-.. ..... Let A = {1,2, 3}, 3 = la, b, c, d), and C ={x, y, z. 1. Let R = {(1, a), (2, b), (2, c), (3, c)} 2. If S = {(a, x), (a, z), (b, z), (c, y), be a relation in A x 3. What is the (d, y)} is a relation in 3 x C, what is inverse relation, R-I.? S o Rr? The next theorem indicates a simple algebraic property of composition.
Composition of Relations is Associative Let 1Z be a relation whose domain is A and whose image is 3. Let S be a relation whose domain contains 3 and whose image is C. Finally, let T be a relation whose domain contains C and whose range is V. Then (T o S) o R = T o (S o R7) . A simple consequence of this theorem is that the notation T oS o R is unambiguous. Proof: This theorem is claiming the equality of two sets. Before choosing a proof strategy, it might be helpful to see if the two sets are subsets of the same Cartesian product. If they are not, then the claim in the theorem must be false. The set T o S is a subset of Domain(S) x D, so (T o S) o 7R is a subset of A x D. In addition, the set S o R is a subset of A x C, so T o (S o 7o) is a subset of A x D. Consequently, the claim is viable. Section 2.1.3 provides some proof strategy options. A natural strategy for this theorem is to show that each set is a subset of the other. Part1: (ToS) o 1? C T o (S o 7R) It is useful to recall the previously determined Cartesian products: T o S C Domain(S) x D ('ToS) oTRCAxD
So7cAxC To(So R)CAxD.
Let (a, d) e (T o S) o R. If (a, d) is also an element in T o (S o 7?), then the claim 8 will be verified since (a, d) represents any element in the left-hand set. Since (a, d) G (T o 5) o 7R, there must be 9 an element b e 3 such that (a, b) e R and (b, d) e T o S. But if (b, d) G T o 5, then there exists an element c e C such that (b, c) e S and (c, d) c T. Consequently, since (a, b) E 7R and (b, c) E S, it is clear that10 (a, c) E 5 o 7?. Since (a, c) E S o 7R and (c, d) c=T, it follows that (a, d) E T o (S o R) is true. This completes the verification that (T o S) o R C T o (S o R). Part2:To (S oR) g (T oS) oR" See Exercise 18 in Exercises 12.1.3.
12.1.3 Exercises 7The exercises marked with ýD4have detailed solutions in Appendix G, 1. Which of the following sets of ordered pairs represent functions? (Recall that Z+ is the set of positive integers.)
(a) D 1(1, 1), (2, 4), (3, 3), (3, 4), (4, 5), (5, 5)} C 11,2, 3,4,51 x 11,2,3,4,5) (b) [(1, 4), (2, 4), (3, 4), (4, 4), (5, 4)) c
8See Section 2.1.3 if you need a quick review on this proof strategy. 9 See Definition 12.8 (concept - properties) with T o S in the role of the definition's S. 1 "°Definition 12.8 (properties --- concept).
{1, 2,3,4, 5) x {1, 2,3,4, 5)
D
12.1 Functions and Relations (c) {(x, y) E 2+ x 2+ I gcd(x, y) = 21
(d) {(x, y) E N x {-1, 0, 1} I y = sin(x-)l 2. '1 Which of the relations in Exercise 1 are onto?
3. 14Which of the relations in Exercise I are one-to-one? 4. qA Describe the inverse relation for each relation in Exercise 1. 5. Which of the following sets of ordered pairs represent functions? (Recall that Z+ is the set of positive integers.) (a) {(a, a), (b, c), (c, x), (d, y), (x, z), (y, d), (d, b)S C (a, b, c, d, x, y, z} x {a, b, c, d, x, y, zS (b) {(x,y) E 11,2,3,4,5,6,7,8} x {1,2,3,4,5,6,7,8}1 lcm(x, y) > 81 N lS (C) I(XY) EZ XNIY = P9 2 W] I(WI, (d) Let P be the set of all people and let VV be the set of all human women. The set of ordered pairs is {(p, w) E• x N I tw is the biological mother of p. 6. Which of the relations in Exercise 5 are onto? 7. Which of the relations in Exercise 5 are one-to-one? 8. Describe the inverse relation for each relation in Exercise 5. 9. Determine the ordered pairs in S o T?, where 7z. S c 2+ X Z+. (a) 7Z ={(1, 5), (1,6), (2, 6), (2, 8)} and S ((,2), (5, 4), (6, 5), (7, 8), (8,9), (8, 10)1 (b) 7?. {(1, 1), (2, 1), (3, 2), (4, 1)) and S {(1, 1), (1, 2), (2, 3), (2, 4)1 R = ((2a, 2a) Ia E Z+} and (c) OD*
(d)
(a) Determine the ordered pairs in 7?-1. (b) Determine the ordered pairs in S-1. (c) Determine the ordered pairs in (S o IZ)-1
(d) Is there any connection between parts (a), (b), and (c)? 13. Determine the ordered pairs in T o S o 7Z. Assume that 7R C {GA, MI, PA, SD) x (CA, ND, WII, S C ICA, ND, WI} x INC, SC, WA}, and T C INC, SC, WA} x {IL, KA, MA, MN, RI}. (a) D 7- =((GA, CA), (MN , CA)}, = (NCA, IL), (NC, WA)) (b) 7Z = {(MI, ND), (PA, ND), (PA, WI), (SD, CA)}, = {(CA, NC), (ND, NC), (ND, SC), (ND, WA), WA)}, T = {(NC, MN), (SC, MA), (SC, RI), (WA, IL)1 (c) 7Z =(GA, ND), (MI, WI), (PA, CA), (SD, CA)1, S = {(CA, SC), (ND,NC),(ND,SC), (WI, WA)), T = ((NC, KA), (SC, MA), (SC, RI), (WA, IL)) 14. Determine the ordered pairs in T o S o 71Z.Assume that 7Z C (1,2,3} x {a, b, cl, S C fa, b, c} x {I,fl, y}, and y) x {©, K, 4,41. T C (a, /3, (a) 7Z = (1, a), (1, b)1, S ={(a, ct), (b,/f)1, (u, 0), (p, 0)) T (b) 7? = {(1, b), (1, c), (2, a), (3, a)1, (c, y)1, ((a, a), (a, y), (b, ca), (c, /3), T ((a,1-), (/8,4 ), (03, 0), (Y' Q))} I(1, a), (1, b), (2, a), (3, b)), (c) 7R (c, fi)}, S = {(a, at), (b, /3), ((a,Q(), (a,4), (A, *), (Y', 4) " 15. Each of the following statements is either true (always) or
S = {(3b, 3b + 1)1b c Z+} ? = 1(2a, 6a) Ia E Z+} and S {(3b, 3b + 1)1 b E Z+1
[(n, 2n + 1) n e 2+} and (e) 7?. 2 S = ((2k - 1, k ) I k E Z+± S a 7?. and 10. Determine which (if any) of the compositions, 7• S, are defined. If the composition is defined, list the resulting ordered pairs. In each case, 7-., S C N x N. (a) 7Z =1{(0, 5), (1, 8), (9, 6), (9, 7)} and S = ((0, 0), (0, 9), (8, 1),(8, 9)1 (b) 7?. ((2, 5), (5, 3), (6, 6), (6, 8)1 and ((1, 2), (5, 4), (6, 5), (7, 8)1 S (c) 7R.= ((1, 1), (2, 4), (3, 9), (4, 16)} and S = {(1, 1), (4, 2), (9, 3), (16, 4)} (d) 7Z.= ((1,2), (1,3), (1,4), (1, 5)} and S= {(1,2), (2, 4), (3, 5), (4, 6), (5, 7)1 11. Let 7? = ((1, 5), (1, 6), (2, 6), (2, 8), (3, 1), (4, 7)1 and S = ((1,2), (5, 4), (6, 5), (7, 8), (8, 9), (8, 10)) (both subsets of N x N). (a) Determine the ordered pairs in 7--1. (b) Determine the ordered pairs in S-
735
1
.
(c) Determine the ordered pairs in (S o 7?)- 1. (d) Is there any connection between parts (a), (b), and (c)? 12. Let 7? = ((0, 2), (0, 5), (0, 9), (1, 9), (1, 12), (1, 15), (2, 2)) 1(2, 0), (2, 6), (5, 6), (9, 8), (12, 1),(12, 7), = and S (15, 4)1 (both subsets of N x N).
false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) If a relation is one-to-one and onto, then it is a function. (b) Composition of relations is associative for all relations. That is, (T o S) o 7-Z= 7 o (S o7Z), where 7-, S, and are any relations. (c) A relation that is not one-to-one can still have an inverse. (d) '1 The function F is defined by F = ((a, 1), (b, 2), (c, 3), (d, l), (e, 2)). Then the range of.F is {1, 2, 3}. (e) Let A and B be sets. Then it is possible for a subset of the Cartesian product A x B to be both a function and a relation. 16. Let Y7 be a one-to-one and onto function with domain D.y and range 7Z.F. Prove the following. (a) An inverse function, g, must exist. (b) The inverse function is unique. (c) The inverse function is also one-to-one and onto. 17. The informal definition of the identity function, _, requires 1-(x) = x to be true for all x in the domain (so the range is equal to the domain). Produce a formal definition for the identity function. 18. Complete the proof of Theorem 12.1.
736
Chapter 12 Functions, Relations, Databases, and Circuits
19. Let 7Z and S be two relations for which 7Z o S and S o 7) are both defined. Is it always true that 7Z o S = S o 7Z? If it is true, provide a proof; otherwise find a counterexample.
20. Let 7Z be a relation between A and B, and let S be a relation between 8 and C with Image(7Z) = Domain(S). Prove that (So z)- 1 = 7z-1 0 S-
12.2 Equivalence Relations There are a number of special properties that can be used to describe relations. These properties will be defined in the next section. A special bundle of properties, called an equivalence relation, will be explored in more detail in Section 12.2.2.
12.2.1 Properties that Characterize Relations Before the properties are formally defined, a few examples will be given. Some Properties that Describe Relations Let A = Z+. Define the following relations on A.
S7ZI
=
{(x, y) E Z• x Z'
(x - y) > 0O 1I
S7?2={(X,y) E Z+ x Z+gcd(x,y)= S
= f3{(x, y) EZ+
X Z+ x 1. The attribute B is functionally dependent on {AI, A 2, ... , Aj} if every distinct choice of values for the attributes A 1 , A2. Aj uniquely determines the value of B. A more formal definition of functional dependence will be given on page 751. In Example 12.13, the values of "Father" and "Mother" in the relation "Biological" do not uniquely determine the value of "Biological Child" (since parents can have several biological children). Therefore, "Biological Child" is not functionally dependent on {Father, Mother). On the other hand, both "Father" and "Mother" are functionally dependent on "Biological Child."
Functional Dependence Let T be a relation with attribute set (Student, GPA, Honors}, where "Honors" has values in the set [none, cum laude, magna cum laude, summa cum laude). Since honors rankings are determined by the value of a student's GPA, the attribute "Honors" is functionally dependent upon the attribute "GPA." Note that "GPA" is not functionally dependent on "Honors," since knowing that a student will graduate cum laude does not determine whether the GPA is 3.6 or 3.69 or some other value in the range determined by the college for cum laude. U The next definition introduces a fundamental concept in database theory.
DEFINITION 12.20 Key; PrimaryKey; Alternate Key; Nonkey Attribute Let T be a relation and A be the attribute set of T. P = (A 1 , A 2 ... , A I},of A is called a key for T if
A nonempty subset,
1. All attributes in A - P (the set difference) are functionally dependent on P. 2. No nonempty proper subset of P has property 1. If a key consists of only one attribute, B, it is customary to speak of the key B instead of the key {B }. If there is more than one key, one of them is chosen to be the primary key, and the other choices are demoted to the status of alternate keys. An attribute that is not part of the primary key is called a nonkey attribute.
748
Chapter 12 Functions, Relations, Databases, and Circuits In Example 12.13, Biological Child is a primary key for the Biological relation. In Example 12.14, the only possible primary key would be the Student attribute. If this attribute takes on unique values (such as a student ID) rather than values that might appear in other tuples (such as a name), then it will be a primary key; otherwise, there is no primary key for the relation. The next example hints at the kinds of problems that the absence of a primary key creates. Driver's Licenses Let T be a relation that has attribute set {Last, First, Middle, Birthdate, Driver's License Number, Street Address, City, State, Zip}. It is tempting to choose Driver's License Number as the primary key. This would be a mistake in many states. The algorithms used to determine driver's license numbers do not always assign a unique number to each person. 2 6 Imagine the trouble you might encounter if you share a driver's license number with someone who has a tendency to drive recklessly while intoxicated. If your name appears earlier in the state's database, the tickets might be sent to you instead of the real offender. In states that do not have unique driver's license numbers, at least one other attribute must be added to form the primary key. The Birthdate attribute might be a good choice. It is not mathematically certain that {Driver's License Number, Birthdate} is truly a primary key, but the probability of failure is extremely low. A better solution would be to use an algorithm that does generate unique driver's license numbers. U It is now possible to resolve the potential practical problem encountered in Example 12.13. This practical problem was caused by real-life databases possibly having identical tuples that represent distinct entities in the database. The solution is simple. When a relational database model is chosen, it must be designed so that there is a primary key. If no such key arises naturally, some kind of unique ID must be assigned to each entity represented in the relation. The following list summarizes the features of relational databases that have been presented so far. The two-dimensional terminology will be used as the primary vocabulary. This choice may help reinforce the mathematical terminology given in Definition 12.18.
"* A relational
database is a collection of relations, each of which can be visualized as a two-dimensional table. "*The columns in a table represent attributes. "* The rows in a table are ordered tuples of attributes that represent a unique entity.
"*There are no duplicate rows. "*The entry in each row-column intersection is single valued. "*The order of the rows is unimportant. "*Each table has a primary key, consisting of one or more attributes.
V Quick Check 12'.6' Assume that the following table represents a relation.
26
Variety
Vegetable
Germination
Harvest
Bush Champion
Cucumber
7-14 days
55 days
Coot Kitty
Cucumber
8-12 days
60 days
Nantes Half Long
Carrot
7-21 days
70 days
See [32] for more complete details. Some related information can be found at the "Driver's License" link at http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/ in the "Textbook-Related Links" section.
12.3 n-ary Relations and Relational Databases Variety
Vegetable
Germination
Harvest
What's Up Doc?
Carrot
7-14 days
75 days
Cherry Belle Cherry Bomb
Radish Radish
7-14 days 7-14 days
22 days 20 days
1. Determine the attribute set. 2. Determine all functional dependencies.
749
3. What is a good choice for the primary key?
There are several very useful operations that may be applied to relations in a relational database. The operations that are most relevant for this discussion are introduced next. DEFINITION 12.21 Projection;Join Let T1 be an nl-ary relation with attribute set A1 and 12 be an n2-ary relation with attribute set A2 . If {B 1 , B2 ... , Bj } S A1 , then the projection of T1 onto {B1 , B2 ... , B I}is the relation obtained by 1. removing from each tuple in T1 the components that do not correspond to an attribute in {B 1 , B 2 ... , Bj1 ,
2. and then removing any duplicate tuples (keeping one copy). The projection of a relation, T, onto attributes {B1 , B2 ... , Bj} is denoted by T[B1 , B2 ... , B j ]. Similar notation is used to denote the projection of a single tuple in T onto the attribute set {B 1 , B2 ... , Bj1}. The join, T1 * T2, of T1 and 12 is a relation having attribute set B = A1 U A2 , with an ordering imposed on B. Assume that the ordered attributes in the union, B, are B 1, B2, ... , B,. Then T,*1T
2
={r E B1 x B 2 x ... x B, 13rl E T1 and 3r 2 ET 2
with r[Al] = rl and r[A2 ] = r2}. The Cartesian product, B, x B2 x ...x B, in the definition of join is constructed using the values of each attribute that actually appear in the relations, T1 and T2, and not the set of all potential values for the attributes. The join defined in Definition 12.21 is also called the naturaljoin. A simple algorithmic procedure can create a join:
" Form a Cartesian product from T, and T2. That is, make a table with one row for each distinct pair of rows in T1 and T2. Each new row will consist of a tuple from T1 concatenated with a tuple from 12. If T, has n rows and 12 has m rows, then the new table will have nm rows. There may be columns with duplicate attribute names. "*Remove all rows in the new table for which the duplicate columns do not have identical values. (This step corresponds to the requirement 3r, E T, and 3r2 E T2 with r[Al ] = rI and r[A 2] = r2 in Definition 12.21. That definition does not start with duplicate columns.) "*Form the attribute set (as a true set, with no duplicates). Project onto the attribute set (thus eliminating one copy of each duplicate column). A short example should help clarify the definitions of projection and join.
S1Projection
and Join with Family Relations The relations in Example 12.12 contain a very large number of tuples. To make this
750
Chapter 12 Functions, Relations, Databases, and Circuits example simpler, suppose that the two relations, "Biological" and "Adoptive," contain only the tuples shown in Tables 12.2 and 12.3. TABLE 12.2 Biological Father
Biological Mother
Biological Child
John Smith
Jane Smith
William Smith
John Smith
Jane Smith
Susan Smith
Esteban Rodriguez
Anita Rodriguez
Pablo Rodriguez
Walter Leblanc
Miranda Leblanc
Wanda Leblanc
Robert Westlund
Virginia Westlund
Derwin Westlund
Robert Westlund
Virginia Westlund
Darwin Westlund
TABLE 12.3 Adoptive Father
Adoptive Mother
Adoptive Child
John Smith
Jane Smith
Carmen Smith
John Smith Esteban Rodriguez
Jane Smith Anita Rodriguez
Polly Smith Tran-minh Rodriguez
Isaac Levitz
Helen Levitz
Aaron Levitz
Isaac Levitz
Helen Levitz
Hanna Levitz
Bob Jones
Betty Jones
Samantha Jones
The projection of Biological onto {Father, Mother} is the relation containing all couples who are (jointly) biological parents (Table 12.4). TABLE 12.4 Biological[Father, Mother] Biological[Father,Mother] Father Mother John Smith
Jane Smith
Esteban Rodriguez
Anita Rodriguez
Walter Leblanc
Miranda Leblanc
Robert Westlund
Virginia Westlund
The join Biological * Adoptive is essentially the relation containing all pairs of biological and adoptive children having the same parents (Table 12.5). TABLE 12.5 Biological * Adoptive Biological * Adoptive Father
Mother
Biological Child
Adoptive Child
John Smith
Jane Smith
William Smith
Carmen Smith
John Smith
Jane Smith
William Smith
Polly Smith
John Smith
Jane Smith
Susan Smith
Carmen Smith
John Smith
Jane Smith
Susan Smith
Polly Smith
Esteban Rodriguez
Anita Rodriguez
Pablo Rodriguez
Tran-minh Rodriguez
More precisely, the join is the set of all tuples in Father x Mother x Biological Child x Adoptive Child for which the projection onto {Father, Mother, Biological Child) is a tuple in Biological, and the projection onto {Father, Mother, Adoptive Child} is a tuple in Adoptive. This rules out tuples such as (Isaac Levitz, Helen Levitz, Wanda Leblanc, Aaron Levitz) because the projection, (Isaac Levitz, Helen Levitz, Wanda Leblanc), onto {Father, Mother, Biological Child} is not in Biological. U
12.3 n-ary Relations and Relational Databases
751
SQui c-,Chec-k- -12.7... Use the relations defined in Example 12.16 to complete the following tasks. 1. Form the projection it is an intermediate step for producAdoptive[Mother, Adoptive Child]. ing a more useful relation. Form (Bi2. Use the alternative procedure (outological * Adoptive)[Father, Mother]; that is, first form the join, and then lined immediately after Definition 12.21) to produce the join, Biproject onto {Father, Mother}. Also, ological * Adoptive. describe what this new relation repreR1 sents. 3. The join in part (2) of this Quick Check is not terribly useful. However, Projections can be used to provide a more precise definition of functional dependence. DEFINITION 12.22 FunctionalDependence (Formal) Let T be a relation and A be the attribute set of T. Let B E A and {A 1, A2 , . A.j. 1A} C A with j > 1. The attribute, B, is functionally dependent on {A1, A2 . ,A }if for every pair of tuples, r, and r 2 in T, r[A, A 2,
.... ,
Aj]
= r 2 [AI, A 2 ....
Aj]
implies
r1 [B] = r 2 [B].
The notation {AI, A 2 ,..., Aj 1-- B
indicates that B is functionally dependent on {A I, A 2.
Aj}.
12.3.3 Normal Forms If care is not taken in the formation of a relational database model, some serious deficiencies can be introduced. In particular, if too much redundancy is designed into the relations, there can be problems if a piece of information needs to be modified or deleted. The following example illustrates such a problem. Anomalies in a Relational Database Model The relation in Table 12.6 might arise if the registrar's office is attempting to keep track of course assignments together with useful information to help someone contact the instructor. TABLE 12.6 Schedule Schedule
Course
Section
Semester
Instructor
Office
Phone
MAT241
1
Fall
Gossett
CC 224
x6131
MAT241 MAT 124M MAT124M MAT124M
2 1 2 3
Fall Fall Fall Fall
Gossett Pederson Kinney Pederson
CC 224 CC 225 CC 229 CC 225
x6131 x6348 x6532 x6348
Suppose Professor Pederson decides to change her last name to Conrath. It is possible that the person who modifies the database might be in a hurry and forget to change every occurrence of the name. This will result in inconsistencies in the database. Note also that some information is repeated multiple times (for example, the fact that Professor Gossett is in CC 224 and has phone extension 6131). This redundancy requires the database to take more space than is necessary.
752
Chapter 12 Functions, Relations, Databases, and Circuits There is another potential problem with this design. Suppose section 2 of MATI24M does not have a sufficient number of students, so the registrar's office decides to cancel section 2. There is some possibility (depending on the other relations in the database) that deleting this tuple may result in the loss of the information placing Professor Kinney in CC 229 with phone extension 6532. N The solution to the kinds of problems introduced in Example 12.17 is to cleverly decompose the database into a collection of relations that exhibit some desirable properties. These properties are codified in a sequence of normalforms. Before introducing some normal forms, the decomposition process needs to be discussed. DEFINITION 12.23 Decomposition; Lossless Decomposition Let T be a relation with attribute set A = DI U D2 U ... U Dn, where the subsets, Di, of attributes are not necessarily disjoint. The set of relations {7[D1 ], T[D2 ], .... , T[Dn]} is a decomposition of -. If, in addition, T = T[D 1 ] * 7T[D 2] * ... * T[Dn], the decomposition is called a lossless decomposition.
A Lossless Decomposition Suppose the relation, Schedule, defined in Table 12.6 from Example 12.17 is decomposed as {Schedule[Course, Section, Semester, Instructor], Schedule[Instructor, Office, Phone]). Tables 12.7 and 12.8 show these projections. TABLE 12.7 Schedule[Course, Section, Semester, Instructor] Schedule[Course, Section, Semester, Instructor] Course Section Semester Instructor MAT241
1
Fall
Gossett
MAT241
2
Fall
Gossett
MAT1 24M
I
Fall
Pederson
MAT124M
2
Fall
Kinney
MAT124M
3
Fall
Pederson
TABLE 12.8 Schedule[Instructor, Office, Phone] Schedule[Instructor, Office, Phone] Instructor Office Phone Gossett
CC 224
x6131
Pederson
CC 225
x6348
Kinney
CC 229
x6532
This decomposition is lossless because Schedule = Schedule[Course, Section, Semester, Instructor] * Schedule[Instructor, Office, Phone].
U
A Lossy Decomposition Suppose the relation, Schedule, in Table 12.6 is decomposed as {Schedule[Course, Section, Semester], Schedule[Course, Instructor, Office, Phone]}.
12.3 n-ary Relations and Relational Databases
753
The two projections are shown in Tables 12.9 and 12.10. TABLE 12.9 Schedule[Course, Section, Semester] Schedule[Course, Section, Semester] Course Section Semester MAT241
I
Fall
MAT241
2
Fall
MAT124M
I
Fall
MAT124M
2
Fall
MAT124M
3
Fall
TABLE 12.10 Schedule[Course, Instructor, Office, Phone] Schedule[Course, Instructor, Office, Phone] Course Instructor Office Phone MAT241
Gossett
CC 224
x6131
MAT241
Gossett
CC 224
x6131
MAT124M
Pederson
CC 225
x6348
MAT124M
Kinney
CC 229
x6532
MAT124M
Pederson
CC 225
x6348
This decomposition is lossy because
Schedule :, Schedule[Course, Section, Semester] * Schedule[Course, Instructor, Office, Phone].
In fact, the join (Table 12.11) contains tuples that should be excluded. For instance, Professor Kinney is not scheduled to teach section 1 of MAT 124M.
TABLE 12.11 Schedule[Course, Section, Semester] Schedule[Course, Instructor, Office, Phone]
*
Schedule[Course, Section, Semester] * Schedule[Course, Instructor, Office, Phone] Course Section Semester Instructor Office Phone MAT241
I
Fall
Gossett
CC 224
x6131
MAT241
2
Fall
Gossett
CC 224
x6131
MAT124M
I
Fall
Pederson
CC 225
x6348
MAT 124M
2
Fall
Pederson
CC 225
x6348
MAT124M
3
Fall
Pederson
CC 225
x6348
MAT124M
1
Fall
Kinney
CC 229
x6532
MAT124M
2
Fall
Kinney
CC 229
x6532
MAT124M
3
Fall
Kinney
CC 229
x6532
U
The primary goal for the rest of this section is to introduce first, second, and third normal forms. Brief mention will be made of Boyce Codd normal form. The next definition starts the process. It uses notation introduced in Definition 12.22 on page 751.
754
Chapter 12 Functions, Relations, Databases, and Circuits
DEFINITION 12.24 First, Second, and Third Normal Forms Let 7 be a relation with attribute set A, and let D C A be a subset of attributes. Let B 0 D be an attribute that is not in any key. First Normal Form T is in first normal form if every attribute in T is single valued. Second Normal Form T is in second normalform if it is in first normal form and if D --> B implies that D is not properly contained in any key of T. Third Normal Form 7T is in third normalform if it is in first normal form and if D ->B implies that D contains some key of T. Notice that Definition 12.18 requires relations in a relational database to be in first normal form. Exercise 10 in Exercises 12.3.4 asserts that any relation that is in third normal form is also in second normal form. The intuitive motivation for second normal form is the observation that having a functional dependency on a proper subset of a key is undesirable. It is better to have the primary key be the only source of dependency for attributes that are not part of some key, The requirement for third normal form just strengthens this insight. First Normal Form Table 12.12 represents a database design that is not in first normal form. Assume that teaching assistants can only work for one instructor. TABLE 12.12 ScheduleA ScheduleA Course
Section
Semester
Instructor
Office
Teaching Assistant
MAT241
I
Fall
Gossett & Turnquist
CC 224 & CC 226
Nielsen
MAT241
2
Fall
Gossett
CC 224
Nielsen
MAT124M
I
Fall
Conrath
CC 225
Dowdey
MAT 124M
2
Fall
Kinney
CC 229
Ness
MAT124M
I
Spring
Conrath
CC 225
Dowdey
The problem, of course, is that section 1 of MAT241 is team taught, resulting in attributes that are not single valued. The solution is simple: Break tuples with multivalued attributes into a set of tuples. For this example, one additional tuple suffices (Table 12.13). TABLE 12.13 ScheduleB ScheduleB Instructor
Office
Teaching Assistant
Fall
Gossett
CC 224
Nielsen
I
Fall
Turnquist
CC 226
Nielsen
MAT241
2
Fall
Gossett
CC 224
Nielsen
MAT I24M
1
Fall
Conrath
CC 225
Dowdey
MAT124M
2
Fall
Kinney
CC 229
Ness
MAT422
2
Fall
Kinney
CC 229
Ness
MAT 124M
I
Spring
Conrath
CC 225
Dowdey
Course
Section
MAT241
I
MAT241
Semester
U
12.3 n-ary Relations and Relational Databases
755
Second Normal Form The relation in Example 12.17 on page 751 was shown to result in update and deletion anomalies. One indicator that such problems will occur for some relation is that the relation is not in second normal form. Consider the relation, ScheduleB, from Example 12.20 (Table 12.13). That relation is in first normal form, but not in second normal form. To see this, consider the functional dependencies. The essential dependencies are shown in Table 12.14.27 TABLE 12.14 Essential functional dependencies for ScheduleB (Course, Section, Semester, Instructor) -+ Office (Course, Section, Semester, Instructor) -- Teaching Assistant (Course, Section, Semester, Office) - Instructor (Course, Section, Semester, Office) --* Teaching Assistant Instructor -.- ý Office Instructor - Teaching Assistant Office -- Teaching Assistant Office
-
Instructor
The dependencies indicate that (Course, Section, Semester, Instructor) can be chosen as the primary key. There is one alternate key: (Course, Section, Semester, Office). Notice that D = Instructor is properly contained in a key, but D --+ Teaching Assistant is true. Thus, this relation is not in second normal form. U If a relation is not in second normal form (but is in first normal form), there are algorithms that will transform it into an equivalent relation that is in second normal form. See [571 or most other database textbooks for details. The algorithms typically create a lossless decomposition to achieve their goal. The approach here will be to introduce an algorithm that will generate a decomposition whose relations are in third normal form. Third Normal Form Suppose the registrar's office has decreed that team-taught sections will no longer be permitted. In addition, an instructor is allowed to have different teaching assistants for different sections or semesters of a course. The relation in Table 12.15 has eliminated one instructor to satisfy the ban on team taught courses, and has also added a new teaching assistant. TABLE 12.15 ScheduleC Course
Section
MAT241
1
Fall
MAT241
2
MAT124M
1
MAT124M
Semester
ScheduleC Instructor
Office
Teaching Assistant
Gossett
CC 224
Nielsen
Fall
Gossett
CC 224
Nielsen
Fall
Conrath
CC 225
Dowdey
2
Fall
Kinney
CC 229
Ness
MAT422
2
Fall
Kinney
CC 229
Ness
MAT 124M
1
Spring
Conrath
CC 225
Berg
27 am using the term essential dependency informally. There are other dependencies, such as tInstructor,
Office) --+ Teaching Assistant, which are not listed. They have been omitted because at least one element of the left-hand side is functionally dependent on the other elements of the left-hand side. For instance, Instructor -- Office, so (instructor, OfficelI Teaching Assistant should be omitted. The inclusion of both
{Course, Section, Semester, Instructor] dependencies on the left-hand side.
--
Office and Instructor
-
Office is because neither has any internal
756
Chapter 12 Functions, Relations, Databases, and Circuits The essential functional dependencies are listed in Table 12.16. TABLE 12.16 Essential functional dependencies for ScheduleC [Course, Section, Semester) - Instructor (Course, Section, Semester) -* Office {Course, Section, Semester) --+ Teaching Assistant Instructor -. Office Teaching Assistant -- Instructor Teaching Assistant
-.
Office
Office -* Instructor The primary (and only) key is {Course, Section, Semester}. This relation is in second normal form. This follows since the left-hand side of every dependency is either the entire primary key (and hence not a proper subset of the primary key), or is a nonkey attribute. This relation is not, however, in third normal form. To see this, set D = {Teaching Assistant). Notice that D -* Instructor is true, but D does not contain any key in the relation. U
For each relation listed, determine whether it is in first, second, or third normal form. Assume that there are never players with the same name in the same position on the same team. Team
League Rosters Player Position
Captain
Mud Hens
Casey
first base
yes
Mud Hens
O'Reilly
pitcher
no
Mud Hens
Issacson
pitcher
no
Mud Hens
Johnson
catcher
no
Mud Hens
Johnson
right field
no
Prairie Chickens
Svenson
pitcher
no
Prairie Chickens
Johnson
shortstop
yes
Prairie Chickens
Johnson
catcher
no
Prairie Chickens
Hidalgo
left field
no
1.
2.
Salaries EmployeelD
Name
Job Grade
Salary
1214
John Chen
GS- 11
$50,000
1225
Mary Thompson
GS-10
$45,000
1309
Sue Witkowski
GS- 1I
$50,000
1356
Ahmed Mosse
GS-12
$55,000
1443
John Chen
GS-9
$40,000
1455
Yolanda Roberts
GS-9.5
$40,000
:
:
:
:
l
12.3 n-ary Relations and Relational Databases
757
An algorithm is needed that will convert a relation (in first or second normal form) into a lossless decomposition of relations that are each in third normal form. The corollary to the next theorem (from [31]) will be useful. A Lossless Decomposition Let T be a relation with attribute set A, and let C c A and D c A with C n D = 0. Set E = A - (C u D). If C -+ D, then the two projections, T[C U D] and T[C U E], form a lossless decomposition of T. Proof: In the expressions that follow, c, d, e are tuples of values for the sets of attributes, C, D, E, respectively. Assume that the attributes in A are ordered with all attributes in C appearing first, all attributes in D appearing in the middle, and all attributes in E appearing at the end. Then T[C U D]
*
7[C U E] ={(c, d, e) I c e C, d e D, e E E and (c, d) E 7T[C U D] and (c, e)
E
T[C U El}.
Since c, d, and e are chosen independently, it is possible that T[C U D] * TI[C U E] may contain tuples that are not in T. It is certainly true that 7 c 7[C U D] * T[C U E]. Suppose that (c, d, e) E T[C U D] * T[C U E]. It is not certain that (c, d, e) E T. However, it is true that (c, e) E T[C U El. But (c, e) is a projection of some tuple in T, so there is some d' E D such that (c, d', e) E 7. But then (c, d') c T[C U D]. Thus, (c, d) e T[C U D] and (c, d') e T[C U D]. In addition, C -* D. But then Definition 12.22 implies that d = (c, d)[D] = (c, d')[D] = d' (because (c, d)[C] = (c, d')[C]). Consequently, (c, d, e) = (c, d', e) E T. Thus, T[C U D] * T[C U E] C 7. The two subset inclusions establish the validity of the theorem. l COROLLARY 12.1
Let T be a relation with attribute set A, and let C C A and D c A. Set E = A - (C U D). If C -, D, then the two projections, T[C U DI and T[C U E], form a lossless decomposition of 7. Proof: What has changed from the statement of the theorem is that the corollary does not assume C n D = 0. Thus, suppose C n D 0 0. Set D' = D - C. Then C n D' = 0, CUD'= CUD,andE = A- (CUD) =A-(CUD'). The theorem asserts that T[C U D] = T[C U D'] and 7[C U E] form a lossless ED decomposition of T. Converting Relations to Collections in Third Normal Form Suppose T is a relation in a relational database that is not in third normal form. Since it is in a relational database, it is already in first normal form. Let A be the attribute set of T. Since T is not in third normal form, there must be a subset, D C A, and an attribute, B 0 D, which is not part of any key, such that D -> B, but D does not contain any key. Since 0 U (A - (D U {B})) = A - {B}, Corollary 12.1 implies that the relations T[D U [B)] and T[A - IB}] form a lossless decomposition of 7. Notice that D U {B} 0 A because otherwise D would be a key for 7. Thus, both T[D U {B}] and T[A - {B}] are relations with fewer attributes than 7 has. If both these new relations are in third normal form, the goal has been achieved. Otherwise, the process can be repeated. Since the number of attributes in each new partition is strictly fewer than the number of attributes in the parent relation, the process must terminate after a finite number of steps. The process is summarized in the following algorithm.
758
Chapter 12 Functions, Relations, Databases, and Circuits Convert to Third Normal Form convert tables into first normal form (as necessary) to form a relational database while there are relations in the database that are not in third normal form choose a relation, T, with attribute set A, which is not in third normal form find a subset, 0, of attributes in T and an attribute, B D, where B is not in any key, such that D-+B replace T with T[ U {B}] and T[A - {B}] The following theorem formally summarizes the main consequence of the process just outlined. Any RelationalDatabasecan be Converted to Third Normal Form Let {ITI, T2 ..... "T,} be a relational database. Then the relations, 7T, can be losslessy decomposed into a collection of relations that are each in third normal form.
Converting First Normal Form to Third Normal Form The relation, ScheduleB, is in first normal form. It is repeated in Table 12.17. The essential dependencies were found in Example 12.21 and are repeated in Table 12.18. TABLE 12.17 ScheduleB ScheduleB Course
Section
MAT241
Semester
Instructor
Office
Teaching Assistant
1
Fall
Gossett
CC 224
Nielsen
MAT241
1
Fall
Turnquist
CC 226
Nielsen
MAT241
2
Fall
Gossett
CC 224
Nielsen
MAT 124M
I
Fall
Conrath
CC 225
Dowdey
MAT124M
2
Fall
Kinney
CC 229
Ness
MAT422
2
Fall
Kinney
CC 229
Ness
MAT124M
1
Spring
Conrath
CC 225
Dowdey
TABLE 12.18 Essential functional dependencies for ScheduleB (Course, Section, Semester, Instructor) -- Office (Course, Section, Semester, Instructor)
-.
(Course, Section, Semester, Office]
Teaching Assistant --
Instructor
(Course, Section, Semester, Officel -- Teaching Assistant Instructor
->
Office
Instructor -+ Teaching Assistant Office -e Teaching Assistant Office --+ Instructor The algorithm for converting to third normal form starts by finding a subset, D, of attributes that does not contain a key and another attribute, B g D, which is not in any key and which is functionally dependent on D. One such choice is D = [Instructor} and B = Teaching Assistant. The next step is to create the lossless decomposition {ScheduleB[Instructor, Teaching Assistant], ScheduleB[Course, Section, Semester, Instructor, Office]) (Tables 12.19 and 12.20).
12.3 n-ary Relations and Relational Databases
759
TABLE 12.19 ScheduleB[Instructor, Teaching Assistant] ScheduleB[Instructor, Teaching Assistant] Instructor Teaching Assistant Gossett Turnquist
Nielsen Nielsen
Conrath
Dowdey
Kinney
Ness
TABLE 12.20 ScheduleB[Course, Section, Semester, Instructor, Office] ScheduleB[Course, Section, Semester, Instructor, Office] Course Section Semester Instructor Office MAT241
1
Fall
Gossett
CC 224
MAT241
I
Fall
Tumquist
CC 226
MAT241
2
Fall
Gossett
CC 224
MAT124M
1
Fall
Conrath
CC 225
MAT124M
2
Fall
Kinney
CC 229
MAT422
2
Fall
Kinney
CC 229
MAT124M
1
Spring
Conrath
CC 225
Both of these new relations are in third normal form. The essential functional dependencies in ScheduleB[Course, Section, Semester, Instructor, Office] are as follows: {Course, Section, Semester, Instructor} --+ Office
{Course, Section, Semester, Office) --o Instructor Instructor -+ Office Office --- Instructor. There is no choice for B that is not in some key.
U
VQuick Check 12.9 1. Show that the relation ScheduleB[Instructor, Office, Teaching Assistant] is in third normal form.
(Hence, it is possible to losslessly decompose ScheduleB into two third normal form relations.) []
In the final decomposition in Example 12.23, Instructor is the primary key for the relation ScheduleB[Instructor, Office]. It also appears in the relation ScheduleB [Course, Section, Semester, Instructor, Office] as a foreign key. 28 Modifying or deleting an instructor's name will require changes to both tables. However, since Instructor always appears as either a primary key or as a foreign key, software implementations of relational databases can typically update all instances automatically.
SWConverting
Second Normal Form to Third Normal Form
The relation, ScheduleC, in Table 12.15 on page 755 is already in second normal form. In Example 12.22 it was demonstrated that D = {Teaching Assistant) and B = Instructor show that ScheduleC is not in third normal form. The conversion algorithm suggests that ScheduleC can be decomposed into ScheduleC[Instructor, Office, Teaching Assistant] 28
A foreign key is an attribute (or set of attributes) that is a primary key in some other relation in the relational database.
760
Chapter 12 Functions, Relations, Databases, and Circuits (Table 12.24 on page 761) and ScheduleC[Course, Section, Semester, Instructor, Office] (Table 12.21). TABLE 12.21 ScheduleC[Course, Section, Semester, Instructor, Office] ScheduleC[Course, Section, Semester, Instructor, Office] Course Section Semester Instructor Office MAT241
I
Fall
Gossett
CC 224
MAT241
2
Fall
Gossett
CC 224
MAT 124M
I
Fall
Conrath
CC 225
MAT 124M
2
Fall
Kinney
CC 229
MAT422
2
Fall
Kinney
MAT 124M
I
Spring
Conrath
CC 229 CC 225
ScheduleC [Course, Section, Semester, Instructor, Office] is not in third normal form. To see this, list the essential functional dependencies. (Course, Section, Semester --. Office (Course, Section, Semesterl -+ Instructor Instructor --+ Office Office --> Instructor The functional dependency Instructor --* Office shows that this relation is not in third normal form. Create two new relations, ScheduleC[Course, Section, Semester, Instructor] (Table 12.22) and ScheduleC[Instructor, Office] (Table 12.23). TABLE 12.22 ScheduleC[Course, Section, Semester, Instructor] ScheduleC[Course, Section, Semester, Instructor] Course Section Semester Instructor MAT241
I
Fall
Gossett
MAT241
2
Fall
Gossett
MAT I24M
I
Fall
Conrath
MAT I24M
2
Fall
Kinney
MAT422
2
Fall
Kinney
MAT 124M
I
Spring
Conrath
TABLE 12.23 ScheduleC[instructor, Office] ScheduleC[Instructor, Office] Instructor Office Gossett
CC 224
Conrath
CC 225
Kinney
CC 229
Both of these new relations are in third normal form. The primary key for the first is {Course, Section, Semester}. There are no other choices for primary key and no functional dependencies with D not the primary key. There are no nonkey attributes in the second new relation. They must therefore be in third normal form. The relation, ScheduleC[Instructor, Office, Teaching Assistant], in Table 12.24 has the following functional dependencies.
12.3 n-ary Relations and Relational Databases
761
Teaching Assistant --* Instructor Teaching Assistant -- Office Instructor Office Office --+ Instructor -
TABLE 12.24 ScheduleC[Instructor, Office, Teaching Assistant] ScheduleC[Instructor, Office, Teaching Assistant] Instructor Office Teaching Assistant Gossett Conrath Kinney
CC 224 CC 225 CC 229
Nielsen Dowdey Ness
Conrath
CC 225
Berg
The primary key is therefore Teaching Assistant, but this relation is not in third normal form because of Instructor -+ Office. Create the new relations, ScheduleC[Teaching Assistant, Instructor] (Table 12.25) and ScheduleC (Instructor, Office] (already available as Table 12.23). Both are in third normal form since every functional dependency has a key on the left. TABLE 12.25 ScheduleC[Teaching Assistant, Instructor] ScheduleC[Teaching Assistant, Instructor]
Teaching Assistant Nielsen Dowdey Ness Berg
Instructor Gossett Conrath Kinney Conrath
The three relations, ScheduleC[Course, Section, Semester, Instructor] (Table 12.22 on page 760), ScheduleC[Instructor, Office] (Table 12.23 on page 760), and ScheduleC[Teaching Assistant, Instructor] (Table 12.25 on page 761), form a third normal form decomposition for ScheduleC. U Other Normal Forms Although third normal form prevents many potential anomalies from entering a relational database, it does not prevent all such problems. One undesirable property is having an attribute in the primary key be functionally dependent on a nonkey attribute. Problem 17 in Exercises 12.3.4 presents an example of this. The previous example exhibits a similar problem.
SWBeyond
Third Normal Form Quick Check 12.8 introduced a relation (Table 12.26 on page 762) with team rosters. Suppose that instead of listing whether a player is the captain, it lists the player's Contract ID. League Contract ID's are tied to both team and status, where status is either rookie or veteran. The essential functional dependencies are as follows. {Team, Player, Position} -+ Contract ID Contract ID -) Team This relation is in third normal form because Team is not a nonkey attribute. However the functional dependency Contract ID --- Team may be considered undesirable since Contract ID is a nonkey attribute. U
762
Chapter 12 Functions, Relations, Databases, and Circuits TABLE 12.26 League rosters League Rosters Team Player Position
Contract ID
Mud Hens
Casey
first base
mhr
Mud Hens
O'Reilly
pitcher
mhv
Mud Hens
Issacson
pitcher
mhv
Mud Hens
Johnson
catcher
mhv
Mud Hens
Johnson
right field
mhr
Prairie Chickens
Svenson
pitcher
pcr
Prairie Chickens
Johnson
shortstop
pcv
Prairie Chickens Prairie Chickens
Johnson
catcher
pcr
Hidalgo
left field
pcv
The undesirable feature in this example can be removed by decomposing the relation into {(League Rosters)[Contract ID, Team], (League Rosters)[Player, Position, Contract ID]}. The primary keys are Contract ID and {Player, Position, Contract ID), respectively. The relations in this decomposition are in what is called Boyce-Codd normal form. DEFINITION 12.25 Boyce-Codd Normal Form Let T be a relation in a relational database with attribute set A. Let D C A and B 0 D. Then T is in Boyce Codd normalform if D -+ B implies that D contains some key of T. The change from third normal form is the elimination of the requirement that B not be an attribute in any key. For more information on Boyce-Codd normal form, see [60] or [31]. It is tempting to assume that the sequence of normal forms will ultimately reach a form that guarantees that all undesirable features have been eliminated from the database model. Unfortunately, this is not the case. The higher normal forms remove some additional problems but eventually lead to problems with preserving functional dependencies. There are also trade-offs that must be considered. For instance, some database designers might choose to give up third normal form for some relations in the database in order to reduce the number of relations. For instance, it is not uncommon to leave both Zip Code and City as attributes in a larger relation, even though City is functionally dependent on Zip Code, and Zip Code may not be a key in the relation. Both theoretical and practical research continues in this area. See [31] for a mathematically oriented overview.
12.3 n-ary Relations and Relational Databases
763
12.3.4 Exercises The exercises marked with P have detailed solutions in Appendix G.
instructor per section.) (b) {Book Title, Author, Publisher, Edition) (You may as-
1. Which of the following tables could represent relations in a relational database? If the table could represent a relation in a relational database, list the attribute set.
sume there is only a single author in each tuple.) (c) {Composition, Composer, Original Instrumentation, Composition Date, Original Performance Date) (Assume
Cast List
(a) 0
Understudy
Actor
Character
Hamnemu
Jn
Garner,
Claudius
Richard Gunther
Bob Searle
Ophelia
Suzanne Bonner Virginia Smith
Sally Richards no understudy
Gertrude
able in the "Textbook-Related Links" section of http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/). The Lady Bird Johnson Wildflower Center Master Plant List Common Name
Family
Indian Mlo
Malvaceae
Indian
Malvaceae
Course
Section
Semester
Instructor
MAT241
1
Fall
Gossettfrtcsm
Genus/ Species Abutilon
MAT241
2
Fall
Gossett
Abutilon
MAT222
1
Spring
Kinney
incanum
MAT124M
I
Fall
Conrath
Pelotazo
Malvaceae
MAT124M
2
Fall
Kinney
Abutilon incanum
MAT 124M
3
Fall
Conrath
My Movie Log Where Watched
Aesculus pavia
Red Buckeye
Hippocastanaceae
Movie
Aesculus pavia var. pavia
Red Buckeye
Hippocastanaceae
Teaching Assignments
(b)
(c) Ponette
Video at home
Fellowship of the Ring
Ritz Theater
Secret of Roan Inish
Theatre Leo
Fellowship of the Ring
Ritz Theater
Princess Mononoke DVD at home 2. Let T be a relation and let A be its attribute set. Prove Proposition 12.2. PROPOSITION 12.2 If {A1 , A2 ... , A1j} C A and B E (A 1 , A 2. A j, then B is functionally dependent on {A 1 , A 2. , Aj. 3.
that there will never be two composers with the same name and composition.) (d) Partial information from The Lady Bird Johnson Wildflower Center Master Plant List has been used for this table (current Web address avail-
• Determine the functional dependencies in the following relations. Ignore trivial dependencies such as Proposition 12.2. Do not include dependencies with more attributes than necessary. For example, if B is functionally dependent on {A 1, A2 }, then it is also functionally dependent on {AI, A2 , C}, for any attribute C. List only the smaller set of attributes. • Then list a good choice for the primary key. Only the attribute sets are provided in some cases. (a) {Course, Section, Semester, Year, Instructor) (You may assume that the database only stores information for current or previous time periods and that there is only one
fruticosum
Mallow
Mallow
4. OPForm the projection (Teaching Assignments)[Course, Instructor]. Teaching Assignments Course MAT241
Section 1
Semester Fall
Instructor Gossett
MAT241 MAT222 MAT124M MATI24M
2
Fall
Gossett
I I 2
Spring Fall Fall
Kinney Conrath Kinney
MAT124M
3
Fall
Conrath
5. Form the following projections for the truncated version of Chores. (a) Chores[Day, Person] (b) Chores[Task, Person]-Sort the result by Task, and within identical tasks sort by Person.
764
Chapter 12 Functions, Relations, Databases, and Circuits Chores
7. Form the join for the following pairs of relations.
Day
Task
Person
Sunday
cook
Sue
Course
Section
Sunday
dishes
Franka
MAT241
1
Gossett
Monday
vacuum
Beth
MAT241
2
Gossett
Monday
cook
Franka
MAT223
I
Wetzell
Monday
dishes
Beth
MAT124M
I
Conrath
Monday
shop
Sue
MAT124M
2
Kinney
Tuesday
dust
Beth
MAT124M
3
Conratb
Tuesday
cook
Tuesday
dishes
Sue Sue
(a)
Instructor
Room Assignments
6. Form the join for the following pairs of relations. Yearbook
(a) @
Teaching Assignments
Course
Section
Room
MAT241
I
CC325
MAT241
2
CC325
MAT223
I
CC431
Task
Homeroom
MAT 124M
I
CLC 109
Joe
Advertisements
H4
MAT I24M
2
RC424
Martha
Activities
G3
MAT124M
3
AC203
Kim
Teacher Photos
H4
Rosa
Student Photos
D5
Student
(b)
Creation Composer
Composition
Date
John Newton
Amazing Grace Moonlight Sonata
c. 1770
Brandenburg Concertos
1721
Newspaper Student
Feature
Cohort
Amelia
News
Junior
Wesley Kim
City Page Photos
Sophomore Senior
Rosa
Editorials
Junior
Bob
Sports
Sophomore
(b)
Ludwig van Beethoven
Employees Employee ID
Employee Name
Salary Level
21457
Said Sachdev
S3
21490 21688
Millie Volk Dominique Delacroix
SI S2
22000
June Hebert
S4
Salaries
Johann Sebastian Bach
1801
Performance Artist/ Orchestra
Composer
Composition
Charlotte Church Judy Collins
John Newton
Amazing Grace Amazing Grace
Alan Schiller
Ludwig van
FUr Elise
Amsterdam Baroque Orchestra
Johann Sebastion Bach
Brandenburg Concertos
John Newton
Beethoven
8. Prove the following proposition about the composition of projections.
Salary Level
Yearly Pay
S1
$20,000
S2
$25,000
S3
$30,000
S4
$90.000
S
1
$0,000PROPOSITION
12.3
Let T be a relation with attribute set {A1. Aj, B 1 ,..., Bk, CI ... , Cn}, with j, k, n > 1. Set T- = T[A1 ... T[A1 ..
, Aj, B 1 . . Aj].
Bk]. Then T"[A1 ....
Aj] -
12.3 n-ary Relations and Relational Databases 9. Prove the following proposition about the natural join. PROPOSITION 12.4 Let T 1 , T2 , and T3 be relations with attribute sets A1 , A2 , A3 , respectively. Impose orderings on A1 U A2 , A2 U A3 , and A1 U A2 U A3 . Then * The join operator is commutative: T1 * T2 T= 2*T 1. - The join operator is associative: ( 1 * T2 ) * T3 = T1 * (T2 * T3 ). o If A1 n A2 = 0, then T1 * T = T 1 x T2 . 10. Prove that any relation that is in third normal form is also in second normal form. 11. For each relation, * List the essential functional dependencies. * Determine whether it is in first, second, or third normal form. Your answer should reflect the highest form that applies. Give adequate reasons to justify your answer. (a) Assume that "Katia and Marielle Lab~que" is a singlevalued entry for this relational database. CD-I Artist UPC Title Michala Petri 743215911227 The Ultimate Recorder Collection Enya 093624742623 A Day Without Rain Watermark Enya 075992677424 Kalevala: Ruth 038146202125 Dream of MacKenzie the Salmon Maiden 04552 Ece! Mi(b) 074645256825 Encore! Midori 074644838121
Encore!
Katia and Marielle Lab~que
CD-2
(b)
Artists Custer LaRue, The Baltimore Consort The Baltimore The Mad Buckgoat Consort Dolly Parton, Linda Trio TroDll atnLnaID Ronstadt, Emmylou Harris Tan Dun, Yo-Yo Ma Crouching Tiger Hidden Dragon Title The Daemon Lover
(c) D Assume that all widgets are stored in Warehouse W and all flanges are stored in Warehouse F.
Part ID W1256 W1257 W2276 F4 F6
765
Widgets and Flanges Part Name Part Location Widget (metric) Widget (metric) Widget (English) Flange (4 inch) Flange (6 inch)
Warehouse Warehouse Warehouse Warehouse Warehouse
W W W F F
: (d) Assume that there will never be two composers with the same name. Compositions Composer's Composer Birthdate Title 1844 Sheherezade Nikolai RimskyKorsakov Maurice Ravel 1875 Sheherezade 1902 Concierto de Joaquin Rodrigo Aranjuez Joaquin Rodrigo 1902 Fantasia para un gentilhombre Johann Sebastian 1685 Concerto for four Bach harpsichords 1685 Messiah George Frederic Handel
12. Using the relations in Exercise 11 as the base relations, which of the following decompositions are lossless? Justify your answers. (a) LCD-I [UPC, Title], CD-I [Title, Artist]) {CD-lI[UPC, Title], CD-I[UPC, Artist]) (c) B {(Widgets and Flanges)[Part ID], (Widgets and Flanges)[Part Name, Part Location]) (d) (Compositions[Title, Composer], Compositions[Composer, Composer's Birthdate} 13. Convert the relation, Salaries, from Quick Check 12.8 into third normal form. Use the algorithm on page 758. 14. o Use the algorithm on page 758 to convert each relation in Exercise 11 into a collection of relations in third normal form. 15. Consider the following relation. Members Name
Initials
44 51 52
Joe Smith Carl Carlson Betty Boop
JS CC BB
64
Carl Carlson
CC BB
Bob Burquist 75 (a) List the essential functional dependencies.
766
Chapter 12 Functions, Relations, Databases, and Circuits
(b) What are the possible primary keys? (c) Which normal form is this table in? (List the highest form.) (d) If the relation is not in third normal form, convert it to a collection of tables in third normal form whose join is the original table. If it is in third normal form, attempt to convert it into a collection of tables in Boyce-Codd normal form whose join is the original table. 16. Example 12.23 introduced the projection ScheduleB[Course, Section, Semester, Instructor, Office]. This relation is in third normal form.
(a) Show that it is not in Boyce-Codd normal form. (b) Show that the projections ScheduleB[Course, Section, Semester, and ScheduleB[Instructor, Office] a r In , Instructor] struc-Coddtnor]mand for. [(c) are in Boyce-Codd normal form. 17. A more realistic course registration database would need to allow multiple professors to have the same name. Suppose that the registrar's office has chosen to use Name and Office to uniquely identify professors (rather than the more sensible decision to assign unique ID numbers). Assume also that offices and phone numbers can be shared, and that more than one phone can be assigned to an office, but that professors with the same name are never assigned to the same office.
(a) Show that Instructor Info is in third normal form. (b) Show that there is an attribute in the primary key that is functionally dependent on a nonkey attribute. (c) Convert this relation to a pair of relations in Boyce-Codd normal form. 18. Each of the following statements is either true (always) or false (at least sometimes). Determine which option applies for each statement and provide adequate explanation for your choice. (a) It is always better to use a higher normal form than to use a lower normal form. (b) 014A relation, T, can always be recovered from a lossless decomposition (of T). Redundancy in a relational database is usually a desirable feature. (d) t If nonempty relations, 7Zand T, have no common attributes, then the join, 7". * T, is empty. 19. Prove that functional dependence is transitive. That is, show that X
--
Y and Y --- Z implies that X
--
Z.
20. Let T be a relation that is in third normal form. If the primary key is the only key in the relation, prove that T is also in Boyce-Codd normal form.
Instructor Info Name
Office
Phone
Teaching Assistant
Gossett Conrath Kinney Brown Jones Jones
CC 224 CC 224 CC 224 CC 224 HC 414K AC 123
x6131 x6131 x6532 x6532 x6335 x6312
Nielsen Ness Nygren Ness Anderson Nelson
12.4 Binary Functions and Binary Expressions The main goal of this section is to develop some of the primary theory relating to functions whose domain and range is a Boolean algebra. This development will culminate in a canonical form, called disjunctive normalform, for binary functions. A secondary goal is to develop an algorithm for converting a binary expression into disjunctive normal form.
12.4.1 Boolean Functions DEFINITION 12.26 Single-Variable Boolean Function Let B be a Boolean algebra with associated set, B. A single-variableBoolean function is a function whose domain is B and whose range is {0, 1}.
M
A Single-Variable Boolean Function Let B be the Boolean algebra defined in Example 2.29. Recall that the associated set is B = {1 a,{b}, (a, b}}. A single-variable Boolean function on B can be specified by
12.4 Binary Functions and Binary Expressions
767
showing its action for each element in B. x
f(x)
0 (a)
0
{bh
I
[a, b}
0
0
U
One easy-to-answer question is, "How many single-variable Boolean functions are there with domain B?". The answer depends on the size of the associated set. •f OTM
The Number of Single-Variable Boolean Functions on 3
,
Let 3 be a Boolean algebra with associated set B. If JBI = m, then there are 2m distinct single-variable Boolean functions on 1. Proof: Two functions are distinct if they disagree on at least one element in B. A function can independently assign either 0 or I to each element in B. Since there are m elements, each with two possible values, there are 2m distinct functions. El All Single-Variable Boolean Functions Table 12.27 shows all 16 possible Boolean functions for the Boolean algebra in Example 2.29. TABLE 12.27 All 16 Boolean functions for the Boolean algebra with associated set B = (0, (a), 1bh, (a, bj) X
fh
f2
f3
fh
f5
f6
f7
f8
f9
fl0
fAl
f12
f13
f14
f15
f16
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
fa}
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1b} {a, b)
0 0
0 1
1 0
1 1
0 0
0 1
1 0
1 1
0 0
0 1
1 0
1 1
0 0
0 1
1 0
1 1 U
Single-variable Boolean functions are not as interesting as multivariable Boolean functions, which are defined next. DEFINITION 12.27 MultivariableBoolean Function Let 13 be a Boolean algebra with associated set, B. An n-variable Booleanfunction n times
is a function whose domain is B x B x - - x B and whose range is {0, 1}.
A Multivariable Boolean Function Let 3 be the Boolean algebra defined in Example 2.29. A two-variable Boolean function on B can be specified by showing its action for each pair of elements in B. Table 12.28 on page 768 shows one such function. U The proof of the following theorem is left as an exercise. The Number of MultivariableBoolean Functions on B Let B be a Boolean algebra with associated set, B. If IBI = m, then there are 2"_= 2(m'") distinct n-variable Boolean functions on 3.
768
Chapter 12 Functions, Relations, Databases, and Circuits
TABLE 12.28 A twovariable Boolean function on B = 40, (a], {b], (a, bl x y f(x, y) 0 0 0 {a} [b)
0 0
0 1
[a, bl
0
0
0 {a}
{a) {a})
0
{b)
{a}
{a, b}
{a}
0
0
(b}
1
{a}
{b)
0
{b}
{b}
(a, b)
{b}
I
0
{a, b}
0
[a} {b}
la,b} {a,bl
1 1
{a, b}
{a, bl
0
U
The most important case is when the Boolean algebra is the one defined in Example 2.28. For that example, B = 10, 1}. Since the remainder of this section will mainly be interested in this case, another definition will be helpful. DEFINITION 12.28 Binary Function A binaryfunction of order n is an n-variable Boolean function on a Boolean algebra whose associated set is B = 10, 1). The counting theorem for this definition is a corollary of Theorem 12.7.
COROLLARY 12.2
The Number of Binary Functions There are 22" distinct binary functions of order n. The collection of all possible binary functions of order 2 is shown in the next example. All Binary Functions of Order 2 Table 12.29 shows all 16 possible binary functions of order 2. Compare this table with Table 12.27 on page 767.
12.4.2 Binary Functions and Disjunctive Normal Form Up to this point, there has been no attempt to produce algebraic expressions that define a Boolean function. It is time to consider this for binary functions. Recall Definition 2.20
on page 53. This definition can be restricted to the case of current interest.
12.4 Binary Functions and Binary Expressions
TABLE 12.29 All 16 binary functions of order 2 f8 f6 f7 fh f5 f2 f3 y fA x
769
Af
fl0
fil
f12
f13
f14
f15
f16
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
0
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
I
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
IE
DEFINITION 12.29 Binary Variable; Binary Expression A binary variable is one whose possible values are either 0 or 1. A binary expression is an algebraic expression that is composed using the symbols 0, 1, +,., , and binary variables.
M
A Binary Expression The expression 0 + 1 • x • 7 is a binary expression in two variables. It is instructive to evaluate this expression for each possible value of x and y. x
y
0+1 .x "Y
0
0
0
0
1
0
1
0
1
1
1
0
Note that this binary expression corresponds to the function, f3, in Example 12.29.
M
When you studied functions over the real numbers, you may have encountered functions that have no corresponding algebraic expression. For example, a function that is important in numerical analysis and probability theory is the errorfunction, commonly denoted by erf(x), and defined by erf(x) =
f2
j j
et2dt e- 2 dt.
Even if the most common trigonometric or transcendental functions [such as eX and that defines this function; it has no ln(x)] are used, there is still no simple expression 29 easy-to-express integration-free antiderivative. The situation for binary functions is much simpler. It will soon be shown that every binary function (no exceptions) can be expressed as a binary expression. Thus, the correspondence between f3 and the binary expression in Example 12.30 was not an accident. The formal proof will depend upon the notion of a minterm. DEFINITION 12.30 Minterm xn be n binary variables. A minterm is a binary expression in the Let X1, X2, form X.I "X2"" "X.n,
where ii is either xi or T, for i = 1, 2. variables. 29
n. There is an assumed ordering of the
For more information, see the "Textbook-Related Links" section at http://www.matbcs.bethel.edu/-gosselt/ DiscreteMathWithProof/.
770
Chapter 12 Functions, Relations, Databases, and Circuits In other words, a minterm is a product, in order, of the n binary variables, each appearing in either complemented or uncomplemented form. There are no summations in the expression. Some Minterms The following are all minterms in the 4 binary variables, x 1 , X2,
" X1 "* X1 "* XI
T2"X3
X3, X4.
X4
X2
X3"X4
X2
X3
X4
The binary expressions in the next list are not minterms in x1,
X2, X3, X4.
"•XJ X3 • X4 Missing x2 Cannot include summations "* X1 X2 + X3 XX4 "* X2 x- X3 • X4 Variables are out of order "* X X2 • X3 • T4Complement should only involve single variables
E
DEFINITION 12.31 Disjunctive Normal Form A binary expression is in disjunctive normal form if it is either a sum of distinct minterms or it is the expression, 0.
IV Q~uic-k _Ch-eck 12_.,10-Assume that all expressions in this Quick Check may be constructed using the binary variables, X1 , X2 and x3. 1. Which of the following binary expressions in xI, X2, X3 are in disjunctive sions are minterms in xi, X2, X3? If an normal form? If an expression is not expression is not a minterm, explain in disjunctive normal form, explain why it isn't, why it isn't.
(a) 1 x I.F2j.3 (b) 0 (c) T1
-XT3 following binary expresthe 2. Which of
(a) xl .X2-x3+x.I (b) 0 +Xi .2 X3 ( Ox2 (c)
T1 - X2
T3
X2x3
EVI
It is now time to start proving the main results in this section. This will be done by proving a sequence of intermediate propositions. The following familiar ideas will be used.
"*Two binary expressions
are equivalent if one can be transformed into the other using the axioms and fundamental properties of Boolean algebras.
"* Two binary functions are equal if they have identical
values at each element of their
common domain. The major results that will be proved are summarized as follows. 1. Every binary function can be expressed as a binary expression in disjunctive normal form. 2. Every binary expression is equivalent to a binary expression in disjunctive normal form. These results indicate that disjunctive normal form is a canonicalform for binary functions and binary expressions. In mathematics, a canonical form is a standard form into which all members of some class of mathematical objects can be placed. Classes of
12.4 Binary Functions and Binary Expressions
771
mathematical objects that have a canonical form demonstrate an underlying regularity and order that is considered desirable.
PROPOSITION 12.5 Evaluating Minterms Let xfl i2 .- x-f,be a minterm in the binary variables, n-variable binary function, f, as f(Xl,X2 ....
Xn) =
Xl, x2.
Xn.
Define an
•fl " i2 "-Xn.
Then f has the value 1 at only one element in its domain; it has the value 0 at all other elements of its domain. The n-tuple at which f has the value 1 is determined by setting 1if )i= xi 0i if i/ = T,
fori = 1,2 ...
, n.
An example will demonstrate how simple the proof will be.
A Minterm as a Function Let f(xI, x2, X3) = X1 • XF2 • X3 . It is not hard to list the value of f at each of the 8 ordered triples in its domain. However, a simple observation will achieve the same goal. Notice that f must evaluate to 0 if any of the three factors, X1 , i2, x3, is 0. The only way to make each of these factors evaluate to 1 is to set x, = 1, x2 = 0, and X3 = 1. U
Proof of Proposition 12.5: The function, f,
is defined in the n factors, , in If any one of those factors evaluates to 0, then f will also evaluate to 0. The only way to make each of the factors evaluate to 1 is to make the assignments specified in the statement of the proposition. Xfl, i2 ...
The critical observation in the proof of the first main result essentially reverses Proposition 12.5. If the ordered n-tuples at which a binary function, f, equals 1 are known, then a set of minterms that evaluate to I at those n-tuples can be constructed. If these minterms are properly combined, a binary expression that represents f will 30 result. Every Binary Function can be Expressed in Disjunctive Normal Form n. Then there is a Let f be a binary function in the binary variables, Xl, x2.....x binary expression in disjunctive normal form that is equal to f when viewed as a binary function. The proof will be easier to follow if an example is available.
From Binary Function to Binary Expression Let f be a binary function in the binary variables, xl, x 2 , X3, with values defined by Table 12.30 on page 772. There are three ordered triples at which f has the value 1. Using the main idea of Proposition 12.5, it should be clear that the three minterms, xl - x2 • x3, TT • X2 •X3, and 30
The constructive proof that follows is very similar in spirit to the standard construction of a Lagrange
interpolating polynomial. Consult a numerical analysis text for details.
772
Chapter 12 Functions, Relations, Databases, and Circuits
TABLE 12.30 A binary function of Xl, X2, and x 3 xl
x2
x3
f(xi,x 2 ,x 3 )
0
0
0
1
0
0
1
0
0 0
1 1
0 1
0 1
1
0
0
0
1
0
1
0
1
1
0
1
I
1
1
0
X2 • x-, each have the value 1 at an ordered triple where f has the value 1. Adding these minterms will produce a binary expression that evaluates to the same value as f at every triple in the domain. This occurs because the ordered triples, (0, 0, 0), (0, 1, 1), and (1, 1, 0), will each cause one of the terms to evaluate to 1 and the other two terms to evaluate to 0. Any other ordered triple will cause each of the minterms to evaluate to 0. We can thus write (informally blurring the distinction between f and the sum of minterms)
X1
f(x
+
1 , X2, X3) = X* • x2 • T3 + x1 • X2 • X3
Xl • X2 • x-3.
U
This expression is in disjunctive normal form.
Proof of Theorem 12.8: If f evaluates to 0 at every ordered n-tuple in its domain, then f(x 1 , x2,. x) = 0 is a representation of f as a binary expression in disjunctive normal form. Otherwise, for each ordered n-tuple at which f evaluates to 1, define a corresponding minterm,ii
fori = 1,2 .
.
4 n, by
xi
if the ith coordinate of the n-tuple is 1
T'
if the ith coordinate of the n-tuple is 0
n.
The sum of these minterms is a binary expression in disjunctive normal form. It has been constructed so that exactly one of the minterms in the sum will evaluate to 1 at each ordered n-tuple where f evaluates to 1. At any other ordered n-tuple, each minterm will evaluate to 0, so the sum will also evaluate to 0. 1]
VtQuick Check 12.11 1. Let f be the binary function defined by the following table. Construct a binary expression in disjunctive normal
form that represents a binary function that is equal to f.
X1
X2
X3
f(x 1 ,X 2 ,x 3 )
X1
x2
x3
f(x 1 ,x 2 ,X 3 )
0
0
0
0
1
0
0
0
0
0
1
0
1
0
1
0
0
1
0
1
1
1
0
1
0
1
1
1
1
1
1
1
El
12.4 Binary Functions and Binary Expressions
773
12.4.3 Binary Expressions and Disjunctive Normal Form The sequence of propositions leading up to the second major result is patterned after the sequence that is used in [53]. It may be helpful to review the axioms and fundamental properties for Boolean algebras (see Section 2.5). PROPOSITION 12.6 Moving Complements onto Single Variables Every binary expression is equivalent to a binary expression in which the only occurrences of the complement operator, , are to complement single variables. Proof: Every binary expression is equivalent to itself, so if every complement operator in the original expression already involves only a single variable (or if no complement operators appear), there is nothing to prove. If either 0 or 1 appear in the expression, an equivalent expression can be obtained by using the substitutions 0 -- 1 and - --> 0. The involution property can be used to eliminate multiple complements of the same subexpression. Any other complements must involve subexpressions containing one (or both) of the operators, + and .. Associativity and De Morgan's laws can be used to reduce these to subexpressions in which the complements involve strictly smaller subexpressions. These transformations can be applied recursively, generating a sequence of equivalent binary expressions. Eventually, an expression in the form asserted by the proposition must be reached because binary expressions are finite, and each transformation El makes progress toward the stated goal. Moving Complements onto Single Variables
Consider the binary expression 1 • xl •X2 +
X2
• (x3 +
70).
One possible sequence of
transformations is shown below. 1 •X1 •X2 +X2 - (x3 +T4)
-X1
1
Xx2 + x2 - (x3 +
-4)
• (XI • X2) + X2 • (x3 + T4_)
= (
x
+ •X2)+ x2. (x3±-4)
+ x2 (x3 + 34) x•) + ý-T + ý-) +x 2 - (x 3 + W4) =(0 = (0 +
involution associativity De Morgan complement of 1 De Morgan
U
PROPOSITION 12.7 Transforming to a Sum of Products Let E be a binary expression in which all occurrences of the complement operator involve only single variables. Then E is equivalent to a binary expression that is a sum of products, where each product contains factors that are either a constant, 0 or 1, or are complemented or uncomplemented single variables. A product can consist of a single such factor. Proof: The associative and distributive properties are all that is needed. Suppose there is a factor that is not one of the four acceptable entities. If the factor is a product of acceptable entities, then the associative property allows the factor to be ungrouped into a product of factors that are acceptable. If the factor is the only factor in the term, and the addition operator appears one or more times, the associative property can be used to convert the term into a sum of strictly simpler terms.
774
Chapter 12 Functions, Relations, Databases, and Circuits Otherwise, the factor must contain at least one addition operator and there must be at least one other factor. The distributive property can be used to transform the term into two strictly simpler terms. These transformations can be applied recursively, generating a sequence of equivalent binary expressions. Eventually, an expression in the form asserted by the proposition must be reached because binary expressions are finite, and each transformation makes progress toward the stated goal. E A
Transforming to a Sum of Products The final binary expression in Example 12.34 can be transformed into an equivalent binary expression that is a sum of products. (0 + T + T2-)+ X2 • (X3 + 3F4) = 0 + TF + ý2-+ X2.- (X3 + 74) = 0 + j-] + i2T+ (x2 - X3) + (x2 • l"4) 2 +x
-+]
2
" X3 + X2 •X4
associativity
distributivity associativity
PROPOSITION 12.8 Transforming Sums of Products to Disjunctive Normal Form Let E be a binary expression that is a sum of products in which each factor is either 0, 1, or a complemented or uncomplemented single variable. Then E is equivalent to a binary expression in disjunctive normal form. Note that disjunctive normal form implies a pre-established ordering among the variables so that it is possible to discuss minterms.
Proof: If a term includes 0 as a factor, then the term can be replaced by the term containing only 0. Then all terms consisting of only the factor, 0, may be removed (using the identity axiom). The exception will be an expression that only contains a sum of terms that are each 0. In that case, all but one 0 may be removed. If a term includes 1 as a factor and also contains other factors, the factor, 1, can be removed (using the identity axiom). If a term consists only of the constant, 1, and there are other terms, then all terms except the term consisting of only I may be removed (using the domination axiom). At this point, either the entire expression is the constant, 0, and the expression is already in disjunctive normal form, or the entire expression is the constant, 1, or else the constants 0 and I do not appear in the expression. Let the set of variables be Ix 1, X2, . . , x ,}. For the remainder of this proof, a term will be said to contain xi if either xi or T, is a factor of the term. Assume also that 0 does not appear in the expression. The expression will be transformed into disjunctive normal form using the following replacement algorithm. for i = 1, 2 ..... n while E contains a term, replace t with t
- xi
t, +
which does not contain Xi t
- 5F7
The validity of the replacement step is proved below. t = t •1
identity axiom
= t • (xi +T ) = (t
.
complement axiom
xi) + (t • Tx) distributivity axiom
= t. xi + t •x-T
associativity
12.4 Binary Functions and Binary Expressions
775
At each step of the algorithm, there is a smaller total number of missing variables. Since there is a finite number of terms in the original expression and a finite number of missing variables, the process must eventually terminate with an expression in dis, junctive normal form. Note that the expression, 1, is missing all n variables, but the algorithm still applies (starting with the replacement 1 = x, + WT). When the algorithm terminates, every term will contain all n variables. The commutative law can then be used to sort the factors in each term, using the preestablished ordering of the variables. If a term contains multiple copies of xI (or of •7), the idempotence property can be used to remove all but one copy. If a term contains both xi and i-7,the complement axiom implies that the term is equivalent to the term, 0. If there are multiple 0 terms, all but one can be eliminated using the identity axiom. Finally, if 0 is the only term, the expression is in disjunctive normal form. Otherwise, the resulting expression is in disjunctive normal form. Transforming Sums of Products to Disjunctive Normal Form The final expression in Example 12.35 can be transformed into an equivalent binary expression in disjunctive normal form using the following steps. 0+
X-2 + X2 "X3 + X2 -"4
T+ X+
=--
identity
+ X2 •x4
2• X3
+ Y2X+
replacement algorithm (i
+ X2 • X3 + X2 • Y4 =
5-" - X1 + +
T--+
+-l- X2
-
X3
+ X2 • X3 - T-+ X2 • J-4 =
-F"l + X2
+
=
1)
Xl
replacement algorithm (i = 1)
X1++x32 - %+3 " Yl
X3
X + X2
X3.
+
X4 "
X2 •4 X1
X--1• X2 + XTl •
X2
replacement algorithm (i = 1)
2
+%-2-X1-+Xz'XF +
X2 'X3
X1 + X2
X3 'l
replacement algorithm (i
+ X2 344 x) + x2 X4 =-T. X2 • X3
2)
XT1- X2 •3
+ X-I X T2 -J-x1 + T2 "XT + X2 " X3
X1 + X2
X3
TX
+
x +
T4
36
X2
•*
= XT*- X2
*
"1 T2 "X3 +Xl "x2 X2
X3
±-2 "Xl1 +l X3 "Xl
+
X2
X3
Xl
+ X2 • xF- Xl + X2 3F4 xI =---X'
X2 X X3
"±T1 " -•2 "+ 2
replacement algorithm (i - 3)
X3 + X1 - X2 •X'3
-* X2• +
X2
+ X- X2 •X-3
T
X3
+
X1
X3
+T
X1 2
+ X2 .X3.X
X2
Tx3
X1 "3T
replacement algorithm (i
3)
776
Chapter 12 Functions, Relations, Databases, and Circuits
+ X2 X3
F + X2
+ X2 •N4
Tx
=
4 X replacement algorithm (i = 3)
X-- X2 - X3 ± I-" X2 " 3F3
"+* 3 XX3 + 2l"X2 +X2X
+-x2
XFl
+
X3 .Xj
X2
"±X2 "±X2
X3
X].3+X2.xl . X3
+-x2
.F X T3
+ X2
X3 .I
X4 Xl
x4 -Y1
replacement algorithm (i
3)
=--X1 " X2 " X3 -+- - X2•
"= Xl
+ Xl "X2X3 .X3 +-• xl* .2x"+X2 -X X3 + T2 "XI N3 "±i2 XI> X3 + NF N7 T3
+
X3
N5 I?
NI
+ X2 X3 XI + X2 " X3 "+X2
X4 Xl " X3 + X2
"+ x2
T-•.
= X
XI
N"3
N
replacement algorithm (i = 3)
X2 -X3 +XW
-X2
x3
N2 X3 + N1 N 302
+
+ +
T4
N XI X3 + N-2 Nx x3 +
02-xl .x3
+ X2 X3
XI + X2
+ x 2 TN
X
+ x
x- N
2N X3
X3 + X2z
N1 x4.Xl
N
x42 x - X3 + x2 Tx4N•Nx-•Fx
replacement algorithm (i =3)
Another 10 steps (with i = 4) of the replacement algorithm yields the final expression
in disjunctive normal form.
N•- X2 • X3 • X4 + N- X2 • X3 "X4 ± Nl "X2 N X"3 X4 + N1- X2 X 4 "+ X X2 X3 X4 + N "2X3 2 "X4 + TIX X2 N X4 + NFX 2"T3 N" 4 "+N• XI X3 X4 +.02 XN X3 4 + T2 N XI TI X4 + N Xl X3 .X4 "XT X3 FX4 "+ •2 x X3 X4 + X2 Xl X3 X4 + NQ NI N X4 + "+X2 X3 XI X4 + x2 XX3 " XI X4 + X2 X3 N X4 +X2 X3 l X4 +-X2 N•. Xt X3 + X2 -4 N x X• X2 N Th4X3 ± X2 N4 "N The individual terms can now be sorted.
N-l
• X2 " X3 " X4 + N']- X2 " X3 N X4 + Nl " X2 N Y3 "X4 + NI- X2 N •N3 X4 "+•- NX2 X3 X4 + N2 X3 N4+71 N 2NX3 N X4 + Y 'NN2 N F4
"+ Xl '2 X3 X4 + XI )h• X3 N4+l X1- 2N- X4 + Xl 'X2"X3 "4N "+XlN 2N X3 X4 + NI NX2X3 X4 + NI NX2"X3 X4 + NI NY2X3 N 4 "- XX2 X X3 X4 + X" X2 X3 NX3 X2 X4 + NJ X2 X3 •4 "+Xl -x2 X3 N4 ±Xl Xx2 -5X3 -4 +N .x2 - x 3 N4-+ ] X2 3X3.X4 The remaining steps in the proof are not needed for this example.
U
12.4 Binary Functions and Binary Expressions
777
The previous example indicates that Proposition 12.8 does not accomplish everything we would like. In particular, the replacement algorithm adds unnecessary redundancy to the expression that is in disjunctive normal form. For instance, the final expression begins with the term 2-I • X2 • X3 • X4 but later contains another copy of that term. By using the commutativity axiom and the idempotence property, the multiple copies of 2-i-- X2 • X3 - X4 can be combined into a single term, 3-1-. X2 •X3 •X4. This is true
in general.
uN~O
iýp!Every Binary Expression is Equivalent to a Unique Expression in Disjunctive Normal Form
Every binary expression is equivalent to a unique expression in disjunctive normal form. The uniqueness requires a preestablished lexicographical ordering of the variables. Proof: The previous three propositions establish the assertion that a binary expression, E, is equivalent to a binary expression, D, which is in disjunctive normal form.
Proposition 12.8 assumes that some lexicographical ordering of variables exists (if the variables are subscripted, the natural subscript order can be used). The (possibly
complemented) variables in each term of D are in lexicographical order. These terms are all minterms. Make the additional assumption that the uncomplemented variable, xi,
comes before its own complement, Yi7.With this additional ordering, the collection of minterms can be sorted.
Now use the idempotence property to remove all but one copy of any identical terms. Call the final expression C. Since there are no duplicate terms in C, each minterm has an imposed internal sorting, and there is a well-defined sort order among the minterms, C is unique.
A
ED
Unique Expression
The overly long expression at the end of Example 12.36 can be placed into the unique
form specified by the previous theorem using the natural ordering imposed by the subscripts. That expression is repeated here. XT-- X2 'X3 -X4 + T1 -X2 T2
+-'l
± X+
.
T•
X3
X4 +
X3-
X4 + X1
5X2" X3 "X4
±X1
X2
X3
+
TX2 X3
X1
X4 + X1
+-Xl"X2 X3 'F4
"7X2 'X- 3 " X4 + J- '
X3 "X4 +
T2
"X2 'X3
X3 "X4 + X1 - )2
02" X3 X2
X4 + -Xl
X4
+
x3
3XF1 2
X3" x4 + -
X2
±
"x2 "F3
"T4
X4 + X1 "T2 "x3 "x4
x3 "X4
X2
+ X1 X2 X3 X4 + TX1
X4
X 2 x23 'x4
+
TX
T2" x3
X4
X3" X4 + -xl
X2
X3
X22 3"T4
x4-
X1
X3 "x4
The minterms can be sorted, using the convention that xi comes before 3F. Xl
X2 - X3 • X4 + X1 • X2 - X3 •4 X
+-Xl'2
X3
"+
X4 + Xl
x2
X3
X2 XF" X33
+ X1 -
T"
X2
X33 X4 +
T-
X2
X3 "x4
+-Xl X2 "X-'3 X4 +
X3" X4
+ XY--
"+ W "+ý-l
"2"
"
'3F *2
X3
X4 +-TX x2 "X3
X2 • X3 • x54 + Xl
T4 + Xl X4 +
x2
W1. X2
X4 + XT
x3
X2 • x3 • x4 X
X4 + Xl "x2
X3 • x4 + xl
X"2X3 X-E X3
"X4
-xl+"X2
X4+-I-
X4 + -IT- "x2 "i.3 "F4
X2
+l
X2
x3
x4
x3 • X
X4
X3 "4 X3
4
"x2"T3 3F4
778
Chapter 12 Functions, Relations, Databases, and Circuits Finally, remove duplicate minterms. X1 "X2 'X3
X3 "
' X4 + X
"- X1 x2 "- Y" X2 "--l TX2
X3
+ -4X
X3 •X4
+
X1
X3
+
xl "X2
x4
' x4 + X12T3
+
X1
TX- X4
+
•F1 X2
X3 •X4
X4
+
x1 -x2
x3
-x2 .x3 .X4 X2
+
x4'
X3
x2
X1 -x2
x3 "x4 +
+
X3 "X4
X2 " XT •2
X3
X4
X3 • X4
T4
A good way to check to see if any mistakes were made in this long process is to compare this result with the original expression, thought of as a binary function. The minterms in the final expression indicate that the final expression will evaluate to 1 at the 15 ordered 4-tuples: {(1, 1, 1, 1), (1, 1, 1,0), (1, 1, 0, 0), (1,0, 1, 1), (1, 0, 1,0),
(1,0,0, 1), (1,0,0,0), (0,1,1, 1), (0, 1, 1,0), (0, 1,0, 1), (0, 1,0,0), (0,0, 1, 1), (0, 0, 1,0), (0, 0, 0, 1), (0, 0, 0, 0)1. Table 12.31 shows values for the function defined by the original expression. Note that whenever either x1 or x2 is 0, the term I • x1• X2 will evaluate to 1, but then the domination property indicates that the function must evaluate to 1. This leaves only four additional ordered 4-tuples to examine. It turns out that the only way the expression can have the value 0 is to have x1 = X2 - 1 and x 3 = 0 and X4 = 1. The ordered 4-tuple (1, 1,0, 1) is the only possible binary 4-tuple missing from the previous list. Thus, the final disjunctive normal form expression was calculated correctly. TABLE 12.31 The function defined by X1 X2 + X2- (x3 + X4) 1. X
X 2 + X2
X1
X2
X3
X4
0
0
0
0
1
0 0
0 0
0
1
1
1
0
1
0
0
1
1
1
0
1
0
0
1
0
1
0
1
1
0
1
1
0
1
0
1
1
1
1
1
0
0
0
1
1
0
0
1
1
1
0
1
0
1
1
0
1
1
1
1
1
0
0
1
1
1
0
1
0
1 1
1 1
1 1
0 1
1 1
(X3 + -4)
U
ViQuickChe-ck 12.'12 1. Use the process developed in Propositions 12.6, 12.7, 12.8, and Theorem 12.9 to transform the following binary expression in x1, x 2 and x 3 into disjunctive normal form. Use the nat-
ural lexicographical order imposed by the subscripts. (O+X0X + X
(X2+T)
I]
12.4 Binary Functions and Binary Expressions
779
12.4.4 Exercises The exercises marked with D have detailed solutions in Appendix G. 1. Count the number of Boolean functions with domain equal to the Boolean algebra defined by setting B = P(S), where S = {a, b, c} and R is defined using the standard rules presented on page 54 for creating a Boolean algebra from a set. 2. Prove Theorem 12.7. 3. D How many distinct binary functions of order 3 are there? 4. How many distinct functions of order n are there with domain (0, 1, 2} and range {0, 1, 21? 5. Assume that all expressions in this problem may be constructed using the binary variables, w, x, y, z. Which of the following binary expressions are minterms in w, x, y, z? If an expression is not a minterm, explain why it isn't. (a) 0. w •x y (b) O 1 (c) w •Y • y (d) w.-Y-y•z+W-x•Y.z (e) T•.x•y3•z 6. Assume that all expressions in this problem may be constructed using the binary variables, w, x, y, z. Which of the following binary expressions in w, x, y, z are in disjunctive normal form? If an expression is not in disjunctive normal form, explain why it isn't. (a) 11717. z + x 37ý (b) OP Iand (c) w• y•x •z+w•x•y•z (d) w•x•y-z+w•x- Y.z (e) 0+0 (f) w.w-x.y.z 7. Write a binary expression in disjunctive normal form for each binary function of order 2 in Example 12.29.
13. Use the process developed in Proposition 12.8 to transform the following binary expressions in x1, x 2 , and x 3 into an expression in disjunctive normal form. (a) ' x1 • x3 • 2]- (b) xiI T22 (c) 0 + x1 • x2 -xt 14. Use the process developed in Propositions 12.6, 12.7, 12.8, and Theorem 12.9 to transform the following binary expressions in xt, x2 , and x3 into disjunctive normal form. the Use the order iod n lexicgaphical natural lexicographical order imposed by the subscripts. (a) X1 • x+ ± x2 • (b) 0 1 (c) x2 • Xý + X'. X2 .xI (d) 1 + x 1 + x2 x3 (XI + j2) 15. Use the process developed in Propositions 12.6, 12.7, 12.8, and Theorem 12.9 to transform the following binary expressions in x1 , x 2 , and x 3 into disjunctive normal form. Use the natural lexicographical order imposed by the subscripts. (a) x 3 + x1 •-l- + X2 + -3. I (b) T2. x3 + x 1I X3 (c) (xI + X2) - (X2 + TD (d) 3 5 16. Use the process developed in Propositions 12.6, 12.7, 12.8, Theorem 12.9 to transform the following binary expressions in xi, x2, x3, and x 4 into disjunctive normal form. Use the natural lexicographical order imposed by the subscripts. (a) xI +X2+X3+X4
(b) (xl -x 2 + x3 -4) -X2 + X3 + X4 (c) 1 • x2 • 3 + X1 - x2 - Xl x3• (d) X1 -x2 - x 3 + xi • x 2 -x4 + Xt •x3 • x4 + x2 •x 3 • X4
8. ý4 Create a binary expression that represents a function of two binary variables that evaluates to I when exactly one of the variables is 1 and evaluates to 0 otherwise. (The XOR function.)
17. Use the process developed in Propositions 12.6, 12.7, 12.8,
9. Create a binary expression that represents a function of two binary variables that evaluates to 1 when the variables have the same value and evaluates to 0 otherwise. (The biconditional function.)
and Theorem 12.9 to transform the following binary expressions in xt, x2, x3, and x 4 into disjunctive normal form. Use the natural lexicographical order imposed by the subscripts. (a) x 2 .x 3 .x 4 +1 x 3 +x1 x2
10. A binary function evaluates to 1 at the following ordered 5tuples: (0,0, 1, 1, 1), (1,0, 1,0, 1), (1, 1,0,0, 1). It evaluates to 0 at all other 5-tuples. Write a binary expression in disjunctive normal form that represents this function. 11. Use the process developed in Proposition 12.6 to transform the following binary expressions in x1 , x 2 , and x 3 into an expression for which any complement operator is acting on only a single variable. (b) x 2 • x I x3 (a) @Ixt + 2-7.(xl + T-•) (c) I • x 3 + 0+ x 1 12. Use the process developed in Proposition 12.7 to transform the following binary expressions in x1, x 2 , and x3 into a sum of products where every factor is either 0, 1, or a complemented or uncomplemented single variable, (a) t•P (Tl - I + x2 x3) •x (b) (xt + x2 x3) (x2 + -T) (c) ( + Xl) (x2 -T + 0)
(b) 'X1 - x4 +Xt x12x2 (c) x 3 • (i-+ ± l-" x3) + X" (-"3x4 + x1 ) (d) x 2 (x3 + x4) + x2 + x3 + X-4 18. P An alternative canonical form. (a) Use the definition of a minterm as a model from which to create a definition for a maxterm. (b) Define conjunctive normalform. Recall that the logic operator v is also called the disjunction operator and that A is also called the conjunction operator. Also recall that Boolean algebras that are created from collections of propositions make the associations: v with + and A with .. (c) State and prove a maxterm-based proposition patterned after Proposition 12.5. Then,
780
Chapter 12 Functions, Relations, Databases, and Circuits
(d) State (but do not prove) a maxterm-based theorem patterned after Theorem 12.8 (e) State (but do not prove) a maxterm-based theorem patterned after Theorem 12.9. 19. Use part (c) of Exercise 18 to write a binary expression in
conjunctive normal form for each binary function of order 2 in Example 12.29. 20. Use any method you wish to convert the binary expressions in Exercise 14 into conjunctive normal form. (See Exercise 18 for the definition of conjunctive normal form).
12.5 Combinatorial Circuits Section 12.4 introduced a technique for converting binary functions into binary expressions in disjunctive normal form. Disjunctive normal form is very nice as a canonical representation, but it is often not ideal if the goal is to have a simple binary expression with as few products and sums as possible. This section will present a technique for simplifying binary expressions. This will be beneficial because Section 12.5.2 will establish a connection between binary expressions and the design of combinatorial circuits. Since circuits are built from real components, it is desirable to minimize how many components are needed to accomplish the circuit's task.
12.5.1 Minimizing Binary Expressions The process of simplifying a binary expression in disjunctive normal form relies heavily on the four axioms for a Boolean algebra. It also uses the absorbtion properties. The major simplification rule can be justified by the following set of Boolean equivalences. e .f + j
f
(e +
-
).f
=
I.f = f
1= f
Both e and f can be binary expressions, instead of merely single binary variables. The most common usage will have e as a single binary variable and f as a product of binary variables. A typical simplification might be X1 'X2
X4 + X1 ' Y2"3X4 = X1 "j'X4.
-3'
The expanded version of this simplification is shown next. X1 "X2-x3
X4--X4 +
x21
x3 'X4
= X2 • xý3
X4 + Fx2 -
i- . x4) + 2)F2.(xlI
= X2 (x1I
= (x2 + F)• (x I = I • (xj • F3" X4)
.
commutativity (twice)
x1 X4 -
X4)
associativity (twice) distributivity complement
X4)
= (x1I T3- x4)• I
commutativity
= (x I
.
identity
= Xl -
xT • X4
X4)
associativity
Binary Expression Simplification Rule Let e and f be binary expressions. Then e
f + -f
=
f.
The binary expression simplification rule (with perhaps some commutativity) can be used iteratively to reduce an expression in disjunctive normal form into a much simpler (but equivalent) expression.
12.5 Combinatorial Circuits
781
Simplifying the Binary Expression in Quick Check 12.11 The binary function defined in Quick Check 12.11 on page 772 was found to be equivalent to the binary expression T-T • x2 • x3 + XI • X2 • X3 + X1 • X2 • F + x• X2 • X3. The binary expression simplification rule can be used several times. X-
X2
x3
+
Tx "X2
X3
+
Xl "X2
3 + X1 "X2
X3
= (X1 X2 .x3 + x .X2 .X3) +Xl -x X-3+ X12 X3 = (Y1- X2) + X1 - X2 -T3 + X] -X2 X3 = '-i- • X2 + (X
" X2
Y3 + X1 • X2 " X3)
= x-i. X2 + (xI. X2) = (Th- X2 + X1 X2) = X2
Consequently, the function can be expressed as either f(xI, x2, x3) = - 5F3 + xT--.x2.x 3 +xl "x 2 "x--T+XI -x 2.x 3 , oras f(xl,x 2 x, 3) = x2. Aquicklookat the table (on page 772) that defines f should convince you that f(x], x2, x3) = X2 is a valid representation. For most purposes, this shorter expression is the preferred representation. U Doing hand simplifications (as in the previous example) works well for small problems. It becomes less suitable when the number of variables increases. There are two fairly elementary methods that can turn the process into an algorithm. The first method is an algorithm that uses a matrix, called a Karnaugh map, to represent the minterms. It proceeds by circling pairs of minterms that are candidates for simplification. This method becomes difficult to use when the number of variables gets larger than five or six. The most common alternative is the Quine-McCluskey algorithm. This algorithm uses tables to organize the work. It is also easier to translate into a computer algorithm. The Quine-McCluskey algorithm will be introduced by an example before a formal description is given.
S0Illustrating
Quine-McCluskey Consider the binary function f(x, y, z) = x .y.z+x .y.+x Y.z+-*y.z+-*y.-z+-£.-'z, which is in disjunctive normal form. The Quine-McCluskey algorithm associates a bit string with each minterm. If a variable appears in uncomplemented form, the bit string contains a 1 in the corresponding position; otherwise, the bit string contains a 0 in the corresponding position. It will be helpful to sort the minterms according to the number of Is in their associated bit strings. The minterms can also be numbered for future reference. 1 111 2 110 3 011 4 100 5 010 6 000
An application of the binary expression simplification rule to two binary expressions corresponds to a merging of the two associated bit strings. For instance, the simplification, x • y • z + x • y • == x • y, corresponds to merging 11l and 110 into the expression 11-, where the hyphen represents the elimination of z. Notice that bit strings cannot be merged unless they differ by exactly one in the number of Is they contain. However, this
782
Chapter 12 Functions, Relations, Databases, and Circuits condition doesn't guarantee a merge is possible: For example, 1100 and 0001 cannot be merged. The algorithm extends the table by making a new column for all the possible merged bit-hyphen expressions. In addition, a column is added to place a check mark next to each minterm that became part of a merged expression, and another column to keep track of which minterms were used to create each merged expression. The check mark indicates that a minterm has been "covered" by a merged expression. For instance, the minterms x • y • z and x • y are covered by 11-. 1 111 2 110 3 011 4 100 5 010 6 000
V V /
V
V V
1,2 1,3 2,4 2,5 3,5 4,6 5,6
1 1-11 1-0 -10 01-00 0-0
The newly merged expressions can now be merged with each other. requirement is that hyphens must match another hyphen in the same columns are introduced to keep track of which bit-hyphen expressions ered and also which of the original minterms have been covered by the expression. 1 111 2 110 3 011 4 100
V V V V
1,2 1,3 2,4 2,5
11-11 1-0 -10
V V V V
5 010
V
3,5
01-
V
6
V
4,6 5,6
-00 0-0
V V
000
1,2,3,5 2,4,5,6
The only new position. New have been covnew bit-hyphen
-1-- 0
Notice that the expression -1- can be obtained by merging 11- and 01- (leading to a covering of 1, 2, 3, 5) and also by merging - 11 and - 10 (leading to the same covering). The new expression is listed only once. No additional merging is possible, so the algorithm is ready to start phase 2. A matrix is created by listing the original minterms as column labels and listing all unchecked bit-hyphen expressions as row labels. (In this example, all unchecked bit-hyphen expressions are in the final section of the table, but this need not be true in general.) An X is placed in row i, column j if the bit-hyphen expression in row i covers the minterm in column j.
-1-- 0
1 x .y . z
2 x . y.'
X
X
-
3 . y "z
4 x .y
X
X
5 -.y Y.-
6 x.y.z
X X
X
X
Every minterm in the original binary expression must be covered by the reduced expression. The final goal is to pick a smallest set of rows so that every column has at least one X among the chosen rows. For this example, both rows are needed. The final reduced binary expression is therefore y + ý. You can evaluate both representations at each of the eight ordered triples in the domain to verify that f(x, y, z) = x. y z + x y
++y
+ 7 . y .z +
.y . +
.
.
y +-
783
12.5 Combinatorial Circuits The Quine-McCluskey Algorithm Phase 1: Sort the minterms according to the number of l's in their associated bit-hyphen strings. Number the sorted bit-hyphen strings. while the current section contains bit-hyphen strings that can be merged List all
new merged bit-hyphen
strings in
a new column.
Make a check next to each bit-hyphen string in the current section that was used in a merge. the original minterms Create a new column that lists covered by the newest bit-hyphen strings.
Phase 2: Create a matrix that has the original minterms as column labels and all unchecked bit-hyphen expressions as row labels. Place an X in row i, column j if the bit-hyphen expression in row i covers the minterm in column j. Pick a smallest set, S, of rows so that every column has at
least one X among the chosen rows. Convert the bit-hyphen expressions in S into their corresponding binary expressions and add them.
A Four-Variable Quine-McCluskey The Quine-McCluskey algorithm can be used to minimize the binary expression w..y.z+w.x.yTz+wTx .y*
Z. +
+w
y z++
z+wTx
Y~z
+.
.yz
xy
Phase 1: 1 2
1011 0110
•/ x/
1,4 2,6
-01 1 0-10
3 4 5 6 7 8
0101 0011 1000 0010 0001 0000
V
3,7 4,6 4,7 5,8 6,8 7,8
0-01 00100-1 -000 00-0 000-
/ / /
V V
4,6,7,8
00--
/
V V V
Phase 2: 1 1011 -011 0-10 0-01
2 0110
3 0101
4 0011
6 0010
7 0001
8 0000
X
X
X
X
X
X
X
X
-000 00--
5 1000
X
X
X
X
784
Chapter 12 Functions, Relations, Databases, and Circuits Notice that if the term T • x (0 0- -) is chosen, then each of the other four terms must also be chosen. However, the other four terms are sufficient to cover all columns. The 31 equivalent minimized binary expression is thus 7. y z+±
.y
z+0-Ty-z +x.Y.y.
No proof of the correctness of the Quine-McCluskey algorithm will be given in this textbook. However, a bit of justification seems appropriate for phase 2 of this example. How can the exclusion of T - 7 be justified? One approach is to notice that w~x=w~x~yA-wiJx1y =wdx.y.z+w.xKy .i±w.x.y z + T
=i.x.y
y
+ w
>.x z +z. +
x.y
Thus,
+(-1.17 Y Z
z + Z +z ..
y
.x
+
=(T7>y.z+iYZx>y)z)+(J.y.-T+
+ (Ow. =Y -yz
z+ +
.
±w
.+
" y'.
w.Txy.D-Y
T. z) + (T.• . z + w . Y. Y.y') z~ + w +YZ ± 7x - Y T.
.+
.y
The final equivalence follows from four applications of the absorption property. For example, -7. y . z + w
y.z
=
(- y . z) +-1. (-1.y . z) (7. y z) + (-T. y z).w
=
(x'y z).
=
This confirms the conclusion of the algorithm:
1-.
Y is not needed.
U
60 Quick Check 12.13 1. Use the Quine-McCluskey algorithm to minimize the binary expression
x.y.-T+x.5.z+-T.y.z+5E.y.T.
12.5.2 Combinatorial Circuits and Binary Expressions --current
/
-
Figure 12.2. Two serial gates.
_ff y/ current
During the late 1930s, Claude Shannon noticed a useful connection between electronic circuits and binary expressions. That connection is the focus of this section. The two circuit diagrams in Figures 12.2 and 12.3 will help motivate Shannon's observations. The first diagram (Figure 12.2) shows part of an electrical circuit. In order for current to flow from the left end to the right end, both of the connections, x and y, need to be closed. It is common to call the connections gates (perhaps since they look like an open gate in a fence). The gates are in a serial arrangement. If either gate is open, no current will flow. The behavior is the same as the familiar logic operator, AND. That is, current will flow if and only if x A y = T. This can also be expressed using binary expression notation: Current will flow if and only if x .y = I. The second diagram (Figure 12.3) shows two gates in a parallel arrangement. In this diagram, current can flow if either (or both) of the gates is closed. Thus, current will flow if and only if x v y = T. Using binary expression notation: Current will flow if and only ifx y= 1.
Figure 12.3. Two parallel
gates.
3t
The final expression can be verified using the QuineMeCluskey applicationlapplet available in the Down-
loads section at http://www.mathcs.bethel.edu/-gossett/DiscreteMathWithProof/.
12.5 Combinatorial Circuits
It is now customary to abstract the logical structure of circuits and use special no-
yx
Figure 12.4. The standard symbol for an AND gate.
y
•
Figure 12.5. The standard symbol for an OR gate.
Figure 12.6. The standard symbol for a NOT gate.
785
tation for the gates. The standard symbol for an AND gate is shown in Figure 12.4. If both inputs, x, y, are 1, then the output will be 1. Otherwise, the output will be 0. The standard symbol for an OR gate is shown in Figure 12.5. If either input, x, y, is 1, then the output will be 1. Otherwise, the output will be 0. The final symbol (Figure 12.6) represents a NOT gate (or an inverter). This gate changes an input value of I to a 0 and a 0 to a 1. These three kinds of logic gates can be used to design many useful circuits. The circuits of interest here are called combinatorial circuits. A combinatorialcircuit is a circuit in which there are no delay elements. This is in contrast to sequential circuits, in which delay elements exist. (One example would be a finite-state automaton, with its built-in memory of the current state. Finite-state automata can accept a stream of input symbols; combinatorial circuits do not.) general process can be used to create combinatorialS>A circuits. 1. Decide how many inputs the circuit will need. 2. Create a binary function that represents the desired output for each combination of input values.
3. Create a binary expression, in disjunctive normal form, which represents the function. 4. Use the Quine-McCluskey algorithm to create a simpler (but equivalent) binary expression. 5. Use the logic gates to build a diagram that matches the binary expression.
A Very Simple Combinatorial Circuit Suppose a combinatorial circuit is needed that accepts three inputs, x, y, z, and generates a 1 if x = 1 and y z, but generates a 0 in all other cases. The function is specified by Table 12.32. TABLE 12.32 f(x, y, z) x y z f(x, y, z) 0 0 0 0
0 0 1 1
0 1 0 1
0 0 0 0
1
0
0
1
1
0
1
0
1
1
0
0
1
1
1
1
y .z. Since the numbers This can be represented by the binary expression x y+x of Is in the two minterms differ by more than one, the Quine-McCluskey algorithm will not result in any simplification. Figure 12.7 on page 786 shows how the binary expression can be translated into a circuit diagram. Notice the use of the associative property; the binary expression that has been diagramed is actually (x • Y)).
+ (x • y) . z
There are two alternative depictions for this diagram. In the first alternative (Figure 12.8), the three source inputs are shown as two distinct clusters on the left. This is merely a convenience to help keep the diagram simple. The second alternative (Figure 12.9) uses multiple-input AND and OR gates. These also serve to simplify the diagram. They also reflect a valid option. Many commercial AND and OR gates are capable of multiple inputs. n
786
Chapter 12 Functions, Relations, Databases, and Circuits
Figure 12.7 A combinatorial circuit for (x y) . + (x • y) - z.
x y
z
Figure 12.8 A combinatorial circuit for (x •Y) •g + (x • y) • z that duplicates the inputs.
x
y z x y
z
Figure 12.9
A combinatorial circuitforx -Y-z+x - y - z that uses multiple-input gates.
x y z
y
z
V Quick Check 12.14. 1. Design a combinatorial circuit that has three inputs. Its output should be
the same value as the majority of its inputs.
The basic strategy presented for designing combinatorial circuits can be extended to create more versatile circuits. The next example introduces one possible extension: a circuit with multiple outputs. Paper, Scissors, Rock The rules for the game Paper, Scissors, Rock were explained in Quick Check 6.1 on page 258. It is fairly easy to design a combinatorial circuit that will determine the winner (if any) of a game of Paper, Scissors, Rock. There will be six inputs: P1, s1, rl, and P2, S2, r2, where the subscript indicates the player and the letter indicates which object the player chose. If player 1chooses scissors, then SI = 1and pI = r= 0. (Illegal input will be considered soon.) Similarly, if player 2 chooses rock, then r2 = I and P2 = S2 = 0 should be the input values. It is tempting to design the circuit so that a single output variable is assigned the value 1 if player I wins and the value 0 if player 2 wins. The defect with this strategy is that many Paper, Scissors, Rock games end in a tie. A binary variable cannot represent a three-valued outcome. One solution is to design a circuit with three output variables: p, s, r. At most one of these variables will be assigned the value 1 (representing a win by the player
787
12.5 Combinatorial Circuits
who chose that object). A tie will result in all three output variables being assigned the value 0. If player 1 chooses scissors and player 2 chooses rock, then player 2 will win. Consequently, r = 1 and p = s = 0 are the appropriate output values. With six inputs, there are 26 = 64 values in the domain of the function. However, some of these values represent illegal inputs for this game. For example, Pt = Sl = r, = 1 would mean that player 1 tried to show all three objects simultaneously. The circuit can prevent illegal input from giving player I an unfair advantage by making the outcome a tie. Since illegal input and legitimate ties all result in an output of three Os, there are not that many input sets that will result in a nontie output. Only those values are shown in Table 12.33. TABLE 12.33 The input 6-tuples that result in an
output value being set to 1 r
s2
r2
P
s
0
1
0
0
0
1
1
0
0
1
0
0
0
0
0
1
0
0
1
1 0
0 0
1 0
0 0
0 1
0 1
1 0
0
0
0
0
1
0
0
1
0
Pl 0
Sl 0
rl 1
P2
0
0
1
o o
1
1 1
0
There will be three expressions to simplify: p(pl, srl,
P 2 , s 2 , r 2 ) = pl
sT1 rl P2 T2 " r2 + P1 T
s(pi, sl, ri, P2, S2, r 2 ) = 7p-
S• FT. P2 Ts2 r2 + P1
r(pl, si, ri, P2, S2, r 2 ) -T
T"i
rl
p2
S2
r
Ts •
T2 + PT •
P2 s2 •r2 •T
r
2• s
P2 T2
r2.
Since both minterms in each expression have the same number of Is, the QuineMcCluskey algorithm will not result in simpler expressions. Figure 12.10 shows the desired combinatorial circuit. Figure 12.10 A combinatorial circuit for Paper, Scissors, Rock.
- -
.
.A,...
P1
An alternative approach to a Paper, Scissors, Rock circuit is presented in Problem 12 U in Exercises 12.5.4.
788
Chapter 12 Functions, Relations, Databases, and Circuits The diagram in Figure 12.10 was created with a program called Logisim, which has been designed to simulate combinatorial circuits. 32 The program can be downloaded free. 33 Logisim provides drag-and-drop templates for the logic gates and the connections. It also lets you simulate the circuit. It provides "input switches" (the boxes on the left-hand side of Figure 12.10) and "output LEDs" (the boxes on the right-hand side of Figure 12.10). The input switches can be clicked to toggle between 0 and 1. The output LEDs show the resulting output. The circuit in Figure 12.10 shows player I choosing scissors and player 2 choosing rock. The output, r, is 1, signifying that rock (player 2) is the winner. The program uses green lines when the connecting line carries a 1 and red lines when the value is 0. Figure 12.10 is a bit messy in the middle, but it is possible to follow the lines properly because junctions (where a line splits) are indicated by a dot. All other crossings are artificial and assumed to be insulated so that no accidental bit changes arise. The sets of smaller AND gates were used because the large AND gates in Logisim are limited to 5 inputs. The general process for constructing combinatorial circuits with multiple output functions is summarized next.
Creating Combinatorial Circuits Step I Decide how many inputs and how many outputs the circuit will need. Create a binary variable for each input and each output. Step 2 For each output, create a binary function that represents the desired output for each combination of input values. Step 3 Create binary expressions, in disjunctive normal form, which represent each of the binary functions. Step 4 Use the Quine-McCluskey algorithm to create simpler (but equivalent) binary expressions. Step 5 Use logic gates to build a diagram that matches the set of binary expressions.
12.5.3 Functional Completeness Propositional logic was introduced in Section 2.3. Five primary logic operators were used to construct propositions (statements). These operators were NOT, AND, OR, implication, and biconditional. The implication and biconditional logical equivalences on page 48 can be used to show that implication and biconditional are not really necessary. In particular, (P --
Q) -• [--(P A (--Q))] €, [(--P) v Q]
asserts that implication can be replaced with either NOT and AND or with NOT and OR. Similarly, (P " Q) T(x1 (Y T" Y)) t' (X T" (Y T 0))
T"
((x T"(Y t Y)) T (x T (Y T Y)))) I (z T z) The final statement contains only one kind of logic operator. However, the original statement contained only three operators in total; the new statement contains thirteen.
lV Q'ui~ck _C'h'e'c'k 1,2'.15' For each statement, find a logically equivalent statement that contains only the NAND operator. Provide adequate justification for your answer. 1. -(xAy)
2. a-
b
IV
12.5.4 Exercises The exercises marked with 01I have detailed solutions in Appendix G. 1. Use the Quine-McCluskey algorithm to simplify the following binary expressions. Show all the details. (a) w.xTY-z+w-xT.yz (b) oxxy.z+x.y.-t+x.y.z+x.y.-+Y.y.z (c) x.y.-Z+x.y.z+x.y.z+T-.y7ý+x.3y (d) xI.x 2 .x3 .x4 +xl.x2.x .j-x4+xl.x .3 2 x-3.x 4-1.xh2.x3.x 4 2. Use the Quine-McCluskey algorithm to simplify the following binary expressions. Show all the details. (a) x1 'x3 -4± -x2*x3*x4+x 'x2 xT3 xT4 -+x 2Fx 3Fx 4 (b) x.y.z+x.y.7 +x.5 7 z+x.Y.7'+T.y.2+ y.z T.(c) (c) x.y-z+x..Z~x-37+T.y-z+T.y. .•c (d) a.b•c. d +a .b . cd +a•. c •d+ 3. There are sixteen distinct binary functions of order 2. (See Example 12.29 on page 768 and Exercise 7 in Exercises 12.4.4 on page 779.) Use the binary expression simpli-
fication rule (and perhaps other Boolean axioms and properties) to simplify the associated binary expressions. 4. Simplify the following binary expressions. Show all the details. (a) OPw,.(x.y- X) (b) (x+y).(x+Y).(y+z) (c) (x+y).(x.z)+(x+y) 5. Simplify the following binary expressions. Show all the details. (a) w.(T,.y).(x±z) (b) x.w .z.(Y+z)+ w x.T+ (y"Z +Z).
(T+
y)
6. Create a combinatorial circuit diagram that represents each of the binary functions. Do not simplify the binary expressions. (a) f(x, y) = x +xxy (b) D f(x, y) = (T + T) • (x+-y)
12.5 Combinatorial Circuits (c) (d) 7. For i.
8. 9.
10.
11.
12.
f (x, y, z) = (x + y) • (T + z) + x • y f(x, y, z) = (x y y • z) + x-Y each binary expression in Exercise 6, Convert the binary expression to an equivalent expres-
sion in disjunctive normal form . algorithm to minimize the Quine-McCluskey ii. Use the exrsinin pat i) expression i (i). iii. Create a combinatorial circuit diagram that represents the binary function defined by the expression in part (ii). 8. bDesignary cmbionatoradeirt f the sxpriedsion in part(ii) 014Design a combinatorial circuit for the simplified version of each expression in Exercise 1. Design a combinatorial circuit for the simplified version of the expression in Exercise 2, part(d). Use only two-input AND and OR gates. a pair of wall Design a combinatorial circuit that simulates switches that control a single light. The light will be on if both switches are in the "up" position or if both switches are in the "down" position. If one switch is "up" and the other switch is "down," the light will be off. a popular TV quiz show. I once bought a computer version of ky ere a key oon thekeyoar achassgneda layrs the keyboard to to The players wereThe each assigned press if they knew the answer to the current question. The first contestant to press a key was supposed to be chosen to answer the question (and gain points if a correct answer was given). Unfortunately, the program was poorly written. It seems that the program would show the question, then start a timer. It apparently gathered all key presses during a time window lasting a few seconds. It would then look through the list of keys pressed and choose the first pressed key. The list was always scanned in the same order, so player I had an unfair advantage, Suppose, for example, that players 1, 2, 3, and 4 used keys a, b, c, and d, respectively. If keys a, c, and d were pressed at about the same time, the program would always choose player 1. If keys b and d were pressed at nearly the same time, the program always choose player 2. Thus, player 1 was favored over 2, 3, and 4, player 2 was favored over 3 and 4, and player 3 was favored over player 4. Design a combinatorial circuit to simulate this defective program. Recall the strategy for designing a combinatorial circuit to simulate a game of Paper, Scissors, Rock with output variables p, s, r, each designating a potential winning object. An alternative strategy is to use output variables, w], w2, t, which represent player I wins, player 2 wins, a tie, respec-
tively. All three output variables will be set to 0 for illegal input values. (a) Create binary expressions to represent the binary functions which specify the appropriate behavior of the output variables, W1 , W2, t. (b) Design a combinatorial circuit to simulate this new strategy. 13. Mom spends a lot of time chauffeuring her three children. The children all like mom, so they fight over who gets to sit in the front of the car with her. Mom has devised a fair way to
791
determine which kid gets the front seat. Each child will flip a coin. If all three are heads or all three are tails, they must all flip again. Otherwise, the child with the odd face gets the front seat. (No, notcoin the kid who makesAdelyn's the funniest face!) Thus, if Sharayah's shows a head, coin shows frows then gs Adelyn head, a shows ahtaif an ando's coin a tail, and Landon's coin shows a head, Adelyn gets the front seat. Follow the process outlined in this section to devise a combinatorial circuit that simulates mom's seating process. 14. Design a combinatorial circuit that will add two 2-bit binary numbers, perhaps resulting in a 3-bit result. Show details for the entire process. 15. Design a combinatorial circuit that will multiply two 2-bit binary numbers, resulting in a 4-bit result. Show details for the entire process. 16. D A single mom has made an agreement with her kids, Annabelle and Boris, about eating out on Saturdays. If Mom wants to eat at home, they will not eat out. However, if Mom wants to eat out, at least one of the kids must also want to eat out or else they will eat at home. Design a combinatoteya outon ehet that will ect d rat cut correctly decide whether they eat out on circuit that will rial Stra.So eal o h niepoes Saturday. Show details for the entire process. 17. Three friends meet each week for dinner. One friend drives everyone to the restaurant, on a rotating basis. They decide (by secret ballot) on the night of the dinner which of four favorite restaurants to visit. If at least two friends agree on a restaurant, then that is where they will go. If they all have different choices, the driver's choice is where they will eat. Designate the friends as x, y, and z, with x being the driver. Design a combinatorial circuit that will correctly specify the restaurant. Show the function definition table(s) and the simplified functions. Building the circuit from the simplified functions is optional. (Hint: Four restaurants can be specified with 2 bits, You will so use variables X1 , x2, Yt, Y2, and z1, Z2. need two output functions, ft and f2. You may use the Quine-McCluskey applet at http://www.mathcs-bethel.edu/ -gossett/DiscreteMathWithProof/ to create the minimized binary expressions. You can even rename the column headers as xi, x2, Y1, Y2, z 1, z2 and just click to specify the function.) 18. Show that {-,, v} is a functionally complete set of logic operators. 19. Use truth tables to prove that the following are tautologies. (a) -'P +->(P T"P) (b) ' (P A Q) -- [(P f
Q) f"(P f"Q)] (c) (P V Q) - 1!(p TP) T (Q f Q) 20. Find a logically equivalent statement that uses only the NAND operator for each statement. Provide adequate justification that your answer is logically equivalent to the original statement. (b) oPvQvR (a) -(PvQ) (d) (P A Q) v R (c) (-'P) A (- Q) 21. Prove that {[4 is a functionally complete set of logic operators.
792
Chapter 12 Functions, Relations, Databases, and Circuits
12.6 QUICK CHECK SOLUTIONS Quick Check 12.1 1. The domain and range are most naturally defined to be the set, Z+, of positive integers. The function can then be represented as
S = {(n, Zj)In
E Z+}.
j=i
Notice that the second coordinates in S are all distinct, since each one is larger than the previous second coordinate. The function is therefore one-to-one. Observe that the first few ordered pairs are {(1, 1), (2, 3), (3, 6) .... }. It is clear that many natural numbers (for example, 2, 4, and 5) do not appear as second coordinates. The function is therefore not onto. 2. This function has a countably infinite domain (the integers) and a finite range (the set of words (even, odd)). The function is not one-to-one because it contains distinct ordered pairs, (2, even) and (4, even), which have the same second coordinate. It is clearly onto since the ordered pairs (2, even) and (3, odd) are in the function. The function might be described as { ....
(-3, odd), (-2, even), (- 1, odd), (0, even),
(1, odd), (2, even), (3, odd) ... .. Quick Check 12.2 1. R-' = ((a, 1), (b, 2), (c, 2), (c, 3)) 2. SoR={(I,x),(1,z),(2, z),(2, y),(3, y)) Quick Check 12.3 1. R1
is reflexive, is not symmetric, is transitive, and is not an equivalence relation.
2. 7R2 is reflexive, is not symmetric, is transitive (see Exercise 3 in Exercises 12.2.3) and is not an equivalence relation. 3. R3 is not reflexive, is symmetric, is not transitive [since gcd(4, 6) gcd(6, 20) = 2, but gcd(4, 20) A 2], and is not an equivalence relation.
= 2 and
4. IZ4 is reflexive, symmetric, and transitive. It is therefore an equivalence relation. 5. 7R5 = {(-1, 1), (1, 1)}. It is not reflexive (since (-1, -1) ý )Z5). It is not symmetric (since (1, -1) 7Z,5 ). It is transitive, since x = -1, y = 1, z = I and x = 1, y = 1, z = I are the only choices for x, y, and z in the definition of transitivity. Quick Check 12.4 1. R, is not antireflexive (x < y does not imply x -A y), it is antisymmetric (if x < 3' and x ý4 y, then in fact, x < y, so y ;ý x), but it is not asymmetric (since 4 < 4). 2. 7R2 is not antireflexive (414), it is antisymmetric (if x Iy and x A y, then in fact, x is a proper factor of y, so y cannot be a divisor of x), and it is not asymmetric (414 again). 3. R 3 is not antireflexive (gcd(2, 2) = 2) and is neither antisymmetric nor asymmetric [both fail because gcd(2, 6) = gcd(6, 2) = 2 but 2 :A 6]. 4.
R 4
is neither antireflexive, nor antisymmetric, nor asymmetric.
5. R5 = {(-1, 1), (1, 1)}. This relation is not antireflexive because (1, 1) is in RZ5 . It is antisymmetric, but is not asymmetric [since (1, 1) is in R 5 ].
12.6 QUICK CHECK SOLUTIONS
793
Quick Check 12.5 1.
This is quite easy to see intuitively. However, it is helpful to attempt a carefully
worded proof. Reflexivity holds because every student lives in the same dorm as his or her own self. (This may seem a bit artificial, but it is valid.) If student x lives in the same dorm as student y, then changing the order of their names does not cause them to live elsewhere; y and x live in the same dorm. Thus, symmetry holds. Suppose that students x and y live in the same dorm. For the moment, assume the dorm is named "Marshall Hall." Suppose also that students y and z live in the same dorm. We already know that y lives in Marshall Hall, so z also lives in Marshall Hall. But then x and z both live in Marshall Hall, so they live in the same dorm. This establishes the transitivity property. Since the relation is reflexive, symmetric, and transitive, it is an equivalence relation. 2. There is one equivalence class per dorm. The equivalence classes consist of the sets of students living in a common dorm.
Quick Check 12.6 1. The attribute set is {Variety, Vegetable, Germination, Harvest]. 2. Before this question can be answered, you need to determine whether a theoretical or a table-specific answer is sought. It is clear from the table that every other attribute is functionally dependent on "Variety" (and also on "Harvest"). However, if the table is extended, it is quite possible to have a variety name repeated. For example, Bush Champion is a likely name for a variety of green bean. If such a variety is added to the table, then knowing the variety name does not uniquely determine the rest of the tuple. It is almost certain to have repeated Harvest values, so the other attributes will not be functionally dependent on Harvest. Using the theoretical version of functional dependency, the attributes "Germination" and "Harvest" are functionally dependent on {Variety, Vegetable). 3. The discussion in part (2) shows that the primary key should be {Variety, Vegetable}.
Quick Check 12.7 1.
Adoptive[Mother, Adoptive Child] Mother
Adoptive Child
Jane Smith
Carmen Smith
Jane Smith
Polly Smith
Anita Rodriguez
Tran-minh Rodriguez
Helen Levitz
Aaron Levitz
Helen Levitz
Hanna Levitz
Betty Jones
Samantha Jones
2. The three steps are listed below. • Form the Cartesian product (Table 12.36 on page 794). (This would normally all be done by computer.) Some of the first names have been abbreviated in order to fit the tuples on the page.
794
Chapter 12 Functions, Relations, Databases, and Circuits TABLE 12.36 The Cartesian product Biological* Adoptive Father Mother Biological Child Father
Mother
Adoptive Child
John Smith
Jane Smith
William Smith
John Smith
Jane Smith
Carmen Smith
John Smith
Jane Smith
William Smith
John Smith
Jane Smith
Polly Smith
John Smith
Jane Smith
William Smith
E. Rodriguez
A. Rodriguez
T. Rodriguez
John Smith
Jane Smith
William Smith
Isaac Levitz
Helen Levitz
Aaron Levitz
John Smith
Jane Smith
William Smith
Isaac Levitz
Helen Levitz
Hanna Levitz
John Smith
Jane Smith
William Smith
Bob Jones
Betty Jones
Samantha Jones
John Smith
Jane Smith
Susan Smith
John Smith
Jane Smith
Carmen Smith
John Smith
Jane Smith
Susan Smith
John Smith
Jane Smith
Polly Smith
John Smith
Jane Smith
Susan Smith
E. Rodriguez
A. Rodriguez
T. Rodriguez
John Smith
Jane Smith
Susan Smith
Isaac Levitz
Helen Levitz
Aaron Levitz
John Smith
Jane Smith
Susan Smith
Isaac Levitz
Helen Levitz
Hanna Levitz
John Smith
Jane Smith
Susan Smith
Bob Jones
Betty Jones
Samantha Jones
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
John Smith
Jane Smith
Carmen Smith
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
John Smith
Jane Smith
Polly Smith
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
E. Rodriguez
A. Rodriguez
T. Rodriguez
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
Isaac Levitz
Helen Levitz
Aaron Levitz
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
Isaac Levitz
Helen Levitz
Hanna Levitz
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
Bob Jones
Betty Jones
Samantha Jones
W. Leblanc
M. Leblanc
Wanda Leblanc
John Smith
Jane Smith
Carmen Smith
W. Leblanc
M. Leblanc
Wanda Leblanc
John Smith
Jane Smith
Polly Smith
W. Leblanc
M. Leblanc
Wanda Leblanc
E. Rodriguez
A. Rodriguez
T. Rodriguez
W. Leblanc
M. Leblanc
Wanda Leblanc
Isaac Levitz
Helen Levitz
Aaron Levitz
W. Leblanc
M. Leblanc
Wanda Leblanc
Isaac Levitz
Helen Levitz
Hanna Levitz
W. Leblanc
M. Leblanc
Wanda Leblanc
Bob Jones
Betty Jones
Samantha Jones
R. Westlund
V. Westlund
Derwin Westlund
John Smith
Jane Smith
Carmen Smith
R. Westlund
V. Westlund
Derwin Westlund
John Smith
Jane Smith
Polly Smith
R. Westlund
V. Westlund
Derwin Westlund
E. Rodriguez
A. Rodriguez
T. Rodriguez
R. Westlund
V. Westlund
Derwin Westlund
Isaac Levitz
Helen Levitz
Aaron Levitz
R. Westlund
V. Westlund
Derwin Westlund
Isaac Levitz
Helen Levitz
Hanna Levitz
R. Westlund
V. Westlund
Derwin Westlund
Bob Jones
Betty Jones
Samantha Jones
R. Westlund
V. Westlund
Darwin Westlund
John Smith
Jane Smith
Carmen Smith
R. Westlund
V. Westlund
Darwin Westlund
John Smith
Jane Smith
Polly Smith
R. Westlund
V. Westlund
Darwin Westlund
E. Rodriguez
A. Rodriguez
T. Rodriguez
R. Westlund
V. Westlund
Darwin Westlund
Isaac Levitz
Helen Levitz
Aaron Levitz
R. Westlund
V. Westlund
Darwin Westlund
Isaac Levitz
Helen Levitz
Hanna Levitz
R. Westlund
V. Westlund
Darwin Westlund
Bob Jones
Betty Jones
Samantha Jones
12.6 QUICK CHECK SOLUTIONS
795
Remove any tuples for which the duplicate attributes [Father, Mother] do not have identical values (Table 12.37).
TABLE 12.37 The tuples with identical values for duplicate attributes Father
Mother
Biological Child
Father
Mother
Adoptive Child
John Smith
Jane Smith
William Smith
John Smith
Jane Smith
Carmen Smith
John Smith
Jane Smith
William Smith
John Smith
Jane Smith
Polly Smith
John Smith
Jane Smith
Susan Smith
John Smith
Jane Smith
Carmen Smith
John Smith
Jane Smith
Susan Smith
John Smith
Jane Smith
Polly Smith
E. Rodriguez
A. Rodriguez
T. Rodriguez
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
* Project onto [Father, Mother, Biological Child, Adoptive Child} (Table 12.38).
TABLE 12.38 The projection onto (Father, Mother, Biological Child, Adoptive Child) Father
Mother
Biological Child
Adoptive Child
John Smith John Smith
Jane Smith Jane Smith
William Smith William Smith
Carmen Smith Polly Smith
John Smith
Jane Smith
Susan Smith
Carmen Smith
John Smith
Jane Smith
Susan Smith
Polly Smith
E. Rodriguez
A. Rodriguez
Pablo Rodriguez
T. Rodriguez
4. The projection (Biological * Adoptive)[Father, Mother] contains all couples who have both biological and adoptive children. Father
Mother
John Smith E. Rodriguez
Jane Smith A. Rodriguez
Quick Check 12.8 1. This relation is in first normal form because there are no attributes with sets as values. The essential functional dependencies are listed next. (Team, Player, Position) -+ Captain The only key is {Team, Player, Position), so it is the primary key. This relation is in third normal form because the only functional dependency has a key for the left-hand side. 2. This relation is in first normal form because there are no attributes with sets as values. The essential functional dependencies are listed next. EmployeelD -- Name EmployeelD -* Job Grade EmployeelD -> Salary Job Grade -- Salary
Thus, the primary (and only) key is EmployeelD. Since there are no proper subsets of the primary key, this relation is in second normal form. However, the functional dependency Job Grade --* Salary prevents this from being in third normal form.
Quick Check 12.9 1. The relation ScheduleB[Instructor, Office, Teaching Assistant] is in third normal form because either Instructor or Office could be chosen as primary key. (The most
796
Chapter 12 Functions, Relations, Databases, and Circuits likely choice for primary key is Instructor, with Office as alternate key.) Teaching Assistant is the only choice for B in Definition 12.24. However, there is no choice for D that does not contain a key. The algorithm decomposed ScheduleB into three relations. The decomposition {ScheduleB [Course, Section, Semester, Instructor], ScheduleB [Instructor, Office, Teaching Assistant] I is a viable alternative decomposition (many database designers would prefer it).
Quick Check 12.10 1.
(a) This is not a minterm because it contains the constant, 1. (b) This is not a minterm (but it is in disjunctive normal form). A minterm must contain each of the candidate binary variables in either complemented or uncomplemented form. (c) This is a minterm in xj, x2, x 3 .
2. (a) This binary expression is in disjunctive normal form. (b) This is not in disjunctive normal form because it is not a sum of minterms (0 is not a minterm) and it is not just 0. (c) This binary expression is in disjunctive normal form. (The summation may contain just one term.)
Quick Check 12.11 1. There are four ordered triples at which f evaluates to 1. minterms are listed. triple (0,1,0)
The corresponding
minterm X2 •x3
xI
(0,1,1)
x- X2 x3
(1,1,0) (1,1,1)
xI 2x-T3 x I X2 x3
Thus, it is possible to write
f (xl,x2, x3) = 2X .x2 -x3 -- W .X2
X3 + XI .X2 .T 3+ XI
X2
X3.
Quick Check 12.12 1.
Move all complements onto single variables
(0 + X0) X'2 + X1 • (X 2 + 56) =
=
(I)
)
+ Xl " (X2 + T3) De Morgan
(1 •I) • 2 + X1
(x2 + ý-)
complement of 0
Transform to a sum of products
(1 • TO) • T2 + x1 - (X2 + x3) = (0 • ý-) T2 + x1 •x2 + X1 • i-3 distributivity = I xT1 x2 + xl - X2 +-
xi• 3
associativity
797
12.6 QUICK CHECK SOLUTIONS Transform into disjunctive normal form I
-X2 +x
.x2- +xl
-xY3
= 1 - (TIF. -i) + 3x2 + x1
= Xl
•jY
+X1 =
± X1
XT "X2
=
lI + x1
-)
= (5F-.
T'
Xl
•-.
X2 + XI
X2
•
3 + X1X2
replacement algorithm (i = 3)
•i2
"i'x12" )3± Xl
'x2"X3 +
X-1
replacement algorithm (i = 2)
• T3 • 2
x3 + TXl- "X2"
l-
identity
X2
T2"
+X
commutativity
T3
x2 + xl
X2 + x1 - -
+
x2
associativity
x1X2 + X1 •-3
j
+ 53 -Xl
X2' X3
• Y3- X2
replacement algorithm (i = 3)
XI x3 Y x2
Now sort within terms to produce minterms. Tl
"2'X3.+
-Xl'X2'
.6+
Xj-"X2 'X3
-Xl + -X1 "X21 - T3+ X1
X1 2' X-ý23 73+ '
F22
Sort and reduce into a unique disjunctive normal form Sort the minterms. Xl " X2 '3X3
+ X12 X2
Ti-X +
X1X2
T3 ± X1 T
Ti + Ti1
- 2
+X3 Ti 2
3
Remove duplicate minterms. Xl "X2 'X3 + Xl "X2
5i +-Xl X1 '323 +
T-xl 323 T
- + Xl3TiE'
Check the answer. (0+xI). T + x1
X1
X2
X3
0
0
0
0
0
1
1
0
1
0
0
0
1
1
0
1
0
0
1
1
0
1
0
1
1
1
1
0 1
1
(x 2 +T)
1
'i.2 .3 +1l '.2 '.X3 +71 '.2.3 The minterms in the expression x. .X2 '.x3 +-Xi'.2.1l correspond to the ordered triples where the original expression evaluates to 1.
Quick Check 12.13 1. The two phases are shown. Phase 1: 1 110 2 101 3 011 4 010
•/
V V
1,4 3,4
-10 01-
798
Chapter 12 Functions, Relations, Databases, and Circuits Phase 2: 1 110
2 101
101
3 011
4 010
X
-10
X
X X
01-
X
In order to cover every column, all three of the rows are required. The minimized binary expression is thus x -.- z + y + i1* y. ±
Quick Check 12.14 1.
The first step is to create a binary function that matches the desired circuit. x
y
z
f(x, y, z)
0
0
0
0
0
0
1
0
0
1
0
0
0
1
1
1
1
0
0
0
1 1
0 1
1 0
1 1
1
1
1
1
A binary expression in disjunctive normal form that represents this function is Y Y. z + x - Yz + x y - + x y z. Now use Quine-McCluskey to simplify the expression. Phase 1: I
O11i
2 3
101t 110
4
111
,
V V
1,4
- 11
2,4 3,4
1-1 11-
Phase 2: 1 1-1
2 X
11z
3
4 X
x
X
All three of the rows are needed. The simplified function is f(x, y, Z) = y •Z + x • Z +- x • y. A circuit that implements this function is shown in Figure 12.12.
Figure 12.12. A combinatorial circuit for a three-way majority.
Quick Check 12.15 -((Xtav y) (X Tby)) f"Y) f" (X f"Y)) T"((X 1" Y) T"(X t" Y))
1-,(XAy)
S((X 2.
a --+ b
•
(---a) v b
"ۥ(a T a) v b
S((a
T a) T (a T•a)) T• (b T b)
12.7 Chapter Review
799
12.7.1 Summary This chapter is about functions and relations, together with some important applications. You have been working with functions for many years. Relations, which are generalization of functions, may be less familiar. The material in this chapter demonstrates that this is a profitable generalization. The chapter begins with a review of the notion of a function (Section 12.1). The review presents functions as special subsets of a Cartesian product. This definition is equivalent to the more familiar notion of a mapping, but it provides a simpler transition to the notion of a relation. The section includes discussions of the important notions of one-to-one, onto, composition, and inverse. Section 12.2 discusses several properties that may apply to a particular relation. Three of these properties are especially noteworthy: reflexive, symmetric, and transitive. A relation for which all three of these properties apply is called an equivalence relation. Equivalence relations play an important role in algebraic structures courses. Examples 12.10 and 12.11 give a glimpse of that role. Section 12.3 generalizes the notion of a relation to that of an n-ary relation. Perhaps the most visible application of n-ary relations is their use as the basis for relational databases. Some of the power and flexibility of relational databases derives from the firm theoretical foundation that is available from the mathematical theory of relations. This section does not attempt to provide a well-rounded introduction to relational databases. Instead, it focuses on the concept of normal forms. By converting the relations in a relational database to an appropriate normal form, some potential inconsistencies in the database can be avoided. Section 12.4 switches back from relations to functions. The section starts with a brief look at functions whose domain is a Boolean algebra but concentrates on functions whose domain is the set {0, 1). These binary functions are often expressed by using binary expressions. The study of binary expressions is aided by introducing minterms and disjunctive normal form. Binary functions have much more inherent structure than do functions whose domain is the real numbers. The major result of this section is a proof that every binary function can be expressed in disjunctive normal form. Section 12.5 builds on the foundation laid in Section 12.4. It begins with the QuineMcCluskey algorithm for minimizing a binary expression which is in disjunctive normal form. A direct application is in the design of combinatorial circuits. A binary function can be constructed that specifies the input-output behavior of the desired circuit. That function can be expressed in disjunctive normal form, which can then be minimized using the Quine-McCluskey algorithm. The minimized expression can then be converted into a circuit design with fewer components than would have been the case for the original version of the function. This chapter effectively illustrates that there is great depth to even apparently simple mathematical ideas. The notion of a function is a familiar part of the modern mathematical landscape. However, even that familiar idea can be extended in interesting and useful ways. Most of the material in this chapter is very straightforward and can be mastered with a reasonable investment of time and effort. The material on normal forms in relational databases is perhaps the hardest part of the chapter. You might want to spend extra time reviewing that section.
800
Chapter 12 Functions, Relations, Databases, and Circuits
12.7.2 Notation Notation
Page
Q o0
731
the composition of the functions (or relations), 1' and 9
T7-
733
the inverse relation of 1Z
aob
732
an alternative way to indicate that (a, b) E 71
T~r
738
the reflexive closure of the relation, R"
lZs
738
the symmetric closure of the relation, 1Z
)Zn
738 739
the transitive closure of the relation, 7R The composition of the relation, R, with itself n times
740
the equivalence class of element, x
749
the projection of a relation, T, onto attributes {B 1, B 2 ... , Bj I
749
the join of the relations, T1 and T2
751
B is functionally dependent on {A 1 , A 2.
X
769
the complement of the binary variable, x (interchanges 0 and 1)
x
769
generic notation for either x or -
785 785
the standard symbol for an AND gate the standard symbol for an OR gate
[x] T[B 1 , B2.
Bj]
T* T2 {A 1 , A2
Brief Description
,
Aj I-
B
Y
Y
S785
Aj}
the standard symbol for a NOT gate
"
789
the NAND logic operator
1-
789
the NOR logic operator
12.7.3 Definitions Function; Domain; Range A function from the nonempty set D into the nonempty set 7Z is a subset, _F, of the Cartesian product E)x R such that every element of D) appears in one and only pair appers ony none oe oeordered odere an par in i Bt. F.domain The set, D, is called the domain and the set, 7Z, is called the range. The image of the function is the subset of 7Z consisting of elements that actually appear in the right-hand side of at least one ordered pair in B.
• If (y, x) E 9, then (x, y) E B. Composition of Functions Let Y be a function whose domain is X and whose range is Y. Let g be a function whose is Y and whose range is Z. The composition of G domai isdend who rang is defThe c i
Onto and One-to-One Functions Let B7 be a function from D into 7?. Then Y is called onto if every element of R appears as a second coordinate in at least one ordered pair in T. If no element of 7Z appears as second coordinate in more than one ordered pair in BT,then B is called one-to-one (also abbreviated as 1-1).
subset, 7R, of the Cartesian product A x B. If (a, b) E 7R, it is common to write aTZb and to say that a is related to b. If A = B, the relation is said to be a relation on A. The set A is called the domain of the relation and the set 8 is called the range. Onto and One-to-One Relations Let 7) be a relation be-
Inverse Function Let B be a one-to-one and onto function with domain, D.-, and range, Rr. A function, g, whose domain is 7Z.- and whose range is E9y- is called the inverse of F if the following conditions hold: - If (x, y) E T, then (y, x) E g.
tween the sets A and S. Then R is called onto if every element of 3 appears as a second coordinate in at least one ordered pair in 7Z. If no element of B appears as second coordinate in more than one ordered pair in 7?, then 1Z is called one-to-one.
G o B {(x, z) I Ey E Y with (x, y) E F and (y, z) c 9}. Relation A relation between the set A and the set B is a
12.7 Chapter Review Inverse Relation Let R be a relation between the sets A and B. The inverse relation of 7? is denoted 7-1 and is a subset of the Cartesian product B x A. More precisely, Z-- = {(b, a) E B x A I (a, b) c 7Z). Composition of Relations Let 7Z be a relation whose domain is A and whose image is B. Let S be a relation whose domain contains L3 and whose range is C. The composition of S and 7 is a subset of-A xC. Itis denoted by SoR and is defined as S o 7R = {(a, c) I 3b E B with (a, b) ec 7and (b, c) E S1. Reflexive; a set, A4. Symmetric; Transitive Let 7R be a relation on
"*If (x, x) E 7Z for all x E A, then 7Z is reflexive, "•If (y, x) E 7 whenever (x, y) E 7Z, then R is symmetric. "*If (x, z) E whenever (x, y) E R and (y, z) E 7, then R• is transitive. Equivalence Relation Let R be a relation on a set, A. If 7 is reflexive, symmetric, and transitive, then it is called an equivalence relation. relatiref
; Antc Ac L, 7 "relationon a set, A.m * 7? is antireflexive if (x, y) e 7? implies x y y. "*7 is antisymmetric if (x, y) E R with x 5 y implies (y,'x) 0 R. "•7 is asymmetric if (x, y) E 7Z implies (y, x) 0 7?. Partial Ordering; Poset A relation, 7?, on a set, A, is a partial ordering if it is reflexive, antisymmetric, and transitive. The pair, (A, 7Z), is called a partially ordered set, or poset.
801
• The symmetric closure of 7Z is the smallest relation, ls, such that 7 C Ks and IZ, is symmetric on A. The transitive closure of R is the smallest relation, Rt, such that 7R C t and 7Rt is transitive on A. R" Let 7R be a relation on a set, A. Then • 77{(x, I x) E A x A ix E A) R? 7z • 7f
7Z
ZO7?n-1
Equivalence Class Let 7? be an equivalence relation on a set, A, and let x E A. The equivalenceclass of x is denoted by [x] and is defined as [x] = {a • A I(x, a) E R}. The set {[x] Ix E Al is referred to as the set of equivalence classes induced by R on A. The element, x, that appears in the notation "[x]" is called the class representative. Well Defined Let 7? be an equivalence relation on a set, A. Let 0 be a binary operation on A, and let 0 be a binary operation defined on the equivalence classes that 7? induces on A. Finally, let 0 be defined by [x] 0 [y] = [x 0 y], for x, y E A. If ([x] 0 [y]) = ([x'] 0 [y']) whenever [x] = [x'] and [y] = [y'], then the operation, 0, is said to be well defined. n-ary Relation An n-ary relation in (or on) the sets A 1 , A 2 ... ,An is a subset, 7, of the Cartesian product Al x A2 x ... x A. Binary Relation A binary relation is a 2-ary relation.
Hasse Diagram A Hasse diagram is used to visually repTernary Relation A ternary relationis a 3-ary relation. resent poses. A Hasse diagram satisfies three properties. Relational Database; Attributes; Tuples A relational "•There is a vertex for every element of A. database is a collection, {T1 , T2 ... , Tk}, of relations, "•Element y appears higher in the graph than element x if where Tj is an nj-ary relation, for j = 1, 2 ... , k. x7y and x :A y. The coordinate positions in the nj-tuples of Tj are "*If x7?y and there is no z c A with both x7Zz and z7?y, called attributesand must be single valued. Each attribute then there is an edge joining the vertices representing x has an attributename. The set of all attribute names will be and y. called the attributeset. The individual ordered nj-tuples in the relation, 7j, Reflexivity is assumed, so it is not explicitly shown on the are simply called tuples when the value of nj is understood. diagram. Two alternative systems of terminology are shown in Closure of a Set A common notion in higher mathemattable. next the the that is idea key The set. a of closure ics is that of the original set, S, does not satisfy some property. The deficiency might be corrected by adding additional elements to Mathematical Two-Dimensional Computer Storage S. The goal is to find the smallest set that contains S and View View View also satisfies the desired property. relation table file Closures of Relations Let 7Z be a relation on a set, A. tuple tperow record ° The reflexive closure of R?is the smallest relation, 7 rattribute column field such that 7?C: KT and 7? r is reflexive on A.
802
Chapter 12 Functions, Relations, Databases, and Circuits
Key; Primary Key; Alternate Key; Nonkey Attribute Let T be a relation and A be the attribute set of 7-. A Aj}1 , of A is called a nonempty subset, P = {AI, A 2. key forTif 1. All attributes in A- P (the set difference) are functionally dependent on P. 2. No nonempty proper subset of P has property 1. If a key consists of only one attribute, B, it is customary to speak of the key B instead of the key {B}. If there is more than one key, one of them is chosen to be the primary key, and the other choices are demoted to the status of alternate keys. An attribute that is not part of the primary key is called a nonkey attribute,
The notation JAI, A 2 indicates {A,, A 2 .
that , AJ
B
is
.
Aj1
--
B
functionally
dependent
on
.
Essential Functional Dependencies The phrase essential functional dependencies is used informally in this book. The collection of essential functional dependencies in a relation consists of those that have no elements of the lefthand side of a functional dependence that are functionally dependent on the other elements of the left-hand side.
Foreign Key A foreign key is an attribute (or set of attributes) that is a primary key in some other relation in the same relational database.
Update and Deletion Anomalies An update or deletion anomaly is an inconsistency that enters a relational database as the result of changing or removing information in the database. One common way this happens is if there is redundancy in the database and only some (but not all) of the multiple copies of the information are changed or deleted. Normal forms seek to eliminate these (and other)
Projection; Join Let TI be an n I -ary relation with attribute set A1 and -2be an n 2 -ary relation with attribute set A2 . If {BI, B 2 , ... , Bj 1 c A 1 , then the projection of TI onto {BI, B 2 ..... B I}is the relation obtained by
kinds of anomalies. Decomposition; Lossless Decomposition Let - be a relation with attribute set A = {D1, D 2 ... , Dn}, where the subsets, Di, of attributes are not necessarily disjoint. The set of relations {T[Dl], T[D2 ] .... T[Dn]} is a decompo-
1. removing from each tuple in T1 the components that do not correspond to an attribute in {B 1 , B2 ..... Bj},
sition of T. If, in addition, T = T[D 1] * T[D2 ]
2. and then removing any duplicate tuples (keeping one copy).
decomposition is called a lossless decomposition. First, Second, and Third Normal Forms Let 7- be a re-
onto attributes The projection of a relation, T, Bj 1 is denoted by T[BI, B 2 ,..., Bj]. Sim{BI, B 2 ... ilar notation is used to denote the projection of a single tuple in T onto the attribute set IBI, B 2 ... , Bj}. The join, TI * -2, of TI and T2 is a relation having attribute set B = A1 U A2 , with an ordering imposed on B. Assume that the ordered attributes in the union, B, are BI, B 2 .. . B,. Then
lation with attribute set A, and let D C A be a subset of attributes. Let B 0 D be an attribute that is not in any key. First Normal Form T is infirst normalform if every attribute in T is single valued.
Ti * T2 ={r E Bi x B 2 X ...x Bn I 3rI
E TI
and 3r2 e 12 with r[A,] = riand r[A2]
=
r2}.
Functional Dependence (Informal) Let T be a relation and A be the attribute set of T. Let B e A and IA 1 , A2 ... , A1 C c A with j > 1. The attribute B isfunctionally dependent on {AI, A 2 ... , AJ} if every distinct choice of values for the attributes A i, A 2 . A1 uniquely determines the value of B. Functional Dependence (Formal) Let 7- be a relation Let B E A and and A be the attribute set of T. The attribute, B, 1. > j with A C I Aj , {A1 , A 2 ... A J if for every is functionally dependent on {A 1 , A2.. pair of tuples, ri and r2 in T, ri[Aj,A
2.
implies
..
A1 ]=r2[A, A 2 ,. .Ai] rl[B] = r2[B].
* ...
*
T[Dn], the
Second Normal Form T is in second normalform if it is in first normal form and if D --. B implies that D is not properly contained in any key ofT. Third Normal Form T is in third normalform if it is in first normal form and if DB implies that D contains some key of T. Boyce-Codd Normal Form Let T be a relation in a relational database with attribute set A. Let D C A and B V D. Then - is in Boyce Codd normal/form if D -+ B implies that D contains some key of -. Single-Variable Boolean Function Let B be a Boolean algebra with associated set, B. A single-variable Boolean function is a function whose domain is B and whose range n h B a is Multivariable Boolean Function Let B be a Boolean algebra with associated set, B. An n-variableBoolean funcn times
tion is a function whose domain is B x B x... x B and whose range is {0, 1}.
12.7 Chapter Review
803
Binary Function A binary function of order n is an nvariable Boolean function on a Boolean algebra whose associated set is B = {0, 1). Binary Variable; Binary Expression A binary variable is one whose possible values are either 0 or 1. A binary expression is an algebraic expression that is composed using the symbols 0, 1, +,., , and binary variables.
AND Gate If both inputs, x, y, of an AND gate are 1, then the output will be 1. Otherwise, the output will be 0. OR Gate If either input, x, y, of an OR gate is 1, then the output will be 1. Otherwise, the output will be 0. NOT Gate A NOT gate changes an input value of I to a 0 and a 0 to a 1. Combinatorial Circuit A combinatorial circuit is a cir-
Minterm Let X, X2, .... ,xn be n binary variables. A minterm is a binary expression in the form •1•x2""x, where is neither expressifor in the 1, 2. rm e i2Xa. w here xi is either xi or T, for i = 1, 2 ,. .. n.. T There is an
cuit in which there are no delay elements.
assumed ordering of the variables, Canonical Form A mathematical entity is said to be in canonicalformif it conforms to some predetermined rules, There are many canonical forms, depending on the context. The canonical form of interest here is disjunctive normal form. Disjunctive Normal Form A binary expression is in disjunctive normal form if it is either a sum of distinct minterms or it is the expression, 0. Functionally Complete A collection of logic operators in propositional logic is functionally complete if every compound statement is logically equivalent to some statement that contains only operators from the collection.
which delay elements exist. NAND The logic operator, NAND, is denoted by defined by the following truth table. P Q P ft Q T T F T F T
Sequential Circuit A sequential circuit is a circuit in w i h d l y el m n se i t
F
T
F
NOR The logic operator, NOR, is denoted by fined by the following truth table. p Q p IQ T T F
Quine-McCluskey Algorithm; Karnaugh Maps The Quine-McCluskey algorithm and Karnaugh maps are two alternative methods for minimizing binary expressions. Gate A gate is an element in a logic circuit.
t and
T
F
F
F
T
F
F
F
T
4 and
de-
12.7.4 Theorems Theorem 12.1 Composition of Relations is Associative Let 7Z be a relation whose domain is A and whose image is B3. Let S be a relation whose domain contains B and whose image is C. Finally, let T be a relation whose domain contains C and whose range is D. Then (T o S) o R = T o (S o R).
two projections, T[C U D] and T[C U E], form a lossless decomposition of T. Corollary 12.1 Let T be a relation with attribute set A, and let C C A and D C A. Set E = A - (C U D). If
Theorem 12.2 Transitive Relations Let 71 be a relation on a set, A. Then 7Z is transitive if and only if Rn C I, VnE Z+. Proposition 12.1 Equivalence Classes are Disjoint Let 7Z be an equivalence relation on a set, A. If x and y are two elements in A, then either [x] = (y] or else [x] n [y] = 0. Theorem 12.3 Equivalence Relations and Partitions Let A be a set. "•If R is an equivalence relation on A, then the equivalence classes of 7 form a partition of A. "*Every partition of A determines an equivalence relation on A. Theorem 12.4 A Lossless Decomposition Let T be a relation with attribute set A, and let C C A and D C A with
Theorem 12.5 Any Relational Database Can be Converted to Third Normal Form Let {T1 , 'T2 .. Tn } be a relational database. Then the relations, Ti, can be losslessy decomposed into a collection of relations that are each in third normal form.
Proposition 12.3 Let T be a relation with attribute set {A1 ... , Aj, B1 ... , Bk, C1. .... Cdj, with j, k, n > 1. Then Bk]. T[Ai.... Aj, BI. ' = Set T[A1 , ... , Aj] = T-[A1.... Aj]. Proposition 12.4 Let T1, 7T2, and T3 be relations with attribute sets A1 , A2 , A3 , respectively. Impose orderings on
CrnD = 0. SetE = A-
A 1 UA
(CUD). IfC --* D,thenthe
C --* D, then the two projections, T[C U D] and T[C U El,
form a lossless decomposition of T.
Proposition 12.2 If {A1 , A2 ..... Aj1 g A and B E {A1 , A2 ,... Aj }, then B is functionally dependent on {AI, A2 ... , Aj)}.
2
,A 2 UA
3
,andA1 UA
2
UA 3 . Then
804
Chapter 12 Functions, Relations, Databases, and Circuits
"*The join operator is commutative:
Theorem 12.7 The Number of Multivariable Boolean Functions on II B •B be a Boolean algebra with associated set, B. If IoB = m, then thereB.are 2 (m) distinct n-variable Boolean functions on B.
nary expression in disjunctive normal form that is equal to f when viewed as a binary function. Proposition 12.6 Moving Complements onto Single Variables Every binary expression is equivalent to a binary expression in which the only occurrences of the complement operator, , are to complement single variables. Proposition 12.7 Transforming to a Sum of Products Let E be a binary expression in which all occurrences of the complement operator involve only single variables. Then E is equivalent to a binary expression that is a sum of products, where each product contains factors that are either a constant, 0 or 1, or are complemented or uncomplemented single variables. A product can consist of a single such factor.
Corollary 12.2 The Number of Binary Functions There are 22" distinct binary functions of order n.
Proposition 12.8 Transforming Sums of Products to Disjunctive Normal Form Let E be a binary expression that
Proposition 12.5 Evaluating Minterms Letil • j2 "" be a minterm in the binary variables, x1, X2 ..... x, Define an n-variable binary function, f, as f (XI, X2,..... X,) = . Then f has the value 1 at only one element in • x1l4 ... its domain; it has the value 0 at all other elements of its domain. The n-tuple at which f has the value 1 is determined by setting I = 0 if ii = xi xi xi=0 if.=7fori=1,2.... tables. Theorem 12.8 Every Binary Function Can be Expressed in Disjunctive Normal Form Let f be a binary function
is a sum of products in which each factor is either 0, 1, or a complemented or uncomplemented single variable. Then E is equivalent to a binary expression in disjunctive normal form. Theorem 12.9 Every Binary Expression Is Equivalent to a Unique Expression in Disjunctive Normal Form Every binary expression is equivalent to a unique expression in disjunctive normal form. The uniqueness requires a preestablished lexicographical ordering of the vari-
in the binary variables, X1, X2 ..... x,
erators,
T1 * T 2 = T2 * T1.
"*The join operator is associative: (TI * T2 ) * T3 = Tt * (T2 * T 3).
"*If A1 n A2 = 0, then T1 * 7 = T1 X T2. Theorem 12.6 The Number of Single-Variable Boolean Functions on 1BLet B be a Boolean algebra with associated set, B. If IBI = m, then there are 2 m distinct singlevariable Boolean functions on B.
Then there is a bi-
Theorem 12.10 NAND Is Functionally Complete The set, {T1, is a functionally complete collection of logic op-
12.7.5 Sample Exam Questions 1. Let Z =
1(0, 5), (0, 6), (1, 6), (2, 7), (3, 6), (4, 5)) and
S = 1(5, 2), (5,4), (6, 5), (6, 8), (7,7)1 (both subsets of N x N). Determine the ordered pairs in (S o R)- 1. 2. Let A = [a, b, c, d, el. Find the transitive closure of the relation [(a, b), (a, c), (b, d), (c, e), (d, c), (e, d)lI
.A x A.
are designated as "subsidiary tenants."
Chief Tenant
Chief Apt
Phone
Lease Expires
Deposit
Mary Anthony
212
777-222-3333
5-1-2004
$250
Walter
103
777-222-4444
5-1-2004
$200
310
777-222-5555
7-1-2004
$300
411
777-222-1111
2-1-2005
$250
Boyd 3. Every positive integer, i, can be uniquely written in the form
Martin
i = q2J, where q is odd and j > 0. Let 7Z be the relation
Daud
on Z+ x Z+ definedby (n,m) E R. if and only if q, = q2, where n = ql2i, m = q 2 2k, and ql and q2 are both odd. (a) Prove that 7Z is an equivalence relation.
CharKane
(b) Describe the equivalence classes. 4. Define key (in the context of relational databases). 5. Consider the following relations in a relational database. The database is kept by an apartment manager. The apartment complex requires one tenant in each apartment to be designated as the "chief tenant." All other tenants in the apartment
Subsidiary Name
Apt
SPhone
Chief Tenant
Polly Boyd Joe Bertonelli Peter Parks Billie Kane
103 310 310 411
777-222-4444 777-222-2222 777-222-5555 777-222-1111
Walter Boyd Martin Daud Martin Daud Char Kane
Mary Kane
411
777-222-1111
Char Kane
12.7 Chapter Review (a) Form the projection, Subsidiary[SPhone,Chief Tenant]. (b) Form the join, Chief*Subsidiary. (c) Find a suitable primary key for each of the two original relations. Are there any foreign keys? 6. The following relation has the primary key {Student, Level}. The only other key is IStudent, Audition piece). The relation stores information about students in a piano competition. Competitors Level Audition piece 1 Winset: The Happy Rabbit
Student Mary Anperson
Teacher Smith
Mary Anderson
2
Beethoven: Fdr Elise
Smith
Mary Anperson
3
Liszt: La Campanella
Jones
Bobbie Juarez
1
Winset: The Happy Rabbit
Jones
Wilma Holter
1
Winset: The Happy Rabbit
Jones
Esteban Laureano
2
Beethoven: Fir Elise
Colatti
Min Gua Lin
3
Liszt: La Campanella
Gao
805
The essential functional dependencies are listed next. {Student, Level} --- Audition piece [Student, Level} -). Teacher Level -- Audition piece Audition piece -+ Level Convert the relation to a collection of relations in third normal form. 7. Convert the binary expression, (x + Y)• (x • y) + 0, into disjunctive normal form. 8. Use the Quine-McCluskey algorithm to minimize the binary function f(x, y, z) = x • y •-+ .y .- +..z +Y.z. 9. Suppose you want a combinatorial circuit which outputs a 0 whenever both its binary inputs are the same, and outputs a 1 if its inputs are different. (a) Produce a binary function that represents the circuit. (b) Design the circuit using AND, OR, and NOT gates.
12.7.6 Projects Mathematics 1. Write a brief expository paper describing how to use Karnaugh maps to minimize binary expressions. 2. 2. Write a paper that provides a more careful introduction to Boyce-Codd normal form. partially ordered sets 3. Write a brief expository paper about 3. and lattices. Stirling between connection the explains 4. Write a report that numbrse aeof hareflexive, and counnetingonto counting onto futwnctions functions havnumbers of thethtseondlkind second kind and ing finite domain and range. ComputerScience 1. Write a brief expository paper that explores the ways in
which current combinatorial circuit design differs from the presentation in this book (i.e., using AND, OR, and NOT gates). Write a program that takes a binary function in disjunctive normal form as input and produces a table listing the value of the function for each combination of input values.
Write a program that takes a finite relation as input and determines which of the following properties hold for the relation: symmetric, transitive, antireflexive, antisymmetric, asymmetnc. 4. Write a program that takes as input a pair of relations from a relational database and outputs their join. The input should also specify the attributes for each table.
12.7.7 Solutions to Sample Exam Questions 1. It is first necessary to determine S o )Z. Since (0, 5) c 7Z and (5, 2) and (5, 4) are in S, the ordered pairs (0, 2) and (0, 4) are in S o 1Z. Continuing in this manner, we find that
2. The first pass results in [(a, b), (a, c), (a, d), (a, e), (b, c), (b, d), (c, d), (c, e), (d, c), (d, e), (e, c), (e, d)}.
S o 7Z = 1(0, 2), (0, 4), (0, 5), (0, 8), (1, 5), (1, 8), (2, 7), (3, 5), (3, 8), (4, 2), (4, 4)}. Consequently, (So
1
(2, 0), (4, 0), (5, 0), (8, 0), (5, 1), (8, 1), (7, 2), (5, 3), (8, 3), (2, 4), (4, 4)}.
A second pass produces {(a, b), (a, c), (a, d), (a, e), (b, c), (b, d), (b, e), (c, c), (c, d), (c, e), (d, c), (d, d), (d, e), (e, c), (e, d), (e, e)1. No additional ordered pairs are added in a third pass, so the transitive closure is the relation listed in the second pass.
806
Chapter 12 Functions, Relations, Databases, and Circuits
3. (a) Let n E Z+, with n = ql 2 ' and m E Z+, with m = q22j, where p and q are both odd, and i, j > 0. Then (n, n) E 7Z because qI = qt. Also, if (n, m) E 'R, then q1 = q2. But then q2 = q1, so (m, n) c )Z is also true. Finally, let r E Z+ with s = q 3 2k. If (n, m) E R and (M, s) E 7R, then qI = q2 and q2 = q3. Therefore, q1 = q3, so (n, s) E 71 is also true. The relation, R, is reflexive, symmetric, and transitive. It is therefore an equivalence relation,
and another is [3] = {3, 6, 12, 24, 48.... 4. This is Definition 12.20. Let T be a relation and A be the attribute set of T. A nonempty subset, P = {A1, A 2 . A 1 , ofA is called akey for T if:
(b) The numbers that are in the same equivalence class are those numbers with identical sets of odd prime divisors. Thus, one equivalence class is
1. All attributes in A - P (the set difference) are functionally dependent on P. 2. No nonempty proper subset of P has property 1.
[1] = {1, 2, 4, 8, 16,...
5. (a) The projection can be formed by keeping the columns for attributes SPhone, and Chief Tenant, and then removing any duplicate rows.
Subsidiary[SPhone,Chief-Tenant] SPhone
Chief Tenant
777-222-4444
Walter Boyd
777-222-2222
Martin Daud
777-222-5555
Martin Daud
777-222-1111
Char Kane
(b) The suggested algorithm starts by forming the Cartesian product of the two relations. (Lease Expires and Deposit have been abbreviated here. The phone numbers have also been temporaritly truncated.)
Chief x Subsidiary Chief Tenant
Apt
Phone
Lease-Exp
Dep
Name
Apt
SPhone
Chief Tenant
Mary Anthony
212
3333
5-1-2004
$250
Polly Boyd
103
4444
Walter Boyd
Mary Anthony
212
3333
5-1-2004
$250
Joe Bertonelli
310
2222
Martin Daud
Mary Anthony
212
3333
5-1-2004
$250
Peter Parks
310
5555
Martin Daud
Mary Anthony
212
3333
5-1-2004
$250
Billie Kane
411
111l
Char Kane
Mary Anthony
212
3333
5-1-2004
$250
Mary Kane
411
1111
Char Kane
Walter Boyd
103
4444
5-1-2004
$200
Polly Boyd
103
4444
Walter Boyd
Walter Boyd
103
4444
5-1-2004
$200
Joe Bertonelli
310
2222
Martin Daud
Walter Boyd
103
4444
5-1-2004
$200
Peter Parks
310
5555
Martin Daud
Walter Boyd
103
4444
5-1-2004
$200
Billie Kane
411
1111
Char Kane
Walter Boyd
103
4444
5-1-2004
$200
Mary Kane
411
1111
Char Kane
Martin Daud
310
5555
7-1-2004
$300
Polly Boyd
103
4444
Walter Boyd
Martin Daud
310
5555
7-1-2004
$300
Joe Bertonelli
310
2222
Martin Daud
Martin Daud
310
5555
7-1-2004
$300
Peter Parks
310
5555
Martin Daud
Martin Daud
310
5555
7-1-2004
$300
Billie Kane
411
1111
Char Kane
Martin Daud
310
5555
7-1-2004
$300
Mary Kane
411
111
Char Kane
Char Kane
411
1111
2-1-2005
$250
Polly Boyd
103
4444
Walter Boyd
Char Kane
411
1111
2-1-2005
$250
Joe Bertonelli
310
2222
Martin Daud
Char Kane
411
1111
2-1-2005
$250
Peter Parks
310
5555
Martin Daud
Char Kane
411
1111
2-1-2005
$250
Billie Kane
411
1111
Char Kane
Char Kane
411
1111
2-1-2005
$250
Mary Kane
411
1111
Char Kane
12.7 Chapter Review
807
Now remove any row for which the two versions of Chief Tenant and the two versions of Apartment (Apt) are not the same. Reduced Chief x Subsidiary Chief Tenant
Apt
Phone
Lease Exp
Dep
Name
Apt
SPhone
Chief Tenant
Walter Boyd
103
4444
5-1-2004
$200
Martin Daud
310
5555
7-1-2004
$300
Polly Boyd
103
4444
Walter Boyd
Joe Bertonelli
310
2222
Martin Daud
Martin Daud
310
5555
7-1-2004
$300
Char Kane
411
1111
2-1-2005
$250
Peter Parks
310
5555
Martin Daud
Billie Kane
411
1111
Char Kane
Char Kane
411
1111
2-1-2005
$250
Mary Kane
411
1111
Char Kane
Finally, remove one copy of each duplicate column. Chief * Subsidiary Chief Tenant Walter Boyd
Apartment
Phone
Lease Expires
Deposit
Name
SPhone
103
777-222-4444
5-1-2004
$200
Polly Boyd
777-222-4444
Martin Daud
310
777-222-5555
7-1-2004
$300
Joe Bertonelli
777-222-2222
Martin Daud
310
777-222-5555
7-1-2004
$300
Peter Parks
777-222-5555
Char Kane
411
777-222-1111
2-1-2005
$250
Billie Kane
777-222-1111
Char Kane
411
777-222-1111
2-1-2005
$250
Mary Kane
777-222-1111
(c) Suitable primary keys for Chief are either Chief Tenant or Apartment or Phone. The best choice is Chief Tenant, since it is used as a foreign key in the Subsidiary relation. The only choice for primary key for Subsidiary is Name. 6. The functional dependencies Level --+ Audition piece and Audition piece --+ Level indicate that this relation is not in second normal form (for example, D = Level is properly contained in the primary key). The algorithm for converting to third normal form can begin by setting D = {Student, Level] and B = Teacher. Then replace Competitors by Competitors[Student, Level, Teacher] and Competitors[Student, Level, Audition piece]. Competitors[Student, Level, Teacher] Student Level Teacher 1 Smith Mary Anderson Smith 2 Mary Anderson Jones 3 Mary Anderson Jones 1 Bobbie Juarez Wilma Holter Esteban Laureano
1 2
Jones Colatti
Min Gua Lin
3
Gao
7. Follow the process outlined in the text. Move complements onto single variables. (x+y)-(x.y)+0 =
Level, Audition piece]
Competitors[Student, Student Level I Mary Anderson Mary Anderson 2
Audition piece Winset: The Happy Rabbit
Mary Anderson Bobbie Juarez Wilma Holter Esteban Laureano Min Gua Lin
Liszt: La Campanella Winset: The Happy Rabbit Winset: The Happy Rabbit Beethoven: Ffir Elise Liszt: La Campanella
3 1 1 2 3
only essential functional dependency is (Student, Level) Teacher). The second relation still has the functional dependencies Level -+ Audition piece and Audition piece ->, Level, but there are no choices for B in the algorithm for which B is not in any key. Therefore, this table is also in third normal form. It is still tempting to do one more lossless decomposition, replacing Competitors[Student, Level, Audition piece] with Competitors[Student, Level] and Competitors[Level, Audition piece]. The relation Competitors[Student, Level, Audition piece] can be regained as Competitors[Student, Level]*Competitors[Level, Audition-piece]. This would, however leave us with a table with nothing in it but the primary key.
=
(x+)-.(x-y) (x + Y) • (Y + T)
identity DeMorgan
Now replace this with a sum of products.
Beethoven: Ffir Elise
The first table is already in third normal form (the
(x + Y) • (T + 3) = (x + 5T) T + (x +±3) •3
distributivity
= x . (x + +)3. Y (x + 3) = (T • x + T-. 3) + (Y . x + 3= (x • Y + T . 3) + (x • Y + 3-
commutativity (twice) distributivity (twice) commutativity (twice)
=x
3) 3) T YY + . + x •7 + y- 3Y associativity
808
Chapter 12 Functions, Relations, Databases, and Circuits
Now make each term a minterm.
Phase 2:
x.-x7+ Y-.Y + x.-Y+ T. y
=0+ Y + x = 0+
.Y + x
= x Y+ 0 + X
complement
1
+
idempotence
X .y •
+Y
x + X Y+ = =
y + x •y + x + yY Y Y-+(x Y+x.Y)+7.y • Y + x y + x• 5 + 2 •y + x y
= =
-Y) +x
= (Y-.Y+
2
+ Y
y
= T •Y+ x -Y
commutativity
-1 0
identity
0-0
replacement algorithm (x) associativity idempotence commutativity associativity
00-
y___x _z__
3 y •z _
4 •
•
X
X
X
X X
X
The most efficient way to cover all four columns is to use rows one and three. The minimized binary expression is therefore
idempotence
It is easy to check that the expression in disjunctive normal form is equivalent to the original ___9. function.
(a)
x
y
f(x, y)
x
y
(x+-Y).(x.y)+O
x-Y+x.'Y
0
0
0
0
0
1
1
0
1
1
0
1
0
0
1
0
1
1
0
1
1
1
1
0
1
1
0
0
This translates to f (x, y) =
z)minimized.
f 8. 8. f(x,y,z)=x.y.i+2-.y.•+2.y.z+ii.y.z
iimzd
(b) x
Phase 1: 1 x.y.•. 2 -. y-z 3 x.?yz 4 x.-yz
110 010 001 000
,/ ,/ V V
1,2 2,4 3,4
-10 0-0 00-
.y + x. y,which is already
ALPPENDIX(-
This appendix contains brief descriptions of and interesting facts about the commonly used number systems. It is not intended to be a formal or complete introduction. Informally, a number system consists of a set of mathematical abstractions called numbers, together with the operations addition and multiplication. A number system therefore consists of more than just the numbers. The operations (which take pairs of numbers and produce another number) are an essential ingredient. Subtraction and division can be defined by using addition and multiplication, together with the notion of an inverse (defined later).
A.1 The Natural Numbers The simplest number system is the set of naturalnumbers:
N = {0, 1, 2, 3, 4 .... }. These numbers arise naturally in the context of counting. Very young children can easily grasp the concept of attaching a number (or number name) to a quantity, irrespective of what kind of objects are being considered (4 balls, 4 dolls, 4 houses, etc.). This system of numbers has a limited mathematical structure. There is a notion of addition but only a limited notion of subtraction (4 - 6 is no longer a natural number). Multiplication of natural numbers is defined, but division generally doesn't work (7 - 3 is not a natural number). Every natural number has a uniquely defined successor (5 is the successor of 4). Every natural number except 0 also has a uniquely defined predecessor (8 is the predecessor of 9). This is not the case with the rational, real, and complex numbers. The notions of successor and predecessor can be used to define the terms greater than and less than for elements of N. This leads to the seemingly trivial but actually significant well-ordering principle: AXIOM 3.1 The Well-Ordering Principle Every nonempty set of natural numbers has a smallest element. The inclusion of the number 0 permits discussion of the absence of some object. It also is a significant part of a place value representation of numbers. For example, the number 203 means "two one hundreds, no tens, and three units." Common use of a special symbol (such as "0") to represent the notion of "none" was a fairly late occurrence in the history of mathematics.
Al
A2
Appendix A Number Systems One simple property 1 of the natural numbers is that whenever a product of two such numbers is zero, at least one of those numbers must also be zero: ab = 0
implies that
a = 0 or b = 0 (or both are zero).
This property is sometimes called the zero product principle. You should note that some authors define the natural numbers to be the set {1, 2, 3, .... }. That is, they exclude zero.
A.2 The Integers The natural numbers do not allow sufficient flexibility for the kinds of mathematical concepts and calculations we commonly need. In particular, subtraction is not always meaningful if we limit numbers to the natural numbers. For example, there is no natural number to represent 4 - 17. The integers correct this deficiency by introducing negative numbers: Z =
..-
4, -3, -2, -1, 0, 1, 2, 3, 4....}
The letter Z is traditionally used to denote the integers. 2 Two important subsets of Z are the set of positive integers, Z+ = 1, 2, 3, .... 1, and the set of nonnegative integers (another name for the set of natural numbers). The introduction of negative numbers allows convenient mathematical manipulation of notions such as "a deficit of $1000" or "5 degrees below 0." The Chinese were comfortably using negative numbers in financial calculations somewhere between 500 B.C. and 250 A.D., probably closer to the earlier date [76]. The integers have a well-defined addition, subtraction, and multiplication. There is still no fully developed division, however. A significant feature of the integers is the notion of prime numbers and factorization. Recall that a prime number is a positive integer n > 1 that is not evenly divisible by any other positive integers except itself and 1. I*
!
The FundamentalTheorem of Arithmetic
Every integer n, with n > 2, can be written uniquely as a product of primes in ascending order. Factorization provides a mechanism to partially overcome the lack of a full division. You are familiar with "short division" from elementary school. The formal name is as follows: IFiP*MFtF
The Euclidean Division Algorithm
Let a and b be integers with b 0 0. Then there exist unique integers q and r such thata=bq+rand0 y, or y > x, or x = y. The final axiom specifies that if a sequence of real numbers (such as the sequence {.1,. 11,. 111,. 1111 .... }) gets closer and closer to some value (. 11111 ... gets close to g in this example), then the value they approach is also a real number. In more formal language, the set of real numbers contains all limit points of sequences of real numbers. This notion is called completeness. The set of real numbers is thus a complete orderedfield.
The order axioms can be applied to the integers and rational numbers. However, there are sequences of rational numbers whose limit point is not a rational number. For example, the sequence of rational numbers {3, 3.1, 3.14, 3.141, 3.1415, 3.14159, ... approaches the value 7r, which is not rational. Thus, the set of rational numbers is not complete. The set of real numbers is also dense: Between any two distinct real numbers, there is always another (distinct) real number.
A.5 The Complex Numbers The set of real numbers might seem to be a large enough set of numbers to answer all our mathematical questions adequately. However, there are some natural mathematical questions that have no solution if answers are restricted to be real numbers. In particular, many simple equations have no solution in the real numbers. For example, x2 +
=0.
A solution would require a number whose square is - 1. For many centuries, mathematicians were content with the answer, "there is no such number." Eventually, it became acceptable to allow the existence of a number, denoted i, such that i2 -- 1. Once the 5 See Proposition 3.6. 6
The line over the digits 571428 indicates that the group repeats forever.
A.5 The Complex Numbers
AS
proper definitions of addition, subtraction, multiplication, and division were found, a 7 new field, called the complex numbers, was available. C = {a +bi Ia, b E R) That is, the set of complex numbers contains all expressions of the form a + bi, where a and b are real numbers, and i = -1. The set of real numbers is a subset of C , since a + Oi = a is a real number for every a. Addition and multiplication of complex numbers are defined by (a + bi) + (c + di) = (a + c) + (b + d)i 2 (a + bi) . (c + di) = ac + adi + bci + bdi = ac + adi + bci - bd = (ac - bd) + (ad + bc)i
The set of complex numbers is often visualized as the set of all points in the plane (Figure A.2). The complex number a + bi is identified with the point (a, b) in the plane. The old x-axis is renamed the "real axis" and the old y-axis is renamed the "imaginary 8 axis." Figure A.2
The complex
Im
plane.
3
2
-1 + 2.2i
I
I
-3
-2
.
-1
I + 2i
. .
..
Re 1
2
3
-1
-2-
1.5i
-2
0
2 -
2i
-3
The set of complex numbers forms the final number system in the chain of number systems presented. The system of complex numbers is a complete field. However, the notion of order needs to be relaxed. For example, in Figure A.2, neither i > 1, nor I > i, nor i = 1. The distance (in the complex plane) from a complex number to the origin can be used to obtain a weaker form of order. This distance is denoted by the familiar absolute value symbol. Thus jiJ = II and II + 2iI > Ii1, as can be seen in Figure A.2. One of the nice algebraic properties of complex numbers is the following theorem about polynomials with complex coefficients. 7
Although square roots of negative numbers appeared in computations at least as early as 50 A.D., general
acceptance of complex numbers as a meaningful and valid concept only occurred during the last half of the
nineteenth century [79]. 8 The unfortunate term imaginaryis a historical relic we are stuck with.
A6
Appendix A Number Systems The Fundamental Theorem of Algebra The polynomial anxn + an-Ixn-1 +
with complex coefficients an, an-1i .... tors with complex coefficients an, Cn .....
+
a2x2 + alx + ao
al, ao, factors into a product of n linear facCl:
an(x + cn)(X + Cn- 1 ) ..
(X + C2)(x +
Cl).
If the coefficients ak are real numbers, there may be no real solutions to the equation anxn + an-Ixn-I + -... + a2x2 + aix + ao = 0. However, the fundamental theorem of algebra ensures that there will always be n complex solutions. Recall that the set of real numbers is a subset of the set of complex numbers, so some or all of the solutions may be real numbers. Also, some solutions may be counted more than once. Thus x 2 +4x +4 = (x + 2)(x + 2) = 0 has the two solutions x = -2 and x = -2. That is, -2 is a multiple solution. The equation x 2 + I = (x - i)(x + i) = 0 has as solutions x = i and x = -i.
A.6 Other Number Systems Mathematicians are aware of other "number" systems. The simplest alternative number system is called arithmeticmod 2. This system of numbers is defined by Z2 =
{0, 1}
with the following addition and multiplication tables: +
0
1
.0
0 1
0 1
1 0
0 1
1 0 0
0 1
This number system can be shown to be a field. It is used in error-correcting codes, generally with polynomials whose coefficients are in Z 2. Similar fields can be defined for any number p that is a prime: Zp = {0, 1,2,3 ... 0
, p-l},
with addition and multiplication defined mod p. That is, first do the addition or multiplication in the integers. Call the answer a. The mod p answer is the remainder upon dividing a by p. For example, in Z5 1
4
3+4=7 (mod5)=2 and 2.4=8 (mod5)=3. 03,2 Figure 1.3. Clock arithmetic,
A more formal name for these algebraic systems is the integers mod p. Arithmetic in the number systems Zp is often called clock arithmeticdue to the way numbers seem to wrap around (Figure A.3). For example, the sum 3 + 4 in Z5 can be envisioned by starting at 3 and moving 4 ticks clockwise, ending at 2 (which agrees with the preceding example).
A.7 Representation of Numbers
A7
A.7 Representation of Numbers You are familiar with the decimal representation of numbers. The key features of this system are that it is a base 10 and a place value system. Thus, the digits in the number 123.45 represent * 1 100 (102) * 2 10s (101) * 3 Is (100) * 4.Is (10-1)
* 5.01s (10-2) Ten is not the only base that has been used to represent numbers. The ancient Mayans used a base 20 representation, and the ancient Babylonians used a base 60 system. Our conventions that there are 60 minutes in an hour and 360 degrees in a circle are derived from the Babylonian representation. Perhaps one reason that 60 was chosen as a base is that 60 has a large number of prime factors: 60 = 22 .3 •5. The net effect is that numbers can be easily divided into halves, thirds, fourths, fifths, sixths, tenths, twelfths, fifteenths, twentieths, thirtieths, and sixtieths. Thus, fractional arithmetic and mercantile operations were simplified. The most important modern alternatives to base 10 (decimal) are base 2 (binary), base 8 (octal), and base 16 (hexadecimal). These alternatives are important because computers use a binary representation for storing and manipulating numbers. 9 Octal and hexadecimal are used as convenient shorthand notations for binary numbers because binary numbers are difficult for humans to keep in short-term memory. Binary numbers use the symbols {0, 11 and a place value notation. The symbols are called bits, an acronym for binary digits. Since the base is 2, each place represents a power of 2 (as opposed to a power of 10 in decimal representation). Thus the binary number 101.11 represents
* 1 4 (22) * 0 2s (21)
* 1 1 (20) 1 '
(2-1)
1 14 (2-2) If the subscripts 2 and l0 are understood to indicate the representation's base, then the number five and three-fourths can be expressed as 5 .751o and as 101.112. Both are valid representations of the same number. The binary representation is easier to store in a computer because we only need some medium that can exist in two states (such as magnetized-demagnetized, on-off). 10 The decimal form requires fewer digits to represent a number (65, 53510 = 11111111111111112) and is generally easier for humans to use. Table A. 1 shows the binary, octal, and hexadecimal representations of the natural numbers from 0 to 31. You might observe common patterns in the two binary columns. One reason for the similarity is that 161o is represented by 100002 in binary notation. Since, for example, 23 10 = 1610 + 710, the representation in binary is 100002 + 1112 = 101112. What should the binary, octal, and hexadecimal representations be for the number 3 2 10? 9So much arithmetic is currently done on computers or calculators that binary representations may be more
important than the more familiar decimal representation. 10The manner in which computers actually store the binary number five and three-fourths is slightly more
complex than "101.1 1." It is more like a binary version of scientific notation.
A8
Appendix A Number Systems
TABLE A.1 The Numbers 0 to 31 in Decimal, Binary, Octal, and Hexadecimal Decimal Binary Octal Hexadecimal Decimal Binary 0 0 0 0 16 10000 I1 2 3 4 5 6 7 8
Octal 20
Hexadecimal 10
10 11 100 101 110 111 1000
1 2 3 4 5 6 7 10
1 2 3 4 5 6 7 8
17 18 19 20 21 22 23 24
10001 10010 10011 10100 10101 10110 10111 11000
21 22 23 24 25 26 27 30
11 12 13 14 15 16 17 18
9
1001
11
9
25
11001
31
19
10
1010
12
A
26
11010
32
IA
11
1011
13
B
27
11011
33
lB
12
1100
14
C
28
11100
34
iC
13
1101
15
D
29
11101
35
ID
14
1110
16
E
30
11110
36
1E
15
1111
17
F
31
11111
37
IF
Octal representation uses the symbols {0, 1, 2, 3, 4, 5, 6, 7} and powers of 8. Thus the number five and three-fourths is 5.68; that is, one five and six eighths. Hexadecimal representation uses the symbols {0, 1,2,3,4,5,6,7,8,9, A, B, C, D, E, F}. The symbol A is the hexadecimal symbol for the number 10. The symbol F represents the number 15. Hexadecimal representation uses powers of 16 for the place values. Five and three-fourths is 5.C 16 ; that is, five and twelve sixteenths. Notice that 65,5351i0 = 1777778 = FFFF16 .
It is easy to convert numbers between binary representation and octal or hexadeci1 mal representation. To convert from binary to octal, group the bits into groups of three (moving out from the binary point). Replace each group by its octal value: 101110.0102 101 110. 0102 5
6.
28
56.28
To convert from binary to hexadecimal, group the bits into groups of four (again, moving out from the binary point) and replace each group by its hexadecimal value: 101110.0102 0010 1110. 01002 2
E. 416 2E.4 16.
liThe reason is related to 8 and 16 both being powers of 2.
A.7 Representation of Numbers
A9
To convert from octal or hexadecimal to binary, just reverse the process: 47.158 4
4
1
.
58
100 111. 001 1012 100111.0011012
I
1C.B3 16 C . B
316
0001 1100. 1011 00112 11100.101100112.
To convert the binary number 101110.0102 to its decimal equivalent, just expand the place value representation using base 10 arithmetic. 1 25 +23 +22
+ 2 +2-2 = 32+8+4+2+
+21
4
---46.2510
Converting from decimal to binary is the most tedious of the common conversions. Start subtracting powers of 2 (in descending order) until the remainder is zero. It is helpful to memorize a small set of powers of 2 (Table A.2). TABLE A.2 Some Useful Powers of 2 n
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
2n
I 8
4I
1 2
1
2
4
8
16
32
64
128
256
512
1024
For example, the powers of 2 in 117.25 can be determined: Current Remainder
2n
New Remainder
117.25
64
53.25
53.25
32
21.25
21.25
16
5.25
5.25
4
1.25
1.25
1
0.25
0.25
¼
0
The binary representation is built using the powers of 2 that were used (with Os for the powers that were skipped). Thus, 1 17 .2 510 = 1110101.012. That is, 117.25 = 1. 26+ 1-.25 + 1 .24 +0.-23 + 1. 22 + 0. 2' 1 - 20 + 0.2-1 + 1. 2-2.
APPENDIX
B
Summation Notation
Suppose there are n numbers at, a7 .... , a, that are to be added together. Writing the summation using the
...
notation is awkward:
a0 +
+ - a,
(-a2+
I
+ a,,.
If additional calculations need to be done, the notation becomes unwieldy. Mathematicians have developed a shorthand notation for sums that has a compact form and some simple rules for manipulation. This notation is called summation notation. The basic notation involves the Greek uppercase symbol sigma: ). There are four other significant components. The first is the set of numbers to be added. In the preceding example, these would be the numbers at, a2 .... a,,. The second component is an index variable (typically one of the letters i, j, k, 1, m, or n). This variable allows us to refer generically to one of the numbers to be added. Thus, we might write ai to refer to the ith number. The final two components are the starting and ending values for the index. The pieces fit together as follows: zai, i-k
which is read as "the sum of the numbers ai, for i between k and n, inclusive." Thus, the notational shorthand is defined as follows: DEFINITION B.1 Summation Notation n
Lai
i=k
=ak+ak+l
+ak+2 + *-+af-l
+al
Notice that the choice of index variable does not change the sum: ai i=k
aj. j -k
Changing the starting or ending value for the index does change the sum. The properties listed below are direct consequences of this definition. Notice that the symbol c represents a number that does not change with the index. You will improve
A1O
Summation Notation
All
your understanding if you make up simple examples to test the properties. For example, the property 4
4
c 'ai =- C
ai i=1
i=l
is merely an extended version of the distributive propertyI of the real numbers: cal + ca2 + ca3 + ca4 = c(al + a2 + a 3 + a4).
There are additional properties that are not listed here. These additional properties are useful with manipulations that are more complex than those needed for this text. n
1. >jzc ýnc n
2.
Y(i+b)=1
n
n
i+Yb
i=k
i=k
n
4.
i=k in
Lc'ai=c. i=k
i=k
ai i=k
These properties can be combined. For example, n
E(ai + c)
n
Z ai + nc i=1
combines properties I and 2 (set each number bi = c).
1See Appendix A.3.
APPENDIX
Logic Puzzles
C
C.1 Logic Puzzles about AND, OR, NOT The most enjoyable way that I know to become comfortable with the logic operators is to solve logic puzzles. Raymond Smullyan is a master at creating interesting logic puzzles [71, 72]. Consider the following examples.
M
The Island of Knights and Knaves On a certain island, every inhabitant is either a knight, who always tells the truth, or a knave, who always lies. Suppose you meet two such inhabitants, A and B. A makes the following statement: "At least one of us is a knave." What are A and B? (It helps to know that the opposite of at least one is none.) We can use a variation on truth tables to help solve this puzzle. If we let T represent knight, and F represent knave, we see that there are four possible answers to this puzzle: both knights, both knaves, A a knight and B a knave, or A a knave and B a knight. These possibilities are listed in the following table. A
B
T
T
T
F
F
T
F
F
We need to check each of these possibilities for consistency with the statement by A. The results are presented in the following table. A
B
A: "At least one of us is a knave"
T
T
Inconsistent
T
F
Consistent
F
T
Inconsistent
F
F
Inconsistent
Both being knights is inconsistent because A should be telling the truth, but neither is a knave. The second row (A a knight and B a knave) is consistent; A should be telling
A12
C.1 Logic Puzzles about AND, OR, NOT
A13
the truth, and one of them, B, is a knave. The third row is inconsistent since A should be lying, but one of them, A, is a knave. The final row is similar to the third row. A should be lying, but at least one (actually both) is a knave. The only consistent entry is when A is a knight and B is a knave. Lurking behind the previous table is a collection of compound statements. For example, the first row can be expanded to (A is a knight)
A
(B is a knight)
A
(At least one of A, B is a knave).
This simplifies to T A T A F = F, which I have listed as inconsistent. Similarly, the third row can be expanded to (A is a knave)
A
(B is a knight)
A
(-'(At least one of A, B is a knave)). 1
This simplifies to T A T A (--T) = F, which I have listed as inconsistent.2
-IHow
0
to Choose a Bride Suppose you are a visitor to the island of knights and knaves. Every female there is either a knight or a knave. You fall in love with one of the females there-a girl named Elizabeth-and are thinking of marrying her. However, you want to know just what you are getting into; you do not wish to marry a knave. If you were allowed to question her, there would be no problem, but an ancient taboo of the island forbids a man to hold speech with any female unless he is already married to her. However, Elizabeth has a brother Arthur who is also a knight or a knave (but not necessarily the same as his sister). You are allowed to ask the brother just one question, but the question must be answerable by "yes" or "no." The problem is for you to design a question such that upon hearing the answer, you will know for sure whether Elizabeth is a knight or a knave. What question would you ask? I shall again let T represent knight and F represent knave. There are four combinations possible for Arthur and Elizabeth. In order to avoid overlooking possibilities, I will again use a modified truth table. What would happen if I ask the obvious question, "Is your sister a knight?"? The next table shows Arthur's answer in each possible case.
A
E
"Is your sister a knight?"
T
T
Yes
T
F
No
F
T
No
F
F
Yes
Clearly, the obvious question will not work. I would ask the question and receive an answer. The answer would narrow the possibilities from 4 to 2, but not in a useful way. For example, if Arthur answers "yes," Arthur and Elizabeth are represented by either the first or the fourth row. In the first row, Elizabeth is a knight, but in the fourth row, Elizabeth is a knave. I have no way of knowing which of the two rows is correct. If Arthur answers "No," I face the same obstacle. I need a different question. Convince yourself that "Is your sister a knave?" will not work either. 1 In this case, since A is a knave we should negate his statement. 2
The statement "At least one of A, B is a knave" is true in this case. Its negation is thus false, causing the
compound statement to be false.
A14
Appendix C Logic Puzzles Perhaps I can try the question "Is either of you a knight?". The results are summarized as follows: A E "Is either of you a knight?" T
T
Yes
T
F
Yes
F
T
No
F
F
Yes
This appears to be better. If Arthur answers "no," I know that Elizabeth is a knight. But what if he answers "yes"? I still wouldn't know what type Elizabeth is. Rather than randomly choosing another question, it may be profitable to analyze the preceding tables. Perhaps we can gain some insight into what we want the question to produce. The problem with my first question was that the common answers did not have a common knight/knave value for Elizabeth. The problem with the last question was that it only worked sometimes. What I really need is a question for which the first and third rows have one answer and the second and fourth rows have the opposite answer. My goal is a table that looks like this: A
E
???
T
T
Yes
T
F
No
F
T
Yes
F
F
No
Perhaps I can use one of the truth tables from Section 2.3.2. Remember that if Arthur is a knave, he will negate the value in a truth table. The truth tables, with Arthur's filtered view, are given as follows: A
E
A AE
"Are you both knights?"
T
T
T
Yes
T
F
F
No
F
T
F
Yes
F
F
F
Yes
A
E
AvE
"Is at least one of you a knight?"
T
T
T
Yes
T
F
T
Yes
F
T
T
No
F
F
F
Yes
Neither does what I want. 3 However, the (A A E) table comes close. The fourth row is the only entry that is not what I want. Perhaps a similar question would work. In fact, the question I want is "Are you both the same type?".
3
A
E
"Are you both the same type?"
T
T
Yes
T
F
No
F
T
Yes
F
F
No
The second is really the same question as "Is either of you a knight?".
C.1 Logic Puzzles about AND, OR, NOT
A15
Unfortunately, the single question will not help determine what kind of brother-inlaw you might gain. 4 E The final puzzle by Smullyan that I wish to present in this section involves a character from one of Shakespeare's more controversial 5 plays. One of the concepts hiding below the surface is that you can't believe everything you read. A statement is not true merely because it claims to be true. Portia's Caskets In Shakespeare's Merchant of Venice Portia had three caskets 6-- gold, silver, and leadinside one of which was Portia's portrait. The suitor was to choose one of the caskets, and if he was lucky enough (or wise enough) to choose the one with the portrait, then he could claim Portia as his bride. On the lid of each casket was an inscription to help the suitor choose wisely: This first of gold, who this inscription bears: "Who chooseth me shall gain what many men desire." The second silver, which this promise carries: "Who chooseth me shall get as much as he deserves." This third dull lead, with warning all as blunt: "Who chooseth me must give and hazard all he hath." Suppose instead that Portia wished to choose her husband not on the basis of virtue, but simply on the basis of intelligence. She had new inscriptions put on the caskets (Figure C.1). Figure C.1 The inscriptions on Portia's caskets.
TABLE C.1 Possible Truth Values for the Casket Inscriptions Lead Silver Gold T T
T F
T
F
F T F
F
T
T
F F F
T
F
F F
T F
GOLD
SILVER
LEAD
The portrait is in this casket
The portrait is not in this casket
The portrait is not in the gold casket
Portia explained to the suitor that of the three statements, at most one was true. Which casket should the suitor choose? In this problem there are more than two people or things to combine. We can consider the possible combinations of the truthfulness of the messages on the three caskets. One systematic way to list them all is to have the right-hand column alternate between T and F, the middle column alternate two T's then two F's, and the left-hand column alternate T, F in groups of 4. (As we move from right to left, each column alternates in groups of twice as many as the previous column.) Table C. t lists the possible combinations. To solve the problem, we need to find a row that is consistent with the inscriptions. Portia has told us that we may eliminate rows 1, 2, 3, and 5. The consistency checks are presented in Table C.2. In the fourth row, the gold inscription is true, so the portrait is in the gold casket. However, the silver casket should have a false inscription. But the inscription on the silver casket correctly states that the portrait is not in the silver casket. This is an inconsistency. You should take the time to explain the remaining entries in the table. 4 5 6
That would require a single question with four possible answers, or else two questions. At least in the late twentieth century. A casket (in this context) is a small chest or box.
A16
Appendix C Logic Puzzles TABLE C.2 Checking the Consistency of the Inscriptions Gold Silver The portrait is in this casket
Lead
The portrait is not in this casket
The portrait is not in the gold casket.
Gold
Silver
Lead
Consistency
T
T
T
-
T
T
F
-
T
F
T
-
T
F
F
F
T
T
F
T
F
Inconsistent
F
F
T
Consistent
F
F
F
Inconsistent
Inconsistent -
According to the table, the seventh row is the proper row. The gold inscription is false, so the portrait is not in the gold casket. The lead inscription is true but provides no new information since we already know the portrait is not in the gold casket. The silver inscription is false, so the portrait must be in the silver casket. The suitor should choose the silver casket. U
60rQ~ui~ckC Check CA1 1. Suppose in Example C.2 you were allowed to ask Elizabeth one yes/no question (rather than asking Arthur).
What question could you ask her to determine whether she is a knight or a knave?
C.2 Logic Puzzles about Implication, Biconditional, and Equivalence Here are a few more of Raymond Smullyan's delightful logic puzzles. The ones presented next help explore implication, the biconditional, and equivalence. More Knights and Knaves We have two people, A, B, each of whom is either a knight or a knave. Suppose A makes the following statement: "If I am a knight, then so is B." Can it be determined what A and B are? Once again, I will let T represent knight and F represent knave. Table C.3 records the consistency of the statement by A. TABLE C.3 Consistency Check for Example C.4 A
B
T
T
Consistent
T
F
Inconsistent
F
T
Inconsistent
F
F
Inconsistent
"If I am a knight, then so is B"
In the first row, since A is a knight, she speaks truthfully. B is also a knight, so the statement is consistent with the types of A and B. In the second row, B is a knave, contradicting the truthfulness of A's statement. The final two rows are inconsistent since A, being a knave, can never speak truthfully. But the implication "If I am a knight, then
C.2 Logic Puzzles about Implication, Biconditional, and Equivalence
A17
so is B" is true (A is a knave, so the "hypothesis" is false, making the implication true). We conclude that they are both knights. • Romance among the Knights and Knaves This problem, though simple, is a bit surprising. Suppose it is given that I am either a knight or a knave. I make the following two statements: 1. "1love Linda." 2. "IfI love Linda, then I love Kathy." Am I a knight or a knave? We need to decide if I can consistently make the two statements. Suppose I am a knave; every statement I make is false. Then (1) is actually false; so I don't love Linda. The hypothesis of the implication (2) is therefore false, so the implication is true. But then I could never say the true implication (2). The assumption that I am a knave makes the two statements inconsistent. No logical inconsistency occurs if we assume I am a knight. Therefore, I must be a knight. Notice that since (1) is true, the hypothesis to (2) is also true. But the entire implication (2) is also true. We are led to conclude 7 that I must also love Kathy. Although the logic of the puzzle ensures that I am a knight, some might be tempted to label me a two-timing knave! U M
Is There Gold on this Island? On a certain island of knights and knaves, it is rumored that there is gold buried on the island. You arrive on the island and ask one of the natives, A, whether there is gold on this island. He makes the following response: "There is gold on this island if and only if I am a knight." Our problem has two parts: a. Can it be determined whether A is a knight or a knave? b. Can it be determined whether there is gold on the island? I will let T and F serve multiple duty for this example. The symbol T will represent A being a knight, gold being on the island, and the biconditional (A's response) being true. The symbol F will represent A being a knave, there being no gold on the island, and the biconditional being false. For each combination, we will check for consistency (Table C.4). TABLE C.4 Consistency check for Example C.6 A
Gold
Gold * Knight
Consistency
T
T
T
Consistent
T
F
F
Inconsistent
F
T
F
Consistent
F
F
T
Inconsistent
In the first row, A tells the truth, so the claim that his knightly nature is identical with the presence of gold is consistent. In the second row, the biconditional is false. A could never make the claim that the biconditional is true. In the third row, the biconditional is false, but A is a knave and would tell us that it was true. In the final row, the knave A would never tell us that the biconditional was true (which it is in that row). We cannot answer (a), but we do know that the island contains gold. U 7
This form of reasoning is developed in Section 2.7.1.
A18
Appendix C Logic Puzzles
OV Quic-k-Chec ,k"C.-2 1. Suppose you are on the island of knights and knaves. Inhabitant A makes the following two statements: (a) "If I am a knave, then B is a knight."
(b) "I am a knave if and only if B is a knave." Do you know what A is? Do you know what B is?
C.3 Exercises The exercises marked with O have detailed solutions in Appendix G. The firstfive problems concern two identical twins, Ebenezum and Jedediah. Ebenezum always tells the truth. Jedediah tells the truth on some days and lies on others. Since the twins are identical, 1 can't tell which brother is speaking merely by looking or listening. I must consider the content of their statements. 1. 0 One day the twins approached me and made the following statements: "I"T always tell the truth."
6. King Nebuchadnezzar employed three personal physicians. The physicians tended to be jealous of one another. Since he was a despot, Nebuchadnezzar only allowed high officials or personal servants one big mistake or two small mistakes. Each physician had already made one small mistake, so they were all eagerly looking for opportunities to cause their rivals to stumble. One morning, Nebuchadnezzar (hereafter called "the king") woke up feeling ill. He called in physician A, who prescribed a diet of chicken soup and black bread for three days, promising that this would cure him. Physician B then came rushing in and exclaimed, "Don't listen to A, he and C are both liars." The King then called in C to get his opinion. C was so frightened that all he could say was, "One of the others is lying." The King thought for a minute, then had one of the physicians executed immediately. Which doctor was executed, and why? What assumptions did you need to make in order to solve this problem? (You may not assume that the king is stupid; nor may you assume that he acted arbitrarily. Within his worldview, he made a rational decision based on the facts and so should you.)
"*"My brother is lying." Is Jedediah telling the truth today, or is this one of his lying days? (Remember, the twins are identical, so I can't tell which is which by looking or listening.) 2. I once asked them what their mother's name was. replied, "*"Mother's name is either Ann or Betty." "*"Mother's name is Ann."
They
Can I determine their mother's name? 3. The twins have a sister whose name is either Maud or Gerty. I asked them the following question: "Does Ebenezum ever lie or is your sister's name Maud?". They both gave the same yes/no answer. What did I learn?
7. ýD- Suppose you arrive on the island of knights and knaves. You want to visit the capital city. As you travel, you come to a fork in the road. There is a local inhabitant standing at the fork. She informs you will answerrakae only one question. Yudntko hte that hshesakih a o
4. Suppose I ask Ebenezum and Jedediah "Are you lying toaswes ca I etYou Wat pssile day?.pirsof
don't know whether she is a knight or a knave. Can you
day?". What possible pairs of answers can I get?
think of a single question to ask whose answer is guaranteed
5. If I ask the question "Is your brother lying today?", can I determine if Jedediah is telling the truth or lying?
to provide the proper route to the capital city? (You need to be a bit devious.)
C.4 QUICK CHECK SOLUTIONS Quick Check C.1 1. Suppose you ask her the same question that would work with Arthur. Her response would be as follows: A
E
"Are you both the same type?"
T
T
Yes
T
F
Yes
F
T
No
F
F
No
This tells us what kind of resident Arthur is but nothing about Elizabeth.
C.4 Logic Puzzles about Implication, Biconditional, and Equivalence
A19
The goal is to find a question with a response: A T T
E T F
??? Yes No
F
T
Yes
F
F
No
One simple question is, "Do you have a brother?". The key is to find a question that is always true.
Quick Check C.2 1. You need to keep the truth tables for implication and equivalence in mind for this puzzle. As usual, let T represent knight and F represent knave. Consider the first statement. If both are knights, then the hypothesis is false and the conclusion is true, so the implication is true. Since A tells the truth, the statement is one that could be made by A. The second row follows by an almost identical argument. In the third row, both the hypothesis and conclusion are true, so the implication is true. But A is a knave and could never make a true statement. Thus, row 3 is inconsistent. In the fourth row, the hypothesis is true and the conclusion is false, so the implication is false. A could make this false statement. A
B
"If I am a knave, then B is a knight"
T
T
Consistent
T
F
Consistent
F
T
Inconsistent
F
F
Consistent
Now consider the second statement. If both are knights, the statement (an equivalence) is true. Since A is a knight, A could make the statement. If both are knaves, the statement is also true. But A, being a knave, could not make a true statement. Thus, row 4 is inconsistent. If A is a knight but B is a knave, then the statement is false. A (a knight) would not make such a statement. Thus, row 2 is inconsistent. If A is a knave and B is a knight, the statement is false. The knave A could utter such a statement. Row 3 is consistent. A
B
"I am a knave if and only if B is a knave"
T
T
Consistent
T
F
Inconsistent
F
T
Consistent
F
F
Inconsistent
Row I is the only row in which both statements are consistent. Both A and B are knights.
APPENDIX
The Golden Ratio
D
Is it possible to quantify an aesthetic notion? Some people have believed it is possible. A number, named the golden ratio (also called the golden section or the divine proportion), is the result of one such attempt. This number derives from a geometric ratio that has its roots in antiquity. Euclid called the process of creating the ratio a "division in extreme and mean ratio." The notion reappeared in Fra Luca Pacioli's book, De Divina Proportione (written in 1509 and illustrated by Leonardo DaVinci). The adjective golden seems to have first appeared in the 1800s in Germany. The name golden section first appeared in English in the 1875 Encyclopedia Britannica. Some (but not all) mathematicians use the Greek letter (D to represent this number (in honor of the ancient Greek sculptor Phidias). The number (Euclid's "division in extreme and mean ratio") can be derived by considering the problem: Given a line segment, divide it into two pieces, S and L, having respective lengths, s and 1, so that the number of copies of S that will fit in L is the same as the number of copies of L that will fit inside the original line segment. Another way to state the problem is to require the ratio of the longer segment's length to the shorter segment's length to be the same as the ratio of the entire line segment's length to the longer segment's length:
Making the substitution (P
-.
I
+ s
s
1
S we seek a solution to the equation )=
1+-
or (P -2_(I - 1 -0. The possible solutions are 1 is
Figure D.1. Dividing a line by the golden ratio.
A20
and
1-52V.
Since '-
is negative, the only possibility
2 In terms of the original question, the line should be divided so that = sD (Figure D. 1). It has been asserted numerous times during the previous century that the golden ratio can also be used to determine the length and width of an aesthetically pleasing rectangle, called the golden rectangle (Figure D.2).
The Golden Ratio
A21
This assertion has been extended to suggest that golden rectangles can be found in ancient architecture and in many famous paintings and sculptures. 1 Perhaps the most noteworthy claim of this sort involves the front face of the Parthenon (Figure D.3), a temple in Athens dedicated to Athena. It was completed around 447 B.C. The width and height (including the peak) are said to form a golden rectangle. Figure D.2. The golden rectangle.
Figure D.3
The Parthenon.
Many of these claims seem to be valid as rough approximations. However, serious doubt has been expressed about (a) the mathematical accuracy of such claims and (b) the validity of claims that the architects, painters, and sculptures actually intended to use the golden ratio as a basis for their work. For more information about these objections (and some other dubious claims regarding the golden ratio), see Misconceptions about the Golden Ratio, by George Markowsky [56].
1
For example, the face in the Mona Lisa.
APPENDIX
E
Matrices
Matrices appear in many contexts in mathematics. They are most fully explored in linear algebra courses. They appear in some of the chapters of this book, so a very brief introduction will be helpful for students who have not previously studied them in a linear algebra course or an upper level high school course. DEFINITION E.1 Matrix; Square Matrix; Main Diagonal An n-by-m matrix is a numeric table containing n rows and m columns. One common notational device is to abbreviate the matrix A by writing A = (aij). This is interpreted as "A is the matrix whose entry in row i and column j is the number aij." If n = m, we say the matrix is a square matrix. The main diagonalis the set of entries {aIt, a22, a33 ..... ann I. For example, a 3-by-4 matrix will have the form
(
~1
a12
a,,
(1,4
a21
a22
a23
a24
(03l
032
a33
(134
/
DEFINITION E.2 Transpose of a Matrix Every n-by-m matrix, A, has an associated m by n matrix called its transpose that is denoted AT. The entry in the ith row and jth column of AT is the entry in the jth row and ith column of A. That is, if A = (aij) and A T = (bij), then bij = aji. This is often abbreviated as AT = (aji). The transpose effectively interchanges rows and columns of A. The transpose of the 3-by-4 matrix shown previously is
a12 a13 a
a22 a23 a24 04
a32 a33 a34
One common use for matrices is to compactly organize the important information in a system of linear equations. One matrix that arises in that context is called a coefficient matrix.
A22
Matrices
A23
DEFINITION E.3 Coefficient Matrix
Let bl
alixi + a12x2 +
+ almxm
a21x1 + a22X2 + ''.
+ a2mXm = b2
=
an]X1 + an2X2 + ' " + anmXm
bn
be a system of linear equations. Then the matrix A = (aij) is called the coefficient matrix for the system of equations.
A Coefficient Matrix The system of linear equations -5w+2x+3y+4z= 4w-6x+9y = w+3x+4y-
7 8
z=-6
has coefficient matrix 4 1
-6 3
9 4
0. -_
Matrices are frequently used in applied settings. They also provide interesting examples of algebraic systems with properties that differ from some of the standard field properties that characterize the rational numbers. Only some simple algebraic manipulations of matrices will be presented here. Just as we can add, subtract and multiply numbers, it is also possible to define analogous operations on matrices. There are two special families of matrices that correspond to the numbers 0 and 1. DEFINITION E.4
Zero Matrix
A zero matrix is a matrix for which every entry is the number 0. The 3-by-4 zero matrix is as follows: 0
0 0
.0
00 00 Once matrix addition is defined, it is easy to show that adding any n-by-m matrix, A, to an n-by-m zero matrix will result in A. Compare this to a + 0 = a for any real number, a. DEFINITION E.5
Identity Matrix
An identity matrix is an n-by-n (square) matrix with aij =
I 0I
The n-by-n identity matrix is denoted In.
if i Aj if i =j.
A24
Appendix E Matrices The 3-by-3 identity matrix is shown below. h= ( 1 13=10 10 0
.0
00l1
Once matrix multiplication is defined, it will be easy to show that multiplying any n-by-m matrix, A, by an appropriate identity matrix will result in A. In fact, Aim = A and IA = A will both be true. Compare this to a • 1 = a and I • a = a for any real number, a. Two n-by-m matrices can be added (and subtracted) by adding (subtracting) the corresponding entries. DEFINITION E.6 Matrix Addition Let A = (aij) be an n-by-m matrix and let B = (bij) be another n-by-m matrix. Then the sum, C = A + B, is an n by m matrix. Moreover, C = (cij), where Cij = aij + bij.
Matrix Addition 113-4) 3
-5
1 51
-2
8
2
(2
3
0
(5
4
-9
-2 10
3
3
110
9
8
-6 -
-
)
At first it may seem reasonable to define an entry-by-entry multiplication (similar to what was done for addition). However, most applications of matrices favor a more complex definition. Under this definition, matrix multiplication is not commutative (in general). The product A B may be defined, but B A may not exist. Even if A B and B A both exist, it is usually the case that A B # BA. DEFINITION E.7 Matrix Multiplication Let A = (aij) be an n-by-m matrix and let B = (bij) be an m-by-p matrix. Then the product, C = AB, is an n-by-p matrix. Moreover, C = (cij), where m Cij = TL aikbkj. k=l
Notice that the number of columns of A must be the same as the number of rows in B. Observe also that the product matrix has the same number of rows as A and the same number of columns as B. One visualization of this process is as follows. The entry in the ith row and jth column of the product is the result of grabbing the jth column of B, rotating 90' counterclockwise, and dropping it on the ith row of A. We then multiply numbers that occupy the same position and add the results: Cij = ailblj + ai2 b2j +
+ aikbkj +
bij
+aimbmj.
.
Matrices
S1Matrix
A25
Multiplication Make sure you can reproduce this calculation.
-
8
-1 1
-2
47
= O=(A
)(0
23
55
It should be noted that matrix division is not defined. For square matrices, it is common to define the notion of an inverse matrix and gain some of the benefits of division. DEFINITION E.8 Matrix Inverse If A is an n-by n-matrix, then A has an inverse if there is a matrix, A-', such that AA-' = A-'A = I,. Many square matrices do not have an inverse. The definition of matrix multiplication can be extended to cover nonnegative integer exponents for square matrices. DEFINITION E.9 Ak Let A be an n-by-n matrix. The nonnegatie powers of A are A0
I,
Ak
AAk-1.
Matrix Powers Let A =
2 1 (10
1
. Then
=6 4 6) A2 =AA= 6 5 6 (2228
and
3
A =AA'
/20 22
16 20) .22 17 6 8
APPENDIX
The Greek Alphabet
F
The following Greek letters are commonly used in mathematics. Name
Lowercase
Uppercase
alpha
cl
A
beta
/3
B
gamma
Y
F
delta epsilon
Pronunciation
A E
E
zeta
Z
"e" like the "a" in "ate"
eta
H
"e" like the "a" in "ate"
theta
0
(0)
"e" like the "a" in "ate"
iota
I
I
"i" like the "e" in "he"
kappa
K
K
lambda
X
A
mu
I
M
like "moo"
N
like "new"
E
"x" like in "vex," "i" like the "e" in "he"
nu xi omicron
0
0
pi
7H
[1
rho
p
P
sigma
or
I
tau
r
T
like "row"
upsilon
v
T
"ups" like "oops"
phi
q5
q)
with a long 'i"
chi
X
X
with a hard "ch" (like a "k")
psi
i/f
omega
(0
Q
The Hebrew letter aleph, t, is also used in discussions about the cardinality of infinite sets.
A26
APPENDIX
N
N (R v S)
Contrapositive
(-R A -S) --- (-P V -Q)
Converse
(R v S) --> (P A Q)
Inverse (-PV -Q) -- (-R A -S) 4. Associativity The two sides of the biconditional have the same truth table values.
P
Q
R
Pv Q
QvR
[(Pv Q) vR]
[Pv(QvR)]
T
T
T
T
T
T
T
T
T
F
T
T
T
T
T T
F F
T
T
T
T
T
F
T
F
T
T
F
T
T
T
T
T
T
F
T
F
T
T
T
T
F
F
T
F
T
T
T
F
F
F
F
F
F
F
P
Q
R
PA Q
QAR
[(PA Q)ARI
[PA(QAR)J
T
T
T
T
T
T
T
T
T
F
T
F
F
F
T
F
T
F
F
F
F
T
F
F
F
F
F
F
F
T
T
F
T
F
F
F
T
F
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
Distributivity (A over v) The two sides of the biconditional have the same truth table values. P
Q
R
PA Q
PAR
QvR
PA(QvR)
(PA Q) v(PAR)
T
T
T
T
T
T
T
T
T
T
F
T
F
T
T
T
T
F
T
F
T
T
T
T
T
F
F
F
F
F
F
F
F
T
T
F
F
T
F
F
F
T
F
F
F
T
F
F
F
F
T
F
F
T
F
F
F
F
F
F
F
F
F
F
P
Q
R
PAR
QAR
(PVQ) AR
(PAR) v(QAR)
T
T
T
T
T
T
T
T
T
T
F
T
F
F
F
F
T
F
T
T
T
F
T
T
P
Q
T
F
F
T
F
F
F
F
F
T
T
T
F
T
T
T
F
T
F
T
F
F
F
F
F
F
T
F
F
F
F
F
F
F
F
F
F
F
F
F
A31
A32
Appendix G Solutions to Selected Exercises
5. Implication All sides of the biconditional have the same truth table. P Q -P -Q P A (-Q) P I Q -(P A (-Q))
9. (a)
T
T
F
F
F
T
T
T
T
F
F
T
T
F
F
F
F
T
T
F
F
T
T
T
F
F
T
T
F
T
T
T
P
Q
(P -
Q) A Q]
[(P-• Q) A Q] -+ P
T
T
T
T
T
T
F
F
F
T
F
T
T
T
F
F
F
T
F
T
Q)
[(P -
11. (b) (P -) [(-P) --* Q])
4:• P
--
(a-(a+(b-b)))+b
(-(-P)v Q) implication (P V Q) double negation SP law of addition
(c) [P -- (Q A (-Q))] [P --> F] --+ (-P)
.:.
-
ST
(-P) law of contradiction
,• ((-P)v F) -- (-P) implication -P .-P Q)) v P implication
Q) V P
-:• ((-P) v Q) v P 0.
Solutions to Selected Exercises
Exercises 3.4.3 1 k2m+l k 2. , k- I + 1) + (2m + 1)/ =+• (mm+ l)(2m + 1). m(2m (b) If n is even, then there is an integer m such that n - 2m. Thus, using Exercise 1, ~-=1 k = Lkl k = m(2m + 1) = ( )- (n + 1) =-- n(n+l) 2 If n is odd, then there is an integer m such that n = 2m + 1. Thus, using part (a), k = (m + 1)(2m + 1) = k = Z2m+l y=1 _k k1 n(n+l) 2 1• n --
2
5. Some useful examples: 24 - I = 15 = 3 . 5, 26 _ 163 = 7.9, 29 - 1 = 511 = 7 . 73, 20 - I = 1023 = 3 - 11 .31 = 31 . 33, 215 - 1 = 32767 = 7 • 31 • 151 = 7 .4681. Observation 1: If n = 2 m, then 22m - 1 = (2- - 1)(2m + 1). Observation 2: If n = ab, then 2 ab _ 1 = (2' - 1)(Eb=_ 2 a(b-k)). Thus, if n = ab is composite, then n -1 - 1 = (2a - l)(kb1 2 a(bk)) is also composite. 11. Let the rational numbers be = Their sum is • +
and •, with b A 0 and d 3 0. . Since a, b, c, and dare
integers, so are ad + bc and bd. Also, b 3 0 and d A 0, imply that bd A: 0.Thuip ad+bc y-d-- meets the requirements of the definition of a rational number. Nth e : dnthn priousltheesetncmbes. thaThe Note:rou: "properties -* concept" part of the definition are important. The proof is not complete without this verification 21. A proof by cases will work. The key idea is the definition of absolute value: lal Note that 101 Case 1: x > 0, y XY > 0 so Ixyl = Case 2: x > 0, y xy < 0 so Ixyl = Case 3: x < 0, y xy < 0 so ixyl = Case 4: x < 0, y xy > 0 so
Ia
a >0 0a.
= 0 = -0. > 0 xy = lxi • IY1 0 -(xy) = (-x)y = IxliYl 0 be some integer. Assume that every set with n elements has exactly 2' subsets. Let IS = n + 1. Single out any element and denote it x. Then S - {x} has n hypothesis, it has 2n subsets elements, ( o eo sohcby the otiinductive ) (none of which contain x). subsets of S can be partitioned into two disjoint all the subsets that contain x and all the subsets that don't ain th e subset that contain x ts ca be don't contain x. with Everya subset be of uniquely paired subset TT -that {x}contains from thex can subsets S
-
{x)
The two groups thus have the same number of
elements. Since the two groups are disjoint and their union equals the set of all subsets of S, there must be 2n + 2 n = 2n+1 subsets of S. Conclusion The theorem of mathematical induction implies that a set with n elements has 2n subsets, for n > 0. 37. Try to find a counter-example to the claim. Then we need both a and b to have remainder 1 or 2 when divided by 3. By Problem 36, a must be odd and b must be even (or vice versa). Also, c must be odd. The easiest way to coordinate remainder mod 2 and even and odd is to classify numbers by their remainder mod 6. A bit of thought and experimentation shows that we must havea E 16k + 1,6k+5},b E{6m +2,6m +4}, and c = 2n + 1 if we want to make sure that 3 divides neither a nor b. There are four cases to investigate to find a counterexample to the claim. a=6k+1andb=6m+2 a 2 +b 2 =c2 becomes 36k 2 + 12k + 1 + 36m 2 + 24m + 4 = 4n 2 + 4n + I which simplifies to 3(3k 2 + k + 3m 2 + 2m + 1) + I = n(n + 1). a = 6k + 1 and b = 6m + 4 a 2 + b2 = c 2 becomes 36k 2 + 12k + 1 + 36m 2 + 48m + 16 - 4n 2 + 4a + 1
A38
Appendix G Solutions to Selected Exercises which simplifies to 3(3k 2+k+3m 2+4m+5) +I=n(n+ 1). 2 a =6k+5andb=6m+2 a 2 +b 2 =c becomes 2 2 2 36k +60k+25+36m +24m+4-4n +4n+ I which simplifies to 3(3k 2 +k+3m 2 +2m+2)+1 =n(n+l). 2 2 2 a=6k+5andb=6m+4 a +b =c becomes 2 2 2 +4n + 1 4n = 36k + 60k + 25 + 36m +48m + 16 which2 simplifies2to 3(3k +k+3m +4m+p13)+i=n(nstl).
while d > 0 if a is divisible by i and b is divisible by i display d 0 d else d =d - 1 end greatestCommonDivisor 9.
real
{xl,
absoluteValueOfAverage (integer n,
X2,
Exercise 35 shows that each of the final equations is
sum
impossible.
average
- 0
for i
Exercises 4.1.3 1. Other algorithms are possible. {integer, integer, integer, integer} change (real price) 1.0 - price amount q = 0 while amount > .25 1 q-q amount = amount - .25 0 d while amount > 10 1 d =d+ amount - .10 amount =0
boolean relativelyPrime (integer a, integer b) if a == b return false else # do a brute-force check for common factors c = min(a,b) 2 to c for i if (i l a) and (i return false
.05
return {q,d,n,p} end change
# cycle through possible greatest common divisors # d will eventually become 1 if not other common # divisor can be found
I b)
# no common factors
6. There are several possible algorithms for this problem. A simple one is shown here. Exercise 7 in Exercises 7.1.7 provides another alternative, void greatestCommonDivisor (natural number a, natural number b) if a =- 0 and b == 0 display invalid input message else if (a < b and a 0 0) or (b :: d =a else d :b
= 0
: 1 to n
sum + xi sum average = sum - n if average < 0 average = --average return average retdas urn ealuOverage end ahsolureValueOfAverage 15. There are more efficient ways to do this. This algorithm is simple and straightforward. The goal at this point is familiarity with the algorithm notation, not efficiency.
G.4 Algorithms
while amount > .05 n + I n amount = amount p = amount
real
xn})
return true end relativelyPrime
Exercises 4.2.3 c = 1 andn 0 = 10 work
1. (c) f 3 (n) = 3nlog 2 (n)
c= 1
0) 200 150
,
100 50
-
2
4
6
8
10
12
14
Solutions to Selected Exercises This function is not in O(n 2 ). Be careful:
(d) f 4 (n) -=
5. (d) True. Assume n > 0 and drop the absolute value signs.
it may seem from a poorly chosen graph that it is (see 3
->
8
(right-hand graph), !- will always get larger (no matter what you choose for c). 15000 -,
20 , ,
7500
forn > 1.
8
Note that f(n) = n2+4 n+Is f E 0(g)?
12500 10000
-
Letc= andno = I. 8. (b) f(n) = -'-4; g(n) = n
,
1
n2
n3
the left-hand graph). However, if n gets large enough
c
A39
20 - 4 + W+4.
=
Assume n > 0 and drop the absolute value signs. Then
,2+ n+ 4
2500 z
20
-4 +
nn2
,z_
5000
n+4 20
-
n + 4
40
30
20
10
1 c= 20
I =
700000
5n.
600000
2
500000
Letn 0 = 1 andc = 5. Then 1 12- 1 < 5ini forn > 1 and so f E 0(g). Is f G Q(g)? Assume n > 0 and drop the absolute value signs. Then
400000 z 30
,
200000
2
100000
n+4
25 2. (d)
f4(n)= -
50
75
forn>4
100
125
150
-
n+n
175
n2 +4
c=2andn0 = 16work
2n 1
=-n
2
c=2 1000I 800,
> -n
" 8Letno=4andc=
+
2
-
n forn >0.
.Then 124I>
Inlforn>4
and so fE•Q (g).
600
Is f E 0(g)? Since it was shown that f c O(g) and f c Q (g), it is
400 -
valid to conclude that f • 0(g).
200
f(n) = n!; g(n) = n
-(c)
S.,Is 5
10
15
20
4. (c) True. Sincen > 0forlog(n) to be defined, wecandrop the absolute value signs. 3nlog2 (n) < 3n .n =
3
n
2n
forn > 0
since < n for n 2(n) n> > 0. Thus 13n 1g 2 (n)I < 312i fon10gI.2 for n > 1. (d) False. Assume n > 0 and drop the absolute value signs. Suppose that there do exist constants c and no such that < cn
4
> n22+ 4>n2+4
MR
2
for all n > no. Then n2(8 - c) < 0 for all no. This is clearly a contradiction, since for all
* > 8c both factors are positive. Consequently, there n3n cannot exist a c and no such that I -I < c1n 2 1 for all
no.
f E 0(g)? Assume n > 0 and drop the absolute value signs. Then n! = n .(n -1). 1
n Let no = I and c = 1. Then, ]n!] < lIInn for n > I and so f E O(g). Is f E Q (g)? Assume n > 0 and drop the absolute value signs. Suppose that there do exist positive constants c and no such that n! > Cnn for all n > no. Then 2ý > c for all n > no. Since limn~oc hc = 0, there must be a positive integer, nI (which can be assumed to be greater than no), such that 11 < c for all n > no. This
A40
Appendix G Solutions to Selected Exercises
contradicts the previous assertion that 1- > c for all n > n 0 . Therefore, there do not exist constants c and no such that In!I > clnnI for all n > no and so f 0 Q2(g). Is f a 0(g)? Although f E O(g), f € Q (g) and so f 0 0(g). 9. If x > 0, we can drop the absolute value signs. Since LxJ < x, it is clear that LxJ c O(x) (c = I and no = 0). Since LxJ > x - 1 it is reasonable to expect
Note the assumption made that the term 363 tog 2(n) approaches 0 as n - oc. This can be verified by one application of L'Hospital's Rule. (Recall that -dg-lg 2 (n) = 1') Since 363 is a real number with 363 5#0, Theorem 4.7 implies that f E 8(g). Exercises 4.2.5 1. (c)
LxJ e Q (x). More formally, assume x > 0 so that we can drop the absolute value signs. Then [xJ > x - I > ix for
m
and no = 2.
x > 2. Let c =
~=
2~~
Since LxJ E (x) and [x E Q (x), we conclude that Lxi a 0(x). 18. (c) 121 (log 2 (n) + n)(n + 3n log 2 (n)) + 6n2 The factor log 2 (n) + n a 0(n) by Theorem 4.6 and the factor n + 3n log 2 (n) E 0(n log 2 (n)) by the same theorem. The initial term (121(log 2 (n) + n)(n + 3n log2 (n))) is in 0(n2 log 2 (n)) by Theorem 4.5. A final application of Theorem 4.6 leads to 2 121(log 2 (n)+n)(n+3nlog 2 (n))+6n a 0(n2 log2 (n)).
(I + 2 + 3 +
21. n! grows faster. A graph clearly shows this. The algebraic justification is: 3
This isn't quite what is needed. A small adjustment will work. Notice that 2 • 2 • 2 • 2 < I • 2 • 3 - 4. Thus, it is in fact the case (for n > 4) n times n =22 ... 2 2, nn
2. = =2
2(n) + 2
1
(n
!
+
2). 2 +(n
= n(n - 1) + n = n2.
-
I)¶ !
15. There will be a committee consisting of from 1 to n people. Different sized committees are mutually exclusive, so one counting strategy is to count the number of ways to have a committee of size k, for each k in [ 1, 2 ..... n} and then add. There are (n) ways to pick the k committee members and there are then k choices for the driver. There are consequently k( ) ways to pick a committee (with driver) of size k. General Counting Principle 2 asserts that there are F =1 k(n) ways to choose a shopping committee. An alternative strategy is to first choose a driver. This can be done in n ways. There are n - I people left. From 0 to n - I of them can be chosen to add to the committee. That is, any subset of the n - I remaining people can be added to the shopping committee. Corollary 5.1 implies that there are 2 n- 1 subsets, so there are n 2 n- 1 ways to choose a shopping committee with driver. Equating these two counts establishes the desired identity.
Exercises
5.3.3
2. (b) 37 (generalized pigeon-hole: Find the smallest n such that r A- =4.) 5. 16 (Formally: Make 15 boxes-one per marriage. Now try to fill them without having 2 entries in a box.) 10. The set {1, 2, 3 ... 2n} contains n even numbers and n - 1 odd numbers. We can get the maximum number of elements that differ by more than 1 by choosing all the even numbers (odd would also work) . This collection still leaves out one integer. The final value must therefore be odd and hence Aill differ by I from its two neighbors. 15. Let R be the set of students whose resumes meet the company's standards. Let D be the set of students whose clothing was acceptable and let S be the set of students who are willing to accept the salary range offered. (a) Using the inclusion-exclusion theorem 65=(5+30+44+28) -(18+15+16) + IRn D n SI so IR n D n S1 = 7.
(b) A Veso diagram that uses the information found in part (a) makes this easy to answer. There are four students with acceptable resumes but unacceptable clothing and
Solutions to Selected Exercises who were unwilling to work at the salary offered. 5
R8. 4
(b) (a red card, a black card) ME
9
10. Picking a card (b) P(the sum is 8) = W, since the dice can show (2, 6),
20 Salary Job Screening
20. Let A 1 be the set of all bit strings of length 7 with Is in the first 5 positions. Let A2 be the set of bit strings of length 7 with ls in positions 2-6, and let A 3 be the set of bit strings of length 7 with ls in positions 3-7. We want to count the number, x, of bit strings of length 7 that are not in A 1 U A 2 U A3 . There are 27 = 128 bit strings of length 7. IA II = IA2 1 = IA31 = 22 = 4 since there are only two unspecified bits in each case. IAiI A 2 n A3 1 = I since there is only one element in that set: 1111111. Finally, IA PNAI = 2(for example,l IAl P AzlI 1. A31b = IA2(forexawhich 2 --I 11 2n 0,3 11111 A A, PA 2 ={111110, l1L11111), but JIi nA 3I = 1-. x = 128 - JAI U A2 U A3 1 = 128 - [(JAI I 1IA2 1+ IA3 1) - (IAI
n A2 1
+ IAI n A 3 1+ IA2 n A3 1) + JAI n A2 n A3 1] = 128 - [(4 + 4 + 4) - (2 + 2 + 1) + 1] = 120 The bit strings with five consecutive Is are: 1111100,1111101, 111110,1111111, 0111110,01111111,0011111, 1011111.
Fierobab ityT eory Exercises 6.1.3 1. (b) Let the sample space be denoted by R for this problem. R = {L1, L2 , SI. This sample space uses equally likely outcomes. It is also possible to use the sample space R = {L, SI with probabilities {2, 1}, but this sample space is often harder to use properly. (c) Since there are three distinct physical coins, it is best to use ordered triples for the model: S
{(T, T, T), (T, T, H), (T, H, T), (T, H, H), (H, T, T), (H, T, H), (H, H, T), (H, H, H)}.
2. (b)
7. (b) The set of rational integers is empty. The complement is the set of all integers. Picking a card
(a) (a ten, a diamond)
1 S7
A45
i. With equally likely outcomes, each has theoretical probability I.
ii. P(L) = 23 P(S) =. 3 (These are theoretical with unequally likely outcomes.) (c) Again, using equally likely outcomes, the theoretical probability is I for each outcome. 6. (c) A number card or an ace that is, in either case, a diamond or a spade or a club. (This is a long description, but is easier than listing all 30 cards.)
(6, 2), (3, 5), (5, 3), or (4, 4). 11. A suitable sample space for this random experiment is {1, 2, 3, 4, 5, 6}. This is a sample space with unequally likely outcomes. The numbers in the set {1, 41 fit the criteria for x in the problem statement. The numbers in this set are twice more likely to occur than any of the numbers {2, 3, 5, 61. Thus, assign a probability of 2 = 1 to each of the outcomes 1 and 4, and assign a probability of 1 to the remaining outcomes. (a) P(subtractinf 5 from the number produces a negative number) = j, since four numbers fit this criteria, two of which have been assigned a probability of ¼ and two of have been assigned probability of 1. 12. There are two possible outcomes for the flip of the first coin, six possible outcomes for the roll of the die, and then two possible outcomes for the second flip of a coin. General counting principle I from Chapter 5 implies that there should be 2 • 6 • 2 ý 24 outcomes in the sample space for this random experiment. A typical outcome may look like {H, 3, TI, indicating that the first coin flip is a head, the roll of the dice gives a 3, and the second coin flip gives a tail. (b) P(the number is 2 and at least one head) =24-- I since the experiment can yield {H, 2, HI, {H, 2, TI, or IT, 2, H}.
Exercises 6.2.3 2. This can be calculated directly from the reduced sample space. An alternative is to use probability computation formula 4 in the form P(B I A) -
P(A n B) (A) P
I have given both solutions. (b) P(4 I face card) = 0) = 0, or
3
= 0.
To
(c) P(4 I aredcard)=
13'°r o 6 26 13'L6 3 3. This can be calculated directly from the reduced sample space. An alternative is to use probability computation formula 4 in the form P(B I A) =
P(A n B) P(A)
I have given both solutions. (a) P(the third is a tail the second is a tail) (+) 1 8
2
=
=
2 or ½,
A46
Appendix G Solutions to Selected Exercises
5. Using formula 1, P(at least one vowel is chosen) = P(first
probabilities with unequally likely outcomes. For
letter is a vowel) + P(second letter is a vowel) - P(both 25 - 235 vowels)= 5 + 5
example, P(HTH) = .8 • .2 • .8 =.128.
6-Outcome 6. Using formulas I and 6, P(no trip in August) =HHH 26 26
676-6-76'
ucm
Probability .512
HHT
.128
HTH
.128
HTT THH
.032 .128
THT TTH TTT
.032 .032 .008
1 - P(trip in August) = I - (P(trip in August this year) +
P(trip in August next year)-P(trip in August both years)) (12
12
=
144
144'
8. (a) P(an even sum I one die is a 3) = I In the 6-by-6 table of possible rolls, the column and row indexed by 3 constitute the restricted sample space. There are II outcomes in the restricted sample space, since (3, 3) is only counted once. Only five of these outcomes have an even sum. 9. Even numbers are twice as likely as odd numbers. If P(even) = x and P(odd) = y, then x = 2y and 3y + 3x =1 . Thus, y= Iand x= .2 (a) P(1,3)= 10. Picking a card
2. (a) The different flips of the coin and the roll of the die are all independent. Thus, general counting principle I
by formula 5
'. =
implies that there are 2 •2 •2 •6 = 48 distinct outcomes
(a) (a ten, a diamond) ind ) 4 P(a ten) 13
sample (b) in In the order for thespace. product of the number of heads and the
P(a ten Ia diamond) = 13 (b) (a red card, a black card) P(a red card) - 26 c= 21 P(a red card I a black card) - O = 0 13. (a) Two events which are mutually exclusive but not independent: Picking a card, with events "a red card" and "a black card." 16. Use computation formula 1. P(spade U 3) = P(spade) + P(3) - P(3 of spades) 1 4 13 4 -52 5+- 52-19. (a) Using the following table, there are two ways, (4, 6) and (6, 4), to gets a 2sum of1 10 with even numbers. The prbblt probability is 36 1
Exercises 6.3.1
18 2
3
4
6 7
digit on the die to be greater than 10, the digit on the die must be at least 4. Consider the three mutually exclusive cases: The digit on the die is 4, the digit on the die is 5, the digit on the die is 6. If the digit on the die is 4, then the number of heads must be 3. Since there is only one way to have three heads, there is one possible coin/die combination in this case (HHH4). Similarly, if the digit is a 5, there is only one possible coin/die combination (HHH5). If the digit on the die is 6, then the number of heads must be at least 2. Thus, there are 3 + 1 = 4 possible coin/die combinations in this case (HHT6, HTH6, THH6, HHH6). Therefore, the probability that the number of heads multiplied by the digit on the die is greater than 10 is 1+1+4 1 To answer the question 48 =8" in the problem statement, the probability that Jake's parents get tooco choose 7 -8"8 aainso spot issI - s the h vacation paet8e 3. (c) The possible combinations for the exact numbers of heads and tails such that they differ by at most two is shown in the table below. Number of Tails Number of Heads
1
2
3
4
5
5 6
2
3
4
5
6
7
8
3
4
5
6
7
8
9
4
4
4
5
6
7
8
9
10
3
5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
(b) i. P(sum is 10 n both even) = P(sum is 10) • P(both even m 36 3 T-8 ii. P(sum is 10 n both even) = P(both even) • P(sum is 9 .2 _1 10 both even) = 36 9 18
3. (a) {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} (b) The flips are determined by the weighting of the coin, not the order of the flip. Successive flips are independent. (c) I used computation formula 5 and theoretical
3 5 Noting that the three cases in the table are mutually exclusive, it is feasible to proceed as in part (b) and conclude that the probability that the number of heads and the number of tails differ by at most two is C(8,4)+C(8,3)+C(8,5)26
70+56+5566 91 .711. T2-8 256 5. (a) There are C(90, 7) = 7,471,375,560 ways that you can
256
-
select seven distinct integers. (b) There are C(7, 6) ways that you can select six of the seven winning integers. There are then C(83, 1) ways that you can select the remaining incorrect integer. Since these choices are independent, the probability that
Solutions to Selected Exercises you will win a prize at the fair is = 581
C(7, 6) •C(83, 1)
7,471,375,560
_•.0000000778.
5,148
8. (a) There are six different spins and obviously the repetition of colors spun is permitted. Consequently, the number of spin combinations possible is 46 - 4, 096. (b) The probability that you will win two stuffed animals is 1-P(less than 2 stuffed animals). How many ways can you win less than two stuffed animals? Well, the number of ways to win no stuffed animals is obviously 36. The number of ways to win exactly one stuffed animal is P(6, 1) •35 because the spin number must be chosen and then the outcomes of the other five spins must be independently determined. Thus, the number of ways to win less than two stuffed animals is 36 + P(6, 1) - 35. Consequently, the probability that you will win two stuffed animals (i.e., the wheel pointer tice intheorage sctin lands in the orangeland section twice) ishand. is 729 + 1,458 • 35 36 + P(6, 1) 1=-4,096 4,096 2,187 1,909 4,096 4,096 (c) The number of ways to spin the wheel such that you will win exactly one cookie and exactly one book is P(6, 2) • 24, since the particular spins that you win one cookie and one book need to be chosen among the distinct spins and the outcomes of the remaining four spins need to be independently determined. Therefore, the probability that you will win exactly one cookie and exactly one book (i.e., the wheel pointer lands in the red section exactly once and in the yellow section exactly once) is
4,096
-
30•16
=___ -
4,096
=__ -
15 -
128
.117.
11. (a) We could add the probabilities of exactly 3 eights and exactly 4 eights. We could also subtract the probability of 2 or fewer eights from 1. Since we already know the probabilities for exactly I eight and for no eights, the second alternative is simpler. The number of crazy eights hands with exactly two eights is C(4, 2) - C(48, 5)
P(exactly two eights)
133,784,560
.006.
The alternative approach is to add the probabilities of exactly 3 eights and exactly 4 eights: C(4, 3) . C(48, 4) + C(4, 4) . C(48, 3) 133,784,560 133,784,560 778,320+ 17,296 .006... 133,784,560
The probability is 2
5 t48 9
5
-.
002.
19. (a) First count the number of 8-card hands containing exactly two 2s, exactly three 3s, and exactly three cards with a third common face value. There are four choices from which to select the two cards with a face value of 2. Similarly, there are four choices from which to pick the three cards with a face value of 3. There are then 13 - 2 = 11 face values left over, one of which we may assign to the remaining three cards. Finally, once this last face value is chosen, there are four choices from which we may select the remaining three cards for the (using gee ral countin T e ar se qent h There are consequently (using general counting principle 1)
C(4, 2). C(4, 3). C(11, 1). C(4, 3) 11! 4! 4! 4! 2! .2!
3!. 1!
1!. 10!
3!. 1!
1,056
8-card hands containing exactly two 2s, exactly three 3s, and exactly three cards with a third common face value. This implies that the probability that exactly two 2s, exactly three 3s, and exactly three cards with a third common face value are chosen is
1056 1056 C(52, 8)
1056 156 752,538,150
_.0000014.
20. (a) For the difference between the digit on a die and the integer 3 to be positive, a 4, 5, or 6 must show up on at least one of the dice. There are 32 = 9 ways for both digits on the dice to be among 4, 5, and 6. There are 2 • C(3, 1) . C(3, 1) = 18 ways for there to be exactly
counting principle 2 implies that there are 9 + 18 = 27 ways for the difference between the digit on a die and
10,273,824 133,784,560
The probability of at least three eights is therefore 7,84,060-
1
one of 4, 5, and 6 on a die, since exactly one of these three numbers can be on either of the two dice, and there are three choices for the last die. Thus, general
10,273,824
and so
1-
14. (b) All 5 cards are the same suit: pick a suit, then the 5 cards
7,471,375,560
P(6, 2) •24
A47
the integer 3 to be positive. This implies that the 3 probability that this event will occur is 27 Consequently, player A more likely to win. An alternative approach is to let X be the event: "x > 3" and Y be the event "y > 3." Then
P(A U B) = P(A) + P(B) - P(A 1 1 1 1 3 . 2 22 4 2
n
B)
A48
Appendix G Solutions to Selected Exercises
25. (a) Taking into account that you must buy a pair of jeans if you buy any T-shirt(s), consider the four mutually exclusive cases: buy exactly one item, buy exactly two items, buy exactly three items, buy exactly four items. Within each of these last three cases, there are two subcases: Buy a T-shirt, do not buy a T-shirt. The number of ways to buy exactly one item is C(23, 1) (a T-shirt may not be purchased alone). The number of ways to buy exactly two items when one is a T-shirt is 1 (a T-shirt and jeans). The number of ways to buy exactly two items when neither is a T-shirt is C(23 + 2 - 1, 2) = C(24, 2). Moving right along, the number of ways to buy exactly three items when at least I is a T-shirt is C(23, 1) + 1, while the number of ways to buy exactly three items when none are T-shirts is C(23 + 3 - 1, 3) = C(25, 3). Finally, the number of ways to buy exactly four items when at least one is a T-shirt is C(23+2- l, 2)+C(23, 1)+t = C(24, 2)+C(23, 1)+l, while the number of ways to buy exactly four items when none are T-shirts is C(23 + 4 - 1,4) = C(26, 4). Consequently, general counting principle 2 implies that the number of acceptable distinct purchases is C(23, 1) + (I + C(24, 2)) + (C(23, 1) + I + C(25, 3)) + (C(24, 2) + C(23, 1) + I+ C(26, 4)) 23 + 277 + 2,324 + 15, 250 = 17,874.
Thus, the estimated profit on the 5,000 shirts is 5,000 • $0.83 = $4,150. 16. (a) E(D)
28.
1+30. 4 +31 7 30.417 12 12 • 12 (b) There are three months that start with a "J": January, June, and July. There are three months that count double and 9 months that count single. A denominator of 15 is appropriate. January and July have 31 days; June has 30. The expression below counts the "J"months first. 1) 2 ( +1 28. + 30. E(D) =2 31 =
-
- +5 115) +
30.
+3
2
.35
5
.40
.70 2.00
21. I will assume that exactly I million people participate, so that every possible number is chosen (there are 106 numbers according to Counting Formula 2). I will assume that every number is equally likely to be chosen. The probability of losing is 999,999 .999999. P(Losing) = 1,000,000 The expected value is
S1,999P(T
2000 2000 $1.75 per ticket. = .25 The average loss is 2.00 14. The expected profit per shirt is
=
999,999
1,000,0o0
1,000,000
This is a fair game.
P(TI W)
1. (a)
-
P(T) • P(W I T) P(T)-P(WJT) +P(T). P(W IT) (.998)(.15) (.998)(. 15) + (.002)(.775) .9898
or P(TI W) = I - P(T I W) -- 1 - .0102= .9898 P(T I W)
(b)
P(T) . P(W I T) P(T) • P(W IT) + P(T). P(W I T) (.998)(.85) or (.998)(.85) + (.002)(.225) I W)
= .9995.P
3 P(M-
=
I
-
.9995
P(T IW) - I
-
1,709,919 2,167,071
.0005
457,152 .7890, P(F) = 4,_7,1 516,494 .3021, P(UI8 I M) -- 5,494 P(U8IF= 124,911 2 457,152
=
P(M) -P(U18 I M) P(M) • P(UI8 I M) + P(F). P(UI8 1 F)
-•
(.7890)(.3021) (.7890)(.3021) + (.2110)(.2732)
.8053
.9995
.2110
P(M I U 18)
(a)
E(X) = -2..30 + 0..23 + 1 •.19 + 2 .10+5-5.11+7 .07 $0.83.
1
Exercises 6.5.1
x • P(x) 2 4.70 7. There are two prizes: a $500 car and a hearty wish for better luck next time. The probability of winning is 2,000 = .. 0005. The probability of losing is 2 The expected value is thus l 1999 = $0.25. 20 1+0 500.
15)
_ 30.467
Exercises 6.4.1 1. The ratios express the approximate odds of winning (including the possibility of a free ticket) and winning cash (no free tickets included). These approximate probabilities .241 and -L2- 2.121. These can are, respectively, -L 4.15 --8.26 also be computed by adding all but the first (respectively, first two) rows in the probability column of the table computed in the text. Thus the probability of winning something is approximately I - .7589 = .2411 and the probability of winning a cash prize is approximately .2411 - .12 = .1211. 5. (a) X P(X) X.P(X)
31 •
Solutions to Selected Exercises 9. P(G I B) =4,P(G IC) =.25, P(G I R) =.2, P(G I M) = .1 P(B) = .3, P(C) = .4, P(R) = .2, P(M) = .1 The generalized Bayes's formula should be used. I have the used the spreadsheet in the previous problem to simplify work in part (b). I have shown the details for part (c). (b) P(C I G) -_ .3704 The next part uses the values P(D I X) = I - P(G I X). P(R)-P(DIR) (c) P(RID) P(B)-P(D I B) + P(C). P(D I C) +P(R) • P(D I R) + P(M) • P(D I M) (2)(8) .3)(.6) + (.4)(.75) + (.2)(.8) + (.1)(.9) .2192 11. Note that in the selection of symbols, S will represent T-shirt. P(C) = .33, so P(A) = .67 P(P A) = .04, P(S I A) = .21, P(T I A) = .65, P(V I A) =.10 P(P IC) = .09, P(S IC) = .28, P(T C) = .32, P(V C) =.31 P(A(d) - P(A) P(V I A) P(A)- (.67)(. P(V I10) A) + P(C)(• P(V C)
(a) P(A I V)
(.67)(.10) + (.33)(.31) .3957
P(L W) P(W).P(LIW) +P(B).P(LIB) 10) +P(O) P(L (.781)(.078) (.1)(.078) --
(a) P(WI L)
+I + (.116)(.192) .5448 P(B) *P(LI B) (b) P(B I L) =P(W) P(L I W) + P(B) P(L I B) P(W). P(L 10) +P(O)P(L. )(.278) (.781)(.078) + (.103)(.278) + (.116)(.192) -. 2561 (c) P(O I L)
=
P(W) . P(L
P(O). P(LI[ W) I B) • W) + P(B) P(L
P(O).(P(LI 0) (.781)(.078) + (.103)(.278) + (.116)(.192) -. 1992 Many people do not expect over half the poor to be white. The high probability of being white offsets the low probability of being poor, given white.
G.7 Recursion
P(C) •P(T I C) P(A) . P(T I A) + P(C) • P(T I C) (.33)(.32) (.67)(.65) + (.33)(.32) - .1952 P(A) • P(P I A) (e) P(A I P) = P(A) - P(P I A) + P(C) . P(P IC) (.67)(.04) (b) P(C IT)
(.67)(.04) + (.33)(.09) .4743 (d) P(C IS)
-. 18. P(White) = .781 P(Black) = .103 P(Other) = .116
P(C) •P(S I C) P(A) .P(S I A) + P(C) .P(S (.33)(.28) (.67)(.21) + (.33)(.28) 3964 P(Low income IWhite) = P(Low income I Black) = P(Low income I Other) =
I C)
.078 .278 .192
A49
Exercises 7.1.7 1. (a) integer recursiveSum (integer n) if n < 1 return 1 else return n + recursiveSum(n-l) end recursiveSum (b) It does use tail-end recursion. (c) integer loopSum (integer n) 0 sum i=i while i < n sum = sum + i + 1 i = return end loopSum
ASO
Appendix G Solutions to Selected Exercises
5. (a) The complete diagram still has two invocations that lead to a sum of 4. However, there are more invocations than were done in the Quick Check 7.1. . 11, 3,41
0k,{3,41
{(l, {3,41
{3}, 141
'k4 {3}, 0
{3,4},0b
{11,{41 {11,0
{1,31, {41
{ 4, {1,
1, 3,41,
(b) Choosing the larger element is more efficient. There are more opportunities to eliminate some recursions at lines 5 or 7, since the recursive invocations at line 15 are producing sets with small sums faster than if small elements were eliminated. 7. (a) The following base conditions are sufficient: "*gcd(0, n) = n
(b)
"*gcd(1,
Page 3: [5.5,10] r = .005, S(f, 5.5, 10) -_0.59846, S(f, 5.5, 7.75) -- 0.342984, and S(f, 7.75, 10) -- 0.254901, so whole - left - right = 0.00057522 < 10 . .005. The algorithm vauif returns ,1 dx.0.342984 + 0.254901 = 0.597885 as the
n) = 1, where n > 1.
integer god(integer a, integer b) (a > b) swap a and b if a =: 0 return b if a =- 1 return 1 return gcd(b mod a,
. value ofxf Back to Page 1 The final result is 1.706848 + 0.597885 : 2.30473, using the intervals [1,3.25], [3.25,5.5], and [5.5,10]. The actual value is 2.30259, with an error in the adaptive quadrature estimate of approximately -. 00214.
if
a)
14. real aToAPowerOf2 (real a, natural number n) if nl= 0 return a else return aToAPowerOf2 (a,n - 1) aToAPowerOf2 (a, n - I) end aToAPowerOf2
20. Page 1:[1,10] -r= .01, S(f, 1, 10) r_ 2.74091, S(f, 1, 5.5) -- 1.80944, and S(f, 5.5, 10) -- 0.59846, so Whole - left - right = 0.333008 > 10. .01.
Exercises 7.2.5 2.
integer Fibinacci(integer n) if (n == 0) or (n = 1) return 1 else return Fibonacci(n-i) + Fibonacci (n-2) end Fibonacci
This is not an efficient algorithm; it is an example of redundant recursion. 3. (b) an = n
Page 2: [1,5.51 Page 3: [5.5,10] is pending.
an-1
= n ((n - I)• an_-2)
substitute
r = .005, S(f, 1,5.5) --1.80944, S(f, 1, 3.25) -- 1.19627, and S(f, 3.25, 5.5) - 0.526424, so
(n(n - 1)). ((n -
lt
whole - left - rightly = 0.08675 > 10- .005.
-
Page 4: [1,3.25] r = .0025, S(f, 1,3.25) -- 1.19627, S(f, 1,2.125) - 0.755735, and S(f, 2.125, 3.25) _-0.424997, so
S(f, 3.5, 4.375) 0.297271, and S(f, 4.375, 5.5) -c 0.228847, so whole - left - right = 0.000306028 < 10. .0025. The
algorithm returns 0.297271 + 0.228847 = 0.526118 as the A dx.
n! (c)
an = 2an-2
2(2a,-4) substitute 22an_4 simplify = 22 (2an 6 ) substitute 2a6 simplify
Back to Page 2 Page 2 returns
fj3.25
1dx + f35.' . 5 ¼dx=1.18073 +0.526118 = 1.706848.
(n-k+1)) an-k
= (n(n - 1)(n - 2) ...2
value of f3.25 1 dx.
substitute
(n(n - l)(n - 2)) • a,_3 simplify
(n(n-1)(n-2).
whole - left - right = 0.0155343 < 10. .0025. The algorithm returns 0.755735 + 0.424997 = 1.18073 as the Page 5: [3.25,5.5] r = .0025, S(f, 3.25, 5.5) -- 0.526424,
2)an-3)
=2kani 2 k
1) •a0
Solutions to Selected Exercises If n is even, this terminates when k
n If n is odd,
this terminates when k = n-I 2 2Oao
01 +902-=-2 n )
-(2)
ifnis even
3 -2
if n is odd
301
(- 1,3, -2, 6, -4, 12, -8, 24, ... }. The recurrence relation essentially interleaves two distinct sequences. = 5(5an_2 + (n - 1)) + n substitute 2 - 5 an2 + (5(n -- 1) + n) simplify = 52(5an_3 + (n - 2)) +(n - 2)) + (5(n - 1) + n) substitute =- 5 an_3 + (52(n - 2) + 5(n - 1) + n) simplify
k + 1) +
5
-2.
a, = 3' + (-
5
)"
for n > 0.
factorization now produces (x - 1)(x - 2)(x - 3) = 0.
3
-
502
12. (c) The characteristic equation is x 3 - 6x 2 + lIx - 6 = 0. Possible rational roots are ±1, ±2, ±3, +6. It is easy to verify that2 I is a root, so we can factor this as (x - 1)(x - 5x + 6) = 0 [using a simple polynomial division (x - 1)1x 3 - 6x 2 + lIx - 6 ]. An easy
an = 5an,-I + n
5kan-k + ( 5 k-I (n
-
Using the substitution 02 = 2 - 0 1, it is easy to find 01 = I and hence 02 = 1. The solution is therefore
The first few values of {an } are:
-
11. (a) The characteristic equation is x 2 + 2x - 15 = 0, having roots rl = 3 and r2 = -5 This has a general solution in the form an = 013' + 02 (- 5 )n. The system of linear equations is
ifn is even
2-2al ifnisodd
4. (c)
A51
k-2(n - k + 2)
The roots are therefore r] = 1, r2 = 2, and r 3 = 3. The general solution is an = Olin + 022n + 033n 01 + 022n +033n. The system of linear equations is 01 + 02 + 03 = 3 01 + 202 + 303 = 4 01 + 402 + 903 = 6.
= 5nao 1 + 5 n-2
+(51= yn nk
2 + ... + 5(n - 1)+ n) Subtract the first equation from the second and subtract first equation from the third equation. The resulting reduced system is
k • 5 n-k the next step is not trivial -the
5
more details follow 16 Some hints for the final simplification:
02 + 203 = 1
nk.Sn-k
-lk
k=l
302 + 803 = 3. Subtracting 3 times the first equation (of the reduced
kl
system) from the second allows us to determine that 03 = 0. Substituting this result into the second equation (of the reduced system) allows us to determine that 02 = 1. Substituting these results into any of the original equations results in 01 = 2. The solution is
and 1
1 k k = 5-+
•+
5
k times Now rearrange terms to get 5n( n I l n
1:T + Y-
+
n
+"+
4.
k=n 5k
k=2 k1
This can be expressed in terms of a collection of geometric seres: 1I
ll
n ( kf
5 k-- 5kI n-2• T2 k=l
1
Il +Y -
k=l
Using the formula for the sum of a geometric series, together with some algebraic simplification leads to the final result,
an = 2 + 2n. 17. (a) Let Bn stand for the number of distinct bit strings of length n that do not contain three consecutive Is. B 0 = 1 since there is only one way to have a bit string with no bits. Additionally, B1 = 2, since both the bit strings 0 and 1 do not have three consecutive ls. B1 = 4, since the bit strings 00, 0.1, 10, and 1 do not have three consecutiveIs. Ifn > 3, the number of distinct bit strings oflengthn that do not contain three consecutive I s is equivalent to the sum of the number of bit strings of length n that do not contain three consecutive ls that end with a 0 and the number of bit strings of length n that do not contain three consecutive Is that end with a 1.
A52
Appendix G Solutions to Selected Exercises Bit strings of length n that do not contain three consecutive Is that end with a 0 are just the bit strings of length n - I with no three consecutive Is with a 0 added at the end. There are Bn_1 such bit strings, The number of bit strings of length n that do not contain three consecutive Is and end in 1 is equivalent to the sum of the number of bit strings of length n that do not contain three consecutive Is that end with a 01 and the number of bit strings of length n that do not contain three consecutive I s that end with a 11. Bit strings of length n that do not contain three consecutive Is that end with a 01 are just the bit strings of length n - 2 with no three consecutive ls with a 01 added at the end. There are Bn_2 such bit strings. Bit strings of length n that do not contain three consecutive Is that end with a I1 must have 0 as their (n - 2)nd bit. Thus, these are just the bit strings of length n - 3 with no three consecutive ls with a 011 added at the end. There are Bn-
3
such bit strings.
"* B 0 =.1, B 1 =2, B 2 =4 "* Bn = Bn- 1 + Bn-2 + Bn-
3
for n > 3
(b) The characteristic equation is x
- -X
3. Since h E 0 (g), there are constants, c and nl, such that, for all n > n 1 , Ih(n)I < clg(n)J. Thus, for all n > max(no, n1), If (n) I < Ih(n) I < c lg (n) I. Consequently, f c 0 (g). 5. Since logv (z) - I < 0, ( I
xlogy(z)-- < X logy(z)J < xlogy(z)+l
0. x-- - - =
2
The rational roots theorem indicates that the possible rational roots are ±1. It is easy to verify that neither is a root. In more advanced courses you might learn how to determine that this polynomial has one real and two complex roots. It is still possible to apply the linear homogeneous solution process, but the algebraic details are pretty nasty. If you try back substitution, you might find it difficult to see any patterns emerge. Techniques to solve this recurrence relation are in Section 7.4. 18. (b) presented The characteristic equation is 18. (b18)0.The rationaristic rootn t m indicates that the possible rational roots are ±1, ±2, ±3, ±6, ±9, ±1-8. It is easy to verify that ±1 are not roots. However, rI = 2 is a root. So x 3 _ x2 + 21x - 18 = (x - 2)(x 2 - 6x + 9)
Exercise 3 implies that f E 0 (-xlogy(Z)). Now let h 2 (z) = x log(z)-J. Then Problem 4 implies (Z) that f a x Combining these results, it is valid to conclude that e a (xl0gy(z)).
8. Use Theorem 7.9. Part
a
b
c
d
v
a vsbv
0
(a)
3
2
4
5
2
3 21
0
b
0
(n 10g2 (3))
13. (a) a = 1, b = c = 2, and d = 1. Theorem 7.10 implies 2 that f 0- ( 2 (n)] ).-([log 16. shuffle (a) There are no data copies in the base case, so f(l) = 0 (d = 0). There are three recursions, each processing one third of the original list, so a = b = 3. There are n data copies after the excursions, soIc • f(n) = 3f (a) + n f(l)0 (b) Theorem 7.9 (with a = 3
which easily factors completely as (x - 2)(x - 3)2. Thus, r 2 = 3 has multiplicity v2 = 2 and v1 = I. The general solution has the form
xxtgy(z)+l• Then
x~togy(z)] and let hI (z)
Let f(z)
f
It is now possible to create the recurrence relation.
3
Exercises 7.3.1
-
31
-
. Thus,
= v
by) implies that
f E t9 (n log 3 (n)).
Exercises 7.4.1 2. (a) The product is
C0 + • = 1 2a0 +3f3 4
0
aO0+9po0+
+3,8] =2 181) = .j
The solution is ce0 = -2, 60 = 3, f61 = -1,so the explicit formula for the sequence is an= -2.2n + (3-n) 3n = -2n~'
+ (3 - n) 3nk=
Zk k(0
j
k0
k=0
j)Z
c
k(k + 1)k 2
A53
Solutions to Selected Exercises requires a use of Theorem 7.11.
4. Assume n > 0.
7-
an = (n - l)an-I + 1 2 = (n - ])[(n - )an2 + 1]+ I substitute
1)(n
= (n =
(n
=
(,i-
1) +1
2)an 2 + (n-
-
4-(n
1)+
-
77
1 substitute k=0
A(z)
3"
+ (n - 1)(n - 2) +... +(n- 1)(n - 2).. (n-
Thus (n-
7. forn>O. -_o n >0 2 2 10. (d) Multiply by zn and sum from n = 2: an =~3n ..9 .
1))
P(n - 1, r)
r=0O n-iIc
anzn = -5 1
This leads to A(z) - ao - al z = -5z(A(z) - ao) + 36z 2 A(z) so
o
3
k
I
1)k(k
1 ) 3 kzk
A(z)(1 + 5z - 36z
-
108z
3
+ 405z 4
n=1
1458z
-
5
+ •
.
•
0c
Oc
c
3an-lzn + Z
Zanzn=
A(z) =
an-lzn
+9z)(1 -4z)
(3+3z) ((
----(3 + 3z) + 7
cc1
Zn nI
anzn +7
=3ZT
3 ao + (5 a0 + al)z= 3 + z.
7zn
n=1 = 3z Z
2)
Thus
The series starts as 2 1 - 6z + 27z
j=0
j=1
k,)3kzk _ k=0+•/ S_)k k)3kzk
aJzj"
ajzi + 36z2 Y
=-5z E
(-2
=( k=0
n=2
n=2
n=2
r=O
9. (b)
c an- Izn +36 Y an_2Zn
0cc
Y
P(n - 1, r)
=
zk 2
n-i
5. (b) (1 + 3z)
k-
l+ (n- 1)
--1)(n--2)...(n--n)aO+
6.3k6
-
2 (3
--
+ (n-i1) +
2
zk.
-
Combining
=(n - 1)(n -2)... (n - n)a 0 + (n - 1)(n - 2) •-(n -- (n -+) +. + (n -- 1)(n -- 2)k=
= E
( 3 k+1
2
1)(n - 2)(n - 3)an-3 + (n - 1)(n - 2)
Y
k
k=0 \j=0
+ (n-1)+ 1 simplify
=00 +
L 3kZk
Zk 07
simplify
1)(n - 2)[(n - 3)an_3 + 1]
-
= (n
7
z
-
E k=O
1(-9)j4k-j
'k
cc
= (3+
4k
3z) k=
n=0
zk
9j
-
j=0
Zk
so A(z)-ao=3zA(z)+
Solving for A(z):
6
-z
7
+
4
kzk)
+
_
= (3 + 3z)
4
Zk zk+i)
Icc
k=0 1
(L
03(l+z)
A-3z + (0-z)(0-3z)
The first term expands to -6 E00k= 3 kzk (using the table of useful generating functions). The second term
(3
3 13
( k=0
9
-(-9 )k+l
)k+l
-
(
4 k+)
9
zk
)k + 4 k)l + 4 k zk
A54
Appendix G Solutions to Selected Exercises 3 '"~ 0 ( 13 =1
5k4
)
E oiS- )
) k'~
) "4nforn>0.
)n +15
13
13
38 12. The solution is the coefficient of z in the expansion of
30
7( 5j ( 3 0k z1I: z0k
Ezi
i=0
\k=0
I25mThey z ) \m=0
4
+z
5
+2z
6
+2z
+2z
7
11
8
19
+9z
20
+9z
2
1 +9z
22
+92z24 + 13 z25 + 13 z26 + 13 227 + 13 z28
+ 18 z34 + 24 z35 + 24 z36 + 24 z
+ 24 z
3
4
243
1
5
2 4 1 5
3
6
2 4 6 3 1
5
7
246153
7
8
2 4 6 8 3 75
1
9 24681597 3 represent the position of the second-to-last person. It appears that a similar pattern is developing. Notice
38
n
3
4
5
6
7
8
9
10
11
12
pn
1 3
5
1 3
5
7
9
11
1
G.8 Combinatorics
* 13 z29 + 18 z30 + 18 z31 + 18 z32 + 18 z33 37
1
21
n =32 m +i,where0 < i < 3.2 m , thenpn =2i+ 1. This fits the table developed so far.
23
+9z
2
3
The divisions in the pattern appear to be where n = 3 •2m. In analogy with the solution to the modified Josephus problem, we might conjecture that if
+ 2 z ± 2 z9 + 4Z10 + 4 z I + 4 12 + 4z3 18 17 16 14 +65 15 + 6 26 + 4 24 +6z + 62 +6z
I
that n = 2 is a special case (don't eliminate anyone), so general pattern. The base value part of P2P3= =2 is1.not is Extend thethe table: i
Mathematiea was used to determine that the desired coefficient is 24. The valid part of the expansion is l+z+z +z
in
00(2 5.4)Z)2_~ i-4•z
Consequently, an=(24(
Elimination Sequence -
n
.
Exercises 8.1.4 18. (a) Since(1-z-6z 2 )=(1-3z)(I+2z),
2. (b) p(8,5)=p(7,4)+p(3,5)=3+0=3 (a) p(5) = 7
B3. / A 3z Z2 -- z + I 1-z- 6z 2 I I +2z 5
=Z(
+
k=I k=2
65 )
5=5
( I +2z5=3+2 3z
(5) 1-3z
(5
1+• 2z"
Using Table 7.10, this can be written as 3nzn 5 (9) (0 Z
+ (
Z
(-1)n2nzn
+
The summations can be combined, leading to the generating function 3 r 5 E [3n~l + (- 1 )n 2 n+ljZn ,=0
5
3 2n+l zn+l 5
n=O
Exercises 7.5.1 4. (a) Look at the diagonal elements in the table:
5=4+1
k=3 5=3+1+1
k=4 5=2+1+1+1
k=5 5=1+1+1+1+1
5=2+2+1 4. (a) p(n, k) satisfies the recurrence relation k p(n,k)=Lp(n-k,i) 0 12!. 11! ... 2!.-l!, there are certainly many choices for a Latin square of order 12. 15. (d) Definitely true. The maximum number of mutually orthogonal Latin squares of order n is at most n - 1. The number of distinct Latin squares of order n is at least n! • (n - 1)!...2!. 1!, which is larger than n - 1 for n > 1 (it quickly becomes much larger). 18. Since L does not contain p, the lines, 11, L2, d. b, L n, Ln+l, that contain p must each contain a common point with L (by FPP2). Suppose Li and Lj both share the point, q, with L. This would violate FPP2 since then Li and Lj would both contain the distinct points p and q. There must therefore be at least n + I distinct points on L. Denote the point that is on both L and Li as pi, for i=1,2... n+l. Suppose there is another point, x, on L. Since x # p, it must be on a common line, Lj, with p (by FPPI). The distinct points x and pj both are on L and on Lj, violating
FPP1 (and FPP2). This contradiction means that the only points that are on L are Pi, P2., Pn+±, completing the proof.
21. Suppose every point is on either L1 or L 2 . By FPP2, there is a unique point, p, that is on both L1 and L 2 . By FPP3, there must exist 4 points, no three of which are on a common line. This means that two of those points, Pl and p', must be on L 1 , and two of the points, P2 and p•, must be on L 2 . Also, P I{PI, P2, P1, P2} (or there would be three points on a common line).
29. s(6, 3) = -225
Exercises 8.2.4
P
3. (b) Orthogonal.
L3
P2 (1,3) (2,2) (3,1)
(2, 1) (3,3) (1,2)
P1L4
(3,2) (1,1) (2,3)
P1 L1
L2
5. (b) Not self-orthogonal. 1 2
2 3
3 1
1 2
2 3
3 1
(1, 1) (2,2)
(2,2) (33)
(3, 3) (1,1)
3
1 2
3
1 2
(3,3)
(1,1)
(2,2)
7. If two rows are swapped, the numbers in those rows will remain unchanged, so every row will still contain every
By FPPl, there must be a line, L3 , that contains pl and P2 and another line, L 4 , that contains P'I and p'. The lines L 3 andL 4 must contain a common point, q, by FPP2. That point cannot be in {P1, PP P2, P221 (for example, if the common point were pl, then both L1 and L4 would contain PI and p'1, contradicting FPP2). Since q is on a common
A56
Appendix G Solutions to Selected Exercises
line, L 3 , with P1, it cannot be on L 1 . Since q is on the common line, L 3 , with P2, it cannot be on L 2 . Thus, q is the desired point. 26. (d) The duality principle does not make this claim. It makes an assertion about the statements of theorems. However, it is true that interchanging points and lines in an actual finite projective plane results in another finite projective plane. The reason is that the axioms (together with FPP3Y) make points and lines entirely symmetrical objects. It is merely a matter of convention that we use dots and lines to represent the plane. (e) Theorem 8.8 is not an existence theorem. It merely states that if a finite projective plane with one condition exists, then the other conditions are also true. In fact, the next section will show that there is no finite projective plane with 43 points and 43 lines, that is, there does not exist a finite projective plane of order 6.
Exercises 8.4.1 1. greedy Item X has maximum benefit, so it is added first. No other items will fit, so the total benefit is 4. sophisticated greedy The expanded data table is shown below.
X 4 5
item benefit volume
Z 2 3
Y 3 4
quantity I 1 b/v ratio .8 .75 .667 This algorithm will choose item X, for a total benefit of 4. Knapsack The algorithm produces the following tables.
Exercises 8.3.2
T
4. (a) (10, 15, 6, 4, 2) Potential. (b) (15, 12, 5, 3, 2) Not possible. bk 0 vr, r(k - 1) 5 X.(v - 1), Fisher's inequality fails. 7. (b) (22,7,2) Not possible. These parameters pass the tests in
K
w
X
Y
Z
B(w)
X
Y
Z
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
4
0
3
2
3
0
1
0
5
4
3
2
4
1
0
0
6
4
3
4
4
1
0
0
7
4
5
5
5
0
1
1
Theorem 8.11. Since v is even and k - ,X= 7 - 2 = 5 is
not a square, it fails the test in the Bruck-Ryser-Chowla theorem. (c) (15,7,3) Potential. The relationship r(k - 1) = X.(v - 1) is 7 • 6 = 3.14, which is true. The Bruck-Ryser-Chowla theorem requires the equation z2 - 4x2 - 3y 2 to have an integer solution. Among many other solutions, x = y = z = I is easy to spot. 10. Since v = 43 is odd, the2 design won't exist unless the equation z2 = 6x2 - y has a solution in integers, x, y, and z, not all 0. The equation can be written as 6x2 = Y2 + z2 . It is beyond the scope of this text to prove that this Diophantine equation has no non-trivial integer solutions.
4. (a) The optimal packing is 2 item X's and I item Y, for a total benefit of 13.
15. (a) The residual design has parameters (10,15,6,4,2). It is as follows. B
B
g
i
i
h
k
1
B1
i
m j n
o p
B* B* 1 42 3 5 h g j i m m joon p o0
B* 6 h k n o0
B
7 h I
m p4 p
B
8 g k 1
The optimal packing is item Y and item Z, for a total benefit of 5.
T
B
w
X
Y
Z
B(w)
X
Y
Z
0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
2
0
91 h k I14
K
3 5
1
1
001
4 4
1 0 5 22
4 5
100 0 11
5
5
5
5
1
0
1
2
0
0
0
0
B*
B*
B*
B*
6
8
6
6
g
B* 1 g
8
i
g
j
j
7
9
9
6
9
1
1
0
1
k
j
h
k
1
8
9
10
9
10
0
2
0
10
10
10
10
0
2
0
13
13
I1
13
2
1
0
B*
B10
12
13
14
15
m
o
n
n
m
n
9
n
p
p
p
p
o
10
Solutions to Selected Exercises 8. Notice that a typical mall will have (for all practical purposes) unlimited quantities of each item. The knapsack problem can be represented with a knapsack having capacity 100 and item benefits, volumes, and quantities as shown. mug watch jean skirt item benefit
5
4
3
volume
30
20
10
quantity
0
00
0
book
swim suit
ring
3
3
volume
14
21
5 55
quantity
0
oc
00
earrings
backpack
stationary
benefit
2
2
1
volume
12
28
5
item benefit
item
volume quantity
X 6 3
Y 2 1
1
3
The optimal benefit is 6, but it can be achieved by packing one item X or three item Y's. So the optimal packing is not unique. 13. The proposed heuristic produces the inequality 11 [ 9j [ L>5 which simplifies to 11>10. Since this is true, the proposed heuristic would suggest packing an item X. There is no room for any other items, so the total benefit is 11. However, it is easy to see that packing two item Y's and one item Z produces a total benefit of 11.1, so the proposed heuristic fails to generate the optimal packing.
Exercises 8.S.4 1. The messages are 4 bits long, the corresponding code words are 7 bits long.
quantity 0 0 0 The optimal solution is to purchase 10 mugs, for a total benefit of 30, and $100 spent. This indicates one weakness in the knapsack model that has been presented: The model does not allow for the benefit level to change after packing a few of a particular item. You might attempt to fix this by setting the quantity of mug to a smaller number, then adding a new item "extra mugs," with unlimited quantity but a lower benefit. Since the "extra mugs" have the same volume, they will not be packed until all the "mugs" have been packed. If the quantity of "mugs" is set at 3 and "extra mugs" are given benefit 2 (but unlimited quantity), the optimal benefit will be 24 and the knapsack will contain 3 mugs and 5 books. 11. (a) This is false. Consider the knapsack with total capacity v = 3, with two potential items.
14. (a) 0000000
item # benefit
A57
0000 0100 1000 1100
0000000 0100101 1000011 1100110
0001 0101 1001 1101
0001111 0101010 1001100 1101001
0010 0110 1010 1110
0010110 0110011 1010101 1110000
0011 0111 1011 1111
0011001 0111100 1011010 1111111
2. (b) 1111 xC = 0
1110111 hasx = 0 0 xr, x6O= 0 0x, Xx6
so the decoding table indicates an error in bit 4. 6. (a) 3 (differ in positions 2, 3 and 4)
7. (b) 0111001 1. (b) 101 10. (b) 101 (error in position 14) 11. (a) False. M is the number of code words (or messages), not the number of message bits.
0001111
0010110
0011001
0100101
0101010
0110011
0111100 1111100
1000000
1001111
1010110
1011001
1100101
1101010
1110011
0100000
0101111
0110110
0111001
0000101
0001010
0010011
0011100
0010000
0011111
0000110
0001001
0110101
0111010
0100011
0101100
0001000
0000111
0011110
0010001
0101101
0100010
0111011
0110100
0000100
0001011
0010010
0011101
0100001
0101110
0110111
0111000
0000010
0001101
0010100
0011011
0100111
0101000
0110001
0111110
0000001
0001110
0010111
0011000
0100100
0101011
0110010
0111101
A58
Appendix G Solutions to Selected Exercises 1000011
1001100
1010101
1011010
1100110
1101001
1110000
1111111
0000011
0001100
0010101
0011010
0100110
0101001
0110000
0111111
1100011
1101100
1110101
1111010
1000110
1001001
1010000
10I11
1010011
1011100
1000101
1001010
1110110
1111001
1100000
1101111
1001011
1000100
1011101
1010010
1101110
1100001
1111000
1110111
1000111
1001000
1010001
1011110
1100010
1101101
1l10IO0
ll0ll11
1000001
1001110
1010111
1011000
1100100
1101011
1110010
lllll01
1000010
1001101
1010100
1011011
1100111
1101000
1110001
111110
1 214 3
18. (a) M I
= [ 8192]235 34.
14 0i=
=
12. (e) False. The ranges in the table indicate that the correct
value
i
Thus, the code can have at most 34 code words. 20. (a) There are seven positions from which to choose the single bit to change. There is a probability of p that a bit is changed, and a probability of 1 - p that a bit remains unchanged. Therefore, the probability of exactly I bit being changed is -
If p
=.1,
is not known, but it lies somewhere in the indicated range. Ramsey's theorem states that there is
p) 6
-
7p(l
-
p) 6 .
the probability is 0.372009, whereas when
p = .5 the probability is 0.0546875. This may seem
paradoxical - the probability of exactly I bit being changed is lower when p i larger! Look at your solution to part (b) to resolve this puzzle.
Exercises 8.6.3 k}I C S with 1. (a) Vk E S, V~ii1 , i2' I < i < ii2 ... < ik < n, M(k, iI, i2. ik) 2. (a) No system. IA t U A2 1 < 2 (b) The list, a, b, c, is the unique system of distinct representatives 5. (c) No, since it doesn't satisfy the marriage condition. 7. (a) Case 2 (since IA2 1 A2). It is possible to choose m = I and then rename A 2 as A1, or to choose m = 2 and use the (possibly renamed) sets A 1, A 2 or A2 , A3 as A I and A2 . The solution using A I and A2 (without renaming) will be completed. Removing T = fb, c} from A3 yields B3 = {a). The inductive step (applied to At and A2 ) assigns r1 = c, r 2 = b and a second application of the inductive step (applied to B3 ) assigns r3 = a. The system of distinct representatives is c, b, a. 8. (a) Since t = n = 2, the lower bound is (2T2)! = 2. There are actually two systems of distinct representatives: a, b and b, a. (b) Since t = 2 < n, the lower bound is 2! = 2. There are actually three systems of distinct representatives: a, b, c
some unique number, R(5, 5), such that every set with at least that size satisfies the (5, 5) Ramsey condition. 13. One person noted that the theoretical studies done by mathematicians are no more irresponsible than spending time playing (or watching) football, composing (or listening to) music, making movies, etc. (Also, the mathematics generally costs much less.) If time spent on artistic endeavors is justifiable, then so is time spent proving
theorems. Ann hinted at another aspect of the debate: The principle must be one that can be applied to solve important
scientific problems. Notice that the scoreboards, video cameras, satellite links, and other equipment needed to broadcast a football game would not be possible without mathematicians having developed the mathematics needed for this technology. The stadium is safe because it was built using mathematically based principles from civil engineering. Ann asked that the principle be explained in language a lay person can understand. This is exactly why the result
was presented to the press as a problem about guests at a party, thus causing the objection by B.V.B.
G.9 Formal Models in Computer
Science
Exercises 9.1.4 3. If a sample space of messages has many low probability messages and just a few high probability messages, there will be very little surprise when a typical message is sent (it will be one of the few in most cases). The definition assigns a relatively small information value as the average information for such a sample space. However, if every message is equally likely, then there will be the maximum variation in which message is sent. The surprise factor will be higher (because there is less predictability). The information value is highest in this case (Theorem 9.1). 4. The answer depends upon whether or not n is a power of 2. (a) n = 14 is not a power of 2. The number of bits is
and b, c, a and c, b, a.
12. (a) False. The marriage condition requires this to be true for every subcollection of k sets.
[10g 2 (14)1 In fact, 1410 = 11102.
=
[3.807351
=
4.
Solutions to Selected Exercises
A59
8. Write log 2 (x) Wn•2) = ln(x). Then use the standard result: lim x ln(x)
=
x-*0+
1 ln(x) xý0+
1
I x2
= lim lim r
x_.m
ty
ot
=0
-
xO+(-x)
lim [-xlog2 (x)
lim x-O+
[
-
(1 - x) log2 (l -
xln) Jn(2)
-
x)]
x) ln(2)
j
1 1 1 lim [-x ln(x)] lim [(1 ln(2) x-O+ ln(2) x-o+ 1 1 - .0 -ln(2) ln(2)
9. Here is a solution that uses a single final state. There are no transitions from the Lose and Win states because the input consists of only two numbers.
]
-
x) ln(l
-
x)]
(1.-0)4
2
3tr
10. Let S = {L, C, B, W) with PL = 3, PC = PB = P
=
I(S)
312
and
5. Then
I() 3 -1-o
=
-
5 16
4
3 2
1-lo2
-
323
41
log 2
-
1log2
11. (a)
5 I-
16/
_i 1.97722.
14. (d) False. Shannon's definition of information is not about semantics. The "surprise factor" in the formal definition is a purely probabilistic notion (which can be informally motivated by intuition about semantic content).
Exercises 9.2.3 1. (a) start in Zero, then: One, Zero, Zero, One, Zero, One, Two, Two. The string is recognized. 5. (c) This can be done with 4 states: Empty, Zero, One, Both. The start state is Empty, the only final state is Both.
State
Input
New State
Output
co
It
cI
0
Cl
01
c1
0
C1
10
c1
0
Cl
10
C1
0
00 co The output string is therefore 10000 (13 + 3 = 16). 16. (b) Three state are sufficient. The start state is No, indicating that the pair ab is not in process. The state a indicates that an a has just been received. The state ab indicates that the pair of characters has already appeared. (b c, F)
(a, F)
(a, F)
Input
(abc,T)
4(b, 1
State
0
Empty
Zero
One
Zero
Zero
Both
One
Both
Both
Both
One Both
T) a
No
ab
(c, F) 18. The set of states, 8, contains two elements: = {initial, toggle). The start state is initial. The input and output sets are both {0, 1). The input and output functions are implicitly
A60
Appendix G Solutions to Selected Exercises
defined by the state diagram. havee
0,0)
(1,1)
Exercises 9.3.2 1. (b) This is not a legal set of productions for a regular grammar. The production Y --+ Y has been ruled out in this textbook. 3. Notice that {uvw} U {vw) = {uvw, vw}. Also, {u, X}{vw}
={uvw, XvW} = {uvW, vW}, so
{uvw} U {vwl = {uvw, vw= {u, Xj}{vw}. 5. (c) S : IY =: IOOY =• IOOOOY =: 100001 7. (a) {anc I n > 1} U {bk I k > 2} 9. More than one solution is possible. There may be different choices for nonterminals, and even with the same non-terminals there may be multiple sets of productions that work. Notice the need to avoid using X.to finish (it is not in
but that would allow numbers such as 1111-2222 33334444 02/05 to be matched. The regular expression will need to a form like (AIBIC)
(0[1-9]j1[0-2])/[0-9][0-9]
whereAis fourcopies of"[0-9] [0-9] [0-9] [0-9]" separated by single spaces, B is four copies of "[0-9] [0-9] [0-9] [0-9]", separated by single hyphens, and C is four copies of "[0-9] [0-9] [0-9] [0-9]", all run together. 18. This cannot be done using regular expressions. It is tempting to try something like [P] [el Fr] [^c] [^i] [^v] [^a] [^I] but that is looking for a string with eight characters,
each of which is any character except one position-specific letter. Regular expressions are not powerful enough to specify a string that cannot be present; they can only specify a group
E).
(a) A = {S, T} with S as the start symbol. 11 = {S --> aS - bT - a - b, T ---) aS - a 11. (c) This is false in general. The presence or absence of , depends on the set of productions. 13. Note that A+ c A*, for any set, A. Consequently, if x E A+, then x e A*. Suppose that s c (E+)*. Then either s = .or there is , then a positive integer, n, with s E (E2+)n. Ifs {) U U-c 1 (E*)i. If s A X,,then S E (E')*, since (E*)* =X s is the concatenation if n strings in E+. But every string in E+ is also a string in V (since E+ C V). Thus,
s E
C (E*)*. This implies that (+)* Now suppose that s E (E*)*. Then either s = X or there is a positive integer, n, with s E (E*)n. If s - ,, then s G (X+)*, since (E+)* = {p} U U- 1 (E+)i. Ifs # X, then s is the concatenation of n strings in E" (some of which might be ,X). Let the substrings be sI, s2 ..... Sn, where si E• ki. Since s A k, at least one of the n substrings must be non-null. Thus, k = k -I-k 2 + ... + k, > 0. Since s E Ek, and k > 0, s e E+ C (2+)*. This, together with the discussion for the case s = X, implies that (T _c(2+)*. The two set inclusions imply that ( -*)*= (Y+)*.
Exercises 9.5.2 1. (a) Let E = {0, 1) and A = {Even, Odd). The start symbol is Even. The productions will be those listed in the following table. Second First Production Production Transition t(Even, 0)= Even
Even
t (Even, 1) = Odd t(Odd, 0)=Odd t(Odd, ) = Even
Even -+ 1 Odd Odd-- 0Odd Odd 1 Even
initially S = [S, B}. (I have modified the notational suggestions of the proof so that lowercase b does not have two meanings.) We need to add the state F to S because of the production B
--
c. _T
{F}. No states
other than BH are added when defining t, so S = [S, B, F, BH}. In tabular form ,the transition function looks like the next table.
Input
3. [aeiou] [a z] * [aeiou] 5. (b)
Even -I 1 Odd-+0
3. (a) E = [a, b, c) and the start state is S. The set of states is
Exercises 9.4.3 (alelilolu)
0 Even
-
[a-z]*(ae
[a-zA-Z]*(aa
eelii
or i ao u) ooluu) [a-zA-Z]*
State
a
S
B
b BH
C BH
B
BH
B
F
F
BH
BH
BH
BH
BH
BH
BH
14. It is tempting to use a regular expression that starts with
[0-9] [0-9] [0-9] [0-9] ( I-)?[0-9] [09] [09] [0-9] (I-)?
Solutions to Selected Exercises
A61
The state diagram is shown next.
b"-'•a
5. (b) Start by creating finite automata for the five regular expressions a, h, c, d, and e.
f
A
AS A
A7
Next, the finite automata corresponding to ab and to cd can be formed using the concatenation rule.
The alternative rule can now be used to construct a nondeterministic finite automaton for the regular expression, (ab J cd).
A
A
fa
faA
b Sbb
Finally, the concatenation rule can be used again to construct the final nondeterministic finite automaton for the regular S(ablcd)AA de expression, (ab I cd)e. SCS
A62
Appendix G Solutions to Selected Exercises
6. (b) A state diagram corresponding to the state table is shown.
If x = s 1 and z = sl, then Lo(x, z) = 0, Lo(x, y) Lo(y, y) = 1, and Lo(y, z) = 0. Thus, Ln(x, z) = 0 1 11*0, which simplifies to 0 1+0.
The modified state diagram with new initial state, s, and new final state, fo, is shown next. Note the two A-transitions into the new final state. 0
1,
01O[+0
I
0
=
~l1/•0 A
All+ The final iteration uses y = s1 . There is only one choice for x and z: x = s and z = fo. In this instance, gLo(x, z) =-- Iol+, Lo(x, y) = 0 1 1+0, Lo(y, y) =011 +0, and Lo(y, z) = I+. Thus, Ln(x, gA z) = I•I1+)I ((011+0)(011+0)*I+). This simplifies to A! + I (011 +0)+ 1+. +
A~~ The reduction loop might be easiest to start with y The choices for x and z are x = s, Z = SI, x= s, z = s2, and x = s, z = fo. For x = s, z = s1 ,Lo(x, z) = 0, Lo(x, y) = Lo(y, y) = 0, and Lo(y, z) = 0. Thus, Ln(x, z) = 0 1 0*0, which simplifies toO. For x = s, z = s2 , Lo(x, z) = 0, Lo(x, y) =, Lo(y, y) = 0, and Lo(y, z) = 1. Thus, L(x, z) = 0, and o1, which smlfe T ,to
=
so.
For x = s, z = fA, Lo(x, z) = 0, Lo(x, y) =, Lo(y, y) = 0, and Lo(y, z) = A. Thus, Ln(x, z) = 0 IA.*A, which simplifies to A. 0
I 7A 0
7. (d) False. Although a nondeterministic finite automaton does not need to have transitions associated with every input symbol, it is still permitted.
Exercises 9.6.3 1. r= (S -- aXcd, X -0. aXc b) 5. (a) These productions can be part of a context-sensitive grammar, but not a context-free grammar (since there are terminal symbols on the left). Notice that ci and/or 1 u 2 , in the description of context-sensitive productions, can be Aý. 6. (a) These productions cannot be part of any of the grammars in the table, since it is not valid for a to be A in any production of the form u -- , 11. (a) False. The text gave the language {anbncn In > 0} as a counterexample. 14. (a) Halts in state sj(not recognized). Tape = •... AAABB(C)CA...
G.10 Graphs Now let y = s2 (another final state). The possible transitions that pass through y are discussed next. If x = s and z = fo, then Lo(x, z) = A., Lo(x, y) = 1, Lo(y, y) = 1, and Lo(y, z) = A.Thus, Ln(x, z) = A I I1'*A, which simplifies to AýlI+. Ifx = s and z = s1, then Lo(x, z) = 0, Lo(x, y) = 1, Lo(y, y) = 1, and Lo(y, z) = 0. Thus, Ln(x, z) = 0 111*0, which simplifies to 0 I 1+0. Ifx = sl and z = fo, then Lo(x, z) = 0, Lo(x, y) = 1, Lo(y, y) = 1, and Lo(y, z) = Aý. Thus, Ln(x, z) = 0 11 l*A, which simplifies to I+.
Exercises 10.1.3 2. (a) When either n > 2 or m > 2 the graphs are all bipartite: Let Vu be the vertices whose row and column sum is even and Vý be the vertices whose row and column sum is odd. 4. (b)a-
c
d
A63
Solutions to Selected Exercises 5. (b) Sn is regular only when n equals I or 2.
(b) There are Hamilton cycles. One is shown.
6. Since Kn is complete, it will have n - 1 edges incident with the first vertex, n - 2 additional edges incident with the second vertex, n - 3 additional edges incident with the third vertex, and so on. The edges run out when the penultimate vertex is reached. The number of edges is therefore n(n-1 I+ 2+-.+ n-2 + n-1
101
An alternate proof: We want all possible unordered pairs of vertices from a set of size n. Each unordered pair corresponds to an edge of K,. There are therefore (n) = n(n-1) edges.
001
111 110
000
12. The union, G U G-, is Kn,, and hence has (2) edges. Since G has e edges, its complement must have (n) - E edges. 16. (b) False. Loops and/or multiple edges are permitted in multigraphs. 20. (a) Maximal clique: Any two adjacent vertices Maximal independent set: {a, b, c}.
Exercises 10.2.3 3. (a) It is a trail and a path. (a) I is a trail, and p waatl (b) It is a trail, a closed walk, a circuit, and a cycle. 2 (b and d) and 5. (a) The K6nigsberg graph has connectivity 7. (a)
edge connectivity 3 (ad, bd, and cd). a b c d a 240 1 b 2 0 2 I d c 0I 2I 01 0f 4
8. (a) A3:
a a (a b 22 c 4 d \11
b 22 8 22 11
c 4 22 4 11
d 11 11 1 81.
Oil
010 10. The path needs to connect three regions, but it must do so by passing through vertex 6 each time it changes region. Since two changes are required, vertex 6 must be crossed two times. This rules out a Hamilton path. 12. (d) False. In order to make this statement, you must first verify that G is a connected multigraph. 17. It is possible, as the following graph demonstrates. a
x
(b)
b
0
0
01 11 1 00 0 00 1 0 0
0 0
Exercises 10.3.3
e
0
l
e
e
2.
e
d
c
a
b b
(a) There is neither an Euler circuit nor an Euler trail since there are eight vertices with odd degree.
e
( Exercises 10.4.3
a b c d
twice). 6.H 3
d
•
3 (b) The entry in row l, column 4 of A is the answer. There are 11 walks. They are aelbeIae5 d, aelbe2 ae5 d, ae2 be 2 ae5 d, ae 2 beIae5 d, aeIbe3 ce 7 d, ae1 be 4 ce 7 d, ae2 be 3 ce 7 d, ae 2 be 4 ce 7 d, ae5 de 5 ae5 d, ae5 de6be6 d, and ae5 de 7 ce 7 d. 14. (a) False. Consider the K6nigsberg graph in Exercise 3. It is clear that ae Ibe2a is a trail. However, this is not a path because vertex a is repeated.
3. With two vertices, the complete graph K2 has only one edge. There is a Hamilton path but no edge to use to return to the initial vertex (recall that a cycle is a circuit, which is a trail and consequently has no repeated edges). When n > 3, a cycle will never use an edge twice (or it would use vertices
b
e3
e4
e5
1
0 0
01 0
1
1
1 0 1
0 0
1
0
1
0
0 1 1 0 0 /2 0 10 2 00 0 0 0 0
0 0 1 0
0 0 0 0
11 0
00 0 2 0 0
1 0
1 1 0
1
0
0 0 0 11 0 0 0 0 1 1 1 0 1 0 1 0
1101
0 0 = 2 01 0 2
0
0
1 0
0 0 1 0 0
0 1 1 1 0 l 1 1 0
A64
Appendix G Solutions to Selected Exercises
4. (a) This cannot be an adjacency matrix, since vertex b has two neighbors according to row 2, but it has three neighbors according to column 2 (that is, this matrix is not symmetric). 7. (b) 3, 2, 2, 2, 1 10. (a) The only isomorphism is shown, V
O(V)
a
u
b
z
c
x
d
w
e
v
graph contains a triangle (a circuit of length 3). The smallest circuit in the first graph has length 4. Since isomorphism preserves circuits, these cannot be isomorphic. 12. (b) True. This implication is true since the hypothesis is always false. There cannot exist an isomorphism between a simple graph and a multigraph because the edge sets will either have a different number of elements, or else the adjacencies will not map properly. (f) True. This implication is true since the conclusion is always true. Vertex degree is isomorphism invariant. 19. (a)
f y 11. (a) The degree sequences are identical (3, 3, 3, 3, 3, 3, 3, 3) but they are not isomorphic. Notice that the second
Exercises 10.5.4 5. (a) Notice that removing any vertex will also remove four edges. If it can be shown that any subgraph derived by removing just one edge is planar, then any subgraph with fewer vertices must also be planar. The symmetry of the graph means that removing any edge is equivalent to removing any other. However, the planar embedding we are used to seeing shows two apparently different kinds of edges (inside and outside). The following diagram shows the result of removing either of these apparently different kinds of edges. a
a
ebe
Cb
1)
1
Vd
c
d
a
a
e
e
b
d
c
8. The graph is nonplanar. (Remove the central vertex and its incident edges. The resulting subgraph is homeomorphic to K 5 .)
C
b
d P 9. (a) The icosahdedron graph has chromatic number 4. It can't be a larger number (by the four color theorem). Notice the triangle around the outside. It will require three colors for the three vertices on the outside. There is only one way to color the next three vertices on the following diagram if we try to limit the coloring to three colors. There are now three vertices that require a fourth
Solutions to Selected Exercises color.
2.r(a)
o 0
0
1 00
1
0
0
0
1
0
1
0
1
0
1 0 0 0
b1
A65
001 0 0
5. All three questions require A3 , which is shown. r
0 0 0 000202
bJAg 12. (a)
0
0
0
0
20
0
0
0 ,1
4 0
0 1
0 0
00 0 0
1
Ismrhcto 1.I n ,(a)
-----Isomorphic
2 0\
0
1
0
There are two directed walks of length 3 from b to d. 10. If Y~i=-1 A' has a zero in the ijth entry, then there is no directed walk (or path) between vi and vj that has length less than or equal to n - 1. By the previous problem (contrapositive), there is no directed walk between vi and vj, so D is not strongly connected.
17. (c) False. The chromatic number is the minimal number of colors in any proper coloring of the graph.
Exercises 10.6.3 1. (a) This is weakly connected but not strongly connected (vertex b has no outward arcs, so no directed walks exist from b to any other vertex),
If yi-0 A' has a nonzero value in the ijth position, then for at least one value of i, there is a directed walk of length i between vi and vj. If this is true for all choices of i and j, D must be strongly connected. 11. (b) False. To be multiple arcs, both must start and end at the same vertices: 0(ej) = (vi, vf) and ,(e2 ) = (vi, vf).
12. (a) The shortest path has length 7 and is acdef. The following table shows the details of Dijkstra's algorithm. d n
B
0
0
r
A
p
a
b
c
d
e
f
[-0-1
00
0
cc
cc
0c
3
2]
1
(a]
a
[b, c}
2
{a, cl
c
d, f}
4
3
{a, b, c}
b
{f}
[4]
4
{a, b, c, d}
d
(e}
5
{a, b, c, d, e}
e
{f)
19. (a) This is not a tournament graph (and hence not a transitive tournament graph either). One of the many reasons for this is that there is not an arc between vertices a and e. 21. The tournament graph is shown below.
(a) Four of the many Hamilton paths are: FEABCD, EBCDAF, DEAFBC, and EAFBCD.
a 8
a C
c d] e
(b) The following table shows the numbers of wins. A B C D E F 4 2 2 2 2 3 This table favors a path starting at Faramir, but such a path cannot have Assumpta ranked second. This example serves as motivation for the desirability of transitive tournament graphs.
A66
Appendix G Solutions to Selected Exercises
GA1I Trees Exercises 11.1.1 3. (a) Since n = in + 1,solve for i to show i n
=+
11-=
-n-I
=
_. Since
m n(m-I)+l
6. (a) Theorem 11.4 implies that 10,250
The total weights are listed in the next table.
(paragraph)+>
(#PCDATA I emphasis) +>
subject (#PCDATA)>
(b)
a
b
c
d
50
37
27
55
c 6
d 30
54
132
graph
17. (a)
d
c
b
a
graph
total minimal edge weight
8. The total weights are listed in the next table. graph a b totalminimaledge 43 63 weighting weig raph
modified graph total maximal
137
97
edge weight of original graph 12. (b) There are 3 spanning trees. The matrix K is
<email> gosrac@bethe1. edu
K =-
[email protected] 2
-1
-1
-1
-1
3
0
0
--
0 -I 1
gossett@bethel. edu
with cofactor determinant 3.
<subj ect>Test ing a DTD< / subject> <paragraph>This paragraph is for testing purposes. Should this have been a real email, something more interesting would have been written.
The distinct spanning trees are shown next.
c
d
d
d
Solutions to Selected Exercises 13. (a) The adjacency matrix for C5 is shown,
0
1
1 0 0 1 1 0 0 1
0 0 1 0
1 0
0
1 0 0 1 0 0 1 0
transitive closure ((a, b), (a, c), (b, d), (c, c)) (nothing added) 10. (a) Every integer in the set .T (and any integer in general) is equal to itself, so the relation is reflexive. The relation is trivially antisymmetric because two integers will not be in the relation unless they areequal. Suppose (x,y) E TR1 and(y,z) E •1. Thenx=yandy=z.
The matrix, K, is
Thus, x = z and so (x, z) E 7Z.I. This implies that the
1
12 2
0 1
0 0
0
-1 0
2 -1
0 0 1 There are C I trees.
=
(_1)2 detAl
=5
-1 2 distinct spanning
G.12 Functions, Relations, Exercises 12.1.3 1. (a) Not a function: (3, 3) and (3, 4) are both in the relation 2. (a) Not onto: 2 is not a second coordinate 3. (a) Not one-to-one: (2, 4) and (3, 4) are both in the relation
4. (a) ((1, 1), (4, 2), (3, 3), (4, 3), (5, 4), (5, 5)1 9. (c) This composition is not defined, since some elements (such as 2) in the image of 7Z are not in the domain of S. 13. (a) T o S o 7Z = {(GA, IL), (GA, KA), (MI, IL), (MI, KA)} 15. (d) False. (1, 2, 3} is the image off7 but not necessarily the range. In other words, not every function is onto. Exercises 12.2.3 1.(a)
1
Properties
Transitive Let (a, b) c S and (b, c) c S. Then a + b = 2m and b + c = 2n for some m, n e Z. Thus, a + c = (2m - b) + (2n - b) = 2(m + n - b) = 2k for
some k E Z. This implies that (a, c) E S. (b) Integers of the same parity form the equivalence classes of the relation. That is, the even integers are in one equivalence class and the odd integers are in another. - Lx'i = [y]. 20. Let and [x] y - =LyJ[x']= and y- [y][/jJ. Thus,Then x' =x x- -lxiLxJ= +x' Lx'J and y' = y - [yJ + Ly'j.Consequently,
x' + y' = (x - [xJ + Lx']) + (y - [yJ + [y'j) (b)
(c)
Equivalence
Partial
Therefore, x' + y' and x +- y differ by an integer (more
Relation?
Order?
specifically, the integer Lx'J + LY'J - LxJ - LyJ). Thus, x1 + y' = x + y + n and so x1+y'- [x'+y'j = x+y+n- Lx+y+nj =x+y- Lx+yj. But this means that [x' + y] = [x + y], and so 0 is
no
no
symmetric, transitive, antireflexive
relation is transitive. The Hasse diagram is very simple: It consists of a row containing the numbers -10 to 10, with no edges. 12. 12. Let x mod5=ymod5=r. Then there are integers qx and qy such that x = 5qx + r and y = 5qy + r. But then x-5qx =r=y-5qy, sox - y = 5(qx -qy) = 5n. 15. (a) Reflexive Let (a, a) E Z x Z. Since a + a =2a, which is even, (a, a) E S. Symmetric Let (a, b) ES. Then a + b = 2k for some k E Z.It is then trivially true that b + a =2k, so (b, a) E S.
Databases, and Circuits
Relation
A69
In blended families, the properties are different. Transitivity holds for full siblings, but not for half-siblings. 4. (a) The relation ((1, 2), (2, 2)) on f1, 2} is neither reflexive nor antireflexive. So "not reflexive" does not imply "antireflexive". Also, the relation 0 on 0 is antireflexive and reflexive, so "antireflexive" does not imply "not reflexive" (the definitions are vacuously satisfied). 7. (a) S1 = {(a, b), (a, c), (b, c), (c, c))
= (x + y) + (Lx' + Ly'j - LxJ - Ly]).
well defined.
Exercises 12.3.4 1. (a) This cannot represent a relation in a relational database because the Understudy column is not single-valued. In particular, the Hamlet row has two values in the Understudy column. 4. (Teaching Assignments)[Course, Instructor] Course Instructor
reflexive closure ((a, a), (a, b), (a, c), (b, b), (b, d), (c, c), (d, d)} Note: (d, d) is in the reflexive closure since A hasn't changed.
MAT222
Kinney
symmetric closure {(a, b), (a, c), (b, a), (b, c), (c, a), (c, b), (c, c)}
MAT 124M MATI24M
Conrath Kinney
A70
Appendix G Solutions to Selected Exercises
6. (a) The join is on Student. Student Kim
Task
Yearbook Homeroom Feature
Teacher Photos
H4
Photos
Widgets and Flanges should be replaced by {(Widgets and Flanges)[Part Name, Part Location], Widgets and Flanges)[Part ID, Part Name]}. Both new relations are in third normal form. Part Name is the primary key in
Cohort Senior
the first relation and a foreign key in the second.
Rosa
Student D5 Editorials Junior Photos 11. (c) The functional dependencies are listed below.
(Widgets and Flanges)[Part Name, Part Location] Part Name Part Location Widget (metric) Warehouse W
Part ID -- Part Name
Warehouse W
Widget (English) Flange (4 inch) Flange (6 inch)
Part ID -) Part Location Part Name - Part Location Part ID is the primary key. The functional dependency Part Name -* Part Location prevents this relation from being in third normal form. It is in second normal form, however, since no attribute is functionally dependent on a proper subset of the primary key. The projections are shown. (c) 12. (Widgets and Flanges)[Part ID]
Warehouse F Warehouse F
(Widgets and Flanges)[Part ID, Part Name] Part ID
Part Name
W1256
Widget (metric)
W1257
Widget (metric)
W 1256
W2276
Widget (English)
W1257
F4
Flange (4 inch)
W2276
F6
Flange (6 inch)
Part ID
F4
Part Name
Part Location
18. (b) This is true. The join of the sub-relations in the decomposition produces " again, hence the name lossless. (d) This is false. If the two relations have no common attributes, then R?* - 7R x T, which is as large as the join can get for two relations (see Proposition 12.4).
Widget (metric)
Warehouse W
Exercises 12.4.4
Widget (English)
Warehouse W Warehouse F
F6
(Widgets and Flanges)[Part Name, Part Location]
Flange (4 inch)
The third assertion in Proposition 12.4 implies that the join will be the Cartesian product {Part ID} x [Part Name, Part Location). This means the join will contain many tuples that are not in the original relation. One such tuple is (W1256, Flange (4 inch), Warehouse F). This is not a lossless decomposition. 14. (c) The functional dependencies are listed below.
3. There are 2 = 28 = 256 (Theorem 12.7). 5. (b) This a minterm among other reasons, it does is notnot contain eitherbecause, x or Y-. 6. (b) This is not in disjunctive normal form because it is not 0 and it does not contain any minterms. 8. The binary function f(x, y) = x • + Y- y evaluates to 1 at (1, 0) and (0, 1). It evaluates to 0 at (0, 0) and (1, 1). 11. (a) xI + 3ý. (x1 + i3) ( . . (xi+ 3I-) De Morgan -
=-
Part ID --. Part Name -,
Part Location
Part ID is the primary key. The functional dependency Part Name - Part Location prevents this relation from being in third normal form. The algorithm indicates that
x2 ' (xI + 3-3)
associativity
(this step is optional)
Part ID -+ Part Location Part Name
(5T.x2) •(XI + ?-) involution
12. (a)
(Tx_. 1 + x2 • x3) • x1
= ((WiT- 1) • x1 + (x2 • x3 ) • x 1 ) distributivity = Xi-.1 • x1 + x2 • x3 •x associativity (three times)
Solutions to Selected Exercises 13. (a) A short solution. XI • x3 -iT = X1 •j -- x 3 = (XI •5i) • x 3
commutativity associativity complement axiom commutativity domination
= 0 • x3 = X3 • 0 =0 A long solution. xl •x 3 • x=
xI •x 3 -x-1 x2 + Xl X3 X •Tx 2 = X x-i-x 2 x3 + X1 - j•2 X3 = (xl -i-) • x2 • X3 + (xI • xq-) 2. x3
replacement algorithm (i =2) sort within terms associativity (twice)
= 0 •x2 - x3 + 0 • 5 -X3
complement axiom (twice)
-
associativity (twice)
0. (x2 x3) + 0 (2 . x3)
commutativity (twice) domination (twice) identity axiom
= (x2 - x3) • 0 + (C- x3) • 0 =0+ 0 =0 14. (b) Move all complements onto single variables. Already in this form. Transform to a sum of products. Already in this form. Transform into disjunctive normal form.
replacement algorithm (i = 1) commutativity (twice)
I=1• x] + 1 • T = xI +1-+ - 1 =
identity (twice)
Xl + ýi-
replacement algorithm (i = 2) replacement algorithm (i = 2)
=x 1 x2 + x 1 X2 + 55E2+l- X2 +Yj-1.-2
=Xt
X2 + X 1
=Xj X1
±X1 5x2 X2 X3 + XI X2 jý3 + xI - 2+X- x2 + 3C.2 3+ i-.x2+x5.x2 X- 3 +x+x 1 X2 x3 +X1 x2 . +x.
=X
1
.X2 .x
3
+X1
.X2.3 +XI
.X 2 .X3 +xf
.X 2 .X
replacement algorithm (i = 3) replacement algorithm(i =3)
3
replacement algorithm (i = 3)
x Xl - T2 +ý -X2 •x3x-I" x2 T =X1 X2 .X3 +X1 .X2.-3+XI .--T2.x
3
+xj x2.x3
replacement algorithm (i = 3)
+Xp- X2 •X3 + -i--x2 -X3 + XI -X2 •X3 + XI -x2 -X3
The terms are already minterms. Sort and reduce into a unique disjunctive normal form. The minterms are already sorted. There are no duplicate minterms. The final expression is thus: xI .X2.X3
"x1 "+Xl
+
XI
X2
x3
-+X1
X2 3
x2.X3 + xI X2.x32+Xl+-T. x2 X3 .x2
x3
+
Xl .2
X3.
This makes sense: The function should evaluate to I at all 8 possible ordered triples. 17. (b) Move all complements onto single variables. XI 'X4 + X • X2"-x2 = (xI • x4) + (Xl •T2-). X2
associativity (twice)
x 4 ) - (xjI li-)) 'x 2 De Morgan (twice) = ((XiT + 3)" (Y- + =22))" x2 De Morgan (twice) = ((Ti.
+ X2)) •x2 involution = ((Xi- + iT) G(CI = (xi1+ T-•) • (li + x2 ) • x2 associativity
A71
A72
Appendix G Solutions to Selected Exercises Transform to a sum of products.
I•+ -•4)
I•+ X2). -X2 distributivity distributivity
= (G- + T4. x2) •X2 = x-- x 2 + (G4- X2) •x2 = x-- x 2 +,T4 x2 x 2
associativity
Transform into disjunctive normal form. ' x 2 + X4 " X2 • X2 assoc., idempotence
= X'- X2 + x4 " X2
x. x 2 + X4- x2 =Xl X2 X
X3 + X
X1 + X2
replacement algorithm (i = 1)
.x2 •2-
x3
replacement algorithm (i = 3)
+ X" x 2 •X) + X4 x 2 •X=XTl-X2 x3 +J
X2 .x3+x4-X2
X1
x3
replacement algorithm (i = 3)
+ x 4 . x2 •x] •x3 + x4 •x2 - 7FX-.X2
+
X3 +X-X2
x3 + x4
X2.X1
X3
x'-.2 x• l•3 + X4 - x2 • ýi- X3 + 4- X2 • x •X3
T --- X2
x3. X4 +
+
+x1 -
x
.X2.X3"
.l
x4 • 22+
x3x X4 +
+÷ .x
2 .x.x1 x3 +-X4 .xX X2.x3+x
' X2
X-- X3 X3X4 +
-l X2
x3
X3
. X2 - X3 +-X4 "X2"xi
x3 x3 + x • X2 -•X Tx
X-- X2
4
+ -
X4 +l
replacement algorithm (i = 3) replacement algorithm (i =4)
3
x 4 + xI. X2
" -1X T 4
x3
x3
x4
replacement algorithm (i =4)
.x2.X.X3+-T4.X2.Xl.X3
" X2 - 3X 3
4
+
3X
-x2'
4
sort within terms
+x1-x2. -x3.--4+x)X2.x3-x4+X1-X2.X3-x4 +x5 x2-53-x4 Sort and reduce into a unique disjunctive normal form. Sort the minterms. x1 .x2.x3.4
+
+ 6 -x 2 .X3.
x 5+jx-
.x2
x3 . x4 + -1
X
X2
X3X4
5F X4 + X12
X3.4
Remove duplicate minterms. Xi . X2 .x3.X4*
.
X2
x3. x4 + jl
X2
18. Maxterms and Conjunctive Normal Form (a) DEFINITION G.1 Maxterm Let XI, x2,. , Xn be n binary variables. A maxterm is a binary expression in the form
5,
for i = 1,2 ...
, n.
(b) DEFINITION G.2
x4 + X1
X2
X3
X4 + Xp . X2 .3
X4 +-l p
X2
x3 .x4
(c) PROPOSITION
G.1
terms
Evaluating Max-
Letfl +x•2 +..• +X$nbe a maxterm in the binary variables, x 1 , X2 ... x.. Define an n-variable binary function, f, as
] + '2+4-""- + f~n where ii is either xi or
x3
Conjunctive Normal
Form A binary expression is in conjunctive normalform if it is either a product of distinct maxterms or it is the expression, 1./1
f (x1I x 2., .... Xn) = X6+ f2 +
+
Then f has the value 0 at only one element in its domain; it has the value I at all other elements of its domain. The n-tuple at which f has the value 0 is determined by setting 0 iff =xi xi Iif = ifi zx/ fori = 1,2.
n.
Proof: The function, f, is defined in the n terms,
Solutions to Selected Exercises xjl, xC2 ...... ý%.If any one of those terms evaluates to 1,
A73
Long version:
then f will also evaluate to 1. The only way to make
W w w w (w (w
each of the factors evaluate to 0 is to make the assignments specified in the statement of the proposition. (d)
Every Binary Function Can be Expressed in
w
Conjunctive Normal Let f be a binary function in the binary variables, X1, x2 ..... Xn. Then there is a binary expression in conjunctive normal form that is equal to f when viewed as a binary function.
= = = = =
w w w w (w (w (w
Every Binary Expression
=
(w
Form
(e)
=
is Equivalent to a
(w.
((-) y+ X) (x .--y•-) ((O + T) •X-) (Y (Y-+ Y)) x) O(+ Y) x) T + (w. E.) (x x-) + (w • -) • y x + x + (w . T-) .5i x- + (w . 1).Y + w .•x x 1 + w x •Y x7) • I + (w YX-)• 3 •) (1 + Y) •li) • (Y + 1)
De Morgan De Morgan commutativity associativity distributivity associativity
De Morgan idempotence associativity identity associativity (twice) distributivity commutativity
Y) • 1
domination
identity
)
w-x
associativity
Unique Expression in Conjunctive Normal
Form
6. (b)x
Every binary expression is equivalent to a unique expression in conjunctive normal form. The uniqueness requires a preestablished lexicographical ordering of the variables.
Exercises 12.5.4
8. (b) x+y.z
x Y
1. (b) x.y.z+x.y.-+x.Y.z+x.Y.+Y-.y.z Phase 1:
z
1 2 3 4 5
111 •/ / 110 / 101 01100 /
1,2 1,3 1,4 2,5 ,5
1 11-1 -11 1-0 1 0-
./
1,2,3,5
1
V
Step 1. There are three input variables and one output variable. Let the inputs be A, B, and M, representing the different family members. There will be one output variable:
Phase 2:
2 1 xy-zx.y. -11
X
I_
X
3 x.y.z.yzx.
4 X
The simplified function is f(x, y, z)
= x
16. The general process that was outlined in the textbook will be used.
+ y . z.
4. (a) Short version:
=
w. (x -y+ X) W- (T-x .y)
commutativity
=
W. (I-)
absorption
=
w -
associativity
5 y.z
E. An input variable will have the value 1 if the person wants to go out to eat on Saturday and the value 0 if person does not want to go out to eat on Saturday. The output variable will have the value I if the family eats out on Saturday and the value 0 if the family eats at home on Saturday. Step 2. The binary function is shown below. A 0
B 0
M
E
0
0
0
0
1
0
0 0
1 1
0 1
0 1
1 1 1
0 0 1
0 1 0
0 1 0
1
1
1
1
A74
Appendix G Solutions to Selected Exercises
Step 3. The binary expression in disjunctive normal form is E(A,B,M)=A.B.M+A.B.M+A.B.M. Step 4. Quine-McCluskey Phase 1: I
or PvQvR -(PPvQ) vR ((P I' P) T (Q T Q)) v R (((P T"P) T (Q T Q)) T" M ((P T P) T'(Q 4' Q))) 4 (R 4' R)
Ill
/
1,2
2
101
•/
1, 3
3
Oil
V
I-1 -11
G.13 Appendices Exercises C.3
Phase 2
1- 1
1K2
ý
specifically
A B M
AM AM
A
B M
X
X
1. It is tempting to assume that Ebenezum made the first statement and Jedediah the second, However, the problem states that I don't immediately know which statement was uttered by Ebenezum. The only implied information is that each brother made (a different) one of the statements. Suppose that Ebenezum did make the first statement. That statement is consistent with what we know about his character. Then Jedediah must have made the second
The simplified function is: AM
E (A, B, M)
B M.
Step 5. The following diagram was created using Logisim.
AI
B•
statement. Since Ebenezum never lies, Jedediah must be lying. Suppose now that Ebenezum made the second statement. Then we must assume that Jedediah is lying, since Ebenezum always tells the truth. Jedediah has thus (falsely) claimed to always tell the truth. But we know that he doesn't always tell the truth, so his lie is consistent with the information from Ebenezum. In either case, this must be one of Jedediah's lying days. 7. One possibility is to cause her to mentally step back and analyze the answer she would make to a particular question. Don't ask her to answer the question under consideration. Rather, have her tell how she would answer if you ever did ask it. One such question is, If I were to ask which road leads to the capital city, which road would you tell me is the one?
19. (b) P
Q
PA Q
P t Q
(P t Q) f (P T Q)
T
T
T
F
T
Suppose she is a knight. Then the road she points to is the
T
F
F
T
F
F
T
F
T
F
F
F
F
T
F
same road she would point to if you were to directly ask "which road leads to the capital." On the other hand, if she is a knave, and you asked directly "which road leads to the capital," she would lie and point to the other road. However, you didn't ask her to answer the question. You asked her to predict how she would answer it. She, of course. will lie about how she would answer. The net result is she will point to the road that leads to the capital. In either case, you can take the road that is indicated will lead to the capital city.
20. (b) P v Q v R + P v (Q v R) SP V ((Q T Q) T'(R T R)) (P 4' P) 4 (((Q 4 Q) 4 (R 4 R)) 4 ((Q
4' Q) 4' (R 4' R)))
[1] Robert W. Allen and Lome Greene. The PropagandaGame. WFF 'N Proof, 1970. [2] Leigh Atkinson. Where Do Functions Come From? The College Mathematics Journal,33(2):107-112, March 2002. [3] Norman Balabanian and Bradley S. Carlson. Digital Logic Design Principles. Wiley, 2001. [4] W. W. Rouse Ball and H. S. M. Coxeter. MathematicalRecreations & Essays. Macmillan, 11 th edition, 1960. First issued in 1892. [5] William Berlinghoff. Mathematics: The Art Of Reason. Heath, 1968. [6] Norman L. Biggs, E. Keith Lloyd, and Robin J. Wilson. Graph Theory: 17361936. Claredon Press, 1976. [7] C. Bohm and G. Jacopini. Flow Diagrams, Turing Machines and Languages with Only Two Formation Rules. CACM, May 1966. [8] J. A. Bondy and U. S. R. Murty. Graph Theory with Applications. Elsevier, 1976. [9] Carl Boyer. A History of Mathematics. Wiley, 1968. [10] John D. Bransford and Barry S. Stein. The IDEAL ProblemSolver. W. H. Freeman and Company, 1984. [11] Ricard A. Brualdi. Introductory Combinatorics.North-Holland, 1977. [12] Anne M. Burns. Persian Recursion. Mathematics Magazine, 70(3):196-199, June 1997. [13] Daniel A. Cohen. Introduction to Computer Theory. Wiley, 1986. [14] Charles J. Colbourn and Jeffrey H. Dinitz, editors. The CRC Handbook of CombinatorialDesigns. CRC Press, Inc., 1996. [15] David Crowdis and Brandon Wheeler. McGraw-Hill, 1969.
Introduction to Mathematical Ideas.
[16] Antonella Cupillari. The Nuts and Bolts of Proof Wadsworth, 1989. [17] H. M. Deitel, P J. Deitel, T. R. Nieto, T. M. Lin, and P. Sadhu. XML: How to Program. Prentice Hall, 2001. [18] Rene Descartes. Discourse on Method and Meditations,Laurence J. Lafleur (translator). The Library of Liberal Arts. Bobbs-Merrill, 1960. [19] Reinhard Diestel. Graph Theory. Number 173 in Graduate texts in mathematics. Springer-Verlag, 1997. [20] Underwood Dudley. A Budget of Trisectors. Springer-Verlag, 1987. [21] William Dunham. Journey Through Genius: The Great Theorems ofMathematics. Wiley, 1990. [22] William Dunham. Euler: The Master of Us All, volume 22 of Dolciani Mathematica Expositions. The Mathematical Association of America, 1999. [23] Margery G. Dunn, editor. Exploring Your World: The Adventure of Geography. The National Geographic Society, 1989. [24] P. Erd6s and A. Szekeres. A Combinatorial Problem in Geometry. Composito Mathematica, 2:463-470, 1935.
A75
A76
Bibliography [25] Martin J. Erickson. Introduction to Combinatorics. Wiley Interscience series in discrete mathematics and optimization, Wiley, 1996. [26] Leonhard Euler. Introduction to Analysis of the Infinite, volume 1, translated by John Blanton. Springer-Verlag, 1988. [27] Leonhard Euler. Letters to a German Princess. Thoemmes Press, 1998 (originally translated into English circa 1800). [28] Howard Eves. An Introduction to the History of Mathematics, 6th ed., Saunders, 1990. [29] Jill D. Foley. Unisured in the United States: The Nonelderly Population without Health Insurance. Employee Benefit Research Institute, April 1991. [30] Frederick P. Brooks, Jr., The Mythical Man-Month, 20th anniversary edition edition, Addison-Wesley, 1995. [31] Quentin F.Stout and Patricia A. Woodworth. Relational Databases. American MathematicalMonthly, 90:101-118, February 1983. [32] J. A. Gallian. Assigning Driver's License Numbers. Mathematics Magazine, 64(1):13-22, February 1991. [33] Joseph Gastwirth. The Statistical Precision of Medical Screening Procedures: Application to Polygraph and AIDS Antibodies Test Data. Statistical Science, 2(3), 1987. [34] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. ConcreteMathematics: A Foundationfor Computer Science. Addison-Wesley, 1989. [35] Ralph P. Grimaldi. Discrete and CombinatorialMathematics, 4th ed. AddisonWesley Longman, Inc., 1999. [36] Jan Haliday and Peter Fuller. The Psychology of Gambling. Penguin, 1974. [37] Marshall Hall, Jr. CombinatorialTheory, 2nd ed. Wiley-Interscience, 1986. [38] Arthur Hallerberg. MathematicalProof.Hafner Press, 1974 (out of print). [39] Frank Harary. Graph Theory. Addison-Wesley, 1972. [40] Health United States, 1990. U.S. Department of Health and Human Services, National Center for Health Statistics, 1991. [41] James L. Hein. DiscreteStructures, Logic, and Computability. Jones and Bartlett, 1995. [42] Mark S. Hoffman, editor. The World Almanac and Book of Facts, 1992 edition. Pharos Books, 1991. [43] Ross Honsberger. More MathematicalMorsels. Number 10 in Dolciani Mathematical Expositions. Mathematical Association of America, 1991. [44] E. V. Huntington. Postulates for the Algebra of Logic. Transactionsof the American MathematicalSociety, 5:288-309, 1904. [45] J. B. Saxe, J. L. Bently, D. Haken. A General Method for Solving Divide-andConquer Recurrences. SIGACT News, 12(3):6-44, 1980. [46] Otto Johnson, editor. 1988 Information Please Almanac, 41st ed. Information Please Almanac Atlas and Yearbook. Houghton Mifflin Company, 1987. [47] Flavious Josephus. Josephus The Jewish War. Zondervan, 1982. Gaalya Cornfeld General Editor. [48] Dean Kelley. Automata and FormalLanguages: An Introduction. Prentice Hall, 1995. [49] Walt Kelly. Ten Ever-Lovin' Blue-Eyed Years with Pogo. Fireside Books. Simon & Schuster, 1959.
Bibliography
A77
[50] Morris Kline. MathematicalThought from Ancient to Modern Time. Oxford University Press, 1972. [51] Ramanujachry Kumanduri and Cristina Romero. Number Theory with Computer Applications. Prentice Hall, 1998. [52] Alison Landes, Carol D. Foster, and Betsie B. Caldwell, editors. Homeless in America. The Information Series on Current Topics. Information Plus, 1991. [53] Henry B. Laufer. Discrete Mathematics and Applied Modern Algebra. Prindle, Weber & Schmidt, 1984. [54] F. J. MacWilliams and N. J, A. Sloane. The Theory of Error-CorrectingCodes. North Holland, 1977. [55] Mark Mandelkem. Constructive Mathematics. Mathematics Magazine, 58(5), November 1985. [56] George Markowsky. Misconceptions about the Golden Ratio. The College Mathematics Journal,23(1):2-19, January 1992. [57] Philip J. Pratt and Joseph J Adamski. DatabaseSystems Managementand Design, 3rd ed. Boyd aznd Fraser, 1994. [58] Gordon Raisbeck. Information Theory: An Introductionfor Scientists and Engineers. M.I.T. Press, 1963. [59] Herbert John Reyser. CombinatorialMathematics,volume 14 of The CarusMathematicalMonographs. The Mathematical Association of America, 1963. [60] Peter Rob and Carlos Coronel. DatabaseSystems: Design, Implmentation, and Management, 3rd ed. Course Technology, 1997. [61] Fred S. Roberts. Applied Combinatorics. Prentice Hall, 1984. [62] Kenneth H. Rosen. DiscreteMathematics and Its Applications, 4th ed. McGrawHill, 1999. [63] Kenneth H. Rosen, John G. Michaels, Jonathan L. Gross, Jerrold W. Grossman, and Douglas R. Shier, editors. Handbook of Discrete and CombinatorialMathematics. CRC Press, Inc., 2000. [64] James T. Rosenbaum and Richard Wernick. The Utility of Routine Screening of Patients with Uveitis for Systemic Lupus Erythematosus or Tuberculosis. Arch. Ophthalmol., 108(9), September 1990. [65] Hans Sagan. Space-Filling Curves. Springer-Verlag, 1994. [66] Dr. Seuss. On Beyond Zebra. Random House, 1955. [67] Claude E. Shannon and Warren Weaver. The Mathematical Theory of Communication. University of Illinois Press, 1949. [68] Lloyd Shaw. Cowboy Dances. The Caxton Printers, Ltd. 1952 [69] Joseph H. Silverman. A Friendly Introduction to Number Theory. Prentice Hall, 1997. [70] Steven Skiena. Implementing Discrete Mathematics: Combinatoricsand Graph Theory with Mathematica. Addison-Wesley, 1990. [71] Raymond Smullyan. What Is the Name of This Book? The Riddle of Draculaand Other Logical Puzzles. Prentice Hall, 1978. [72] Raymond Smullyan. Alice in Puzzleland. Penguin Books, 1984. [73] Daniel Solow. How to Read and Do Proofs. Wiley, 1982. [74] Daniel F. Stubbs and Neil W. Webre. Data Structures with Abstract Data Types and Pascal. Brooks/Cole, 1984. [75] People vs. Collins 68 Cal. 2d 319,335, 438 P.2d 33,45, 66 Cal. Rptr. 497,507 (1968).
A78
Bibliography [76] Frank Swetz. The Nine Chapters of the Mathematical Art: An Amazing Book. Historical Notes: Mathematics through the Ages. COMAP, 1992. [77] Richard J. Trudeau. Introduction to Graph Theory. Dover Publications, 1993. (78] Vital Statistics of the United States, 1987, volume 1-Natality. U.S. Dept. of Health and Human Services, Public Health Service, Centers for Disease Control, National Center for Health Statistics, 1989. [79] Karen Doyle Walton. Imagine That! A History of Imaginary Numbers. Historical Notes: Mathematics through the Ages. COMAP, 1992. [80] Sherwood Washburn, Thomas Marlowe, and Charles T. Ryan. Discrete Mathematics. Addison-Wesley, 2000. [81] William Waterhouse. Why Square Roots are Irrational. The American Mathematical Monthly, 93(3), March 1986. [82] Mark Allen Weiss. Data Structures and Problem Solving Using Java. AddisonWesley, 1998. [83] Douglas B. West. Introduction to Graph Theory. Prentice Hall, 1996. [84] Wayne L. Winston. OperationsResearch: Applications and Algorithms, 3rd ed. Duxbuery Press; International Thompson Publishing, 1994. [85] Niklaus Wirth. Data Structures + Algorithms = Programs. Prentice Hall, 1976. [86] John W. Wright, editor. The UniversalAlmanac.Andrews and McNeel, Universal Press Syndicate, 1992.
Index
120
789
{A 1 , A2 ,...,
Aj} -
B, 751
(v, b, r, k, f)l-design, 446
0, 16
nCr, 218
(v, k, ;+-design, 447 (x),, 416
E, 593, A26 -,.100
nPr, 216 0, 166
-,18 2n, A9
q, A26 3, 62
7Zn, 739 'T[B1 , B 2 .
< >, 694 A', 17 Ac, 17
V, 62 F, A26 y, A26
C, A5 N, Al Q, A2
Ak, A25
e, 15
R, 96, A4
AT, A22 C(n, r), 218, 219 Cr, 218 Cn,596 C1-C3 chain, 636 G*, 633 Gn,m, 596 Hd(u, v), 474 Hw(u), 474 Hn, 597 I,, A23
i, A26 w, A26 A, A26 4:ý,41, 66 )`, 541, A26 A-transition, 557 [ ], 177 *, 40 LJ, 177 P(E), 261 f, A26 -,34
R+, 96 Z, A2 Z+, A2 0,550 *, 551 +, 552 ., 550 ?, 552 [], 549 $, 549 Zp, A6 550
Kn, 596 Kn,m, 596 k-connected, 602 k-edge-connected, 603 L(n), 422 m-ary tree, 670 n!, 120 n-ary relation, 745 n-fold repetition code, 476 n-tuple, 20 n-variable Boolean function on B), 767
v, A26 2, 168, A26 w, A26 @, 34 E,259 G. 594 4), A26 Fl, 542, A26 %P,A26 P, 164, 243, 592, A26 7r, A4, A26
\, 550 -,549 15,545,550 -,
o(S), 17
V1,A26
accepted, 531
P(n, r), 216 Pn, 216 p(n), 405 p(n, k), 405 S(n, k), 410, 413 Sn, 20
==>,45, 66, 543 p, A26 --).,38 E, 531, 541, 542, A1O, A26 E*, 541 a, A26
adaptive quadrature, 326 adjacency list, 619 adjacency matrix, 604, 642 adjacent, 593 AIDS, 298 algorithm, 153
GL, 595
i, A4
0, 15
Sn, 597
-, 34
s(n, k), 418 t-error correcting, 476 W, 596 [x], 740 N, A26 c, A26 /, A26 A, 23 (r), 218 nl, 17, 269 X, A26 X(G), 634 U, 17, 269 A, 542, A26 3, A26
C, 16 C, 16 xik i, A1O 0, 169, A26 r, A26 0, 261, A26 x, 20 T, A26 ",789 u,A26 v, 34 A, 34
E, A26 gA26 •, A26
Bj], 749
#,163
F, 33 T, 33 0-1 knapsack problem, 457 1-1, 729, 732
greedy, 458, 711
alphabet, 238, 541 alphanumeric, 227 alphanumeric-upper, 227 alternate key, 747 alternation, 551 analyzing claims, 29-30 ancestor, 669 AND, see logic, operators, AND 34 AND gate, 785 Ann Landers, 497 ANSI, 688, 689 antecedent, 38 antireflexive, 737 antisymmetric, 737 Appel, Kenneth, 632
A79
A80
Index
arc, 641 initial vertex, 641 terminal vertex. 641 Archimedes, 199 argument, 13 argument form incomplete, 71 invalid, 70, 71 sound, 71 unsound, 71 valid, 70, 71 arithmetic mod p, A6 also, see number systems, integers mod p A6 arithmetic progression, 127, 332 arithmetic sequence, 127 arithmetic series, 127 ASCII, 689 assignment operator, 156 associated set, 53 associative, see field axioms A3 assumptions, 30 model specific, 260 asymmetric, 737 attribute, 694, 698 attribute set, 746 attributes, 746 automaton, see finite-state machine 531 axiom, 91 axiomatic method, 91 axioms, 53 back substitution, 334 backtracking algorithms, 724 balanced incomplete block design, 446 construction, 449 resolvable, 447 symmetric, 447 trivial, 446 balanced tree, 670 ball, 409 barber paradox, see paradox, barber 199 Barbie, 225, 283 Bayes's Theorem, see probability, Bayes's Theorem 294 benefit, 457 BIBD, 446 construction, 449 resolvable, 447 symmetric, 447 trivial, 446 Bierce, Ambrose, 74 Big-S2, 168 Big-0, 165 Big-O, 169 Big-O, 166 bijective, 729 binary, 470 also, see numbers, representation, binary A7 binary expression simplification rule, 780 binary error-correcting code, see error-correcting code 474 binary expression, 769 conjunctive normal form, 779
disjunctive normal form, 770 equivalent, 770 binary function, 768 equal, 770 binary heap, 685 binary relation, 745 binary search, 180, 358 binary search tree, 679 binary sphere, 478 binary string, 470, 531 binary tree, 670 binary variable, 769 binomial coefficient, 218 generalized, 378 Binomial Theorem, 236 Newton's, 378 bipartite graph, 596 birthday, see probability, birthday 272 bit, 470, A7 black hole, 533 blob, the, 645 block, 446 BM, see pattern matching 191 B6hm, 154 Bolyai, 91 Boolean algebra, 53 duality principle, 57 symmetric difference, 59 Boolean expression over 15,53 bound variables, 63 bounded knapsack problem, 457 Boyce Codd normal form, 762 Boyer-Moore, see pattern matching 191 breadth-first, 708 bridge, see Kbnigsberg 590 byte, 688 California Supreme Court, 274 canonical form, 770, 803 Cantor's diagonalization proof, 200 capacity, 457 Capulet, Juliet, 530 cardinality, 17 cards, 219 Cartesian product, 20 Cayley's formula, 710 Cayley, arthur, 632 ceiling function, 177, 209 cell, 409 center, 676 chain, 636 change of base, 172 channel, 522 capacity, 529 character data, 694 characteristic equation, 340 check bit, 471 child, 669 Chomsky hierarchy, 571 choose method, 25, 108 chromatic number, 634 Church, Alonzo, 576 circuit, 600 combinatorial, 785 sequential, 785 circuits logic, 788
circular reasoning, see logic, informal fallacies, circular reasoning 31 class representative, 740 clique, 599 clock arithmetic, A6 closed, 600, 643 closing tag, 694 closure of a set, 738 clubs, 219 Codd, E. F., 746 code word, 471, 474 coding theory, see error-correcting codes 470, 529 coefficient matrix, 342, A23 coloring, 634 combination, 213 combinations, see counting, combinations 218 combinations with repetition, see counting, combinations with repetition 221 combinatorial circuit, 785 creating, 788 combinatorial design, 404 combinatorial proof, 234 Combinatorica, 598,617,621 comments in pseudocode, 163 communication system continuous, 523 discrete, 523 mixed, 523 commutative, see field axioms A3 compass, 199 compiler, 521, 573 complement, see set, complement 15, 450 also, see probability, definitions, complement of an event 259 complement of a language, 570 complement of a simple graph, 594 complete bipartite graph, 596 complete graph, 596 complete induction, 124 complete ordered field, A4 complete ordering, 737 complete tree, 670 completeness, A4 complex numbers, see number systems, complex numbers A4 complex plane, A5 complexity NP, 621 NP-complete, 621 P, 621 component, 601 composite, 98 concatenate, 531,544 concatenation, 551 conclusion, 38, 69, 70 conditional probability, see probability, definitions, conditional probability 266 conditional statement, 41 congruence class, 740 congruent, 100, 740 conjunction, 34 conjunctive normal form, 779 connected, 601 connectivity, 602 consequent, 38 Consistent, 92
Index
A81
consonants, 261 constant coefficients, 338 constraints, 457 containers, 409 context-free grammar, 571 context-sensitive grammar, 571 contingency, 41 continuous, 2 contradiction, 41 contrapositive, 42, 43 Contrapositive, 48 contrapositive, 70, 83 Contrapositive, 86 contrapositive, 104 converges to 00, 173, 208 converse, 42, 43, 83 convex set, 16 corollary, 93 countably infinite, 2 counterexample, 28, 109 counting, 213-232 combinations, 218-221 combinations with repetition, 221-224 formulas, 214, 215, 224 permutations, 216-217 permutations with repetition, 217-218 cover, 782 cut edge, 602 cut vertex, 602 cycle, 600 cycle graph, 596
degree of a vertex, 593 degree sequence, 620 dense, A3, A4 depth of a node, 669 depth-first, 707 dequeue, 686 derivable, 543 derivation, 543 derivative of a generating function, 377 derived design, 450 derived implications, see logic, derived implications 41 descendant, 669 destination, 522 deterministic, 256 deterministic finite automaton, 557 diamonds, 219 difference, see set, difference 15 difficult, 199 digraph, 641, see directed graph 641 Dijkstra's shortest path algorithm, 645 Dijkstra, Edsger, 645 Direct proof, 102 directed circuit, 643 directed graph directed circuit, 643 directed multigraph, 641 directed walk, 643 simple, 641 strongly connected, 643 tournament graph, 653 weakly connected, 643
edge list, 619 efficiency, 476 element, see set, element 15, 694 elementary number theory, 243 elementary subdivision, 626 ELISA, 298 ellipsis, 15 embedding, 617 empirical probability, see probability, calculating 261 empty set, see set, empty set 15 encode, 471,473, 522 encoding, 529 end tag, 694 endpoints, 600, 643 enigma machine, 200 enqueue, 686 enumeration, 403 equally likely outcomes, see probability, definitions, equally likely outcomes 260 equivalence, see logic, operators, biconditional 39 equivalence class, 740 class representative, 740 equivalence relation, 736, 739 equivalent binary expressions, 770 equivocation, see logic, informal fallacies, equivocation 31 Erdds, Paul, 107, 239 erf(x), 769 error function, 769
database hierarchical, 746 network, 746 relational, 746 attribute set, 746 foreign key, 759 functionally dependent, 747, 751 lossless decomposition, 752 natural join, 749 nonkey attribute, 747 normal forms, 754 normal form, 762 primary key, 747 terminology comparison, 747 attributes, 746 decomposition, 752 fields, 746 file, 746 join, 749 key, 747 projection, 749 record, 746 relation, 746 table, 746 tuple, 746 De Bello Judaico, 383 de Morgan, Augustus, 632 decode, 471, 473 decomposition, 752 deductive reasoning, 94 Deferred Acceptance, 6 Deferred Acceptance Algorithm, 5, 129, 212 definition, 30, 92 deg(v), 593 degree of a region, 638
directed multigraph, 641 directed walk, 643 closed, 643 endpoints, 643 length, 643 Dirichlet drawer principle, 238 Dirichlet, Peter Gustav Lejeune, 728 discrete, 1, 257 discrete mathematics, 2 disjoint, see set, disjoint 15 disjunction, 34 disjunctive normal form, 770 distinct representatives, 484 distributive, see field axioms A3 divide-and-conquer, 358 divided differences, 415 divine proportion, A20 divisible, 96 divisor, 214 dodecahedron, 611 DOM, 702 domain, 729, 732 dominoes, 226 Dr. Seuss, 238 draft, 263 DTD, 699 dual graph, 633 duality, 57, 428 duality principle, 57, 428 dynamic programming, 459
error-correcting code, 470, 474 Hamming code, 480, 516 perfect, 479 repetition code, 476, 480, 516 essential dependency, 755 estimation, 266 J Euclid's Elements, 91 Euclidean Division Algorithm, 96, 99, A2 Euler 0, 164, 243 Euler circuit, 608 Euler gamma functions, 337 Euler totient function, 164, 243 Euler trail, 608 Euler's 36 officers problem, 427 Euler's formula, 629 Euler, Leonhard, 8, 73, 121,405, 427,590, 732 even, 96 event, see probability, definitions, event 258 4 excitatory input, 576 exclusive OR, 34 existence, 403 existential quantifier, 107 expected value, see probability, expected value 285 also, see probability, definitions, expected value 287 external node, 673, 675
eccentricity, 676 edge elementary subdivision, 626 edge connectivity, 603
face, 628 face card, 219 face value, 219 factor, A2 factorial, 120 fair game, see probability, definitions, fair game 290
A82
Index
fallacy, see logic, formal fallacies 70 also, see logic, informal fallacies 70 gambler's, 268 Fallacy of affirming the consequent, 51 Fallacy of denying the antecedent, 51 falling factorial, 416 false, 33 Fano plane, 429 favorable outcome, see probability, definitions, favorable outcome 258 Fibonacci, 675 Fibonacci sequence, 134, 332, 344 field, A3 field axioms, A3 fields, 746 file, 746 finite automaton, 531 finite induction, 118 finite projective plane, 428 and mutually orthogonal Latin squares, 433 duality principle, 428 Fano plane, 429 order n, 432 finite-state machine, 9,530 deterministic finite automaton, 557 finite automaton, 531 non-deterministic finite automaton, 557 with output, 535 first normal form, 754 floor function, 177 foreign key, 759 formal logic, see logic, formal 14 formulas counting, see counting, formulas 213 probability, 274, 307 Four Color Theorem, 107 fractions, see number systems, rational numbers A3 free variables, 63 "Freshman Theorem", 116 full m-ary tree, 670 function, 729 1-1,729 bijective, 729 binary, 768 Boolean, 766, 767 domain, 729 image, 729 injective, 729 inverse, 731 one-to-one, 729 onto, 729 range, 729 surjective, 729 functionally complete, 789 functionally dependent, 747, 751 Fundamental Theorem of Algebra, A6 Fundamental Theorem of Arithmetic, 98, 114, 125, A2 gambler's fallacy, see fallacy, gambler's 268 gambling, 293 Gamma functions, 337 gate, 784 AND, 785 NAND, 789
NOR, 789 NOT, 785 OR, 785 gates, 37 Gauss, 91 gcd, 97, 111 calculating, 98, 111, 143, 331 generalization improper, see logic, informal fallacies, inappropriate generalization 31 generating function, 372 derivative, 377 geometric progression, 126-128 geometric sequence, 126 geometric series, 127 golden ratio, A20 Golden Rectangle, A20 golden section, A20 grammar, 571 graph, 591, 592 adjacency list, 619 adjacency matrix, 604, 642 bipartite, 596 chromatic number, 634 circuit, 600 clique, 599 complement, 594 complete bipartite, 596 complete graph, 596 component, 601 connected, 601 cycle, 596, 600 degree sequence, 620 digraph, 641, see directed graph 641 directed graph, see directed graph 641 dual, 633 edge, 591, 592 edge list, 619 embedding, 617 grid, 596 Grotztsch, 639 Heawood, 639 Herschel, 611 homeomorphic, 626 hypercube, 597 icosahedron, 639 incidence matrix, 617 independent set, 599 induced subgraph, 594 isomorphic, 619 line, 595 loop, 592 minimal spanning tree, 711 multigraph, 593 path, 600 Petersen, 626 planar, 624 regular, 593, 629 simple, 591, 592 simple directed, 641 spanning tree, 705 star, 597 subgraph, 594 trail, 600
Tutte, 615 underlying simple graph, 593 vertex, 591, 592 walk, 600 Walther, 639 wheel, 596 graph theory, 591 graphic, 620 greatest common divisor, 97 greedy algorithm, 458, 711 Greek alphabet, A26 grid graph, 596 Grotztsch graph, 639 Guthrie, Francis, 632 Guthrie, Frederick, 632 G6del, 103 Haken, Wolfgang, 632 Hall's marriage theorem, 486 halting problem, 198-200 Hamilton cycle, 611 Hamilton path, 611 Hamilton, William Rowan, 611,632 Hamming Code, 470 Hamming code, 480, 516 Hamming distance, 474 Hamming weight, 474 Handshake Theorem, 594, 652 Hanoi, see Tower of Hanoi 333 Hasse diagram, 738 heap, see binary heap 685 heap sort, 686 hearts, 219 Heawood graph, 639 Heawood, Percy, 632 Hegesippus, 383 height of a tree, 670 Herschel graph, 611 heuristic, 464 hexadecimal, 555, see numbers, representation, hexadecimal A7 Hierholzer, Carl, 609 Hilbert, 91, 323 homeomorphic, 626 homogeneous, 338 HTML, 554, 693 Hyper-Text Markup Language, 554 hypercube, 597 hypotenuse, A4 hypothesis, 38, 70, 94 icosahedron graph, 639 icosian game, 611 identity, see field axioms A3 identity matrix, A23 if ... then .... see logic, operators, implication 38 if and only if, 40 iff, 26 image, 729 imaginary numbers, see number systems, complex numbers A5 implication, see logic, operators, biconditional 38, see logic, operators, implication 38 implies, see logic, operators, implication 38 impossible, 199 in, see set, element 15
Index in-order traversal, 677 incidence function, 592 incidence matrix, 617 of a BIBD, 446 incident, 593 inclusion-exclusion, 241, 242 inclusive OR, 34 indegree, 642 indegree sequence, 653 independent, 92 independent events, see probability, definitions, independent events 267 independent set, 599 independent tasks or choices, see counting, formulas 213 index set, 19 index variable, see summation notation A 10 Indirect proof, 48, 86 induce, 740 induction, see mathematical induction 117 inductive reasoning, 94 inference, 45 infinite set, 17 infix, 556, 678 informal logic, see logic, informal 14 information, 528, 576 information processor, 576 information source, 522 inhibitory input, 576 initial vertex, 641 injective, 729 input string, 531 insanity, 37 integer, see number systems, integers A2 integers, see number systems 95 interior node, 670 interpolation search, 180 intersection, see set, intersection 15, 19,269 Intuitionism, 107 invalid, see argument form, invalid 70 inverse, 42, 43, 83 also, see field axioms A3 function, 731 relation, 733 inverse matrix, A25 inverter, 785 invited inference, see logic, informal fallacies, invited inference 30 irrational number, 96 irrational numbers, A4 island of knights and knaves, A12, A13, A16, A18 isomorphic, 619 isomorphism invariant, 620 iteration, see structured control 158 Jacopini, 154 join, 749 Jordan curve theorem, 637 Josephus, 384 Josephus problem, 383 Kaliningrad, see Kdnigsberg 590 Karnaugh map, 781 Kempe, Alfred, 632 key, 747 alternate, 747
foreign, 759 primary, 747 Kirchoff, 709 Kirkman's schoolgirls, 9 Kleene closure, 544, 551, 584 Kleene's theorem, 562 Kleene, Stephen Cole, 544 KMP, see pattern matching 188 knapsack problem, 457 0-1,457 bounded, 457 unbounded, 457 knave, see island of knights and knaves A12 knight, see island of knights and knaves A12 Knuth-Morris-Pratt, see pattern matching 188 Kuratowski's Theorem, 627 Kuratowski, Kazimierz, 627 Kbnigsberg, 8, 590 L(9), 543 Lagrange interpolation, 327 lake Baikal, 30 Caspian Sea, 30 Superior, 30 Lambert, Johann Heinrich, A4 language generated by a grammar, 543 language over 1, 541 complement, 570 regular, 543 Latin square, 421 and finite projective planes, 433 mutually orthogonal, 424 orthogonal, 424 self-orthogonal, 443 standardized, 421 Law of hypothetical syllogism, 48, 86 lcm, 97 leaf, 670 leaf node, 670 least common multiple, 97 left child, 671 lemma, 93 level of a node, 669 level-order traversal, 686 lexicographical, 679 limit of a sequence, 173 limit point, A4 line, 428, 591, 592 line graph, 595 linear, 338 linear code, 474 linear combination, 415 linear homogeneous recurrence relation with constant coefficients of degree k, 339, 397 linear homogeneous recurrence relations with constant coefficients, 338 linear search, see sequential search 179 Lobachevsky, 91 logarithmic functions, 171 logic, 13 analyzing claims, 29 argument form, see argument form 70 conditional statement, 41 contingency, 41 contradiction, 41
A83
derived implications, 41-43 formal, 14 formal fallacies affirming the consequent, 70 denying the antecedent, 70 informal, 14 informal fallacies, 30-32 appeal to authority, 30 circular reasoning, 31 confusing the whole and the parts, 31 equivocation, 31 inappropriate generalization, 31 incorrectly using averages, 31 invited inference, 30 no one knows, so I must be right, 31 nonsequitur, 31 shifting the focus, 30 Using rules in an inappropriate context, 30 operators, 33, 39, 269 functionally complete, 789 AND, 34-35 biconditional, 39-40 conjunction, 34 disjunction, 34 equivalence, 41 exclusive OR, 34 implication, 38-39, 42, 43, 83 inclusive OR, 34 logical equivalence, 41 NAND, 37 NOR, 37 NOT, 34-35 OR, 34-35 precedence, 40 XOR, 34 predicate, 33 propositional, 33 puzzles, see puzzles, logic A12 syllogistic, 14, 69 symbolic, 14 tautology, 41 logic circuits, 788 logic gate, 784 logic operator, see logic, operators 33 logic puzzles, see puzzles, logic A12 logical fallacies, 30 also, see logic, formal fallacies 30 and, see logic, informal fallacies 30 logical inconsistencies, 30 logically equivalent, 41 Logicism, 103 Longfellow, Henry Wadsworth, 530 loop, 592 lossless decomposition, 752 lotteries, see probability, sweepstakes and lotteries 288 Madeline, 9 magic square, 403 main diagonal, A22 Mantoux skin test, see tuberculosis 294 Marriage Problem, 484 also, see Stable Marriage Problem 484 marriage theorem, 486 material implication, 39 mathematical induction, 117, 131
A84
Index
complete induction, 124, 131 finite, 118 strong, 125 weak, 118 mathematical models probability, 258, 263 matrix, A22 addition, A24 coefficient, A23 identity, A23 inverse, A25 main diagonal, A22 multiplication, A24 power, A25 square, A22 transpose, A22 zero, A23 max, 116 maximal, 613 maximal complete tree, 670 maximal spanning tree, 716, 724 maxterm, 779 McCulloch, 576 McKay, Brendan, 497 meaning, 522 member, see set, member 15 Merchant of Venice, A15 merge sort, 360 message, 471, 522 metacharacter. 548, 549 min, 116 minimal spanning tree, 711 minimum distance of a code, 474 minterm, 769 Miss Clavel, 9 Miyazaki, Hayao, 732 mod, 100 Modus ponens, 47 Monopoly, 540 monotone, 239 Montague, Romeo, 530 Monty Hall, see probability, Bayes's Theorem 296 multigraph, 593 multinomial, 230, 231, 237 Multinomial Theorem, 237 mutually exclusive events, see probability, definitions, mutually exclusive events 259 mutually exclusive tasks or choices, see counting, formulas 213 mutually orthogonal Latin squares, 424 NAND gate, 789 nanosecond, 172 natural join, 749 natural numbers, see number systems 95, see number systems, natural numbers Al necessary and sufficient, 40 necessary and sufficient condition, 110 negation, 35, 70 nesting, see structured control 159 neural net, 576 neuron, 576 Newton, Isaac, 410 Noam Chomsky, 571 node, 668
noise, 523 non-Euclidean geometry, 92 nondecreasing function, 363 nondeterministic finite automaton, 557 nonkey attribute, 747 nonnegative integers, see number systems, natural numbers A2 nonsequitur, see logic, informal fallacies, nonsequitur 31 nonterminal symbol, 542 NOR gate, 789 normal forms first, 754 second, 754 third, 754 NOT, see logic, operators, NOT 34 NOT gate, 785 null character, 541 null string, 541 number systems, A l-A9 complex numbers, A4-A6 integers, 95, A2 integers mod p, A6 natural numbers, 95, AI-A2 rational numbers, 96, A2-A4 derivation, 741, 742 real numbers, A4 number theory, 243 numbers representation, A7-A9 base, A7 binary, A7-A9 hexadecimal, A7-A9 octal, A7-A9 place value, A7 objects, 409 obvious algorithm, see pattern matching 187 occupancy problems, 409 octal, see numbers, representation, octal A7 odd, 96 odds, see probability, definitions, odds 288 one-to-one, 729, 732 one-to-one function, 592 onto, 729, 732 opening tag, 694 operations research, 459 optimal, 129 optimization, 404, 458 OR, see logic, operators, OR 34 OR gate, 785 order, 213 axioms, A4 ordered tree, 671 orthogonal Latin squares, 424 outcome, see probability, definitions, outcome 257 outdegree, 642 outdegree sequence, 653 output function, 535 package, 598 paper, scissors, rock, 258, 786 paradox, 206 barber, 199 parallel, 784 parent, 669
parse tree, 687 parser. 701 parsing, 573 partial ordering, 737 partially ordered set, 737 partition, 19, 739 of an integer, 405 Pascal's Theorem, 234 Pascal's Triangle, 310 path, 600 pattern matching, 186 BM (Boyer-Moore), 191 KMP (Knuth-Morris-Pratt), 188 obvious algorithm, 187 patterns, 256 Peano, 323 perfect code, 479 Pert, 553 permutation, 213 permutation of a set, 421 permutations, see counting, permutations 216 permutations with repetition, see counting, permutations with repetition 217 Persian rugs, 320, 362 Petersen graph, 626 phrase-structured grammar, 571 pigeon-hole principle, 238, 239 pipe symbol, 545 Pitts, 576 planar graph, 624 Platonic solids, 630 Playfair's postulate, 91 playing cards, 219 point, 428 polyhedron, 628 pop, 572 Portia, AIS-AI16 poset, 737 Hasse diagram, 738 possible, 129 post-order traversal, 677 postfix, 678 postulate, 91 power series expansion, 374 power set, 20 powers of 2, A9 pre-order traversal, 677 precedence, 40, 552 predecessor, A l Predicate, 62 prefix property, 689 Pregel river, see Kfnigsberg 590 premise major, 69 minor, 69 primary key, 747 prime, 98, 110, A2 twin, 143 Principia Mathematica, 103 priority queue, 689 probability Bayes's Theorem, 294-297 birthday, 272 calculating, 261-263, 269-274 computation formulas, 274, 307 definitions, 257-261, 266-269 complement of an event, 259
Index conditional probability, 266 equally likely outcomes, 260 event, 258 expected value, 287 fair game, 290 favorable outcome, 258 independent events, 267 mutually exclusive events, 259 odds, 288 of outcomes and events, 260 outcome, 257 random experiment, 257 random variable, 286 sample space, 257 value of an outcome, 285 expected value, 285-291 sweepstakes and lotteries, 288-291 probability distribution, 256 probability of outcomes and events, see probability, definitions, of outcomes and events 260 problem solving techniques divide and conquer, 269 processing instructions, 694 production, 542 projection, 749 projective plane, see finite projective plane 428 proof, 90, 94, 102 cases, 106 combinatorial, 234 constructive, 107 contradiction, 105, 107 counterexample, 109 direct, 102 indirect, 104 strategy summary, 138 trivial, 102 vacuous, 102 with sets, 24 Proof by contradiction, 48, 86 proper coloring, 634 proposition, 33, 92 pseudocode, 154 push, 572 pushdown automaton, 571, 572 puzzles logic, 74, A12-A19 Pythagorean theorem, A3 Pythagorean triple, 98 QED, 93 Quantifier, 62 queue, 686, 689 dequeue, 686 enqueue, 686 Quine-McCluskey algorithm, 781, 783 quotient, A2 Radziszowski, Stanislaw, 497 Ramsey condition (j, k), 492 (j, k; m), 494 Ramsey number, 490 R(j, k), 492 R(j, k; m), 494 R(k 1 , k 2 , . .. , k.; m), 496
Ramsey, Frank, 490 random, 256 random experiment, see probability, definitions, random experiment 257 random variable, see probability, definitions, random variable 286 range, 729, 732 rate birth, 279 rational numbers, see number systems, rational numbers A2 Rational Roots Theorem, 346 real number line, A4 real numbers, see number systems, real numbers A4 received message, 523 received signal, 523 receiver, 523 recognize, 531 record, 746 recurrence relation, 332 recurrence relations, 331 recursion, 313 redundant, 318 tail-end, 317 recursive, 313, 331 Reductio ad absurdum, 48, 86 redundancy, 470 reference functions, 170 reflexive, 736 reflexive closure, 738 region, 629 regular expression, 547, 548, 552 precedence, 552 regular grammar, 542, 571 regular graph, 593, 629 regular language, 543 regular polyhedron, 629 regular set, 548 relation, 732, 746 n-ary, 745 1-1,732 antireflexive, 737 antisymmetric, 737 asymmetric, 737 binary, 745 complete ordering, 737 domain, 732 equivalence relation, 736, 739 inverse, 733 join, 749 one-to-one, 732 onto, 732 partial ordering, 737 poset, 737 projection, 749 range, 732 reflexive, 736 reflexive closure, 738 symmetric, 736 symmetric closure, 738 ternary, 745 transitive, 736 transitive closure, 738 relational database, see database, relational 746 relatively prime, 243
A85
religions, 277 remainder, A2 repeat-until, see structured control 159 repeating decimal, 245, A4 repetition, see structured control 158, 213, 217 repetition code, 476, 480, 516 replacement, 213 replication, 449 residual design, 451 Revere, Paul, 530 reverse Polish, 678 Rhind Papyrus, 212 right child, 671 risk (game probabilities), 281-282 rock, paper, scissors, 258, 786 Romeo and Juliet, 530 root, 669 root node, 669 rooted tree, 669 Russell, 103 Russell's paradox, see paradox, barber 199 Russell, Bertrand, 199 rxp, 701 Sacagawea, 225 Saccheri, 105 safety constant, 329 sample, 529 sample space, see probability, definitions, sample space 257 Sandbox, 598 Scheherazade, 388 schema, 699 schoolgirls, see Kirkman's schoolgirls 9 scientific theory, 94 scissors, paper, rock, 258, 786 second normal form, 754 selection, see structured control 155 self-orthogonal Latin square, 443 semantic, 522 sentence, 541 sequence, see structured control 154 sequential circuit, 785 sequential search, 179 serial, 784 set, 15-21 cardinality, 17 Cartesian product, 20 closure, 738 complement, 17 difference, 18 disjoint, 18 element, 15 empty set, 16 equal, 16 index set, 19 infinite, 17 intersection, 17, 19 member, 15 partially ordered set, 737 power set, 20 proofs, 24 proper subset, 16 subset, 16 symmetric difference, 23 union, 17, 19, 269 universal set, 15
A86
Index
Venn diagram, 16 set-builder notation, 15 Seuss, see Dr. Seuss 238 Shakespeare, William, 530, A 15 Shannon, Claude E., 521, 784 Shirley Temple Black, 732 short division, see Euclidean Division Algorithm A2 sibling, 669 Sierpinski curves, 7, 323 signal, 523 simple directed graph, 641 simple graph, 591, 592 simplification rule, 780 Simpson's Rule, 327 simulation, 534 single-variable Boolean function on H, 766 Smullyan, Raymond, A 12 sorting merge sort, 360 heap sort, 686 source code, 573 spades, 219 spanning tree, 705 sphere binary, 478 ternary, 483 square matrix, A22 St. Ives, 212 stable assignment, 4 stable marriage problem, 3 also, see marriage problem 3 stack, 572 pop, 572 push, 572 standard deck of cards, 219 standard ordering, 36 standardized, 421 star graph, 597 start symbol, 542 start tag, 694 state, 530 state diagram, 531 state table, 531 statement, 33 Stirling numbers of the first kind, 418 of the second kind, 410, 413 Stirling, James, 410 straightedge, 199 strategy, 102 string, 470, 531 strong induction, 125 strongly connected, 643 structured control, 154 fixed iteration, 158 indefinite iteration, 159 repeat-until, 159 while, 159 nesting, 159 repetition, 158 selection, 155 sequence, 154 structured programming, 160 subgraph, 594 subjective probability, see probability, calculating 262
subset, see set, subset 15 Substitution Principles, 45, 86 subtree, 669 successor, Al suits, 219 summation notation, A 10-AI I surjective, 729 survey, 256 sweepstakes, see probability, sweepstakes and lotteries 288 syllogistic logic, see logic, syllogistic 14 symbol, 541 symbolic logic, see logic, symbolic 14, 33-35, 38-40, 72 symmetric, 736 symmetric closure, 738 symmetric difference, 23, 59 system of distinct representatives, 484 Szekeres, G., 239 table, 746 tag, 694 tautology, 41,45 TB, see tuberculosis 294 terminal symbol, 542 terminal vertex, 641 ternary sphere, 483 ternary tree, 670 The 36 officers problem, 427 theorem, 38, 92 theoretical probability, see probability, calculating 261 third normal form, 754 Tietiviinen, Aimo, 479 Tom Wilson, 38 tour, see Kdnigsberg 590 tournament graph, 653 Tower of Hanoi, 333, 373, 376 trail, 600 transition function, 531, 535 transitive, 736 transitive closure, 738 transmitter, 522 transpose, A22 traversal, 676 tree, 668 m-ary, 670 ancestor, 669 balanced, 670 binary, 670 binary search, 679 center, 676 child, 669 complete, 670 depth, 669 descendant, 669 eccentricity, 676 external node, 673, 675 full m-ary, 670 height, 670 interior node, 670 leaf, 670 leaf node, 670 left child, 671 level, 669 maximal complete, 670 maximal spanning tree, 716
node, 668 ordered, 671 parent, 669 parse tree, 687 right child, 671 root, 669 root node, 669 rooted, 669 sibling, 669 spanning, 705 subtree, 669 ternary, 670 traversal, 676 in-order, 677 level-order, 686 post-order, 677 pre-order, 677 triangle inequality, 174 trisect, 199 trisectors, 199 trivial solution, 339 true, 33 truth table, 33-34, A12 standard ordering, 36 tuberculosis, 294 tuple, 20, 746 Turing machine, 199, 571, 573 Turing, Alan, 573, 576 Tutte graph, 615 Tutte, William, 615 twin primes, 143 two-column format/proof, 49, 118 twos complement, 540 unattached, 5 unbounded knapsack problem, 457 uncertainty, 256 undefined term, 91 underlying graph, 641 underlying simple graph, 593 union, see set, union 15, 19 Universal quantifier, 62 universal quantifier, 108 universal set, see set, universal set 15 universe of discourse, 62 unrestricted grammar, 571 um, 409 utility, 457 utility graph, 625 valid, see argument form, valid 70 validating parser, 701 value of an outcome, 285 Van Lint, J. H., 479 Vandermonde's Matrix Theorem, 342 Vandermonde's Theorem, 235 Venn diagram, see set, Venn diagram 15, 241 vertex, 591, 592 indegree, 642 initial, 641 outdegree, 642 terminal, 64l Vespacian, 384 viable, 5 vocabulary, 541 void suit, 246 volume, 457
Index vowels, 261 walk, 600 closed, 600 endpoints, 600 length, 600 Walther graph, 639 Wantzel, Pierre Laurent, 199 weak induction, 118 weakly connected, 643 Weaver, Warren, 528 weighted graph, 645 well defined, 742 well-formed, 696 Well-Ordering Principle, 98, 131, A l
wheel graph, 596 while, see structured control 159 Whitehead, 103 whitespace, 553, 695 wlog, 104 word, 541 XML, 693 attribute, 694, 698 character data, 694 DOM, 702 element, 694 parser, 701 processing instructions, 694 tag, 694
closing tag, 694 end tag, 694 opening tag, 694 start tag, 694 validating parser, 701 well-formed, 696
zahlen, A2 zebra, 238 zero divisor, 57 zero matrix, A23 zero product principle, A2 Ziggy, 38
A87
AuO=A Anlu-=A
•C•.CAu(BUC)
4ý
:l.c An (B n c)
Do MOrgns.Laws
A U BU A B nA A-A
L
:..:..
iD (
AUB= AflB
A fnB =A u B vity (U over n)
uis U.
An'o•uc=(A n.s)U(AnC)
A U (BnlC) = (AURB)fn(AUC)
(A U H) rC = (AnC)u(BnC)
(A n B)U C = (A uC) n (B Uc)
AuUX - U
'= U
An-A= 0-
U
CAmpm~ent (continued)
0
A=A
Boolean Algebra Axioms Identity There exist distinct elements, 0 and 1, in B such that for every x e B x+O=X x I=x. Complement For every x E B, there exists a unique element! e B such that x+Y=-- 1 X
X.=
.
Commuta•virty For every pair of (not necessarily distinct) elements x, y E B X+y-•y+X -Vy =y . X. DistrilUtivity For every three elements x, y, z E B (not necessarily distinct)
x.(y+ Z)=x -y+x-z x + y - z = (x + y)- (x + z).
A#raging r elements from a set containing n distinct elements Without Order
-w-h.Ode
n!to
C(,n,r)=
P(n, r)-
WidhKout Rto
c(n'+r"lr)- ,:.-)I
n
WithRepetition
The number of ways to place n objects into k containers Container
Mb~e 0:
-0: •0
k' kIS(nk) .
-'0:
.a .k).
(•=1)•
W -0: -,,
-'0:
fS(n, i) S(a. k) _
p(a1 A)•
0: containers way be empty --0: containers must contain at least one object
"SomeUseful Generating Functions Summation Notation
G(z) S•~T 11
_,z I "~~
Ti ••
ut
rk
%0 (')'-+
T•- CYz
•-I:• Do -- I)Zl 777 (1+z..
k2 0 MO17
0 + ZY EOO
Expmand Notation
+ Z+ ZI + Z3.-
•=00•
0
a .....~~k1= zo
,--z+ 2m +
(...
cz + CYz + CIZ, +.-
I+mZ+ •'•')Z2+( •,+ +c + 2 + M
-Z .I + • + 0•+ (J+-
- +.z+4,+•+.
ISBN 0l-13-0l64-
9 780130 698