### Chapter 2 Finite Horizon Games

express | 代写arm | 代写Algorithm | angular | mining – 本题是一个利用Algorithm进行练习的代做, 对Algorithm的流程进行训练解析, 涵盖了express/arm/Algorithm/angular/mining等方面

#### Ruitian Lang

#### 1 Extensive Form Games

#### 1.1 Histories and Information Sets

Introductory textbooks often define extensive form games in terms of game trees. Although technically correct, that definition loses its intuitive appealing when a player has infinitely many options to choose from or the game has an infinite horizon. Because this course considers many infinite extensive form games, we define them in terms of sequences of moves, which is mathematically equivalent to the tree definition. To motivate the definition, consider the two-player game of chess. After completion of a serious game, a record is produced which documents every move made in the game. Anybody should be able tell who wins the game from the record.^1 Furthermore, given a sequence of moves, a player should be able to tell whether that sequence can be part of a game record, in other words, whether every move is legal. In order to document the play of a chess game, one first needs some vocabulary that describes moves of chess pieces. There are special notations for chess moves such as Bc4, meaning B(ishop) moving to the square with coordinate (c, 4). Without going into any detail, we assume that there is a vocabulary that is capable of describing all conceivable chess moves. We denote the set of all conceivable moves byA, whose elements are notations like Bc4. For our purpose, each element ofA(apart from some special moves such as concession) simply specifies a piece, its initial position and its final position. Now a game record, or part thereof, is a sequence of moves. To accommodate games of infinite horizon, we allow infinite sequences of moves. The set of all finite and infinite sequences of moves is denoted by; its formal definition,(n=0An)A, is not important for us.

**Definition 1.** *(Predecessor and successor) A* b *of length* n 0 *is called a* predecessor *of another element* c *if* b=c *and the first* n *entries of* c *are precisely the entries of* b *(in the same order); when this happens,* c *is called a* successor *of*

(^1) We agree that claiming draw, proposing draw and acceptance of such a proposal are moves, as well as concession. If the record ends in a draw claimed, draw proposal accepted or a concession, it is obvious who wins. If the game ends up in a checkmate, a person familiar with the chess rule should be able to deduce that by exa mining the record.

Not all sequences of moves are parts of game records. Some moves may be illegal after a given sequence of (legal) moves. For example, at the beginning of the game, it is illegal (for White) to move its King to e (because its King cannot travel that far) or to e2 (because that target square is occupied by its own piece). Of course, there are chess rules that specify which moves are legal. Without getting into any detail, we simply denote byHthe set of all legal sequences moves. In other words,His a subset ofsuch that in an element ofH, every move (given the previous moves) is legal. The elements ofHare called histories. This set has an important property: ifhH, then every predecessor ofhalso belongs toH. Translated into the language of chess, this means that if every move inhis legal, the firstnmoves inh(nbeing an integer less than the length ofh) must all be legal, which is essentially tautology. Notice that the last requirement implies that the empty entry belongs toH, which represents the history when no move has been made and thus the beginning of the game. Besides specifying which moves are legal, which is equivalent to specifying which elements ofbelong to H, the rule should also specify which side wins after a completed game. A completed game is a history that cannot be continued, usually because one side has been checkmated or has no legal move to make. A history hHis called *terminal* , representing a completed game, if it has no successors inH. In particular, every infinite sequence inH(if any) is defined as terminal. The set of terminal histories is denoted byHT. For every element ofHT, the rule specifies the winning side. Generalizing this idea, we require that the rule of a general extensive form game specifies each players payoff for every element ofHT. The complement ofHTis denoted byHC=H\HT. Finally, at every non-terminal historyh, the rule should specify the player who moves here. We denote the player who moves athbyk(h)and the set of legal moves that can be made athbyAh. ThisAhcan be formally defined as{aA: (h, a)H}where(h, a)is the sequence obtained by appendingatoh. That (h, a)belongs toHmeans thatais a legal move ath. The rule of chess can then be completely represented by the set of playersN={White,Black},A,H, the payoff functionui:HTRfor every playeriand a mapk:HCNspecifying which player is to move at every non-terminal history. Obviously, this is not how the rule of chess is usually stated, but this formal framework allows us to describe a much broader class of strategic interactions. The framework presented so far has diiculty describing most card games for two reasons. First, a card game usually involves shuffling of the deck or randomly dealing cards to players. So far there is no intrinsic randomness in our game.^2 This is fixed by including a special player, which may be called Shuffler, Chance, or Nature, who makes moves to determine the randomness. (The standard term in game theory is Nature.) For example, in a card game, we may say that this Shuffler moves at the beginning to choose cards dealt to each player. Let us temporarily denote the set of all deals (cards dealt to each player) byB. The definition of the game should specify how the cards are dealt: are they dealt randomly so that all possible deals are equally

(^2) Even in chess, a player is certainly allowed to randomize between two moves. However, this is different from randomness such as card dealing that cannot be controlled at all by any player.

```
likely, or are they chosen from a specific deal set?^3 This amounts to specifying the probability distribution of
Natures move.
The second complication is that a player may not see all the moves made in the past. Because the card
dealing is represented as a move, this move cannot be observed by any player because a player does not see
the other players cards. Meanwhile, it is incorrect to say that is movebBis completely hidden from all
players either because a player sees their own cards.
A solution to this problem is to partitionBinto smaller sets called cells and postulate that a player
observes to which cellbbelongs to but cannot distinguish between two elements ofBbelonging to the same
cell (perhaps until later in the game). More specifically, imagine that 52 cards are dealt randomly to four
players; denote the set of all cards byC. Every element ofBis of the form(c 1 , c 2 , c 3 , c 4 ), whereciis a subset
ofCconsisting of 13 elements such thatcicj=wheni=jand^4 i=1ci=C. The latter two conditions
mean that no card is dealt to more than one player and every card is dealt to some player. Now for every
c 1 Cconsisting of 13 elements, denote byB c 1 the set of deals(c 1 , c 2 , c 3 , c 4 )Bsuch thatc 1 = c 1. In other
words,Bc 1 consists of all deals where Player 1 is dealt c 1. The way we capture the idea that Player 1 only
sees their own cards is to say that for everyd= (c 1 , c 2 , c 3 , c 4 )B, Player 1 only knows whetherdB c 1 (for
every possible c 1 ) but not whatdis.
In general, randomness and unobserved moves can occur at any point in game, so instead of restricting to
cells of some initial set of dealsB, we consider cells ofHC. In other words,HCis partitioned into cells; the
collection of all cells is denoted byI, with the property that any two distinct members ofI(which are sets in
their own right) are disjoint, and the union of all member ofIisHC. We impose two requirements on every
II:
(a)For anyh,hI,k(h) =k(h).
```

(b) For anyh,hI,Ah=Ah.

```
That two histories belong to the same cellImeans that the player in question cannot distinguish between the
two: they only know whether the true history belongs toIor not. When the true history is inI, it can be
any member ofIfrom the players perspective. The first requirement thus says that a player knows when it
is their turn to move. The second requirement says that a player knows which moves are available.
Definition 2. (Extensive form game) An extensive form game consists of the following ingredients.
```

*(a) A set* N *of players, which may include a special player Nature.*

```
(b)A set A of conceivable actions; let = (n=0An)A be the set of finite and infinite sequences of
actions.
```

(^3) One card game is called contract bridge, which involves dealing 52 cards to four players, each receiving thirteen. Although a competitive game usually involves random deals, in a class that teaches some specific techniques of play, the cards are often dealt from a specific deal set that is designed to illustrate the technique.

```
(c)A non-empty set H of histories, satisfying the requirement that for every hH , every predecessor of
h is also in H. Let HT={hH:h has no successor in H} and HC=H\HT. For every hHC , let
Ah={aA: (h, a)H}.
```

*(d) A payoff function* ui:HTR *for every* iN *that is not Nature.*

```
(e)A function k:HCN that specifies the player to move at every non-terminal history. For every iN ,
let Hi={hHC:k(h) =i}.
(f)For every hHC with k(h) = Nature, a probability distribution h on Ah.
(g)An information partition I of HC satisfying the requirement that for any cell I I and h, h I ,
k(h) =k(h) and Ah=Ah.
Elements of I are called information sets. An extensive form game is called finite if both N and H are finite
sets.
For some games, it is more convenient to express the players information in terms of the pieces of history
they can see. For example, it may be possible to say that Player 1 moves after histories of the form(a 1 , a 2 , a 3 )
and Player 1 seesa 1 anda 3 but nota 2. This form of representation can be converted to the standard
information partition representation but is more convenient when it is available.
Definition 3. (Perfect information) An extensive form game is said to have perfect information if every
information set is a singleton.
Having perfect information means that every player knows the entire history when they move.
Definition 4. (Finite horizon) An extensive form game is said to have finite horizon if there is an mN
such that the length of every history is at most m.
```

##### 1.2 Strategies

```
Informally, a players strategy specifies their plan of the entire game. This is an idealization of real game
play. A good chess player never plans only for the next move. Also, if you only focus on the next move, your
evaluation of its effect may be wrong in terms of the big picture: perhaps you can capture an enemy piece with
your next move which sounds good, but if that leads to a checkmate of your King in three moves time then
it is bad. In game theory, we model idealized players who have plans for the entire game and evaluate their
plans in terms of whether they win or lose the entire game. This assumption is not realistic for complicated
long games like chess, but most economic applications either have very short horizon or have a repeated
structure so that the big long-term plan can be summarized in a few sentences.
Therefore, a strategysiof Playerishould be a function fromHitoAsuch thatsi(h)Ahfor every
hHi. Recall thatAhis the set of legal moves at historyh, so the requirement onsisimply says that
```

the actionsi(h)specified for historyhmust be legal. However, there is a complication. If Playerihas an information sets containing two distinct historieshandh, Playericannot distinguish between them at the time of making their decision. The logical inference is that Playerimust choose the same action at both histories.

**Definition 5.** *(Strategy) A strategy of Player* i *in an extensive form game is a map* si *from* Hi *to* A *satisfying two conditions: (a)* si(h)Ah *for every* hHi *; (b) for any* h, hHi *belonging to the same information set,* si(h) =si(h)*.*

An equivalent way of stating this definition is that a strategy maps every *information set* of Playerito a legal action at that information set. When the information partition can be represented by who observes what, the definition of strategies can be simplified further: Playeris action is a function of things he can observe at the time of making that decision.

**Example 1.** *(Stackelberg competition) There are two firms. Firm 1 chooses output* q 1 *first; Firm 2 observes* q 1 *and then chooses output* q 2_. The market price is_ P(q 1 +q 2 ) *and Firm* i *s payoff is* qiP(q 1 +q 2 )*.*

The set of conceivable actions isA= [0,). The set of histories is{()}AA^2 , the first set is the singleton consisting of the empty history (beginning of the game). This description ofH, albeit mathematically rigorous, is often awkward to read. A more intuitive representation is to specify typical elements ofH: they are () (empty history),(q 1 )(this history after Firm 1 chooses its outputq 1 ), and(q 1 , q 2 ). Firm 1 moves at () and Firm 2 moves at(q 1 )for everyq 1 A. Obviously, only(q 1 , q 2 )are terminal histories. The payoff functions are as given in the question. Because Firm 2 observesq 2 , every information set is a singleton. Since Player 1 has only one information set, a strategy of Player 1 is simply an element ofA. Since Player 2 has an information set for everyq 1 , a strategy of Player 2 is a functions 2 fromAtoAwheres 2 (q 1 )is Player 2s output observing q 1.

**Example 2.** *(Byron-Myerson) A monopoly supplier has a privately observed cost of production* cC_. It is common knowledge that the distribution of_ c *is* c_. The government regulator announces a price schedule_ P: [0,)R_. The monopoly observes_ P *and decides whether to enter the market. If the monopoly enters the market, he chooses output* q 0 *and receives payoff* P(q)cq *, while the regulator receives payoff (consumer surplus)* S(q)P(q)*.*

It may be the case that the monopoly observescprior to the regulator announcingP, but games with perfect information are much easier to analyze than those without. Therefore, we should try our best to formulate an economic problem as a game with perfect information. The trick is usually to delay Natures move as much as possible. There are three players, Regulator, Monopoly and Nature. The typical histories are (),(P),(P, c),(P, c, q). The regulator moves at (), choosingP: [0,)R.^4 Nature moves at(P), choosing

(^4) To be fully rigorous, we need to specify the class of functions from whichPis chosen, such asR[0,)orC([0,)), but let us focus on the economic idea and skip this technical detail.

cC according toc. Monopoly moves at(P, c), choosingq {} [0,), where means quitting the market. The payoffs are as given in the question. Even though the regulator never observesc, the only thing we care about is whether the decision maker at(P, c), namely Monopoly, observesc. Since he does, his information sets are singletons. The game has perfect information. The problem is thatP is chosen from a very large set and a Monopolys strategy is a map from that large set andCto[0,), which cannot even be represented by any formula because that large set (perhapsR[0,)) is too large. Some special technique is required to analyze this game.

**Example 3.** *(Cournot competition) There are two firms simultaneously choosing their outputs* q 1 *and* q 2_. The market price is_ P(q 1 +q 2 ) *and Firm* i *s payoff is* qiP(q 1 +q 2 )*.*

Despite the fact that the two firms move simultaneously, it is still possible to formulate Cournot competition as an extensive form game. The trick is to artificially assume that one firm moves first but the other firm does not observe that move. The set of actions isA= [0,). The typical histories are (),(q 1 )and(q 1 , q 2 ). Firm 1 moves at () and Firm 2 moves at(q 1 ). The payoff functions are as in the question. Since Firm 2 does not observeq 1 , all(q 1 )s belong to the same information set. The set of strategies of Firm 1 is obviouslyA. Since Firm 2 only has one information set, their set of strategies is alsoA. The game does not have perfect information, but it is not diicult to analyze as the sets of strategies are relatively simple.

**Example 4.** *(Entry game) There is an Incumbent firm and a potential Entrant. First, the Entrant decides whether to enter the market, which is observed by the Incumbent. If the Entrant does not enter, the Incumbent chooses output* q 1 *and receives payoff* q 1 P(q 1 ) *, while the Entrant receives payoff zero. If the Entrant does enter, they choose outputs simultaneously, the Incumbent receives payoff* q 1 P(q 1 +q 2 ) *and the Entrant receives payoff* q 2 P(q 1 +q 2 )F *, where* F > 0 *is the cost of entry.*

LetE={e, n}be the set of entry decisions. Then the set of conceivable actions isE[0,). The typical histories are (),(e),(n),(n, q 1 ),(e, q 1 ),(e, q 1 , q 2 ). The Entrant moves at () and(e, q 1 )and the Incumbent moves at(e)and(n). The payoffs are as in the question. Since the Incumbent observes the Entrants entry decision, her information sets are singletons. Since the Entrant does not observeq 1 at(e, q 1 ), all(e, q 1 )s belong to the same information sets. The Incumbents set of strategies is a map fromEtoA. The Entrants set of strategies isEA, where the first entry specifies whether he enters and the second entry specifies his choice ofq 2 at the information set{(e, q 1 )}q 1 0.

#### 2 Solution Concepts

For the rest of the chapter, only games with finite horizons are considered.

##### 2.1 Nash equilibrium

Given a strategy profiles, the probability of every terminal historyh= (a 1 , a 2 , …, am)can be computed as follows. PutPh, 0 = 1. GivenPh,kfor somek < m, definePh,k+1=Ph,kqk+1whereqk+1is the probability thatak+1is played at(a 1 , …, ak)by the player who moves there. This recursive definition produces aPh,m, which is the desired probability that the terminal history is reached. In the current situation,qk+1(0,1) only when Nature moves at(a 1 , …, ak). However, there is no diiculty generalizing this Algorithm to the case whereqk+1may be strictly between 0 and 1 for human players.

**Definition 6.** *(Behavior strategy) A* behavior strategy *of Player* i *in an extensive form game assigns a probability distribution* h *on* Ah *to every history* hHi *such that* h=h *when* h *and* h *belong to the same information set.*

In other words, a behavior strategy allows a player to mix locally at each information set. It does not allow mixes of the form^12 (L,L) +^12 (R,R)if the player has two information sets and chooses between L and R at each.

**Definition 7.** *Given a behavior strategy profile* *of an extensive form game with finite horizon, define* Ph() *for every terminal history* h= (a 1 , …, am) *as follows. Let* Ph, 0 () = 1 *; if* Ph,k() *has been defined, define* Ph,k+1() =Ph,k()qk+1() *, where* qk+1() *is the probability that* ak+1 *is played at* (a 1 , …, ak) *under the given behavior strategy (or Natures move in the definition of the game), for* k= 0 *, 1, …,* m 1_. Define_ Ph() *as* Ph,m()*. Player* i *s payoff from the behavior strategy* *is defined as* vi() =hHTui(h)Ph()*.*

**Definition 8.** *(Strategic form) Applying Definition 7 to pure strategies, we have defined payoff functions* vi *on the set of strategy profiles* S *for every player* i_. The strategic form game with the same set of players, set of strategies_ Si *for every* i *and payoff function* vi *for every* i *is called the* strategic form *of the original game.*

This is the *canonical* way of converting an extensive form game to a strategic form game.^5 Therefore, the solution concepts of strategic form games may also be used for extensive form games.

**Definition 9.** *(Nash equilibrium) A strategy profile* s *of an extensive form game* G *is called a* Nash equilibrium *of* G *if it is a Nash equilibrium of the strategic form of* G_._

Even though Nash equilibria in mixed strategies can be defined in a similar fashion, such a solution concept is much less useful for the reason that mixed strategies of an extensive form game are often too complicated to handle. For example, for a game where a Player makesmbinary decisions, a mixed strategy is a probability distribution on a set of 2 melements, where a behavior strategy consists ofmdistributions on a set of two elements; in other words, a mixed strategy is an(2m1)-dimensional object while a behavior strategy is an

(^5) Given a strategic form game, we may also try and represent it as an extensive form game, but there is no *canonical* way of doing it, in the sense that two people may convert the same strategic form game to two different extensive forms. Consequently, if one tries to convert a strategic form of an extensive form game back to an extensive form game, the resulting game is usually different from the extensive form one starts with.

m-dimensional object. The key question is whether the loss of generality when we restrict our attention to behavior strategy will cause us to omit important equilibria of the game. For a class of extensive form games, those with *perfect recall* , every mixed strategy is equivalent to a unique behavior strategy so we may focus exclusively on behavior strategies without any loss of generality. Almost every game in application has perfect recall and all games in this course certainly do. The reader may safely assume that behavior strategies are all they are concerned with without worrying about the definition of perfect recall or the formal equivalence result.

**Definition 10.** *(Perfect recall) In an extensive form game, Player* i *is said to have* perfect recall *if for any two histories* h 1 , h 2 Hi *belonging to the same information set* I *and a predecessor* h 1 Hi *of* h 1 *, there exists a unique predecessor* h 2 *of* h 2 *such that* h 1 *and* h 2 *also belong to the same information set and that Player* i *takes the same action at* h 1 *leading to* h 1 *and at* h 2 *leading to* h 2_. The game is said to have_ perfect recall *if every player does.*

The definition formally captures the idea that Playerinever forgets things they already knew and their own action. If different actions were taken ath 1 andh 2 , Playerihas forgotten the action they took by the time they reach the information setI. Ifh 1 andh 2 did not belong to the same information set, Playerihas forgotten the distinction betweenh 1 andh 2 by the time they reach the information setI. The definition of perfect recall rules out both possibilities.

**Theorem 1.** *(Kuhn) In an extensive form with perfect recall, every mixed strategy* *is associated with a unique behavior strategy* *such that for every* siSi *,* Ph( , si) =Ph(, si)*.*

The theorem ensures that no solution (under whatever solution concept) would be omitted by restricting to behavior strategies. Hence, we never talk about mixed strategies of extensive form games (with perfect recall).

##### 2.2 Subgame Perfect Equilibrium

Given a strategy profiles, an information set of Playerimay not be reached at all; such an information set is called *off the path* ofs. Changing Playeris action at an off-path information set does not change the distribution of outcomes. Consequently the solution concept of Nash equilibrium does not require that a player acts optimally at off-path information sets. This leads to unreasonable equilibria. Consider the Entry game in Example 4 withP(q) =Aqfor someA > 0. The following strategy profile is a Nash equilibrium for everyF > 0 : the Incumbent choosesq 1 =A/ 2 with no entry andq 1 =Aotherwise; the Entrant chooses not to entry andq 2 = 0if entering. Although choosingq 1 =Acannot be optimal for the Incumbent with entry, since entry never happens, the Incumbent does not have to act optimally in that contingency under the solution concept of Nash equilibrium. This equilibrium is unreasonable: if there is any small uncertainty about what the Entrant does, the Incumbent would improve their payoff by choosingq 1 < A upon entry.

In general, very few things never happen. Economic models are all idealized and an event that is assumed not to happen may happen; no matter how small the probability of that event is, acting suboptimally (especially with such drastic action likeq 1 =A) in that contingency is not reasonable. For example, in most economic models of contracts, explicit breaching of a contract never happens in equilibrium but it is very unreasonable to assume that the contracting parties may take some bizarre actions upon a contract breach. Due to mathematical challenges, imposing the requirement of acting optimally at every information set, on- or off-path, is diicult. A standard approach of implementing this requirement will be given in Chapter 4. For now, let us consider a minimum requirement of off-path optimality. If at a historyh, every player knows what happened so far, then this history is a fresh start of a new game. It seems reasonable to require that players play a Nash equilibrium of this new game. For example, in Example 4, after the Entrant enters, the two players play a Cournot competition, which is a well-defined game in its own right, and thus it is reasonable to assume that they play a Nash equilibrium of the game whether or not the history(e)happens with positive probability.

**Definition 11.** *(Subgame) In an extensive form game, a non-terminal history* h *is a* fresh start^6 *if it satisfies two conditions: (a)* {h} *is an information set; and (b) if* h 1 *and* h 2 *are two histories belonging to the same information set and* h 1 *is a successor of* h *, then* h 2 *is also a successor of* h_. A_ subgame *of an extensive form game is the game with the same set of players and actions whose set of histories consists of a fresh start* h *and all its successor histories.*

The first requirement of a fresh start means that the player who moves athknows precisely what happened so far: there is no confusion for that player betweenhand any other history. The second requirement means that if any player who moves later (ath 1 ) also knows thathhas happened: the information set to whichh 1 belongs cannot contain any history wherehdid not happen. In view of the formal definition, we still need to specify Natures moves, payoff functions and the information partition of the subgame after specifying the fresh start. However, these are obvious. Natures moves are inherited from the original game, and payoff functions are restrictions of those in the original game. Due to the two requirements of fresh starts, there is no information set containing both histories in the subgame and those outside. Therefore, the information sets of the subgame are simply those consisting of successors of the fresh start and the fresh start itself.

**Definition 12.** *(Subgame perfect equilibrium) A behavior strategy profile* *is called a* Subgame perfect (Nash) equilibrium *if its restriction to every subgame of the original game is a Nash equilibrium of that subgame.*

The game in Example 4 has two proper subgames corresponding to the fresh starts(e)and(n). The Nash equilibrium described at the beginning of this subsection is not subgame perfect because its restriction to the subgame starting from(e)fails to be a Nash equilibrium. From the definition, it is clear that subgame perfect equilibrium is an interesting refinement of Nash equilibrium only when the game contains many fresh starts.

(^6) This phrase is non-standard and only used for convenience here.

This will be the case for applications considered in this chapter and the first half of the next chapter. The solution concept loses its power in card games as such a game usually has no fresh start: Nature moves at the beginning to deal the cards and nobody observes that action before the end of the game. Another problem is that the set of subgame perfect equilibria may depend on framing of the game.

**Example 5.** *Consider the following variant of the Battle of Sexes. The girl first decides whether she is available for a date. If the girl decides that there is no date, both players receive payoff* 1 / 10_. If the girl is available, the girl and the boy simultaneously choose their moves. The girl chooses between Soccer and Opera and the boy chooses among Soccer, Opera and No Show. If the boy does not choose No Show, the payoffs are as in Battle of Sexes; otherwise, both players receive payoff -10._

There are two ways to frame the problem as an extensive form game. The most natural formulation is as follows. The girl first chooses between Available and Busy. Busy ends the game. If the girl chooses Available, that action is observed and then the two players play the simultaneous move game. In this game, there is no subgame perfect equilibrium where the girl chooses Busy because all three Nash equilibria of the subgame starting from Available gives the girl payoff higher than 1/10. Here is an alternative formulation of the game. The girl first chooses among Busy, Soccer and Opera and the boy chooses among Soccer, Opera and No Show if the girl does not choose Busy. This formulation should be equivalent to the previous one, but it has no proper subgame and thus the boy can force a Nash equilibrium where the girl plays Busy by threatening to play No Show himself. The bottom line is that subgame perfection is an attempt to implement rationality at off-path information sets, and this attempt has limited successes. We focus on applications where it is successful for now and consider a further refinement (namely *sequential equilibrium* ) in Chapter 4. For applications considered in the rest of the chapter and the first half of the next chapter, sequential equilibrium does not further refine subgame perfect equilibrium. For games with finite horizon, a standard procedure called *backward induction* will find all subgame perfect equilibria.

**Step 1** Identify all fresh starts and subgames.

**Step 2** Fix a subgame that contains no proper subgames. Find all Nash equilibria of this game, including those in mixed (behavior) strategies if appropriate.

**Step 3** Fix a particular Nash equilibrium. Replace the fresh start of the solved subgame with a terminal history where the players payoffs are the payoffs from the chosen Nash equilibrium of the subgame. Go back to Step 2.

**Step 4** Go back to Step 3 with a different Nash equilibrium of the subgame (if any).

As usual, we seek for subgame perfect equilibria in both pure and behavior strategies for finite games and only equilibria in pure strategies for games with continuum actions.

#### 3 Applications

##### 3.1 Moral hazard

There is an employer (she) and an employee (he). The timing is as follows.

(1)The employer offers an incentive contract(w, b).

(2)The employee observes the contract and decides whether to accept or reject. Rejecting the contract ends the game.

(3)The employee chooses an efforte[0, e], where e(0,1)is some known number.

(4)A binary outcomey{Success,Failure}is realized, withy=Success with probabilitye. Ify=Success, w+bis paid; ify=Failure,wis paid.

If the contract is rejected, the employer receives payoff 0 and the employee receives payoffu. If the contract is accepted, the employees payoff ish(s)c(e)wheresis the compensation (worw+b) he receives, and the employers payoff isRsify=Success andsify=Failure. We assume thatR > 0 ,handcare both twice continuously differentiable,h> 0 andc 0. The key assumption of the model is that the compensation can depend onybut not one. If the compen- sation cannot depend onyore, there is no way the employer can induce any costly effort: subgame perfection means that no discretionary bonus will be paid. That the compensation cannot depend onemeans that the actionechosen by the agent is hidden from the employer. The model is called moral hazard because hazard (Failure) may happen due to the employees lack of effort instead of pure bad luck. In general, the phrase *moral hazard* refers to the incentive problem related to actions taken by one or both contracting parties that are not observable or contractible. There is no need to bring Nature into the picture; the game ends after the employee chooses his effort. Therefore, the typical histories are(),((w, b)),((w, b), r)(contract being rejected),((w, b), a)and((w, b), a, e). The employer moves at()and the employee moves at((w, b))and((w, b), a). The game has perfect information. At the terminal history((w, b), e), the employers payoff ise(Rb)w, and the employees payoff iseh(w+ b) + (1e)h(w)c(e). For some parameter values, there may exist subgame perfect equilibria where the employee rejects the contract on the equilibrium path. For now, we seek for subgame perfect equilibria where the contract is accepted on the path and leave the other case as an exercise. Whatever(w, b)the employer offers, the employee chooses the effort to maximize his payoffeh(w+b) + (1e)h(w)c(e). The convexity ofcmeans that this optimality amounts to the first order condition. Then for the employee to accept the contract, his payoff must be at leastu. Finally, the employer must choose (w, b)to maximize her payoff. The main challenge is to find(w, b). According to the definition, we should treat the employees effort as a function of(w, b), which renders the analysis impossibly diicult. Instead, we imagine that the employer *suggests* an effort levelewhen she offers the contract(w, b); the suggestion is such

that the employee will find it optimal. This alternative paradigm treats the effort as a number (instead of a function) satisfying the additional constraint that it maximizes the employees payoff in the subgame starting at((w, b), a).

```
(u) = maxw,b,e e(Rb)w; (1)
s.t. h(w+b)h(w)c(e) 0 , with equality ife < e; (2)
eh(w+b) + (1e)h(w)c(e)u. (3)
```

In the literature of contract theory, the first constraint is referred to as the *incentive compatibility* (IC) constraint, and the second constraint is referred to as the *individual rationality* (IR) or *participation* constraint. As a w arm up, consider the first benchmark case wherec(e) = 0for everye[0,e ]andhis strictly concave. Clearly, it is optimal for the employee to suggeste= e. Between the two constraints, the IR constraint must be binding, lest the employer would sendwto. We assume for now that the IC constraint is not binding. Letbe the Lagrange multiplier of the IR constraint. Then the first order conditions are as follows:

```
1 +eh (w+b) +(1e )h(w) = 0;
e+ eh(w+b) = 0.
```

We immediately see thath(w+b) =h(w) = 1/. Sincehis strictly decreasing by assumption, this implies thatb= 0. This(w, b, e)indeed satisfies the IC constraint. In summary, the optimal compensation does not depend on the outcome if effort is costless. The reason is that the employer is risk neutral while the employee is risk averse. Given that the bad outcome only results from pure bad luck (natural hazard), the employer should bear all the risk and fully insure the employee. Now consider the second benchmark case wherec(e)> 0 for everye(0, e),c(0) = 0,c( e) =and h(s) =sfor everysR. In this case, the program Eqs. (1)-(3) is reduces to the following:

```
(u) = maxw,b,e eR(w+eb);
s.t. bc(e) 0 , with equality ife < e;
w+ebc(e)u.
```

The easiest way to solve this program is to eliminate the choice variablew. Solving for the optimal(b, R) yields thatc(e) =Randb=R. In other words, when the employee is risk neutral, it is optimal to make the employee fully responsible for the variation in revenue. It can be easily verified that the effort choice on the equilibrium path is the first best (socially eicient) effort. Therefore, the hidden effort choice does not cause any eiciency loss when the employee is risk neutral.

Finally, consider the general case wherec(e)> 0 for everye(0, e),c(0) = 0,c( e) =andhis strictly concave. The IC constraint immediately implies thatb > 0. Letbe the Lagrange multiplier of the IC constraint. Then the first order conditions are as follows:

```
(Rb)c(e)(h(w+b)h(w)c(e)) = 0;
e+h(w+b) +eh(w+b) = 0;
1 +h(w+b)h(w) +eh(w+b) +(1e)h(w) = 0.
```

Regarding the last two equations as a system with unknownsand, we obtain that=e(1e)

###### ( 1

```
h(w+b)h(^1 w)
```

###### )

###### >

- The first equation reduces toRbc(e) = 0due to the IC constraint. Therefore,b < R. The fact that 0 < b < Rrepresents a fundamental trade-off in the moral hazard problem: the employer wants to bear some risk due to the employees risk aversion, but she must make the employee bear some risk to induce positive effort. Due to the fact that the employee does not have the full incentive (b=R), the effort on the equilibrium path is lower than the eiciency level. Solving a general moral hazard problem is diicult, especially when the outcome is not binary. A com- monly used solvable model is as follows. The outcome is normally distributed with meaneand variance ^2 , and the employees payoff from compensationsand efforteisexp((s(ce^2 /2))). Definevso thatexp(v) =u. It is further assumed that the compensation is an aine function of the outcome y: s=y+wfor some constantsandw. Some algebra shows that the employees expected payoff is exp((e+w^12 ^2 ^2 ^12 ce^2 )). Therefore, the following function may be used as the employees pay- off function: v(, w, e) =e+w^12 ^2 ^2 ^12 ce^2.

This is actually the *certainty equivalent* of the employees payoff. The analogue of Eqs. (1)-(3) is the following:

```
(v) = max,w,e (1)ew;
s.t. ce= 0;
e+w^12 ^2 ^2 ^12 ce^2 v.
```

It is convenient to eliminate the choice variableseandwfrom the constraints, so the employer choosesto maximize c^1 ^12 ^2 ^2 21 c^2 v.

Therefore, there is a unique subgame perfect equilibrium in which= (1 +c^2 )^1. The fact that(0,1) is the analogue of the fact thatb(0, R)in the previous model: the employees compensation is strictly increasing in the outcome, but the employee does not bear the full variation of the outcome. The eiciency effort level is 1 /c, while the effort level on the equilibrium path/cis strictly lower. The explicit formula of

also makes it clear that the loss of eiciency is the combined effect of hidden costly effort (positivec) and the employees risk aversion ().

##### 3.2 Multi-tasking moral hazard

As seen in the previous subsection, one source of ineiciency in moral hazard problems is the toxic combination of costly effort and risk aversion. A second source of ineiciency is the imperfection of performance measure. For the incentive contract to be enforced, the performance measure (outcome) must be verifiable by a third party so that the two contracting parties will not haggle over the true outcome at the end of the game. However, even when some performance measure is verifiable, it may not be fully aligned with what the employer truly cares about. One classical example is the care of machine. The employee works on the employers machine to produce output. The employer naturally bears the cost of maintenance and potential replacement of the machine, so the employer cares both about the output and the longevity of the machine. When only the former is verifiable, it is conceivable that the employee may abuse the machine to generate higher output. Another example is education of primary and high school students. Their exam scores are verifiable, but scores do not fully measure the outcome of their study. Providing a strong incentive based on exam scores may misdirect the teachers and students effort towards exam preparation instead of other objectives of education such as effective communication and logical thinking. There have been several attempts to formally model the misalignment between the performance measure and the employers payoff. We present a popular model that describes the employees effort as a vector. For simplicity, assume that the employees effort is a two-dimensional vectoreR^2. The cost of efforteis (1/2)eTCewhereeTdenotes the transpose of the 2 1 matrixeandCis a positive definite 2 2 matrix. The performance measure isy=ge+while the employers payoff isfes, whereis the noise term with zero mean,fandgare known vectors, andsis the compensation. As before, only linear incentive contracts are allowed:s=y+w. For simplicity, assume that the employee is risk neutral, so his payoff isy+w^12 eTCe. As before, we seek for subgame perfect equilibrium where the employers contract is accepted on the path. The optimal choice of(, w)solves the following program:

```
(v) = max,w,e fegew;
s.t. gCe= 0;
ge+w^12 eTCev.
```

As before, we eliminate the choice variableseandwfrom the constraints, so the employer choosesto maximize fTC^1 g^12 ^2 gTC^1 gv.

Therefore,=fgTTCC^11 gg. The eicient effort isC^1 f, while the effort on the path of the unique subgame perfect equilibrium isC^1 g. Both the magnitude and directions may be ineicient. For mathematical convenience, it is often assumed in literature thatCis a constant times the identity matrix. This can always be achieved by applying a invertible linear transformation one. (Every positive definite matrixCcan be factorized asLLTwhereLis a lower tri angular matrix; this is called the *Cholesky factorization* .) However, it should be noted that the dimensions ofeafter the transformation (byL) no longer have any natural economic interpretation. However, the resulting formula of the optimal,fg/|g|^2 , reveals a fundamental fact about imperfect performance measure: how useful a performance measure is depends on how well it is aligned with the true benefit of effort.

##### 3.3 Adverse selection

There is a regulator (she) and a monopoly (he). The timing of the model is as follows.

(1)The monopoly privately observes his cost of productionc; thiscis drawn from a known distribution on.

(2)The regulator announces the price scheduleP: [0,)R.

(3)The monopoly decides whether to acceptP. Rejecting the contract ends the game.

(4)The monopoly chooses outputq 0.

If the contractP is rejected by the monopoly, both parties receive zero payoff. Otherwise, the regulators payoff isS(q)P(q)whereS(q)is the consumer surplus generated by the output and the monopolys payoff isP(q)cq. For simplicity, we consider the simplest case where ={L, H}where 0 < L < H. We assume thatSis twice continuously differentiable, strictly increasing and concave, andS(0) =. In this model, the monopoly does not have any hidden effort, but he has private informationc. In general, adverse selection refers to models where some parties have private information prior to entering the contract. A little trick allows us to formulate the problem as an extensive form game with perfect information: we have Nature choosecafter the announcement ofP. The typical histories are(),(P),(P, c),(P, c, r),(P, c, a), and(P, c, a, q). The regulator moves at(), Nature moves at(P), and the monopoly moves at(P, c)and (P, c, a). The main challenge in the analysis is thatPcan be any function. There is no reason to assume thatP is differentiable, so the monopoly may not use some first order condition to find the optimal output. Also, it is diicult to formulate the regulators optimization problem when she is choosing an unknown function. Experience with the moral hazard problem suggests that it might be convenient to assume that the regulators contract is accompanied with recommended outputsqLandqHfor the two different types of producers. Of course, the recommendation must be such that a monopoly with costcfindsqcoptimal. Such a monopoly may be tempted to choose the output recommended for the other type (the so-called on-schedule deviation)

or some output outside the recommended set{qL, qH}(the so-called off-schedule deviation). The easiest way to deter off-schedule deviations is to putP(q) =for allq / {qL, qH}. This amounts to offer only two choices to the monopoly: (qL, P(qL))and(qH, P(qH)). Now there is no indeed to invoke the unknown functionPas its values are known (to be) except atqLandqH. PutpL=P(qL)andpH=P(qH). The contract offered by the regulator can be represented as a menu: {(qL, pL),(qH, pH)}. This is a particular example of the celebrated *Revelation Principle* in mechanism design, which can be rigorously proved for a more general class of contract design problems. Because off-schedule deviations lead to payoff, we only need to make sure that the monopoly does not make on-schedule deviations. Therefore, subgame perfection means that the monopoly weakly prefers(qL, pL) to(qH, pH)whenc=Land weakly prefers(qH, pH)to(qL, pL)whenc=H. We seek for subgame perfect equilibria where both types of monopoly accept the contract on the equilibrium path. Letbe the probability thatc=Lspecified in Natures move. The program of the optimal choice of the contract is as follows.

```
V(L, H) = maxqL,pL,qH,pH (S(qL)pL) + (1)(S(qH)pH); (4)
s.t. pLLqLpHLqH; (5)
pHHqHpLHqL; (6)
pLLqL0; (7)
pHHqH 0. (8)
```

Eq. (5) states that the monopoly withc=Lweakly prefers(qL, pL)to(qH, pH); this is referred to as Type-Ls incentive compatibility constraint. Similarly, Eq. (6) is Type-Hs incentive compatibility constraint. Eq. (7) states that Type-Lfinds it optimal to accept the contract; this is referred to as Type-Ls individual rationality or participation constraint. Similarly, Eq. (8) is Type-Hs participation constraint. It is usually a bad idea to explore all possibilities regarding which inequality constraints are binding. In this particular case, it is possible to work out which constraints are binding without writing down the first order conditions. First, Eq. (7) is implied by Eqs. (5) and (8):

```
pLLqLpHLqHpHHqH 0 ,
```

where the first inequality is Eq. (5), the second inequality represents the assumptions thatqH 0 andH > L, and the final inequality is Eq. (8). Therefore, we might remove the redundant constraint Eq. (7) from the program. Now Eq. (8) must be binding, as both Eq. (5) and (6) only involvepHpL, and the regulator would sendpHtowhile maintaining a fixedpHpLwithout Eq. (8). Let us imagine that the regulator has chosen(qL, qH). ThenpHis pinned down by Eq. (8). Eqs. (5)-(6) implies that (HL)qH+LqLpLHqL.

```
This inequality immediately implies that(HL)qH+LqLHqL, which amounts to the requirement that
qHqL. Now the regulators payoff function is strictly decreasing inpL, so the binding constraint is the
lower bound ofpL, namely Eq. (5). To sum up, the binding constraints are Eqs. (5) and (8), but we have
an additional requirement thatqHqL. We eliminate the choice variablespLandpHfrom the binding
constraints and reduce the regulators problem to the following:
```

```
V(L, H) = maxqL,qH (S(qL)LqL(HL)qH) + (1)(S(qH)HqH);
s.t. qHqL.
```

```
Ignoring the constraint, we find thatS(qH) =H+ 1 (HL)whileS(qL) =L. SinceSis strictly
decreasing andH > L, the optimal(qL, qH)implied by the first order conditions indeed satisfies the condition
thatqHqL.
Compared with the eiciency outputs solvingmaxqS(q)cqfor everyc, we see thatqLis eicient
whileqHis too low. Also, Type-Lreceives strictly positive payoff(HL)qHin equilibrium. This positive
payoff is called his informational rent. Both phenomena are due to the monopolys private information prior
to contracting. To ensure that Type-Ldoes not choose the option(qH, pH)recommended to Type-H, Type-L
(the cost-eicient type) has to receive strictly higher payoff than Type-H (the cost-ineicient type). This
informational rent is strictly increasing inqH. Consequently, increasing the output by Type-Hgenerates an
additional negative term, Type-Ls informational rent, in the regulators payoff. This term is not part of the
social surplus and causes the downward distortion in the output by Type-H.
In a slightly more general model of adverse selection,consists of more than two elements but is still
ordered: 0 < 1 < 2 < ... < m. The regulator recommends an output levelqfor each typeand sets
P(q) =ifqis not one of the recommended outputs. Therefore, the contract offered by the regulator is
a menu ofmitems{(qi, pi)}mi=1. The complication comes from the fact that each type now has(m1)on-
schedule deviations, as he may choose the option recommended for every other type. Consequently, working
out which IC and IR constraints are binding becomes more diicult. Without getting to the details, we merely
state the main results.
(a)As long asqm> 0 , every type butmreceives positive payoff (informational rent) in equilibrium.
```

(b) The binding IC constraints are those that state that Type-kdoes not choose the option recommended for Type-k+1, fork= 1, 2, …m 1. (c)More cost-eicient types produce higher outputs:qmqm 1 …q 1. However, these constraints are no longer automatic in the regulators reduced program of choosing(q 1 , …, qm).

(d) The most cost-eicient type produces eicient output:S(q 1 ) = 1. There are downward distortions for less cost-eicient types.