LP Duality
Prime problem:
max x c T x s . t . A x ≤ b x ≥ 0 \begin{aligned}
\max_{\mathbf{x}} \mathbf{c}^{\mathrm{T}}\mathbf{x}\\
s.t. \quad \mathbf{A}\mathbf{x}\le \mathbf{b}\\
\mathbf{x}\ge 0
\end{aligned}
x max c T x s . t . Ax ≤ b x ≥ 0
Interpretation:
x i x_i x i : amount of product i i i produced
b i b_i b i : amount of raw material of type i i i available
a i j a_{ij} a ij : amount of raw material i i i used to produced 1 1 1 unit of product j j j
c j c_j c j : profit from 1 1 1 unit of product j j j
Dual problem:
min y b T y s . t . A T y ≥ c y ≥ 0 \begin{aligned}
\min_{y} \mathbf{b}^{\mathrm{T}}\mathbf{y} \\
s.t. \quad \mathbf{A}^{\mathrm{T}}\mathbf{y}\ge \mathbf{c}\\
\mathbf{y}\ge 0
\end{aligned}
y min b T y s . t . A T y ≥ c y ≥ 0
y i y_i y i 可以理解为单位原材料 i i i 的售价
A T y ≥ c \mathbf{A}^{\mathrm{T}}\mathbf{y}\ge \mathbf{c} A T y ≥ c 是因为只有当直接售卖原材料的收益大于产品利润是,才会出售
min y b T y \min_{y} \mathbf{b}^{\mathrm{T}}\mathbf{y} min y b T y 是买家希望最小化支出
有结论:
b T y ∗ = c T x ∗ \mathbf{b}^{\mathrm{T}}\mathbf{y}^{*}=\mathbf{c}^{\mathrm{T}}\mathbf{x}^{*} b T y ∗ = c T x ∗
Strong Duality of LP:
If the primal and dual are feasible, then both have the same optimal objective value. (feasible 表示有解)
Farka’s lemma
Exactly one of the following statements is true:
∃ x ≥ 0 , s . t . A x = b \exists \mathbf{x}\ge 0,\quad s.t. ~\mathbf{A}\mathbf{x}=\mathbf{b} ∃ x ≥ 0 , s . t . Ax = b
∃ y , s . t . y T A ≥ 0 and y T b < 0 \exists \mathbf{y}, \quad s.t. ~\mathbf{y}^{\mathrm{T}}\mathbf{A}\ge 0 \text{ and } \mathbf{y}^{\mathrm{T}}\mathbf{b}<0 ∃ y , s . t . y T A ≥ 0 and y T b < 0
Geometric Interpretation of Fraka’s
考虑二维的情形,记组成 A \mathbf{A} A 的向量为 a 1 , a 2 \mathbf{a}_1,\mathbf{a}_2 a 1 , a 2
如果想要满足第一个条件,那么向量 b \mathbf{b} b 应当在图像中高亮区域内;但是如果要满足第二个条件,则需要找到 y \mathbf{y} y ,使得向量 a 1 , a 2 \mathbf{a}_1,\mathbf{a}_2 a 1 , a 2 和 b \mathbf{b} b 应当正好在直线 y T x = 0 \mathbf{y}^{\mathrm{T}}\mathbf{x}=0 y T x = 0 的两侧。因此此两个条件只能有一个满足。
Proof of Duality
要证明 b T y ∗ = c T x ∗ \mathbf{b}^{\mathrm{T}}\mathbf{y}^{*}=\mathbf{c}^{\mathrm{T}}\mathbf{x}^{*} b T y ∗ = c T x ∗ ,可以先证明左边小于等于右边,再证明右边小于等于左边。
c T x = x T c ≤ x T A T y ≤ b T y \mathbf{c}^{\mathrm{T}}\mathbf{x}=\mathbf{x}^{\mathrm{T}}\mathbf{c}\le \mathbf{x}^{\mathrm{T}}\mathbf{A}^{\mathrm{T}}\mathbf{y} \le \mathbf{b}^{\mathrm{T}}\mathbf{y}
c T x = x T c ≤ x T A T y ≤ b T y
现在希望证明 c T x ∗ ≥ b T y ∗ \mathbf{c}^{\mathrm{T}}\mathbf{x}^{*}\ge \mathbf{b}^{\mathrm{T}}\mathbf{y}^{*} c T x ∗ ≥ b T y ∗
记 c T x ∗ = Δ \mathbf{c}^{\mathrm{T}}\mathbf{x}^{*}=\Delta c T x ∗ = Δ ,由于 x ∗ \mathbf{x}^{*} x ∗ 为最优解,因此
∄ c T x ≥ Δ + ε ( ∀ ε > 0 ) , A x ≤ b , x ≥ 0 \not \exists \mathbf{c}^{\mathrm{T}}\mathbf{x}\ge \Delta+\varepsilon (\forall \varepsilon>0), \mathbf{A}\mathbf{x}\le \mathbf{b}, \mathbf{x}\ge 0
∃ c T x ≥ Δ + ε ( ∀ ε > 0 ) , Ax ≤ b , x ≥ 0
我们使用 α 0 , α \alpha_0, \mathbf{\alpha} α 0 , α 将不等式写为等式
− c T x + α 0 = − Δ + ε , A x + α = b , α 0 , α , x ≥ 0 -\mathbf{c}^{\mathrm{T}}\mathbf{x}+\alpha_0 = -\Delta + \varepsilon, \mathbf{A}\mathbf{x}+\mathbf{\alpha} = \mathbf{b}, \alpha_0, \mathbf{\alpha}, \mathbf{x}\ge 0
− c T x + α 0 = − Δ + ε , Ax + α = b , α 0 , α , x ≥ 0
将其写为矩阵形式
( − c T 1 0 A 0 I ) ( x α 0 α ) = ( − Δ − ε b ) \begin{pmatrix}
-\mathbf{c}^{\mathrm{T}} & 1 & \mathbf{0} \\
\mathbf{A} & \mathbf{0} & \mathbf{I}
\end{pmatrix}
\begin{pmatrix}
\mathbf{x} \\ \alpha_0 \\ \mathbf{\alpha}
\end{pmatrix} =
\begin{pmatrix}
-\Delta - \varepsilon \\ \mathbf{b}
\end{pmatrix}
( − c T A 1 0 0 I ) x α 0 α = ( − Δ − ε b )
由 Farkas’ lemma,因为上述方程不存在解,因此
∃ λ 0 ≥ 0 , λ 1 ≥ 0 \exists \lambda_0\ge 0, \mathbf{\lambda}_1 \ge 0
∃ λ 0 ≥ 0 , λ 1 ≥ 0
使得
− c T λ 0 + λ 1 T A ≥ 0 and − ( Δ + ε ) λ 0 + λ 1 T b < 0 ⇒ 1 λ 0 λ 1 T A ≥ c T and 1 λ 0 λ 1 T b < Δ + ε -\mathbf{c}^{\mathrm{T}}\lambda_0 + \mathbf{\lambda}_1^{\mathrm{T}}\mathbf{A}\ge 0 \text{ and } -(\Delta+\varepsilon)\lambda_0 + \mathbf{\lambda}_1^{\mathrm{T}}\mathbf{b} < 0 \\
\Rightarrow \frac{1}{\lambda_0} \mathbf{\lambda}_1^{\mathrm{T}}\mathbf{A} \ge \mathbf{c}^{\mathrm{T}} \text{ and } \frac{1}{\lambda_0}\mathbf{\lambda}_1^{\mathrm{T}}\mathbf{b}<\Delta + \varepsilon
− c T λ 0 + λ 1 T A ≥ 0 and − ( Δ + ε ) λ 0 + λ 1 T b < 0 ⇒ λ 0 1 λ 1 T A ≥ c T and λ 0 1 λ 1 T b < Δ + ε
令 y = λ 1 / λ 0 y = \mathbf{\lambda}_1 / \lambda_0 y = λ 1 / λ 0 ,那么对于 ∀ ε > 0 \forall \varepsilon>0 ∀ ε > 0 ,都有 y ∗ T b ≤ y T b < Δ + ε \mathbf{y^{*}}^{\mathrm{T}}\mathbf{b} \le \mathbf{y}^{\mathrm{T}}\mathbf{b}<\Delta+\varepsilon y ∗ T b ≤ y T b < Δ + ε ,因此
y ∗ T b ≤ Δ = c T x ∗ \mathbf{y^{*}}^{\mathrm{T}}\mathbf{b}\le \Delta=\mathbf{c}^{\mathrm{T}}\mathbf{x}^{*}
y ∗ T b ≤ Δ = c T x ∗
Another explanation of Duality
将原问题写为:
max x c T x s . t . a 1 T x ≤ b 1 a 2 T x ≤ b 2 ⋯ a n T x ≤ b n x ≥ 0 \begin{aligned}
\max_{\mathbf{x}} \mathbf{c}^{\mathrm{T}}\mathbf{x}\\
s.t. \quad \mathbf{a}_1^{\mathrm{T}} \mathbf{x}\le b_1\\
\mathbf{a}_2^{\mathrm{T}} \mathbf{x}\le b_2 \\
\cdots \\
\mathbf{a}_n^{\mathrm{T}} \mathbf{x}\le b_n \\
\mathbf{x}\ge 0
\end{aligned}
x max c T x s . t . a 1 T x ≤ b 1 a 2 T x ≤ b 2 ⋯ a n T x ≤ b n x ≥ 0
引入一系列系数 y 1 , y 2 , … , y n > 0 y_1,y_2, \ldots ,y_n>0 y 1 , y 2 , … , y n > 0 ,则有
y 1 a 1 T x ≤ b 1 y 1 y 2 a 2 T x ≤ b 2 y 2 ⋯ y n a n T x ≤ b n y n y_1 \mathbf{a}_1^{\mathrm{T}}\mathbf{x} \le b_1 y_1\\
y_2 \mathbf{a}_2^{\mathrm{T}}\mathbf{x} \le b_2 y_2\\
\cdots \\
y_n \mathbf{a}_n^{\mathrm{T}}\mathbf{x} \le b_n y_n
y 1 a 1 T x ≤ b 1 y 1 y 2 a 2 T x ≤ b 2 y 2 ⋯ y n a n T x ≤ b n y n
假如满足条件 y 1 a 1 T + y 2 a 2 T + ⋯ + y n a n T ≥ c T y_1 \mathbf{a}_1^{\mathrm{T}} + y_2 \mathbf{a}_2^{\mathrm{T}} + \cdots + y_n \mathbf{a}_n^{\mathrm{T}}\ge \mathbf{c}^{\mathrm{T}} y 1 a 1 T + y 2 a 2 T + ⋯ + y n a n T ≥ c T ,即 A T y ≥ c \mathbf{A}^{\mathrm{T}}\mathbf{y}\ge \mathbf{c} A T y ≥ c ,则有
c T x ≤ ( y 1 a 1 T + y 2 a 2 T + ⋯ + y n a n T ) x ≤ b T y \mathbf{c}^{\mathrm{T}}\mathbf{x} \le (y_1 \mathbf{a}_1^{\mathrm{T}} + y_2 \mathbf{a}_2^{\mathrm{T}} + \cdots + y_n \mathbf{a}_n^{\mathrm{T}})\mathbf{x} \le \mathbf{b}^{\mathrm{T}}\mathbf{y}
c T x ≤ ( y 1 a 1 T + y 2 a 2 T + ⋯ + y n a n T ) x ≤ b T y
从而得到了对偶问题。
从直观上来说,就是希望 c \mathbf{c} c 能够被 a 1 , a 2 , … a n \mathbf{a}_1,\mathbf{a}_2, \ldots \mathbf{a}_n a 1 , a 2 , … a n 的线性组合得到,从而就可以确定 c T x \mathbf{c}^{\mathrm{T}}\mathbf{x} c T x 的上界;同理,这个上界也是 b T y \mathbf{b}^{\mathrm{T}}\mathbf{y} b T y 的下界。
Two-player zero sum Game
Two player p1,p2
Payoff to p1 = - Payoff to p2
Let:
A ( i , j ) A(i,j) A ( i , j ) is the payoff to p1 when i i i is p1’s action, j j j is p2’s action.
x x x : prob distribution over p1’s actions
y y y : prob distribution over p2’s actions
⇒ u 1 ( x , y ) = ∑ i j A ( i , j ) x i y j = x T A y u 2 ( x , y ) = − u 1 ( x , y ) \Rightarrow u_1(x,y)=\sum_{ij}A(i,j)x_i y_j = \mathbf{x}^{\mathrm{T}}\mathbf{A}\mathbf{y}\\
u_2(x,y) = -u_1(x,y)
⇒ u 1 ( x , y ) = ij ∑ A ( i , j ) x i y j = x T Ay u 2 ( x , y ) = − u 1 ( x , y )
Thus a NE ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) satisfies
( x ∗ ) T A y ∗ ≥ x T A y ∗ and ( x ∗ ) T A y ∗ ≤ ( x ∗ ) T A y \begin{aligned}
&(x^{*})^{\mathrm{T}}A y^{*} \ge x ^{\mathrm{T}} A y^{*}\\
\text{and~} &(x^{*})^{\mathrm{T}} A y^{*}\le (x^{*})^{\mathrm{T}}A y
\end{aligned}
and ( x ∗ ) T A y ∗ ≥ x T A y ∗ ( x ∗ ) T A y ∗ ≤ ( x ∗ ) T A y
我们称 ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) 是 saddle point
Minmax Theorem
min y max x x T A y = max x min y x T A y (1) \min_{y} \max_{x} x^{\mathrm{T}}Ay = \max_{x} \min_{y} x^{\mathrm{T}}Ay \tag{1}
y min x max x T A y = x max y min x T A y ( 1 )
上述结果得到的值我们称其为 game 的 value。
max x ( min y x T A y ) (2.1) \max_{x}(\min_{y}x^{\mathrm{T}}Ay) \tag{2.1}
x max ( y min x T A y ) ( 2.1 )
min y ( max x x T A y ) (2.2) \min_{y}(\max_{x}x^{\mathrm{T}}Ay) \tag{2.2}
y min ( x max x T A y ) ( 2.2 )
x ∗ x^{*} x ∗ solves (2.1), y ∗ y^{*} y ∗ solves (2.2). Then
( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) is NE
( x ∗ ) T A y ∗ (x^{*})^{\mathrm{T}}Ay^{*} ( x ∗ ) T A y ∗ is the value of the game
If ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) is a NE
( x ∗ ) T A y ∗ (x^{*})^{\mathrm{T}}Ay^{*} ( x ∗ ) T A y ∗ is the value of the game
x ∗ x^{*} x ∗ solves (2.1), y ∗ y^{*} y ∗ solves (2.2)
以上三个条件得到的 ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) 都是相同的。
Proof
Suppose p1 wants to maximize its worst case payoff
max x min y x T A y = max x min j ( x T A ) j \max_{x} \min_{y} x^{\mathrm{T}}Ay\\
= \max_{x} \min_{j} (x^{\mathrm{T}}A)_{j}
x max y min x T A y = x max j min ( x T A ) j
问题转化为 LP 问题,可以用 LP Duality 来解
max x , v 1 v 1 s . t . v 1 ≤ ( x T A ) j , ∀ j x ≥ 0 (LP1) \max_{x,v_1} v_1 \\ \tag{LP1}
s.t. \quad v_1\le (x^{\mathrm{T}}A)_{j} ,\forall j\\
x\ge 0
x , v 1 max v 1 s . t . v 1 ≤ ( x T A ) j , ∀ j x ≥ 0 ( LP1 )
Similarly, p2’s problem is
min y , v 2 v 2 s . t . v 2 ≥ ( A y ) i , ∀ i y ≥ 0 (LP2) \min_{y,v_2} v_2 \\ \tag{LP2}
s.t. \quad v_2\ge (Ay)_{i}, \forall i \\
y\ge 0
y , v 2 min v 2 s . t . v 2 ≥ ( A y ) i , ∀ i y ≥ 0 ( LP2 )
Thus by strong duality, we have v 1 ∗ = v 2 ∗ v_1^{*}=v_2^{*} v 1 ∗ = v 2 ∗ , or
max x min y x T A y = min y max x x T A y \max_{x}\min_{y}x^{\mathrm{T}}Ay=\min_{y}\max_{x}x^{\mathrm{T}}Ay
x max y min x T A y = y min x max x T A y
Let x ⋆ x^{\star } x ⋆ be a solution of LP1. From the constraint
v 1 ⋆ ≤ ( x ⋆ T A ) j ∀ j \begin{aligned}v_1^\star\leq(x^{\star\mathrm{T}}A)_j\forall j\end{aligned}
v 1 ⋆ ≤ ( x ⋆ T A ) j ∀ j
Multiply y j ⋆ y_j^\star y j ⋆ and sum over j j j , we get
∑ j v 1 ⋆ y j ⋆ ≤ ∑ j ( x ⋆ T A ) j y j ⋆ \begin{aligned}\sum_jv_1^\star y_j^\star\leq\sum_j(x^{\star\mathrm{T}}A)_jy_j^\star\end{aligned}
j ∑ v 1 ⋆ y j ⋆ ≤ j ∑ ( x ⋆ T A ) j y j ⋆
⇒ v 1 ⋆ ≤ x ⋆ T A y ⋆ \Rightarrow v_{1}^{\star}\leq x^{\star\mathrm{T}}Ay^{\star}
⇒ v 1 ⋆ ≤ x ⋆ T A y ⋆
where y ⋆ y^{\star } y ⋆ is the solution of LP2. Similarly, we can show that
v 2 ⋆ ≥ x ⋆ T A y ⋆ v_{2}^{\star}\geq x^{\star\mathrm{T}}Ay^{\star}
v 2 ⋆ ≥ x ⋆ T A y ⋆
Since v 1 ⋆ = v 2 ⋆ v_{1}^{\star }= v_{2}^{\star } v 1 ⋆ = v 2 ⋆ , we have
x ⋆ T A y ⋆ = max x min y x T A y = min y max x x T A y x^{\star\mathrm{T}}Ay^{\star}=\max_{x}\min_{y}x^{\mathrm{T}}Ay=\min_{y}\max_{x}x^{\mathrm{T}}Ay
x ⋆ T A y ⋆ = x max y min x T A y = y min x max x T A y
Then we can show that ( x ⋆ , y ⋆ ) ( x^\star , y^\star ) ( x ⋆ , y ⋆ ) is a NE.
min y x ⋆ T A y = max x min y x T A y = min y max x x T A y = max x x T A y ⋆ ≥ x ⋆ T A y ⋆ \operatorname*{min}_{y}x^\mathrm{\star T}Ay=\operatorname*{max}_{x}\operatorname*{min}_{y}x^\mathrm{T}Ay=\operatorname*{min}_{y}\operatorname*{max}_{x}x^\mathrm{T}Ay=\operatorname*{max}_{x}x^\mathrm{T}Ay^{\star} \ge x^{\star \mathrm{T}}Ay^{\star}
y min x ⋆ T A y = x max y min x T A y = y min x max x T A y = x max x T A y ⋆ ≥ x ⋆ T A y ⋆
⇒ y ⋆ \Rightarrow y^\star ⇒ y ⋆ solves min y max x x T A y \min_{y} \max_{x} x^\mathrm{T} Ay min y max x x T A y
Similarly, x ∗ x^{*} x ∗ solves max x min y x T A y \max_{x}\min_{y}x^{\mathrm{T}}Ay max x min y x T A y . So ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) is a NE.
Suppose ( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) is a NE, then
x ∗ T A y ∗ = min y x ∗ T A y ≤ max x min y x T A y x^{* {\mathrm{T}}} A y^{*} = \min_{y} x^{* {\mathrm{T}}} A y \le \max_{x} \min_{y} x^{\mathrm{T}}Ay
x ∗ T A y ∗ = y min x ∗ T A y ≤ x max y min x T A y
Similarly,
x ∗ T A y ∗ = max x x T A y ≥ min y max x x T A y x^{* {\mathrm{T}}}A y^{*} = \max_{x} x^{\mathrm{T}}Ay\ge \min_{y} \max_{x} x^{\mathrm{T}}Ay
x ∗ T A y ∗ = x max x T A y ≥ y min x max x T A y
⇒ min y max x x T A y ≤ x ⋆ T A y ⋆ ≤ max x min y x T A y \Rightarrow\min_y\max_xx^{\mathrm{T}}Ay\leq x^{\star\mathrm{T}}Ay^{\star}\leq\max_x\min_yx^{\mathrm{T}}Ay
⇒ y min x max x T A y ≤ x ⋆ T A y ⋆ ≤ x max y min x T A y
Considering
min y max x f ( x , y ) ≥ max x min y f ( x , y ) \begin{aligned}\min_y\max_xf(x,y)\geq\max_x\min_yf(x,y)\end{aligned}
y min x max f ( x , y ) ≥ x max y min f ( x , y )
So
min y x ⋆ T A y = x ⋆ T A y ⋆ = max x min y x T A y ≥ min y x T A y ∀ x \min_{y}x^{\star\mathsf{T}}Ay=x^{\star\mathsf{T}}Ay^{\star}=\max_{x}\min_{y}x^{\mathsf{T}}Ay\geq\min_{y}x^{\mathsf{T}}Ay\forall x
y min x ⋆ T A y = x ⋆ T A y ⋆ = x max y min x T A y ≥ y min x T A y ∀ x
⇒ x ∗ \Rightarrow x^{*} ⇒ x ∗ maximizes min y x T A y \min_{y}x^{\mathrm{T}}Ay min y x T A y . Similarly, y ∗ y^{*} y ∗ minimizes max x x T A y \max_{x}x^{\mathrm{T}}Ay max x x T A y
( x ∗ , y ∗ ) (x^{*},y^{*}) ( x ∗ , y ∗ ) , ( x ~ , y ~ ) (\tilde{x},\tilde{y}) ( x ~ , y ~ ) are NE, then ( x ∗ , y ~ ) (x^{*},\tilde{y}) ( x ∗ , y ~ ) , ( x ~ , y ∗ ) (\tilde{x},y^{*}) ( x ~ , y ∗ ) are also NE.
None Zero Sum Game
u i ( a i , a − i ) u_i(a_i,a_{-i}) u i ( a i , a − i ) payoff to p1, a i ∈ A i a_i\in A_i a i ∈ A i (discrete, finite)
p i ( a i ) = Pr ( player i plays a i ) p_{i}(a_i)=\operatorname{Pr}(\text{player } i \text{ plays } a_i) p i ( a i ) = Pr ( player i plays a i )
u i ( p i , p − i ) = E ( payoff to p i ) u_i(p_{i},p_{-i})=E(\text{payoff to } p_{i}) u i ( p i , p − i ) = E ( payoff to p i )
Theorem: A NE exists.
Let C ⊆ R n C \subseteq{\mathbb{R}^{n}} C ⊆ R n be a convex, closed, bounded set. Let f : C → C f: C \rightarrow C f : C → C be a continuous function. The f f f has a fixed point, i.e. ∃ x ∈ C , x = f ( x ) \exists x \in C, x=f(x) ∃ x ∈ C , x = f ( x ) 。从直观上来说,就是符合条件的 y = f ( x ) y=f(x) y = f ( x ) 与 y = x y=x y = x 必定有交点。
Let p = ( p 1 , p 2 , … , p n ) p=(p_1,p_2, \ldots ,p_n) p = ( p 1 , p 2 , … , p n ) be a set of strategies.
Define r i ( a i ) = ( u i ( a i , p − i ) − u i ( p i , p − i ) ) + r_i(a_i)=(u_i(a_i,p_{-i})-u_i(p_{i},p_{-i}))^{+} r i ( a i ) = ( u i ( a i , p − i ) − u i ( p i , p − i ) ) + ,这个变量表示的是从策略 p i p_{i} p i 变化到 a i a_i a i 之后收益的增加量。我们希望能找到 p i ∗ p_{i}^{*} p i ∗ ,使得 ∀ a i \forall a_i ∀ a i ,有 r i ( a i ) = 0 r_i(a_i)=0 r i ( a i ) = 0 ,此时就能够得到一个纳什均衡。
Define
f i ( p i ( a i ) ) = p i ( a i ) + r i ( a i ) ∑ α ( p i ( α ) + r i ( α ) ) f_i(p_{i}(a_i)) = \frac{p_{i}(a_i)+r_i(a_i)}{\sum_{\alpha}(p_{i}(\alpha)+r_i(\alpha))}
f i ( p i ( a i )) = ∑ α ( p i ( α ) + r i ( α )) p i ( a i ) + r i ( a i )
p i ( a i ) p_{i}(a_i) p i ( a i ) 表示的是对于策略 p i p_{i} p i ,选中 a i a_i a i 的概率。可以看出这个式子相当于求出 p i ( a i ) + r i ( a i ) p_{i}(a_i)+r_i(a_i) p i ( a i ) + r i ( a i ) 在所有可能中的权重。
记
f ( p ) = ( f 1 ( p 1 ) f 2 ( p 2 ) ⋮ f n ( p n ) ) f(p) = \begin{pmatrix}
f_1(p_1) \\ f_2(p_2) \\ \vdots \\ f_n(p_n)
\end{pmatrix}
f ( p ) = f 1 ( p 1 ) f 2 ( p 2 ) ⋮ f n ( p n )
注意到 p = ( p 1 , p 2 , … p n ) ⊆ [ 0 , 1 ] n p=(p_1,p_2, \ldots p_n) \subseteq[0,1]^{n} p = ( p 1 , p 2 , … p n ) ⊆ [ 0 , 1 ] n ,p ∈ p \in p ∈ closed, bounded, convex set
f f f 是一个连续函数
所以存在一个 fixed point p p p ,p = f ( p ) p=f(p) p = f ( p ) ,此时利用 f f f 的表达式,应当有
p i ( a i ) = p i ( a i ) + r i ( a i ) ∑ α ( p i ( α ) + r i ( α ) ) p_{i}(a_i) = \frac{p_{i}(a_i)+r_i(a_i)}{\sum_{\alpha}(p_{i}(\alpha)+r_i(\alpha))}
p i ( a i ) = ∑ α ( p i ( α ) + r i ( α )) p i ( a i ) + r i ( a i )
同样的,我们希望证明 r i ( a i ) = 0 r_i(a_i)=0 r i ( a i ) = 0 。
First, we claim that for each i i i , ∃ a i , s . t . r i ( a i ) = 0 \exists a_i, s.t.~ r_i(a_i)=0 ∃ a i , s . t . r i ( a i ) = 0 .
Suppose for some i i i , r i ( a i ) > 0 , ∀ a i r_i(a_i)>0, \forall a_i r i ( a i ) > 0 , ∀ a i
⇒ u i ( a i , p − i ) > u i ( p i , p − i ) , ∀ a i \Rightarrow u_i(a_i,p_{-i})>u_i(p_{i},p_{-i}) ,\forall a_i
⇒ u i ( a i , p − i ) > u i ( p i , p − i ) , ∀ a i
⇒ ∑ a i ∈ A i p i ( a i ) u i ( p i , p − i ) > u i ( p i , p − i ) \Rightarrow \sum_{a_i\in A_i}p_{i}(a_i) u_i(p_{i},p_{-i}) > u_i(p_{i},p_{-i})
⇒ a i ∈ A i ∑ p i ( a i ) u i ( p i , p − i ) > u i ( p i , p − i )
⇔ u i ( p i , p − i ) > u i ( p i , p − i ) \Leftrightarrow u_i(p_{i},p_{-i})>u_i(p_{i},p_{-i})
⇔ u i ( p i , p − i ) > u i ( p i , p − i )
which is contradictory.
Fix i i i , let a i a_i a i be r i ( a i ) = 0 r_i(a_i)=0 r i ( a i ) = 0
p i ( a i ) = p i ( a i ) ∑ α ( p i ( α ) + r i ( α ) ) p_{i}(a_i) = \frac{p_{i}(a_i)}{\sum_{\alpha}(p_{i}(\alpha)+ r_i(\alpha))}
p i ( a i ) = ∑ α ( p i ( α ) + r i ( α )) p i ( a i )
⇒ ∑ α ∈ A i p i ( α ) + ∑ α ∈ A i r i ( a i ) = 1 \Rightarrow \sum_{\alpha\in A_i} p_{i}(\alpha) + \sum_{\alpha \in A_i}r_i(a_i)=1
⇒ α ∈ A i ∑ p i ( α ) + α ∈ A i ∑ r i ( a i ) = 1
⇒ ∑ α ∈ A i r i ( α ) = 0 \Rightarrow \sum_{\alpha \in A_i}r_i(\alpha) = 0
⇒ α ∈ A i ∑ r i ( α ) = 0
Thus, ∀ α ∈ A i , r i ( α ) = 0 \forall \alpha\in A_i, r_i(\alpha)=0 ∀ α ∈ A i , r i ( α ) = 0
Proof using the kakvtani fixed point theorem
Let C C C be a closed, bounded, convex subset of R n \mathbb{R}^{n} R n . Let f f f be a corresponding mapping each point in C C C to a subset of C C C .
f : C → 2 C f: C \rightarrow 2^{C}
f : C → 2 C
需要证明这一点是因为之前遇到过 p ∈ B P ( p ) p \in BP(p) p ∈ BP ( p ) ,这也是一个从 point 到 subset 的映射。
Suppose that:
f ( x ) ≠ ∅ ∀ x f(x)\neq \emptyset~\forall x f ( x ) = ∅ ∀ x ,
f ( x ) f(x) f ( x ) is a convex set ∀ x \forall x ∀ x
f f f has a closed graph. (f f f has a closed graph if all sequence { x n } ∈ C \left\{ x_n \right\} \in C { x n } ∈ C and { y n } \left\{ y_n \right\} { y n } , ( x n , y n ) → ( x , y ) (x_n,y_n)\rightarrow(x,y) ( x n , y n ) → ( x , y ) with y n ∈ f ( x n ) y_n\in f(x_n) y n ∈ f ( x n ) , ⇒ y ∈ f ( x ) \Rightarrow y\in f(x) ⇒ y ∈ f ( x ) ).
Then f f f has a fixed point in C C C , x ∈ f ( x ) x\in f(x) x ∈ f ( x ) .
To estalish the existance of a solution to
p ∈ B P ( p ) p\in BP(p)
p ∈ BP ( p )
where
p = ( p 1 p 2 ⋮ p n ) B P ( p ) = ( B P ( p − 1 ) B P ( p − 2 ) ⋮ B P ( p − n ) ) p=\begin{pmatrix}
p_1\\ p_2\\ \vdots \\ p_{n}
\end{pmatrix} \quad
BP(p) = \begin{pmatrix}
BP(p_{-1}) \\ BP(p_{-2}) \\ \vdots \\ BP(p_{-n})
\end{pmatrix}
p = p 1 p 2 ⋮ p n BP ( p ) = BP ( p − 1 ) BP ( p − 2 ) ⋮ BP ( p − n )
We will verify BP has all the properties required in the kakvtani FP theorem.
B P ( p ) BP(p) BP ( p ) is a non-empty for each p p p . 这是显然的,因为在别人给出了策略的情况下,我们总能找到最大化自身收益的策略。
根据 convex 的定义。If p i ∗ , p i ~ ∈ B P ( p − i ) p_{i}^{*}, \tilde{p_{i}} \in BP(p_{-i}) p i ∗ , p i ~ ∈ BP ( p − i ) , α p i ∗ + ( 1 − α ) p i ~ ∈ B P ( p − i ) \alpha p_{i}^{*}+(1-\alpha)\tilde{p_{i}} \in BP(p_{-i}) α p i ∗ + ( 1 − α ) p i ~ ∈ BP ( p − i ) 。这也是显然的。由以下 u i u_i u i 的定义u i ( p i , p − i ) = ∑ a u i ( a i , a − i ) p 1 ( a 1 ) p 2 ( a 2 ) ⋯ p n ( a n ) u_i(p_{i},p_{-i}) = \sum_{a} u_i(a_i, a_{-i}) p_1(a_1) p_2(a_2) \cdots p_n(a_n)
u i ( p i , p − i ) = a ∑ u i ( a i , a − i ) p 1 ( a 1 ) p 2 ( a 2 ) ⋯ p n ( a n )
从线性的特征可以看出u i ( p i , p − i ) = u i ( p i ~ , p − i ) = α u i ( p i , p − i ) + ( 1 − α ) u i ( p i ~ , p − i ) = u i ( α p i ∗ + ( 1 − α ) p i ~ , p − i ) u_i(p_{i},p_{-i}) = u_i(\tilde{p_{i}},p_{-i}) = \alpha u_i(p_{i},p_{-i}) + (1-\alpha) u_i(\tilde{p_{i}},p_{-i}) = u_i(\alpha p_{i}^{*}+(1-\alpha)\tilde{p_{i}},p_{-i})
u i ( p i , p − i ) = u i ( p i ~ , p − i ) = α u i ( p i , p − i ) + ( 1 − α ) u i ( p i ~ , p − i ) = u i ( α p i ∗ + ( 1 − α ) p i ~ , p − i )
Let ( p i ( n ) , p − i ( n ) ) → ( p i , p − i ) (p_{i}^{(n)}, p_{-i}^{(n)}) \rightarrow (p_{i}, p_{-i}) ( p i ( n ) , p − i ( n ) ) → ( p i , p − i ) and p i ( n ) ∈ B P i ( p − i ( n ) ) p_{i}^{(n)} \in BP_{i}(p_{-i}^{(n)}) p i ( n ) ∈ B P i ( p − i ( n ) )
Suppose that p i ∉ B P i ( p − i ) p_{i} \not \in BP_{i}(p_{-i}) p i ∈ B P i ( p − i ) . Then ∃ p ^ i \exists \hat{p}_i ∃ p ^ i and ε > 0 \varepsilon>0 ε > 0 u i ( p ^ i , p − i ) ≥ u i ( p i , p − i ) + ε u_i(\hat{p}_i,p_{-i})\ge u_i(p_{i},p_{-i}) + \varepsilon
u i ( p ^ i , p − i ) ≥ u i ( p i , p − i ) + ε
p ^ i \hat{p}_i p ^ i is a better response for p − i ( n ) p_{-i}^{(n)} p − i ( n ) (for some n n n ) then p i ( n ) p_{i}^{(n)} p i ( n ) thus contradictory .
For sufficiently large n n n ,u i ( p ^ i , p − i ( n ) ) ≥ u i ( p ^ i , p − i ) − ε 2 ≥ u i ( p i , p − i ) + ε − ε 2 ≥ u i ( p i ( n ) , p − i ( n ) ) − ε 4 + ε 2 = u i ( p i ( n ) , p − i ( n ) ) + ε 4 \begin{aligned}
u_i(\hat{p}_i,p_{-i}^{(n)}) &\ge u_i(\hat{p}_i,p_{-i}) - \frac{\varepsilon}{2} \\
&\ge u_i(p_{i},p_{-i}) + \varepsilon -\frac{\varepsilon}{2} \\
&\ge u_i(p_{i}^{(n)}, p_{-i}^{(n)}) - \frac{\varepsilon}{4} + \frac{\varepsilon}{2} \\
&=u_i(p_{i}^{(n)},p_{-i}^{(n)}) + \frac{\varepsilon}{4}
\end{aligned}
u i ( p ^ i , p − i ( n ) ) ≥ u i ( p ^ i , p − i ) − 2 ε ≥ u i ( p i , p − i ) + ε − 2 ε ≥ u i ( p i ( n ) , p − i ( n ) ) − 4 ε + 2 ε = u i ( p i ( n ) , p − i ( n ) ) + 4 ε