零行列式策略在雪堆博弈中的演化

发布时间：2018-07-29 14:44

【摘要】：零行列式策略不仅可以单方面设置对手收益,而且可以对双方的收益施加一个线性关系,从而达到敲诈对手的目的.本文针对零行列式策略博弈前期与稳态期的收益存在偏差,基于Markov链理论给出零行列式策略与全合作策略博弈的瞬态分布、瞬态收益及达到稳态所需时间.发现在小的敲诈因子下,敲诈者前期收益高于稳态期收益,敲诈因子较大时,情况截然相反,并且敲诈因子越大,越不利于双方合作,达到稳态也越慢.这为现实生活中频繁更新策略的博弈提供了一种计算实时收益的方法.此外针对敲诈策略与进化人的博弈,论证了双方均背叛状态下,进化人下次博弈时一定进化为全合作策略.通过对所有状态下策略更新过程仿真,发现进化人在四种情况下的进化速度有显著差异,并最终演化为全合作策略,表明零行列式策略是合作产生的催化剂.
[Abstract]:The zero-determinant strategy can not only set up the opponent's income unilaterally, but also impose a linear relation to the profit of both parties, so as to achieve the purpose of extorting the opponent. Based on the Markov chain theory, this paper gives the transient distribution, transient income and the time to reach the steady state of the zero-determinant strategy game and the total cooperative strategy game. It is found that under the small extortion factor, the earlier income of the blackmailer is higher than that of the steady period. When the extortion factor is larger, the situation is opposite, and the bigger the extortion factor is, the more unfavorable the cooperation between the two parties is, and the slower the steady state is. This provides a method to calculate real-time income for the game of frequent update strategy in real life. In addition, for the game between extortion strategy and evolutionist, it is proved that the evolutionary person will evolve into a full cooperation strategy in the next game under the state of betrayal by both sides. Through the simulation of the strategy updating process in all states, it is found that there are significant differences in the evolutionary speed of the evolutionary human under the four conditions, and the ultimate evolution is the full cooperation strategy, which indicates that the zero-determinant strategy is the catalyst for the generation of cooperation.
【作者单位】：上海理工大学管理学院;华北水利水电大学数学与统计学院;西京学院商贸技术系;
【基金】：国家自然科学基金(批准号:71571119)和国家自然科学基金青年科学基金(批准号:11501199)资助的课题~~
【分类号】：O225

【相似文献】