N人雪堆博弈模型的第三种策略引入及其影响探究

发布时间:2018-05-01 21:32

  本文选题:雪堆 + 博弈 ; 参考:《浙江大学》2017年博士论文


【摘要】:竞争性群体当中的合作行为研究是当今一个重要且紧迫的跨学科难题。目前为止,博弈论提供了最为有效的框架。在合作演化博弈建模中,囚徒困境博弈备受学界关注,相比之下雪堆博弈模型的相关研究较少,而后者通常被认为是在描述竞争情景时前者的替代模型。本文作者在前人研究基础上,对雪堆博弈模型进行了进一步的推广和创新,在N人雪堆博弈模型中引入第三种策略,使用动力学方程推导和仿真模拟的方法进行研究。论文发现N人雪堆博弈不同于公共品博弈(即囚徒困境的N人博弈推广模型),呈现出特殊的动力学性质,公共品博弈模型的动力学演化为不同状态之间的循环转化,无法达到某种稳态,而N人雪堆博弈在充分演化的条件下,系统最终可能趋于几种(两种或三种)不同性质的稳态,为研究群体合作行为的演化提供了新的线索。引入利他惩罚机制的演化博弈研究之前只是在2人博弈的条件下进行,论文第二章首次将利他惩罚机制引入两策略的N人雪堆博弈模型,建立了含惩罚机制的三策略N人雪堆博弈模型,并且研究了惩罚机制的引入对N人雪堆模型在混合均匀群体中造成的影响。作者给出了一系列描述三策略模型的动力学方程。在充分演化的情况下,系统最终会演化为某种稳态,稳态分为两种,具有不同的特性。一般说来,给定相对较小的本益比r,较大的博弈小组规模Ⅳ,较大的乘数因子β/α容易压制背叛者的滋生,导致系统演化成为一个合作性质的、仅由合作者和惩罚者构成的群体,由于所有的背叛者都完成转化,C、P的收益完全相等,系统动力学冻结,这种稳态被称为冻结态,冻结态的C、P频率构成取决于初始状态。反之,较大的r ,较小的Ⅳ和β/α容易使惩罚者处于一种自毁的发展模式,惩罚者逐渐消亡,系统演化为一个仅由合作者和背叛者构成的群体,群体继续演化,相当于最初的两策略N人雪堆博弈模型的动力学演化,最终达到活动态。因此,活动态的C、D频率构成与初始状态无关,同时也与惩罚者相关的参数设定无关。论文作者进一步提出了完全描述系统演化动力学过程的模拟算法,经验证复制动力学方程与程序模拟的结果高度一致。第三章中作者通过在原始两策略NSG模型中引入额外的L策略,研究并建立了一个三策略N人雪堆博弈模型。论文推导了混合均匀群体结构下三种策略频率的动力学方程。给定任何初始条件,都可以通过迭代动力学方程获得频率的时间演化及其稳态分布。模型参数即本益比r和固定收益L的不同取值导致了系统丰富的演化行为。对显示系统如何演变的三角流向图的详细研究表明,根据模型参数取值不同,稳态可以是AllL,AllC或C + D态中的一种。策略L的引入起到了两个作用。它有助于引导系统达到All L态,也有助于达到All C态。相比之下,将利他惩罚机制(P策略)引入N人雪堆博弈只能导致两种策略混合的稳态。此外作者同样使用了一种仿真模拟算法作为理论研究结果的验证,这种算法可用于对各种结构性群体中的NSG模型研究。第四章中,论文作者在可选雪堆博弈模型(Optional NSG)基础上增加了一个合作人数的下限阈值T。论文给出了该模型的动力学方程,同样也用模拟算法进行验证。和OptionalNSG模型类似,新的模r*同样存在一个临界值r*将系统分为两种最终稳态,当rr*的时候,系统终态表现为C、D共存的活动态,当rr*的时候,系统终态表现为ALLL的冻结态。当设定下限阈值为2时,对群体最后达成C、D共存起到了积极的作用。但是当下限阈值继续提高时,反倒对合作产生了抑制作用。在N=T的特殊情况下,背叛者永远不可能通过利用合作者而获取收益,从而背叛者成为了弱势群体。系统在这样的背景下最终也会演化为两种状态ALL C和ALL L,而不再有C、D共存的终态,某种程度上促使D向C转变,最终消灭了 D策略。第五章中,论文作者在OptionalNSG模型的基础上,再度引入了惩罚机制,将模型扩展为一个N人四策略博弈模型。论文给出了该模型的动力学方程,并通过迭代动力学方程和算法模拟,得出有关该模型性质的一些初步结论。和之前的模型类似,N人四策略雪堆博弈模型同样存在一个临界值r·*表达系统最终稳态的突变。当rr*的时候,随着r的增加,系统终态依次表现为C、P共存,C、D、P共存和C、D共存的活动态,这种变化是连续的。当rr*的时候,系统终态突变为ALLL的冻结态。这种相态的转变是瞬变,而非逐渐变化。论文就各参数对于最终稳态造成的影响进行研究发现,L的增大使得瞬变的关键点r*提前到来,β的增大使得惩罚力度增加,而N的增大给背叛者利用合作者的劳动成果提供了机会,使得合作的难度增加,。
[Abstract]:The study of cooperative behavior among competitive groups is an important and urgent interdisciplinary problem. So far, the game theory provides the most effective framework. In the cooperative evolutionary game modeling, the prisoner's dilemma game has attracted much attention, compared with the research of the snow pile game model, and the latter is usually considered to be in the description. On the basis of previous studies, the author further popularized and innovating the snow pile game model on the basis of previous studies, introduced third strategies in the N man snow game model, and studied the use of Dynamic Equation Derivation and Simulation simulation. The paper found that the N snow pile game is different from public goods. The N game extension model of the prisoner's dilemma presents a special dynamic character. The dynamics of the game model of the public goods is transformed into a cycle between different states and can not reach a certain steady state. While the N man snow pile game is fully evolved, the system may eventually tend to several (two or three) different properties of the steady state. In order to study the evolution of group cooperative behavior, the evolutionary game of the altruistic punishment mechanism was introduced only under the condition of 2 party game. The second chapter of the paper introduced the altruistic punishment mechanism into the N man snow game model of the two strategy for the first time, and established a three strategy N man snow game model with the system of punishing machine. The effect of the introduction of the penalty mechanism on the N man snow pile model in the mixed homogeneous group is investigated. The author gives a series of dynamic equations describing the three strategy model. In the case of sufficient evolution, the system will eventually evolve into a certain steady state, and the steady state is divided into two different characteristics. Generally speaking, a relatively small benefit ratio is given. R, the larger game group size IV, the larger multiplier factor beta / alpha is easy to suppress the breeding of the Betrayer, resulting in the evolution of the system into a cooperative nature, the group composed only by the collaborators and the punishes, because all the betrayals have completed the transformation, the C, the P benefits are completely equal, the system dynamics is frozen, and the steady state is called the frozen state. The frequency composition of the frozen C, P depends on the initial state. On the contrary, the larger R, the smaller IV and the beta / alpha are easy to make the punishing in a self destructive development model, the punishing is gradually disappearing, the system evolves into a group of only collaborators and betrayals, and the Group continues to evolve, equivalent to the original two strategy N man snow pile game model. The dynamic evolution of the C and D frequency composition is independent of the initial state, and it is independent of the parameter setting related to the penalty. The author further proposes a simulation algorithm to fully describe the evolutionary process of the system, which is proved to be in high agreement with the results of the program simulation. In the three chapter, the author studies and establishes a three strategy N man snow pile game model by introducing an additional L strategy in the original two strategy NSG model. The paper derives the dynamic equations of the three strategy frequencies under the mixed homogeneous group structure. Given any initial condition, the time evolution of the frequency can be obtained by the iterative dynamic equation. The different values of the model parameters, the benefit ratio R and the fixed income L, lead to the rich evolutionary behavior of the system. The detailed study of how the display system evolves the trigonometric flow chart shows that the steady state can be one of the AllL, AllC or C + D states based on the model parameter values. The introduction of strategy L has played two roles. It helps to guide the system to the All L state and also to the All C state. In contrast, the introduction of the altruistic punishment mechanism (P strategy) to the N man snow game game can only lead to the steady state of the mixture of two strategies. In addition, the author also uses a simulation algorithm to verify the results of the theoretical study. This algorithm can be used for various structural groups. In the study of NSG model in the fourth chapter, the author increases the lower threshold of a cooperative number based on the optional snow stack game model (Optional NSG). The paper gives the dynamic equation of the model, which is also verified by the simulation algorithm. Similar to the OptionalNSG model, the new model r* also has a critical value r* to make the system a system. The final state of the system is divided into two final steady states. When rr*, the final state of the system is shown as the live dynamic of C and D. When rr*, the final state of the system is the frozen state of ALLL. When the threshold threshold is set to 2, the group finally reaches C, and D coexists positively. But when the lower threshold value continues to increase, it inhibits the cooperation. In the special case of N=T, the Betrayer can never get the benefit by using the collaborator, thus the Betrayer becomes a disadvantaged group. In this context, the system will eventually evolve into two states ALL C and ALL L, and no longer C, D coexists with the end state, to some extent, to change D to C, and eventually eliminate D strategy. In the fifth chapter, the theory On the basis of the OptionalNSG model, the author introduces the punishment mechanism again, and extends the model into a N four strategy game model. The paper gives the dynamic equation of the model, and obtains some preliminary conclusions about the property of the model through the iterative dynamics equation and algorithm, which is similar to the previous model and the four strategy snow of the N man. The heap game model also has a sudden change in the final steady state of a critical value R * expression system. When rr*, with the increase of R, the final state of the system appears as C, P coexists, C, D, P coexist and C, D coexists, and this change is continuous. When rr*, the final state of the system becomes the frozen state of ALLL. This phase transition is transient, and The paper studies the influence of the parameters on the final steady state. It is found that the increase of L makes the key point of transient r* come ahead, the increase of the beta makes the punishment increase, and the increase of N gives the Betrayer the opportunity to use the results of the collaborators, and the difficulty of cooperation is increased.

【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:O225

【相似文献】

相关期刊论文 前1条

1 梁卓宇;;试论基于物联网技术的智能交通系统[J];科技创业月刊;2014年05期

相关重要报纸文章 前2条

1 记者 梁楠;广发期货:围绕风险可控展开操作策略[N];期货日报;2012年

2 王先琳;企业转型的策略[N];首都建设报;2010年

相关博士学位论文 前4条

1 徐猛;N人雪堆博弈模型的第三种策略引入及其影响探究[D];浙江大学;2017年

2 黄毅敏;主辅制造商协同生产系统博弈模型研究[D];天津大学;2016年

3 冯玉磊;黑洞蒸发的一种幺正模型[D];浙江大学;2017年

4 李璐;水冷反应堆主回路腐蚀产物活化及迁移模型的研究[D];华北电力大学(北京);2017年

相关硕士学位论文 前1条

1 杨雅娟;山里瑜舍瑜伽馆营销策略的研究[D];长安大学;2016年



本文编号:1831114

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/1831114.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户005ca***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com