蜂窝网络中D2D通信的资源控制优化设计

发布时间：2018-11-19 17:06

【摘要】：从以话音业务为代表的第一代模拟通信系统到如今大规模商用的第四代(4G)移动通信系统,移动通信网络走过了漫长历程。在移动通信网络中实现移动云计算、移动多媒体等新兴业务已经成为移动通信的下一步演进目标。当下,随着智能终端的迅速普及和网络通信量的爆炸性增长,面向第五代(5G)移动通信的无线通信技术已经吸引了业内人士的极大关注。D2D(Device-to-Device:D2D)通信作为5G的关键候选技术,已经成为一个研究热点。因为蜂窝网络引入D2D通信技术后,网络和用户的资源分配会发生巨大的变化,所以资源控制的优化设计就成为一个研究重点。资源控制包含模式选择、资源分配和功率控制,为了使网络达到最佳性能,越来越多的研究工作把这三个机制放在一起研究。目前,多数针对D2D通信的研究都是基于无限积压业务模型和分组级业务模型,而本论文基于更适合于新一代无线通信的流级业务模型,主要对资源控制当中的模式选择和资源分配进行优化设计。在引入D2D通信的蜂窝网络中应用正交频分多址技术(Orthogonal Frequency Division Multiple Access:OFDMA),研究该蜂窝网络的资源控制优化问题,从而使网络中流传输的平均能量消耗最小。本论文以排队论(QueuingTheory)为基础,将上述最优化问题建立成无限范围平均回报的马尔可夫决策过程(Infinite Horizon Average Reward Markov Decision Process)模型。经典的马尔可夫决策过程模型求解方法为贝尔曼方程(Bellman's Equation),即传统集中式离线的值迭代算法。为了解决求解马尔可夫决策过程模型所面临的"维数灾难"问题,本论文把贝尔曼方程简化成等效贝尔曼方程(Equivalent Bellman's Equation)。建立Q值函数与等效贝尔曼方程中值函数的联系,同时利用线性近似(Linear Approximation)方法对全局Q值函数做进一步化简,并应用在线随机学习(Online Stochastic Learning)算法更新迭代Q值函数。以排队论、马尔可夫决策过程为理论基础,本论文提出将等效贝尔曼方程、线性近似方法和随机在线学习算法三者结合的分布式资源控制算法,用以优化蜂窝网络中D2D通信的模式选择和资源分配方式,从而达到网络中流传输的平均能量消耗最小的优化目标。搭建仿真平台将本论文提出的算法与其他四种模式选择-资源分配算法相对比,仿真结果显示采用提出算法的网络,其流传输的平均能量消耗最小。
[Abstract]:The mobile communication network has come a long way from the first generation analog communication system represented by voice service to the fourth generation (4G) mobile communication system which is now large-scale commercial. Mobile cloud computing, mobile multimedia and other emerging services in mobile communication networks have become the next evolution goal of mobile communication. At present, with the rapid popularization of intelligent terminals and the explosive growth of network traffic, The wireless communication technology for the fifth generation (5G) mobile communication has attracted great attention of the industry. D2D (Device-to-Device:D2D) communication as a key candidate technology of 5G has become a research hotspot. With the introduction of D2D communication technology in cellular networks, the resource allocation of network and users will change greatly, so the optimization design of resource control becomes a research focus. Resource control includes mode selection, resource allocation and power control. In order to achieve optimal network performance, more and more research work put these three mechanisms together. At present, most of the research on D2D communication is based on infinite backlog service model and packet-level traffic model, but this paper is based on the flow level service model which is more suitable for the new generation wireless communication. The mode selection and resource allocation in resource control are optimized. The orthogonal frequency division multiple access (Orthogonal Frequency Division Multiple Access:OFDMA) technique is applied in the cellular network with D2D communication. The resource control optimization problem of the cellular network is studied, so that the average energy consumption of the network is minimized. On the basis of queue theory (QueuingTheory), the above optimization problem is established as a Markov decision process (Infinite Horizon Average Reward Markov Decision Process) model with infinite range average return. The classical Markov decision process model is solved by the Belman equation (Bellman's Equation), which is the traditional centralized off-line value iterative algorithm. In order to solve the problem of "dimensionality disaster" in solving the Markov decision process model, this paper simplifies the Berman equation into the equivalent Belman equation (Equivalent Bellman's Equation). The relationship between Q value function and the median function of equivalent Berman equation is established. The global Q value function is further simplified by linear approximate (Linear Approximation) method, and the iterative Q value function is updated by online random learning (Online Stochastic Learning) algorithm. Based on queuing theory and Markov decision process, this paper proposes a distributed resource control algorithm which combines equivalent Berman equation, linear approximation method and stochastic online learning algorithm. It is used to optimize the mode selection and resource allocation of D2D communication in cellular networks, so as to achieve the goal of minimum average energy consumption in the network. The simulation platform is built to compare the proposed algorithm with the other four mode selection-resource allocation algorithms. The simulation results show that the proposed algorithm has the lowest average energy consumption.
【学位授予单位】：北京交通大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN929.5

【参考文献】