片上多核同步单元的研究实现及片间扩展
[Abstract]:With the development of application requirements and chip manufacturing technology, more processor and memory resources can be integrated on a single chip, and the on-chip system is gradually developed from a single-core structure to a multi-core structure. The emergence of multi-core architecture not only improves performance, but also puts forward higher requirements for inter-core synchronization mechanism. To give full play to the processing power of each processor core in the multi-core chip, we need the support of efficient synchronization mechanism. X-DSP is a self-designed architecture and instruction set structure for high-performance multi-core DSP, developed by our university. It is mainly used in the field of signal and image processing. Multiple DSP cores and global cache, are integrated into the chip to communicate with high speed out of chip via PCIE interface. The multi-core architecture supports the parallel execution of multiple tasks, and the data communication among the tasks requires an efficient synchronization mechanism to ensure the correctness and efficiency of the execution. Based on the system structure of X-DSP, this paper uses distributed hardware synchronization unit to realize multi-core synchronization. At the same time, in order to make the off-chip processor core participate in the synchronization effectively, the interface extension based on PCIE is completed, and the PCIE-NI bridge is designed and implemented. The main contents and contributions of this paper are as follows: (1) the hardware synchronization scheme and the software synchronization scheme are analyzed and compared, and the hardware synchronization mechanism based on lock and fence is determined. By reducing the influence of synchronous operation on the normal memory access behavior, the synchronization efficiency is improved. (2) considering the architecture characteristics of X-DSP, a distributed hardware synchronization unit including hardware lock and fence is designed. Among them, the hardware lock has two working modes: the rotation lock and the queue rotation lock, which can effectively reduce the number of requests for lock acquisition. The hardware fence is released by broadcast, thus reducing the network hot issues caused by the serial release of the traditional fence. (3) the PCIE-NI transfer bridge is designed, and the AXI standard interface is realized. The protocol transfer between PBUS and DBI interface and NI interface designed by X-DSP makes the core of off-chip processor participate in synchronization effectively and realize data sharing between chip and chip. (4) Module level verification is completed based on hierarchical verification methodology. The system level verification and the joint test between the hardware synchronization unit and the PCIE-NI bridge are completed in the full chip system environment. The results of logic synthesis show that the design of this paper can meet the performance requirements.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP332
【参考文献】
相关期刊论文 前6条
1 陈书明;万江华;鲁建壮;刘仲;孙海燕;孙永节;刘衡竹;刘祥远;李振涛;徐毅;陈小文;;YHFT-QDSP:High-Performance Heterogeneous Multi-Core DSP[J];Journal of Computer Science & Technology;2010年02期
2 颜建峰;吴宁;;基于PCI总线的DMA高速数据传输系统[J];电子科技大学学报;2007年05期
3 Mick Posner;;快速实现基于AMBA 3 AXI协议的设计[J];电子设计应用;2007年01期
4 蒋周良;权进国;林孝康;;AMBA总线新一代标准AXI分析和应用[J];微计算机信息;2006年29期
5 汪东,马剑武,陈书明;基于Gray码的异步FIFO接口技术及其应用[J];计算机工程与科学;2005年01期
6 胡伟武,,夏培肃;顺序一致共享存储系统中的乱序执行技术──基本理论[J];计算机学报;1997年06期
相关博士学位论文 前1条
1 贾小敏;多核处理器片上Cache访问行为分析与优化机制研究[D];国防科学技术大学;2011年
相关硕士学位论文 前4条
1 梁天永;IP集成方案研究与DFI-AXI总线桥的设计[D];华南理工大学;2010年
2 黄颖然;基于覆盖率验证方法的IP核测试平台设计[D];西安电子科技大学;2009年
3 黄冕;X处理器存储一致性模型的研究与实现[D];国防科学技术大学;2008年
4 陈石坤;多核处理器中CACHE一致性协议研究和实现[D];国防科学技术大学;2005年
本文编号:2416136
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2416136.html