Kontakt5956 |dreamstime.com
Ai Chip Promo

When Selecting Memory for AI, You Must Choose…Wisely

Sept. 3, 2021
AI对存储的需求很高,因此选择正确的内存体系结构成为设计过程中的关键步骤。

本文是我们的一部分图书馆系列:System Design:AI的记忆

What you’ll learn:

  • 片上记忆的好处。
  • 处理片上内存的容量问题。
  • HBM vs. GDDR: Determining the best option.

在本系列的第3部分中,我们探索了车顶线模型如何帮助确定某些AI架构是否受其计算性能或内存带宽的限制。利用这些数据,设计师可以做出明智的决定,以最适合哪种类型的内存系统。

A variety of common memory systems are being used in high-performance AI applications, each with its own unique set of benefits and challenges. More than anything, choosing the “right” solution depends on the application and your use case.

On-Chip Memory: All Business

片上的记忆是可用的最高带宽,最节能的解决方案。它可以每秒提供数十个记忆带宽,现代标线处理器可以达到数百兆字节的容量。此外,数据需要在片上内存和计算单元之间运行的短距离大大降低了访问延迟并进一步提高功率效率。

The low latency and high bandwidth nature of on-chip memory allow for extremely high utilization of compute engines, making them well-suited to high-performance, low-power applications, especially when processing in handheld and battery-operated devices.

尽管片上内存的性能和功率效率是无与伦比的,但主要缺点围绕有限的容量旋转。片上存储器的存储容量远低于外部DRAM解决方案,当时使用多个DRAM时,该解决方案今天可以进入数十千兆字节。

A number of interesting innovations have emerged that make better use of the limited capacity of on-chip memory, including reduced precision data types and recalculating intermediate results to avoid occupying on-chip storage. However, the tremendous growth in training sets and model sizes continues to outpace these innovations, resulting in on-chip memory being better equipped for AI inference tasks than for AI training tasks.

Because of these tradeoffs, on-chip memory is a great solution when running inference tasks on smaller neural networks that fit within the capacity of the memory, or when inferencing in environments where multiple chips can work together to provide a solution. If this isn’t the case, it’s best to pursue other external memory options, such as high bandwidth memory (HBM) and graphics double data rate (GDDR).

HBM: Complex Power

HBM是最新的大批量DRAM解决方案,已在AI解决方案中迅速采用。HBM使用设备内的堆叠来实现高能力,以及以相对较低的数据速率(HBM2中的每秒两千兆位)运行的极宽的界面(1024个数据线),以实现具有良好信号完整性的极高带宽。堆叠以及宽阔和慢速界面的独特组合使HBM内存能够达到极高的性能,同时保持良好的功率效率。随着片上内存的能力增加,HBM为外部记忆解决方案提供了带宽和功率效率的最佳组合。

HBM体系结构产生的区域和功率优势以额外的设计和制造成本。众多的I/O需要一个精细的音高,需要在DRAM内和系统中的组件之间使用额外的硅插音器,基材和复杂的堆叠,在组装到PCB上之前,增加了额外的成本和复杂性。保持硅冷却并应对与堆叠相关的系统工程挑战,为实施HBM2解决方案带来了进一步的挑战。

However, for organizations with the engineering skill to implement HBM memory systems, and with the ability to amortize the added costs, HBM2 can be a great choice for systems that need an external memory solution.

GDDR6: The All-Rounder

GDDR于20年前是为图形行业创建的,在片上记忆和HBM DRAM提供的带宽,功率效率,成本和可靠性之间提供了良好的中间立场。GDDR利用了在DDR等传统DRAM中使用的更熟悉的大容量制造和组装技术,使其成为平衡性能和复杂性的良好解决方案。

与HBM DRAM相反,HBM DRAM实施了以适度的数据速率运行的大量数据线,GDDR6 DRAM采用相反的方法,并以32个数据线以16 GB/s的速度运行,这是HBM2 DRAM的速度的八倍。较少的数据线消除了对插入器等其他组件的需求。但是,以更高的数据速率运行会带来信号完整性和发电效率的挑战。

Those issues can be managed with carefully designed PHYs, packages, and boards. Furthermore, GDDR DRAM devices don’t utilize stacking, further simplifying the manufacturing process and reducing cost. As a result, GDDR offers a cost-effective solution for achieving good performance, power-efficiency, and cost.

在HBM2和GDDR6之间进行选择的SOC考虑

当设计一个处理器利用GDDR或HBM,one must consider some important tradeoffs. In addition to the aforementioned differences between the DRAMs themselves, there are other disparities in how processors connect to these DRAMs.

最重要的区别是与SOC上的PHY电路有关的区别,将其连接到DRAM。对于提供256 GB/s内存带宽的等效GDDR6和HBM2内存系统,与HBM2 PHY电路相比,GDDR6 PHYS的gddr6 Phys需要在SOC面积的1.5至1.75倍之间。

In terms of power, the differences are even more pronounced: GDDR6 PHYs consume anywhere between three-and-a-half to four-and-a-half times as much power as the HBM2 PHY at the same bandwidth. From the point of view of an SoC designer, this large disparity in power and area favor HBM2 memory systems. However, the added cost and implementation complexity of HBM2 memory systems can make the choice of GDDR6 a more attractive one.

Whether or not you choose HBM2 or GDDR6 ultimately depends on what matters most in the system at hand. If you’re prepared to handle the cost and engineering complexity of an HBM2 implementation, it’s the best route to take. But for systems that prioritize cost and more mainstream manufacturing methods, GDDR6 is an excellent solution. There’s no wrong answer when it comes to picking a high-bandwidth-memory solution for your application.

片上和外部记忆解决方案均提供高带宽和低潜伏期,以满足当今最密集的应用的需求。明智地选择,您的努力将得到回报。

阅读更多文章图书馆系列:System Design:AI的记忆

来自我们的合作伙伴

Discover the Simplicity and Scalability of Integrated Power

我们可扩展的双轨对完全综合的PMICS来利用我们的前沿功率技术,并减少组件较少的系统复杂性。内置 - …

您的技术商是什么?

Take a bite size look on the tech landscape of AI - where we have been, where we are, and where we are going. Then explore more with our partner Micr…

Adhesive Materials Solve Design Challenges

幸运的是,已经找到了用于热管理和EMI屏蔽或吸收的简单解决方案 - 3M提供热接口材料并选择…

如何使用集成的GAN开关以高效,具有成本效益的离线电源

2021年7月29日
The range of applications for compact 100-watt power supplies continues to increase, from AC-DC chargers and adapters, USB power delivery (PD) charge…

Driving the Green Revolution in Transportation

Karl-Heinz Steinmetz Sector General Manager Automotive Powertrain Texas Instruments. Technology advancements further electrify cars, enable new effici…

表达您的意见!

This site requires you to register or login to post a comment.
尚未添加评论。想开始对话吗?

来自我们的合作伙伴

Discover the Simplicity and Scalability of Integrated Power

我们可扩展的双轨对完全综合的PMICS来利用我们的前沿功率技术,并减少组件较少的系统复杂性。内置 - …

您的技术商是什么?

Take a bite size look on the tech landscape of AI - where we have been, where we are, and where we are going. Then explore more with our partner Micr…

Adhesive Materials Solve Design Challenges

幸运的是,已经找到了用于热管理和EMI屏蔽或吸收的简单解决方案 - 3M提供热接口材料并选择…

如何使用集成的GAN开关以高效,具有成本效益的离线电源

The range of applications for compact 100-watt power supplies continues to increase, from AC-DC chargers and adapters, USB power delivery (PD) charge…
Mohammed Anwarul Kabir Choudhury,Dreamstime.com
Pacemaker Mohammed Anwarul Kabir Choudhury Dreamstime L 51106898
Baidu