We discuss why Nvidia has a software moat against competitors who might want to manufacture GPUs. Nvidia's Cuda language has had unique advantages over time, including the ability to model memory hierarchy precisely. The fundamental machine learning tools like pytorch ended up supporting only Cuda, cementing Nvidia's position.
There are multiple ways competitors could try to break the Cuda monopoly: hardware compatibility, library compatibility, binary translation, or even creating a new compiler. Companies are starting to engage in these approaches, especially creating new compilers, which is the least legally challenging. In particular, there is already a free clean room implementation of a Cuda compiler that targets AMD GPUs.
Furthermore, new technology like OpenAI's Triton is now skipping Cuda and just targeting Nvidia machine code directly. This will likely prove challenging for Nvidia over the long run. Pytorch has also begun supporting AMD GPUs. The hardware and tech is there, it's just a matter of whether the software stack is mature enough. Perhaps we'll see more diversity in model training hardware in the future.
#nvidia #gpus #hardware
How Nvidia’s CUDA Monopoly In Machine Learning Is Breaking - OpenAI Triton And PyTorch 2.0
[ Ссылка ]
AMD AI Software Solved – MI300X Pricing, Performance, PyTorch 2.0, FlashAttention, OpenAI Triton
[ Ссылка ]
Introducing Triton: Open-source GPU programming for neural networks
[ Ссылка ]
Announcing the SCALE BETA
[ Ссылка ]
AMD ‘Scales’ up its CUDA capabilities
[ Ссылка ]
AMD Makes Second AI Software Acquisition In Less Than Two Months
[ Ссылка ]
Nvidia bans using translation layers for CUDA software
[ Ссылка ]
Software allows CUDA code to run on AMD and Intel GPUs without changes
[ Ссылка ]
Chinese Govt. Funds CUDA-Compatible GPU Startup to Compete Against Nvidia
[ Ссылка ]
Re the strength of Nvidia's AI software moat, what is the history?
[ Ссылка ]
Moore Threads introduces MTT S4000 48GB AI GPU with MTLink and zero-cost NVIDIA CUDA framework translation
[ Ссылка ]
How CUDA Programming Works | Hacker News
[ Ссылка ]
0:00 Intro
0:24 Contents
0:31 Part 1: Nvidia's Cuda moat
1:16 Evolution of Cuda
2:01 GPGPU programming languages
2:29 What made Cuda powerful?
3:26 The competitor to Cuda: OpenCL
4:06 PTX machine code for Nvidia GPUs evolves rapidly
5:15 Other GPU frameworks output Cuda code
6:22 Example: PyTorch outputs Cuda
6:56 Part 2: The competitors' view
7:29 Companies like to have secondary providers
8:14 Approach 1: Maintain hardware compatibility
9:00 In contrast to Nvidia, Intel has a stable architecture
9:39 Approach 2: Library compatibility
10:48 Approach 3: Binary translation
11:53 Approach 4: Create a new compiler
12:22 Clean room compiler reimplementation
13:48 How Intel is using library compatibility
14:36 AMD also didn't want library compatibility
15:03 Moore Threads, Chinese GPU company
15:40 Moore Threads is doing binary translation?
16:29 In response, Nvidia bans disassembly of code
17:23 Nvidia probably doesn't want binary translation tools
18:09 New compiler: SCALE from Spectral Compute
19:12 Spectral Compute could get bought by AMD
19:35 AMD's compiler ROCm doesn't work well
20:06 AMD acquired Nod.AI to get an optimizing compiler
21:00 Part 3: Should Nvidia be worried?
21:11 Implementing Cuda compatibility is a catch up technique
21:31 Nvidia's profit margins depend on its monopoly
21:56 My perspective: Cuda is too complicated
22:55 Nvidia should be most worried about other stacks
23:49 New stacks are already being created
24:07 PyTorch 2.0 is more AMD friendly
24:44 OpenAI's Triton is an easier to use GPGPU language
25:32 How Triton works, targeting PTX directly
26:20 Why LLVM names Triton so powerful
27:30 Should Nvidia use LLVM too?
28:03 Triton only supports what OpenAI needs
28:58 Yes, Nvidia should be worried
29:37 Are machine learning folks going to use AMD?
30:20 Conclusion
31:15 Bypassing Cuda completely
32:20 Outro
Ещё видео!