NVIDIA Corp
NVIDIA is the world leader in accelerated computing.
Profit margin of 55.6% — that's well above average.
Current Price
$177.39
+0.93%GoodMoat Value
$221.97
25.1% undervaluedNVIDIA Corp (NVDA) — Q1 2024 Earnings Call Transcript
Operator
Good afternoon. My name is David, and I'll be your conference operator today. At this time, I'd like to welcome everyone to NVIDIA's First Quarter Earnings Call. Today's conference is being recorded. All lines have been placed on mute to prevent any background noise. After the speakers' remarks, there'll be a question-and-answer session. Thank you. Simona Jankowski, you may begin your conference.
Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the first quarter of fiscal 2024. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer. I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the second quarter of fiscal 2024. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent. During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q, and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, May 24, 2023, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements. During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. And with that, let me turn the call over to Colette.
Thanks, Simona. Q1 revenue was $7.19 billion, up 19% sequentially and down 13% year-on-year. Strong sequential growth was driven by record data center revenue, with our gaming and professional visualization platforms emerging from channel inventory corrections. Starting with data center, record revenue of $4.28 billion was up 18% sequentially and up 14% year-on-year, on strong growth by accelerated computing platform worldwide. Generative AI is driving exponential growth in compute requirements and a fast transition to NVIDIA accelerated computing, which is the most versatile, most energy-efficient, and the lowest TCO approach to train and deploy AI. Generative AI drove significant upside in demand for our products, creating opportunities and broad-based global growth across our markets. Let me give you some color across our three major customer categories, cloud service providers or CSPs, consumer Internet companies, and enterprises. First, CSPs around the world are racing to deploy our flagship Hopper and Ampere architecture GPUs to meet the surge in interest from both enterprise and consumer AI applications for training and inference. Multiple CSPs announced the availability of H100 on their platforms, including private previews at Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, upcoming offerings at AWS, and general availability at emerging GPU specialized cloud providers like CoreWeave and Lambda. In addition to enterprise AI adoption, these CSPs are serving strong demand for H100 from Generative AI pioneers. Second, consumer Internet companies are also at the forefront of adopting Generative AI and deep learning-based recommendation systems, driving strong growth. For example, Meta has now deployed its H100 powered Grand Teton AI supercomputer for its AI production and research teams. Third, enterprise demand for AI and accelerated computing is strong. We are seeing momentum in verticals such as automotive, financial services, healthcare, and telecom, where AI and accelerated computing are quickly becoming integral to customers' innovation roadmaps and competitive positioning. For example, Bloomberg announced it has a 50 billion parameter model, BloombergGPT, to help with financial natural language processing tasks such as sentiment analysis, named entity recognition, news classification, and question-answering. Auto Insurance company, CCC Intelligent Solutions is using AI for estimating repairs. And AT&T is working with us on AI to improve fleet dispatches so their field technicians can better serve customers. Among other enterprise customers using NVIDIA AI are Deloitte for logistics and customer service and Amgen for drug discovery and protein engineering. This quarter, we started shipping DGX H100, our Hopper generation AI system, which customers can deploy on-prem. And with the launch of DGX Cloud through our partnership with Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, we deliver the promise of NVIDIA DGX to customers from the cloud. Whether the customers deploy DGX on-prem or via DGX Cloud, they get access to NVIDIA AI software, including NVIDIA Base Command, AI frameworks, and pre-trained models. We provide them with the blueprint for building and operating AI, spanning our expertise across systems, algorithms, data processing, and training methods. We also announced NVIDIA AI Foundations, which are model foundry services available on DGX Cloud, that enable businesses to build, refine, and operate custom large language models and generative AI models, trained with our own proprietary data, created for unique domain-specific tasks. They include NVIDIA NeMo for large language models, NVIDIA Picasso for images, video, and 3D, and NVIDIA BioNeMo for life sciences. Each service has six elements, pre-trained models, frameworks for data processing and curation, proprietary knowledge-based sector databases, systems for fine-tuning, aligning, and guardrailing, optimized inference engines, and support from NVIDIA experts to help enterprises fine-tune models for their custom use cases. ServiceNow, a leading enterprise services platform, is an early adopter of DGX Cloud and NeMo. They are developing custom large language models trained on data specifically for the ServiceNow platform. Our collaboration will let ServiceNow create new enterprise-grade generative AI offerings, with the thousands of enterprises worldwide running on the ServiceNow platform, including for IT departments, customer service teams, employees, and developers. Generative AI is also driving a step-function increase in inference workloads. Because of their size and complexities, these workloads require acceleration. The latest MLPerf industry benchmark released in April showed NVIDIA's inference platform delivering performance that is orders of magnitude ahead of the industry, with unmatched versatility across diverse workloads. To help customers deploy generative AI applications at scale, at GTC, we announced four major new inference platforms that leverage the NVIDIA AI software stack. These include L4 Tensor Core GPU for AI video, L40 for Omniverse, and graphics rendering, H100 NVL for large language models, and the Grace Hopper Superchip for LLMs and also, recommendation systems and vector databases. Google Cloud is the first CSP to adopt our L4 inference platform with the launch of its G2 virtual machines for generative AI inference and other workloads such as Google Cloud Dataproc, Google AlphaFold, and Google Cloud's Immersive Stream, which render 3D and AR experiences. In addition, Google is integrating our Triton inference server with Google Kubernetes engine and its cloud-based Vertex AI platform. In networking, we saw strong demand at both CSPs and enterprise customers for generative AI and accelerated computing, which require high-performance networking like NVIDIA's Mellanox networking platforms. Demand relating to general purpose CPU infrastructure remains soft. As generative AI applications grow in size and complexity, high performance networks become essential for delivering accelerated computing at data center scale to meet the enormous demand of all training and inferencing. Our 400 gig Quantum-2 InfiniBand platform is the gold standard for AI dedicated infrastructure, with broad adoption across major cloud and consumer Internet platforms such as Microsoft Azure. With the combination of in-network computing technology and the industry's only end-to-end data center scale, optimized software stack, customers routinely enjoy a 20% increase in throughput for their sizable infrastructure investment. For multi-tenant cloud transitioning to support generative AI our high-speed Ethernet platform with BlueField-3 DPUs and Spectrum-4 Ethernet switching offers the highest available Ethernet network performance. BlueField-3 is in production and has been adopted by multiple hyperscale and CSP customers, including Microsoft Azure, Oracle Cloud, CoreWeave, Baidu, and others. We look forward to sharing more about our 400 gig Spectrum-4 accelerated AI networking platform next week at the COMPUTEX Conference in Taiwan. Lastly, our Grace data center CPU is sampling with customers. At this week's International Supercomputing Conference in Germany, the University of Bristol announced a new supercomputer based on the NVIDIA Grace CPU Superchip, which is 6x more energy-efficient than the previous supercomputer. This adds to the growing momentum for Grace with both CPU only and CPU/GPU opportunities across AI and cloud and supercomputing applications. The coming wave of BlueField-3, Grace, and Grace Hopper Superchips will enable a new generation of super energy-efficient accelerated data centers. Now, let's move to gaming. Gaming revenue of $2.24 billion was up 22% sequentially, and down 38% year-on-year. Strong sequential growth was driven by sales of the 40 Series GeForce RTX GPUs for both notebooks and desktops. Overall, end demand was solid, and consistent with seasonality, demonstrating resilience against a challenging consumer spending backdrop. The GeForce RTX 40 Series GPU laptops are off to a great start, featuring four NVIDIA inventions, RTX Path Tracing, DLSS 3 AI rendering, Reflex Ultra-Low Latency rendering, and Max-Q, energy-efficient technologies. They deliver tremendous gains in industrial design, performance, and battery life for gamers and creators. And like our desktop offerings, 40 Series laptops support the NVIDIA Studio platform or software technologies, including acceleration for creative data science and AI workflows, and Omniverse, giving content creators unmatched tools and capabilities. In desktop, we ramped the RTX 4070, which joined the previously launched RTX 4090, 4080, and 4070 Ti GPUs. The RTX 4070 is nearly 3x faster than the RTX 2070 and offers our large installed base a spectacular upgrade. Last week, we launched the 60 family, RTX 4060, and 4060 Ti, bringing our newest architecture to the world's core gamers starting at just $299. These GPUs for the first time provide 2x the performance of the latest gaming console at mainstream price points. The 4060 Ti is available starting today, while the 4060 will be available in July. Generative AI will be transformative to gaming and content creation from development to run time. At the Microsoft Build Developer Conference earlier this week, we showcased how Windows PCs and workstations with NVIDIA RTX GPUs will be AI-powered at their core. NVIDIA and Microsoft have collaborated on end-to-end software engineering, spanning from the Windows operating system to the NVIDIA graphics drivers, and NeMo's LLM framework to help make Windows on NVIDIA RTX Tensor Core GPUs a supercharged platform for generative AI. Last quarter, we announced a partnership with Microsoft to bring Xbox PC games to GeForce NOW. The first game from this partnership, Gears 5, is now available with more set to be released in the coming months. There are now over 1,600 games on GeForce NOW, the richest content available on any cloud gaming service. Moving to Pro Visualization. Revenue of $295 million was up 31% sequentially, and down 53% year-on-year. Sequential growth was driven by stronger workstation demand across both mobile and desktop form factors, with strength in key verticals such as Public Sector, Healthcare, and Automotive. We believe the channel inventory correction is behind us. The ramp of our Ada Lovelace GPU architecture in workstations kicked off a major product cycle. At GTC, we announced six new RTX GPUs for laptops and desktop workstations, with further rollout planned in the coming quarters. Generative AI is a major new workload for NVIDIA-powered workstations. Our collaboration with Microsoft transformed Windows into the ideal platform for creators and designers, harnessing generative AI to elevate their creativity and productivity. At GTC, we announced NVIDIA Omniverse Cloud, an NVIDIA fully managed service running in Microsoft Azure that includes the full suite of Omniverse applications and NVIDIA OVX infrastructure. Using this full stack cloud environment, customers can design, develop, deploy, and manage industrial metaverse applications. NVIDIA Omniverse Cloud will be available starting in the second half of this year. Microsoft NVIDIA will also connect Office 365 applications with Omniverse. Omniverse Cloud is being used by companies to digitalize their workflows from design and engineering to smart factories and 3D content generation for marketing. The automotive industry has been a leading early adopter of Omniverse, including companies such as BMW Group, Geely Lotus, General Motors, and Jaguar Land Rover. Moving to Automotive. Revenue was $296 million, up 1% sequentially, and up 114% from a year ago. Our strong year-on-year growth was driven by the ramp of the NVIDIA DRIVE Orin across a number of new energy vehicles. As we announced in March, our automotive design win pipeline over the next six years now stands at $14 billion, up from $11 billion a year ago, giving us visibility into continued growth over the coming years. Sequentially, growth moderated as some NEV customers in China are adjusting their production schedules to reflect slower than expected demand growth. We expect this dynamic to linger for the rest of the calendar year. During the quarter, we expanded our partnership with BYD, the world's leading manufacturer of NEVs. Our new design win will extend BYD's use of the DRIVE Orin to its next-generation high-volume Dynasty and Ocean series of vehicles, set to start production in calendar 2024. Moving to the rest of the P&L. GAAP gross margins was 64.6%, and non-GAAP gross margins were 66.8%. Gross margins have now largely recovered to prior peak level, and we have absorbed higher costs and offset them by innovating and delivering higher valued products as well as products incorporating more and more software. Sequentially, GAAP operating expenses were down 3%, and non-GAAP operating expenses were down 1%. We have held OpEx at roughly the same level over the last past four quarters. We're working through the inventory corrections in gaming and professional visualization. We now expect to increase investments in the business while also delivering operating leverage. We returned $99 million to shareholders in the form of cash dividends. At the end of Q1, we have approximately $7 billion remaining under our share repurchase authorization through December 2023. Let me turn to the outlook for the second quarter fiscal '24. Total revenue is expected to be $11 billion, plus or minus 2%. We expect this sequential growth to largely be driven by data center, reflecting a steep increase in demand related to generative AI and large language models. This demand has extended our data center visibility out a few quarters and we have procured substantially higher supply for the second half of the year. GAAP and non-GAAP gross margins are expected to be 68.6% and 70% respectively, plus or minus 50 basis points. GAAP and non-GAAP operating expenses are expected to be approximately $2.71 billion and $1.9 billion, respectively. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $90 million, excluding gains and losses from non-affiliated investments. GAAP and non-GAAP tax rates are expected to be 14%, plus or minus 1%, excluding any discrete items. Capital expenditures are expected to be approximately $300 million to $350 million. Further financial details are included in the CFO commentary and other information available on our IR website. In closing, let me highlight some of the upcoming events. Jensen will give the COMPUTEX keynote address in person in Taipei this coming Monday, May 29 local time, which will be Sunday evening in the U.S. In addition, we will be attending the BofA Global Technology Conference in San Francisco on June 6, the Rosenblatt Virtual Technology Summit on The Age of AI on June 7, and the New Street Future of Transportation Virtual Conference on June 12. Our earnings call to discuss the results of our second quarter fiscal '24 is scheduled for Wednesday, August 23. Well, that covers our opening remarks. We're now going to open the call for questions. Operator, would you please poll for questions?
Operator
Thank you. We'll take our first question from Toshiya Hari with Goldman Sachs. Your line is open.
Hi. Good afternoon. Thank you so much for taking the question and congrats on the strong results and incredible outlook. Just one question on data center. Colette, you mentioned the vast majority of the sequential increase in revenue this quarter will come from data center. I was curious what the construct is there, if you can speak to what the key drivers are from April to July and perhaps more importantly, you talked about visibility into the second half of the year. I'm guessing it's more of a supply problem at this point, what kind of sequential growth beyond the July quarter can your supply chain support at this point? Thank you.
Okay. So, a lot of different questions there. So, let me see if I can start and I am sure Jensen will have some follow-up comments. So when we talk about our sequential growth that we're expecting between Q1 and Q2, our generative AI large language models are driving the surge in demand, and it's broad-based across both our consumer Internet companies, our CSPs, our enterprises, and our AI start-ups. It is also interest in both of our architectures, both of our Hopper latest architecture as well as our Ampere architecture. This is not surprising as we generally often sell both of our architectures at the same time. This is also a key area where deep recommendators are driving growth. And we also expect to see growth both in our computing as well as in our networking business. So, those are some of the key things that we have baked in when we think about the guidance we provided to Q2. We also surfaced in our opening remarks that we are working on both supply today for this quarter, but we have also procured a substantial amount of supply for the second half. We have significant supply chain flow to serve our significant customer demand that we see, and this is demand that we see across a wide range of different customers. They are building platforms for some of the largest enterprises, but also setting things up at the CSPs and the large consumer Internet companies. So, we have visibility right now for our data center demand that has probably extended out a few quarters and that's led us to working on quickly procuring that substantial supply for the second half. I'm going to pause there and see if Jensen wants to add a little bit more.
I thought that was great color. Thank you.
Operator
Next we'll go to C.J. Muse with Evercore ISI. Your line is open.
Yeah. Good afternoon. Thank you for taking the question. I guess with data center, you are essentially doubling quarter-on-quarter, two natural kind of questions that relate to one another come to mind. Number one, where are we in terms of driving acceleration into servers to support AI? And as part of that, as you deal with longer cycle times with TSMC and your other partners, how are you thinking about managing their commitments there with where you want to manage your lead times in the coming years to best match that supply and demand? Thanks so much.
Thank you for the question. I'll address the second part first. At the time when the ChatGPT moment occurred, we were already fully engaged in the production of both Ampere and Hopper. This situation underscored the transition from large language models to a chatbot-based product and service. The combination of safety measures, alignment systems, and reinforcement learning through human feedback, along with proprietary knowledge databases and search connections, came together beautifully. This alignment of technology reminded me of the iPhone moment, highlighting the potential and capabilities of such products. NVIDIA has a robust supply chain, as you're aware, and we manufacture supercomputers in large volumes, which include not just the GPUs but thousands of other components necessary for data centers. So, when that pivotal moment arrived, we were already in full production. As Colette mentioned, we had to significantly enhance our procurement efforts for the latter half of the year. Now, looking at the broader context, the world’s data centers are increasingly moving toward accelerated computing. This has been a recognized challenge for a while; if we can effectively implement it across numerous application domains, we could reduce energy consumption and operational costs for data centers significantly. Although it’s a costly endeavor because of the requisite software and system development, we've been working towards this for 15 years. With the emergence of generative AI, we've found a compelling application for our computing platform that has been in development. We are currently witnessing simultaneous shifts. With the data center market valued at around $1 trillion, most of it is still reliant on CPUs and basic NICs, lacking acceleration. However, as generative AI becomes the primary workload, it becomes evident that the budget allocation for data centers will increasingly favor accelerated computing. We are in the midst of this evolution, with a limited CapEx budget for data centers. Nonetheless, we are receiving substantial orders to update them. I believe we are at the onset of a decade-long transition where existing data centers will be repurposed for accelerated computing, leading to a notable shift in spending from traditional to advanced computing technologies, including smart NICs, smart switches, and GPUs, with generative AI at the forefront of workloads.
Operator
And we'll move to our next question, Vivek Arya with BofA Securities. Your line is open.
Thanks for the question. Could I just wanted to clarify does visibility mean data center sales can continue to grow sequentially in Q3 and Q4 or do they sustain at Q2 levels? I just wanted to clarify that. And then Jensen, my question is that, given this very strong demand environment, what does that do to the competitive landscape? Does it invite more competition in terms of custom ASICs? Does it invite more competition in terms of other GPU solutions or other kinds of solutions? How do you see the competitive landscape change over the next two to three years?
Yeah, Vivek. Thanks for the question. Let me see if I can add a little bit more color. We believe that the supply that we will have for the second half of the year will be substantially larger than H1. So, we are expecting not only the demand that we just saw in this last quarter, the demand that we have in Q2 for our forecast, but also planning on seeing something in the second half of the year. We just have to be careful here. But we are not here to guide on the second half of that. Yes, we do plan a substantial increase in the second half compared to the first half.
We face competition from various sources, including well-funded and innovative start-ups globally, established semiconductor companies, and CSPs with internal projects. We're constantly aware of this competition. However, NVIDIA's primary value is that we offer the lowest cost solution and the lowest total cost of ownership. This is because accelerated computing represents a full stack challenge; it requires engineering all software, libraries, and algorithms, integrating and optimizing them for the architecture of an entire data center, rather than just one chip. The engineering and fundamental computer science involved is exceptionally demanding. In addition, generative AI presents a large-scale challenge that operates at a data center level, where the entire system—including networking, distributed computing engines, and architecture—functions as one integrated computer. To achieve optimal performance, a deep understanding of this full stack and data center scale is essential. Utilization is another critical factor, as it relates to the variety of applications we can accelerate. Our diverse architecture maintains high utilization; if a data center can only perform one task quickly, it risks underutilization and scalability issues. Our universal GPU accelerates numerous stacks, resulting in high throughput. Furthermore, our experience in building data centers and integrating our architecture into global cloud systems is invaluable. The time from product delivery to operational readiness can take months if not executed proficiently; however, our operations typically span just a few weeks. We’ve converted data centers and supercomputers into market-ready products, showcasing our team's expertise. Ultimately, our technology provides the highest throughput at the lowest possible cost. The market is highly competitive and substantial, but the challenges we face are significant.
Operator
Next we go to Aaron Rakers with Wells Fargo. Your line is open.
Thank you for taking the question and congratulations on the quarter. As we consider the different growth drivers for the data center business moving forward, I would like to know, Colette, how we should think about the monetization impact of software, especially with the continued growth of your cloud service agreements. Where do you believe we currently stand regarding that approach, particularly in terms of the AI enterprise software suite and other sources of software-only revenue in the future?
Thanks for the question. Software is really important to our accelerated platforms. Not only do we have a substantial amount of software that we are including in our nearest architecture and essentially, all products that we have. We are now with many different models to help customers start their work in generative AI and accelerated computing. So, anything that we have here from a DGX Cloud and providing those services, helping them build models or as we've discussed the importance of NVIDIA AI enterprise, essentially that operating system for AI. So, all things should continue to grow as we go forward, both the architecture and the infrastructure, as well as both availability of this offering, our ability to monetize software as well.
Yeah. We can see in real-time the growth of generative AI and CSPs, both for training the models, refining the models, as well as deploying the models. As Colette said earlier, inference is now a major driver of accelerated computing because generative AI is used so capably in so many applications already. There are two segments that require a new stack of software and the two segments are enterprise and industrials. Enterprise requires a new stack of software, because many enterprises need to have all the capabilities that we've talked about, whether it's large language models, the ability to adapt, and for your proprietary use-case and your proprietary data, align it to your own principles, and your own operating domains. You want to have the ability to be able to do that in a high-performance computing sandbox, and we that DGX Cloud create a model. Then you want to deploy your chatbot or your AI in any Cloud, because you have services and you have agreements with multiple Cloud vendors and depending on the applications, you might deploy it on various clouds. And for the enterprise, we have NVIDIA AI Foundation for helping you create custom models and we have NVIDIA AI Enterprise. NVIDIA AI Enterprise is the only accelerated stack, GPU-accelerated stack in the world that is Enterprise safe and Enterprise supported. There are constant patches that you have to do, there are 4,000 different packages that build up NVIDIA AI Enterprise and represents the operating engine, end-to-end operating engine of the entire AI workflow. It's the only one of its kind from data ingestion, data processing, obviously, in order to train an AI model you have lots of data, you have to process and package up and curate, and align and there's just a whole bunch of stuff that you have to do to the data to prepare it for training. That amount of data, that could consume some 40%, 50%, 60% of your computing time and so data processing is a very big deal. And then the second aspect of it is training the model, refining the model and the third is deploying model for inferencing. NVIDIA AI Enterprise supports and patches and security patches continuously all of those 4,000 packages of software. And for an enterprise that wants to deploy their engines, just like they want to deploy Red Hat Linux, this is incredibly complicated software in order to deploy that in every cloud and as well as on-prem, it has to be secure, it has to be supported. And so, NVIDIA AI Enterprise is the second point. The third is Omniverse. Just as people are starting to realize that you need to align an AI to ethics, the same for robotics, you need to align the AI for physics. And aligning an AI for ethics includes a technology called reinforcement learning human feedback. In the case of industrial applications and robotics, it's reinforcement learning Omniverse feedback. And Omniverse is a vital engine for software defined in robotic applications and industries. And so, Omniverse also needs to be a cloud service platform. And so our software stack, the three software stacks, AI Foundation, AI Enterprise, and Omniverse runs in all of the world's clouds that we have partnerships, DGX Cloud partnerships with. Azure, we have partnerships on both AI as well as Omniverse. With GTP and Oracle, we have great partnerships in DGX Cloud for AI and AI Enterprise is integrated into all three of them and so I think the in order for us to extend the reach of AI beyond the cloud, and into the world's Enterprise and into the world's industries, you need two new types of you need new software stacks in order to make that happen and by putting it in the cloud, integrate it into the world's CSP clouds, it's a great way for us to partner with the sales and the marketing team and the leadership team of all the cloud vendors.
Operator
Next we'll go to Timothy Arcuri with UBS. Your line is open.
Thanks a lot. I have a question and also a clarification. First, Jensen, could you discuss the InfiniBand versus Ethernet debate and how you foresee it developing? I understand that the low latency of InfiniBand is essential for AI, but could you elaborate on the attach rate of your InfiniBand solutions related to your core compute offerings and whether this is impacting Ethernet similarly to what you're seeing on the compute side? For the clarification, Colette, I noticed there wasn't a share buyback despite having about $7 billion remaining on the share repurchase authorization. Was that simply a matter of timing? Thanks.
Colette, how about you go first? You should take the question.
That is correct. We have $7 billion available in recurrent authorization for repurchases. We did not repurchase anything in this last quarter, but we do repurchase opportunistically and we'll consider that as we go forward as well. Thank you.
InfiniBand and Ethernet serve different purposes in a data center, and both are important. InfiniBand achieved a record quarter, and we anticipate a stellar year ahead with an outstanding roadmap for NVIDIA's Quantum InfiniBand. The two networking solutions are distinct; InfiniBand is tailored for an AI-focused environment. For instance, if a data center operates a limited number of applications for specific purposes continuously, and the infrastructure costs approximately $500 million, the difference in overall throughput between InfiniBand and Ethernet could range from 15% to 20%. If you invest $500 million in infrastructure and there's a throughput difference of 10% to 20%, that equates to about $100 million, making InfiniBand essentially free. This significant difference in data center throughput is a critical factor for its usage, especially for singular applications. On the other hand, in a multi-tenant cloud data center serving numerous users with many small jobs, Ethernet is the better choice. There is also a new segment emerging where the cloud is evolving into a generative AI cloud. Although it remains a multi-tenant environment, it is geared towards handling generative AI tasks. This segment presents an exciting opportunity, and at COMPUTEX, we plan to unveil a major product line focused on Ethernet for generative AI cloud applications. Meanwhile, InfiniBand continues to perform exceptionally well, achieving record growth quarter over quarter and year over year.
Operator
Next we'll go to Stacy Rasgon with Bernstein Research. Your line is open.
Hi, guys. Thanks for taking my question. I had a question on inference versus training for generative AI. So, you're talking about inference as being a very large opportunity. I guess, two sub-parts to that. Is that, besides inference basically scales with like the usage versus like training is more of a one-and-done. And can you give us some sort of even if it's just qualitatively, like if do you think our influence is bigger than training or vice-versa, like if it's bigger, how much bigger? Is it like the opportunity, is it 5x, is it 10x, is there anything you can give us on those two workloads within generative AI, it would be helpful.
I'll start by saying that training is an ongoing process. Every time we deploy, we gather new data, and that data is used to refine our training. Therefore, training is never truly complete. Producing and enhancing a Vector database that supplements the large language model is also a continuous effort. We are constantly vectorizing all the structured and unstructured data we collect. Whether it's a recommender system, a large language model, or a vector database, these are among the three primary applications driving the future of computing. While there are certainly other aspects involved, these key applications are always operational. More companies are beginning to recognize the importance of having what could be called an intelligence factory, mainly focused on training, processing, and vectorizing data, as well as developing representations of it. When it comes to inference, there are APIs, whether they are open APIs that integrate with various applications or those embedded within workflows. Our company will offer numerous APIs—some developed in-house and many from collaborations with partners like ServiceNow and Adobe, which will produce an array of generative AI APIs for others to incorporate into their workflows or applications. We are witnessing rapid growth in AI Factories and an emerging market for AI inference through APIs, practically increasing weekly. In simple terms, we currently have an installed base of data centers worth $1 trillion, predominantly using CPUs. However, there is a growing acknowledgment that scaling these data centers with traditional computing methods is becoming impractical, as pointed out in discussions about the end of Moore's Law. The future lies in accelerated computing, which is now driven by generative AI. This means that the financial commitments towards infrastructure will heavily prioritize generative AI and accelerated computing technologies, including GPUs and advanced networking equipment. Over the next several years, a significant portion of that $1 trillion investment, adjusting for ongoing growth in data centers, will be focused primarily on generative AI, encompassing both training and inference.
Operator
Next we'll go to Joseph Moore with Morgan Stanley. Your line is open.
Great. Thank you. I want to follow up on that, in terms of the focus on inference. It's pretty clear that this is a really big opportunity around large language models, but the cloud customers are also talking about trying to reduce cost per query by very significant amounts. You can talk about the ramifications of that for you guys, is that where some of the specialty insurance products that you launched at GTC come in and just how are you going to help your customers get the cost per query down?
That's a great question. You can start by creating a large language model and then distill it into smaller versions. The smallest versions can be integrated into devices like phones and PCs, and surprisingly, they can all perform similar tasks. However, the largest model offers greater versatility and advanced capabilities. The larger model also helps in training the smaller models by generating prompts to improve their performance. This is why we offer various sizes in our inference options. We recently introduced models like L4, L40, and H100, among others, which can be customized to fit different needs. It's important to note that these models are ultimately connected to applications that process different types of inputs and outputs, requiring extensive pre and post-processing that should not be overlooked. This processing often constitutes a significant portion of the overall inference workload. Therefore, the multifaceted nature of inference will be conducted across various environments, including cloud and on-premises solutions. Our AI Enterprise will facilitate this across all clouds. Our partnerships with companies like Dell, ServiceNow, and Adobe will enhance our generative AI capabilities, enabling us to meet the diverse requirements of our customers effectively.
Operator
Next we'll go to Harlan Sur with JP Morgan. Your line is open.
Hi. Good afternoon, and congratulations on the strong results and execution. I really appreciate more of the focus or some of the focus today in your networking products. I mean, it's really an integral part to sort of maximize the full performance of your compute platforms. I think so data center networking business is driving a part of $1 billion of revenues per quarter plus or minus, that's 2.5x growth from three years ago, right, when you guys acquired Mellanox. So very strong growth, but given the very high attach of your InfiniBand, Ethernet solutions, your accelerated compute platforms, is the networking run-rate stepping up in line with your compute shipment? And then, what is the team doing to further unlock more networking bandwidth going forward just to keep pace with the significant increase in compute complexity, datasets, requirements for lower latency, better traffic predictability, and so on?
I appreciate that, Harlan. When people think about AI, they often focus on the accelerator chip, but that overlooks the bigger picture. Accelerated computing is really about the entire stack, including software and networking. We announced our networking stack, DOCA, and our acceleration library, Magnum IO, which are vital to our company. While these might not be widely discussed because they can be complex, they enable us to connect tens of thousands of GPUs. Effective integration requires an exceptional infrastructure operating system, which is why we prioritize networking. With Mellanox, we have the leading technology in high-performance networking, which is why our companies have joined forces. Our network capabilities start with NVLink, a low-latency computing fabric that communicates through memory references rather than traditional network packets. From NVLink, we connect multiple GPUs and expand beyond them, which I will elaborate on during COMPUTEX in a few days. This connects to InfiniBand, where we're fully producing SmartNIC BlueField-3 and optimized fiber optics. All of this runs efficiently. Furthermore, to link smart AI factories to our computing fabric, we’ll introduce a new type of Ethernet at COMPUTEX. The entire process of connecting GPUs and computing units through networking, switches, and software is incredibly intricate. We're pleased you recognize its complexity, but we don't separate it out because we view it as a unified computing platform. We provide these components to data centers worldwide, allowing them to integrate the system into various architectures while still running our software stack. Though our approach is complex, it allows NVIDIA's architecture to function in any data center, from diverse cloud setups to on-premises solutions and edge to 5G, offering us tremendous reach.
Operator
And our last question will come from Matt Ramsay with TD Cowen. Your line is open.
Thank you very much. Congratulations, Jensen, and to the whole team. One of the things I wanted to dig into a little bit is the DGX Cloud offering. You guys have been working on this for some time behind the scenes, where you sell in the hardware to your hyperscale partners and then lease it back for your own business, and the rest of us kind of found out about it publicly a few months ago. And as we look forward over the next number of quarters that Colette discussed to high visibility in the data center business. Maybe you could talk a little bit about the mix you're seeing of hyperscale customers buying for their own first-party internal workloads versus their own sort of third-party, their own customers versus what of that big upside in data center going forward is systems that you're selling in, with potential to support your DGX Cloud offerings and what you've learned since you've launched it about the potential of that business. Thanks.
Thank you, Matt. Without going into specific numbers, the ideal scenario is roughly 10% NVIDIA DGX Cloud and 90% CSPs clouds. Our DGX Cloud is a fully NVIDIA stack, built for optimal performance. This design allows us to collaborate closely with CSPs to create superior infrastructure. Additionally, it enables us to work with CSPs to develop new markets, such as our partnership with Azure to introduce Omniverse cloud to various industries. This kind of computing stack has never existed before, especially with the integration of generative AI, 3D modeling, physics simulations, large databases, high-speed networks, and low-latency capabilities. Through our collaboration with Microsoft, we are implementing Omniverse cloud within Azure, which helps us jointly create new applications and markets. We operate as a unified team, allowing customers to integrate our computing platform while benefiting from the extensive data, services, and security offerings available through Azure, GCP, and OCI, all accessible via Omniverse cloud. It’s a significant advantage for both sides. For customers, NVIDIA's cloud facilitates flexible application deployment; they can run a single standard stack across all clouds and if they prefer managing their software on CSPs' clouds, we're supportive of that with NVIDIA AI Enterprise and NVIDIA AI Foundations. In the long run, NVIDIA Omniverse will also function within CSPs' clouds. Ultimately, our aim is to drive architecture, deepen partnerships, create new markets and applications, and offer customers the flexibility to operate across various environments, including on-premises. These efforts have been very successful, and our collaboration with the three CSPs with DGX Cloud, alongside their sales, marketing, and leadership teams, has been impressive. Everything is functioning very well.
Operator
Thank you. I'll now turn it back over to Jensen Huang for closing remarks.
The computer industry is going through two simultaneous transitions, accelerated computing and generative AI. CPU scaling has slowed, yet computing demand is strong and now with generative AI supercharged. Accelerated computing, a full stack and data center scale approach that NVIDIA pioneered is the best path forward. There is $1 trillion installed in the global data center infrastructure, based on the general-purpose computing method of the last era. Companies are now racing to deploy accelerated computing for the generative AI era. Over the next decade, most of the world's data centers will be accelerated. We are significantly increasing our supply to meet the surging demand. Large language models can learn information encoded in many forms. Guided by large language models, generative AI models can generate amazing content and with models to fine-tune, guardrail, align to guiding principles and ground the facts, generative AI is emerging from labs and is on its way to industrial applications. As we scale with cloud and Internet service providers, we are also building platforms for the world's largest enterprises. Whether within one of our CSP partners or on-prem with Dell Helix, whether on a leading enterprise platform like ServiceNow and Adobe or a bespoke with NVIDIA AI Foundations, we can help enterprises leverage their domain expertise and data to harness generative AI securely and safely. We are ramping a wave of products in the coming quarters, including H100, our Grace and Grace Hopper super chips and our BlueField-3 and Spectrum 4 networking platform. They are all in production. They will help deliver data center scale computing that is also energy-efficient and sustainable computing. Join us next week at COMPUTEX and we'll show you what's next. Thank you.
Operator
This concludes today's conference call. You may now disconnect.