NVDA

NVIDIA Corp

Exchange: NASDAQSector: TechnologyIndustry: Semiconductors

NVIDIA is the world leader in accelerated computing.

Did you know?

Profit margin of 55.6% — that's well above average.

Current Price

$177.39

+0.93%

GoodMoat Value

$221.97

25.1% undervalued

Profile

Valuation (TTM)

Market Cap$4.31T

P/E35.90

EV$4.22T

P/B27.40

Shares Out24.30B

P/Sales19.96

Revenue$215.94B

EV/EBITDA29.46

NVIDIA Corp (NVDA) — Q3 2024 Earnings Call Transcript

Apr 5, 202612 speakers7,055 words32 segments

Operator

Good afternoon. My name is JL, and I will be your conference operator today. I would like to welcome everyone to NVIDIA's Third Quarter Earnings Call. All lines have been muted to minimize background noise. After the speakers’ remarks, there will be a question-and-answer session. Simona Jankowski, you may now begin your conference.

Simona JankowskiModerator

Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the third quarter of fiscal 2024. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer. I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the fourth quarter and fiscal 2024. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent. During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent forms 10-K and 10-Q, and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All statements are made as of today, November 21, 2023, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements. During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. With that, let me turn the call over to Colette.

Colette KressCFO

Thanks, Simona. Q3 was another record quarter. Revenue of $18.1 billion was up 34% sequentially and up more than 200% year-on-year and well above our outlook for $16 billion. Starting with Data Center, the continued ramp of the NVIDIA HGX platform based on our Hopper Tensor Core GPU architecture, along with InfiniBand end-to-end networking, drove record revenue of $14.5 billion, up 41% sequentially and up 279% year-on-year. NVIDIA HGX with InfiniBand together are essentially the reference architecture for AI supercomputers and data center infrastructures. Some of the most exciting generative AI applications are built and run on NVIDIA, including Adobe Firefly, ChatGPT, Microsoft 365 Copilot, CoAssist, now assist with ServiceNow and Zoom AI Companion. Our Data Center compute revenue quadrupled from last year and networking revenue nearly tripled. Investments in infrastructure for training and inferencing large language models, deep learning, recommender systems and generative AI applications are fueling strong broad-based demand for NVIDIA accelerated computing. Inferencing is now a major workload for NVIDIA AI computing. Consumer Internet companies and enterprises drove exceptional sequential growth in Q3, comprising approximately half of our Data Center revenue and outpacing total growth. Companies like Meta are in full production with deep learning, recommender systems and also investing in generative AI to help advertisers optimize images and text. Most major consumer Internet companies are racing to ramp up generative AI deployment. The enterprise wave of AI adoption is now beginning. Enterprise software companies such as Adobe, Databricks, Snowflake and ServiceNow are adding AI copilots and systems to their platforms. Broader enterprises are developing custom AI for vertical industry applications such as Tesla in autonomous driving. Cloud service providers drove roughly the other half of our Data Center revenue in the quarter. Demand was strong from all hyperscale CSPs, as well as from a broadening set of GPU-specialized CSPs globally that are rapidly growing to address the new market opportunities in AI. NVIDIA H100 Tensor Core GPU instances are now generally available in virtually every cloud with instances in high demand. We have significantly increased supply every quarter this year to meet strong demand and expect to continue to do so next year. We will also have a broader and faster product launch cadence to meet the growing and diverse set of AI opportunities. Towards the end of the quarter, the U.S. government announced a new set of export control regulations for China and other markets, including Vietnam and certain countries in the Middle East. These regulations require licenses for the export of a number of our products, including our Hopper and Ampere 100 and 800 series and several others. Our sales to China and other affected destinations derived from products that are now subject to licensing requirements have consistently contributed approximately 20% to 25% of Data Center revenue over the past few quarters. We expect that our sales to these destinations will decline significantly in the fourth quarter. We believe that will be more than offset by strong growth in other regions. The U.S. government designed the regulation to allow the U.S. industry to provide data center compute products to markets worldwide, including China. Continuing to compete worldwide as the regulations encourage, promotes U.S. technology leadership, spurs economic growth and supports U.S. jobs. For the highest performance levels, the government requires licenses. For lower performance levels, the government requires a streamlined prior notification process. For products even lower performance levels, the government does not require any notice at all. Following the government's clear guidelines, we are working to expand our Data Center product portfolio to offer compliance solutions for each regulatory category, including products for which the U.S. government does not wish to have advance notice before each shipment. We are working with some customers in China and the Middle East to pursue licenses from the U.S. government. It is too early to know whether these will be granted for any significant amount of revenue. Many countries are awakening to the need to invest in sovereign AI infrastructure to support economic growth and industrial innovation. With investments in domestic compute capacity, nations can use their own data to train LLMs and support their local generative AI ecosystems. For example, we are working with India's government and largest tech companies including Infosys, Reliance and Tata to boost their sovereign AI infrastructure. A French private cloud provider, Scaleway, is building a regional AI cloud based on NVIDIA H100 InfiniBand and NVIDIA's AI Enterprise software to fuel advancement across France and Europe. National investment in compute capacity is a new economic imperative, serving the sovereign AI infrastructure market represents a multi-billion dollar opportunity over the next few years. From a product perspective, the vast majority of revenue in Q3 was driven by the NVIDIA HGX platform based on our Hopper GPU architecture with lower contribution from the prior generation Ampere GPU architecture. The new L40S GPU built for industry standard servers began to ship, supporting training and inference workloads across a variety of consumers. This was also the first revenue quarter of our GH200 Grace Hopper Superchip, which combines our ARM-based Grace CPU with a Hopper GPU. Grace and Grace Hopper are ramping into a new multi-billion dollar product line. Grace Hopper instances are now available at GPU-specialized cloud providers, and coming soon to Oracle Cloud. Grace Hopper is also gaining significant traction with supercomputing customers. Initial shipments to Los Alamos National Lab and the Swiss National Supercomputing Center took place in the third quarter. The UK government announced it will build one of the world's fastest AI supercomputers called Isambard-AI with almost 5,500 Grace Hopper Superchips. The German supercomputing center, Julich, also announced that it will build its next-generation AI supercomputer with close to 24,000 Grace Hopper Superchips and Quantum-2 InfiniBand, making it the world's most powerful AI supercomputer with over 90 exaflops of AI performance. All-in, we estimate that the combined AI compute capacity of all the supercomputers built on Grace Hopper across the U.S., Europe, and Japan next year will exceed 200 exaflops with more wins to come. Inference is contributing significantly to our data center demand, as AI is now in full production for deep learning, recommenders, chatbots, copilots, and text-to-image generation, and this is just the beginning. NVIDIA AI offers the best inference performance and versatility, resulting in lower power consumption and cost of ownership. We are also driving a fast cost reduction curve. With the release of TensorRT-LLM, we now achieved more than double the inference performance for half the cost of inferencing LLMs on NVIDIA GPUs. We also announced the latest member of the Hopper family, the H200, which will be the first GPU to offer HBM3e, faster, larger memory to further accelerate generative AI and LLMs. It moves inference speed up to another double compared to H100 GPUs for running LLMs like Norma2. Combined, TensorRT-LLM and H200 have increased performance or reduced cost by four times in just one year. With our customers changing their stack, this is a benefit of CUDA and our architecture compatibility. Compared to the A100, H200 delivers an 18 times performance increase for inferencing models like GPT-3, allowing customers to move to larger models without an increase in latency. Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud will be among the first CSPs to offer H200-based instances starting next year. At last week's Microsoft Ignite, we deepened and expanded our collaboration with Microsoft across our entire stack. We introduced an AI foundry service for the development and tuning of custom generative AI enterprise applications running on Azure. Customers can bring their domain knowledge and proprietary data, and we help them build their AI models using our expertise and software stack in our DGX cloud, all with enterprise-grade security and support. SAP and Amdocs are the first customers of the NVIDIA AI foundry service on Microsoft Azure. In addition, Microsoft will launch new confidential computing instances based on the H100. The H100 remains the top-performing and most versatile platform for AI training by a wide margin, as shown in the latest MLPerf industry benchmark results. Our training cluster included more than 10,000 H100 GPUs or three times more than in June, reflecting very efficient scaling. Efficient scaling is a key requirement in generative AI, because LLMs are growing by an order of magnitude every year. Microsoft Azure achieved similar results on a nearly identical cluster, demonstrating the efficiency of NVIDIA AI in public cloud deployments. Networking now exceeds a $10 billion annualized revenue run rate. Strong growth was driven by exceptional demand for InfiniBand, which grew fivefold year-on-year. InfiniBand is critical to gaining the scale and performance needed for training LLMs. Microsoft made this very point last week, highlighting that Azure uses over 29,000 miles of InfiniBand cabling, enough to circle the globe. We are expanding NVIDIA networking into the Ethernet space. Our new Spectrum-X end-to-end Ethernet offering with technologies purpose-built for AI will be available in the first quarter next year. With support from leading OEMs, including Dell, HPE, and Lenovo. Spectrum-X can achieve 1.6 times higher networking performance for AI communication compared to traditional Ethernet offerings. Let me also provide an update on our software and services offerings, where we are starting to see excellent adoption. We are on track to exit the year at an annualized revenue run rate of $1 billion for our recurring software, support, and service offerings. We see two primary opportunities for growth over the intermediate term with our DGX cloud service and with our NVIDIA AI Enterprise software, each reflecting the growth of enterprise AI training and inference, respectively. Our latest DGX cloud customer announcement was this morning as part of an AI research collaboration with Gentech, the biotechnology pioneer also plans to use our BioNeMo LLM framework to help accelerate and optimize their AI drug discovery platform. We now have enterprise AI partnerships with Adobe, Dropbox, Getty, SAP, ServiceNow, Snowflake, and others to come. Moving to Gaming. Gaming revenue of $2.86 billion was up 15% sequentially and up more than 80% year-on-year with strong demand in the important back-to-school shopping season with NVIDIA RTX ray tracing and AI technology now available at price points as low as $299. We entered the holidays with the best-ever line-up for gamers and creators. Gaming has doubled relative to pre-COVID levels even against the backdrop of lackluster PC market performance. This reflects the significant value we've brought to the gaming ecosystem with innovations like RTX and DLSS. The number of games and applications supporting these technologies has exploded in that period, driving upgrades and attracting new buyers. The RTX ecosystem continues to grow. There are now over 475 RTX-enabled games and applications. Generative AI is quickly emerging as the new pillar app for high-performance PCs. NVIDIA RTX GPUs define the most performance AI PCs and workstations. We just released TensorRT-LLM for Windows, which speeds on-device LLM inference up by four times. With an installed base of over 100 million, NVIDIA RTX is the natural platform for AI application developers. Finally, our GeForce NOW cloud gaming service continues to build momentum. Its library of PC games surpassed 1,700 titles, including the launches of Alan Wake 2, Baldur's Gate 3, Cyberpunk 2077: Phantom Liberty and Starfield. Moving to the Pro Visualization segment. Revenue of $416 million was up 10% sequentially and up 108% year-on-year. NVIDIA RTX is the workstation platform of choice for professional design, engineering, and simulation use cases, and AI is emerging as a powerful demand driver. Early applications include inference for AI imaging in healthcare and edge AI in smart spaces and the public sector. We launched a new line of desktop workstations based on NVIDIA RTX Ada Lovelace generation GPUs and ConnectX SmartNICs, offering up to double the AI processing ray tracing and graphics performance of the previous generations. These powerful new workstations are optimized for AI workloads such as fine-tuning AI models, training smaller models, and running inference locally. We continue to make progress on Omniverse, our software platform for designing, building, and operating 3D virtual worlds. Mercedes-Benz is using Omniverse powered digital twins to plan, design, build, and operate its manufacturing and assembly facilities, helping it increase efficiency and reduce defects. Oxxon is also incorporating Omniverse into its manufacturing process, including end-to-end simulation for the entire robotics and automation pipeline, saving time and cost. We announced two new Omniverse Cloud services for automotive digitalization available on Microsoft Azure, a virtual factory simulation engine and an autonomous vehicle simulation engine. Moving to Automotive. Revenue was $261 million, up 3% sequentially and up 4% year-on-year, primarily driven by continued growth in self-driving platforms based on NVIDIA DRIVE Orin SOC and the ramp of AI cockpit solutions with global OEM customers. We extended our automotive partnership with Foxconn to include NVIDIA DRIVE for our next-generation automotive SOC. Foxconn has become the ODM for EVs. Our partnership provides Foxconn with a standard AV sensor and computing platform for their customers to easily build a state-of-the-art safe and secure software-defined car. Now we're going to move to the rest of the P&L. GAAP gross margin expanded to 74% and non-GAAP gross margin to 75%, driven by higher Data Center sales and lower net inventory reserve, including a 1 percentage point benefit from the release of previously reserved inventory related to the Ampere GPU architecture products. Sequentially, GAAP operating expenses were up 12% and non-GAAP operating expenses were up 10%, primarily reflecting increased compensation and benefits. Let me turn to the fourth quarter of fiscal 2024. Total revenue is expected to be $20 billion, plus or minus 2%. We expect strong sequential growth to be driven by Data Center, with continued strong demand for both compute and networking. Gaming will likely decline sequentially as it is now more aligned with notebook seasonality. GAAP and non-GAAP gross margins are expected to be 74.5% and 75.5%, respectively, plus or minus 50 basis points. GAAP and non-GAAP operating expenses are expected to be approximately $3.17 billion and $2.2 billion, respectively. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $200 million, excluding gains and losses from non-affiliated investments. GAAP and non-GAAP tax rates are expected to be 15%, plus or minus 1% excluding any discrete items. Further financial information is included in the CFO commentary and other information available on our IR website. In closing, let me highlight some upcoming events for the financial community. We will attend the UBS Global Technology Conference in Scottsdale, Arizona, on November 28; the Wells Fargo TMT Summit in Rancho Palos Verdes, California, on November 29; the Arete Virtual Tech Conference on December 7; and the J.P. Morgan Health Care Conference in San Francisco on January 8. Our earnings call to discuss the results of our fourth quarter and fiscal 2024 is scheduled for Wednesday, February 21. We will now open the call for questions.

Operator

Your first question comes from Vivek Arya of Bank of America. Your line is open.

Vivek AryaAnalyst

Thanks for taking my question. Just, Colette, wanted to clarify what China contributions are you expecting in Q4. And then, Jensen, the main question is for you, where do you think we are in the adoption curve in terms of your shipments into the generative AI market? Because when I just look at the trajectory of your Data Center growth, it will be close to nearly 30% of all the spending in data centers next year. So what metrics are you keeping an eye on to inform you that you can continue to grow? Just where are we in the adoption curve of your products into the generative AI market? Thank you.

Colette KressCFO

So, first let me start with your question, Vivek, on export controls and the impacts that we are seeing in our Q4 outlook and guidance that we provided. We had seen historically over the last several quarters that China and some of the other impacted destinations contributed to about 20% to 25% of our Data Center revenue. We are expecting in our guidance for that to decrease substantially as we move into Q4. The export controls will have a negative effect on our China business, and we do not have good visibility into the magnitude of that impact even over the long-term. We are working to expand our Data Center product portfolio to possibly offer new regulation compliance solutions that do not require a license; these products may become available in the coming months. However, we don't expect their contribution to be material or meaningful as a percentage of the revenue in Q4.

Jensen HuangCEO

Generative AI is the largest total addressable market expansion of software and hardware that we've seen in several decades. At the core of it, what's really exciting is that what was largely a retrieval-based computing approach, almost everything that you do is retrieved off of storage somewhere, has now been augmented with a generative method. And it has changed almost everything. You could see that in text-to-text, text-to-image, text-to-video, text-to-3D, text-to-protein, text-to-chemicals; these were things that were processed and typed in by humans in the past, and these are now generative approaches. The way that we access data has changed. It used to be based on explicit queries. It is now based on natural language queries, intention queries, semantic queries. We’re excited about the work that we're doing with SAP, Dropbox, and many others that you'll hear about. One of the areas that is really impactful is the software industry, which is about $1 trillion or so, has been building tools that are manually used over the last couple of decades. Now there’s a whole new segment of software called copilots and assistants. Instead of being manually used, these tools will have copilots to help you use it. We will connect all of these copilots and assistants into teams of AIs, which will be the modern version of enterprise business software. The transformation of software and the way it is being done is driving the hardware underneath. You can see that it's transforming hardware in two ways. One is related to accelerated computing; general-purpose computing is too wasteful of energy and cost. Now, that we have much better approaches, called accelerated computing, you can save an order of magnitude of energy, you can save an order of magnitude of time or you can save an order of magnitude of cost by using acceleration. Accelerated computing is transitioning, if you will, general-purpose computing into this new approach. This has been augmented by a new class of data centers. Unlike traditional data centers, these new data centers consist of very few applications, if not one application, used by a single tenant that processes data, trains models, and generates tokens and AI. We call these new data centers AI factories. We're seeing AI factories being built out everywhere, by every country. In terms of expansion, you saw the first wave with large language model start-ups, generative AI start-ups, and consumer Internet companies ramping that. Meanwhile, while that's being ramped, we're starting to partner with enterprise software companies that would like to build chatbots and copilots to augment the tools they host on their platforms. You're seeing GPU specialized CSPs cropping up all over the world, and they are dedicated to processing AI. You're seeing sovereign AI infrastructures, as countries now recognize that they need to keep their own culture and data, process that data, and develop their own AI. This is evident in India, several last year, in Sweden, and last week in France. The number of sovereign AI clouds being built is significant. My guess is that almost every major region and indeed every major country will have their own AI clouds. I believe that we are witnessing new developments as the generative AI wave spreads through every industry, every company, and every region. So, we are at the beginning of this inflection, this computing transition.

Operator

Your next question comes from the line of Aaron Rakers of Wells Fargo. Your line is open.

Aaron RakersAnalyst

Yeah. Thanks for taking the question. I wanted to ask about the networking side of the business. Given the growth rates that you've now cited, I think it's 155% year-over-year, and strong growth sequentially, it looks like that business is almost approaching $2.5 billion to $3 billion quarterly level. I'm curious how you see Ethernet evolving and how you would characterize your differentiation of Spectrum-X relative to the traditional Ethernet stack as we begin to include that in the networking narrative beyond just InfiniBand looking into next year? Thank you.

Jensen HuangCEO

Yeah. Thanks for the question. Our networking business is already on a $10 billion plus run rate, and it's going to get much larger. As you mentioned, we added a new networking platform to our networking business recently. The vast majority of the dedicated large-scale AI factories standardize on InfiniBand. This is not only because of its data rate and latency but also the way it moves traffic around the network. The way that you process AI in a multi-tenant hyperscale Ethernet environment, the traffic patterns are radically different. With InfiniBand and software-defined networks, we could do congestion control, adaptive routing, performance isolation, and noise isolation, not to mention, of course, the data rate and the low latency that's natural to InfiniBand. InfiniBand is not just a network; it's also a computing fabric. We've incorporated many software-defined capabilities into the fabric, including computation. We will perform 40-point calculations and computation right on the switch in the fabric itself. The differences between Ethernet and InfiniBand, especially for AI factories, are profound. If you've invested in a $2 billion infrastructure for AI factories, a 20%, 25%, or 30% difference in overall effectiveness translates to hundreds of millions of dollars in value. Renting that infrastructure over four to five years adds up. The value proposition of InfiniBand for AI factories is undeniable. However, as we transition AI into enterprise, we want to empower every company to build its own custom AIs. We create custom AIs in our company based on our proprietary data and skills. For example, we've acquired one of the models, called ChipNeMo, and we're building many others. There will be tens to hundreds of custom AI models created inside our company. Our AI must run in an Ethernet environment, so we invented a new platform that extends Ethernet; it doesn’t replace Ethernet, it is 100% compliant with it. It's optimized for East-West traffic, where the computing fabric operates. It adds an end-to-end solution with Bluefield, as well as our Spectrum switch, enabling us to perform some of the capabilities we have in InfiniBand, though not all. We've achieved excellent results. Our go-to-market strategy involves partnering with large enterprises already offering our computing solutions. HP, Dell, and Lenovo are equipped with the NVIDIA AI stack, NVIDIA AI Enterprise software stack, and now they can integrate with Bluefield and bundle up our Spectrum switch to offer enterprise customers globally a fully integrated, optimized end-to-end AI solution.

Operator

Your next question comes from the line of Joe Moore of Morgan Stanley. Your line is open.

Joseph MooreAnalyst

Great. Thank you. I'm wondering if you could talk a little bit more about Grace Hopper and how you see the ability to leverage the microprocessor as a total addressable market expander. What applications do you see using Grace Hopper versus more traditional H100 applications?

Jensen HuangCEO

Yeah. Thanks for the question. Grace Hopper is in high volume production now. We're expecting next year just with all of the design wins we have in high-performance computing and AI infrastructures, we're on a very fast ramp with our first data center CPU to a multi-billion dollar product line. This is going to be a very large product line for us. The capability of Grace Hopper is really quite spectacular. It can create computing nodes that simultaneously utilize both very fast memory and very large memory. In the areas of vector databases or semantic search, what's called retrieval augmented generation, you can have a generative AI model refer to proprietary data or factual data before generating a response, which is often quite large. You can also have applications or generative models where the context length is significantly high. This way, you can store entire books into system memory before you ask your questions. The context length can indeed be very large. These generative models can interact naturally as well as refer to factual data, proprietary data, or domain-specific data, providing contextual relevance and reducing hallucination. This use case for Grace Hopper is really quite a fantastic application.

Operator

Your next question comes from the line of Tim Arcuri of UBS. Your line is open.

Tim ArcuriAnalyst

Hi. Thanks. I wanted to ask a little bit about the visibility that you have on revenue. I know there are a few moving parts. I guess, on one hand, the purchase commitments went up a lot again. But on the other hand, the China bans would arguably pull in when you can fill the demand beyond China. I know we're not even into 2024 yet and it doesn't sound like, Jensen, you think that next year would be a peak for your Data Center revenue, but I just wanted to sort of explicitly ask you that. Do you think that Data Center can grow even in 2025? Thanks.

Jensen HuangCEO

I absolutely believe the Data Center can grow through 2025. There are several reasons for that. We are significantly expanding our supply. We have already one of the broadest, largest, and most capable supply chains in the world. Now, remember, people think that the GPU is just a chip, but the HGX H100, the Hopper HGX has 35,000 parts, weighs 70 pounds. Eight of the chips are Hopper. The other 35,000 are not. Even its passive components are incredible, consisting of high voltage, high frequency, and high current parts. It is a supercomputer, and the only way to test it is with another supercomputer. Every aspect of our HGX supply chain is complicated. The remarkable team here has really scaled out the supply chain incredibly. I'm super proud of the team for scaling up this incredible supply chain. Meanwhile, we are adding new customers and new products. So we have new supply. Different regions are stood up with GPU specialist clouds, and sovereign AI clouds are emerging across the globe, as nations realize the need to preserve their knowledge and culture by developing their own AI infrastructure. Enterprises, as private companies, must build their own custom AIs, recognizing that they can’t afford to export their country’s intelligence to others. They realize they must create tools for their own customers to access this intelligence. We have a new service called an AI foundry, leveraging NVS capabilities to support them in that. We're witnessing waves of generative AI starting with start-ups and CSPs, moving to consumer Internet companies and enterprise software platforms, and ultimately to industrial generative AI. This transition to generative AI and accelerated computing promises to impact every company, every industry, and every nation tremendously.

Operator

Your next question comes from the line of Toshiya Hari of Goldman Sachs. Your line is open.

Toshiya HariAnalyst

Hi. Thank you. I wanted to clarify something with Colette real quick, and then I had a question for Jensen as well. Colette, you mentioned that you'll be introducing regulation-compliant products over the next couple of months. Yet, the contribution to Q4 revenue should be relatively limited. Is that a timing issue and could it be a source of reacceleration in growth for Data Center in April and beyond or are the price points such that the contribution to revenue should be relatively limited? And then the question for Jensen, the AI foundry service announcement from last week. I just wanted to ask about that, and hopefully, have you expand on it. How is the monetization model going to work? Is it primarily services and software revenue? How should we consider the long-term opportunity set? Is this going to be exclusive to Microsoft or do you have plans to expand to other partners as well? Thank you.

Colette KressCFO

Thanks, Toshiya. On the question regarding potentially new products that we could provide to our China customers, it’s a significant process to both design and develop these new products. As we discussed, we're going to make sure that we are in full discussions with the U.S. government about our intent to move products as well. Given our state about where we are in the quarter, we’re already several weeks into the quarter. It will take time for us to engage with our customers regarding the needs and interests in these new products. Whether that’s medium-term or long-term, it's difficult to say given the uncertainty of the appreciation of what we can produce with the U.S. government and what our China customers are interested in. We remain focused on finding the right balance for our China customers, but it's hard to say at this time.

Jensen HuangCEO

Toshiya, thanks for the question. There’s a glaring opportunity in the world for AI foundry, and it makes so much sense. Every company has its core intelligence. Our data and domain expertise. Many software companies in the world are tool platforms, and those tools are currently being used manually. In the future, they will be augmented with various AIs. Our foundry must now extend across the globe, and we’ve only announced a few partnerships with SAP, ServiceNow, Dropbox, Getty, and many others. This is because they need to build their proprietary AI, recognizing they must maintain control over their data. We provide several essential elements in our foundry. As you know, we have incredible depth in AI technology. Second, we have established best practices—the expertise to create AI models that are safe and relevant. Third, you need factories. That’s what DGX Cloud provides. Our AI models are called AI Foundations. Our process of creating AIs functions internally to create a secure, optimized environment for our customers. They deploy their AI applications on this enterprise-optimized platform known as NVIDIA AI Enterprise, designed to run across our entire product line, on-prem, cloud, and everywhere, maintaining constant security and support. NVIDIA AI Enterprise is $4,500 per GPU annually; this constitutes our business model. Our customers then have the opportunity to create a monetization model on top of that. They could pursue subscriptions, per-instance, or per-usage pricing models. There are multiple ways to build their business model while we operate on an OS-level licensing model. We expect that NVIDIA AI Enterprise will become a very large business for us.

Operator

Your next question comes from the line of Stacy Rasgon of Bernstein Research. Your line is open.

Stacy RasgonAnalyst

Hi guys. Thanks for taking my questions. Colette, I wanted to know if it weren’t for the China restrictions, would the Q4 guidance have been higher or are you supply-constrained just with reallocating parts that would have gone to China elsewhere? Along those lines, can you share your lead times right now in data centers, and do the China redirection and such lower those lead times because you have parts that are readily available for shipping?

Colette KressCFO

Yeah. Stacy, let me see if I can help you understand. Yes, there are still situations where we are working on both improving our supply each and every quarter. We've done a solid job of ramping every quarter, which has defined our revenue. But with the absence of China for our outlook for Q4, sure, there could have been some areas where we may not be supply-constrained, that we could have sold, but now we can no longer. So, could our guidance have been a little higher for our Q4? Yes. We are still working on improving our supply on plan and continue to grow throughout next year.

Operator

Your next question comes from the line of Matt Ramsay of TD Cowen. Your line is open.

Matt RamsayAnalyst

Thank you very much. Congrats, everybody, on the results. Jensen, I had a two-part question for you, and it comes off of one premise. The premise is, I still get a lot of questions from investors who think about AI training as being NVIDIA's dominant domain and somehow inferencing, even large model inference takes more and more of the total addressable market that the market will become increasingly competitive and you'll be less differentiated, et cetera. So I guess the two parts of the question are: number one, could you spend a little bit of time discussing the evolution of the inference workload as we transition to LLMs and how your company is positioned for that rather than smaller model inference? Second, up until a month or two ago, I never really received any questions about the data processing piece of the AI workloads; the pieces of manipulating the data before training, between training and inference, and after inference. I think that's a large part of the workload now. Could you talk about how CUDA is enabling acceleration of those pieces of the workload? Thanks.

Jensen HuangCEO

Sure. Inference is complicated. It's incredibly complicated. We announced one of the most exciting new engines, optimizing compilers called TensorRT-LLM this quarter. The reception has been incredible. You go to GitHub—it’s been downloaded many times, gaining attention and integration into tech stacks globally, almost instantaneously. There are several reasons for that, clearly. The development of TensorRT-LLM was made possible because CUDA is programmable. Without CUDA and our GPUs being so programmable, rapid software stack enhancements would be nearly impossible. TensorRT-LLM improves performance on the same GPU, doubling the efficiency without any human intervention. H200 increases it by another factor of two. Thus, our inference performance—another way to express inference cost—has declined by a factor of four within this last year. It’s really challenging for others to keep up with this pace. The reason so many gravitate toward our inference engine is our long-standing installed base. We've dedicated over 20 years to building this, and it shows: our installed base is the largest globally, spanning every single cloud, every enterprise system in every industry. Anytime you come across NVIDIA's GPU, it's tied to our software. The architectural compatibility is paramount. It’s our top priority. The sense of certainty in the platform is why everyone chooses to build upon NVIDIA first. The engineering work we have invested, all the advancement of technologies on top of NVIDIA benefits all those who use our GPUs. Data processing is incredibly crucial. Before training a model, you must clean, curate, deduplicate, and potentially augment the data with synthetic sources. This data isn't measured merely in bytes or megabytes. Instead, it's measured in terabytes and petabytes. The effort devoted to data processing, before subsequent data engineering and training, is considerable and can compose 30%, 40%, or even 50% of the total job. Ultimately, this necessitates creating robust, effective machine learning. To that end, we've accelerated Spark and Python frameworks. One of our recent features is called cuDF Pandas, where without changing a line of code, the single most successful data science framework in the world now operates at full speed with NVIDIA CUDA. The acceleration we’re providing has garnered immense enthusiasm. People are incredibly excited as Pandas was designed explicitly for data processing and data science.

Operator

Your final question comes from the line of Harlan Sur of J.P. Morgan. Your line is open.

Harlan SurAnalyst

Good afternoon. Thanks for taking my question. If you look back into the history of the tech industry, specifically companies that have succeeded focus heavily on ecosystem; software, silicon, hardware, strong partnerships, and most importantly, a proactive product roadmap with increased levels of segmentation over time. The team recently announced an aggressive new product cadence in data centers, now transitioning from every two years to every year with higher segmentation across training, optimization, CPU, GPU, and DPU networking. How should we think about your R&D OpEx growth outlook to support this more expansive roadmap? More importantly, what steps is the team taking to coordinate and drive execution through all of this complexity?

Jensen HuangCEO

Gosh. That's just excellent. You've effectively outlined NVIDIA's business plan. First, there is a fundamental reason to accelerate our execution: it drives down costs. When combining TensorRT-LLM with H200, customers save costs for large model inference by a factor of four. While this includes speeds and data metrics, the critical aspect lies in our software improvements and architectural stability. Our goal is to enhance our roadmap speed for it. The second reason is to promote broad generative AI adoption. The configurations in the traditional data center are changing. NVIDIA is integrated across various clouds, each with unique platforms, networking control planes, and security postures, yet we adjust our architecture to fit seamlessly into these stacks. We're simultaneously establishing standalone AI factories integrating our solutions for supercomputers and enterprises. We’re now bringing AI to enterprises in ways no one else has done before, thereby laying the groundwork for our market rollout. The intricacies of technology and segmentation, paired with our architecture compatibility across all segments, provides an incredible reach into the market. Our investments span domains like healthcare, manufacturing, AI, financial services, supercomputing, and more. The range of markets we enhance with domain-specific libraries is broadening exponentially. We provide a comprehensive solution for data centers, comprising InfiniBand networking, Ethernet networking, x86, ARM, and many combination solutions, assuring a cross-segment ecosystem of developers, software solution providers, and extensive distribution partnerships. This incredible reach necessitates sizeable effort. However, the discipline we established in maintaining architectural compatibility has remained consistent. A domain-specific language that runs on one GPU will operate on every GPU we produce. Optimizing TensorRT for cloud translates to optimal performance for enterprise setups as well, drawing benefits from all our developers enhancing their projects through NVIDIA's architecture.

Operator

Thank you. I will now turn the call back over to Jensen Huang for closing remarks.

Jensen HuangCEO

Our strong growth reflects the broad industry platform transition from general-purpose to accelerated computing and generative AI. Large language models, start-ups, consumer Internet companies, and global cloud service providers are the first movers. The next waves are starting to build. Nations and regional CSPs are building AI clouds to serve local demand. Enterprise software companies like Adobe and Dropbox, SAP, and ServiceNow are adding AI copilots and assistants to their platforms. Enterprises in the world's largest industries are creating custom AIs to automate and boost productivity. The generative AI era is in full motion and has created the need for a new type of data center, an AI factory, optimized for refining data, training, inference, and generating AI. AI factory workloads differ significantly from legacy data center workloads supporting IT tasks. AI factories run copilots and AI assistants, which provide substantial software total addressable market expansion and drive new investment. We're expanding the $1 trillion traditional data center infrastructure installed base, empowering an AI industrial revolution. NVIDIA H100 HGX with InfiniBand and the NVIDIA AI software stack define an AI factory today. As we expand our supply chain to meet global demand, we are also constructing new growth drivers for the upcoming AI wave. We highlighted three elements in our new growth strategy gaining momentum: CPU, networking, and software and services. Grace is NVIDIA's first data center CPU. Grace and Grace Hopper are in full production, ramping into a multi-billion dollar product line next year. Irrespective of the CPU choice, we can help customers construct an AI factory. NVIDIA networking now exceeds a $10 billion annualized revenue run rate. InfiniBand grew fivefold year-on-year and is positioned for significant growth as the networking solution for AI factories. Enterprises race to adopt AI, and Ethernet is the standard networking solution. This week, we announced an Ethernet for AI platform for enterprises. NVIDIA Spectrum-X is an end-to-end solution combining Bluefield SuperNIC, Spectrum-4 Ethernet switch, and software that boosts Ethernet performance by 1.6 times for AI workloads. Dell, HPE, and Lenovo have joined us to deliver a complete generative AI solution, integrating NVIDIA AI computing, networking, and software for global enterprises. NVIDIA software and services are projected to exit the year at an annualized revenue run rate of $1 billion. Enterprise software platforms such as ServiceNow and SAP need to build and operate proprietary AI. Enterprises require custom AI copilots. We possess the technology, expertise, and scale necessary to aid customers in designing custom models with their proprietary data on NVIDIA's DGX Cloud and deploying applications on enterprise-grade NVIDIA AI Enterprise. Essentially, NVIDIA is an AI foundry. Our GPUs, CPUs, networking, AI foundry services, and NVIDIA AI Enterprise software represent robust growth engines in full throttle. Thank you for joining us today. We are eager to update you on our advancements next quarter.

Operator

This concludes today's conference call. You may now disconnect.

Q2 2024 Q4 2024