So really – a really important understanding of how we have entered into the overall inferencing market.
Let’s go back in terms of a couple of years ago. We had discussed the size, opportunity in front of us in data center. And the size in front of us incorporated, both training, high-performance computing to areas where we had quite leadership position, but also our discussion of the importance of overall inferencing. A lot of people discuss it that says, wait a minute, why? Why would people move to an overall GPU? People use a CPU for inferencing.
You’re going to have to need a different form factor.
You’re going to need to be smaller.
You’re going to need to focus in terms of the wattage that otherwise what the CPU will just be fine. We took on that piece as a very important understanding of how inferencing would change and how it has changed over the last couple of years.
Our thoughts in terms of working with many customers is the types of the inferencing that we’re doing historically was really based on the type of compute they had. They said, well, the compute can do this. Therefore, I will create inferencing to support that. What we mean by that is rather overall simplistic types of inferencing for mass amounts of data, but very binary types of responses as necessary. What we saw is the output of training. The output of training for AI would create phenomenal overall data sets that new information coming in would need to go through that overall inferencing model. But a standard overall compute chip would likely be too slow to respond back for the overall needs of the overall inferencing workload that’s there. What we see right now is conversational AI, for example, is a very phenomenal AI workload. It’s an AI workload taking in one of the most challenging parts of overall understanding data, which is understanding natural language. Understanding natural language, what is said deciphering in terms of the pieces of the meaning, but also being able to respond in that language as well in a conversational manner. That takes training. That also, therefore, takes inferencing at the side. But speed and performance are important.
You have to have both the ability to speak multiple languages with your overall workload, but you’re looking for a couple hundred milliseconds type of response.
So our entrance into inferencing was an extension of what we saw in terms of training.
We have built now inferencing to solidly be in the double digits percent of our overall data center business in just a couple of years.
We have it growing more and doubling year-over-year in just this last quarter.
With the ability to bring something like a 100 to the market allows us to, again, not have the overall customer choosing at purchase what the workload that they want to do and allow them that continuation of building their training models to move into overall inferencing.
As we go forward, it’s going to be hard for us to determine specifically and model how much overall inferencing is of the workloads that we’re selling. But we have the ability with overall A100 to meet the multiple needs of the end customer, whether that be a hyperscale or that be an enterprise and an enterprise on the edge as well.
Now, will that mean that everything that we do moving forward would be only A100 types of platforms? Now we believe that the overall acceleration will meet so many of the different types of applications and servers going forward. And it someday way in the future, almost everything will be accelerated. Over this period of time, we may still have different form factors outside of A100. Keep in mind, we have A100, which is a platform.
We also sell full systems, full systems such as the DGXs. Why? And why do we do that? It is the ability for those that core competency is not to focus on the infrastructure to get an end-to-end configuration that allows them to plug and play, to focus on their application and focus on the work where they have their overall core competencies. But similarly, we have – may have the opposite, that we may say, we will provide you a specific overall GPU for specific workloads that you have in terms of overall volume.
So we’ll see how the market plays out in terms of there. But I think, we’ll have a full host of different platform solutions and not everything will look just like an A100.