Valérian de Thézan de Gaussan · Data Engineering for process-heavy organizations

What does "inference" mean in the AI world?

You keep seeing this word but you don't understand its meaning: here's a short explanation


When you explore the world of AI, you’ll often come across terms like “inference”, and “inference API”.

Let’s break down at it means in the context of AI (or should I say, machine learning).

So, you want to do AI:

  • Before a machine learning model can make any predictions, it must first be trained. This training involves feeding a large dataset into the model, which contains both input features and corresponding expected outcomes.
  • Once the model has been trained, it can be used to analyze new, unseen data. This step is called inference. When given fresh input, the model processes this data through the knowledge it acquired during training and provides a “ prediction” or “result”.

So, the inference process is simply putting some data in a trained machine learning model and outputting a result.

It’s the stage where a machine learning model becomes practically useful. One can finally use it!

In order to do so, APIs can be built on top of the model, so it can be accessed using various protocols. And that’s why you’ll often find the saying “Inference API” when browsing through the world of machine learning.

Image ref: Page from OpenAI API documentation mentioning “inference”. openai-inference.jpeg