Back to the Index

Model Deployment

What is a model deployment?

A model deployment enables clients to perform inference requests on the model over a network. Typically, a model deployment is loaded to model-serving infrastructure, such as KServe, from a model registry. A model deployment only refers to online models. Offline models make predictions as part of batch inference pipelines.

How do you create a model deployment?

After an online machine learning model has been trained on a dataset, it needs to be deployed so that it can be used in the real world to make predictions on new data. For online models, this involves converting the model into a format that can be deployed to model serving infrastructure from where it can handle requests from clients over the network in real-time.

Do I always need to deploy my model?

Not all machine learning projects require model deployment. For instance, if your project involves offline analysis, batch processing, or one-time predictions, deploying the model may not be necessary as a batch inference pipeline may be good enough. 

However, if your project involves real-time predictions, continuous data streams, or requires online interaction with users or other systems, then deploying your model becomes essential. Examples of such scenarios include recommendation engines, fraud detection systems, and real-time anomaly detection in IoT devices. 

Does this content look outdated? If you are interested in helping us maintain this, feel free to contact us.

© Hopsworks 2023. All rights reserved. Various trademarks held by their respective owners.

Privacy Policy
Cookie Policy
Terms and Conditions