Serving PyTorch models with prebuilt containers on Vertex AI

🤖roboticcontent.com

General

Feb 18, 2023

3 min

🤖roboticcontent.com

General

Feb 18, 2023

3 min

Serving PyTorch models with prebuilt containers on Vertex AI

Machine learning (ML) practitioners using PyTorch tell us that it can be challenging to advance their ML project beyond experimentation. That's why over the last year, we've prioritized development workthat makes it easier for PyTorch users to deploy models in the cloud using Vertex AI. Vertex AI is a fully-managed machine learning platform with tools, workflows, and infrastructure designed to help ML practitioners accelerate and scale ML in production with the benefit of open-source tools.

We are excited to announce that Vertex AI now offers support for pre-built PyTorch serving containers, which makes it easier to bring your PyTorch models into production. You don't have to build a custom container to serve your PyTorch model. With pre-built containers, we've streamlined the ML lifecycle for PyTorch users. This post describes how to deploy your own PyTorch models on Vertex AI. For more details, you can also have a look at the documentation.

Deploy a PyTorch model in three steps

Step 1 - Package your PyTorch model

The first step is to package your trained PyTorch model, including any default or custom handlers, into an archive file using Torch model archiver. The handlers help with the following:

Pre-processing input data into the expected format
Customizing how the model is invoked
Post-processing output from the model

After defining your handlers, you create the model archive file using the Torch model archiver. The pre-built PyTorch image requires the archived model file to be named model.mar, so you need to set the model name as model.

Step 2 - Upload the model to Vertex AI with the pre-built PyTorch serving container image

After you package the PyTorch model, you upload it to the Vertex AI Model Registry, where you can track and manage all of your models and quickly deploy it as a Vertex AI endpoint. You can use the Vertex AI SDK and the pre-built PyTorch serving image to upload the PyTorch model. The Vertex AI SDK provides an optimized experience for interacting with the Vertex AI APIs. Your code will look something like this:

code_block

`[(u'code', u'from google.cloud import aiplatform as vertexairnrn# initialize the Vertex AI SDKrnvertexai.init(project=PROJECT\_ID, staging\_bucket=BUCKET\_NAME)rnrnrn# upload the PyTorch modelrnmodel = vertexai.Model.upload(rn display\_name=model\_display\_name,rn description=model\_description,rn serving\_container\_image\_uri=serving\_container\_image\_uri,rn artifact\_uri=ARCHIVED\_MODEL\_GCS\_URI,rn)rnrnmodel.wait()'), (u'language', u''), (u'caption', )])]`

Step 3 - Create a Vertex AI endpoint and deploy the PyTorch model

The third, and last, step is to create a Vertex AI endpoint and deploy the PyTorch model to the endpoint. For this, you can also use the Vertex AI SDK or you can deploy it through the Google Cloud Console. First, you need to create an endpoint.

code_block

`[(u'code', u'endpoint\_display\_name = f"pytorch-endpoint-{TIMESTAMP}"rnendpoint = vertexai.Endpoint.create(display\_name=endpoint\_display\_name)'), (u'language', u''), (u'caption', )])]`

Next, deploy the model into the endpoint so it can serve online predictions with low latency.

code_block

`[(u'code', u'# Deploy your PyTorch model as an endpointrnendpoint = model.deploy(rn endpoint=endpoint,rn deployed\_model\_display\_name=deployed\_model\_display\_name,rn machine\_type=machine\_type,rn traffic\_percentage=traffic\_percentage,rn sync=sync,rn)'), (u'language', u''), (u'caption', )])]`

Once your model is deployed, you can integrate it with your business application(s). You can test your endpoint via the Vertex AI SDK, endpoint.predict(instances=test_instance), Cloud Shell, or the Google Cloud Console.

What’s next?

To learn more about PyTorch on Vertex AI, take a look at the documentation, which explains Vertex AI's PyTorch integrations and provides resources that show you how to use PyTorch on Vertex AI. You’ll see how easy it is to train, deploy, and orchestrate models in production using PyTorch and Vertex AI. You can also have a look at the notebook that shows how to deploy and host a generative vision model on Vertex AI or try this notebook that deploys a text classification model.

Source: Original Article

Last updated: March 23, 2026

Sep 27, 2022

11 min

How Palantir Manages Continuous Vulnerability Scanning at Scale

The Challenge Effective vulnerability management is a cornerstone of any established security program. For complex cloud software providers like Palantir, stayi

Oct 18, 2022

11 min

Data Connection: The first step in data integration (Palantir RFx Blog Series, #2)

Every data ecosystem requires data integration, and the first step is establishing secure, timely, and reliable data connections to source systems Editor’s note

Dec 29, 2023

4 min

Creature Feature: Safari Across 5 Animal-Focused AI Initiatives of 2023

Whether abundant, endangered or extinct, animal species are the focus of countless AI-powered conservation projects. These initiatives — accelerated using NVIDI

Nov 14, 2024

5 min

Data loading best practices for AI/ML inference on GKE

As AI models increase in sophistication, there’s increasingly large model data needed to serve them. Loading the models and weights along with necessary framewo

Dec 21, 2022

10 min

Power recommendation and search using an IMDb knowledge graph – Part 1

The IMDb and Box Office Mojo Movies/TV/OTT licensable data package provides a wide range of entertainment metadata, including over 1 billion user ratings; credi

Nov 16, 2025

22 min

Cybersecurity and LLMs

TL;DR Large language models (LLMs) and multimodal AI systems are now part of critical business workflows, which means they have become both powerful security to

🤖roboticcontent.com

General

Feb 18, 2023

3 min

🤖roboticcontent.com

General

Feb 18, 2023

3 min