Home > Artificial Intelligence > Integrating AI Models into Spring Boot 3 Microservices Using ONNX Runtime: A Practical Guide

Integrating AI Models into Spring Boot 3 Microservices Using ONNX Runtime: A Practical Guide

Integrating AI Models into Spring Boot 3 Microservices Using ONNX Runtime: A Practical Guide

Artificial Intelligence (AI) is transforming multiple industries, and integrating AI models into microservices is becoming increasingly common. Spring Boot 3, the latest iteration of the popular Java-based web framework, makes it straightforward to design powerful and scalable microservices. ONNX Runtime, an open-source library designed to accelerate machine learning model inference, is the perfect bridge to integrate AI models into your Spring Boot microservices efficiently.

This article provides a practical guide to integrating AI models into Spring Boot 3 microservices using ONNX Runtime. We will walk through the process step-by-step, including setting up your Spring Boot application, embedding an ONNX model, and performing inference in real-time.

Why Use ONNX Runtime in Spring Boot Microservices?

ONNX Runtime is a cross-platform inference engine that supports models trained in various frameworks like TensorFlow, PyTorch, and scikit-learn. Some key benefits of using ONNX Runtime include:

  • Performance: Optimized for high-speed inference.
  • Flexibility: Supports a wide range of AI model formats.
  • Ease of Integration: Compatible with Java via its Java API.
  • Scalability: Ideal for microservices architecture due to its lightweight deployment.

By integrating ONNX Runtime into a Spring Boot 3 microservice, you can deploy AI models seamlessly and scale them as needed.

Prerequisites

Before diving into the implementation, ensure you have the following:

  1. Java Development Kit (JDK): Version 17 or higher (required for Spring Boot 3).
  2. Spring Boot 3 Framework: Installed and set up in your development environment.
  3. ONNX Runtime Java Dependency: Added to your `pom.xml`.
  4. An ONNX Model: Pre-trained AI model exported in ONNX format.
  5. Maven: Build tool for managing dependencies.
Step 1: Setting Up the Spring Boot Project

Start by creating a new Spring Boot project. You can use Spring Initializr to generate the project structure:

  1. Go to [Spring Initializr](https://start.spring.io/).
  2. Select **Spring Boot Version 3.x**.
  3. Add dependencies: **Spring Web** and **Spring Boot Actuator**.
  4. Download the generated project.
Step 2: Add ONNX Runtime Dependency

To use ONNX Runtime in your Spring Boot project, you need to add the following dependency to your `pom.xml`:

<pre class="wp-block-syntaxhighlighter-code">
<dependency>
    <groupId>com.microsoft.onnxruntime</groupId>
    <artifactId>onnxruntime</artifactId>
    <version>1.15.0</version>
</dependency>
</pre>

Run `mvn install` to ensure the dependency is loaded correctly.

Step 3: Load the ONNX Model

Place your ONNX model file (e.g., `model.onnx`) in the `resources` folder of your Spring Boot project. Create a service that loads the model using ONNX Runtime.

Code Example: Load the ONNX Model

import ai.onnxruntime.OnnxRuntime;
import ai.onnxruntime.OrtEnvironment;
import ai.onnxruntime.OrtSession;
import org.springframework.stereotype.Service;

import java.io.IOException;

@Service
public class AIModelService {

    private OrtEnvironment environment;
    private OrtSession session;

    public AIModelService() throws IOException {
        // Initialize ONNX Runtime environment
        this.environment = OrtEnvironment.getEnvironment();

        // Load the ONNX model
        String modelPath = "src/main/resources/model.onnx";
        this.session = environment.createSession(modelPath);
    }

    public OrtSession getSession() {
        return session;
    }
}

Step 4: Perform Inference Using ONNX Runtime

Once the model is loaded, you can use the `OrtSession` object to perform inference. Here’s an example of how to pass input data to the model and retrieve predictions.

Code Example: Perform Inference

import ai.onnxruntime.OnnxTensor;
import ai.onnxruntime.OrtSession;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.Collections;

@Service
public class InferenceService {

    @Autowired
    private AIModelService modelService;

    public float[] predict(float[] inputData) throws Exception {
        // Create an input tensor
        OnnxTensor inputTensor = OnnxTensor.createTensor(
            modelService.getSession().getEnvironment(),
            new float[][]{inputData}
        );

        // Run inference
        OrtSession.Result result = modelService.getSession().run(
            Collections.singletonMap("input", inputTensor)
        );

        // Extract predictions
        float[] predictions = (float[]) result.get(0).getValue();

        // Close the tensor
        inputTensor.close();

        return predictions;
    }
}

Step 5: Exposing the AI Model via REST API

To make the AI model accessible, create a REST controller that interacts with the `InferenceService`. The controller will allow users to send input data and receive predictions.

Code Example: REST Controller

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/predict")
public class AIController {

    @Autowired
    private InferenceService inferenceService;

    @PostMapping
    public float[] predict(@RequestBody float[] inputData) throws Exception {
        return inferenceService.predict(inputData);
    }
}

Step 6: Testing the Microservice

To test the microservice, run your Spring Boot application and use a tool like **Postman** or **cURL**. Send a POST request to the `/api/predict` endpoint with the input data as the JSON payload.

Example cURL Command:

curl -X POST -H "Content-Type: application/json" \
-d "[1.0, 2.0, 3.0]" \
http://localhost:8080/api/predict

You should receive a JSON response containing the model predictions.

Step 7: Deploying the Microservice

Once you have tested the microservice locally, you can deploy it using Docker or a cloud platform like AWS, Azure, or Google Cloud. Ensure that your ONNX model is included in the deployment package.

Sample Dockerfile for Deployment
FROM openjdk:17-jdk-alpine VOLUME /tmp COPY target/springboot-ai-microservice.jar app.jar COPY src/main/resources/model.onnx model.onnx ENTRYPOINT [“java”, “-jar”, “/app.jar”]

Build and run the Docker container:


docker build -t springboot-ai-microservice .
docker run -p 8080:8080 springboot-ai-microservice

Conclusion

By following this guide, you’ve successfully integrated an ONNX AI model into a Spring Boot 3 microservice. This setup allows you to handle scalable AI inference workloads efficiently. The combination of Spring Boot and ONNX Runtime ensures that your application is both robust and performant, making it ideal for production-grade AI microservices.