Integrate with Serverless
Runpod Serverless endpoints are REST APIs that accept HTTP requests, execute your code, and return the result via HTTP response. Each endpoint provides a unique URL and abstracts away the complexity of managing individual GPUs/CPUs. To integrate with Serverless:- Create a handler function with the code for your application.
- Package your worker into a Docker image and push it to a Docker registry.
- Deploy a Serverless endpoint using the Runpod console or REST API.
- Start sending requests to the endpoint.
Integrate with Pods
Pods are self-contained compute environments that you can deploy on Runpod. They’re ideal for applications that require a consistent, predictable environment, such as web applications or backend services with a constant workload. There are two primary methods for integrating a Pod with your application:HTTP proxy
For web-based APIs or UIs, Runpod provides an automated HTTP proxy. Any port you expose as an HTTP port in your template or Pod configuration is accessible via a unique URL. The URL follows this format:https://POD_ID-INTERNAL_PORT.proxy.runpod.net. For example, if your Pod’s ID is abc123xyz and you exposed port 8000, your application would send requests to https://abc123xyz-8000.proxy.runpod.net.
Direct TCP
For protocols that require persistent connections or fall outside of standard HTTP, use the Direct TCP Ports. When you expose a TCP port, Runpod assigns a public IP address and a mapped external port. You can find these details using theGET /pods/POD_ID endpoint or the Pod connection menu in the Runpod console.
Integrate with OpenAI-compatible endpoints
Many external tools and agentic frameworks support OpenAI-compatible endpoints with little-to-no configuration required. Integration is usually straightforward: any library or framework that accepts a custom base URL for API calls will work with Runpod without specialized adapters or connectors. This means you can integrate Runpod with tools like n8n, CrewAI, LangChain, and many others by simply pointing them to your Runpod endpoint URL and providing your Runpod API key for authentication. You can integrate OpenAI-compatible tools with Runpod using any of the following methods:Public Endpoints
Public Endpoints are pre-deployed AI models that you can use without setting up your own Serverless endpoint. They’re vLLM-compatible and return OpenAI-compatible responses, so you can get started quickly without deploying The following Public Endpoint URLs are available for OpenAI-compatible models:- Qwen3 32B AWQ:
https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1 - IBM Granite-4.0-H-Small:
https://api.runpod.ai/v2/granite-4-0-h-small/openai/v1
vLLM workers
Serverless vLLM workers are optimized for running large language models and return OpenAI-compatible responses, making them ideal for tools that expect OpenAI’s API format. When you deploy a vLLM worker, you can access it using the OpenAI-compatible API at this base URL:ENDPOINT_ID is your Serverless endpoint ID.