OpenLLM

Operating LLMs in production

This is an exact mirror of the OpenLLM project, hosted at https://github.com/bentoml/OpenLLM. SourceForge is not affiliated with OpenLLM.

Add a Review

Downloads: 3 This Week

Last Update: 2025-04-21

Download

Get an email when there's a new version of OpenLLM

An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via WebUI, CLI, our Python/Javascript client, or any HTTP client.

Features

Fine-tune, serve, deploy, and monitor any LLMs with ease
State-of-the-art LLMs
Flexible APIs
Freedom To Build
Streamline Deployment
Bring your own LLM

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow OpenLLM

OpenLLM Web Site

Other Useful Business Software

Managed Cloud Hosting Platform | Nexcess

For growing digital businesses and engineering teams that need reliable, fully managed cloud infrastructure to run high-performance applications.

The managed cloud solution engineered for simplicity, with built-in governance and risk-mitigation, plus a bill you can actually forecast.

Learn More

Rate This Project

User Reviews

Be the first to post a review of OpenLLM!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM), Python LLM Inference Tool

Registered

2023-08-21

Similar Business Software

Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
OpenVINO

The Intel® Distribution of OpenVINO™ toolkit is an open-source AI development toolkit that accelerates inference across Intel hardware platforms. Designed to streamline AI workflows, it allows developers to deploy optimized deep learning models for computer vision, generative AI, and large...

See Software
LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Falcon-40B

Falcon-40B is a 40B parameters causal decoder-only model built by TII and trained on 1,000B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license. Why use Falcon-40B? It is the best open-source model currently available. Falcon-40B...

See Software
NVIDIA TensorRT

NVIDIA TensorRT is an ecosystem of APIs for high-performance deep learning inference, encompassing an inference runtime and model optimizations that deliver low latency and high throughput for production applications. Built on the CUDA parallel programming model, TensorRT optimizes neural...

See Software