Skip to main content

Overview

reminix-llamaindex lets you deploy LlamaIndex agents, query engines, and chat engines to Reminix.

Installation

pip install reminix-llamaindex
This will also install reminix-runtime as a dependency.

Quick Start

Wrap your LlamaIndex agent and serve it:
from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI
from reminix_llamaindex import serve_agent

llm = OpenAI(model="gpt-4o")
engine = SimpleChatEngine.from_defaults(llm=llm)

if __name__ == "__main__":
    serve_agent(engine, name="chat-agent", port=8080)

Wrapping Different Components

OpenAI Agent

from llama_index.agent.openai import OpenAIAgent
from reminix_llamaindex import wrap_agent

agent = OpenAIAgent.from_tools(tools)
reminix_agent = wrap_agent(agent, name="openai-agent")

ReAct Agent

from llama_index.agent.react import ReActAgent
from reminix_llamaindex import wrap_agent

agent = ReActAgent.from_tools(tools, llm=llm)
reminix_agent = wrap_agent(agent, name="react-agent")

Query Engine

from llama_index.core import VectorStoreIndex
from reminix_llamaindex import wrap_agent

# Build index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Wrap for Reminix
reminix_agent = wrap_agent(
    query_engine,
    name="docs-search",
    description="Search documentation"
)

Chat Engine

from llama_index.core import VectorStoreIndex
from reminix_llamaindex import wrap_agent

index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine()

reminix_agent = wrap_agent(
    chat_engine,
    name="docs-chat",
    description="Chat about documentation"
)

Configuration

reminix_agent = wrap_agent(
    agent,
    name="my-agent",
    description="What this agent does",

    # Streaming
    streaming=True,

    # Chat mode for conversational agents
    chat_mode=True
)

Multiple Agents

For multi-agent projects, use wrap_agent + serve instead of serve_agent:
from reminix_llamaindex import wrap_agent
from reminix_runtime import serve

rag_agent = wrap_agent(query_engine, name="rag-agent")
chat_agent = wrap_agent(chat_engine, name="chat-agent")

serve(agents=[rag_agent, chat_agent], port=8080)
See Multiple Agents for detailed guidance on multi-agent projects.

With RAG

Deploy a RAG pipeline:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.agent.openai import OpenAIAgent
from llama_index.tools import QueryEngineTool

from reminix_llamaindex import wrap_agent
from reminix_runtime import serve

# Build index
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)

# Create query engine tool
query_tool = QueryEngineTool.from_defaults(
    query_engine=index.as_query_engine(),
    name="docs_search",
    description="Search the documentation"
)

# Create agent with RAG
agent = OpenAIAgent.from_tools([query_tool])
reminix_agent = wrap_agent(agent, name="rag-agent")

if __name__ == "__main__":
    serve(agents=[reminix_agent], port=8080)

Persistent Index

For production, use a persistent vector store:
from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.core import VectorStoreIndex
import os

# Connect to Pinecone
vector_store = PineconeVectorStore(
    api_key=os.environ["PINECONE_API_KEY"],
    index_name="my-index"
)

# Load existing index
index = VectorStoreIndex.from_vector_store(vector_store)

Usage

Once deployed, call your agent using invoke. See Agents for detailed guidance.

With Query

For query-based operations:
from reminix import Reminix

client = Reminix()

response = client.agents.invoke(
    "rag-agent",
    query="What is the return policy?"
)

print(response.output)

With Messages

For conversations with message history:
response = client.agents.invoke(
    "rag-agent",
    messages=[
        {"role": "user", "content": "What is the return policy?"},
        {"role": "assistant", "content": "You can return items within 30 days..."},
        {"role": "user", "content": "What about electronics?"}
    ]
)

print(response.output)

Streaming

For real-time streaming responses:
for chunk in client.agents.invoke(
    "rag-agent",
    query="Tell me about the product",
    stream=True
):
    print(chunk, end="", flush=True)

Deployment

See the Deployment guide for deploying to Reminix, or Self-Hosting for deploying on your own infrastructure.

Next Steps