LlamaIndex

Overview

reminix-llamaindex lets you deploy LlamaIndex agents, query engines, and chat engines to Reminix.

Installation

pip install reminix-llamaindex

This will also install reminix-runtime as a dependency.

Quick Start

Wrap your LlamaIndex agent and serve it:

from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI
from reminix_llamaindex import serve_agent

llm = OpenAI(model="gpt-4o")
engine = SimpleChatEngine.from_defaults(llm=llm)

if __name__ == "__main__":
    serve_agent(engine, name="chat-agent", port=8080)

Wrapping Different Components

OpenAI Agent

from llama_index.agent.openai import OpenAIAgent
from reminix_llamaindex import wrap_agent

agent = OpenAIAgent.from_tools(tools)
reminix_agent = wrap_agent(agent, name="openai-agent")

ReAct Agent

from llama_index.agent.react import ReActAgent
from reminix_llamaindex import wrap_agent

agent = ReActAgent.from_tools(tools, llm=llm)
reminix_agent = wrap_agent(agent, name="react-agent")

Query Engine

from llama_index.core import VectorStoreIndex
from reminix_llamaindex import wrap_agent

# Build index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Wrap for Reminix
reminix_agent = wrap_agent(
    query_engine,
    name="docs-search",
    description="Search documentation"
)

Chat Engine

from llama_index.core import VectorStoreIndex
from reminix_llamaindex import wrap_agent

index = VectorStoreIndex.from_documents(documents)
chat_engine = index.as_chat_engine()

reminix_agent = wrap_agent(
    chat_engine,
    name="docs-chat",
    description="Chat about documentation"
)

Configuration

reminix_agent = wrap_agent(
    agent,
    name="my-agent",
    description="What this agent does",

    # Streaming
    streaming=True,

    # Chat mode for conversational agents
    chat_mode=True
)

Multiple Agents

For multi-agent projects, use wrap_agent + serve instead of serve_agent:

from reminix_llamaindex import wrap_agent
from reminix_runtime import serve

rag_agent = wrap_agent(query_engine, name="rag-agent")
chat_agent = wrap_agent(chat_engine, name="chat-agent")

serve(agents=[rag_agent, chat_agent], port=8080)

See Multiple Agents for detailed guidance on multi-agent projects.

With RAG

Deploy a RAG pipeline:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.agent.openai import OpenAIAgent
from llama_index.tools import QueryEngineTool

from reminix_llamaindex import wrap_agent
from reminix_runtime import serve

# Build index
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)

# Create query engine tool
query_tool = QueryEngineTool.from_defaults(
    query_engine=index.as_query_engine(),
    name="docs_search",
    description="Search the documentation"
)

# Create agent with RAG
agent = OpenAIAgent.from_tools([query_tool])
reminix_agent = wrap_agent(agent, name="rag-agent")

if __name__ == "__main__":
    serve(agents=[reminix_agent], port=8080)

Persistent Index

For production, use a persistent vector store:

from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.core import VectorStoreIndex
import os

# Connect to Pinecone
vector_store = PineconeVectorStore(
    api_key=os.environ["PINECONE_API_KEY"],
    index_name="my-index"
)

# Load existing index
index = VectorStoreIndex.from_vector_store(vector_store)

Usage

Once deployed, call your agent using invoke. See Agents for detailed guidance.

With Query

For query-based operations:

from reminix import Reminix

client = Reminix()

response = client.agents.invoke(
    "rag-agent",
    query="What is the return policy?"
)

print(response.output)

With Messages

For conversations with message history:

response = client.agents.invoke(
    "rag-agent",
    messages=[
        {"role": "user", "content": "What is the return policy?"},
        {"role": "assistant", "content": "You can return items within 30 days..."},
        {"role": "user", "content": "What about electronics?"}
    ]
)

print(response.output)

Streaming

For real-time streaming responses:

for chunk in client.agents.invoke(
    "rag-agent",
    query="Tell me about the product",
    stream=True
):
    print(chunk, end="", flush=True)

Deployment

See the Deployment guide for deploying to Reminix, or Self-Hosting for deploying on your own infrastructure.

SDK

Agents

Integrations

Overview

Installation

Quick Start

Wrapping Different Components

OpenAI Agent

ReAct Agent

Query Engine

Chat Engine

Configuration

Multiple Agents

With RAG

Persistent Index

Usage

With Query

With Messages

Streaming

Deployment

Next Steps

OpenAI

Deploying

SDK

Agents

Integrations

​Overview

​Installation

​Quick Start

​Wrapping Different Components

​OpenAI Agent

​ReAct Agent

​Query Engine

​Chat Engine

​Configuration

​Multiple Agents

​With RAG

​Persistent Index

​Usage

​With Query

​With Messages

​Streaming

​Deployment

​Next Steps

OpenAI

Deploying

Overview

Installation

Quick Start

Wrapping Different Components

OpenAI Agent

ReAct Agent

Query Engine

Chat Engine

Configuration

Multiple Agents

With RAG

Persistent Index

Usage

With Query

With Messages

Streaming

Deployment

Next Steps