Creating an AI Chat Assistant with Streamlit and OpenRouter
Creating an AI Chat Assistant with Streamlit and OpenRouter
Building AI-powered web applications has never been easier. By combining Streamlit (a Python framework for rapid web app development) with OpenRouter (a unified API gateway for top language models), you can create a flexible chat application that switches between different AI models like GPT-4, Claude 3, and Llama 3âwithout changing a single line of code.
What is OpenRouter?
OpenRouter acts as a unified API endpoint for multiple large language models (LLMs). Instead of managing separate API keys, endpoints, and request formats for OpenAI, Anthropic, Meta, and others, OpenRouter provides a single interface to access them all.
Benefits of Using OpenRouter
- Model Agnostic: Switch between GPT-4, Claude, Llama, and more with one parameter change
- Cost Optimization: Choose models based on price/performance needs
- Simplified Integration: One API endpoint, one authentication method
- Fallback Options: Automatically switch to alternative models if one is down
Prerequisites
Before we begin, ensure you have the following:
- Python installed (3.8+)
- Streamlit:
pip install streamlit - Requests:
pip install requests - OpenRouter API Key: Get it at openrouter.ai
Building the Chat Application
Let's build a complete chat interface that sends messages to any OpenRouter-supported model.
Complete Code
import streamlit as st
import requests
import json
# App title
st.title("OpenRouter Streamlit Chatbot")
# Sidebar for API Key and Model Selection
with st.sidebar:
st.header("Settings")
api_key = st.text_input("OpenRouter API Key", type="password")
model = st.selectbox(
"Select Model",
[
"openai/gpt-3.5-turbo",
"anthropic/claude-3-haiku",
"meta-llama/llama-3-8b-instruct"
]
)
st.markdown("[Get API Key](https://openrouter.ai)")
# Initialize chat history
if "messages" not in st.session_state:
st.session_state.messages = []
# Display chat messages from history
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# Chat input
if prompt := st.chat_input("Ask me anything..."):
if not api_key:
st.warning("Please enter your OpenRouter API Key in the sidebar.")
st.stop()
# Add user message to history
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
# API Request configuration
response_url = "https://openrouter.ai/api/v1/chat/completions"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"HTTP-Referer": "http://localhost:8501", # Optional: For ranking on OpenRouter
"X-Title": "OpenRouter Streamlit App",
}
data = {
"model": model,
"messages": st.session_state.messages
}
# Call OpenRouter API
with st.chat_message("assistant"):
with st.spinner("Thinking..."):
response = requests.post(
response_url,
headers=headers,
data=json.dumps(data)
)
if response.status_code == 200:
content = response.json()['choices'][0]['message']['content']
st.markdown(content)
st.session_state.messages.append({
"role": "assistant",
"content": content
})
else:
st.error(f"Error: {response.status_code} - {response.text}")
How It Works
1. Session State Management
Streamlit's st.session_state preserves chat history across app re-runs:
if "messages" not in st.session_state:
st.session_state.messages = []
2. Dynamic Model Selection
The sidebar dropdown lets users switch models on-the-fly:
model = st.selectbox("Select Model", [
"openai/gpt-3.5-turbo",
"anthropic/claude-3-haiku",
"meta-llama/llama-3-8b-instruct"
])
3. OpenRouter API Call
The core integration sends the entire message history to maintain context:
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"HTTP-Referer": "http://localhost:8501",
"X-Title": "OpenRouter Streamlit App",
}
data = {
"model": model,
"messages": st.session_state.messages
}
Running the Application
- Save the code as
app.py - Run the app:
streamlit run app.py - Open your browser at
http://localhost:8501 - Enter your OpenRouter API key in the sidebar
- Start chatting!
Key Considerations
Streaming Responses
For a better user experience, enable real-time token streaming so the AI appears to "type" its response:
response = requests.post(
response_url,
headers=headers,
json=data,
stream=True
)
for chunk in response.iter_lines():
if chunk:
# Parse and stream each token
pass
This requires additional parsing of Server-Sent Events (SSE) format returned by OpenRouter.
Model Selection Strategy
OpenRouter supports hundreds of models. Here's a quick guide:
| Use Case | Recommended Model |
|---|---|
| General purpose | openai/gpt-4 |
| Cost-effective | meta-llama/llama-3-8b-instruct |
| Creative writing | anthropic/claude-3-sonnet |
| Code generation | google/gemini-pro |
| Fast responses | anthropic/claude-3-haiku |
Browse the full list at OpenRouter Models.
Security Best Practices
Never hardcode API keys in your application. Use one of these approaches:
Option 1: Streamlit Secrets (Local Development)
Create .streamlit/secrets.toml:
[openrouter]
api_key = "your-api-key-here"
Access it in code:
api_key = st.secrets["openrouter"]["api_key"]
Option 2: Environment Variables (Production)
import os
api_key = os.environ.get("OPENROUTER_API_KEY")
Set it before running:
export OPENROUTER_API_KEY="your-key"
streamlit run app.py
OAuth Option
OpenRouter supports OAuth, allowing users to connect their own OpenRouter accounts instead of entering API keys manually. This is ideal for public-facing applications where you don't want to manage keys.
Performance Optimization
- Use
st.spinner: Always wrap API calls with loading indicators - Timeout Handling: Add request timeouts to prevent hanging
response = requests.post(url, headers=headers, json=data, timeout=30) - Error Retries: Implement exponential backoff for failed requests
- Caching: Use
@st.cache_datafor repeated similar queries
Enhanced Version with Error Handling
Here's an improved API call with production-ready features:
def call_openrouter(messages, model, api_key, max_retries=3):
url = "https://openrouter.ai/api/v1/chat/completions"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
payload = {
"model": model,
"messages": messages,
"temperature": 0.7,
"max_tokens": 1000
}
for attempt in range(max_retries):
try:
response = requests.post(
url,
headers=headers,
json=payload,
timeout=30
)
if response.status_code == 200:
return response.json()['choices'][0]['message']['content']
elif response.status_code == 429:
st.warning("Rate limited. Please wait...")
time.sleep(2 ** attempt) # Exponential backoff
else:
st.error(f"Error: {response.status_code}")
return None
except requests.exceptions.Timeout:
st.error("Request timed out. Retrying...")
except requests.exceptions.RequestException as e:
st.error(f"Request failed: {e}")
return None
st.error("Max retries exceeded")
return None
Deployment Options
Once your chat app is ready, deploy it to share with the world:
- Streamlit Community Cloud: Free hosting at share.streamlit.io
- Hugging Face Spaces: Free tier with custom models
- Heroku: Scalable PaaS option
- AWS/GCP/Azure: Full control with cloud providers
Next Steps
- Add File Uploads: Enable PDF/text analysis with
st.file_uploader - Memory Management: Implement conversation summarization for long chats
- Multi-Modal: Integrate image analysis with vision models
- Custom Prompts: Add system prompts for specialized behaviors
- Analytics: Track usage patterns and model performance
Conclusion
Combining Streamlit's rapid prototyping with OpenRouter's model flexibility gives you a powerful toolkit for building AI applications. You can experiment with different models, optimize for cost or performance, and deploy a polished web appâall in under 100 lines of Python.
The beauty of this approach? Switching from GPT-4 to Claude 3 to Llama 3 is as simple as changing a dropdown selection. No code rewrites, no multiple SDK integrationsâjust pure experimentation.
Have you built AI apps with Streamlit and OpenRouter? Share your projects and tips in the comments below!