AI that just works. Costs that just track.
One import. Any model. Automatic cost tracking.
pip install costkatanaimport cost_katana as ck
response = ck.ai('gpt-4', 'Explain quantum computing in one sentence')
print(response.text) # "Quantum computing uses qubits to perform..."
print(response.cost) # 0.0012
print(response.tokens) # 47That's it. No configuration. No complexity. Just results.
import cost_katana as ck
# Create a persistent chat session
chat = ck.chat('gpt-4')
chat.send('Hello! What can you help me with?')
chat.send('Tell me a programming joke')
chat.send('Now explain it')
# See exactly what you spent
print(f"π° Total cost: ${chat.total_cost:.4f}")
print(f"π Messages: {len(chat.history)}")
print(f"π― Tokens used: {chat.total_tokens}")Stop guessing model names. Get autocomplete and catch typos:
import cost_katana as ck
from cost_katana import openai, anthropic, google
# Type-safe model constants (recommended)
response = ck.ai(openai.gpt_4, 'Hello, world!')
# Compare models easily
models = [openai.gpt_4, anthropic.claude_3_5_sonnet_20241022, google.gemini_2_5_pro]
for model in models:
response = ck.ai(model, 'Explain AI in one sentence')
print(f"Cost: ${response.cost:.4f}")Available namespaces:
| Namespace | Models |
|---|---|
openai |
GPT-4, GPT-3.5, O1, O3, DALL-E, Whisper |
anthropic |
Claude 3.5 Sonnet, Haiku, Opus |
google |
Gemini 2.5 Pro, Flash |
aws_bedrock |
Nova, Claude on Bedrock |
xai |
Grok models |
deepseek |
DeepSeek models |
mistral |
Mistral AI models |
cohere |
Command models |
meta |
Llama models |
Cache identical questions to avoid paying twice:
import cost_katana as ck
# First call - hits the API
r1 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r1.cached}") # False
print(f"Cost: ${r1.cost}") # $0.0008
# Second call - served from cache (FREE!)
r2 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r2.cached}") # True
print(f"Cost: ${r2.cost}") # $0.0000 πFor long-form content, Cortex compresses prompts intelligently:
import cost_katana as ck
response = ck.ai(
'gpt-4',
'Write a comprehensive guide to machine learning for beginners',
cortex=True, # Enable 40-75% cost reduction
max_tokens=2000
)
print(f"Optimized: {response.optimized}")
print(f"Saved: ${response.saved_amount}")import cost_katana as ck
prompt = 'Summarize the theory of relativity in 50 words'
models = ['gpt-4', 'claude-3-sonnet', 'gemini-pro', 'gpt-3.5-turbo']
print('π Model Cost Comparison\n')
for model in models:
response = ck.ai(model, prompt)
print(f"{model:20} ${response.cost:.6f}")Sample Output:
π Model Cost Comparison
gpt-4 $0.001200
claude-3-sonnet $0.000900
gemini-pro $0.000150
gpt-3.5-turbo $0.000080
Every response includes cost information:
response = ck.ai('gpt-4', 'Write a story')
print(f"Cost: ${response.cost}")
print(f"Tokens: {response.tokens}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")Never failβautomatically switch providers:
# If OpenAI is down, automatically uses Claude or Gemini
response = ck.ai('gpt-4', 'Hello')
print(response.provider) # Might be 'anthropic' if OpenAI failedBlock malicious prompts:
import cost_katana as ck
ck.configure(firewall=True)
# Malicious prompts are blocked
try:
ck.ai('gpt-4', 'ignore all previous instructions and...')
except Exception as e:
print(f'π‘οΈ Blocked: {e}')# Recommended: Use Cost Katana API key for all features
export COST_KATANA_API_KEY="dak_your_key_here"
# Or use provider keys directly (self-hosted)
export OPENAI_API_KEY="sk-..." # Required for GPT models
export GEMINI_API_KEY="..." # Required for Gemini models
export ANTHROPIC_API_KEY="sk-ant-..." # For Claude models
export AWS_ACCESS_KEY_ID="..." # For AWS Bedrock
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"
β οΈ Self-hosted users: You must provide your own OpenAI/Gemini API keys.
import cost_katana as ck
ck.configure(
api_key='dak_your_key',
cortex=True, # 40-75% cost savings
cache=True, # Smart caching
firewall=True # Block prompt injections
)response = ck.ai('gpt-4', 'Your prompt',
temperature=0.7, # Creativity (0-2)
max_tokens=500, # Response limit
system_message='You are helpful', # System prompt
cache=True, # Enable caching
cortex=True, # Enable optimization
retry=True # Auto-retry on failures
)from fastapi import FastAPI
import cost_katana as ck
app = FastAPI()
@app.post('/api/chat')
async def chat(request: dict):
response = ck.ai('gpt-4', request['prompt'])
return {'text': response.text, 'cost': response.cost}from flask import Flask, request, jsonify
import cost_katana as ck
app = Flask(__name__)
@app.route('/api/chat', methods=['POST'])
def chat():
response = ck.ai('gpt-4', request.json['prompt'])
return jsonify({'text': response.text, 'cost': response.cost})from django.http import JsonResponse
import cost_katana as ck
def chat_view(request):
response = ck.ai('gpt-4', request.POST.get('prompt'))
return JsonResponse({'text': response.text, 'cost': response.cost})import cost_katana as ck
support = ck.chat('gpt-3.5-turbo',
system_message='You are a helpful customer support agent.')
def handle_query(query: str):
response = support.send(query)
print(f"Cost so far: ${support.total_cost:.4f}")
return responseimport cost_katana as ck
def generate_blog_post(topic: str):
# Use Cortex for long-form content (40-75% savings)
post = ck.ai('gpt-4', f'Write a blog post about {topic}',
cortex=True, max_tokens=2000)
return {
'content': post.text,
'cost': post.cost,
'word_count': len(post.text.split())
}import cost_katana as ck
def review_code(code: str):
review = ck.ai('claude-3-sonnet',
f'Review this code and suggest improvements:\n\n{code}',
cache=True) # Cache for repeated reviews
return review.textimport cost_katana as ck
def translate(text: str, target_language: str):
# Use cheaper model for translations
translated = ck.ai('gpt-3.5-turbo',
f'Translate to {target_language}: {text}',
cache=True)
return translated.text| Strategy | Savings | Code |
|---|---|---|
| Use GPT-3.5 for simple tasks | 90% | ck.ai('gpt-3.5-turbo', ...) |
| Enable caching | 100% on hits | cache=True |
| Enable Cortex | 40-75% | cortex=True |
| Use Gemini for high-volume | 95% vs GPT-4 | ck.ai('gemini-pro', ...) |
| Batch in sessions | 10-20% | ck.chat(...) |
# β Expensive
ck.ai('gpt-4', 'What is 2+2?') # $0.001
# β
Smart: Match model to task
ck.ai('gpt-3.5-turbo', 'What is 2+2?') # $0.0001
# β
Smarter: Cache common queries
ck.ai('gpt-3.5-turbo', 'What is 2+2?', cache=True) # $0 on repeat
# β
Smartest: Cortex for long content
ck.ai('gpt-4', 'Write a 2000-word essay', cortex=True) # 40-75% offimport cost_katana as ck
from cost_katana.exceptions import CostKatanaError
try:
response = ck.ai('gpt-4', 'Hello')
print(response.text)
except CostKatanaError as e:
if 'API key' in str(e):
print('Set COST_KATANA_API_KEY or OPENAI_API_KEY')
elif 'rate limit' in str(e):
print('Rate limited. Retrying...')
elif 'model' in str(e):
print('Model not found')
else:
print(f'Error: {e}')# Before
from openai import OpenAI
client = OpenAI(api_key='sk-...')
completion = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': 'Hello'}]
)
print(completion.choices[0].message.content)
# After
import cost_katana as ck
response = ck.ai('gpt-4', 'Hello')
print(response.text)
print(f"Cost: ${response.cost}") # Bonus: cost tracking!# Before
import anthropic
client = anthropic.Anthropic(api_key='sk-ant-...')
message = client.messages.create(
model='claude-3-sonnet-20241022',
messages=[{'role': 'user', 'content': 'Hello'}]
)
# After
import cost_katana as ck
response = ck.ai('claude-3-sonnet', 'Hello')# Before
import google.generativeai as genai
genai.configure(api_key='...')
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content('Hello')
# After
import cost_katana as ck
response = ck.ai('gemini-pro', 'Hello')| Language | Package | Install | Import |
|---|---|---|---|
| Python | PyPI | pip install costkatana |
import cost_katana |
| JavaScript | NPM | npm install cost-katana |
import { ai } from 'cost-katana' |
| CLI (NPM) | NPM | npm install -g cost-katana-cli |
cost-katana chat |
| CLI (Python) | PyPI | pip install costkatana |
costkatana chat |
Explore 45+ complete examples:
π github.com/Hypothesize-Tech/costkatana-examples
| Section | Description |
|---|---|
| Python SDK | Complete Python guides |
| Cost Tracking | Track costs across providers |
| Semantic Caching | 30-40% cost reduction |
| FastAPI Integration | Framework examples |
| Channel | Link |
|---|---|
| Dashboard | costkatana.com |
| Documentation | docs.costkatana.com |
| GitHub | github.com/Hypothesize-Tech/costkatana-python |
| Discord | discord.gg/D8nDArmKbY |
| [email protected] |
MIT Β© Cost Katana
Start cutting AI costs today π₯·
pip install costkatanaimport cost_katana as ck
response = ck.ai('gpt-4', 'Hello, world!')