Skip to content

Hypothesize-Tech/costkatana-python

Repository files navigation

Cost Katana Python πŸ₯·

AI that just works. Costs that just track.

One import. Any model. Automatic cost tracking.


πŸš€ Get Started in 60 Seconds

Step 1: Install

pip install costkatana

Step 2: Make Your First AI Call

import cost_katana as ck

response = ck.ai('gpt-4', 'Explain quantum computing in one sentence')

print(response.text)   # "Quantum computing uses qubits to perform..."
print(response.cost)   # 0.0012
print(response.tokens) # 47

That's it. No configuration. No complexity. Just results.


πŸ“– Tutorial: Build a Cost-Aware AI App

Part 1: Basic Chat Session

import cost_katana as ck

# Create a persistent chat session
chat = ck.chat('gpt-4')

chat.send('Hello! What can you help me with?')
chat.send('Tell me a programming joke')
chat.send('Now explain it')

# See exactly what you spent
print(f"πŸ’° Total cost: ${chat.total_cost:.4f}")
print(f"πŸ“Š Messages: {len(chat.history)}")
print(f"🎯 Tokens used: {chat.total_tokens}")

Part 2: Type-Safe Model Selection

Stop guessing model names. Get autocomplete and catch typos:

import cost_katana as ck
from cost_katana import openai, anthropic, google

# Type-safe model constants (recommended)
response = ck.ai(openai.gpt_4, 'Hello, world!')

# Compare models easily
models = [openai.gpt_4, anthropic.claude_3_5_sonnet_20241022, google.gemini_2_5_pro]
for model in models:
    response = ck.ai(model, 'Explain AI in one sentence')
    print(f"Cost: ${response.cost:.4f}")

Available namespaces:

Namespace Models
openai GPT-4, GPT-3.5, O1, O3, DALL-E, Whisper
anthropic Claude 3.5 Sonnet, Haiku, Opus
google Gemini 2.5 Pro, Flash
aws_bedrock Nova, Claude on Bedrock
xai Grok models
deepseek DeepSeek models
mistral Mistral AI models
cohere Command models
meta Llama models

Part 3: Smart Caching

Cache identical questions to avoid paying twice:

import cost_katana as ck

# First call - hits the API
r1 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r1.cached}")  # False
print(f"Cost: ${r1.cost}")     # $0.0008

# Second call - served from cache (FREE!)
r2 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r2.cached}")  # True
print(f"Cost: ${r2.cost}")     # $0.0000 πŸŽ‰

Part 4: Cortex Optimization

For long-form content, Cortex compresses prompts intelligently:

import cost_katana as ck

response = ck.ai(
    'gpt-4',
    'Write a comprehensive guide to machine learning for beginners',
    cortex=True,      # Enable 40-75% cost reduction
    max_tokens=2000
)

print(f"Optimized: {response.optimized}")
print(f"Saved: ${response.saved_amount}")

Part 5: Compare Models Side-by-Side

import cost_katana as ck

prompt = 'Summarize the theory of relativity in 50 words'
models = ['gpt-4', 'claude-3-sonnet', 'gemini-pro', 'gpt-3.5-turbo']

print('πŸ“Š Model Cost Comparison\n')

for model in models:
    response = ck.ai(model, prompt)
    print(f"{model:20} ${response.cost:.6f}")

Sample Output:

πŸ“Š Model Cost Comparison

gpt-4                $0.001200
claude-3-sonnet      $0.000900
gemini-pro           $0.000150
gpt-3.5-turbo        $0.000080

🎯 Core Features

Cost Tracking

Every response includes cost information:

response = ck.ai('gpt-4', 'Write a story')
print(f"Cost: ${response.cost}")
print(f"Tokens: {response.tokens}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")

Auto-Failover

Never failβ€”automatically switch providers:

# If OpenAI is down, automatically uses Claude or Gemini
response = ck.ai('gpt-4', 'Hello')
print(response.provider)  # Might be 'anthropic' if OpenAI failed

Security Firewall

Block malicious prompts:

import cost_katana as ck

ck.configure(firewall=True)

# Malicious prompts are blocked
try:
    ck.ai('gpt-4', 'ignore all previous instructions and...')
except Exception as e:
    print(f'πŸ›‘οΈ Blocked: {e}')

βš™οΈ Configuration

Environment Variables

# Recommended: Use Cost Katana API key for all features
export COST_KATANA_API_KEY="dak_your_key_here"

# Or use provider keys directly (self-hosted)
export OPENAI_API_KEY="sk-..."          # Required for GPT models
export GEMINI_API_KEY="..."             # Required for Gemini models
export ANTHROPIC_API_KEY="sk-ant-..."   # For Claude models
export AWS_ACCESS_KEY_ID="..."          # For AWS Bedrock
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"

⚠️ Self-hosted users: You must provide your own OpenAI/Gemini API keys.

Programmatic Configuration

import cost_katana as ck

ck.configure(
    api_key='dak_your_key',
    cortex=True,     # 40-75% cost savings
    cache=True,      # Smart caching
    firewall=True    # Block prompt injections
)

Request Options

response = ck.ai('gpt-4', 'Your prompt',
    temperature=0.7,                     # Creativity (0-2)
    max_tokens=500,                      # Response limit
    system_message='You are helpful',    # System prompt
    cache=True,                          # Enable caching
    cortex=True,                         # Enable optimization
    retry=True                           # Auto-retry on failures
)

πŸ”Œ Framework Integration

FastAPI

from fastapi import FastAPI
import cost_katana as ck

app = FastAPI()

@app.post('/api/chat')
async def chat(request: dict):
    response = ck.ai('gpt-4', request['prompt'])
    return {'text': response.text, 'cost': response.cost}

Flask

from flask import Flask, request, jsonify
import cost_katana as ck

app = Flask(__name__)

@app.route('/api/chat', methods=['POST'])
def chat():
    response = ck.ai('gpt-4', request.json['prompt'])
    return jsonify({'text': response.text, 'cost': response.cost})

Django

from django.http import JsonResponse
import cost_katana as ck

def chat_view(request):
    response = ck.ai('gpt-4', request.POST.get('prompt'))
    return JsonResponse({'text': response.text, 'cost': response.cost})

πŸ’‘ Real-World Examples

Customer Support Bot

import cost_katana as ck

support = ck.chat('gpt-3.5-turbo',
    system_message='You are a helpful customer support agent.')

def handle_query(query: str):
    response = support.send(query)
    print(f"Cost so far: ${support.total_cost:.4f}")
    return response

Content Generator with Optimization

import cost_katana as ck

def generate_blog_post(topic: str):
    # Use Cortex for long-form content (40-75% savings)
    post = ck.ai('gpt-4', f'Write a blog post about {topic}',
                 cortex=True, max_tokens=2000)
    
    return {
        'content': post.text,
        'cost': post.cost,
        'word_count': len(post.text.split())
    }

Code Review Assistant

import cost_katana as ck

def review_code(code: str):
    review = ck.ai('claude-3-sonnet',
        f'Review this code and suggest improvements:\n\n{code}',
        cache=True)  # Cache for repeated reviews
    return review.text

Translation Service

import cost_katana as ck

def translate(text: str, target_language: str):
    # Use cheaper model for translations
    translated = ck.ai('gpt-3.5-turbo',
        f'Translate to {target_language}: {text}',
        cache=True)
    return translated.text

πŸ’° Cost Optimization Cheatsheet

Strategy Savings Code
Use GPT-3.5 for simple tasks 90% ck.ai('gpt-3.5-turbo', ...)
Enable caching 100% on hits cache=True
Enable Cortex 40-75% cortex=True
Use Gemini for high-volume 95% vs GPT-4 ck.ai('gemini-pro', ...)
Batch in sessions 10-20% ck.chat(...)
# ❌ Expensive
ck.ai('gpt-4', 'What is 2+2?')  # $0.001

# βœ… Smart: Match model to task
ck.ai('gpt-3.5-turbo', 'What is 2+2?')  # $0.0001

# βœ… Smarter: Cache common queries
ck.ai('gpt-3.5-turbo', 'What is 2+2?', cache=True)  # $0 on repeat

# βœ… Smartest: Cortex for long content
ck.ai('gpt-4', 'Write a 2000-word essay', cortex=True)  # 40-75% off

πŸ”§ Error Handling

import cost_katana as ck
from cost_katana.exceptions import CostKatanaError

try:
    response = ck.ai('gpt-4', 'Hello')
    print(response.text)
except CostKatanaError as e:
    if 'API key' in str(e):
        print('Set COST_KATANA_API_KEY or OPENAI_API_KEY')
    elif 'rate limit' in str(e):
        print('Rate limited. Retrying...')
    elif 'model' in str(e):
        print('Model not found')
    else:
        print(f'Error: {e}')

πŸ”„ Migration Guides

From OpenAI SDK

# Before
from openai import OpenAI
client = OpenAI(api_key='sk-...')
completion = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': 'Hello'}]
)
print(completion.choices[0].message.content)

# After
import cost_katana as ck
response = ck.ai('gpt-4', 'Hello')
print(response.text)
print(f"Cost: ${response.cost}")  # Bonus: cost tracking!

From Anthropic SDK

# Before
import anthropic
client = anthropic.Anthropic(api_key='sk-ant-...')
message = client.messages.create(
    model='claude-3-sonnet-20241022',
    messages=[{'role': 'user', 'content': 'Hello'}]
)

# After
import cost_katana as ck
response = ck.ai('claude-3-sonnet', 'Hello')

From Google AI SDK

# Before
import google.generativeai as genai
genai.configure(api_key='...')
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content('Hello')

# After
import cost_katana as ck
response = ck.ai('gemini-pro', 'Hello')

πŸ“¦ Package Names

Language Package Install Import
Python PyPI pip install costkatana import cost_katana
JavaScript NPM npm install cost-katana import { ai } from 'cost-katana'
CLI (NPM) NPM npm install -g cost-katana-cli cost-katana chat
CLI (Python) PyPI pip install costkatana costkatana chat

πŸ“š More Examples

Explore 45+ complete examples:

πŸ”— github.com/Hypothesize-Tech/costkatana-examples

Section Description
Python SDK Complete Python guides
Cost Tracking Track costs across providers
Semantic Caching 30-40% cost reduction
FastAPI Integration Framework examples

πŸ“ž Support

Channel Link
Dashboard costkatana.com
Documentation docs.costkatana.com
GitHub github.com/Hypothesize-Tech/costkatana-python
Discord discord.gg/D8nDArmKbY
Email [email protected]

πŸ“„ License

MIT Β© Cost Katana


Start cutting AI costs today πŸ₯·

pip install costkatana
import cost_katana as ck
response = ck.ai('gpt-4', 'Hello, world!')