Lewati ke isi

Alternative LLM Backends

Selain Kiro, ada banyak backend lain. Comparison + setup per provider.

Quick comparison (Mei 2026)

Provider Cost Quality Setup difficulty Best for
Kiro AI FREE ⭐⭐⭐⭐⭐ Easy Default best
OpenCode Free FREE ⭐⭐⭐⭐ Easiest (no auth) Quick start
Vertex AI FREE ($300 credit) ⭐⭐⭐⭐⭐ Medium High volume + quality
OpenRouter Pay-per-use ⭐⭐⭐⭐ Easy Flexibility
z.ai (GLM) Cheap ($0.6/1M) ⭐⭐⭐⭐ Easy Cost-effective
Kimi Cheap ⭐⭐⭐⭐ Easy Asian language tasks
DeepSeek Cheap ($0.14/1M) ⭐⭐⭐⭐ Easy Reasoning + coding
Groq FREE tier + paid ⭐⭐⭐⭐ Easy Speed (fastest)
OpenAI API $$$ ⭐⭐⭐⭐⭐ Easiest Mainstream
Anthropic API $$$ ⭐⭐⭐⭐⭐ Easy Claude direct
Self-host Llama FREE (compute) ⭐⭐⭐⭐ Hard Privacy

OpenCode Free

Paling gampang setup. No login, no auth, no API key.

Setup di 9Router

Dashboard → Providers → OpenCode Free → Connect → done.

Models auto-fetched dari https://opencode.ai/zen/v1/models. Biasanya ada:

  • oc/claude-3.5-sonnet
  • oc/gpt-4o
  • oc/llama-3.1-70b
  • oc/gemini-2.0-flash

Pakai di bot

OPENAI_MODEL=oc/claude-3.5-sonnet

Trade-off

  • ✅ Fastest setup
  • ✅ Models auto-update
  • ❌ Rate-limit ga jelas
  • ❌ Reliability tergantung OpenCode infra

Cocok untuk: dev / staging, atau backup ke Kiro.

Vertex AI (Google Cloud)

Kalo lo punya akun GCP baru, dapat $300 credit gratis. Cukup buat agent personal ~6-12 bulan.

Setup

  1. Sign up GCP: https://cloud.google.com/free
  2. Aktifkan billing (kredit gratis, ga charge sampai habis)
  3. Enable Vertex AI API
  4. Bikin service account:
  5. IAM & Admin → Service Accounts → Create
  6. Role: Vertex AI User
  7. Download JSON key

Setup di 9Router

Dashboard → Providers → Vertex AI → Upload JSON key → Select project ID.

Models available:

  • vx/gemini-3-pro
  • vx/gemini-3-flash
  • vx/claude-sonnet-4 (via Vertex partnership)
  • vx/glm-5
  • vx/deepseek-v3

Pakai di bot

OPENAI_MODEL=vx/gemini-3-pro

Trade-off

  • ✅ Premium models (Gemini 3, Claude via Vertex)
  • ✅ $300 credit lasts long
  • ✅ Stable & reliable
  • ❌ Setup ribet (GCP project, IAM, service account)
  • ❌ After credit, expensive

OpenRouter

Hub ke 100+ models dari semua provider. 1 API key access ke OpenAI + Anthropic + Google + Mistral + free tier (Llama 3, Mistral 7B, etc).

Setup

  1. Sign up: https://openrouter.ai
  2. Dashboard → Keys → Create key (sk-or-v1-...)
  3. (Optional) Top up credit, atau cuma pake free tier

Setup di 9Router

Dashboard → Providers → OpenRouter → Paste API key.

Direct pakai tanpa 9Router

Kalo lo ga mau host 9Router, OpenRouter langsung bisa jadi backend Kai:

OPENAI_API_KEY=sk-or-v1-xxxxxxxx
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_MODEL=anthropic/claude-3.5-sonnet

Tambah header (opsional):

client = OpenAI(
    api_key=os.environ["OPENROUTER_API_KEY"],
    base_url="https://openrouter.ai/api/v1",
    default_headers={
        "HTTP-Referer": "https://yourdomain.com",  # untuk credit attribution
        "X-Title": "Kai Personal Agent"
    }
)

Model populer di OpenRouter

Model Cost (per 1M token in/out)
anthropic/claude-3.5-sonnet $3 / $15
anthropic/claude-3-haiku $0.25 / $1.25
openai/gpt-4o-mini $0.15 / $0.6
openai/gpt-4o $2.50 / $10
google/gemini-2.0-flash-exp:free FREE (rate-limited)
meta-llama/llama-3.1-70b-instruct:free FREE (rate-limited)
mistralai/mistral-7b-instruct:free FREE (rate-limited)

Strategi hemat: pakai free models untuk dev, paid models untuk prod kritis.

Trade-off

  • ✅ Mainstream, paling banyak dokumentasi
  • ✅ 100+ models di 1 API
  • ✅ Free tier untuk Llama, Mistral, Gemini Flash
  • ❌ Free tier rate-limited (ga konsisten)
  • ❌ Paid tier ga termurah

z.ai (GLM by Zhipu)

China-based provider, GLM family models. Murah tapi quality decent.

Setup

  1. Sign up: https://z.ai
  2. Top up credit (minimal $5)
  3. Generate API key

Direct integration

OPENAI_API_KEY=<glm-api-key>
OPENAI_BASE_URL=https://open.bigmodel.cn/api/paas/v4
OPENAI_MODEL=glm-4-plus

Setup di 9Router

Dashboard → Providers → GLM (Zhipu) → API key.

Models

  • glm-5 — flagship, comparable to GPT-4
  • glm-4-plus — fast & quality balance
  • glm-4-flash — cheapest, simple tasks

Cost

  • glm-5: $0.6 / 1M tokens
  • glm-4-plus: $0.3 / 1M tokens
  • glm-4-flash: $0.1 / 1M tokens

10x lebih murah dari Claude/GPT-4.

Trade-off

  • ✅ Sangat murah
  • ✅ Quality decent untuk most tasks
  • ❌ China-hosted (kalo concern data sovereignty)
  • ❌ Performa Mandarin > English

Kimi (Moonshot AI)

China-based, Kimi K2 model.

Setup

  1. Sign up: https://platform.moonshot.cn
  2. API key

Direct integration

OPENAI_API_KEY=<kimi-api-key>
OPENAI_BASE_URL=https://api.moonshot.cn/v1
OPENAI_MODEL=moonshot-v1-32k

Models

  • moonshot-v1-8k — short context
  • moonshot-v1-32k — medium
  • moonshot-v1-128k — long context (paling populer)

Trade-off

  • ✅ Long context (128k tokens)
  • ✅ Cheap
  • ❌ Performa Indonesian/English bisa inconsistent

DeepSeek

China-based, strong di reasoning + coding.

Setup

  1. Sign up: https://platform.deepseek.com
  2. Top up $5 minimum
  3. API key

Direct integration

OPENAI_API_KEY=<deepseek-api-key>
OPENAI_BASE_URL=https://api.deepseek.com/v1
OPENAI_MODEL=deepseek-chat

Models

  • deepseek-chat — general purpose
  • deepseek-coder — specialized coding
  • deepseek-reasoner — R1-style reasoning

Cost

  • $0.14 / 1M input tokens
  • $0.28 / 1M output tokens

Termurah di tier paid.

Trade-off

  • ✅ Sangat murah
  • ✅ Quality kompetitif
  • ✅ Strong di code & reasoning
  • ❌ Latency variabel

Groq (Speed king)

Cloud LPU inference. Kecepatan inferensi ~10x lebih cepat dari GPU standar.

Setup

  1. Sign up: https://console.groq.com
  2. API key free tier

Direct integration

OPENAI_API_KEY=<groq-api-key>
OPENAI_BASE_URL=https://api.groq.com/openai/v1
OPENAI_MODEL=llama-3.3-70b-versatile

Models

  • llama-3.3-70b-versatile — fastest decent quality
  • llama-3.1-8b-instant — sub-second response
  • mixtral-8x7b-32768 — long context
  • gemma2-9b-it — Google open model

Cost

  • Free tier: 30 req/min, 6000 req/day
  • Paid: $0.59 / 1M tokens (Llama 70B)

Trade-off

  • ✅ Fastest LLM inference available
  • ✅ Free tier generous
  • ✅ Llama 70B competitive dengan Claude Haiku
  • ❌ Cuma open models (Llama, Mixtral, Gemma)
  • ❌ Bukan flagship quality

Cocok untuk: bot yang butuh respon cepet (< 1 detik), real-time chat.

OpenAI API langsung

Yang paling mainstream. Pay-as-you-go.

Setup

  1. Sign up: https://platform.openai.com
  2. Top up minimum $5
  3. API key

Integration

OPENAI_API_KEY=sk-xxxxxxxxx
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-4o-mini

Models

  • gpt-4o-mini — daily driver ($0.15 / $0.60 per 1M)
  • gpt-4o — flagship ($2.50 / $10 per 1M)
  • gpt-4-turbo — older flagship
  • o1-mini / o1-preview — reasoning

Trade-off

  • ✅ Mainstream, ton dokumentasi
  • ✅ Reliable, scalable
  • ✅ Privacy lebih baik (paid policy)
  • ❌ Lebih mahal dari China-based
  • ❌ Quality OpenAI updates ga konsisten lately

Anthropic API langsung

Direct ke Claude tanpa proxy.

Setup

  1. Sign up: https://console.anthropic.com
  2. Top up $5 minimum
  3. API key

Integration

⚠️ Anthropic API format beda dari OpenAI. Pake anthropic SDK:

from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "halo"}],
    max_tokens=1024,
    system="You are Kai..."
)

Atau via OpenRouter/9Router yang udah handle translation OpenAI ↔ Anthropic.

Models

  • claude-3-5-sonnet — flagship Claude 3.5
  • claude-3-5-haiku — fast
  • claude-3-opus — older flagship (deprecated soon)
  • claude-4-sonnet-thinking (kalo udah release)

Cost

  • Claude 3.5 Sonnet: $3 / $15 per 1M
  • Claude 3.5 Haiku: $1 / $5 per 1M

Trade-off

  • ✅ Best quality (Claude 4.5 via API tier)
  • ✅ Privacy decent
  • ❌ Termahal
  • ❌ Format ga OpenAI-compatible (perlu adapter / OpenRouter)

Self-host (advanced)

Kalo lo serius privacy + own everything.

Stack populer

Ollama (paling gampang):

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b
ollama serve  # default port 11434

OpenAI-compatible endpoint:

OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1
OPENAI_MODEL=llama3.1

vLLM (high-throughput):

pip install vllm
python -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --port 8000

llama.cpp (CPU-friendly, quantized):

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
./server -m models/llama-3.1-8b.Q4_K_M.gguf -c 4096 --port 8080

Hardware requirement

Model Min RAM (Q4 quantized) Min RAM (Full FP16)
Llama 3.1 8B 6 GB 16 GB
Llama 3.1 70B 40 GB 140 GB
Mistral 7B 5 GB 14 GB
Qwen 2.5 7B 5 GB 14 GB
DeepSeek-Coder 6.7B 5 GB 13 GB

VPS Oracle (24GB RAM ARM) bisa run Llama 3.1 8B quantized comfort.

Trade-off

  • ✅ Full privacy, ga ada outbound network call
  • ✅ Zero cost ongoing (after VPS)
  • ✅ Customize model behavior fully
  • ❌ Setup ribet
  • ❌ Quality < cloud frontier models
  • ❌ CPU inference slow (token/detik rendah)

Recommendation per use case

"Bot personal, gratis, kualitas tinggi"

Kiro AI via 9Router

"Bot personal, gratis, super simple"

OpenCode Free via 9Router

"Bot personal, butuh privacy"

Self-host Llama 3.1 8B + Ollama, atau Anthropic API (paid)

"Bot personal, butuh kecepatan ekstrim"

Groq (Llama 70B via LPU)

"Bot personal, cost-conscious tapi mau quality"

DeepSeek ($0.14/1M) atau GLM ($0.6/1M)

"Bot production, mainstream support"

OpenAI API atau Anthropic API

"Bot dengan banyak provider fallback"

9Router + connect 3-5 provider tier-based

Decision tree

Mau bayar?
├── Tidak
│   ├── Mau setup minimal? → OpenCode Free
│   ├── Mau quality terbaik? → Kiro AI
│   ├── Mau GCP credit? → Vertex AI ($300)
│   └── Mau privacy? → Self-host Llama
└── Ya, budget?
    ├── < $5/bln → DeepSeek / GLM via OpenRouter
    ├── $5-20/bln → OpenAI gpt-4o-mini / Anthropic Haiku
    └── $20+/bln → Claude Sonnet via Anthropic API

Switching backend di kode

Bot lo harus support switch backend tanpa code change. Pakai env var:

OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
OPENAI_BASE_URL = os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1")
OPENAI_MODEL = os.environ.get("OPENAI_MODEL", "gpt-4o-mini")

client = OpenAI(api_key=OPENAI_API_KEY, base_url=OPENAI_BASE_URL)

Switch provider = update .env + restart bot:

# Pake Kiro via 9Router
sed -i 's|OPENAI_BASE_URL=.*|OPENAI_BASE_URL=http://127.0.0.1:20128/v1|' ~/agent/.env
sed -i 's|OPENAI_MODEL=.*|OPENAI_MODEL=kr/claude-sonnet-4.5|' ~/agent/.env
sudo systemctl restart kai-bot
# Pake OpenAI langsung
sed -i 's|OPENAI_BASE_URL=.*|OPENAI_BASE_URL=https://api.openai.com/v1|' ~/agent/.env
sed -i 's|OPENAI_MODEL=.*|OPENAI_MODEL=gpt-4o-mini|' ~/agent/.env
sudo systemctl restart kai-bot

Atau bikin /setmodel command di bot untuk runtime switch.

Final advice

Untuk pemula: 1. Mulai dengan Kiro via 9Router (gratis, quality tinggi) 2. Setup OpenCode Free sebagai fallback 3. Tambah Anthropic / OpenAI paid kalo perlu privacy / reliability

Untuk advanced: 1. Multi-account Kiro untuk effective doubling 2. Tier-based via 9Router (Subscription → Cheap → Free) 3. Self-host Llama untuk privacy-sensitive task

Untuk production komersil: 1. Anthropic API atau OpenAI API langsung 2. Monitoring + cost alert 3. SLA dengan provider

Cost projection personal agent (Mei 2026): - Pure free (Kiro + OpenCode): $0/bulan - Hybrid (Kiro + DeepSeek backup): $1-3/bulan - Production paid (Anthropic Claude Sonnet): $10-30/bulan