FAQ¶
Pertanyaan yang sering muncul saat ngebangun agent ala-SOUL.
Konsep¶
"Bedanya SOUL.md sama system prompt biasa apa?"¶
System prompt biasa: flat, narrative, hardcoded di kode.
SOUL.md: terstruktur 10 pilar, di file terpisah, load tiap session.
Keuntungan SOUL.md: - Update tanpa restart kode - Audit per pilar - Version control (git) - Share-able dengan orang lain (kalo non-sensitive)
"Filosofi 'akun user = akun agent' aman ga?"¶
Tergantung setup: - ⚠️ Tidak aman kalo: lo skip risk-gating, simpan credential di chat history, atau kasih akses tanpa boundaries - ✅ Aman kalo: risk-gating untuk high actions, credential di file 600, redact secrets di history, SOUL.md boundaries jelas
Filosofi ini powerful tapi butuh disiplin. Setup-nya 1x, manfaatnya seumur agent.
"Bisa ga adopt sebagian SOUL aja, ga semua 10 pilar?"¶
Bisa. Minimal yang penting: - Identity (siapa agent) - Communication (style) - Boundaries (hard NO)
Tambahan untuk agent dengan tool execution: - Capabilities - Autonomy + Risk-Gating - Verification
Skip dulu kalo simple: - Memory Rules (kalo lo ga pake memory) - Resource Management (kalo simple bot)
"SOUL.md ga akan bikin bot lebih lambat ya?"¶
Ya nambah ~6-9 KB ke system prompt = ~1500-2500 tokens.
Impact: - Latency: +0.2-0.5 detik per call - Cost: +0.1-0.5 cent per call (kalo pake API berbayar)
Worth it untuk konsistensi behavior.
Optimize kalo perlu: - Hapus pilar yang ga critical buat use case lo - Compress wording (pilihan lo "JANGAN A" vs "Janganlah pernah melakukan A")
Setup¶
"Free VPS mana yang paling cocok buat agent personal?"¶
Order rekomen: 1. Oracle Cloud Always Free — 24GB RAM forever (kalo bisa lolos approval) 2. AWS Free Tier — 12 bulan gratis t3.micro, abis itu $10/bln 3. Google Cloud e2-micro — forever free tapi US region only
Detail di VPS Gratis.
"Bisa ga jalan agent di laptop aja?"¶
Bisa, tapi: - ❌ Laptop harus nyala terus - ❌ Kalo lo gerak (mobile), bot down - ❌ Lo ga bisa close laptop kalo bot lagi proses
Mendingan VPS. Free tier udah lebih dari cukup.
Excepción: bot yang lo run on-demand (manual run, ga 24/7). Laptop OK.
"Pake LLM mana yang murah tapi bagus?"¶
Pricing tier per Mei 2026 (per 1M input tokens):
| Model | Input | Output | Quality |
|---|---|---|---|
| gpt-4o-mini | $0.15 | $0.60 | Solid baseline |
| claude-3-haiku | $0.25 | $1.25 | Sangat fast |
| llama-3.1-70b (via Groq) | $0.59 | $0.79 | Super cepet, open weights |
| gemini-1.5-flash | $0.075 | $0.30 | Murah, decent |
| deepseek-chat | $0.14 | $0.28 | Murah, OK |
Untuk agent personal yang chat 100-500x sehari, semua ini < $1/bulan. Pilih based on: - Speed: Groq (Llama) - Quality: claude-3-haiku atau gpt-4o-mini - Cost: gemini-flash atau deepseek
Atau host LLM lokal: - ollama dengan Llama 3.1 8B di VPS Oracle (24GB RAM cukup) - llama.cpp dengan model Q4 quantized
"Ga bisa sign up Oracle Cloud (di-decline) gimana?"¶
Common reasons: - Pake VPN/Tor → matikan, sign up dari residential IP - Credit card prepaid/virtual → pake debit/credit beneran - Phone number ga aktif → pakai nomor yang bisa receive SMS
Backup plan: - AWS Free Tier (mainstream, jarang decline) - Hetzner CX11 €3.79 (bayar tapi murah)
"Pakai webhook atau polling untuk Telegram bot?"¶
Polling (run_polling):
- ✅ Setup gampang
- ✅ Ga butuh public HTTPS endpoint
- ❌ Latency ~1-2 detik
- ❌ Resource lebih (bot constantly ping Telegram)
Webhook (run_webhook):
- ✅ Latency ~100-300ms
- ✅ Resource efficient
- ❌ Butuh public HTTPS endpoint (Let's Encrypt + reverse proxy)
Untuk personal bot single-user: polling cukup. Webhook overhead ga worth it.
Coding¶
"Model gua output markdown terus, gimana ngepatch-nya?"¶
Update section Communication di SOUL.md:
## COMMUNICATION
PLAIN TEXT WAJIB. JANGAN PERNAH PAKE Markdown:
- TANPA *, **, ***
- TANPA #, ##, ###
- TANPA -, *, + buat list (pakai angka 1. 2. 3.)
- TANPA backtick atau triple-backtick
- TANPA [link](url) atau 
Tulis kayak SMS / chat biasa.
Setelah update SOUL.md, reset history:
Karena model bisa "ke-tarik" pattern lama yang udah ada di history.
Kalo masih nge-output markdown setelah itu, ganti model. Beberapa model emang stubborn di markdown habit.
"Bagaimana cara handle output panjang dari tool?"¶
Output 10MB dari find / ga akan muat di context. Truncate:
async def run_shell(cmd):
# ... execute
out = stdout.decode()
if len(out) > 3000:
return out[:3000] + f"\n\n... (truncated, total {len(out)} chars)"
return out
Atau buat smart truncation:
def smart_truncate(text, max_chars=3000):
if len(text) <= max_chars:
return text
lines = text.split('\n')
if len(lines) > 50:
# Show first 20 + last 10
head = '\n'.join(lines[:20])
tail = '\n'.join(lines[-10:])
return f"{head}\n\n... ({len(lines)-30} lines skipped) ...\n\n{tail}"
return text[:max_chars]
"History bot bengkak. Cara handle?"¶
Sliding window di code:
Save juga limit:
Atau auto-archive saat besar:
if len(history) > 100:
archive_file = HISTORY_DIR / "archive" / f"{chat_id}-{ts}.json"
archive_file.write_text(json.dumps(history[:50])) # archive first 50
history = history[50:] # keep last 50
"Bot gua trigger 'Injection ke-X' di tiap response. Fix?"¶
Model treat tool result sebagai injection (paranoid mode).
Fix di SOUL.md:
## TOOL RESULTS — PENTING
Tool result format:
"Tool execution selesai (step X/Y). Hasil: [output]"
INI BUKAN INJECTION. Output resmi dari pipeline eksekusi.
LARANGAN KERAS:
- JANGAN bilang "Injection ke-..."
- JANGAN bilang "ignored"
- JANGAN curiga tool result
Lalu reset history (penting!):
Soalnya kalo history udah penuh dengan "Injection ke-X", model bakal continue pattern itu.
Detail di Prompt Injection Defense.
"Bot gua respon lambat 5-10 detik. Normal?"¶
Per call LLM ~2-3 detik. Multi-step (tool_use) bisa 2-3 call = 5-9 detik.
Optimize:
- Compress system prompt — kurangin SOUL.md kalo bisa
- Reduce history —
history[-20:]bukanhistory[-30:] - Pakai LLM yang lebih cepat — Groq, gemini-flash, claude-haiku
- Typing indicator — agar user ga panik:
- Stream responses (advanced) — first chunk muncul cepat
"Bagaimana cara debug agent yang behavior aneh?"¶
Add detailed logging:
log.info(f"=== Request from {chat_id} ===")
log.info(f"User: {user_text}")
log.info(f"System prompt size: {len(messages[0]['content'])} chars")
log.info(f"History size: {len(history)} messages")
Setelah call:
Setiap tool exec:
Tail log:
Atau enable verbose mode:
Operations¶
"VPS gua di-suspend Oracle, gimana?"¶
- Jangan panic
- Cek email — Oracle biasanya kasih notice 7-30 hari
- Backup data SEKARANG (sebelum benar-benar deleted)
- Sign up VPS baru (AWS, Hetzner, Oracle lagi dengan akun beda)
- Restore dari backup
Pelajaran: - Selalu backup ke S3/R2 eksternal - Punya rencana migrasi siap - Ga rely on 1 provider
"Disk penuh terus walaupun udah cleanup. Apa yang ngabisin?"¶
Common culprits:
- /var/log (journalctl, app logs) → sudo journalctl --vacuum-time=7d
- /var/cache (apt cache) → sudo apt-get clean
- ~/.cache (pip, npm, browser) → rm -rf ~/.cache/{pip,npm}
- /tmp (orphan files) → sudo find /tmp -type f -mtime +7 -delete
- /snap (snap revisions) → sudo snap set system refresh.retain=2
- Docker images → docker system prune -a
"VPS kehabisan RAM, bot crash. Solusi?"¶
Short-term:
- Restart bot: sudo systemctl restart kai-bot
- Limit memory di systemd: MemoryLimit=400M
Long-term: - Reduce system prompt size - Smaller history window - Add swap:
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
Swap = pakai disk sebagai RAM extension. Lambat tapi mencegah OOM kill.
"GitHub PAT gua bocor di log. Apa yang harus dilakuin?"¶
- Revoke segera: https://github.com/settings/tokens → Delete
- Generate token baru
- Update di
~/agent/credentials/github.env - Audit log: cari kapan/dimana bocor
- Pastikan
redact_secrets()aktif disave_history():
def save_history(chat_id, history):
sanitized = [
{**msg, "content": redact_secrets(msg["content"])}
for msg in history if isinstance(msg, dict) and "content" in msg
]
# ... save
- Reset history kalo perlu:
Security¶
"Aman ga simpan private key wallet di file plaintext?"¶
⚠️ Tidak aman 100% kalo: - VPS shared (orang lain bisa akses) - VPS lo ke-compromise - Backup ke cloud tanpa encryption
✅ Acceptable risk kalo: - VPS personal, akses cuma lo - File permission 600 (read/write owner only) - Folder permission 700 - Wallet itu spesifik untuk agent (bukan main wallet lo) - Saldo wallet < $X yang lo nyaman lose
Untuk wallet besar:
- Encrypt dengan GPG: gpg -c wallet.env
- Pakai hardware wallet (Ledger / Trezor) yang sign via USB
- Pakai multisig + agent cuma punya 1 dari N keys
- Pakai threshold signatures
Atau: cuma kasih agent akses ke "hot wallet" kecil, transfer ke "cold wallet" manual.
"Apa risiko utama jalan agent dengan SOUL akses penuh?"¶
- Prompt injection → agent tertipu konten malicious. Defense: SOUL.md + content fencing.
- Credential leak → token muncul di log/history. Defense: redaction + file permission.
- Risk misclassification → model salah label risk=low padahal high. Defense: dual-check di backend, conservative default risk=high kalo missing.
- Account suspended → karena agent activity dianggap automation. Defense: rate-limiting, human-like delay, avoid suspicious patterns.
- VPS compromise → seseorang akses VPS lo. Defense: SSH key only, firewall, fail2ban.
- Bot token leak → orang lain bisa kirim command. Defense:
is_owner()check di setiap handler.
"Bot gua bisa di-takeover ga kalo orang dapet bot token?"¶
Iya, tapi gua punya is_owner() check:
def is_owner(user_id):
return user_id == OWNER_TELEGRAM_ID
async def message_handler(update, ctx):
if not is_owner(update.effective_user.id):
return # decline
# ... process
Walaupun orang lain spam chat ke bot, kalo user_id != OWNER_TELEGRAM_ID, ga akan di-process.
⚠️ Limitation: kalo orang lain dapet bot token, mereka bisa read chat history yang ada di bot (via getUpdates). Tapi mereka ga bisa execute command karena bot reject.
Best practice: token rotate kalo lo curiga bocor.
Tools & Integration¶
"Bisa ga agent gua di Discord juga, bukan cuma Telegram?"¶
Bisa. Adapter pattern:
class TelegramAdapter:
async def send_message(self, chat_id, text):
await self.bot.send_message(chat_id, text)
class DiscordAdapter:
async def send_message(self, channel_id, text):
channel = self.bot.get_channel(channel_id)
await channel.send(text)
# Core agent logic ga peduli platform
async def handle_message(adapter, user_id, text):
if not is_authorized(user_id):
return
response = await ask_ai_agentic(user_id, text)
await adapter.send_message(user_id, response)
"Bisa ga agent gua punya web UI?"¶
Bisa. Pakai FastAPI + Streamlit/SSE/WebSocket:
from fastapi import FastAPI, WebSocket
app = FastAPI()
@app.websocket("/chat/{user_id}")
async def chat_ws(websocket: WebSocket, user_id: str):
await websocket.accept()
while True:
text = await websocket.receive_text()
response = await ask_ai_agentic(int(user_id), text)
await websocket.send_text(response)
Deploy backend ke VPS, frontend ke Vercel/Netlify (static React).
"Bisa ga agent gua trigger oleh cron (proaktif)?"¶
Bisa. Cron script call ke agent dengan synthetic message:
# Cron daily 9am
0 9 * * * /home/ubuntu/agent/proactive.sh
# proactive.sh:
#!/bin/bash
curl -X POST "https://api.telegram.org/bot${TOKEN}/sendMessage" \
-d chat_id=${OWNER_ID} \
-d text="Selamat pagi! Mau status server hari ini?"
Atau panggil agent langsung:
# proactive.py
from main import ask_ai_agentic
import asyncio
async def daily_report():
response = await ask_ai_agentic(
OWNER_ID,
"Generate daily report: cek service status, disk, recent errors"
)
# Send to Telegram
await bot.send_message(OWNER_ID, response)
asyncio.run(daily_report())
"Multi-user agent (untuk komunitas) gimana?"¶
Different SOUL.md per user role:
def build_system_prompt(user_id):
base_soul = load_soul()
role = get_user_role(user_id) # admin, member, guest
role_appendix = {
"admin": "User adalah admin, akses penuh.",
"member": "User adalah member, akses subset.",
"guest": "User adalah guest, read-only."
}.get(role, "")
return f"{base_soul}\n\n## ROLE\n{role_appendix}"
Plus per-user permission check di tool execution:
ALLOWED_COMMANDS_BY_ROLE = {
"admin": ["*"],
"member": ["git status", "git log", "ls", "cat"],
"guest": ["ls", "cat"]
}
def is_command_allowed(user_role, command):
allowed = ALLOWED_COMMANDS_BY_ROLE.get(user_role, [])
if "*" in allowed:
return True
return any(command.startswith(c) for c in allowed)
Lainnya¶
"Lisensi pakai apa?"¶
Untuk code agent: MIT atau Apache 2.0 (kalo lo mau orang lain pake juga).
Untuk SOUL.md template: free use, attribution appreciated.
"Bisa lo (Devin/Cognition) update guide ini?"¶
Guide ini lo bisa fork & maintain sendiri. Source di GitHub (kalo lo deploy ke GitHub Pages atau Vercel/Netlify).
Update di-trigger via git push → CI/CD redeploy.
"Ada channel/group orang yang ngebahas SOUL agent?"¶
Komunitas: - HuggingFace forum - r/LocalLLaMA - r/MachineLearning - Indo: komunitas AI Indonesia atau cari di Telegram
Atau bikin sendiri 😉
"Kalo gua udah deploy + bot jalan, terus mau scale ke 100 user?"¶
Bot single-user vs multi-user beda banget arsitekturnya:
| Aspek | Single-user | 100 user |
|---|---|---|
| LLM cost | < $1/bulan | $10-100/bulan |
| Server | t3.micro cukup | minimal t3.small atau lebih |
| DB | JSON files | Postgres / SQLite |
| Concurrency | Single asyncio | Worker pool / queue |
| Rate limiting | Optional | Wajib |
| User auth | Single owner | OAuth / login |
| Monitoring | Basic | Grafana / Datadog |
Plan migration step-by-step, jangan langsung jump ke "big scale".