Update & Maintain¶
Pemeliharaan harian dan jangka panjang.
Daily checklist¶
Cek setiap hari (atau setup monitoring):
- Service status:
sudo systemctl is-active kai-bot - Disk space:
df -h(alert kalo >80%) - Memory:
free -h - Recent errors:
sudo journalctl -u kai-bot -p err --since today - Bot responsive (chat di Telegram)
Cron script untuk auto-check:
cat > ~/agent/daily-check.sh << 'EOF'
#!/bin/bash
# Run via cron daily, alert via Telegram if issues
source ~/agent/.env
ISSUES=()
# Service check
if ! systemctl is-active --quiet kai-bot; then
ISSUES+=("Service DOWN")
fi
# Disk check
DISK_PCT=$(df / | tail -1 | awk '{print $5}' | tr -d '%')
if [ "$DISK_PCT" -gt 80 ]; then
ISSUES+=("Disk ${DISK_PCT}% full")
fi
# Memory check
MEM_PCT=$(free | grep Mem | awk '{print int($3/$2 * 100)}')
if [ "$MEM_PCT" -gt 80 ]; then
ISSUES+=("RAM ${MEM_PCT}% used")
fi
# Recent errors
ERROR_COUNT=$(journalctl -u kai-bot -p err --since "24 hours ago" | wc -l)
if [ "$ERROR_COUNT" -gt 10 ]; then
ISSUES+=("$ERROR_COUNT errors last 24h")
fi
# Alert if any
if [ ${#ISSUES[@]} -gt 0 ]; then
MSG="Daily check issues:%0A"
for issue in "${ISSUES[@]}"; do
MSG="${MSG}- ${issue}%0A"
done
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-d "chat_id=${OWNER_TELEGRAM_ID}" \
-d "text=${MSG}"
fi
EOF
chmod +x ~/agent/daily-check.sh
(crontab -l 2>/dev/null; echo "0 9 * * * /home/ubuntu/agent/daily-check.sh") | crontab -
Weekly maintenance¶
Setiap minggu:
1. System update¶
2. Cleanup cache¶
3. Disk audit¶
# Big folders di /home
du -h ~ --max-depth=2 | sort -rh | head -10
# Big folders di /
sudo du -h / --max-depth=2 2>/dev/null | sort -rh | head -10
4. Log rotation manual¶
Kalo log file lo bengkak (custom logs, bukan journal):
ls -lh ~/agent/*.log
# kalo gede:
mv ~/agent/agent.log ~/agent/logs-archive/agent-$(date +%Y%m%d).log
gzip ~/agent/logs-archive/agent-$(date +%Y%m%d).log
5. Backup verify¶
# Test backup readable
aws s3 ls s3://my-backups/ | tail -5
# Test restore (dry run)
aws s3 cp s3://my-backups/agent-latest.tar.gz /tmp/test-restore.tar.gz
tar -tzf /tmp/test-restore.tar.gz | head -5 # list files tanpa extract
rm /tmp/test-restore.tar.gz
Monthly maintenance¶
1. Review memory.json¶
Hapus yang stale, redundant, atau prompt injection. Edit file langsung.
2. Archive history¶
History per chat bisa numpuk. Archive yang lama:
cd ~/agent/data/history
# Move file yang udah > 30 hari ke archive
find . -maxdepth 1 -name "*.json" -mtime +30 -exec mv {} archive/ \;
3. Audit credentials¶
- Token mana yang masih active?
- Mana yang udah ga dipake?
- Mana yang perlu rotation (sudah > 90 hari)?
Rotate token yang dipake:
1. Generate new token di provider
2. Update ~/agent/credentials/<platform>.env
3. Test agent jalan
4. Revoke old token
4. SOUL.md review¶
Buka ~/agent/SOUL.md, baca pelan-pelan:
- Apakah identity masih akurat?
- Apakah capabilities up-to-date dengan tool yang ada?
- Apakah ada pattern aneh dari bot bulan terakhir yang harus di-fix di SOUL.md?
- Hapus section yang ga relevan, tambah yang baru.
Setelah edit, reset history biar pattern lama ga affect:
di Telegram.
5. Cost audit¶
Kalo lo pakai paid services:
- LLM API: berapa token dipake bulan ini? Cek dashboard provider.
- VPS: kalo bukan free tier, berapa charge bulan ini?
- Database / storage: berapa GB used?
Optimize kalo over-budget: - Switch LLM model ke yang murah (gpt-4o-mini, claude-haiku, llama via Groq) - Reduce history window di system prompt - Cleanup old data
Quarterly (3 bulan sekali)¶
1. Provider review¶
Apakah lo masih pakai provider yang paling cost-effective?
- AWS Free Tier abis di bulan 12 → migrate ke Oracle / Hetzner
- Provider X naik harga → consider alternative
2. Security audit¶
# Check SSH attempts
sudo journalctl _COMM=sshd --since "30 days ago" | grep "Failed password" | wc -l
# Cek active user
who
last -n 20
# Cek listening ports
sudo netstat -tlnp
# Hanya ada port yang lo expect (22 SSH + bot tertentu)?
# Cek file permission credential
ls -la ~/agent/credentials/
# Semua 600?
# Cek SOUL.md
ls -la ~/agent/SOUL.md
# 644 atau 600?
3. Backup restore test¶
Backup ga useful kalo restore-nya ga jalan. Test:
# Download backup terbaru
aws s3 cp s3://my-backups/agent-latest.tar.gz /tmp/
# Extract di tmp folder
mkdir -p /tmp/restore-test
tar -xzf /tmp/agent-latest.tar.gz -C /tmp/restore-test
# Verify isi
ls -la /tmp/restore-test/data/
cat /tmp/restore-test/data/memory.json | jq '.notes | length'
# Cleanup
rm -rf /tmp/restore-test /tmp/agent-latest.tar.gz
Kalo ada error, fix backup pipeline.
4. Disaster recovery drill¶
Skenario: VPS lo di-suspend, harus migrate ke VPS baru dalam 1 jam. Apakah lo bisa?
Step:
1. Sign up VPS baru
2. SSH ke VPS baru
3. Clone repo: git clone https://github.com/user/agent.git
4. Setup env: python3 -m venv venv && source venv/bin/activate && pip install -r requirements.txt
5. Restore data: aws s3 cp s3://my-backups/agent-latest.tar.gz . && tar -xzf agent-latest.tar.gz
6. Setup systemd
7. Start service
8. Update DNS / Telegram webhook kalo perlu
Kalo lo ga bisa dalam 1 jam, improve documentation / scripts.
Yearly¶
1. Major upgrades¶
- Ubuntu LTS upgrade (22.04 → 24.04 → 26.04 saat available)
- Python major version (3.12 → 3.13)
- Library major versions
Test dulu di staging / cloned VPS.
2. SOUL.md philosophical review¶
SOUL.md udah 1 tahun aktif. Pertanyaan:
- Apakah filosofi inti masih sesuai? (mungkin lo udah mature, butuh adjust)
- Apakah identity masih relevan? (mungkin lo mau rename, ganti tone)
- Apakah boundaries cukup ketat / cukup longgar?
- Apakah autonomy levels masih sesuai?
Rewrite kalo perlu. Bot lo mungkin udah evolve dari original use case.
3. Stack evaluation¶
- Apakah LLM model lo masih SOTA?
- Apakah Python library lo masih maintained?
- Apakah Telegram masih platform yang lo pake? (mungkin Discord lebih cocok sekarang)
Switching cost vs benefit. Kalo benefit > cost, plan migration.
Update best practices¶
Always backup before update¶
Update sequence¶
# 1. Save state
sudo systemctl stop kai-bot
# 2. Backup
tar -czf ~/agent-backup-$(date +%Y%m%d).tar.gz ~/agent/data ~/agent/credentials
# 3. Update code
cd ~/agent
git pull
# 4. Update deps
source venv/bin/activate
pip install -r requirements.txt --upgrade
# 5. Run tests (kalo ada)
# pytest
# 6. Start service
sudo systemctl start kai-bot
# 7. Verify
sleep 5
sudo systemctl is-active kai-bot
sudo journalctl -u kai-bot -n 20 --no-pager
Rollback if needed¶
# Stop current
sudo systemctl stop kai-bot
# Restore from backup
cd ~
tar -xzf agent-backup-20260514.tar.gz
# Revert code
cd ~/agent
git log --oneline -5
git checkout <previous-commit>
# Start
sudo systemctl start kai-bot
Anti-patterns¶
❌ Update tanpa backup¶
Always backup first.
❌ Update tanpa test¶
Test manual dulu (run python main.py foreground), terus systemd.
❌ Lupa monitoring setelah update¶
Bot bisa silent fail (jalan tapi ga respond bener). Check beberapa kali setelah update.
❌ Edit SOUL.md tanpa version¶
Pakai git untuk SOUL.md:
Rollback gampang:
Monitoring dashboard sederhana¶
Bikin command yang summary status:
cat > ~/agent/status.sh << 'EOF'
#!/bin/bash
echo "=== Kai Agent Status ==="
echo ""
echo "Service:"
systemctl is-active kai-bot
systemctl is-enabled kai-bot
echo ""
echo "Memory:"
free -h | head -2
echo ""
echo "Disk:"
df -h / | tail -1
echo ""
echo "Recent errors (24h):"
journalctl -u kai-bot -p err --since "24 hours ago" --no-pager | wc -l
echo ""
echo "Uptime:"
uptime
EOF
chmod +x ~/agent/status.sh
~/agent/status.sh
Atau make available via Telegram command: