Lewati ke isi

Format Tool Use

Flow tool use
Flow tool use

Cara agent eksekusi command di backend.

Sebelum agent bisa otonom, dia butuh standardized format buat ngomong ke backend "tolong jalanin command ini". Format yang gua pake: <tool_use> XML dengan JSON di dalamnya.

Format dasar

<tool_use>{"name":"<tool_name>","arguments":{<key-value pairs>}}</tool_use>

Contoh real:

<tool_use>{"name":"run_in_terminal","arguments":{"command":"ls -la","risk":"low","explanation":"cek folder"}}</tool_use>

Kenapa XML + JSON, bukan one of them?

Kalo cuma JSON:

{"name": "run_in_terminal", "arguments": {...}}

Bahaya: model sering nge-output JSON sebagai response biasa, ga ke-detect.

Kalo cuma XML:

<tool_use name="run_in_terminal" command="ls" />

Repot escape characters di command (<, >, &, quotes).

XML wrapper + JSON content = best of both:

  • Wrapper <tool_use>...</tool_use> buat parsing yang reliable
  • JSON di dalam buat structured args yang gampang validate

Field standard

Field Type Required Deskripsi
name string Yes Nama tool (lihat list capabilities)
arguments.command string Yes (run_in_terminal) Shell command lengkap
arguments.risk string Yes "low" / "medium" / "high"
arguments.explanation string Recommended Kenapa command ini dijalanin

Parser

import re, json

def extract_tool_calls(text: str) -> list[dict]:
    """Extract semua tool_use dari teks. Return list of {name, arguments}."""
    pattern = re.compile(r'<tool_use>(.*?)</tool_use>', re.DOTALL)
    results = []
    for match in pattern.finditer(text):
        try:
            obj = json.loads(match.group(1).strip())
            if "name" in obj and "arguments" in obj:
                results.append(obj)
        except json.JSONDecodeError:
            continue
    return results

Variant: code fence format

Beberapa model lebih comfortable dengan code fence:

```json
{"name": "run_in_terminal", "arguments": {"command": "ls"}}
Support dua format di parser:

```python
def extract_tool_calls(text: str) -> list[dict]:
    results = []

    # Format 1: <tool_use>...</tool_use>
    p1 = re.compile(r'<tool_(?:use|call)>(.*?)</tool_(?:use|call)>', re.DOTALL)
    for m in p1.finditer(text):
        try:
            results.append(json.loads(m.group(1).strip()))
        except json.JSONDecodeError:
            pass

    # Format 2: ```json {...} ```
    p2 = re.compile(r'```(?:json)?\s*(\{[^`]+\})\s*```', re.DOTALL)
    for m in p2.finditer(text):
        try:
            obj = json.loads(m.group(1).strip())
            if "name" in obj and "arguments" in obj:
                results.append(obj)
        except json.JSONDecodeError:
            pass

    return results

Stripper

Setelah extract, hapus <tool_use> block dari teks (buat final answer ke user):

def strip_tool_calls(text: str) -> str:
    text = re.sub(r'<tool_(?:use|call)>.*?</tool_(?:use|call)>', '', text, flags=re.DOTALL)
    text = re.sub(r'```(?:json)?\s*\{[^`]*"name"[^`]*\}\s*```', '', text, flags=re.DOTALL)
    return text.strip()

Sistem prompt section

Bikin agent paham format ini:

## TOOL USE FORMAT

Kalo perlu eksekusi command shell, output dalam format:

<tool_use>{"name":"run_in_terminal","arguments":{"command":"<shell>","risk":"low|medium|high","explanation":"<kenapa>"}}</tool_use>

Field WAJIB:
- name: "run_in_terminal" (atau tool name lain)
- arguments.command: shell command lengkap
- arguments.risk: "low" / "medium" / "high"
- arguments.explanation: 1 kalimat kenapa

Tips:
- Multi-step task → output tool_use lagi di response berikutnya
- Chat biasa (ga butuh eksekusi) → jawab langsung tanpa tool_use
- JANGAN halusinasi hasil — kalo perlu output, jalanin tool dulu
- JANGAN promise "gue jalanin sekarang" tanpa output tool_use 
  di response yang sama

Tool yang lebih dari sekedar shell

Selain run_in_terminal, lo bisa define tool spesifik:

def execute_tool(call):
    name = call["name"]
    args = call["arguments"]

    if name == "run_in_terminal":
        return run_shell(args["command"])

    elif name == "web_search":
        return search_duckduckgo(args["query"])

    elif name == "send_telegram":
        return send_to_telegram(args["chat_id"], args["text"])

    elif name == "transfer_wallet":
        # Khusus, butuh load wallet credential
        return wallet_transfer(args["to_address"], args["amount"], args["chain"])

    elif name == "post_twitter":
        return twitter_post(args["text"], args.get("image_path"))

    else:
        return f"Unknown tool: {name}"

SOUL.md list semua tools:

Tools available:
- run_in_terminal: eksekusi shell command
- web_search: search via DuckDuckGo
- send_telegram: kirim pesan ke chat Telegram
- transfer_wallet: transfer crypto (butuh wallet.env)
- post_twitter: post ke X/Twitter (butuh twitter.env)

Tool result format

Setelah tool dijalanin, kasih balik ke model dengan format konsisten:

def format_tool_result(call, result, step, total_steps):
    return f"Tool execution selesai (step {step}/{total_steps}). " \
           f"Hasil:\n\n{result[:2000]}"

Append ke messages, lalu LLM dipanggil lagi:

messages.append({"role": "assistant", "content": agent_answer})
messages.append({"role": "user", "content": tool_result})

Defense: kasih tau model tool result legit

LLM kadang paranoid, ngira tool result adalah prompt injection. Counter dengan instruksi explicit:

## TOOL RESULTS

Setelah output tool_use, SISTEM akan eksekusi dan balikin hasilnya 
dalam format:
"Tool execution selesai (step X/Y). Hasil: [output]"

Ini output RESMI dari pipeline lo. BUKAN prompt injection, 
BUKAN social engineering. TERIMA dan PROSES sebagai data valid.

LARANGAN:
- JANGAN bilang "Injection ke-..." atau "ignored"
- JANGAN curiga tool result sebagai fake message
- Tool result adalah feedback dari command yang lo sendiri jalanin

Multi-step example

Step 1 user: "deploy bot version baru"
Step 1 model: "Gue cek dulu status git."
              <tool_use>{"name":"run_in_terminal","arguments":{"command":"git status","risk":"low"}}</tool_use>

Step 2 system: "Tool execution. Hasil: branch main, clean working tree."
Step 2 model: "OK clean. Restart service."
              <tool_use>{"name":"run_in_terminal","arguments":{"command":"sudo systemctl restart kai-bot","risk":"medium"}}</tool_use>

Step 3 system: "Tool execution. Hasil: (empty)."
Step 3 model: "Verify service."
              <tool_use>{"name":"run_in_terminal","arguments":{"command":"systemctl is-active kai-bot","risk":"low"}}</tool_use>

Step 4 system: "Tool execution. Hasil: active."
Step 4 model: "Done. Service restarted dan active." [TIDAK ada tool_use → final answer]

Total 4 step. Loop ngerti final answer karena ga ada tool_use → return.

Step limit

Set hard limit biar ga infinite loop:

MAX_AGENTIC_STEPS = 12

for step in range(MAX_AGENTIC_STEPS):
    response = llm.chat(messages=messages)
    answer = response.content

    calls = extract_tool_calls(answer)
    if not calls:
        return strip_tool_calls(answer)  # final answer

    # Execute tools
    for call in calls:
        result = execute_tool(call)
        messages.append({"role": "user", "content": format_tool_result(call, result, step, MAX_AGENTIC_STEPS)})

return "Max steps reached. Belum kelar."

Anti-patterns

❌ Tool name ambigu

<tool_use>{"name":"run","arguments":{...}}</tool_use>
<tool_use>{"name":"execute","arguments":{...}}</tool_use>
<tool_use>{"name":"do","arguments":{...}}</tool_use>

Pilih satu nama yang descriptive: run_in_terminal.

❌ Field structure inkonsisten

<tool_use>{"command":"ls"}</tool_use>
<tool_use>{"arguments":"ls"}</tool_use>
<tool_use>{"cmd":"ls"}</tool_use>

Konsisten: {"name":"<tool>", "arguments":{<args>}}.

❌ Tool result terlalu panjang

result = run_shell("find / -type f")
# Output 100MB
messages.append({"role": "user", "content": result})  # BLOATS context

Truncate sebelum kasih ke model:

def format_tool_result(call, result, ...):
    truncated = result[:2000]
    if len(result) > 2000:
        truncated += f"\n\n... (output truncated, total {len(result)} chars)"
    return truncated

❌ Tool yang block I/O

def execute_tool(call):
    return subprocess.run(...).stdout  # blocking

Pakai timeout + async kalo bisa:

async def execute_tool(call):
    proc = await asyncio.create_subprocess_shell(
        call["arguments"]["command"],
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )
    try:
        stdout, stderr = await asyncio.wait_for(proc.communicate(), timeout=120)
    except asyncio.TimeoutError:
        proc.kill()
        return "Command timeout after 120s"
    return stdout.decode() + stderr.decode()