API Orchestration and Tool Selection

Introduction

When an agent has access to a large number of tools, selecting the right tool, calling them in the right order, and handling errors during invocation becomes a critical engineering challenge. This section explores tool selection algorithms, orchestration patterns, and error handling strategies.

The Tool Selection Problem

Challenges

Too many tools: With hundreds of tools, LLMs struggle to process all tool definitions within context
Semantic overlap: Multiple tools with similar functionality require precise differentiation
Context consumption: Each tool definition consumes hundreds of tokens
Selection accuracy: Selecting the wrong tool leads to task failure

Tool Selection Strategies

1. Full Loading (Suitable for Few Tools)

# When tools < 20, pass all directly to the LLM
response = llm.chat(
    messages=messages,
    tools=all_tools,  # All tool definitions
)

2. Semantic Routing

Pre-filter relevant tools based on the semantic content of the user query:

class ToolRouter:
    def __init__(self, tools, embedding_model):
        self.tools = tools
        self.embed = embedding_model

        # Pre-compute embeddings for all tool descriptions
        self.tool_embeddings = {
            tool["name"]: self.embed(tool["description"])
            for tool in tools
        }

    def select_tools(self, query, top_k=5):
        """Select the most relevant tools based on semantic similarity"""
        query_embedding = self.embed(query)

        scores = {}
        for name, emb in self.tool_embeddings.items():
            scores[name] = cosine_similarity(query_embedding, emb)

        # Return the top_k most relevant tools
        sorted_tools = sorted(scores.items(), key=lambda x: x[1], reverse=True)
        selected_names = [name for name, score in sorted_tools[:top_k]]

        return [t for t in self.tools if t["name"] in selected_names]

# Usage
router = ToolRouter(all_tools, embedding_model)
relevant_tools = router.select_tools("Check the weather in Beijing for me")
response = llm.chat(messages=messages, tools=relevant_tools)

3. Category Routing

Use an LLM to classify first, then load tools from the corresponding category:

TOOL_CATEGORIES = {
    "information_retrieval": ["web_search", "knowledge_base_query", "database_query"],
    "data_analysis": ["execute_python", "create_chart", "statistical_test"],
    "communication": ["send_email", "send_slack", "create_document"],
    "file_operations": ["read_file", "write_file", "list_directory"],
}

def category_routing(query, llm):
    """Classify first, then select tools"""
    category = llm.classify(
        f"Classify the following query into a tool category: {list(TOOL_CATEGORIES.keys())}\n"
        f"Query: {query}\nCategory:"
    )
    return TOOL_CATEGORIES.get(category, [])

4. Two-Stage Selection

Use a lightweight model for initial filtering, then a stronger model for precise selection:

def two_stage_selection(query, all_tools):
    # Stage 1: Lightweight model for quick filtering (low cost)
    candidates = light_model.select(
        query=query,
        tools=[{"name": t["name"], "description": t["description"][:100]} for t in all_tools],
        top_k=10,
    )

    # Stage 2: Strong model for precise selection (with full definitions)
    selected = strong_model.select(
        query=query,
        tools=[t for t in all_tools if t["name"] in candidates],
    )
    return selected

Tool Chaining Patterns

Sequential Chain

# Tool A's output becomes Tool B's input
async def sequential_chain(query):
    # Step 1: Search
    search_results = await search_tool(query)

    # Step 2: Query database with search results
    db_results = await database_query(extract_entities(search_results))

    # Step 3: Analyze
    analysis = await code_executor(generate_analysis_code(db_results))

    return analysis

Parallel Fan-out

import asyncio

async def fan_out(query):
    """Call multiple tools in parallel and aggregate results"""
    tasks = [
        web_search(query),
        knowledge_base_search(query),
        database_query(query),
    ]

    results = await asyncio.gather(*tasks, return_exceptions=True)

    # Filter out errors
    valid_results = [r for r in results if not isinstance(r, Exception)]

    return merge_results(valid_results)

Conditional Branching

async def conditional_routing(query, context):
    """Choose different tool chains based on conditions"""
    intent = classify_intent(query)

    if intent == "factual_question":
        # Knowledge retrieval chain
        docs = await knowledge_base_search(query)
        return generate_answer(query, docs)

    elif intent == "data_analysis":
        # Data analysis chain
        data = await database_query(extract_sql(query))
        code = generate_analysis_code(data)
        return await code_executor(code)

    elif intent == "action_request":
        # Action execution chain
        plan = plan_actions(query)
        results = []
        for step in plan:
            result = await execute_action(step)
            results.append(result)
            if result.get("error"):
                break
        return summarize_results(results)

Recursive Tool Use

The agent discovers during execution that additional tool calls are needed:

async def recursive_tool_use(query, tools, llm, depth=0, max_depth=5):
    """Recursive tool invocation"""
    if depth >= max_depth:
        return "Maximum recursion depth reached"

    response = await llm.chat(
        messages=[{"role": "user", "content": query}],
        tools=tools
    )

    if not response.tool_calls:
        return response.content

    # Execute tool calls
    results = []
    for tc in response.tool_calls:
        result = await execute_tool(tc.name, tc.arguments)
        results.append(result)

    # Feed results back to the LLM, potentially triggering more tool calls
    follow_up = format_results(results)
    return await recursive_tool_use(follow_up, tools, llm, depth + 1, max_depth)

Error Handling and Retries

Error Classification

Error Type	Example	Handling Strategy
Parameter error	Missing required parameter	Let LLM correct the parameters
Authentication failure	Expired API key	Refresh credentials and retry
Rate limiting	429 Too Many Requests	Exponential backoff retry
Service unavailable	500 Server Error	Wait and retry, or degrade
Logic error	Query returns no results	Rewrite query or switch tools
Timeout	Execution takes too long	Retry with timeout or simplify request

Retry Strategies

import asyncio
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
)
async def resilient_tool_call(tool_name, arguments):
    """Tool call with retry logic"""
    return await execute_tool(tool_name, arguments)

class SmartRetry:
    def __init__(self, llm):
        self.llm = llm

    async def execute_with_recovery(self, tool_name, arguments, error=None):
        """Intelligent error recovery"""
        if error:
            # Let LLM analyze the error and correct parameters
            fix = await self.llm.generate(
                f"Tool {tool_name} call failed.\n"
                f"Arguments: {arguments}\n"
                f"Error: {error}\n"
                f"Please correct the arguments or suggest an alternative."
            )
            arguments = fix.get("corrected_arguments", arguments)

        try:
            return await execute_tool(tool_name, arguments)
        except Exception as e:
            if self.is_retryable(e):
                return await self.execute_with_recovery(tool_name, arguments, str(e))
            raise

    def is_retryable(self, error):
        retryable_codes = [429, 500, 502, 503, 504]
        return getattr(error, 'status_code', None) in retryable_codes

Fallback Strategies

class ToolWithFallback:
    def __init__(self, primary_tool, fallback_tools):
        self.primary = primary_tool
        self.fallbacks = fallback_tools

    async def execute(self, arguments):
        """Try fallback tools when the primary tool fails"""
        try:
            return await self.primary(arguments)
        except Exception as primary_error:
            for fallback in self.fallbacks:
                try:
                    return await fallback(arguments)
                except:
                    continue
            raise primary_error

# Example: fallback chain for search tools
search = ToolWithFallback(
    primary_tool=google_search,
    fallback_tools=[bing_search, brave_search, duckduckgo_search]
)

Tool Usage Decision Framework

When to Use Which Tool

TOOL_DECISION_TREE = """
User query →
├─ Need latest information? → web_search
├─ Need internal knowledge? → knowledge_base_search  
├─ Need precise computation? → code_interpreter
├─ Need data analysis? → code_interpreter + data_tools
├─ Need to perform an action? →
│  ├─ Send a message? → email/slack_tool
│  ├─ File operation? → file_tools
│  └─ System operation? → bash_tool
├─ Need multi-step reasoning? → Combine multiple tools
└─ Pure knowledge Q&A? → No tools needed, answer directly
"""

Practical Recommendations

Orchestration Checklist

[ ] Implement tool routing to reduce interference from irrelevant tools
[ ] Define clear usage scenarios and exclusion conditions for each tool
[ ] Implement error handling and retry mechanisms
[ ] Set timeouts and resource limits for tool calls
[ ] Add logging and monitoring for tool usage
[ ] Add human confirmation steps for high-risk tools

Performance Optimization

Call independent tools in parallel to reduce latency
Cache results from frequently used tools
Use lightweight models for pre-filtering
Truncate tool results to avoid wasting context space