Advanced Prompt Techniques

1. Self-Consistency

1.1 Core Idea

Self-Consistency improves reasoning accuracy by sampling multiple reasoning paths and performing majority voting on the final answers.

1.2 Workflow

Question → [CoT Reasoning Path 1 → Answer A]
         → [CoT Reasoning Path 2 → Answer B]
         → [CoT Reasoning Path 3 → Answer A]
         → [CoT Reasoning Path 4 → Answer A]
         → [CoT Reasoning Path 5 → Answer C]

Majority Vote → Answer A (3/5) ✓

1.3 Implementation

import openai
from collections import Counter

def self_consistency(prompt, n_samples=5, temperature=0.7):
    """Use Self-Consistency with multiple sampling and voting"""
    answers = []
    for _ in range(n_samples):
        response = openai.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}],
            temperature=temperature
        )
        # Extract final answer from reasoning
        answer = extract_final_answer(response.choices[0].message.content)
        answers.append(answer)

    # Majority vote
    counter = Counter(answers)
    return counter.most_common(1)[0][0]

1.4 Applicability and Limitations

Suitable for: Math problems, logical reasoning, multiple choice — tasks with clear-cut answers

Limitations:

Higher cost (multiple API calls)
Limited effectiveness for open-ended generation tasks
Requires a reliable answer extraction mechanism

2. Tree of Thoughts (ToT)

2.1 Core Idea

Tree of Thoughts organizes the reasoning process as a tree structure, generating multiple "thought" branches at each node and selecting the optimal path through evaluation.

2.2 Comparison with CoT

Aspect	CoT	ToT
Reasoning structure	Linear chain	Tree branches
Exploration	Single path	Multiple paths in parallel
Backtracking	Not supported	Supported
Evaluation	Final result only	Intermediate steps evaluable

2.3 Implementation Framework

class TreeOfThoughts:
    def __init__(self, model, evaluator):
        self.model = model
        self.evaluator = evaluator

    def solve(self, problem, max_depth=3, branch_factor=3):
        """BFS-style ToT solver"""
        root = ThoughtNode(problem, depth=0)
        current_level = [root]

        for depth in range(max_depth):
            next_level = []
            for node in current_level:
                # Generate multiple thought branches
                thoughts = self.generate_thoughts(node, branch_factor)
                for thought in thoughts:
                    # Evaluate the promise of each thought
                    score = self.evaluator.evaluate(thought)
                    child = ThoughtNode(thought, depth=depth+1, score=score)
                    node.add_child(child)
                    next_level.append(child)

            # Select the most promising nodes for further expansion
            next_level.sort(key=lambda x: x.score, reverse=True)
            current_level = next_level[:branch_factor]

        # Return the best leaf node
        return max(current_level, key=lambda x: x.score)

2.4 Prompt Example

Problem: {problem}

Please generate 3 different solution approaches (first step only):

Approach 1:
Approach 2:
Approach 3:

Evaluate the feasibility of each approach (1-10):

2.5 Applicable Scenarios

Creative writing (exploring different narrative directions)
Mathematical proofs (trying different proof strategies)
Planning problems (exploring different action plans)
Code design (comparing different architectural approaches)

3. RAG-Augmented Prompts

3.1 Basic Pattern

Inject retrieved context into the prompt to augment LLM knowledge:

Answer the question based on the following references. If the references do not contain relevant information, state this honestly.

References:
---
{retrieved_context_1}
---
{retrieved_context_2}
---
{retrieved_context_3}

Question: {user_question}

Please cite specific content from the references to support your answer.

3.2 Advanced RAG Prompt Patterns

Multi-step reasoning RAG:

Based on the following materials, please:
1. First summarize the key points of each material
2. Analyze the connections between materials
3. Synthesize an answer to the user's question
4. Identify any information gaps that may need supplementation

Materials: {contexts}
Question: {question}

RAG with confidence levels:

Answer the question based on the provided materials. Label each claim with a confidence level:
- [High]: Directly supported by the materials
- [Medium]: Can be inferred from the materials
- [Low]: Limited material support, partially based on general knowledge

Materials: {contexts}
Question: {question}

3.3 Integration with Memory Systems

RAG can serve as an external memory system for AI Agents. See RAG-Augmented Memory for details.

4. DSPy: Programmatic Prompt Optimization

4.1 DSPy Overview

DSPy (Declarative Self-improving Python) is a framework that transforms prompt engineering into a programming problem, defining LLM programs declaratively and optimizing prompts automatically.

4.2 Core Concepts

Signature: Type signatures defining inputs and outputs
Module: Composable LLM operation modules
Teleprompter/Optimizer: Algorithms for automatic prompt optimization
Metric: Metrics for evaluating prompt quality

4.3 Basic Usage

import dspy

# Configure LLM
lm = dspy.OpenAI(model="gpt-4", max_tokens=300)
dspy.settings.configure(lm=lm)

# Define Signature
class SentimentClassification(dspy.Signature):
    """Classify sentiment of a text."""
    text = dspy.InputField(desc="Text to classify")
    sentiment = dspy.OutputField(desc="positive, negative, or neutral")

# Use Module
classify = dspy.Predict(SentimentClassification)
result = classify(text="This product exceeded my expectations!")
print(result.sentiment)  # "positive"

4.4 Automatic Optimization

from dspy.teleprompt import BootstrapFewShot

# Prepare training data
trainset = [
    dspy.Example(text="Great product!", sentiment="positive"),
    dspy.Example(text="Terrible experience.", sentiment="negative"),
    # ...more samples
]

# Define evaluation metric
def accuracy_metric(example, pred, trace=None):
    return example.sentiment == pred.sentiment

# Automatic optimization
teleprompter = BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=4)
optimized_classify = teleprompter.compile(classify, trainset=trainset)

4.5 Advantages of DSPy

Programmable: Prompt logic expressed in code, testable and versionable
Auto-optimized: No manual prompt debugging needed
Composable: Modular design supports complex pipelines
LLM-agnostic: Prompts auto-adapt when switching models

5. Automatic Prompt Optimization

5.1 APE (Automatic Prompt Engineer)

Methods for automatically searching for optimal prompts:

1. Given a task description and evaluation dataset
2. Use an LLM to generate candidate prompts
3. Test each prompt on the evaluation set
4. Select the best-performing prompt
5. Iterate and improve

5.2 OPRO (Optimization by PROmpting)

Leveraging the LLM itself as an optimizer:

Here are some prompts and their performance scores:

Prompt: "Classify the sentiment" → Score: 0.72
Prompt: "Determine if positive or negative" → Score: 0.78
Prompt: "Analyze the emotional tone" → Score: 0.75

Based on the above information, generate a new prompt that might perform better.

5.3 Practical Prompt Optimization Workflow

1. Define evaluation metrics and test set
2. Write an initial prompt
3. Evaluate on the test set
4. Analyze failure cases
5. Modify the prompt (manual + automatic)
6. Repeat 3-5 until requirements are met
7. A/B test before production rollout

6. Meta-Prompting

6.1 Concept

Using LLMs to generate, evaluate, and improve prompts. Essentially, "prompts for writing prompts."

6.2 Prompt Generator

You are a prompt engineering expert. The user will describe a task, and you need to:

1. Analyze the key elements of the task
2. Generate 3 prompts in different styles
3. Evaluate the strengths and weaknesses of each prompt
4. Recommend the best prompt

Task description: {task_description}

Please generate prompts suitable for use with GPT-4.

6.3 Prompt Evaluator

Please evaluate the quality of the following prompt:

Prompt: {prompt_to_evaluate}
Target task: {task_description}

Evaluation dimensions:
1. Clarity (1-10): Are instructions unambiguous?
2. Completeness (1-10): Does it cover all necessary information?
3. Formatting (1-10): Is the output format clear?
4. Robustness (1-10): Can it handle abnormal inputs?
5. Efficiency (1-10): Token usage efficiency

Overall score and improvement suggestions:

6.4 Prompt Iterator

Current prompt: {current_prompt}
Evaluation results: {evaluation_results}
Failure cases: {failure_cases}

Please improve the prompt based on the above information:
1. Fix issues causing the failure cases
2. Preserve the strengths of the original prompt
3. Improve overall robustness

Improved prompt:

7. Advanced Techniques Summary

7.1 Prompt Chaining

Decompose complex tasks into chained prompt calls:

Step 1: Analyze → Extract key information
Step 2: Plan → Develop action plan
Step 3: Execute → Generate final output
Step 4: Review → Check and correct

7.2 Role-Play Enhancement

Have 3 experts analyze this problem separately:
- Expert A (Data Scientist): Analyze from a data perspective
- Expert B (Product Manager): Analyze from a user needs perspective
- Expert C (Security Engineer): Analyze from a security perspective

Then synthesize the opinions of all 3 experts to provide a final recommendation.

7.3 Constraint Escalation

When LLM output does not meet requirements, progressively add constraints:

# First attempt
Summarize this article.

# Second attempt (add constraints)
Summarize this article in 3 bullet points, each no more than 20 words.

# Third attempt (further constraints)
Summarize this article in 3 bullet points, each no more than 20 words.
Format:
- Point 1: [content]
- Point 2: [content]
- Point 3: [content]
Output only the bullet points, do not add any other content.

8. Summary

Technique	Core Idea	Use Cases	Complexity
Self-Consistency	Multiple sampling + voting	Math/logical reasoning	Medium
Tree of Thoughts	Tree exploration + evaluation	Planning/creative/complex reasoning	High
RAG-Augmented Prompt	Retrieval + context injection	Knowledge-intensive tasks	Medium
DSPy	Programmatic prompt definition	Pipelines needing auto-optimization	Medium-High
APE/OPRO	Auto-search for optimal prompts	Large-scale prompt optimization	High
Meta-Prompting	LLM generates/evaluates prompts	Prompt development process	Low-Medium

References

Wang et al., "Self-Consistency Improves Chain of Thought Reasoning in Language Models", 2023
Yao et al., "Tree of Thoughts: Deliberate Problem Solving with Large Language Models", 2023
Khattab et al., "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines", 2023
Chain-of-Thought and Reasoning Patterns — Reasoning techniques in Agents
Prompt Design Fundamentals — Foundational prompt techniques