Advanced Prompt Techniques
1. Self-Consistency
1.1 Core Idea
Self-Consistency improves reasoning accuracy by sampling multiple reasoning paths and performing majority voting on the final answers.
1.2 Workflow
Question → [CoT Reasoning Path 1 → Answer A]
→ [CoT Reasoning Path 2 → Answer B]
→ [CoT Reasoning Path 3 → Answer A]
→ [CoT Reasoning Path 4 → Answer A]
→ [CoT Reasoning Path 5 → Answer C]
Majority Vote → Answer A (3/5) ✓
1.3 Implementation
import openai
from collections import Counter
def self_consistency(prompt, n_samples=5, temperature=0.7):
"""Use Self-Consistency with multiple sampling and voting"""
answers = []
for _ in range(n_samples):
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=temperature
)
# Extract final answer from reasoning
answer = extract_final_answer(response.choices[0].message.content)
answers.append(answer)
# Majority vote
counter = Counter(answers)
return counter.most_common(1)[0][0]
1.4 Applicability and Limitations
Suitable for: Math problems, logical reasoning, multiple choice — tasks with clear-cut answers
Limitations:
- Higher cost (multiple API calls)
- Limited effectiveness for open-ended generation tasks
- Requires a reliable answer extraction mechanism
2. Tree of Thoughts (ToT)
2.1 Core Idea
Tree of Thoughts organizes the reasoning process as a tree structure, generating multiple "thought" branches at each node and selecting the optimal path through evaluation.
2.2 Comparison with CoT
| Aspect | CoT | ToT |
|---|---|---|
| Reasoning structure | Linear chain | Tree branches |
| Exploration | Single path | Multiple paths in parallel |
| Backtracking | Not supported | Supported |
| Evaluation | Final result only | Intermediate steps evaluable |
2.3 Implementation Framework
class TreeOfThoughts:
def __init__(self, model, evaluator):
self.model = model
self.evaluator = evaluator
def solve(self, problem, max_depth=3, branch_factor=3):
"""BFS-style ToT solver"""
root = ThoughtNode(problem, depth=0)
current_level = [root]
for depth in range(max_depth):
next_level = []
for node in current_level:
# Generate multiple thought branches
thoughts = self.generate_thoughts(node, branch_factor)
for thought in thoughts:
# Evaluate the promise of each thought
score = self.evaluator.evaluate(thought)
child = ThoughtNode(thought, depth=depth+1, score=score)
node.add_child(child)
next_level.append(child)
# Select the most promising nodes for further expansion
next_level.sort(key=lambda x: x.score, reverse=True)
current_level = next_level[:branch_factor]
# Return the best leaf node
return max(current_level, key=lambda x: x.score)
2.4 Prompt Example
Problem: {problem}
Please generate 3 different solution approaches (first step only):
Approach 1:
Approach 2:
Approach 3:
Evaluate the feasibility of each approach (1-10):
2.5 Applicable Scenarios
- Creative writing (exploring different narrative directions)
- Mathematical proofs (trying different proof strategies)
- Planning problems (exploring different action plans)
- Code design (comparing different architectural approaches)
3. RAG-Augmented Prompts
3.1 Basic Pattern
Inject retrieved context into the prompt to augment LLM knowledge:
Answer the question based on the following references. If the references do not contain relevant information, state this honestly.
References:
---
{retrieved_context_1}
---
{retrieved_context_2}
---
{retrieved_context_3}
Question: {user_question}
Please cite specific content from the references to support your answer.
3.2 Advanced RAG Prompt Patterns
Multi-step reasoning RAG:
Based on the following materials, please:
1. First summarize the key points of each material
2. Analyze the connections between materials
3. Synthesize an answer to the user's question
4. Identify any information gaps that may need supplementation
Materials: {contexts}
Question: {question}
RAG with confidence levels:
Answer the question based on the provided materials. Label each claim with a confidence level:
- [High]: Directly supported by the materials
- [Medium]: Can be inferred from the materials
- [Low]: Limited material support, partially based on general knowledge
Materials: {contexts}
Question: {question}
3.3 Integration with Memory Systems
RAG can serve as an external memory system for AI Agents. See RAG-Augmented Memory for details.
4. DSPy: Programmatic Prompt Optimization
4.1 DSPy Overview
DSPy (Declarative Self-improving Python) is a framework that transforms prompt engineering into a programming problem, defining LLM programs declaratively and optimizing prompts automatically.
4.2 Core Concepts
- Signature: Type signatures defining inputs and outputs
- Module: Composable LLM operation modules
- Teleprompter/Optimizer: Algorithms for automatic prompt optimization
- Metric: Metrics for evaluating prompt quality
4.3 Basic Usage
import dspy
# Configure LLM
lm = dspy.OpenAI(model="gpt-4", max_tokens=300)
dspy.settings.configure(lm=lm)
# Define Signature
class SentimentClassification(dspy.Signature):
"""Classify sentiment of a text."""
text = dspy.InputField(desc="Text to classify")
sentiment = dspy.OutputField(desc="positive, negative, or neutral")
# Use Module
classify = dspy.Predict(SentimentClassification)
result = classify(text="This product exceeded my expectations!")
print(result.sentiment) # "positive"
4.4 Automatic Optimization
from dspy.teleprompt import BootstrapFewShot
# Prepare training data
trainset = [
dspy.Example(text="Great product!", sentiment="positive"),
dspy.Example(text="Terrible experience.", sentiment="negative"),
# ...more samples
]
# Define evaluation metric
def accuracy_metric(example, pred, trace=None):
return example.sentiment == pred.sentiment
# Automatic optimization
teleprompter = BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=4)
optimized_classify = teleprompter.compile(classify, trainset=trainset)
4.5 Advantages of DSPy
- Programmable: Prompt logic expressed in code, testable and versionable
- Auto-optimized: No manual prompt debugging needed
- Composable: Modular design supports complex pipelines
- LLM-agnostic: Prompts auto-adapt when switching models
5. Automatic Prompt Optimization
5.1 APE (Automatic Prompt Engineer)
Methods for automatically searching for optimal prompts:
1. Given a task description and evaluation dataset
2. Use an LLM to generate candidate prompts
3. Test each prompt on the evaluation set
4. Select the best-performing prompt
5. Iterate and improve
5.2 OPRO (Optimization by PROmpting)
Leveraging the LLM itself as an optimizer:
Here are some prompts and their performance scores:
Prompt: "Classify the sentiment" → Score: 0.72
Prompt: "Determine if positive or negative" → Score: 0.78
Prompt: "Analyze the emotional tone" → Score: 0.75
Based on the above information, generate a new prompt that might perform better.
5.3 Practical Prompt Optimization Workflow
1. Define evaluation metrics and test set
2. Write an initial prompt
3. Evaluate on the test set
4. Analyze failure cases
5. Modify the prompt (manual + automatic)
6. Repeat 3-5 until requirements are met
7. A/B test before production rollout
6. Meta-Prompting
6.1 Concept
Using LLMs to generate, evaluate, and improve prompts. Essentially, "prompts for writing prompts."
6.2 Prompt Generator
You are a prompt engineering expert. The user will describe a task, and you need to:
1. Analyze the key elements of the task
2. Generate 3 prompts in different styles
3. Evaluate the strengths and weaknesses of each prompt
4. Recommend the best prompt
Task description: {task_description}
Please generate prompts suitable for use with GPT-4.
6.3 Prompt Evaluator
Please evaluate the quality of the following prompt:
Prompt: {prompt_to_evaluate}
Target task: {task_description}
Evaluation dimensions:
1. Clarity (1-10): Are instructions unambiguous?
2. Completeness (1-10): Does it cover all necessary information?
3. Formatting (1-10): Is the output format clear?
4. Robustness (1-10): Can it handle abnormal inputs?
5. Efficiency (1-10): Token usage efficiency
Overall score and improvement suggestions:
6.4 Prompt Iterator
Current prompt: {current_prompt}
Evaluation results: {evaluation_results}
Failure cases: {failure_cases}
Please improve the prompt based on the above information:
1. Fix issues causing the failure cases
2. Preserve the strengths of the original prompt
3. Improve overall robustness
Improved prompt:
7. Advanced Techniques Summary
7.1 Prompt Chaining
Decompose complex tasks into chained prompt calls:
Step 1: Analyze → Extract key information
Step 2: Plan → Develop action plan
Step 3: Execute → Generate final output
Step 4: Review → Check and correct
7.2 Role-Play Enhancement
Have 3 experts analyze this problem separately:
- Expert A (Data Scientist): Analyze from a data perspective
- Expert B (Product Manager): Analyze from a user needs perspective
- Expert C (Security Engineer): Analyze from a security perspective
Then synthesize the opinions of all 3 experts to provide a final recommendation.
7.3 Constraint Escalation
When LLM output does not meet requirements, progressively add constraints:
# First attempt
Summarize this article.
# Second attempt (add constraints)
Summarize this article in 3 bullet points, each no more than 20 words.
# Third attempt (further constraints)
Summarize this article in 3 bullet points, each no more than 20 words.
Format:
- Point 1: [content]
- Point 2: [content]
- Point 3: [content]
Output only the bullet points, do not add any other content.
8. Summary
| Technique | Core Idea | Use Cases | Complexity |
|---|---|---|---|
| Self-Consistency | Multiple sampling + voting | Math/logical reasoning | Medium |
| Tree of Thoughts | Tree exploration + evaluation | Planning/creative/complex reasoning | High |
| RAG-Augmented Prompt | Retrieval + context injection | Knowledge-intensive tasks | Medium |
| DSPy | Programmatic prompt definition | Pipelines needing auto-optimization | Medium-High |
| APE/OPRO | Auto-search for optimal prompts | Large-scale prompt optimization | High |
| Meta-Prompting | LLM generates/evaluates prompts | Prompt development process | Low-Medium |
References
- Wang et al., "Self-Consistency Improves Chain of Thought Reasoning in Language Models", 2023
- Yao et al., "Tree of Thoughts: Deliberate Problem Solving with Large Language Models", 2023
- Khattab et al., "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines", 2023
- Chain-of-Thought and Reasoning Patterns — Reasoning techniques in Agents
- Prompt Design Fundamentals — Foundational prompt techniques