AI_FOR_CYNICAL_DEVS
← Back to The Grind
Module 21 // 10 minutes // Reference

Troubleshooting Common Problems

It was working yesterday.

— Every developer, about every system, including AI

Table of Contents


AI tools break in predictable ways. Here’s how to fix the most common issues.


Output Quality Issues

Problem: AI gives generic, unhelpful responses

Symptoms:

  • Responses are vague
  • Answers could apply to any project
  • Missing specific details

Likely causes:

  1. Prompt is too vague
  2. Not enough context provided
  3. Wrong model for the task

Fixes:

❌ Bad: "Help me with this code"

✅ Better: "Review this Python function for security issues. 
It handles user authentication in a Flask app. 
Check for: SQL injection, timing attacks, password handling."

Checklist:

  • Is the prompt specific about what you want?
  • Did you include relevant context?
  • Did you specify the format you want?
  • Did you include constraints (what NOT to do)?

Problem: AI keeps hallucinating facts/APIs/functions

Symptoms:

  • References non-existent libraries
  • Suggests deprecated methods
  • Makes up API endpoints

Likely causes:

  1. Training data cutoff (doesn’t know recent changes)
  2. Model confidently guessing
  3. Ambiguous library names

Fixes:

  1. Specify versions explicitly:
Use React 18 with the new hooks API.
Use Python 3.11+ syntax.
Use Node.js v20 LTS APIs.
  1. Provide documentation:
Here's the actual API documentation:
{paste relevant docs}

Based on this documentation, write...
  1. Ask for verification:
Before providing code, verify that all imports and 
function calls exist. If you're unsure about an API, 
say so rather than guessing.
  1. Use RAG: Include actual documentation in context.

Problem: AI ignores my instructions

Symptoms:

  • Asked for X, got Y
  • Specifically said “don’t do Z”, it did Z
  • Format instructions ignored

Likely causes:

  1. Instructions buried in long prompt
  2. Conflicting instructions
  3. Model “helpfully” overriding your choices

Fixes:

  1. Put critical instructions at the start AND end:
IMPORTANT: Return only JSON, no explanation.

{rest of prompt}

Remember: JSON only, no other text.
  1. Use explicit formatting:
Format your response EXACTLY like this:
[ANALYSIS]
{your analysis here}
[/ANALYSIS]

[CODE]
{your code here}
[/CODE]
  1. Remove ambiguity:
❌ "Keep it short" (what's short?)
✅ "Maximum 3 sentences"
✅ "Under 100 words"

Problem: Code has bugs/doesn’t compile

Symptoms:

  • Syntax errors
  • Type mismatches
  • Missing imports

Likely causes:

  1. Incomplete context
  2. Model mixing language versions
  3. Framework conventions mismatch

Fixes:

  1. Specify exact environment:
TypeScript 5.3 with strict mode
React 18 with functional components only
Node.js ESM (import/export, not require)
  1. Include your existing types/interfaces:
Use these existing types:
{paste your interfaces}
  1. Ask for complete code:
Provide complete, runnable code including all imports.
Do not use placeholder comments like "// rest of code here".
  1. Request self-review:
After writing the code, check for:
- Missing imports
- Type errors
- Syntax errors
Fix any issues before responding.

Problem: Responses are too long/short

Symptoms:

  • Asked for summary, got essay
  • Asked for detailed explanation, got one sentence

Fixes:

For too long:

Respond in 3 sentences maximum.
Be concise. No preamble or summary.
Just the code, no explanation needed.

For too short:

Provide a detailed explanation with examples.
Include: {list specific things to include}
Aim for approximately 500 words.

Technical Errors

Problem: Rate limit errors (429)

Symptoms:

Error: 429 Too Many Requests
Rate limit exceeded

Fixes:

  1. Implement exponential backoff:
async function withRetry(fn: () => Promise<any>, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (e) {
      if (e.status === 429 && i < maxRetries - 1) {
        await sleep(Math.pow(2, i) * 1000);
        continue;
      }
      throw e;
    }
  }
}
  1. Check your tier limits: You might need to upgrade.

  2. Add request queuing: Don’t fire parallel requests.

  3. Cache responses: Don’t ask the same question twice.


Problem: Context length exceeded

Symptoms:

Error: Maximum context length exceeded
Error: Input too long

Fixes:

  1. Truncate input:
function truncateToTokenLimit(text: string, maxTokens: number) {
  // Rough estimate: 1 token ≈ 4 characters
  const maxChars = maxTokens * 4;
  if (text.length > maxChars) {
    return text.slice(0, maxChars) + "\n[truncated]";
  }
  return text;
}
  1. Summarize context: Ask the AI to summarize previous conversation.

  2. Use sliding window: Keep only recent messages.

  3. Chunk large documents: Process in pieces.

  4. Use a model with larger context: Claude has 200K, some models have less.


Problem: Timeout errors

Symptoms:

Error: Request timeout
Error: Connection timed out

Likely causes:

  1. Complex prompt taking too long
  2. Network issues
  3. Service overloaded

Fixes:

  1. Increase timeout:
const response = await client.messages.create({
  // ... options
}, {
  timeout: 120000 // 2 minutes
});
  1. Use streaming for long responses:
const stream = await client.messages.create({
  // ... options
  stream: true
});
  1. Break into smaller requests: Don’t ask for everything at once.

Problem: API key errors

Symptoms:

Error: Invalid API key
Error: Authentication failed
Error: 401 Unauthorized

Checklist:

  • Key is actually set in environment
  • No trailing whitespace in key
  • Key hasn’t been rotated/revoked
  • Using correct key for correct provider
  • Not hitting free tier limits

Debug:

# Check if env var is set
echo $ANTHROPIC_API_KEY | head -c 10

# Verify it's the right format
# Anthropic keys start with "sk-ant-"
# OpenAI keys start with "sk-"

Performance Problems

Problem: Responses are too slow

Symptoms:

  • API calls taking 10+ seconds
  • Users complaining about latency

Fixes:

  1. Use streaming:
// User sees content appearing progressively
const stream = await client.messages.create({
  stream: true,
  // ...
});

for await (const chunk of stream) {
  process.stdout.write(chunk.delta?.text || '');
}
  1. Use faster models:
  • Claude: Haiku < Sonnet < Opus
  • OpenAI: GPT-3.5 < GPT-4
  1. Reduce input size: Less context = faster processing.

  2. Request shorter outputs:

Respond in under 100 words.
  1. Use caching: Same question = cached answer.

Problem: High memory usage (local models)

Symptoms:

  • System slowing down
  • Out of memory errors
  • Swap thrashing

Fixes:

  1. Use smaller model: 7B instead of 70B

  2. Use quantized model:

# Q4_K_M is good quality/size balance
ollama run llama3.1:8b-q4_K_M
  1. Reduce context length:
ollama run llama3.1 --context-length 4096
  1. Check actual requirements: Some models lie about their sizes.

Cost Issues

Problem: API costs unexpectedly high

Symptoms:

  • Surprise bill
  • Usage much higher than expected

Debug:

  1. Check what’s actually being sent:
console.log('Input tokens:', countTokens(prompt));
console.log('Prompt preview:', prompt.slice(0, 500));
  1. Audit your calls:
let totalTokens = 0;

async function trackedCall(prompt: string) {
  const response = await client.messages.create({...});
  totalTokens += response.usage.input_tokens;
  totalTokens += response.usage.output_tokens;
  console.log(`Total tokens so far: ${totalTokens}`);
  return response;
}

Common causes:

  • Conversation history growing unbounded
  • System prompt included in every call
  • Retry loops without limits
  • Logging prompts that include full context

Fixes:

  • Set up billing alerts
  • Implement per-request budgets
  • Truncate conversation history
  • Cache identical requests

Problem: Using expensive model for simple tasks

Diagnosis: Are you using Opus/GPT-4 for everything?

Fix: Route to appropriate models:

function selectModel(task: string): string {
  const simpleTasks = ['classification', 'extraction', 'yes/no'];
  const complexTasks = ['reasoning', 'analysis', 'creative'];
  
  if (simpleTasks.some(t => task.includes(t))) {
    return 'claude-3-5-haiku-20241022';
  }
  return 'claude-sonnet-4-20250514';
}

Integration Problems

Problem: Inconsistent response format

Symptoms:

  • Sometimes JSON, sometimes text
  • Structure varies between calls
  • Parsing fails randomly

Fixes:

  1. Use structured output features:
// Anthropic
const response = await client.messages.create({
  // ...
  tools: [{
    name: "format_response",
    description: "Format the response",
    input_schema: {
      type: "object",
      properties: {
        answer: { type: "string" },
        confidence: { type: "number" }
      },
      required: ["answer", "confidence"]
    }
  }],
  tool_choice: { type: "tool", name: "format_response" }
});
  1. Validate and retry:
async function getStructuredResponse(prompt: string, schema: any) {
  for (let attempt = 0; attempt < 3; attempt++) {
    const response = await client.messages.create({...});
    try {
      const parsed = JSON.parse(response.content);
      validateSchema(parsed, schema);
      return parsed;
    } catch {
      // Add reminder to prompt and retry
      prompt += "\n\nIMPORTANT: Respond ONLY with valid JSON.";
    }
  }
  throw new Error("Failed to get structured response");
}
  1. Be extremely explicit:
Respond with ONLY a JSON object, no other text.
Do not include markdown code blocks.
Do not include explanation.
Just the JSON.

Problem: Streaming not working

Symptoms:

  • No chunks received
  • All content at once
  • Connection hanging

Checklist:

  • Using stream: true option
  • Using async iteration properly
  • Not buffering the entire response
  • Handling stream errors

Example fix:

try {
  const stream = await client.messages.create({
    stream: true,
    // ...
  });

  for await (const event of stream) {
    if (event.type === 'content_block_delta') {
      process.stdout.write(event.delta.text);
    }
  }
} catch (e) {
  console.error('Stream error:', e);
}

Problem: Tool/function calls failing

Symptoms:

  • Model doesn’t call tools
  • Wrong tool called
  • Invalid arguments

Fixes:

  1. Better tool descriptions:
{
  name: "search_docs",
  description: `Search internal documentation. 
    Use this when user asks about: company policies, 
    technical specs, or internal processes.
    Do NOT use for general knowledge questions.`,
  input_schema: {
    // Be explicit about expected values
  }
}
  1. Validate tool calls:
function validateToolCall(call: ToolCall): boolean {
  const tool = tools.find(t => t.name === call.name);
  if (!tool) return false;
  
  // Validate arguments against schema
  return validateSchema(call.arguments, tool.input_schema);
}
  1. Handle gracefully:
if (!validateToolCall(call)) {
  // Ask for clarification instead of crashing
  return "I wasn't able to understand that request. Could you rephrase?";
}

Quick Fixes Cheat Sheet

ProblemFirst thing to try
Generic responsesAdd more context to prompt
HallucinationsInclude actual docs in prompt
Ignores instructionsMove instructions to start AND end
Code bugsSpecify exact versions
Rate limitsAdd exponential backoff
Context exceededTruncate or summarize input
Slow responsesUse streaming
High costsRoute to cheaper models
Inconsistent formatUse structured output tools
Tool calls failingImprove tool descriptions

When to Give Up

Sometimes the AI just can’t do what you want. Signs to try a different approach:

  • Same error after 5+ different prompts
  • Error rate above 30% even with good prompts
  • Task requires real-time information
  • Task requires guaranteed correctness
  • Cost per task exceeds value of task

Alternatives to consider:

  • Traditional programming
  • Different AI model
  • Human in the loop
  • Hybrid approach (AI assists, human decides)

Remember: If you’ve been debugging the same AI issue for an hour, take a break. Fresh eyes often spot what tired eyes miss.