ChatGPT to NotionChatGPT to Notion

The Popularity of "New Chinese Interpretations" Prompts Sparks Reflection: Professional AI Engineers Discuss How to Craft Effective Prompts

on 22 days ago

The Popularity of "New Chinese Interpretations" Prompts Sparks Reflection: Professional AI Engineers Discuss How to Craft Effective Prompts

A "New Chinese Interpretations" prompt recently went viral. When used with Claude 3.5, entering a Chinese word generates a sarcastic explanation graphic for it. The prompt's pseudo-code writing style has made many realize prompts can be structured in such innovative ways. Crafting effective prompts remains a challenge, and engineers from Anthropic specializing in prompt design discussed key insights in a podcast.

01 Good Prompts: Sufficient Clarity and Continuous Iteration

  • Zack Witten: Prompt engineering aims to make models perform tasks, extract maximum value, and collaborate on otherwise unachievable goals. At its core, it's about clear communication—similar to conversing with humans, but requiring an understanding of the model's "psychology." The term "engineering" stems from the trial-and-error process: restarting conversations, iterating independently, and treating prompts as part of a larger system integration (as noted by David Hershey, who works on client integrations).
  • David Hershey: Think of prompt engineering as a form of programming for models, but prioritizing clarity. It involves system-level thinking about data sources, latency tradeoffs, and how to structure inputs—making it a distinct discipline from software engineering.
  • Amanda Askell: Key skills for prompt engineers include clear communication (translating tasks into understandable instructions) and iterative refinement. Always test edge cases—for example, ensuring a prompt like "extract rows starting with J" works when no J rows exist or the input is empty.
  • David Hershey: Users rarely input "perfect" text—expect typos, missing punctuation, and unstructured queries. Evaluate prompts against real-world inputs, not idealized scenarios.

02 Spell Out What You Know but the Model Doesn’t

  • Zack Witten: Always review model outputs carefully. Even if a prompt asks for step-by-step reasoning (CoT), models might abstract instead of following instructions—a mistake many miss by skipping detailed output analysis.
  • David Hershey: Writing clear task instructions requires identifying and articulating knowledge gaps the model lacks. Poor prompts often rely on the author’s assumptions rather than systematic task analysis.
  • Amanda Askell: When a model errs, ask it to diagnose the issue or refine the prompt. For example, prompt: "Point out ambiguities in these instructions before executing."
  • Alex Albert: Models can’t ask clarifying questions like humans, so anticipate their potential confusion and address it proactively in prompts.

03 When Iterations Backfire: Know When to Quit

  • Amanda Askell: If a model clearly struggles with a task (e.g., misaligning with core capabilities), don’t overinvest. Rapidly test for understanding—if absent, pivot.
  • David Hershey: Some tasks resist improvement through prompts alone. His experiment connecting Claude to a Game Boy emulator to play Pokémon FireRed showed marginal progress despite elaborate prompts (e.g., grid-based image descriptions). Eventually, he prioritized waiting for better models over endless tweaking.
  • Zack Witten: Visual tasks may lag text capabilities due to limited training data, making prompt engineering less effective for multi-modal challenges.

04 No Need for Role-Playing: Be Honest with Models

  • Amanda Askell: As models grow more capable, avoid misleading role-playing prompts (e.g., "You are a teacher"). Direct communication—such as "I need you to evaluate language model performance"—is more effective.
  • Zack Witten: Metaphors (e.g., "Evaluate this chart like a high school teacher grading homework") can clarify criteria without full role-playing, but specificity about the model’s actual context (e.g., "You’re a support chatbot for this product") is key.
  • David Hershey: Many users shortcut prompt design with generic role-playing, neglecting product-specific details. For example, translating a verbal task description directly into a prompt often outperforms overly abstract approaches.
  • Alex Albert: Misconceptions treat prompts like Google searches (keyword-only inputs), leading to edge-case failures. Detail and precision are critical.

05 Models Reason, but Not Like Humans

  • David Hershey: Debating whether models "truly reason" is philosophical; focus on performance. Iterating on reasoning structures (e.g., step-by-step instructions) improves outcomes, regardless of terminology.
  • Zack Witten: Test reasoning by replacing valid logic with flawed but plausible steps. If the model still produces correct answers, its "reasoning" may be superficial.
  • Amanda Askell: Grammatical perfection is secondary to conceptual clarity. She tolerates typos in draft prompts but polishes final versions for professionalism.

06 Master Prompt Skills by Tackling Hard Tasks

  • Zack Witten: Improve by analyzing prompt-output pairs, studying exemplars, and experimenting widely. Success signals a good prompt, but edge-case testing is equally vital.
  • Amanda Askell: Validate prompts with non-experts—if they misunderstand, the model likely will too. Iteration and observation are key.
  • David Hershey: Push model boundaries with challenging tasks (e.g., complex email drafting). Even failures reveal how models process information.
  • Zack Witten: Research prompts prioritize diversity (few examples to avoid bias), while consumer prompts rely on abundant examples for consistency.
  • David Hershey: Enterprise prompts must handle millions of edge cases, requiring rigorous testing beyond individual use cases.

07 Trust Models’ Capabilities—Don’t Infantilize Them

  • Zack Witten: Effective prompt tricks (e.g., forcing CoT) get baked into models over time. For example, modern models no longer need explicit "think step-by-step" prompts for math problems.
  • David Hershey: Recent models handle complex contexts better. Instead of oversimplifying, provide full task details—e.g., sharing research papers directly rather than paraphrasing them.
  • Amanda Askell: Treat models as capable collaborators. For example, prompt: "Generate 17 examples from this research paper on prompt techniques."
  • Zack Witten: Simulating model "perspectives" varies by type—pre-training models feel like text prediction engines, while RLHF models mimic more nuanced reasoning.

08 The Future of Prompt Engineering: Models That Read Our Minds

  • David Hershey: Models will better infer intent, reducing effort, but prompt engineering persists—you’ll always need to specify goals clearly.
  • Zack Witten: Future tools may automate prompt generation for novices, with "prompt generators" serving as entry points.
  • Amanda Askell: Meta-prompts (prompting models to refine prompts) are already common. Long-term, models might extract requirements directly from users (e.g., via interactive dialogue), minimizing explicit prompting.
  • Alex Albert: Enterprise tools may evolve into interactive systems that draw out user needs to craft optimal prompts, moving beyond simple text boxes.
  • Zack Witten: Today’s prompt engineering resembles teaching; tomorrow, it may resemble introspection, with models proactively decoding user needs.
  • Amanda Askell: Philosophical writing skills—translating complex concepts into simple terms—will remain vital for defining abstract tasks (e.g., "what makes a good chart") and collaborating with future AI systems.

Key takeaways: Effective prompts prioritize clarity, iterate relentlessly, respect model capabilities, and prepare for a future where AI increasingly bridges the gap between human intent and execution.