OpenAI Official Prompt Engineering Guide

on 2 months ago

OpenAI Official Prompt Engineering Guide

Original address: https://platform.openai.com/docs/guides/prompt-engineering

This guide shares how to more effectively use large language models like GPT-4 (sometimes called GPT models) to achieve better results. The methods introduced can be combined with each other to exert greater effects. We encourage you to experiment and find the techniques that work best for you.

Currently, some examples demonstrated here only apply to the most advanced model gpt-4. In general, if the model you try fails to succeed in a task and a more advanced model is available, you may try again with a more advanced model.

In addition, you can view some example prompts to understand what our models can do:

Prompt examples

Browse these examples to discover the potential of GPT models.

Six Strategies to Help You Achieve Better Results

Write Clear Instructions

These models are not mind readers and cannot guess your thoughts. If the model's output is too long, you can ask it to answer briefly. If the model's output is too simple, you can request writing at a more professional level. If you are dissatisfied with the output format, you can directly show the expected format. It is best to make the model not need to guess what you want, so that you are most likely to get the desired results.

Tips:
- Add detailed information to your query to get more accurate answers. For example, instead of saying "Summarize the meeting minutes," say "Summarize the meeting minutes in one paragraph. Then write a Markdown list of speakers and their key points. Finally, list the follow-up steps or action items suggested by the speakers (if any)."
- Ask the model to adopt a specific persona. For example, act as a comedian who likes to tell jokes. Whenever asked to help write something, it will reply with a document where each paragraph contains at least one joke or interesting comment.
- Use delimiters to clearly distinguish different parts of the input. Delimiters such as triple quotes, XML tags, and section headings can help divide text sections that need to be treated differently. For example, summarize the text separated by triple quotes in 50 characters: """Insert text here"""
- Clearly state the steps required to complete the task. For example, use the following step-by-step instructions to respond to user input. Step 1 - The user will provide you with text in triple quotes. Summarize this text in one sentence and prefix it with "Summary:". Step 2 - Translate the summary in Step 1 into Spanish and add the prefix "Translation:".
- Provide examples. This is the classic few-shot prompt, where you first give the large model examples and let it output according to the examples. For example, write an article in the style of "The sunset glows with the lone wild duck flying; the autumn water shares the same color as the vast sky. Fishermen's boats sing at dusk, their voices reaching the shore of Peng蠡 Lake."
- Specify the output length. You can ask the model to generate output of a given target length. The target output length can be specified according to the count of words, sentences, paragraphs, points, etc. The effect is not obvious in Chinese, and the given length is only approximate. Specifying how many characters is not very accurate, but specifying how many paragraphs works better. For example, summarize the text separated by triple quotes in two paragraphs and 100 characters: """Insert text here"""

Provide Reference Texts

If you have specific materials or examples on the topic you want to write about, show them to the AI so that it can provide more accurate and relevant content. Language models may create false answers, especially when asked about specific topics or required to cite references and URLs. Providing reference texts can help the model provide more accurate answers.

Tips:
- Instruct the model to answer questions using reference texts. For example, use the article enclosed in triple quotes to answer the question. If the answer cannot be found in the article, write "I can't find the answer":

"""<Insert document here>"""
Question: <Insert question here>

Ask the model to cite content from reference texts when answering. For example, you will be given a document separated by triple quotes and a question. Your task is to answer the question using only the provided document and cite the paragraphs used to answer the question. If the document does not contain the information needed to answer the question, simply write: "Insufficient information". If the answer to the question is provided, it must be accompanied by a citation note. Use the following format to cite relevant paragraphs ({"citation": …}):

"""<Insert document here>"""
Question: <Insert question here>

Break Complex Tasks into Simple Subtasks

If you have a complex topic to write about, try to divide it into several small parts. For example, first write a section on the background of the topic, and then write a section on the main points. Just as in software engineering, complex systems are decomposed into modular components, the same approach should be taken when submitting tasks to language models. Complex tasks generally have higher error rates than simple tasks. Complex tasks can often be redefined as a workflow of a series of simple tasks.

Tips:
- Use intent classification to identify the most relevant instructions in user queries. For example, in a customer service scenario, when a user asks "What should I do if my internet is disconnected?", first determine the problem category and confirm that it belongs to troubleshooting in technical support, then provide a targeted answer.
- For applications requiring long conversations, summarize or filter previous conversations. Because models have a fixed context length, conversations between users and assistants cannot continue indefinitely. To solve this problem, you can summarize the conversation history. When the input size reaches a predetermined threshold length, trigger a query to summarize part of the conversation and use the summary of the previous conversation as part of the system message; or summarize the previous conversation asynchronously in the background throughout the conversation; or store all past chat records in a vector database and dynamically query embeddings during subsequent conversations with users.
- Summarize long documents in segments and recursively build a complete summary. For example, asking a large model to summarize a book may exceed the token limit, so a series of queries can be used to summarize each part of the document. Chapter summaries can be connected and summarized to generate a summary of the summary, and this process can be performed recursively until the entire document is summarized.

Give the Model Time to "Think"

Models may make more reasoning errors when answering questions immediately. Asking the model to engage in a "chain of thought" before giving an answer can help the model reason out the correct answer more reliably.

Tips:
- Instruct the model to find its own solution before rushing to a conclusion. For example, if you want a model to evaluate a student's solution to a math problem, instead of directly asking the model whether the student's solution is correct, prompt the model to first generate its own solution and then evaluate it.
- Use internal monologue or a series of queries to hide the model's reasoning process.
- Ask the model if it missed anything in previous answers.

Use External Tools

Sometimes, combining AI with other tools (such as data search tools) can achieve better results. Use the output of other tools to compensate for the model's shortcomings. For example, a text retrieval system can provide the model with relevant document information, and a code execution engine can help the model perform mathematical calculations and run code.

Tips:
- Use embedding-based search to achieve efficient knowledge retrieval.
- Use code execution to perform more accurate calculations or call external APIs.
- Give the model access to specific functions.

Test and Adjust

Try different instructions and methods to see which one works best, then adjust based on the results. "Evaluating model output using golden standard answers" is an effective method to ensure the quality of AI model responses.

Specific Operations:
- Define golden standard answers: First, determine which known facts a correct answer to a question should contain. These facts form the criteria for evaluating AI responses.
- Compare model queries with facts: Use model queries to generate answers, then check how many of the required facts are included in the answer.
- Evaluate the completeness of answers: Evaluate the completeness and accuracy of answers based on the number of facts they contain. If an answer contains all or most of the required facts, it can be considered high-quality.

This strategy is particularly suitable for scenarios requiring precise and detailed information, such as science, technology, or academic research. By comparing with golden standard answers, the output quality of AI models can be effectively monitored and improved.