diff --git a/_freeze/embeddings/applications/execute-results/html.json b/_freeze/embeddings/applications/execute-results/html.json index d2cb0b0..081fbf7 100644 --- a/_freeze/embeddings/applications/execute-results/html.json +++ b/_freeze/embeddings/applications/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "c7c6aaa88acff3e085f8bcd0144e88ab", + "hash": "4dcd8d01a2de3b1e2e0f0407e336d87c", "result": { "engine": "jupyter", - "markdown": "---\ntitle: Applications\nformat:\n html:\n code-fold: true\n---\n\n", + "markdown": "---\ntitle: Applications\nformat:\n html:\n code-fold: true\n---\n\nBuild a bot that can answer questions based on documents!\nResource: https://platform.openai.com/docs/tutorials/web-qa-embeddings\n\n", "supporting": [ "applications_files" ], diff --git a/_freeze/llm/gpt_api/execute-results/html.json b/_freeze/llm/gpt_api/execute-results/html.json index 556f51d..313a3ee 100644 --- a/_freeze/llm/gpt_api/execute-results/html.json +++ b/_freeze/llm/gpt_api/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "dc7e1b2db90c90886a913f41ce9fd215", + "hash": "b873702b4a61fcaca14b721868545856", "result": { "engine": "jupyter", - "markdown": "---\ntitle: The OpenAI API\nformat:\n html:\n code-fold: false\n---\n\nResource: [OpenAI API docs](https://platform.openai.com/docs/introduction){.external}\n\n\nLet's get started with the OpenAI API for GPT. \n\n\n### Authentication\n\nGetting started with the OpenAI Chat Completions API requires signing up for an account on the OpenAI platform. \nOnce you've registered, you'll gain access to an API key, which serves as a unique identifier for your application to authenticate requests to the API. \nThis key is essential for ensuring secure communication between your application and OpenAI's servers. \nWithout proper authentication, your requests will be rejected.\nYou can create your own account, but for the seminar we will provide the client with the credential within the Jupyterlab (TODO: Link).\n\n::: {#a9cb3d89 .cell execution_count=1}\n``` {.python .cell-code}\n# setting up the client in Python\n\nimport os\nfrom openai import OpenAI\n\nclient = OpenAI(\n api_key=os.environ.get(\"OPENAI_API_KEY\")\n)\n```\n:::\n\n\n### Requesting Completions\n\nMost interaction with GPT and other models consist in generating completions for certain tasks (TODO: Link to completions)\n\nTo request completions from the OpenAI API, we use Python to send HTTP requests to the designated API endpoint. \nThese requests are structured to include various parameters that guide the generation of text completions. \nThe most fundamental parameter is the prompt text, which sets the context for the completion. \nAdditionally, you can specify the desired model configuration, such as the engine to use (e.g., \"gpt-4\"), as well as any constraints or preferences for the generated completions, such as the maximum number of tokens or the temperature for controlling creativity (TODO: Link parameterization)\n\n::: {#662daa55 .cell execution_count=2}\n``` {.python .cell-code}\n# creating a completion\nchat_completion = client.chat.completions.create(\n messages=[\n {\n \"role\": \"user\",\n \"content\": \"How old is the earth?\",\n }\n ],\n model=\"gpt-3.5-turbo\"\n)\n```\n:::\n\n\n### Processing\n\nOnce the OpenAI API receives your request, it proceeds to process the provided prompt using the specified model. \nThis process involves analyzing the context provided by the prompt and leveraging the model's pre-trained knowledge to generate text completions. \nThe model employs advanced natural language processing techniques to ensure that the generated completions are coherent and contextually relevant. \nBy drawing from its extensive training data and understanding of human language, the model aims to produce responses that closely align with human-like communication.\n\n### Response\n\nAfter processing your request, the OpenAI API returns a JSON-formatted response containing the generated text completions. \nDepending on the specifics of your request, you may receive multiple completions, each accompanied by additional information such as a confidence score indicating the model's level of certainty in the generated text. \nThis response provides valuable insights into the quality and relevance of the completions, allowing you to tailor your application's behavior accordingly.\n\n### Error Handling\n\nWhile interacting with the OpenAI API, it's crucial to implement robust error handling mechanisms to gracefully manage any potential issues that may arise. \nCommon errors include providing invalid parameters, experiencing authentication failures due to an incorrect API key, or encountering rate limiting restrictions. B\ny handling errors effectively, you can ensure the reliability and resilience of your application, minimizing disruptions to the user experience and maintaining smooth operation under varying conditions. \nImplementing proper error handling practices is essential for building robust and dependable applications that leverage the capabilities of the OpenAI Chat Completions API effectively.\n\n", + "markdown": "---\ntitle: The OpenAI API\nformat:\n html:\n code-fold: false\n---\n\n::: {.callout-note}\nResource: [OpenAI API docs](https://platform.openai.com/docs/introduction){.external}\n:::\n\n\n\nLet's get started with the OpenAI API for GPT. \n\n\n### Authentication\n\nGetting started with the OpenAI Chat Completions API requires signing up for an account on the OpenAI platform. \nOnce you've registered, you'll gain access to an API key, which serves as a unique identifier for your application to authenticate requests to the API. \nThis key is essential for ensuring secure communication between your application and OpenAI's servers. \nWithout proper authentication, your requests will be rejected.\nYou can create your own account, but for the seminar we will provide the client with the credential within the Jupyterlab (TODO: Link).\n\n::: {#1c41ead3 .cell execution_count=1}\n``` {.python .cell-code}\n# setting up the client in Python\n\nimport os\nfrom openai import OpenAI\n\nclient = OpenAI(\n api_key=os.environ.get(\"OPENAI_API_KEY\")\n)\n```\n:::\n\n\n### Requesting Completions\n\nMost interaction with GPT and other models consist in generating completions for certain tasks (TODO: Link to completions)\n\nTo request completions from the OpenAI API, we use Python to send HTTP requests to the designated API endpoint. \nThese requests are structured to include various parameters that guide the generation of text completions. \nThe most fundamental parameter is the prompt text, which sets the context for the completion. \nAdditionally, you can specify the desired model configuration, such as the engine to use (e.g., \"gpt-4\"), as well as any constraints or preferences for the generated completions, such as the maximum number of tokens or the temperature for controlling creativity (TODO: Link parameterization)\n\n::: {#a7e7ff6f .cell execution_count=2}\n``` {.python .cell-code}\n# creating a completion\nchat_completion = client.chat.completions.create(\n messages=[\n {\n \"role\": \"user\",\n \"content\": \"How old is the earth?\",\n }\n ],\n model=\"gpt-3.5-turbo\"\n)\n```\n:::\n\n\n### Processing\n\nOnce the OpenAI API receives your request, it proceeds to process the provided prompt using the specified model. \nThis process involves analyzing the context provided by the prompt and leveraging the model's pre-trained knowledge to generate text completions. \nThe model employs advanced natural language processing techniques to ensure that the generated completions are coherent and contextually relevant. \nBy drawing from its extensive training data and understanding of human language, the model aims to produce responses that closely align with human-like communication.\n\n### Response\n\nAfter processing your request, the OpenAI API returns a JSON-formatted response containing the generated text completions. \nDepending on the specifics of your request, you may receive multiple completions, each accompanied by additional information such as a confidence score indicating the model's level of certainty in the generated text. \nThis response provides valuable insights into the quality and relevance of the completions, allowing you to tailor your application's behavior accordingly.\n\n### Error Handling\n\nWhile interacting with the OpenAI API, it's crucial to implement robust error handling mechanisms to gracefully manage any potential issues that may arise. \nCommon errors include providing invalid parameters, experiencing authentication failures due to an incorrect API key, or encountering rate limiting restrictions. B\ny handling errors effectively, you can ensure the reliability and resilience of your application, minimizing disruptions to the user experience and maintaining smooth operation under varying conditions. \nImplementing proper error handling practices is essential for building robust and dependable applications that leverage the capabilities of the OpenAI Chat Completions API effectively.\n\n", "supporting": [ "gpt_api_files" ], diff --git a/_freeze/llm/parameterization/execute-results/html.json b/_freeze/llm/parameterization/execute-results/html.json index ffb6c73..28f7be4 100644 --- a/_freeze/llm/parameterization/execute-results/html.json +++ b/_freeze/llm/parameterization/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "bc2adb1e48802eac607894bda896dc2e", + "hash": "a20b274126bd9f16f0ae718708e16f43", "result": { "engine": "jupyter", - "markdown": "---\ntitle: Parameterization of GPT\nformat:\n html:\n code-fold: true\n---\n\n- **Temperature**: Temperature is a parameter that controls the randomness of the generated text. Lower temperatures result in more deterministic outputs, where the model tends to choose the most likely tokens at each step. Higher temperatures introduce more randomness, allowing the model to explore less likely tokens and produce more creative outputs. It's often used to balance between generating safe, conservative responses and more novel, imaginative ones.\n\n- **Max Tokens**: Max Tokens limits the maximum length of the generated text by specifying the maximum number of tokens (words or subwords) allowed in the output. This parameter helps to control the length of the response and prevent the model from generating overly long or verbose outputs, which may not be suitable for certain applications or contexts.\n\n- **Top P (Nucleus Sampling)**: Top P, also known as nucleus sampling, dynamically selects a subset of the most likely tokens based on their cumulative probability until the cumulative probability exceeds a certain threshold (specified by the parameter). This approach ensures diversity in the generated text while still prioritizing tokens with higher probabilities. It's particularly useful for generating diverse and contextually relevant responses.\n\n- **Frequency Penalty**: Frequency Penalty penalizes tokens based on their frequency in the generated text. Tokens that appear more frequently are assigned higher penalties, discouraging the model from repeatedly generating common or redundant tokens. This helps to promote diversity in the generated text and prevent the model from producing overly repetitive outputs.\n\n- **Presence Penalty**: Presence Penalty penalizes tokens that are already present in the input prompt. By discouraging the model from simply echoing or replicating the input text, this parameter encourages the generation of responses that go beyond the provided context. It's useful for generating more creative and novel outputs that are not directly predictable from the input.\n\n- **Stop Sequence**: Stop Sequence specifies a sequence of tokens that, if generated by the model, signals it to stop generating further text. This parameter is commonly used to indicate the desired ending or conclusion of the generated text. It helps to control the length of the response and ensure that the model generates text that aligns with specific requirements or constraints.\n\n- **Repetition Penalty**: Repetition Penalty penalizes repeated tokens in the generated text by assigning higher penalties to tokens that appear multiple times within a short context window. This encourages the model to produce more varied outputs by avoiding unnecessary repetition of tokens. It's particularly useful for generating coherent and diverse text without excessive redundancy.\n\n- **Length Penalty**: Length Penalty penalizes the length of the generated text by applying a penalty factor to longer sequences. This helps to balance between generating concise and informative responses while avoiding excessively long or verbose outputs. Length Penalty is often used to control the length of the generated text and ensure that it remains coherent and contextually relevant.\n\n\n\n## Roles: \n\n::: {#3a526819 .cell execution_count=1}\n``` {.python .cell-code}\nfrom openai import OpenAI\nclient = OpenAI()\n\ncompletion = client.chat.completions.create(\n model=\"gpt-3.5-turbo\",\n messages=[\n {\"role\": \"system\", \"content\": \"You are a poetic assistant, skilled in explaining complex programming concepts with creative flair.\"},\n {\"role\": \"user\", \"content\": \"Compose a poem that explains the concept of recursion in programming.\"}\n ]\n)\n\nprint(completion.choices[0].message)\n```\n:::\n\n\n## Function calling: \nhttps://platform.openai.com/docs/guides/function-calling\n\n", + "markdown": "---\ntitle: Parameterization of GPT\nformat:\n html:\n code-fold: false\n code-wrap: true\n---\n\nThe GPT models provided by OpenAI provide a variety of parameters that can change the way the language model responds. \nBelow you can find a list of the most important ones.\n\n- **Temperature**: Temperature (`temperaure`) is a parameter that controls the randomness of the generated text. Lower temperatures result in more deterministic outputs, where the model tends to choose the most likely tokens at each step. Higher temperatures introduce more randomness, allowing the model to explore less likely tokens and produce more creative outputs. It's often used to balance between generating safe, conservative responses and more novel, imaginative ones.\n\n- **Max Tokens**: Max Tokens (`max_tokens`) limits the maximum length of the generated text by specifying the maximum number of tokens (words or subwords) allowed in the output. This parameter helps to control the length of the response and prevent the model from generating overly long or verbose outputs, which may not be suitable for certain applications or contexts.\n\n- **Top P (Nucleus Sampling)**: Top P (`top_p`), also known as nucleus sampling, dynamically selects a subset of the most likely tokens based on their cumulative probability until the cumulative probability exceeds a certain threshold (specified by the parameter). This approach ensures diversity in the generated text while still prioritizing tokens with higher probabilities. It's particularly useful for generating diverse and contextually relevant responses.\n\n- **Frequency Penalty**: Frequency Penalty (`frequency_penalty`) penalizes tokens based on their frequency in the generated text. Tokens that appear more frequently are assigned higher penalties, discouraging the model from repeatedly generating common or redundant tokens. This helps to promote diversity in the generated text and prevent the model from producing overly repetitive outputs.\n\n- **Presence Penalty**: Presence Penalty (`presence_penalty`) penalizes tokens that are already present in the input prompt. By discouraging the model from simply echoing or replicating the input text, this parameter encourages the generation of responses that go beyond the provided context. It's useful for generating more creative and novel outputs that are not directly predictable from the input.\n\n- **Stop Sequence**: Stop Sequence (`stop`) specifies a sequence of tokens that, if generated by the model, signals it to stop generating further text. This parameter is commonly used to indicate the desired ending or conclusion of the generated text. It helps to control the length of the response and ensure that the model generates text that aligns with specific requirements or constraints.\n\n\n## Roles: \n\nIn order to cover most tasks you want to perform using a chat format, the OpenAI API let's you define different `roles` in the chat. \nThe available roles are `system`, `assistant`, `user` and `tools`. \nYou should already be familiar with two of them by now: \nThe `user` role corresponds to the actual user prompting the language model, all answers are given with the `assisstant` role.\n\nThe `system` role can now be given to provide some additional general instructions to the language model that are typically not a user input, for example, the style in which the model responds. \nIn this case, an example is better than any explanation.\n\n::: {#875ab431 .cell execution_count=1}\n``` {.python .cell-code}\nimport os\nfrom llm_utils.client import get_openai_client\n\nMODEL = \"gpt4\"\n\nclient = get_openai_client(\n model=MODEL,\n config_path=os.environ.get(\"CONFIG_PATH\")\n)\n\ncompletion = client.chat.completions.create(\n model=\"MODEL\",\n messages=[\n {\"role\": \"system\", \"content\": \"You are an annoyed technician working in a help center for dish washers, who answers in short, unfriendly bursts.\"},\n {\"role\": \"user\", \"content\": \"My dish washer does not clean the dishes, what could be the reason.\"}\n ]\n)\n\nprint(completion.choices[0].message.content)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nCould be anything. Blocked spray arm. Clogged filter. Faulty pump. Detergent issue. Check all that.\n```\n:::\n:::\n\n\n## Function calling: {#sec-test} \n\nAs we have seen, most interactions with a language model happen in form of a chat with almost \"free\" question or instructions and answers.\nWhile this seems the most natural in most cases, it is not always a practical format if we want to use a language model for very specific purposes.\nThis happens particularly often when we want to employ a language model in business situations, where we require a consistent output of the model.\n\nAs an example, let us try to use GPT for sentiment analysis (see also [here](../nlp/overview.qmd#sec-sentiment-analysis)).\nLet's say we want GPT to classify a text into one of the following four categories: \n\n::: {#ce80e6f9 .cell execution_count=2}\n``` {.python .cell-code}\nsentiment_categories = [\n \"positive\", \n \"negative\",\n \"neutral\",\n \"mixed\"\n]\n```\n:::\n\n\nWe could do the following:\n\n\n\n::: {#40c825ed .cell execution_count=4}\n``` {.python .cell-code}\nmessages = []\nmessages.append(\n {\"role\": \"system\", \"content\": f\"Classify the given text into one of the following sentiment categories: {sentiment_categories}.\"}\n)\nmessages.append(\n {\"role\": \"user\", \"content\": \"I really did not like the movie.\"}\n)\n\nresponse = client.chat.completions.create(\n messages=messages,\n model=MODEL\n)\n\nprint(f\"Response: '{response.choices[0].message.content}'\")\n```\n:::\n\n\n::: {#c6ffeb88 .cell execution_count=5}\n\n::: {.cell-output .cell-output-stdout}\n```\nResponse: 'Category: Negative'\n```\n:::\n:::\n\n\nIt is easy to spot the problem: GPT does not necessarily answer in the way we expect or want it to. \nIn this case, instead of simply returning the correct category, it also returns the string `Category: ` alongside it (and capitalized `Negative`).\nSo if we were to use the answer in a program or data base, we'd now again have to use some NLP techniques to parse it in order eventually retrieve **exactly** the category we were looking for: `negative`. \nWhat we need instead is a way to constrain GPT to a specific way of answering, and this is where `functions` or `tools` come into play (see also [Function calling](https://platform.openai.com/docs/guides/function-calling){.external} and [Function calling (cookbook)](https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models){.external}).\n\nThis concept allows us to specify the exact output format we expect to receive from GPT (it is called functions since ideally we want to call a function directly on the output of GPT so it has to be in a specific format). \n\n::: {#b78f10d6 .cell execution_count=6}\n``` {.python .cell-code}\n# this looks intimidating but isn't that complicated\ntools = [\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": \"analyze_sentiment\",\n \"description\": \"Analyze the sentiment in a given text.\",\n \"parameters\": {\n \"type\": \"object\",\n \"properties\": {\n \"sentiment\": {\n \"type\": \"string\",\n \"enum\": sentiment_categories,\n \"description\": f\"The sentiment of the text.\"\n }\n },\n \"required\": [\"sentiment\"],\n }\n }\n }\n]\n```\n:::\n\n\n::: {#e1e40f2a .cell execution_count=7}\n``` {.python .cell-code}\nmessages = []\nmessages.append(\n {\"role\": \"system\", \"content\": f\"Classify the given text into one of the following sentiment categories: {sentiment_categories}.\"}\n)\nmessages.append(\n {\"role\": \"user\", \"content\": \"I really did not like the movie.\"}\n)\n\nresponse = client.chat.completions.create(\n messages=messages,\n model=MODEL,\n tools=tools,\n tool_choice={\n \"type\": \"function\", \n \"function\": {\"name\": \"analyze_sentiment\"}}\n)\n\nprint(f\"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nResponse: '{\n\"sentiment\": \"negative\"\n}'\n```\n:::\n:::\n\n\nWe can now easily extract what we need: \n\n::: {#5e3c869b .cell execution_count=8}\n``` {.python .cell-code}\nimport json \nresult = json.loads(response.choices[0].message.tool_calls[0].function.arguments) # remember that the answer is a string\nprint(result[\"sentiment\"])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nnegative\n```\n:::\n:::\n\n\nWe can also include multiple function parameters if our desired output has multiple components.\nLet's try to include another parameter which includes the `reason` for the sentiment.\n\n::: {#9c902d9f .cell execution_count=9}\n``` {.python .cell-code}\ntools = [\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": \"analyze_sentiment\",\n \"description\": \"Analyze the sentiment in a given text.\",\n \"parameters\": {\n \"type\": \"object\",\n \"properties\": {\n \"sentiment\": {\n \"type\": \"string\",\n \"enum\": sentiment_categories,\n \"description\": f\"The sentiment of the text.\"\n },\n \"reason\": {\n \"type\": \"string\",\n \"description\": \"The reason for the sentiment in few words. If there is no information, do not make assumptions and leave blank.\"\n }\n },\n \"required\": [\"sentiment\", \"reason\"],\n }\n }\n }\n]\n```\n:::\n\n\n::: {#d77c3b61 .cell execution_count=10}\n``` {.python .cell-code}\nmessages = []\nmessages.append(\n {\"role\": \"system\", \"content\": f\"Classify the given text into one of the following sentiment categories: {sentiment_categories}. If you can, also extract the reason.\"}\n)\nmessages.append(\n {\"role\": \"user\", \"content\": \"I loved the movie, Johnny Depp is a great actor.\"}\n)\n\nresponse = client.chat.completions.create(\n messages=messages,\n model=MODEL,\n tools=tools,\n tool_choice={\n \"type\": \"function\", \n \"function\": {\"name\": \"analyze_sentiment\"}}\n)\n\nprint(f\"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nResponse: '{\n\"sentiment\": \"positive\",\n\"reason\": \"Appreciation for the movie and actor\"\n}'\n```\n:::\n:::\n\n\nHere, again, we could also constrain the possibilities for the `reason` to a certain set. \nHence, functions are great to have more consistent answers of the language model such that we can use it in applications.\n\n", "supporting": [ "parameterization_files" ], diff --git a/_freeze/llm/prompting/execute-results/html.json b/_freeze/llm/prompting/execute-results/html.json index bb0b0f3..0a75cb9 100644 --- a/_freeze/llm/prompting/execute-results/html.json +++ b/_freeze/llm/prompting/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "d550bb699b724b045b6d42c1602ebf1c", + "hash": "c0652d9a13ad67e81575d6e1d131ce24", "result": { "engine": "jupyter", - "markdown": "---\ntitle: Prompting\nformat:\n html:\n code-fold: true\n---\n\n**Resources:** \n- https://platform.openai.com/docs/guides/prompt-engineering\n- \n\n", + "markdown": "---\ntitle: Prompting\nformat:\n html:\n code-fold: true\n---\n\nLearning prompting is a science for itself. \nThe difficulty lies in the probabilistic nature of the language models. \nThat means, small changes to your prompt (that you might even find insignificant) can have a large impact on the result/the answer.\nIn particular, the changes do not have to be \"logical\", i.e., depend on your changes in a comprehensible or reproducible way. \nThis can sometimes be frustrating, but can also be avoided in many cases when following the right instructions for prompting. \nTo do so, let's best follow the creators.\n\n\n::: {.callout-note}\n_The following is taken from the [OpenAI Guide](https://platform.openai.com/docs/guides/prompt-engineering){.external}_\n:::\n\n#### Write clear instructions\nThese models can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less the model has to guess at what you want, the more likely you’ll get it.\n\nTactics:\n\n- Include details in your query to get more relevant answers\n- Ask the model to adopt a persona\n- Use delimiters to clearly indicate distinct parts of the input\n- Specify the steps required to complete a task\n- Provide examples\n- Specify the desired length of the output\n

\n\n#### Provide reference text\nLanguage models can confidently invent fake answers, especially when asked about esoteric topics or for citations and URLs. In the same way that a sheet of notes can help a student do better on a test, providing reference text to these models can help in answering with fewer fabrications.\n\nTactics:\n\n- Instruct the model to answer using a reference text\n- Instruct the model to answer with citations from a reference text\n

\n\n#### Split complex tasks into simpler subtasks\nJust as it is good practice in software engineering to decompose a complex system into a set of modular components, the same is true of tasks - submitted to a language model. Complex tasks tend to have higher error rates than simpler tasks. Furthermore, complex tasks can often be re-defined as a workflow of simpler tasks in which the outputs of earlier tasks are used to construct the inputs to later tasks.\n\nTactics:\n\n- Use intent classification to identify the most relevant instructions for a user query\n- For dialogue applications that require very long conversations, summarize or filter previous dialogue\n- Summarize long documents piecewise and construct a full summary recursively\n

\n\n#### Give the model time to \"think\"\nIf asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a \"chain of thought\" before an answer can help the model reason its way toward correct answers more reliably.\n\nTactics:\n\n- Instruct the model to work out its own solution before rushing to a conclusion\n- Use inner monologue or a sequence of queries to hide the model's reasoning process\n- Ask the model if it missed anything on previous passes\n

\n\n#### Use external tools\nCompensate for the weaknesses of the model by feeding it the outputs of other tools. For example, a text retrieval system (sometimes called RAG or retrieval augmented generation) can tell the model about relevant documents. A code execution engine like OpenAI's Code Interpreter can help the model do math and run code. If a task can be done more reliably or efficiently by a tool rather than by a language model, offload it to get the best of both.\n\nTactics:\n\n- Use embeddings-based search to implement efficient knowledge retrieval\n- Use code execution to perform more accurate calculations or call external APIs\n- Give the model access to specific functions\n

\n\n#### Test changes systematically\nImproving performance is easier if you can measure it. In some cases a modification to a prompt will achieve better performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Therefore to be sure that a change is net positive to performance it may be necessary to define a comprehensive test suite (also known an as an \"eval\").\n\nTactic:\n\n- Evaluate model outputs with reference to gold-standard answers\n\n", "supporting": [ "prompting_files" ], diff --git a/_freeze/nlp/overview/execute-results/html.json b/_freeze/nlp/overview/execute-results/html.json index f49bb38..867a963 100644 --- a/_freeze/nlp/overview/execute-results/html.json +++ b/_freeze/nlp/overview/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "511ca1f1badee45c381c211cb06bcc5e", + "hash": "f5b2521cd9ea65125fb9f9725dfe0627", "result": { "engine": "jupyter", - "markdown": "---\ntitle: Overview of NLP\nformat:\n html:\n code-fold: false\n---\n\n## A short history of Natural Language Processing\n\nThe field of Natural Language Processing (NLP) has undergone a remarkable evolution, spanning decades and driven by the convergence of computer science, artificial intelligence, and linguistics. \nFrom its nascent stages to its current state, NLP has witnessed transformative shifts, propelled by groundbreaking research and technological advancements. \nToday, it stands as a testament to humanity's quest to bridge the gap between human language and machine comprehension. \nThe journey through NLP's history offers profound insights into its trajectory and the challenges encountered along the way.\n\n#### Early Days: Rule-Based Approaches (1960s-1980s)\nIn its infancy, NLP relied heavily on rule-based approaches, where researchers painstakingly crafted sets of linguistic rules to analyze and manipulate text. \nThis period, spanning from the 1960s to the 1980s, saw significant efforts in tasks such as part-of-speech tagging, named entity recognition, and machine translation. \nHowever, rule-based systems struggled to cope with the inherent ambiguity and complexity of natural language. \nDifferent languages presented unique challenges, necessitating the development of language-specific rulesets. Despite their limitations, rule-based approaches laid the groundwork for future advancements in NLP.\n\n#### Rise of Statistical Methods (1990s-2000s)\nThe 1990s marked a pivotal shift in NLP with the emergence of statistical methods as a viable alternative to rule-based approaches. \nResearchers began harnessing the power of statistics and probabilistic models to analyze large corpora of text. \nTechniques like Hidden Markov Models and Conditional Random Fields gained prominence, offering improved performance in tasks such as text classification, sentiment analysis, and information extraction. \nStatistical methods represented a departure from rigid rule-based systems, allowing for greater flexibility and adaptability. \nHowever, they still grappled with the nuances and intricacies of human language, particularly in handling ambiguity and context.\n\n#### Machine Learning Revolution (2010s)\nThe advent of the 2010s witnessed a revolution in NLP fueled by the rise of machine learning, particularly deep learning. \nWith the availability of vast amounts of annotated data and unprecedented computational power, researchers explored neural network architectures tailored for NLP tasks. \nRecurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) gained traction, demonstrating impressive capabilities in tasks such as sentiment analysis, text classification, and sequence generation. \nThese models represented a significant leap forward in NLP, enabling more nuanced and context-aware language processing.\n\n#### Large Language Models: Transformers (2010s-Present)\nThe latter half of the 2010s heralded the rise of large language models, epitomized by the revolutionary Transformer architecture.\nPowered by self-attention mechanisms, Transformers excel at capturing long-range dependencies in text and generating coherent and contextually relevant responses. \nPre-trained on massive text corpora, models like GPT (Generative Pretrained Transformer) have achieved unprecedented performance across a wide range of NLP tasks, including machine translation, question-answering, and language understanding. \nTheir ability to leverage vast amounts of data and learn intricate patterns has propelled NLP to new heights of sophistication.\n\n#### Challenges in NLP\nDespite the remarkable progress, NLP grapples with a myriad of challenges that continue to shape its trajectory:\n\n- **Ambiguity of Language**: The inherent ambiguity of natural language poses significant challenges in accurately interpreting meaning, especially in tasks like sentiment analysis and named entity recognition.\n \n- **Different Languages**: NLP systems often struggle with languages other than English, facing variations in syntax, semantics, and cultural nuances, requiring tailored approaches for each language.\n\n- **Bias**: NLP models can perpetuate biases present in the training data, leading to unfair or discriminatory outcomes, particularly in tasks like text classification and machine translation.\n\n- **Importance of Context**: Understanding context is paramount for NLP tasks, as the meaning of words and phrases can vary drastically depending on the surrounding context.\n\n- **World Knowledge**: NLP systems lack comprehensive world knowledge, hindering their ability to understand references, idioms, and cultural nuances embedded in text.\n\n- **Common Sense Reasoning**: Despite advancements, NLP models still struggle with common sense reasoning, often producing nonsensical or irrelevant responses in complex scenarios.\n\n#### Conclusion\nThe journey of NLP from rule-based systems to large language models has been marked by remarkable progress and continuous innovation. \nWhile challenges persist, ongoing research and development efforts hold the promise of overcoming these obstacles and unlocking new frontiers in language understanding. \nAs NLP continues to evolve, driven by advancements in machine learning and computational resources, it brings us closer to the realization of truly intelligent systems capable of understanding and interacting with human language in profound ways.\n\n\n## Classic NLP tasks/applications\n\n#### Part-of-Speech Tagging\nPart-of-speech tagging involves labeling each word in a sentence with its corresponding grammatical category, such as noun, verb, adjective, or adverb. \nFor example, in the sentence \"The cat is sleeping,\" part-of-speech tagging would identify \"cat\" as a noun and \"sleeping\" as a verb. \nThis task is crucial for many NLP applications, including language understanding, information retrieval, and machine translation. \nAccurate part-of-speech tagging lays the foundation for deeper linguistic analysis and improves the performance of downstream tasks.\n\n
\nCode example\n\n::: {#d87df8e1 .cell execution_count=1}\n``` {.python .cell-code}\nimport spacy\n\n# Load the English language model\nnlp = spacy.load(\"en_core_web_sm\")\n\n# Example text\ntext = \"The sun sets behind the mountains, casting a golden glow across the sky.\"\n\n# Process the text with spaCy\ndoc = nlp(text)\n\n# Find the maximum length of token text and POS tag\nmax_token_length = max(len(token.text) for token in doc)\nmax_pos_length = max(len(token.pos_) for token in doc)\n\n# Print each token along with its part-of-speech tag\nfor token in doc:\n print(f\"Token: {token.text.ljust(max_token_length)} | POS Tag: {token.pos_.ljust(max_pos_length)}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nToken: The | POS Tag: DET \nToken: sun | POS Tag: NOUN \nToken: sets | POS Tag: VERB \nToken: behind | POS Tag: ADP \nToken: the | POS Tag: DET \nToken: mountains | POS Tag: NOUN \nToken: , | POS Tag: PUNCT\nToken: casting | POS Tag: VERB \nToken: a | POS Tag: DET \nToken: golden | POS Tag: ADJ \nToken: glow | POS Tag: NOUN \nToken: across | POS Tag: ADP \nToken: the | POS Tag: DET \nToken: sky | POS Tag: NOUN \nToken: . | POS Tag: PUNCT\n```\n:::\n:::\n\n\n
\n\n\n\n#### Named Entity Recognition\nNamed Entity Recognition (NER) involves identifying and classifying named entities in text, such as people, organizations, locations, dates, and more. For instance, in the sentence \"Apple is headquartered in Cupertino,\" NER would identify \"Apple\" as an organization and \"Cupertino\" as a location. \nNER is essential for various applications, including information retrieval, document summarization, and question-answering systems. Accurate NER enables machines to extract meaningful information from unstructured text data.\n\n
\nCode example\n\n::: {#acc47b23 .cell execution_count=2}\n``` {.python .cell-code}\nimport spacy\n\n# Load the English language model\nnlp = spacy.load(\"en_core_web_sm\")\n\n# Example text\ntext = \"Apple is considering buying a startup called U.K. based company in London for $1 billion.\"\n\n# Process the text with spaCy\ndoc = nlp(text)\n\n# Print each token along with its Named Entity label\nfor ent in doc.ents:\n print(f\"Entity: {ent.text.ljust(20)} | Label: {ent.label_}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nEntity: Apple | Label: ORG\nEntity: U.K. | Label: GPE\nEntity: London | Label: GPE\nEntity: $1 billion | Label: MONEY\n```\n:::\n:::\n\n\n
\n\n\n\n#### Machine Translation\nMachine Translation (MT) aims to automatically translate text from one language to another, facilitating communication across language barriers. \nFor example, translating a sentence from English to Spanish or vice versa. \nMT systems utilize sophisticated algorithms and linguistic models to generate accurate translations while preserving the original meaning and nuances of the text. \nMT has numerous practical applications, including cross-border communication, localization of software and content, and global commerce.\n\n#### Sentiment Analysis\nSentiment Analysis involves analyzing text data to determine the sentiment or opinion expressed within it, such as positive, negative, or neutral. \nFor instance, analyzing product reviews to gauge customer satisfaction or monitoring social media sentiment towards a brand. \nSentiment Analysis employs machine learning algorithms to classify text based on sentiment, enabling businesses to understand customer feedback, track public opinion, and make data-driven decisions.\n\n
\nCode example\n\n::: {#02b6acb2 .cell execution_count=3}\n``` {.python .cell-code}\n# python -m textblob.download_corpora\n\nfrom textblob import TextBlob\n\n# Example text\ntext = \"I love TextBlob! It's an amazing library for natural language processing.\"\n\n# Perform sentiment analysis with TextBlob\nblob = TextBlob(text)\nsentiment_score = blob.sentiment.polarity\n\n# Determine sentiment label based on sentiment score\nif sentiment_score > 0:\n sentiment_label = \"Positive\"\nelif sentiment_score < 0:\n sentiment_label = \"Negative\"\nelse:\n sentiment_label = \"Neutral\"\n\n# Print sentiment analysis results\nprint(f\"Text: {text}\")\nprint(f\"Sentiment Score: {sentiment_score:.2f}\")\nprint(f\"Sentiment Label: {sentiment_label}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nText: I love TextBlob! It's an amazing library for natural language processing.\nSentiment Score: 0.44\nSentiment Label: Positive\n```\n:::\n:::\n\n\n
\n\n\n#### Text Classification\nText Classification is the task of automatically categorizing text documents into predefined categories or classes. \nFor example, classifying news articles into topics like politics, sports, or entertainment. \nText Classification is widely used in various domains, including email spam detection, sentiment analysis, and content categorization. \nIt enables organizations to organize and process large volumes of textual data efficiently, leading to improved decision-making and information retrieval.\n\n
\nCode example\n\n::: {#356c23f3 .cell execution_count=4}\n``` {.python .cell-code}\nfrom sklearn.feature_extraction.text import TfidfVectorizer\nfrom sklearn.svm import SVC\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import LabelEncoder\n\n# Example labeled dataset\ntexts = [\n \"I love this product!\",\n \"This product is terrible.\",\n \"Great service, highly recommended.\",\n \"I had a bad experience with this company.\",\n]\nlabels = [\n \"Positive\",\n \"Negative\",\n \"Positive\",\n \"Negative\",\n]\n\n# Create a TF-IDF vectorizer\nvectorizer = TfidfVectorizer()\n\n# Encode labels as integers\nlabel_encoder = LabelEncoder()\nencoded_labels = label_encoder.fit_transform(labels)\n\n# Create a pipeline with TF-IDF vectorizer and SVM classifier\nclassifier = make_pipeline(vectorizer, SVC(kernel='linear'))\n\n# Train the classifier\nclassifier.fit(texts, encoded_labels)\n\n# Example test text\ntest_text = \"This product exceeded my expectations.\"\n\n# Predict the label for the test text\npredicted_label = classifier.predict([test_text])[0]\n\n# Decode the predicted label back to original label\npredicted_label_text = label_encoder.inverse_transform([predicted_label])[0]\n\n# Print the predicted label\nprint(f\"Text: {test_text}\")\nprint(f\"Predicted Label: {predicted_label_text}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nText: This product exceeded my expectations.\nPredicted Label: Negative\n```\n:::\n:::\n\n\n
\n\n\n#### Information Extraction\nInformation Extraction involves automatically extracting structured information from unstructured text data, such as documents, articles, or web pages. \nThis includes identifying entities, relationships, and events mentioned in the text. \nFor example, extracting names of people mentioned in news articles or detecting company acquisitions from financial reports. \nInformation Extraction plays a crucial role in tasks like knowledge base construction, data integration, and business intelligence.\n\n#### Question-Answering\nQuestion-Answering (QA) systems aim to automatically generate accurate answers to user queries posed in natural language. \nThese systems comprehend the meaning of questions and retrieve relevant information from a knowledge base or text corpus to provide precise responses. \nFor example, answering factual questions like \"Who is the president of the United States?\" or \"What is the capital of France?\". \nQA systems are essential for information retrieval, virtual assistants, and educational applications, enabling users to access information quickly and efficiently.\n\n", + "markdown": "---\ntitle: Overview of NLP\nformat:\n html:\n code-fold: false\n---\n\n## A short history of Natural Language Processing\n\nThe field of Natural Language Processing (NLP) has undergone a remarkable evolution, spanning decades and driven by the convergence of computer science, artificial intelligence, and linguistics. \nFrom its nascent stages to its current state, NLP has witnessed transformative shifts, propelled by groundbreaking research and technological advancements. \nToday, it stands as a testament to humanity's quest to bridge the gap between human language and machine comprehension. \nThe journey through NLP's history offers profound insights into its trajectory and the challenges encountered along the way.\n\n#### Early Days: Rule-Based Approaches (1960s-1980s)\nIn its infancy, NLP relied heavily on rule-based approaches, where researchers painstakingly crafted sets of linguistic rules to analyze and manipulate text. \nThis period, spanning from the 1960s to the 1980s, saw significant efforts in tasks such as part-of-speech tagging, named entity recognition, and machine translation. \nHowever, rule-based systems struggled to cope with the inherent ambiguity and complexity of natural language. \nDifferent languages presented unique challenges, necessitating the development of language-specific rulesets. Despite their limitations, rule-based approaches laid the groundwork for future advancements in NLP.\n\n#### Rise of Statistical Methods (1990s-2000s)\nThe 1990s marked a pivotal shift in NLP with the emergence of statistical methods as a viable alternative to rule-based approaches. \nResearchers began harnessing the power of statistics and probabilistic models to analyze large corpora of text. \nTechniques like Hidden Markov Models and Conditional Random Fields gained prominence, offering improved performance in tasks such as text classification, sentiment analysis, and information extraction. \nStatistical methods represented a departure from rigid rule-based systems, allowing for greater flexibility and adaptability. \nHowever, they still grappled with the nuances and intricacies of human language, particularly in handling ambiguity and context.\n\n#### Machine Learning Revolution (2010s)\nThe advent of the 2010s witnessed a revolution in NLP fueled by the rise of machine learning, particularly deep learning. \nWith the availability of vast amounts of annotated data and unprecedented computational power, researchers explored neural network architectures tailored for NLP tasks. \nRecurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) gained traction, demonstrating impressive capabilities in tasks such as sentiment analysis, text classification, and sequence generation. \nThese models represented a significant leap forward in NLP, enabling more nuanced and context-aware language processing.\n\n#### Large Language Models: Transformers (2010s-Present)\nThe latter half of the 2010s heralded the rise of large language models, epitomized by the revolutionary Transformer architecture.\nPowered by self-attention mechanisms, Transformers excel at capturing long-range dependencies in text and generating coherent and contextually relevant responses. \nPre-trained on massive text corpora, models like GPT (Generative Pre-trained Transformer) have achieved unprecedented performance across a wide range of NLP tasks, including machine translation, question-answering, and language understanding. \nTheir ability to leverage vast amounts of data and learn intricate patterns has propelled NLP to new heights of sophistication.\n\n#### Challenges in NLP\nDespite the remarkable progress, NLP grapples with a myriad of challenges that continue to shape its trajectory:\n\n- **Ambiguity of Language**: The inherent ambiguity of natural language poses significant challenges in accurately interpreting meaning, especially in tasks like sentiment analysis and named entity recognition.\n \n- **Different Languages**: NLP systems often struggle with languages other than English, facing variations in syntax, semantics, and cultural nuances, requiring tailored approaches for each language.\n\n- **Bias**: NLP models can perpetuate biases present in the training data, leading to unfair or discriminatory outcomes, particularly in tasks like text classification and machine translation.\n\n- **Importance of Context**: Understanding context is paramount for NLP tasks, as the meaning of words and phrases can vary drastically depending on the surrounding context.\n\n- **World Knowledge**: NLP systems lack comprehensive world knowledge, hindering their ability to understand references, idioms, and cultural nuances embedded in text.\n\n- **Common Sense Reasoning**: Despite advancements, NLP models still struggle with common sense reasoning, often producing nonsensical or irrelevant responses in complex scenarios.\n\n#### Conclusion\nThe journey of NLP from rule-based systems to large language models has been marked by remarkable progress and continuous innovation. \nWhile challenges persist, ongoing research and development efforts hold the promise of overcoming these obstacles and unlocking new frontiers in language understanding. \nAs NLP continues to evolve, driven by advancements in machine learning and computational resources, it brings us closer to the realization of truly intelligent systems capable of understanding and interacting with human language in profound ways.\n\n\n## Classic NLP tasks/applications\n\n#### Part-of-Speech Tagging\nPart-of-speech tagging involves labeling each word in a sentence with its corresponding grammatical category, such as noun, verb, adjective, or adverb. \nFor example, in the sentence \"The cat is sleeping,\" part-of-speech tagging would identify \"cat\" as a noun and \"sleeping\" as a verb. \nThis task is crucial for many NLP applications, including language understanding, information retrieval, and machine translation. \nAccurate part-of-speech tagging lays the foundation for deeper linguistic analysis and improves the performance of downstream tasks.\n\n
\nCode example\n\n::: {#1de29527 .cell execution_count=1}\n``` {.python .cell-code}\nimport spacy\n\n# Load the English language model\nnlp = spacy.load(\"en_core_web_sm\")\n\n# Example text\ntext = \"The sun sets behind the mountains, casting a golden glow across the sky.\"\n\n# Process the text with spaCy\ndoc = nlp(text)\n\n# Find the maximum length of token text and POS tag\nmax_token_length = max(len(token.text) for token in doc)\nmax_pos_length = max(len(token.pos_) for token in doc)\n\n# Print each token along with its part-of-speech tag\nfor token in doc:\n print(f\"Token: {token.text.ljust(max_token_length)} | POS Tag: {token.pos_.ljust(max_pos_length)}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nToken: The | POS Tag: DET \nToken: sun | POS Tag: NOUN \nToken: sets | POS Tag: VERB \nToken: behind | POS Tag: ADP \nToken: the | POS Tag: DET \nToken: mountains | POS Tag: NOUN \nToken: , | POS Tag: PUNCT\nToken: casting | POS Tag: VERB \nToken: a | POS Tag: DET \nToken: golden | POS Tag: ADJ \nToken: glow | POS Tag: NOUN \nToken: across | POS Tag: ADP \nToken: the | POS Tag: DET \nToken: sky | POS Tag: NOUN \nToken: . | POS Tag: PUNCT\n```\n:::\n:::\n\n\n
\n\n\n\n#### Named Entity Recognition\nNamed Entity Recognition (NER) involves identifying and classifying named entities in text, such as people, organizations, locations, dates, and more. For instance, in the sentence \"Apple is headquartered in Cupertino,\" NER would identify \"Apple\" as an organization and \"Cupertino\" as a location. \nNER is essential for various applications, including information retrieval, document summarization, and question-answering systems. Accurate NER enables machines to extract meaningful information from unstructured text data.\n\n
\nCode example\n\n::: {#e9d253d7 .cell execution_count=2}\n``` {.python .cell-code}\nimport spacy\n\n# Load the English language model\nnlp = spacy.load(\"en_core_web_sm\")\n\n# Example text\ntext = \"Apple is considering buying a startup called U.K. based company in London for $1 billion.\"\n\n# Process the text with spaCy\ndoc = nlp(text)\n\n# Print each token along with its Named Entity label\nfor ent in doc.ents:\n print(f\"Entity: {ent.text.ljust(20)} | Label: {ent.label_}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nEntity: Apple | Label: ORG\nEntity: U.K. | Label: GPE\nEntity: London | Label: GPE\nEntity: $1 billion | Label: MONEY\n```\n:::\n:::\n\n\n
\n\n\n#### Machine Translation\nMachine Translation (MT) aims to automatically translate text from one language to another, facilitating communication across language barriers. \nFor example, translating a sentence from English to Spanish or vice versa. \nMT systems utilize sophisticated algorithms and linguistic models to generate accurate translations while preserving the original meaning and nuances of the text. \nMT has numerous practical applications, including cross-border communication, localization of software and content, and global commerce.\n\n#### Sentiment Analysis {#sec-sentiment-analysis}\n\nSentiment Analysis involves analyzing text data to determine the sentiment or opinion expressed within it, such as positive, negative, or neutral. \nFor instance, analyzing product reviews to gauge customer satisfaction or monitoring social media sentiment towards a brand. \nSentiment Analysis employs machine learning algorithms to classify text based on sentiment, enabling businesses to understand customer feedback, track public opinion, and make data-driven decisions.\n\n
\nCode example\n\n::: {#8ee69b14 .cell execution_count=3}\n``` {.python .cell-code}\n# python -m textblob.download_corpora\n\nfrom textblob import TextBlob\n\n# Example text\ntext = \"I love TextBlob! It's an amazing library for natural language processing.\"\n\n# Perform sentiment analysis with TextBlob\nblob = TextBlob(text)\nsentiment_score = blob.sentiment.polarity\n\n# Determine sentiment label based on sentiment score\nif sentiment_score > 0:\n sentiment_label = \"Positive\"\nelif sentiment_score < 0:\n sentiment_label = \"Negative\"\nelse:\n sentiment_label = \"Neutral\"\n\n# Print sentiment analysis results\nprint(f\"Text: {text}\")\nprint(f\"Sentiment Score: {sentiment_score:.2f}\")\nprint(f\"Sentiment Label: {sentiment_label}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nText: I love TextBlob! It's an amazing library for natural language processing.\nSentiment Score: 0.44\nSentiment Label: Positive\n```\n:::\n:::\n\n\n
\n\n\n#### Text Classification\nText Classification is the task of automatically categorizing text documents into predefined categories or classes. \nFor example, classifying news articles into topics like politics, sports, or entertainment. \nText Classification is widely used in various domains, including email spam detection, sentiment analysis, and content categorization. \nIt enables organizations to organize and process large volumes of textual data efficiently, leading to improved decision-making and information retrieval.\n\n
\nCode example\n\n::: {#b55c09cb .cell execution_count=4}\n``` {.python .cell-code}\nfrom sklearn.feature_extraction.text import TfidfVectorizer\nfrom sklearn.svm import SVC\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import LabelEncoder\n\n# Example labeled dataset\ntexts = [\n \"I love this product!\",\n \"This product is terrible.\",\n \"Great service, highly recommended.\",\n \"I had a bad experience with this company.\",\n]\nlabels = [\n \"Positive\",\n \"Negative\",\n \"Positive\",\n \"Negative\",\n]\n\n# Create a TF-IDF vectorizer\nvectorizer = TfidfVectorizer()\n\n# Encode labels as integers\nlabel_encoder = LabelEncoder()\nencoded_labels = label_encoder.fit_transform(labels)\n\n# Create a pipeline with TF-IDF vectorizer and SVM classifier\nclassifier = make_pipeline(vectorizer, SVC(kernel='linear'))\n\n# Train the classifier\nclassifier.fit(texts, encoded_labels)\n\n# Example test text\ntest_text = \"This product exceeded my expectations.\"\n\n# Predict the label for the test text\npredicted_label = classifier.predict([test_text])[0]\n\n# Decode the predicted label back to original label\npredicted_label_text = label_encoder.inverse_transform([predicted_label])[0]\n\n# Print the predicted label\nprint(f\"Text: {test_text}\")\nprint(f\"Predicted Label: {predicted_label_text}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nText: This product exceeded my expectations.\nPredicted Label: Negative\n```\n:::\n:::\n\n\n
\n\n\n#### Information Extraction\nInformation Extraction involves automatically extracting structured information from unstructured text data, such as documents, articles, or web pages. \nThis includes identifying entities, relationships, and events mentioned in the text. \nFor example, extracting names of people mentioned in news articles or detecting company acquisitions from financial reports. \nInformation Extraction plays a crucial role in tasks like knowledge base construction, data integration, and business intelligence.\n\n#### Question-Answering\nQuestion-Answering (QA) systems aim to automatically generate accurate answers to user queries posed in natural language. \nThese systems comprehend the meaning of questions and retrieve relevant information from a knowledge base or text corpus to provide precise responses. \nFor example, answering factual questions like \"Who is the president of the United States?\" or \"What is the capital of France?\". \nQA systems are essential for information retrieval, virtual assistants, and educational applications, enabling users to access information quickly and efficiently.\n\n", "supporting": [ "overview_files" ], diff --git a/_freeze/nlp/tokenization/execute-results/html.json b/_freeze/nlp/tokenization/execute-results/html.json index 9be1827..686e0e3 100644 --- a/_freeze/nlp/tokenization/execute-results/html.json +++ b/_freeze/nlp/tokenization/execute-results/html.json @@ -1,8 +1,8 @@ { - "hash": "015b9b99e573753e3289e6d5f046eca5", + "hash": "b9faa2cc27aa2ac93360727c228d3d38", "result": { "engine": "jupyter", - "markdown": "---\ntitle: Tokenization\nformat:\n html:\n code-fold: false\n---\n\nTODO: Some introductory sentence.\n\n## Simple word tokenization\nA key element for a computer to understand the words we speak or type is the concept of word tokenization. \nFor a human, the sentence \n\n::: {#4561108d .cell execution_count=1}\n``` {.python .cell-code}\nsentence = \"I love reading science fiction books or books about science.\"\n```\n:::\n\n\nis easy to understand since we are able to split the sentence into its individual parts in order to figure out the meaning of the full sentence.\nFor a computer, the sentence is just a simple string of characters, like any other word or longer text.\nIn order to make a computer understand the meaning of a sentence, we need to help break it down into its relevant parts.\n\nSimply put, word tokenization is the process of breaking down a piece of text into individual words or so-called tokens. \nIt is like taking a sentence and splitting it into smaller pieces, where each piece represents a word.\nWord tokenization involves analyzing the text character by character and identifying boundaries between words. \nIt uses various rules and techniques to decide where one word ends and the next one begins. \nFor example, spaces, punctuation marks, and special characters often serve as natural boundaries between words.\n\nSo let's start breaking down the sentence into its individual parts.\n\n::: {#36ccc15c .cell execution_count=2}\n``` {.python .cell-code}\ntokenized_sentence = sentence.split(\" \")\nprint(tokenized_sentence)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n['I', 'love', 'reading', 'science', 'fiction', 'books', 'or', 'books', 'about', 'science.']\n```\n:::\n:::\n\n\nOnce we have tokenized the sentence, we can start anaylzing it with some simple statistical methods. \nFor example, in order to figure out what the sentence might be about, we could count the most frequent words. \n\n::: {#4b5bb732 .cell execution_count=3}\n``` {.python .cell-code}\nfrom collections import Counter\n\ntoken_counter = Counter(tokenized_sentence)\nprint(token_counter.most_common(2))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[('books', 2), ('I', 1)]\n```\n:::\n:::\n\n\nUnfortunately, we already realize that we have not done the best job with our \"tokenizer\": The second occurence of the word `science` is missing do to the punctuation. \nWhile this is great as it holds information about the ending of a sentence, it disturbs our analysis here, so let's get rid of it. \n\n::: {#54b1654a .cell execution_count=4}\n``` {.python .cell-code}\ntokenized_sentence = sentence.replace(\".\", \" \").split(\" \")\n\ntoken_counter = Counter(tokenized_sentence)\nprint(token_counter.most_common(2))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[('science', 2), ('books', 2)]\n```\n:::\n:::\n\n\nSo that worked.\nAs you can imagine, tokenization can get increasingly difficult when we have to deal with all sorts of situations in larger corpora of texts (see also the exercise). \nSo it is great that there are already all sorts of libraries available that can help us with this process. \n\n::: {#7a06e431 .cell execution_count=5}\n``` {.python .cell-code}\nfrom nltk.tokenize import wordpunct_tokenize\nfrom string import punctuation\n\ntokenized_sentence = wordpunct_tokenize(sentence)\ntokenized_sentence = [t for t in tokenized_sentence if t not in punctuation]\nprint(tokenized_sentence)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n['I', 'love', 'reading', 'science', 'fiction', 'books', 'or', 'books', 'about', 'science']\n```\n:::\n:::\n\n\n## Advanced word tokenization\n\nTODO: Write\n\n\nFrom the docs: \n\nhttps://platform.openai.com/tokenizer\n\nA helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).\n\n", + "markdown": "---\ntitle: Tokenization\nformat:\n html:\n code-fold: false\n---\n\nTODO: Some introductory sentence.\n\n## Simple word tokenization\nA key element for a computer to understand the words we speak or type is the concept of word tokenization. \nFor a human, the sentence \n\n::: {#706b1324 .cell execution_count=1}\n``` {.python .cell-code}\nsentence = \"I love reading science fiction books or books about science.\"\n```\n:::\n\n\nis easy to understand since we are able to split the sentence into its individual parts in order to figure out the meaning of the full sentence.\nFor a computer, the sentence is just a simple string of characters, like any other word or longer text.\nIn order to make a computer understand the meaning of a sentence, we need to help break it down into its relevant parts.\n\nSimply put, word tokenization is the process of breaking down a piece of text into individual words or so-called tokens. \nIt is like taking a sentence and splitting it into smaller pieces, where each piece represents a word.\nWord tokenization involves analyzing the text character by character and identifying boundaries between words. \nIt uses various rules and techniques to decide where one word ends and the next one begins. \nFor example, spaces, punctuation marks, and special characters often serve as natural boundaries between words.\n\nSo let's start breaking down the sentence into its individual parts.\n\n::: {#650aa91e .cell execution_count=2}\n``` {.python .cell-code}\ntokenized_sentence = sentence.split(\" \")\nprint(tokenized_sentence)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n['I', 'love', 'reading', 'science', 'fiction', 'books', 'or', 'books', 'about', 'science.']\n```\n:::\n:::\n\n\nOnce we have tokenized the sentence, we can start analyzing it with some simple statistical methods. \nFor example, in order to figure out what the sentence might be about, we could count the most frequent words. \n\n::: {#f5e22bc0 .cell execution_count=3}\n``` {.python .cell-code}\nfrom collections import Counter\n\ntoken_counter = Counter(tokenized_sentence)\nprint(token_counter.most_common(2))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[('books', 2), ('I', 1)]\n```\n:::\n:::\n\n\nUnfortunately, we already realize that we have not done the best job with our \"tokenizer\": The second occurrence of the word `science` is missing do to the punctuation. \nWhile this is great as it holds information about the ending of a sentence, it disturbs our analysis here, so let's get rid of it. \n\n::: {#4547c10a .cell execution_count=4}\n``` {.python .cell-code}\ntokenized_sentence = sentence.replace(\".\", \" \").split(\" \")\n\ntoken_counter = Counter(tokenized_sentence)\nprint(token_counter.most_common(2))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[('science', 2), ('books', 2)]\n```\n:::\n:::\n\n\nSo that worked.\nAs you can imagine, tokenization can get increasingly difficult when we have to deal with all sorts of situations in larger corpora of texts (see also the exercise). \nSo it is great that there are already all sorts of libraries available that can help us with this process. \n\n::: {#f635468e .cell execution_count=5}\n``` {.python .cell-code}\nfrom nltk.tokenize import wordpunct_tokenize\nfrom string import punctuation\n\ntokenized_sentence = wordpunct_tokenize(sentence)\ntokenized_sentence = [t for t in tokenized_sentence if t not in punctuation]\nprint(tokenized_sentence)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n['I', 'love', 'reading', 'science', 'fiction', 'books', 'or', 'books', 'about', 'science']\n```\n:::\n:::\n\n\n## Advanced word tokenization\n\nThe above ideas illustrate well the idea of tokenization of splitting text into smaller chunks that we can feed to a language model.\nIn practice, especially in models like GPT, a critical component is the vocabulary or the set of unique words or tokens the model understands.\nTraditional approaches use fixed-size vocabularies, which means every unique word in the corpus has its own representation (index or embedding) in the model's vocabulary. \nHowever, as the vocabulary size increases (for example, by including more languages), so does the memory requirement, which can be impractical for large-scale language models. \nOne solution is the so-called bit-pair encoding.\nBit pair encoding is a data compression technique specifically designed to tackle the issue of large vocabularies in language models. \nInstead of assigning a unique index or embedding to each token, bit pair encoding identifies frequent pairs of characters (bits) within the corpus and represents them as a single token. \nThis effectively reduces the size of the vocabulary while preserving the essential information needed for language modeling tasks.\n\n\n### How Bit Pair Encoding Works:\n\n1. **Tokenization**: The first step in bit pair encoding is tokenization, where the text corpus is broken down into individual tokens. These tokens could be characters, subwords, or words, depending on the tokenization strategy used.\n\n2. **Pair Identification**: Next, the algorithm identifies pairs of characters (bits) that occur frequently within the corpus. These pairs are typically consecutive characters in the text.\n\n3. **Replacement with Single Token**: Once frequent pairs are identified, they are replaced with a single token. This effectively reduces the number of unique tokens in the vocabulary.\n\n4. **Iterative Process**: The process of identifying frequent pairs and replacing them with single tokens is iterative. It continues until a predefined stopping criterion is met, such as reaching a target vocabulary size or when no more frequent pairs can be found.\n\n5. **Vocabulary Construction**: After the iterative process, a vocabulary is constructed, consisting of the single tokens generated through pair replacement, along with any remaining tokens from the original tokenization process.\n\n6. **Encoding and Decoding**: During training and inference, text data is encoded using the constructed vocabulary, where each token is represented by its corresponding index in the vocabulary. During decoding, the indices are mapped back to their respective tokens.\n\n\n::: {.callout-tip}\nIt is very illustrative to use the the OpenAI [tokenizer](https://platform.openai.com/tokenizer){.external} to see how a sentence is split up into different token.\nTry mixing languages and standard as well as more rare words and observe how they are split up.\n\nAnother detailed example can be found [here](https://www.geeksforgeeks.org/byte-pair-encoding-bpe-in-nlp/){.external}.\n:::\n\n\n\n### Advantages of Bit Pair Encoding:\n\n1. **Efficient Memory Usage**: Bit pair encoding significantly reduces the size of the vocabulary, leading to more efficient memory usage, especially in large-scale language models.\n\n2. **Retains Information**: Despite reducing the vocabulary size, bit pair encoding retains important linguistic information by capturing frequent character pairs.\n\n3. **Flexible**: Bit pair encoding is flexible and can be adapted to different tokenization strategies and corpus characteristics.\n\n\n### Limitations and Considerations:\n\n1. **Computational Overhead**: The iterative nature of bit pair encoding can be computationally intensive, especially for large corpora.\n\n2. **Loss of Granularity**: While bit pair encoding reduces vocabulary size, it may lead to a loss of granularity, especially for rare or out-of-vocabulary words.\n\n3. **Tokenization Strategy**: The effectiveness of bit pair encoding depends on the tokenization strategy used and the characteristics of the corpus.\n\n\n\n::: {.callout-tip}\n__From the [OpenAI Guide](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them){.external}__:\n\nA helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).\n:::\n\n", "supporting": [ "tokenization_files" ], diff --git a/_quarto.yml b/_quarto.yml index 5a734cd..60085ee 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -62,6 +62,7 @@ website: - llm/prompting.qmd - llm/parameterization.qmd - llm/exercises/ex_gpt_parameterization.ipynb + - llm/exercises/ex_gpt_ner_with_function_calls.ipynb - section: "Embeddings" contents: diff --git a/docs/about/assignment.html b/docs/about/assignment.html index b9c2e54..5f31739 100644 --- a/docs/about/assignment.html +++ b/docs/about/assignment.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/about/projects.html b/docs/about/projects.html index 046bd99..6588e86 100644 --- a/docs/about/projects.html +++ b/docs/about/projects.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/about/schedule.html b/docs/about/schedule.html index c5776cd..4cc751f 100644 --- a/docs/about/schedule.html +++ b/docs/about/schedule.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/embeddings/applications.html b/docs/embeddings/applications.html index 9c270ce..33545f0 100644 --- a/docs/embeddings/applications.html +++ b/docs/embeddings/applications.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + @@ -416,6 +422,7 @@

Applications

+

Build a bot that can answer questions based on documents! Resource: https://platform.openai.com/docs/tutorials/web-qa-embeddings

diff --git a/docs/embeddings/clustering.html b/docs/embeddings/clustering.html index 5970247..b0aad0a 100644 --- a/docs/embeddings/clustering.html +++ b/docs/embeddings/clustering.html @@ -319,6 +319,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/embeddings/embeddings.html b/docs/embeddings/embeddings.html index 6bab2fc..2fc83ed 100644 --- a/docs/embeddings/embeddings.html +++ b/docs/embeddings/embeddings.html @@ -31,7 +31,7 @@ - + @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + @@ -841,8 +847,8 @@

Embeddings

diff --git a/docs/llm/exercises/ex_gpt_start.html b/docs/llm/exercises/ex_gpt_start.html index 20c09d8..1f355dd 100644 --- a/docs/llm/exercises/ex_gpt_start.html +++ b/docs/llm/exercises/ex_gpt_start.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/llm/gpt.html b/docs/llm/gpt.html index 0039f80..4dabfb7 100644 --- a/docs/llm/gpt.html +++ b/docs/llm/gpt.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + diff --git a/docs/llm/gpt_api.html b/docs/llm/gpt_api.html index ef3646c..30fde58 100644 --- a/docs/llm/gpt_api.html +++ b/docs/llm/gpt_api.html @@ -319,6 +319,12 @@ Exercise: GPT Parameterization + + @@ -450,12 +456,24 @@

The OpenAI API

+
+
+
+ +
+
+Note +
+
+

Resource: OpenAI API docs

+
+

Let’s get started with the OpenAI API for GPT.

Authentication

Getting started with the OpenAI Chat Completions API requires signing up for an account on the OpenAI platform. Once you’ve registered, you’ll gain access to an API key, which serves as a unique identifier for your application to authenticate requests to the API. This key is essential for ensuring secure communication between your application and OpenAI’s servers. Without proper authentication, your requests will be rejected. You can create your own account, but for the seminar we will provide the client with the credential within the Jupyterlab (TODO: Link).

-
+
# setting up the client in Python
 
 import os
@@ -470,7 +488,7 @@ 

Authentication

Requesting Completions

Most interaction with GPT and other models consist in generating completions for certain tasks (TODO: Link to completions)

To request completions from the OpenAI API, we use Python to send HTTP requests to the designated API endpoint. These requests are structured to include various parameters that guide the generation of text completions. The most fundamental parameter is the prompt text, which sets the context for the completion. Additionally, you can specify the desired model configuration, such as the engine to use (e.g., “gpt-4”), as well as any constraints or preferences for the generated completions, such as the maximum number of tokens or the temperature for controlling creativity (TODO: Link parameterization)

-
+
# creating a completion
 chat_completion = client.chat.completions.create(
     messages=[
diff --git a/docs/llm/intro.html b/docs/llm/intro.html
index b6a3775..e0c0482 100644
--- a/docs/llm/intro.html
+++ b/docs/llm/intro.html
@@ -285,6 +285,12 @@
   
  Exercise: GPT Parameterization
   
+ + diff --git a/docs/llm/parameterization.html b/docs/llm/parameterization.html index fe59998..73e7807 100644 --- a/docs/llm/parameterization.html +++ b/docs/llm/parameterization.html @@ -319,6 +319,12 @@ Exercise: GPT Parameterization
+ + @@ -450,39 +456,191 @@

Parameterization of GPT

+

The GPT models provided by OpenAI provide a variety of parameters that can change the way the language model responds. Below you can find a list of the most important ones.

    -
  • Temperature: Temperature is a parameter that controls the randomness of the generated text. Lower temperatures result in more deterministic outputs, where the model tends to choose the most likely tokens at each step. Higher temperatures introduce more randomness, allowing the model to explore less likely tokens and produce more creative outputs. It’s often used to balance between generating safe, conservative responses and more novel, imaginative ones.

  • -
  • Max Tokens: Max Tokens limits the maximum length of the generated text by specifying the maximum number of tokens (words or subwords) allowed in the output. This parameter helps to control the length of the response and prevent the model from generating overly long or verbose outputs, which may not be suitable for certain applications or contexts.

  • -
  • Top P (Nucleus Sampling): Top P, also known as nucleus sampling, dynamically selects a subset of the most likely tokens based on their cumulative probability until the cumulative probability exceeds a certain threshold (specified by the parameter). This approach ensures diversity in the generated text while still prioritizing tokens with higher probabilities. It’s particularly useful for generating diverse and contextually relevant responses.

  • -
  • Frequency Penalty: Frequency Penalty penalizes tokens based on their frequency in the generated text. Tokens that appear more frequently are assigned higher penalties, discouraging the model from repeatedly generating common or redundant tokens. This helps to promote diversity in the generated text and prevent the model from producing overly repetitive outputs.

  • -
  • Presence Penalty: Presence Penalty penalizes tokens that are already present in the input prompt. By discouraging the model from simply echoing or replicating the input text, this parameter encourages the generation of responses that go beyond the provided context. It’s useful for generating more creative and novel outputs that are not directly predictable from the input.

  • -
  • Stop Sequence: Stop Sequence specifies a sequence of tokens that, if generated by the model, signals it to stop generating further text. This parameter is commonly used to indicate the desired ending or conclusion of the generated text. It helps to control the length of the response and ensure that the model generates text that aligns with specific requirements or constraints.

  • -
  • Repetition Penalty: Repetition Penalty penalizes repeated tokens in the generated text by assigning higher penalties to tokens that appear multiple times within a short context window. This encourages the model to produce more varied outputs by avoiding unnecessary repetition of tokens. It’s particularly useful for generating coherent and diverse text without excessive redundancy.

  • -
  • Length Penalty: Length Penalty penalizes the length of the generated text by applying a penalty factor to longer sequences. This helps to balance between generating concise and informative responses while avoiding excessively long or verbose outputs. Length Penalty is often used to control the length of the generated text and ensure that it remains coherent and contextually relevant.

  • +
  • Temperature: Temperature (temperaure) is a parameter that controls the randomness of the generated text. Lower temperatures result in more deterministic outputs, where the model tends to choose the most likely tokens at each step. Higher temperatures introduce more randomness, allowing the model to explore less likely tokens and produce more creative outputs. It’s often used to balance between generating safe, conservative responses and more novel, imaginative ones.

  • +
  • Max Tokens: Max Tokens (max_tokens) limits the maximum length of the generated text by specifying the maximum number of tokens (words or subwords) allowed in the output. This parameter helps to control the length of the response and prevent the model from generating overly long or verbose outputs, which may not be suitable for certain applications or contexts.

  • +
  • Top P (Nucleus Sampling): Top P (top_p), also known as nucleus sampling, dynamically selects a subset of the most likely tokens based on their cumulative probability until the cumulative probability exceeds a certain threshold (specified by the parameter). This approach ensures diversity in the generated text while still prioritizing tokens with higher probabilities. It’s particularly useful for generating diverse and contextually relevant responses.

  • +
  • Frequency Penalty: Frequency Penalty (frequency_penalty) penalizes tokens based on their frequency in the generated text. Tokens that appear more frequently are assigned higher penalties, discouraging the model from repeatedly generating common or redundant tokens. This helps to promote diversity in the generated text and prevent the model from producing overly repetitive outputs.

  • +
  • Presence Penalty: Presence Penalty (presence_penalty) penalizes tokens that are already present in the input prompt. By discouraging the model from simply echoing or replicating the input text, this parameter encourages the generation of responses that go beyond the provided context. It’s useful for generating more creative and novel outputs that are not directly predictable from the input.

  • +
  • Stop Sequence: Stop Sequence (stop) specifies a sequence of tokens that, if generated by the model, signals it to stop generating further text. This parameter is commonly used to indicate the desired ending or conclusion of the generated text. It helps to control the length of the response and ensure that the model generates text that aligns with specific requirements or constraints.

Roles:

-
-
-Code -
from openai import OpenAI
-client = OpenAI()
+

In order to cover most tasks you want to perform using a chat format, the OpenAI API let’s you define different roles in the chat. The available roles are system, assistant, user and tools. You should already be familiar with two of them by now: The user role corresponds to the actual user prompting the language model, all answers are given with the assisstant role.

+

The system role can now be given to provide some additional general instructions to the language model that are typically not a user input, for example, the style in which the model responds. In this case, an example is better than any explanation.

+
+
import os
+from llm_utils.client import get_openai_client
 
-completion = client.chat.completions.create(
-  model="gpt-3.5-turbo",
-  messages=[
-    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."},
-    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
-  ]
-)
-
-print(completion.choices[0].message)
-
+MODEL = "gpt4" + +client = get_openai_client( + model=MODEL, + config_path=os.environ.get("CONFIG_PATH") +) + +completion = client.chat.completions.create( + model="MODEL", + messages=[ + {"role": "system", "content": "You are an annoyed technician working in a help center for dish washers, who answers in short, unfriendly bursts."}, + {"role": "user", "content": "My dish washer does not clean the dishes, what could be the reason."} + ] +) + +print(completion.choices[0].message.content)
+
+
Could be anything. Blocked spray arm. Clogged filter. Faulty pump. Detergent issue. Check all that.
+
-
-

Function calling:

-

https://platform.openai.com/docs/guides/function-calling

+
+

Function calling:

+

As we have seen, most interactions with a language model happen in form of a chat with almost “free” question or instructions and answers. While this seems the most natural in most cases, it is not always a practical format if we want to use a language model for very specific purposes. This happens particularly often when we want to employ a language model in business situations, where we require a consistent output of the model.

+

As an example, let us try to use GPT for sentiment analysis (see also here). Let’s say we want GPT to classify a text into one of the following four categories:

+
+
sentiment_categories = [
+    "positive", 
+    "negative",
+    "neutral",
+    "mixed"
+]
+
+

We could do the following:

+
+
messages = []
+messages.append(
+    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}."}
+)
+messages.append(
+    {"role": "user", "content": "I really did not like the movie."}
+)
+
+response = client.chat.completions.create(
+    messages=messages,
+    model=MODEL
+)
+
+print(f"Response: '{response.choices[0].message.content}'")
+
+
+
+
Response: 'Category: Negative'
+
+
+

It is easy to spot the problem: GPT does not necessarily answer in the way we expect or want it to. In this case, instead of simply returning the correct category, it also returns the string Category: alongside it (and capitalized Negative). So if we were to use the answer in a program or data base, we’d now again have to use some NLP techniques to parse it in order eventually retrieve exactly the category we were looking for: negative. What we need instead is a way to constrain GPT to a specific way of answering, and this is where functions or tools come into play (see also Function calling and Function calling (cookbook)).

+

This concept allows us to specify the exact output format we expect to receive from GPT (it is called functions since ideally we want to call a function directly on the output of GPT so it has to be in a specific format).

+
+
# this looks intimidating but isn't that complicated
+tools = [
+    {
+        "type": "function",
+        "function": {
+            "name": "analyze_sentiment",
+            "description": "Analyze the sentiment in a given text.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "sentiment": {
+                        "type": "string",
+                        "enum": sentiment_categories,
+                        "description": f"The sentiment of the text."
+                    }
+                },
+                "required": ["sentiment"],
+            }
+        }
+    }
+]
+
+
+
messages = []
+messages.append(
+    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}."}
+)
+messages.append(
+    {"role": "user", "content": "I really did not like the movie."}
+)
+
+response = client.chat.completions.create(
+    messages=messages,
+    model=MODEL,
+    tools=tools,
+    tool_choice={
+        "type": "function", 
+        "function": {"name": "analyze_sentiment"}}
+)
+
+print(f"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'")
+
+
Response: '{
+"sentiment": "negative"
+}'
+
+
+

We can now easily extract what we need:

+
+
import json 
+result = json.loads(response.choices[0].message.tool_calls[0].function.arguments) # remember that the answer is a string
+print(result["sentiment"])
+
+
negative
+
+
+

We can also include multiple function parameters if our desired output has multiple components. Let’s try to include another parameter which includes the reason for the sentiment.

+
+
tools = [
+    {
+        "type": "function",
+        "function": {
+            "name": "analyze_sentiment",
+            "description": "Analyze the sentiment in a given text.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "sentiment": {
+                        "type": "string",
+                        "enum": sentiment_categories,
+                        "description": f"The sentiment of the text."
+                    },
+                    "reason": {
+                        "type": "string",
+                        "description": "The reason for the sentiment in few words. If there is no information, do not make assumptions and leave blank."
+                    }
+                },
+                "required": ["sentiment", "reason"],
+            }
+        }
+    }
+]
+
+
+
messages = []
+messages.append(
+    {"role": "system", "content": f"Classify the given text into one of the following sentiment categories: {sentiment_categories}. If you can, also extract the reason."}
+)
+messages.append(
+    {"role": "user", "content": "I loved the movie, Johnny Depp is a great actor."}
+)
+
+response = client.chat.completions.create(
+    messages=messages,
+    model=MODEL,
+    tools=tools,
+    tool_choice={
+        "type": "function", 
+        "function": {"name": "analyze_sentiment"}}
+)
+
+print(f"Response: '{response.choices[0].message.tool_calls[0].function.arguments}'")
+
+
Response: '{
+"sentiment": "positive",
+"reason": "Appreciation for the movie and actor"
+}'
+
+
+

Here, again, we could also constrain the possibilities for the reason to a certain set. Hence, functions are great to have more consistent answers of the language model such that we can use it in applications.

diff --git a/docs/llm/prompting.html b/docs/llm/prompting.html index a9d2d5f..7859158 100644 --- a/docs/llm/prompting.html +++ b/docs/llm/prompting.html @@ -285,6 +285,12 @@ Exercise: GPT Parameterization + + @@ -416,9 +422,82 @@

Prompting

-

Resources: - https://platform.openai.com/docs/guides/prompt-engineering -

+

Learning prompting is a science for itself. The difficulty lies in the probabilistic nature of the language models. That means, small changes to your prompt (that you might even find insignificant) can have a large impact on the result/the answer. In particular, the changes do not have to be “logical”, i.e., depend on your changes in a comprehensible or reproducible way. This can sometimes be frustrating, but can also be avoided in many cases when following the right instructions for prompting. To do so, let’s best follow the creators.

+
+
+
+ +
+
+Note +
+
+
+

The following is taken from the OpenAI Guide

+
+
+
+

Write clear instructions

+

These models can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less the model has to guess at what you want, the more likely you’ll get it.

+

Tactics:

+
    +
  • Include details in your query to get more relevant answers
  • +
  • Ask the model to adopt a persona
  • +
  • Use delimiters to clearly indicate distinct parts of the input
  • +
  • Specify the steps required to complete a task
  • +
  • Provide examples
  • +
  • Specify the desired length of the output

  • +
+
+
+

Provide reference text

+

Language models can confidently invent fake answers, especially when asked about esoteric topics or for citations and URLs. In the same way that a sheet of notes can help a student do better on a test, providing reference text to these models can help in answering with fewer fabrications.

+

Tactics:

+
    +
  • Instruct the model to answer using a reference text
  • +
  • Instruct the model to answer with citations from a reference text

  • +
+
+
+

Split complex tasks into simpler subtasks

+

Just as it is good practice in software engineering to decompose a complex system into a set of modular components, the same is true of tasks - submitted to a language model. Complex tasks tend to have higher error rates than simpler tasks. Furthermore, complex tasks can often be re-defined as a workflow of simpler tasks in which the outputs of earlier tasks are used to construct the inputs to later tasks.

+

Tactics:

+
    +
  • Use intent classification to identify the most relevant instructions for a user query
  • +
  • For dialogue applications that require very long conversations, summarize or filter previous dialogue
  • +
  • Summarize long documents piecewise and construct a full summary recursively

  • +
+
+
+

Give the model time to “think”

+

If asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a “chain of thought” before an answer can help the model reason its way toward correct answers more reliably.

+

Tactics:

+
    +
  • Instruct the model to work out its own solution before rushing to a conclusion
  • +
  • Use inner monologue or a sequence of queries to hide the model’s reasoning process
  • +
  • Ask the model if it missed anything on previous passes

  • +
+
+
+

Use external tools

+

Compensate for the weaknesses of the model by feeding it the outputs of other tools. For example, a text retrieval system (sometimes called RAG or retrieval augmented generation) can tell the model about relevant documents. A code execution engine like OpenAI’s Code Interpreter can help the model do math and run code. If a task can be done more reliably or efficiently by a tool rather than by a language model, offload it to get the best of both.

+

Tactics:

+
    +
  • Use embeddings-based search to implement efficient knowledge retrieval
  • +
  • Use code execution to perform more accurate calculations or call external APIs
  • +
  • Give the model access to specific functions

  • +
+
+
+

Test changes systematically

+

Improving performance is easier if you can measure it. In some cases a modification to a prompt will achieve better performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Therefore to be sure that a change is net positive to performance it may be necessary to define a comprehensive test suite (also known an as an “eval”).

+

Tactic:

+
    +
  • Evaluate model outputs with reference to gold-standard answers
  • +
+
Back to top