Solution Overview

Discover the art of creating and implementing LLM agents with AWS SageMaker JumpStart Foundation Models

Large language model (LLM) agents are powerful programs that enhance standalone LLMs by providing access to external tools and the ability to plan and execute tasks on their own. In this post, we introduce LLM agents and demonstrate how to build an e-commerce LLM agent using Amazon SageMaker JumpStart and AWS Lambda. The agent can perform tasks like answering return-related questions and providing updates about orders. We also discuss the architecture of LLM agents and the process of tool selection and task planning. Finally, we provide an overview of how to implement a simple LLM agent using AWS services.

Full Article: Discover the art of creating and implementing LLM agents with AWS SageMaker JumpStart Foundation Models

The Power of Large Language Models: Building an E-commerce Agent with Amazon SageMaker JumpStart and AWS Lambda

Introduction

Large Language Model (LLM) agents are groundbreaking programs that enhance the capabilities of standalone LLMs by granting them access to external tools and empowering them to plan and execute tasks independently. LLMs often need to interact with other software, databases, or APIs to complete complex tasks. For instance, an administrative chatbot that handles scheduling would require access to employees’ calendars and emails. By providing LLMs with access to tools, LLM agents can become more versatile, albeit more complex. In this post, we will introduce LLM agents and guide you through the process of building and deploying an e-commerce LLM agent using Amazon SageMaker JumpStart and AWS Lambda. This agent will utilize various tools to offer new functionalities, such as answering queries about returns and providing updates about orders. These capabilities necessitate the LLM to retrieve data from multiple sources and perform retrieval augmented generation (RAG). To power our LLM agent, we will utilize a Flan-UL2 model deployed as a SageMaker endpoint and integrate it with AWS Lambda to create data retrieval tools. The final agent can be seamlessly integrated with Amazon Lex and deployed as a chatbot on websites or AWS Connect. Lastly, we will discuss key points to consider before deploying LLM agents to production. As an alternative, Amazon also offers a fully managed experience for building LLM agents through the preview feature called “agents for Amazon Bedrock.”

Understanding LLM Agent Architectures

LLM agents are innovative programs that employ LLMs to determine when and how to utilize tools in order to complete complex tasks. Equipped with tools and task planning abilities, LLM agents can interact with external systems, effectively overcoming the limitations of traditional LLMs, such as knowledge cutoffs, hallucinations, and imprecise calculations. Tools can manifest in various forms, including API calls, Python functions, or webhook-based plugins. For example, an LLM can utilize a retrieval plugin to fetch relevant context and perform RAG. With multiple approaches available (such as ReAct, MRKL, Toolformer, HuggingGPT, and Transformer Agents) to using LLMs with tools, advancements in this field are progressing rapidly. For simplicity, one common approach is to prompt an LLM with a list of tools and instruct it to determine if a tool is necessary to fulfill a user query, and if so, to select the appropriate tool. The prompt typically appears as follows:

You May Also Like to Read  Merging Education and Deep Learning: Unveiling Inspiring Case Studies and Success Stories

“`html
Your task is to select a tool to answer a user question. You have access to the following tools.
• Search: search for an answer in FAQs
• Order: order items
• Noop: no tool is needed
{few shot examples}
Question: {input}
Tool:
“`

More complex approaches involve utilizing specialized LLMs, such as GorillaLLM, that can directly decode “API calls” or “tool use.” These finetuned LLMs are trained on datasets encompassing API specifications, enabling them to recognize and predict API calls based on instructions. To output tool invocations, these LLMs typically require metadata regarding the available tools (descriptions, yaml, or JSON schema for input parameters). This approach is adopted by agents for Amazon Bedrock and OpenAI function calls. It is worth noting that LLMs generally need to be sufficiently large and complex in order to demonstrate tool selection capabilities.

Understanding the Workflow of an LLM Agent

Assuming tool planning and selection mechanisms are established, a typical LLM agent program operates according to the following sequence:

1. User request: The program receives a user input, such as “Where is my order 123456?”, from a client application.
2. Plan next action(s) and select tool(s) to use: The program generates the next action by prompting the LLM, for example, “Look up the orders table using OrdersAPI.” The LLM is instructed to suggest a tool name, such as OrdersAPI, from a predefined list of available tools with their descriptions. Alternatively, the LLM might be directed to directly generate an API call with input parameters, such as OrdersAPI(12345). It is important to note that the next action may or may not involve utilizing a tool or API. If not, the LLM would respond to the user input without incorporating additional context from tools or simply return a canned response, such as “I cannot answer this question.”
3. Parse tool request: The program parses and validates the tool/action prediction suggested by the LLM. Validation is necessary to ensure that tool names, APIs, and request parameters are not hallucinated and that the tools are invoked correctly according to specifications. This parsing may require a separate LLM call.
4. Invoke tool: Once the validity of the tool name(s) and parameter(s) is ensured, the program proceeds to invoke the tool. Invocation methods can include HTTP requests, function calls, and more.
5. Parse output: The response from the tool may require additional processing. For instance, an API call may yield a lengthy JSON response, with only a subset of fields relevant to the LLM. Extracting information in a clean, standardized format can improve the LLM’s interpretation of the results.
6. Interpret output: Given the output from the tool, the LLM is prompted again to comprehend the results and determine whether it can generate the final answer for the user or if additional actions are necessary.
7. Terminate or continue to step 2: The program either returns a final answer or provides a default response in the case of errors or timeouts. Different agent frameworks may execute the previous program flow differently. For example, ReAct combines tool selection and final answer generation into a single prompt, instead of using separate prompts for tool selection and answer generation. Additionally, this logic can be executed in a single pass or within an “agent loop” governed by a while statement. The agent loop terminates when the final answer is generated, an exception is thrown, or a timeout occurs. Regardless of the specific implementation, agents rely on the LLM to orchestrate planning and tool invocations until the task is completed.

You May Also Like to Read  Master the Basics of Deep Learning: Your Ultimate Guide to Understanding and Implementing this Powerful Technology

Implementing a Simple Agent Loop with AWS Services

In the following sections, we will guide you through building an e-commerce support LLM agent capable of answering questions about return statuses and providing order updates by utilizing tools. This agent serves as a query router, selecting the appropriate retrieval tool to query multiple data sources (e.g., returns and orders) based on a given query. We accomplish query routing by having the LLM choose from multiple retrieval tools responsible for interacting with data sources and fetching relevant context. In our solution, both retrieval tools are implemented as AWS Lambda functions that accept an ID (orderId or returnId) as input, fetch a JSON object from the data source, and convert the JSON into a human-friendly representation suitable for the LLM. Although a real-world scenario would involve a highly scalable NoSQL database like DynamoDB, our solution utilizes a simple Python Dict with sample data for demonstration purposes. The agent’s functionality can be expanded by adding additional retrieval tools and modifying prompts accordingly. This LLM agent can be tested as a standalone service that integrates with any UI over HTTP, making it easy to integrate with Amazon Lex. Let’s dive into the key components of our solution:

1. LLM Inference Endpoint: The core component of an agent program is the LLM. In this case, we will deploy the Flan-UL2 model using SageMaker JumpStart foundation model hub. SageMaker JumpStart simplifies the process of deploying LLM inference endpoints to dedicated SageMaker instances.

2. Agent Orchestrator: The agent orchestrator manages the interactions between the LLM, tools, and the client application. For our solution, we will utilize an AWS Lambda function to drive this flow and leverage helper functions to streamline the process.

3. Retrieval Tools: The retrieval tools are Lambda functions that interact with data sources and fetch relevant context. In our e-commerce agent, we will implement a return status retrieval tool and an order status retrieval tool. These tools will fetch JSON objects from the data sources and convert them into human-friendly representations suitable for the LLM.

You May Also Like to Read  Unveiling Amazon CodeWhisperer: Paving a New Path in Software Engineering

Conclusion

Building LLM agents offers incredible possibilities for extending the capabilities of standalone LLMs. By granting LLMs access to external tools and enabling them to plan and execute tasks independently, we can overcome the limitations traditionally associated with LLMs. In this post, we explored the concept of LLM agents and provided a step-by-step guide to building an e-commerce LLM agent using Amazon SageMaker JumpStart and AWS Lambda. This agent utilized tools to enhance its functionalities, such as answering queries about returns and providing order updates. We also highlighted the importance of considering key factors before deploying LLM agents to production. With the advancements in LLM technology and the availability of managed services like Amazon Bedrock, the possibilities for building powerful and intelligent agents are limitless.

Summary: Discover the art of creating and implementing LLM agents with AWS SageMaker JumpStart Foundation Models

Building and deploying an e-commerce language model (LLM) agent using Amazon SageMaker JumpStart and AWS Lambda is a powerful way to enhance the capabilities of standalone LLMs. These agents can access external tools and perform self-directed tasks, such as answering questions about returns and providing updates about orders. By using tools and task planning abilities, LLM agents can interact with other software and overcome limitations. This article provides an overview of LLM agent architectures and outlines the steps involved in implementing a simple agent loop using AWS services.




Frequently Asked Questions – Building and Deploying Tool-using LLM Agents using AWS SageMaker JumpStart Foundation Models

Frequently Asked Questions

1. What is AWS SageMaker JumpStart Foundation Models?

Answer: AWS SageMaker JumpStart Foundation Models is a comprehensive set of pre-trained machine learning models provided by Amazon Web Services (AWS) to help developers quickly build and deploy tool-using language learning models (LLMs).

2. How can I learn to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models?

Answer: You can follow the steps below to learn how to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models:

  1. Start by setting up an AWS account and accessing AWS SageMaker.
  2. Explore the available JumpStart Foundation Models for LLM agents in SageMaker.
  3. Select the appropriate model based on your project requirements.
  4. Prepare your training data and upload it to SageMaker.
  5. Train your LLM agent using the JumpStart Foundation Model.
  6. Once trained, evaluate the model’s performance and make necessary refinements.
  7. Finally, deploy your LLM agent as a tool, which can be accessed and utilized by end-users.

3. Are there any prerequisites for using AWS SageMaker JumpStart Foundation Models?

Answer: Yes, a basic understanding of machine learning concepts and familiarity with AWS services is recommended. Additionally, some knowledge of programming languages such as Python would be beneficial for building customized LLM agents using the foundation models.

4. Can I customize the pre-trained LLM agents provided by JumpStart Foundation Models?

Answer: Absolutely! AWS SageMaker JumpStart Foundation Models are designed to be customizable. You can fine-tune the pre-trained models according to your specific use case and data requirements. This allows you to build LLM agents that are tailored to your application’s needs.

5. How can I deploy the LLM agent as a tool for end-users?

Answer: After training and fine-tuning your LLM agent using AWS SageMaker JumpStart Foundation Models, you can easily deploy it as an API endpoint in SageMaker. This endpoint can be integrated into your preferred application or accessed directly by end-users to utilize the LLM agent as a tool.

6. Are there any additional resources or support available for using AWS SageMaker JumpStart Foundation Models?

Answer: Yes, AWS provides extensive documentation, tutorials, and sample code to help you get started with SageMaker JumpStart Foundation Models. Additionally, AWS offers support through forums, community groups, and the AWS Support Center to assist users in their journey of building and deploying tool-using LLM agents.