Retrieval Augmented Generation (RAG) for large language models (LLMs) enables you to include real-time information and information created after the cutoff training time for the LLM. RAG is strongly associated with using the user’s prompt (query) to retrieve semantically related data from a vector database, and then augment the user’s prompt by adding the returned (hopefully related) data to the user’s prompt. This version of RAG suffers from being probabilistic in the data retrieved (sometimes it may not work) and problematic in requiring you to index your existing (normally unstructured) data in a vector database.
This live session will look into extending RAG for LLMs to include the ability to query structured data and API calls using function calling. We will introduce the function calling paradigm for LLMs and describe how LLMs can be fine-tuned to detect when a function needs to be called and then output JSON containing arguments to call the function. We will demonstrate an open-source function calling example for Air Quality prediction that enables users to ask questions such as “what will air quality be like next week” or “will there be any day with bad air quality in the next 10 days”? You will see a source code and a demo and all material and software used will be open-source (including the LLM).