This is a minimal implementation of RAG pattern using Semantic Kernel as a foundation for enabling enterprise data ingestion, and long-term memory. RAG is a pattern that uses pretrained LLM along with your own data to generate responses. The demo includes loading the data into QDrant (Vector DB) and used Semantic Kernel to orchestrate and generate responses from the vector DB.
Before you get started, make sure you have the following requirements in place:
- Visual Studio Code with extensions:
- .NET 7.0 SDK for building and deploying .NET 7 projects.
- Azure Function Core Tools 4.x for managing Azure Functions
- Azure OpenAI API key for using the Azure OpenAI API (or click here to signup).
Then, open a terminal and clone this repo with the following command:
git clone https://github.com/chinu333/rag-chat.git-
Open a new Visual Studio Code window and click on the Azure extension (or press
SHIFT+ALT+A). -
Mouse-over
WORKSPACE(in the lower left pane) and selectCreate Function(i.e., +⚡) to create a new local Azure function project. -
Select
Browseand create a folder calledmyfuncinside the cloned repo'ssrcdirectory to house your Azure Function code (e.g.,rag-chat/src/myfunc). Then use the selections below when creating the project:Selection Value Language C#Runtime .NET 7 IsolatedTemplate Http triggerFunction name MyChatFunctionNamespace My.MyChatFunctionAccess rights Function
Now close and reopen Visual Studio Code, this time opening the rag-chat folder so you can view and interact with the entire repository.
-
Open a terminal window, change to the directory with your Azure Function project file (e.g.,
rag-chat/src/myfunc), and run thedotnetcommand below to add the Semantic Kernel NuGet package to your project.dotnet add package Microsoft.SemanticKernel --prerelease -v 0.14.547.1-preview
In addition, use the commands below (optional) to configure .NET User Secrets and then securely store your OpenAI API key.
dotnet add package Microsoft.Extensions.Configuration.UserSecrets dotnet user-secrets init --id rag-chat dotnet user-secrets set "AZURE_OPENAI_APIKEY" "<your Azure OpenAI API key>"
Make sure to specify
rag-chatas the--idparameter. This will enable you to access your secrets from any of the projects in this repository. -
Back in your Azure Function project in Visual Studio Code, open the
Program.csandMyChatFunction.csfile and replace everything in the file with the content below. -
The complete code files (with additional comments).
Program.cs
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Connectors.Memory.Qdrant; using Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch; var hostBuilder = new HostBuilder() .ConfigureFunctionsWorkerDefaults(); hostBuilder.ConfigureAppConfiguration((context, config) => { config.AddUserSecrets<Program>(); }); hostBuilder.ConfigureServices(services => { services.AddSingleton<IKernel>(sp => { // Retrieve the OpenAI API key from the configuration. IConfiguration configuration = sp.GetRequiredService<IConfiguration>(); string openAiApiKey = ""; //configuration["OPENAI_APIKEY"]; QdrantMemoryStore memoryStore = new QdrantMemoryStore( host: "http://localhost", port: 6333, vectorSize: 1536, logger: sp.GetRequiredService<ILogger<QdrantMemoryStore>>()); AzureCognitiveSearchMemory memory = new AzureCognitiveSearchMemory( "ENDPOINT", "KEY" ); IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddAzureChatCompletionService( deploymentName: "DEPLOYMENT NAME", endpoint: "ENDPOINT", apiKey: openAiApiKey)) // .Configure(c => c.AddAzureTextEmbeddingGenerationService( // deploymentName: "text-embedding-ada-002", // endpoint: "ENDPOINT", // apiKey: openAiApiKey)) // .WithMemoryStorage(memoryStore) // .WithMemory(memory) .Build(); return kernel; }); // Provide a chat completion service client to our function. services.AddSingleton<IChatCompletion>(sp => sp.GetRequiredService<IKernel>().GetService<IChatCompletion>()); // Provide a persistant in-memory chat history store with the // initial ChatGPT system message. const string instructions = "You are a helpful friendly assistant."; services.AddSingleton<ChatHistory>(sp => sp.GetRequiredService<IChatCompletion>().CreateNewChat(instructions)); }); hostBuilder.Build().Run();
MyChatFunction.cs
using System.Net; using System.Text; using Microsoft.Azure.Functions.Worker; using Microsoft.Azure.Functions.Worker.Http; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Memory; using Microsoft.SemanticKernel; namespace My.MyChatFunction { public class MyChatFunction { private readonly ILogger _logger; private readonly IKernel _kernel; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, IKernel kernel, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _kernel = kernel; _chat = chat; _chatHistory = chatHistory; } [Function("MyChatFunction")] public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestData req) { // Add the user's chat message to the history. // _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); string message = await SearchMemoriesAsync(_kernel, await req.ReadAsStringAsync() ?? string.Empty); _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, message); // Send the chat history to the AI and receive a reply. string reply = await _chat.GenerateMessageAsync(_chatHistory, new ChatRequestSettings()); // Add the AI's reply to the chat history for next time. _chatHistory.AddMessage(ChatHistory.AuthorRoles.Assistant, reply); // Send the AI's response back to the caller. HttpResponseData response = req.CreateResponse(HttpStatusCode.OK); response.WriteString(reply); return response; } private async Task<string> SearchMemoriesAsync(IKernel kernel, string query) { StringBuilder result = new StringBuilder(); result.Append("The below is relevant information.\n[START INFO]"); // Search for memories that are similar to the user's input. const string memoryCollectionName = "ms10k"; IAsyncEnumerable<MemoryQueryResult> queryResults = kernel.Memory.SearchAsync(memoryCollectionName, query, limit: 3, minRelevanceScore: 0.77); // For each memory found, try to get previous and next memories. await foreach (MemoryQueryResult r in queryResults) { int id = int.Parse(r.Metadata.Id); MemoryQueryResult? rb2 = await kernel.Memory.GetAsync(memoryCollectionName, (id - 2).ToString()); MemoryQueryResult? rb = await kernel.Memory.GetAsync(memoryCollectionName, (id - 1).ToString()); MemoryQueryResult? ra = await kernel.Memory.GetAsync(memoryCollectionName, (id + 1).ToString()); MemoryQueryResult? ra2 = await kernel.Memory.GetAsync(memoryCollectionName, (id + 2).ToString()); if (rb2 != null) result.Append("\n " + rb2.Metadata.Id + ": " + rb2.Metadata.Description + "\n"); if (rb != null) result.Append("\n " + rb.Metadata.Description + "\n"); if (r != null) result.Append("\n " + r.Metadata.Description + "\n"); if (ra != null) result.Append("\n " + ra.Metadata.Description + "\n"); if (ra2 != null) result.Append("\n " + ra2.Metadata.Id + ": " + ra2.Metadata.Description + "\n"); } result.Append("\n[END INFO]"); result.Append($"\n{query}"); return result.ToString(); } } }
Congigure keys/endpoints in following places:
- src/myfunc/Program.cs
- src/importmemories/Program.cs
-
Run your Azure Function locally by opening a terminal, changing directory to your Azure Function project (e.g.,
rag-chat/src/myfunc), and starting the function by runningfunc start
Make note of the URL displayed (e.g.,
http://localhost:7071/api/MyChatFunction). -
Start the test console application Open a second terminal and change directory to the
chatconsoleproject folder (e.g.,rag-chat/src/chatconsole) and run the application using the Azure Function URL.dotnet run http://localhost:7071/api/MyChatFunction
-
Type a message and press enter to verify that we are able to chat with the AI!
Input: Hello, how are you? AI: Hello! As an AI language model, I don't have feelings, but I'm functioning properly and ready to assist you. How can I help you today? -
Now let's try to ask about something that is not in the current AI model, such as "What was Microsoft's total revenue for 2022?"
Input: What was Microsoft's cloud revenue for 2022? AI: I'm sorry, but I cannot provide information about Microsoft's cloud revenue for 2022 as it is not yet available. Microsoft's fiscal year 2022 ends on June 30, 2022, and the company typically releases its financial results a few weeks after the end of the fiscal year. However, Microsoft's cloud revenue for fiscal year 2021 was $59.5 billion, an increase of 34% from the previous year.As you can see the AI is a bit out of date with its answers.
Next we'll add a 'knowledge base' to the chat to help answer questions such as those above more accurately.
Semantic Kernel's memory stores are used to integrate data from your knowledge base into AI interactions. Any data can be added to a knowledge base and you have full control of that data and who it is shared with. SK uses embeddings to encode data and store it in a vector database. Using a vector database also allows us to use vector search engines to quickly find the most relevant data for a given query that we then share with the AI. In this chapter, we'll add a memory store to our chat function, import the Microsoft revenue data, and use it to answer the question.
Before you get started, make sure you have the following additional requirements in place:
- Docker Desktop for hosting the Qdrant vector search engine.
Note that a different vector store, such as Pinecone or Weviate, could be leveraged.
In this section we deploy the Qdrant vector database locally and populate it with example data (i.e., Microsoft's 2022 10-K financial report). This will take approximately 15 minutes to import and will use OpenAI’s embedding generation service to create embeddings for the 10-K.
-
Start Docker Desktop and wait until it is running.
-
Open a terminal and use Docker to pull down the container image for Qdrant.
docker pull qdrant/qdrant
-
Change directory to the root of this repo (e.g.,
rag-chat) and create a./data/qdrantdirectory for Qdrant to use as persistent storage. Then start the Qdrant container on port6333using the./data/qdrantfolder as the persistent storage location.mkdir ./data/qdrant docker run --name mychat -p 6333:6333 -v "$(pwd)/data/qdrant:/qdrant/storage" qdrant/qdrantTo stop the container, in another terminal window run
docker container stop mychat; docker container rm mychat;. -
Open a second terminal and change directory to the
importmemoriesproject folder in this repo (e.g.,rag-chat/src/importmemories). Run theimportmemoriestool with the command below to populate the vector database with your data.Make sure the
--collectionargument matches thecollectionNamevariable in theSearchMemoriesAsyncmethod above.Note: This may take several minutes to several hours depending on the size of your data. This repo contains Microsoft's 2022 10-K financial report data as an example which should normally take about 15 minutes to import.
dotnet run -- --memory-type qdrant --memory-url http://localhost:6333 --collection ms10k --text-file ../../data/ms10k.txt
When importing your own data, try to import all files at the same time using multiple
--text-filearguments. This example leverages incremental indexes which are best constructed when all data is present.If you want to reset the memory store, delete and recreate the directory in step 2, or create a new directory to use.
-
With Qdrant running and populated, run your Azure Function locally by opening a terminal, changing directory to your Azure Function project (e.g.,
rag-chat/src/myfunc), and starting the function by runningfunc start
Make a note of the URL displayed (e.g.,
http://localhost:7071/api/MyChatFunction). -
Start the test console application Open a second terminal and change directory to the
chatconsoleproject folder (e.g.,rag-chat/src/chatconsole) and run the application using the Azure Function URL.dotnet run http://localhost:7071/api/MyChatFunction
-
Type a message and press enter to verify that we are able to chat with the AI!
Input: Hello, how are you? AI: Hello! As an AI language model, I don't have feelings, but I'm functioning properly and ready to assist you. How can I help you today? -
Now let's try ask the same question from before about Microsoft's 2022 revenue
Input: What was Microsoft's cloud revenue for 2022? AI: Microsoft's cloud revenue for 2022 was $91.2 billion.The AI now has the ability to search through the Microsoft 10-K financial report and find the answer to our question. Let's try another...
Input: Did linkedin's revenue grow in 2022? AI: Yes, LinkedIn's revenue grew in 2022. It increased by $3.5 billion or 34% driven by a strong job market in the Talent Solutions business and advertising demand in the Marketing Solutions business.