Enhancing a basic chat application built with Remix by incorporating retrieval-augmented generation (RAG) to respond to user queries based on defined data.
This article builds upon our previous article on building a simple chat application with remix. Building a simple chat bot with Remix. In that article, we built a simple Q and A bot that will respond to anything the user says based upon the system prompt and the user input only.
As before, you can find the code for this article at remix-simple-chat.
For this article, we would like to add a few more features to our chat application. First of all, if you are hosting a chat bot, then you probably want to have some sort of agreement with the user in place, to provide appropriate disclaimers and define the acceptable terms of use. Second, what if we want the bot to respond based upon data that we define, rather than just winging it based upon its built-in parameters? To do this, we implement something called retrieval-augmented-generation, or RAG.
These additions can be found on the route chat/with-terms-and-retrieval
in the remix project, and they use the loader in chat+/loaders/chatLoaderWithTerms.ts
, the action in chat+/actions/chatActionWithRetrieval.ts
, and the chat component in chat+/components/chatComponentWithTerms
.
Adding a disclaimer involves four steps:
<HStack>
<IconButton
type="smallUnstyled"
icon={boxChecked ? BoxCheckedIcon : BoxUncheckedIcon}
label="agree to terms"
onClick={() => toggleCheckBox()}
tooltipPlacement="topRight"
/>
I have read and agree to the
<Flex onClick={() => setModalOpen(true)}>
terms of use.
</Flex>
</HStack>
<Modal isOpen={modalOpen} setModalOpen={setModalOpen} onClose={() => setModalOpen(false)}>
<ChatTermsOfServiceContent onClick={() => setModalOpen(false)} />
</Modal>
const toggleCheckBox = () => {
setBoxChecked(!boxChecked);
setSearchParams({ acceptTerms: boxChecked ? "false" : "true" });
};
const acceptTerms = searchParams.get("acceptTerms");
if (acceptTerms) {
console.log("accept terms", acceptTerms);
if (acceptTerms === "true") await setAcceptedTerms(sessionId);
else await unsetAcceptedTerms(sessionId);
const response = redirect("/chat/with-terms-and-retrieval");
return await setSessionIdOnResponse(response, sessionId);
}
const acceptedTerms = await getAcceptedTerms(sessionId);
If you have a lot of data that you want the chat bot to use in order to answer questions, you can accomplish this by implementing a semantic search with a vector database that will first retrieve the most relevant data from your database, and then add that to the context for the model to use when answering the user's question.
In this example, I have created a lot of data in the form of a json file that contains a list of questions and answers. This is entirely to define the persona for Dark Violet, our chat bot. You could just as easily use data from user manuals for your product, historical data, or any other data that you have available that you want the chat bot to use to answer questions.
The first step is to load the data into the database. In this case, I am using pineconeDB. There are many options, but pinecone is a good choice that is easy to implement and has a generous free tier. The same basic process is used regardless of which database provider that you choose.
The entire RAG interface is defined in app/lib/server-utils/darkVioletRetrieval.ts
, shown below
import { TogetherAIEmbeddings } from "@langchain/community/embeddings/togetherai";
import { Document } from "@langchain/core/documents";
import { persona01 as dvPersonaBase } from "./dvPersonaBase.json";
import { PineconeStore } from "@langchain/pinecone";
import { pinecone } from "./pineconeDb.server";
type PersonaEntry = { question: string; answer: string };
type Persona = PersonaEntry[];
export const darkVioletRetrieval = async (query: string) => {
const vectorStore = await initializeDarkVioletRetrieval();
console.log("running query on vector store");
const results = await vectorStore.similaritySearchWithScore(query, 5);
console.log("retrieval results", results);
return results
.map(([result, score]) => ({
question: result.pageContent,
answer: result.metadata.answer,
score,
}))
.filter((result) => result.score > 0.65);
};
export const initializeDarkVioletRetrieval = async () => {
const index = pinecone.index("darkviolet");
const namespace = "persona-base";
const indexStats = await index.describeIndexStats();
const namespaces = indexStats.namespaces;
if (!namespaces || !Object.keys(namespaces).includes(namespace)) {
const personaDocs = (dvPersonaBase as Persona).map(
(entry) =>
new Document({
pageContent: entry.question,
metadata: { answer: entry.answer },
})
);
const vectorStore = new PineconeStore(
new TogetherAIEmbeddings({
modelName: "WhereIsAI/UAE-Large-V1",
}),
{
pineconeIndex: index,
namespace,
}
);
const sliceSize = 5;
for (
let startIndex = 0;
startIndex < personaDocs.length;
startIndex += sliceSize
) {
console.log("creating index slice", startIndex);
await vectorStore.addDocuments(
personaDocs.slice(startIndex, startIndex + sliceSize),
{
ids: Array.from({ length: sliceSize }, (_, i) => startIndex + i).map(
(i) => i.toString()
),
}
);
}
return vectorStore;
}
return new PineconeStore(
new TogetherAIEmbeddings({
modelName: "WhereIsAI/UAE-Large-V1",
}),
{
pineconeIndex: index,
namespace,
}
);
};
If you have any questions, or if you want help in your implementation -- including security, user management, document preparation, and multi-modal implementations -- you are welcome to contact us at Dark Violet.
If you find this useful, make sure to STAR our repository on GitHub. Thank You!