Text-based AI systems today have a big vulnerability: it’s incredibly difficult to separate content from instructions.
This came up a while ago during the production of another article here on the Algolia blog when one of our authors suggested that an LLM used as a customer support chatbot on an ecommerce website should be able to have access to all the information it might ever need to give a customer. I fought back against this a bit, since that clearly opens up security concerns. A clever user could engineer a message that’s supposed to be just treated as just conversation, but the LLM would interpret it as instructions to reveal sensitive information that this user shouldn’t be given.
Turns out we’re not the first people to debate this. (Of course.) It’s widely understood that while we can just intuitively grok the difference between abstractly talking about doing something and being instructed to actually do that thing1, LLMs just don’t have that intuition. They’ve been described as essentially “spicy autocomplete”. Now, while that description is deliberately provocative, it’s not completely wrong. LLMs aren’t “intelligent” in the way humans think about it, but they do have tons of information available to them and advanced pattern matching to usually get close to the answer they’re expected to give. But their lack of what we would call “common sense” lets these incredibly complex models get taken advantage of, sometimes even in ways that toddlers would see through. This type of attack is usually classed under the name “prompt injection”, and we’ve already written the ultimate guide on how to defend against it.
So let’s sidestep the problem altogether and explore how we can just limit the amount of information the LLM can access to exactly what the user it’s talking with should be able to access. That’s often more difficult than it sounds because (a) that’s probably too much to fit in a prompt2, and (b) it’s variable, depending on whether the user has authenticated themselves and what access that grants them.
Think about how we would solve a situation like this if the chatbot was a human, say a customer support agent. We’d just give the agent a database that we can control with fine-grained permissions, and to access certain information on behalf of the customer, the customer has to authenticate themselves first and prove they have the necessary permissions. This would prevent the agents from getting taken by callers pretending to be someone they’re not or pretending to have more access than they really do.
With the AI, we can do the same thing! We can first let the customers authenticate themselves with a more accurate tool than the LLM, and then using the customer’s permissions, let the LLM look up whatever it wants to in some appropriate database of information and relay that back to the customer in readable English. OpenAI calls this Function Calling, and they’ve supported this functionality since June 2023. Duplicating it with reasonable quality in other models made by OpenAI or in models from other providers is just a matter of adjusting some syntax3.
Let’s actually try and build a working prototype, just to see if this is really as easy as it sounds.
First off, I’m going to build a little function to just send a user’s message to the AI and keep track of the ongoing conversation (since you need to send the whole thing to the GPT every time you make a new request). This bit of boilerplate makes it so that we can abstract away that complexity and just focus on the actual database logic.
<aside> 💡 Side note: this means that if you’re just making a proof-of-concept, so you don’t actually want to build the full application, you can just test this with the OpenAI playground! Just give it the tool function definitions we’ll make later and manually run what it asks you to. Otherwise, you can find all of the code I’m going to write here in this GitHub repo, so feel free to just follow the straightforward steps in the README to recreate the application locally.
</aside>
Here what that might look like:
import OpenAI from "openai";
const openai = new OpenAI(); // uses process.env.OPENAI_API_KEY automatically
let messages = [
{
role: "system",
content: "You're a helpful customer service agent named Emma. You're now connected with a customer in a chat window on the website of AllTheThings, an ecommerce company that sells a wide variety of household products. If you need information about the company that is generally available to a large group of customers or website visitors, the search_info function will let you search for that information. If you choose to use that information, recap it in your own words and in a human, conversational style, avoiding formatting. Before you give the user the answer, ask necessary followup questions to further narrow down the possible answers."
}
];
const toolFunctions = {}; // Defined later on in the article ;)
const getAIResponse = async () => {
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages,
tools: Object.values(toolFunctions).map(x => x.definition)
});
messages.push(response.choices[0].message);
switch (response.choices[0].finish_reason ?? "stop") {
case "length":
console.error("The AI's response was too long.");
return;
case "content_filter":
console.error("The AI's response violated their content policy.");
return;
case "tool_calls": // The AI is telling us to call a function
// note that if you use the setting to force the ai to use a tool, the finish reason will actually be "stop", not "tool_calls". we're not using that functionality here so we don't need the extra logic.
const requestedFunction = toolFunctions[response.choices[0].message.tool_calls[0].function.name]?.func;
if (typeof requestedFunction === "undefined") {
console.error("Somehow the AI tried to call a function that doesn't exist.");
return;
}
messages.push({
role: "tool",
content: await requestedFunction(
// the arguments the AI wants to pass to this function
JSON.parse(response.choices[0].message.tool_calls[0].function.arguments)
),
tool_call_id: response.choices[0].message.tool_calls[0].id
});
return await getAIResponse();
case "stop": // The AI is just giving us a text response
return response.choices[0].message.content;
}
};
This is just remixed from OpenAI’s docs. There’s nothing super special going on here, so the breakdown is pretty short:
messages
is an array that keeps track of the ongoing conversation.toolFunctions
is coming in a bit; hold your horses.getAIResponse
is an asynchronous function that just gives the LLM the entire conversation history. Then it puts the response at the end of the conversation history, and depending on the reason the LLM finished its response, either just returns the text result or runs a specific function the AI asked it to. If the AI asked for our app to run a specific function, we’ll give it the results of the function and wait for another response4.Great! Then we need to set up a server where we can actually access this. I won’t bore you with the details since it’s not super relevant to the cool part of this project, but the gist is that our server makes available a few routes on localhost:3000
:
/
— That’s index.html, where our frontend lives./operator.webp
— An AI-generated image to represent our support agent. It’s a completely fictional “photo”./styles/index.css
— The CSS file that makes our app look pretty ✨/get-user-image
— This POST endpoint authenticates the current user and returns the image associated with their account. For the sake of simplicity, we’ve only implemented a dummy auth system here that automatically logs the user in as a specific customer, so it just returns a hardcoded placeholder image of the letter J for now. Here’s our dummy authenticateUser
function:
const authenticateUser = async () => {
return {
customerID: "e8f9g0h1-2i3j-4k5l-6m7n-8o9p0q1r2s3t",
name: "Jaden",
image: "<https://placehold.co/60x60?text=J>"
};
}
See the README of the repo for more details about how to use this as-is and how you could expand it further.
/send-chat-message
— This POST endpoint takes in a message sent by the user, adds it to the messages
array with the user
role, and then sends back whatever the getAIResponse
function returns.That’s all the boilerplate out of the way! Now we can get to the point of this whole article in the first place, which was the tool functions! OpenAI has a great article in their documentation on how this works, but I always find it easier to understand when actually tinkering with it.
To give the LLM the ability to call actual code, we need to (a) tell the AI about the code it can call with a function definition, and (b) actually write the function it can call. So let’s gameplan what we want the AI to be able to do.
First of all, it makes sense that an AI support agent would know about general news affecting the company. For example, if a shortage of supplies is causing the entire company to fall behind on shipping quotas, it would be nice to give that info to our agent somehow. However, we can’t hardcode that data into our app for a few reasons.
If you know anything about Algolia, you might know where this is headed. This is our whole thing. We are the fuzzy search, neural hashing, high-tech AI search people.
If you’re following along, this part is quick and painless. Just sign up for a free Algolia account and create an application. Add a new index in that application called “announcements” and drop in the dummy JSON data from here. The setup wizard will walk you through all of this — it shouldn’t take more than 60 seconds. Find your Application ID and Search API Key in the API Keys page and store them as environment variables where your application and reach them. Then in our app, we import Algolia and initialize it:
import { algoliasearch } from "algoliasearch";
const algoliaClient = algoliasearch(process.env.ALGOLIA_APPLICATION_ID, process.env.ALGOLIA_API_KEY);
No sweat.
Let’s write an asynchronous function that searches through this new index of ours.
const searchAnnouncements = async ({query}) => {
// must return a string to be used as a message back to the AI
// we're not worrying about whether the user is authenticated because these are public announcements
const { results } = await algoliaClient.search({
requests: [
{
indexName: "announcements",
query
}
]
});
return JSON.stringify(results[0].hits.map(x => x.message));
}
This takes in arguments in the form of {"query": "shipping delays"}
and spits out a response in the form of ["Tropical Storm Debby is causing shipping delays on the east coast of the US"]
. Perfect!
Now for GPT4o to understand conceptually what this function does and how to call it, we’ll make a little function definition object. Here are the relevant docs on OpenAI’s site.
const searchAnnouncementsDefinition = {
type: "function",
function: {
name: "searchAnnouncements",
description: "Search for general product and company information. Call this whenever you need to know about company news, deals going on right now, current trends, supply chain notices, and other information that generally affects all customers.",
parameters: {
type: "object",
properties: {
query: {
type: "string",
description: "The query string with which to search the company announcements database"
}
},
required: ["query"],
additionalProperties: false
}
}
}
You might be thinking, “that’s good and all, but company-wide announcements are only half the use case here. A lot of the time when someone uses a support chat window, they want details about their personal order.” And you’d be right about that… so let’s run through that process again. Make a new index in the same application called orders
and hydrate it with this dummy data (or your own, if you prefer). Here’s the function definition:
const searchOrdersDefinition = {
type: "function",
function: {
name: "searchOrders",
description: "Search for specific order information.",
parameters: {
type: "object",
properties: {
orderID: {
type: "string",
description: "The ID of the order which will be used to pull up the order details"
}
},
required: ["orderID"],
additionalProperties: false
}
}
}
Quick aside: looking up orders by order number isn’t exactly what Algolia is meant to do because it doesn’t involve any fuzzy matching. When we search for a particular order number, we don’t want to get all results with similar order numbers, we want to pull up exactly that order. For this kind of strict one-to-one kind of lookup, a database like MySQL would probably be best. Fortunately, for an ecommerce business that is already keeping track of their orders, they likely already have this database created! So it’s not a significant hurdle for a real life use case. Building a setup like that would be out of the scope of this article though and require a lot more configuration for those who want to play around with the demo, so we’re still going to use Algolia for this. The trick is just to search without a query, and put the order number as a filter instead. This limits our results to only those that contain that particular order number. This kind of flexibility makes Algolia a great option when you’re working with more complex datasets that require both fuzzy search and this strict type of lookup.
In addition, to pull up the right order, we need to make sure that the user we’re talking to is the same person who placed that order. So in that filter, we’re also going to include the customer’s ID that we got from the authenticateUser
function. Remember that in this demo, that function automatically logs us in as one particular user. So if you’re experimenting with the dataset now, the AI will only be able to look up orders by that particular customer (the one with the fake address in Metropolis, NY). Here’s what that function looks like:
const searchOrders = async ({orderID}) => {
// must return a string to be used as a message back to the AI
const user = await authenticateUser();
const { results } = await algoliaClient.search({
requests: [
{
indexName: "orders",
query: "",
filters: `customerId:"${user.customerID}" AND orderNumber:"${orderID}"`
}
]
});
console.log(JSON.stringify(results[0].hits));
if (results[0].hits.length == 0) return "That order number isn't valid.";
return JSON.stringify({
orderedOnDate: results[0].hits[0].date,
shippingAddress: results[0].hits[0].address,
totalWeightInPounds: results[0].hits[0].totalWeight,
totalPrice: results[0].hits[0].total,
deliveryDate: results[0].hits[0].deliveryDate
});
}
See, even with our workarounds, it’s still not super complex. We’re even handling the edge case where the order number isn’t valid (though there is still room to improve — see the README).
Then we can bundle all of this into the toolFunctions
object that we didn’t define all the way earlier:
const toolFunctions = {
"searchAnnouncements": {
definition: searchAnnouncementsDefinition,
func: searchAnnouncements
},
"searchOrders": {
definition: searchOrdersDefinition,
func: searchOrders
}
};
In the getAIResponse
function we wrote all the way at the beginning, we’ll consume this object and pass it along to GPT4o.
With access to all this information, the LLM now can answer a much wider variety of questions without the extra cost associated with hardcoding this information somewhere.
A conversation could go something like this:
Here’s another one that mentions one of the ongoing announcements:
Thank goodness the AI understands how important it is that I get my zucchini spiralizer.
The best part is that now it’s not up to the developer to add new announcements. Algolia’s interface is meant for users with all different levels of technical skill. Even better, this could be easily hooked up to any internal management tools to give non-technical executives the power to push new announcements to the announcements
index and have that info instantly available to all the chatbots running at any given moment.
And we’re done! Again, if you’d like to recreate this at home to tinker with the settings or make a quick demo for your bosses, you can find the repo with detailed setup instructions here. And as always, if you’re looking for a great search solution, the best way to find the right one is by actually messing around with it. Algolia’s free tier is super generous — here’s the signup link if you’d like to give it a whirl. Have fun!
1: The technical linguistic language for this is that we can understand the difference between the infinitive and imperative moods.
2: Which, if you have ever experimented with LLMs, just reeks of incredible expense since we’re usually charged by the token.
3: We’re using OpenAI’s built-in functionality to do this, but you could just define a syntax in your original prompt to do this manually. For example, prefix all messages from the end user with “User:” and all messages from the called functions with “Tool:”. Then tell the LLM in the initial prompt that if it wants to call a function, it should respond with the name of the function (or some string you’ve defined as representing that function) and the necessary arguments in the format you specify. You may run into issues where the output is not perfectly in the format you define, and this can happen with GPT4o too. With OpenAI though, you can constrain the output to a certain format without a ton of extra code — with other providers, you may have to roll your own retry system if the response from the AI doesn’t match your expectations.
4: The frontend of our app will never know about any of that complexity. Our HTML and JS just displays the version of the conversation without any of the recursion that only happens on the backend. All the frontend will experience is a slight delay in receiving a response because NodeJS is actually sending out at least three outgoing HTTP requests (OpenAI 1, OpenAI 2, then Algolia) before resolving the incoming request
Jaden Baptista
Freelance Writer at Authors CollectivePowered by Algolia AI Recommendations