Large language models (LLMs) encode a model language, and by extension the world. Prompt engineering can be used to personalize LLMs by pre-pending a prompt to the user's LLM query. Feature stores are already widely used to provide context and history to online models that are deployed on stateless applications - an application can use a user-id or session-id to retrieve precomputed features for the user or session that are used to enrich the features available at request-time. We can easily imagine how the answer provided by a LLM to a task about how get to my summer house could be improved if the LLM knew the time of year, the weather, that I like to cycle that distance, and that I have a car.
In this talk, we show how to personalize LLMs using a feature store and prompt engineering. We walk through how to build an example free serverless, personalized LLM application using Hopsworks, an open-source feature store with a built-in vector database. We will look at how to build templates for prompts, and how they can be easily constructed and included in user queries. We will look at how to fill-in prompt templates with real-time context data, produced by streaming feature pipelines, and user-specific data, produced by batch feature pipelines. We will also look at how we can incorporate documents from vector databases in prompts using a combination of user-input and historical user data from the feature store. We will make this example application available as open-source in Python.