Local LLMs
LlamaIndex.TS supports OpenAI and other remote LLM APIs. You can also run a local LLM on your machine!
Using a local model via Ollama
The easiest way to run a local LLM is via the great work of our friends at Ollama, who provide a simple to use client that will download, install and run a growing range of models for you.
Install Ollama
They provide a one-click installer for Mac, Linux and Windows on their home page.
Pick and run a model
Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think mixtral 8x7b
is a good balance between power and resources, but llama3
is another great option. You can run Mixtral by running
ollama run mixtral:8x7b
The first time you run it will also automatically download and install the model for you.
Switch the LLM in your code
To tell LlamaIndex to use a local LLM, use the Settings
object:
Settings.llm = new Ollama({
model: "mixtral:8x7b",
});
Use local embeddings
If you're doing retrieval-augmented generation, LlamaIndex.TS will also call out to OpenAI to index and embed your data. To be entirely local, you can use a local embedding model like this:
Settings.embedModel = new HuggingFaceEmbedding({
modelType: "BAAI/bge-small-en-v1.5",
quantized: false,
});
The first time this runs it will download the embedding model to run it.
Try it out
With a local LLM and local embeddings in place, you can perform RAG as usual and everything will happen on your machine without calling an API:
async function main() {
// Load essay from abramov.txt in Node
const path = "node_modules/llamaindex/examples/abramov.txt";
const essay = await fs.readFile(path, "utf-8");
// Create Document object with essay
const document = new Document({ text: essay, id_: path });
// Split text and create embeddings. Store them in a VectorStoreIndex
const index = await VectorStoreIndex.fromDocuments([document]);
// Query the index
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "What did the author do in college?",
});
// Output response
console.log(response.toString());
}
main().catch(console.error);
You can see the full example file.