AI-generated illustration of a businesswoman working on a computer at her desk in an office, with a llama playfully poking its head over the top of her screen.
llamafile is one of the easiest ways to start using Generative AI on your own computer.

Generative AI can be an incredibly powerful tool with a wide range of applications- taking notes, drafting documents, and providing customer support, just to name a few.

But today’s commercial offerings from Google or OpenAI come with a major risk: privacy. To use their services, you must send your data to their AI – meaning that the privacy of your data is only as good as their security. Many services also use your data directly for training future AI systems, so you never know where that info may pop up in the future.

However, there are other options. There is a vibrant open-source community developing generative AI, and they offer a number of great off-the-shelf systems that perform at ChatGPT levels. One of the easiest to use is llamafile from Mozilla (the company behind the Firefox web browser). Llamafile offers commercial-grade generative AI packed into a single file, and which works seamlessly on every operating system – no fancy hardware needed. Let’s get started!

Downloading the llamafile

First, head over to the llamafile Github page and scroll down to the table titled “Other example llamafiles”.

You’ll see there are a number of excellent open-source models to choose from. I’d recommend starting with LLaVa 1.5, which is a great combination of “small” (a few GB is small in the gen AI world), high-performing, and multimodal – meaning it can take both text and images as input. Click on the appropriate link in the “llamafile” column of the table, and save the file to your computer.

One-step system configuration for llamafile

Once the file is finished downloading, the next step depends on your operating system.

If you’re using Windows, open up the folder with the downloaded file, right click on the file, and rename it so that it has “.exe” at the end.

(If you’re using Mac or Linux, please see the official quickstart section for next steps.)

Then, double-click on the recently-named “.exe” file. Your computer will likely throw a warning – that’s normal when trying to run an unregistered .exe file for the first time. Click “More Info”, then the “Run Anyway” button. You will likely need some level of Administrator rights on your computer to do this.

Generative AI on your own computer

A command box pop up, showing the details of the system running in the background. You can ignore this.

Some seconds later, you’ll also have a tab pop up on your browser, showing the web page for your new Generative AI assistant. The title says “llama.cpp” – this is the name of the engine running your model. There is also a lot of other technical stuff on the page that lets you tune your assistant if you choose. Ignore that for now, and scroll to the box at the bottom with the note “Say something…” (circled in red in my screenshot). This is where you’ll enter your prompt.

After entering your prompt, click the “Send” button just below the box, and that will start your chat session. If you want to interrupt the system, click “Stop”. If you want to clear the chat memory and start fresh, just click “Reset”.

I mentioned above that LLaVa is a multimodal model. If you have a picture you want to feed it, use the “upload image” button, then select an image. It will likely take a little while to process it (my picture from the screenshot below took a few minutes), but it does a nice job of responding.

To end the session, just close the tab and the command window.

What to do next with your own Generative AI

Congratulations! You now have your own Generative AI assistant running on your own computer! You can use it just like you would ChatGPT, Microsoft Copilot, or other chat-based gen AI system – but, this time, without sharing any private information with a third-party. Llamafile can also be connected to other applications (like VS Code) via API’s – information on that is available in the llamafile documentation.

If you’re interested, you can also explore some of the other models on the llamafile site and see which works best for your needs. The process for using them is the same as above.

As of today, llamafile is by far the easiest way to get started with local generative AI models. Many thanks to Mozilla AI and Justine Tunney for creating llamafile and their continued work to make generative AI accessible to everyone.


Become your company’s AI expert in under 30 minutes a month by signing up for the Executive Summary newsletter by AI For Business.

If you liked this post, have future posts delivered straight to your inbox by subscribing to the AI For Business newsletter. Thank you!