From RAGs to riches: A practical guide to making your local AI chatbot smarter

📆 6/17/2024 3:33 PM
📰 TheRegister

⏱ Reading Time:
117 sec. here
3 min. at publisher
📊 Quality Score:
News: 50%
Publisher: 61%

Ai Ai Headlines News

Ai Ai Latest News,Ai Ai Headlines

Nine out of 10 execs recommend adding Retrieval Augmented Generation to your daily regimen

If you've been following enterprise adoption of AI, you've no doubt heard the term “RAG” tossed around.

At a very high level, RAG uses an embedding model to convert a user's prompt into a numeric format. This so-called embedding is then matched against information stored in a vector database. This database can contain all manner of information, such as for example, a business's internal processes, procedures, or support docs. If a match is found, the prompt and the matching information are then passed on to a large language model , which uses them to generate a response.

Assuming Docker Engine or Desktop is installed on your system — we're using Ubuntu Linux 24.04 for our testing, but Windows and macOS should also work — you can spin up a new Open WebUI container by running the following command: Mac and Windows users will need to enable host networking in Docker Desktop before spinning up the Open-WebUI container . If you're running Open WebUI on a different machine or server, you'll need to replace localhost with its IP address or hostname, and make sure port 8080 is open on its firewall or otherwise reachable by your browser.

Downloading a model is rather straight forward. Just enter the name of the LLM you want and press 'pull', but for the purposes of this tutorial we're going to use a 4-bit quantized version of Meta's recently announced Llama3 8B model. Depending on the speed of your connection and the model you choose, this could take a few minutes.

By default, Open WebUI defaults to using the Sentence-Transformers/all-MiniLM-L6-v6 model to convert your documents into embeddings that Llama3 or whatever LLM you're using can understand. In"Document Settings" you can change this to use one of Ollama or OpenAI's embedding models instead. However, for this tutorial we're going to stick with the default.Now that we've uploaded our documents.

We apply these tags by opening up our"Documents" panel under the"Workspace" tab. From there, click the edit button next to the document we'd like to tag, then add the tag in the dialogue box before clicking save.We can now query all documents with that tag by typing"#" followed by the tag at the start of our prompt. For example, since we tagged the Podman doc as"Support" we'd start our prompt with"#Support".

Write Comment

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Ai Ai Latest News, Ai Ai Headlines