Blazing Fast Local LLM Web Apps With Gradio and Llama.cpp

Published: 26 June 2024
on channel: HuggingFace
2,344
55

In this video, we'll run a state of the art LLM on your laptop and create a webpage you can use to interact with it. All in about 5 minutes. Seriously!


We'll be using Llama.cpp's python bindings to run the LLM on our machine and Gradio to build the webpage.

Resources mentioned in the video:

Llama.cpp python: https://github.com/abetlen/llama-cpp-...
Gradio: https://github.com/gradio-app/gradio
Qwen-2 0.5B Instruct Model (GGUF): https://huggingface.co/Qwen/Qwen2-0.5...
Llama.cpp's chat completion API: https://github.com/abetlen/llama-cpp-...
Gradio Chatbot Guide: https://www.gradio.app/guides/creatin...


Watch video Blazing Fast Local LLM Web Apps With Gradio and Llama.cpp online without registration, duration hours minute second in high quality. This video was added by user HuggingFace 26 June 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 2,344 once and liked it 55 people.