In this episode, we discuss using AWS Lambda for machine learning inference. We cover the tradeoffs between GPUs and CPUs for ML, tools like ggml and llama.cpp for running models on CPUs, and share examples where we've experimented with Lambda for ML like podcast transcription, medical imaging, and natural language processing. While Lambda ML is still quite experimental, it can be a viable option for certain use cases.
💰 SPONSORS 💰
AWS Bites is brought to you by fourTheorem, an Advanced AWS Partner. If you are moving to AWS or need a partner to help you go faster, check us out at fourtheorem.com !
🔖 Chapters:
00:00 Intro
01:18 Why Lambda nad ML? Model training vs inference workloads
02:34 Some reference use cases
04:32 The performance benefits of GPUs over CPUs for machine learning.
07:24 Discussing the advantages of CPUs: cheaper, more widely available, easier to scale
11:02 Using Python and frameworks like TensorFlow and PyTorch
14:33 Native ML frameworks like ggml and llama.cpp that are optimized for CPU execution
16:40 Examples of running ML models on Lambda: podcast transcription and medical imaging
19:36 Lambda for large language models and retrieval augmented generation (RAG)
23:04 Summary and closing
In this episode, we mentioned the following resources.
Episode "46. How do you do machine learning on AWS?": https://awsbites.com/46-how-do-you-do...
Episode "108. How to Solve Lambda Python Cold Starts": https://awsbites.com/108-how-to-solve...
ggml (the framework): https://github.com/ggerganov/ggml
ggml (the company): https://ggml.ai
llama.cpp: https://github.com/ggerganov/llama.cpp
whisper.cpp: https://github.com/ggerganov/whisper.cpp
whisper.cpp WebAssembly demo: https://whisper.ggerganov.com/
ONNX Runtime: https://onnxruntime.ai/
An example of using whisper.cpp with the Rust bindings: https://github.com/lmammino/whisper-r...
Project running Whisper.cpp in a Lambda function: https://github.com/eoinsha/whisper_la...
AWS Lambda Image Container Chest X-Ray Example: https://github.com/fourTheorem/lambda...
Episode "103. Building GenAI Features with Bedrock": https://awsbites.com/103-building-gen...
You can listen to AWS Bites wherever you get your podcasts:
Apple Podcasts: https://podcasts.apple.com/us/podcast...
Spotify: https://open.spotify.com/show/3Lh7Pzq...
Google: https://podcasts.google.com/feed/aHR0...
Breaker: https://www.breaker.audio/aws-bites
RSS: https://anchor.fm/s/6a3312a0/podcast/rss
Do you have any AWS questions you would like us to address?
Leave a comment here or connect with us on X, formerly Twitter:
/ eoins
/ loige
#aws #ai #ml #lambda #inference #llama #podcast #genai #serverless
Watch video 110. Why should you use Lambda for Machine Learning? online without registration, duration hours minute second in high quality. This video was added by user AWS Bites 18 January 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 34 once and liked it 1 people.