Trying out Mixtral 8x22B MoE fine tuned Zephyr 141B-A35B Powerful Open source LLM

Published: 15 April 2024
on channel: Rithesh Sreenivasan
364
9

If you like to support me financially, It is totally optional and voluntary. Buy me a coffee here: https://www.buymeacoffee.com/rithesh
Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr 141B-A35B is the latest model in the series, and is a fine-tuned version of mistral-community/Mixtral-8x22B-v0.1 that was trained using a novel alignment algorithm called Odds Ratio Preference Optimization (ORPO) with 7k instances for 1.3 hours on 4 nodes of 8 x H100s. ORPO does not require an SFT step to achieve high performance and is thus much more computationally efficient than methods like DPO and PPO. To train Zephyr-141B-A35B, we used the argilla/distilabel-capybara-dpo-7k-binarized preference dataset, which consists of synthetic, high-quality, multi-turn preferences that have been scored via LLMs.
https://huggingface.co/HuggingFaceH4/...
https://arxiv.org/pdf/2403.07691.pdf
https://huggingface.co/chat/

If you like such content please subscribe to the channel here:
https://www.youtube.com/c/RitheshSree...


Watch video Trying out Mixtral 8x22B MoE fine tuned Zephyr 141B-A35B Powerful Open source LLM online without registration, duration hours minute second in high quality. This video was added by user Rithesh Sreenivasan 15 April 2024, don't forget to share it with your friends and acquaintances, it has been viewed on our site 36 once and liked it people.