OpenAI's NEW Embedding Models

Опубликовано: 25 Январь 2024
на канале: James Briggs
29,030
498

OpenAI's new embedding models are text-embedding-3-small and text-embedding-3-large. These models are better than Ada 002 (text-embedding-ada-002), and we have the option of latency and storage-optimized text-embedding-3-smallor the higher accuracy text-embedding-3-large.

Key takeaways here are the pretty huge performance gains for multilingual embeddings — measured by the leap from 31.4% to 54.9% on the MIRACL benchmark. For English-language performance, we look at MTEB and see a smaller but still significant increase from 61% to 64.6%.

It's worth noting that the max tokens and knowledge cutoff have not changed. That lack of new knowledge represents a minor drawback for use cases performing retrieval in domains requiring up-to-date knowledge.

We also have a different embedding dimensionality for the new v3 large model, resulting in higher storage costs and paired with higher embedding costs than what we get with Ada 002.

Now, there is some nuance to the dimensionality of these models. By default, these models use the dimensionality noted above. However, it turns out that they still perform even if we cut down those vectors. For v3 small, we can keep just the first 512 dimensions. For v3 large, we can trim the vectors down to a tiny 256-dimensions or a more midsized 1024-dimensions.

📕 Article:
https://www.pinecone.io/learn/openai-...

📌 Code:
https://github.com/pinecone-io/exampl...

🌲 Subscribe for Latest Articles and Videos:
https://www.pinecone.io/newsletter-si...

👋🏼 AI Consulting:
https://aurelio.ai

👾 Discord:
  / discord  

Twitter:   / jamescalam  
LinkedIn:   / jamescalam  

00:00 OpenAI Ada 002
01:25 New OpenAI Embedding Models
03:50 OpenAI Embedding Dimension Parameter
05:04 Using OpenAI Embedding 3
10:08 Comparing Ada 002 to Embed 3

#openai #ai #artificialintelligence #nlp


Смотрите видео OpenAI's NEW Embedding Models онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь James Briggs 25 Январь 2024, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 29,030 раз и оно понравилось 498 людям.