Llama | ChatGPT as OCR Vision document AI

Published: 01 January 1970
on channel: Amit Shukla

12,183

201

Using Llama 2 | ChatGPT as OCR Vision AI

https://github.com/AmitXShukla/RPA/bl...

Author: Amit Shukla
https://github.com/AmitXShukla
/ ashuklax
/ @amit.shukla

Meta has recently released Llama 2, a large language model trained with up to 70B parameters, positioning it as the fastest and most advanced solution available. This model is expected to outperform other tools in terms of both speed and accuracy.

Upon completing this blog, you will acquire the skills to build Llama 2 and ChatGPT APIs and harness the capabilities of large language models for practical data analytics tasks.

*Using Llama 2 | ChatGPT as OCR Vision AI*

What incentive do I have to develop yet another OCR Vision AI tool when there are already thousands of options readily available in market?

The majority of the offerings in the market or available wrappers are built upon Open Source OCR packages.
In contrast, Vision AI solutions by large companies, which are not reliant on Open Source technologies, often offer low upfront costs but impose significant charges later on, frequently requiring customers to pay per image used.

Why pay for inferior results from a service not trained on your data, when building an in-house, cost-effective, on-premise solution with customized capabilities is feasible, without risking data exposure.

Here are some use cases you can consider optimizing this OCR Vision API by training it on "in-house" data.

Document classifier
Digital Invoice scanner
Digital Private signature matching
Scanning for confidential information like PHI, Private health, or Personal data
Train on "in-house" data for classifying "Secured information" in contracts, spends, etc.
Label or Hash documents
Sorting Receipts, Vendor Invoices or Matching Employee Expenses
Duplicate Invoice
---

This OCR Vision AI code below, is just one of many ways while numerous smarter optimizations are possible. and Fine tuning these models with in-house knowledge base will results into more accurate results.
OCR, Vision AI automated build process

Step 1: A receipt or invoice document is uploaded / dropped to a folder or you system takes a screenshot of an image or document
Step 2: Automated Code to call script is executed as soon as file is dropped
Step 3: Scripts read text from images
Step 4: Scripts to build prompts
Step 5: Send prompts to Llama 2 | ChatGPT
Step 6: Store results back to Application

Watch video Llama | ChatGPT as OCR Vision document AI online without registration, duration hours minute second in high quality. This video was added by user Amit Shukla 01 January 1970, don't forget to share it with your friends and acquaintances, it has been viewed on our site 12,183 once and liked it 201 people.

135,655

3.9K