- ThePrompt
- Posts
- MiniGPT-4
MiniGPT-4
Chat with your images, for real
Hi folks!ππ» This is The Prompt! We're the AI newsletter thatβs like a nice jog in the morning - refreshing, energizing, and a great way to start the day.
Lace up your sneakers and letβs get it ππ»π
FEATURED
MiniGPT - chat with your imagesπ¬
MiniGPT-4 is out of this world. π
It is a chatbot that can answer questions about your images (a functionality promised by GPT-4, but still not released).
Some of the things it can do:
explain what is on the photo
find problems and solve them (ex: dead plant, what to do with it)
write poems for photos
give you HTML/JS code for a sketch (I actually tried this and it didnβt really work..)
write an advertisement for a given photo
write recipes and shopping list for a given meal photo
explain art
The model is open-sourced, and you can find the code here.
How does it work?
It is built on top of BLIP-2 (which is a model that understands images) & Vicuna β which is an open-sourced platform to build chatbots that have a similar quality to ChatGPT.
And it is using the LAVIS library, one of the most comprehensive open-source libraries for multimodal language and vision intelligence.
Extra info
As I was writing this newsletter, I found that there is another model similar to MiniGPT, called β LLaVA. It does the same thing but is MUCH faster. Give it a go!
WHAT ELSE IS GOING ON
π¦ Elon wants to build a TruthGPT. He thinks that Microsoft directly owns OpenAI right now and that we need another open that is not controlled by anyone. How much will you charge for the truth Elon... I guess $3?
π Stanford student built his own HealthGPT. He uploaded his Apple Health data and added a chatbot interface. The code is open source.
ππ»ββοΈ NVIDIA dropped new text-to-video model. They also made it possible to personalize these videos using Dreambooth + Stable Diffusion. Still no public demo but hereβs a stormtrooper that is vacuuming the beachππ»
RESOURCES
The best resources we came across lately that will help you become better at writing prompts & building AI apps.
π LLM prompting for programming [Short & useful]
ππ» ThinkGPT - Python library for long memory [Code & explanations]
π₯ Can AI kill the greenscreen? [Short VOX video]
π₯ Beginners guide to autonomous agents [ a must read ]
TOOLBOX
The latest AI tools to use or get inspiration from.
MeetGeek - Email/Meetings summarizer
Human or Not - Social Turing game
ChatGPT2D - ChatGPT in a 2Dimensional map
MyAskAI - Personalized ChatGPT for your website
WebScrapeAI - Scrape the web using AI and no-code
Blok - Tasks, notes, meetings on autopilot
Slait - AI tutor for American Sign Language (ASL)
PROMPT OF THE DAY
TOOL
Midjourney
PROMPT
Elon Musk in a poor street situation eating sitting on the sidewalk next to homeless people -- v5
RESULT
LATEST PAPERS
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, and More
NeAI: A Pre-convoluted Representation for Plug-and-Play Neural Ambient Illumination
Solving Math Word Problems by Combining Language Models With Symbolic Solvers
Generative Disco: Text-to-Video Generation for Music Visualization
What'd you think of today's edition? |