- ThePrompt
- Posts
- Face-to-Speech
Face-to-Speech
generate speech from text & face image
Hi folks!👋🏻 This is The Prompt & we have some news: We polished our logo and the email format. Plus we have a new section: AI prompts, where we’ll feature interesting prompt hacks and results for you.
Hope you like it! Let’s see what’s new 👇🏻
HOT PICK
Generate speech from images and text
Can you imagine the voice of someone when you look at their face?
Well, this model can.
Face-TTS generates speaking styles and voices learned from facial characteristics.
Although the speech results still sound “artificial”, this is the first time that face images are used as a condition to train a Text-to-Speech (TTS) model.
You can see some of the results here.
INTERESTING TECH
Microsoft’s new model can understand images
KOSMOS-1 is the newest multi-modal model by Microsoft that can perceive general modalities, follow instructions, and perform in-context learning.
This means that the model is very good at many different tasks, like understanding language, talking to people, describing pictures, and answering questions about what it sees.
And it can do all of this with less training data, and/or fine-tuning.
The team also introduced a dataset of the Raven IQ test, which diagnoses the nonverbal reasoning capability of this model - or basically, it measures its IQ.
WHAT ELSE IS GOING ON
💬 Snapchat to launch its own chatbot (powered by ChatGPT). The big idea is that in addition to talking to our friends and family every day, we’re going to talk to AI every day.
👀 Elon Musk is recruiting AI researchers to create a ChatGPT rival. Musk actually co-founded OpenAI but has cut ties because of concerns about a conflict of interest with Tesla AI.
🏋🏻♀️ AI-designed COVID-19 drug starts clinical trials. They leveraged AI to find patterns and to develop a molecule that will become the basis of the drug.
RESOURCES
The best resources we came across lately that will help you become better at writing prompts & building AI apps.
📚 Beating OpenAI CLIP with 100x less data and compute[Insightful article]
👋🏻 OpenAI's Foundry leaked pricing says a lot – if you know how to read it [Insightful article about the AI in 2023]
🎥 How to create AI thumbnails for YT videos [18min tutorial]
TOOLBOX
The latest AI tools to use or get inspiration from.
Jailbreak Chat: A collection of ChatGPT jailbreaks
PyqAI: Build AI features with one simple API call
Mendable.AI: Chat-Powered Search Trained on Your Docs
Figma.AI: Integrate GPT Chat into your Figma workflow
editGPT: Easily edit and track changes to your content in chatGPT
PROMPT OF THE DAY
TOOL
Midjourney
PROMPT
💬 vertical farm 🍄🏰✨💫🥀☁️ --ar 4:5
RESULT
LATEST PAPERS
ChatGPT: A Meta-Analysis after 2.5 Months
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
Internet Explorer: Targeted Representation Learning on the Open Web
Language Is Not All You Need: Aligning Perception with Language Models
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
What'd you think of today's edition? |