- ThePrompt
- Posts
- AudioPaLM: AI can Speak and Listen 🔥
AudioPaLM: AI can Speak and Listen 🔥
PLUS: SDXL 0.9 - most advanced text-to-image AI
FEATURED
AudioPaLM: A Model That Can Speak and Listen 🔥
The AudioPaLM model is designed to understand and create both text and speech.
It's a combination of two previous models by Google:
This model can convert speech to text, text to speech, and even speech from one language to another.
Plus, AudioPaLM doesn't just deal with the words people say but also how they say them. This includes aspects like who's speaking (speaker identity) and how the voice sounds (intonation).
Multi-modality improves results
The creators found that they could make this model even better by using the lessons learned from text-based models.
There's a lot more text data out there than speech data, so this helped the model understand speech even better.
And the results are really impressive 👇🏻
AudioPaLM outperforms *every other model* in translating speech across languages, even ones it wasn't trained on. And, it can also mimic a speaker's voice in different languages from a short audio sample.
NEW TECH
Stability AI launched the most advanced open-source text-to-image model: SDXL 0.9 🔥
SDXL 0.9 is the most advanced text-to-image model by Stability AI.
It uses a huge amount of information (3.5 billion bits for the base model and 6.6 billion when combining the two models).
The model uses two stages - the first stage creates an initial image, and the second one refines it, adding more precise details.
You can try it here. API is coming soon.
WHAT ELSE IS GOING ON
🦙Midjourney just pushed a new version (v 5.2). With this version, you can use the “zoom-out” feature. The community has created some great stuff, see link 1, link 2.
👀 100K+ ChatGPT accounts compromised and sold on the dark web. As reported by Group-IB, the majority (40,000+) of the compromised credentials trace back to the Asia-Pacific region, but many in other regions as well. Make sure you protect your credentials with 2FA.
🏋🏻♀️ The Last AI Boom Didn't Kill Jobs. It created jobs. Economists looked at the job market across a number of European countries, and both high and less-skilled workers didn’t seem to be significantly affected by software or AI.
RESOURCES
The best resources we came across lately that will help you become better at writing prompts & building AI apps.
📚 OpenAI developer forum [ forum to ask questions ]
👋🏻 IMG comparison between Midjourney v5 and v5.2 [Twitter thread]
🎥 Google Search’s guide on AI-generated content [ useful resource ]
TOOLBOX
The latest AI tools to use or get inspiration from.
Embedchain.ai: Framework for LLM-powered chatbots
Something: Your emotional AI friend
SomelAI: Personalised wine recommendations
Upword: Faster research with the help of AI
Avaturn: Selfie to 3d Avatar
PROMPT OF THE DAY
TOOL
Midjourney Zoom feature
PROMPT
Young man with short beard, photograph, soft focus background --ar 2:3
Custom Zoom + prompt: On a tropical beach --ar 3:2
RESULT
RESULT WITH ZOOM
Custom Zoom + prompt: On a tropical beach --ar 3:2
What'd you think of today's edition? |