ThePrompt
Posts
Text-to-video (API available)

Text-to-video (API available)

Basically Stable Diffusion for videos

Anita Kirkovska
January 30, 2023

Hi folks! 👋🏻 This is The Prompt, the Indiana Jones of AI:

We uncover valuable insights that others have missed. ⚱️

Today's exciting finds:

Fine-tune your videos
Audio is all you need
ChatGPT enters Congress
GPT as a backend
+ more

Create & Fine-tune videos ⏯

The Tune-A-Video model was released last year, and it can generate videos using only one text-video pair.

The tech is based on text-to-image models, & is adapted for video generation.

So, basically Stable Diffusion/DreamBooth for videos.🎥

This weekend they released the official implementation, and you can find an open API on this Replicate link.

Text-to-music/audio papers hit the right note 🎵

In just 3 days, we got 5 papers that outperform previous models like Riffusion (which we covered here).

These new papers do so many new things:

📝 Text → 🎵 (different kinds of music/sounds/humming -- all of it)📸 Image → 🎵 🎥 Video → 🎵🖌️ Inpainting → 🎵

All links are included in the "Latest AI papers" section below👇🏻

📚 Learning lounge

[Short article] Overview of GPT-as-a-Backend

[Resource] WhisperX with speaker diarization and character-level timestamps

[3min video] How to build your own text translator with no-code & AI

🪀 Top Headlines

Microsoft, GitHub, and OpenAI ask court to throw out AI copyright lawsuit

Chinese Search Giant Baidu to Launch ChatGPT-Style Bot.

OpenAI has hired an army of contractors to make basic coding obsolete

🛼 Toolbox

Steamship: Build and deploy Prompt APIs in seconds 🤯

ScribePod: 1.5 hours of dialogue about ML papers

Text2SQL: Generate SQL with AI

MakeLog: Automate your change-log with GPT-3

🤓 Latest Audio/Music papers

1. Make-An-Audio: You can create audio from text, images, or videos. Plus you can do audio inpainting.

2. AudioLDM: You can generate sound, speech, and music with text descriptions. This model can also generate other everyday sounds from the text description (basically any sound you want). Great news - they will open-source this one!

3. Noise2Music: generate high-quality 30-second music clips from text prompts.

4. MusicLM: a model by Google that can generate high-fidelity music from text descriptions, that span over several minutes. This model can also transform whistled and hummed melodies according to the style described in a text caption. Sadly, Google doesn't plan to release this model due to ethical reasons

5. Moûsai: generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. Code on this GitHub link.

📸 AI Photo of the day

Pope getting some fresh ink

source: https://www.reddit.com/r/midjourney/comments/10orw2t/pope_getting_some_fresh_ink/

❤️ If you like The Prompt, and want to support my work:

Share The Prompt with a friend, and invite them to subscribe here.
Book an ad in The Prompt (reply to this email if you’re interested)

Thank you for reading! ✌🏼