Google Challenges OpenAI with Veo 2

December 17, 2024

Welcome, AI enthusiasts.

While Sora was supposed to be the highlight of the holiday season, Google may have just left some coal in OpenAI's stocking. The tech giant's new Veo 2 video model appears to leave Sora in the dust, with 4K capabilities, realism, and physics that will make you question reality. Let's get into it…

In today's AI news:

  • Google launches next-gen video, image models

  • ChatGPT Search goes free for everyone

  • Create a customer support AI voice agent for your website

  • AI agents make 10+ minute videos from text

  • More AI & tech news Read time: 4 minutes

LATEST DEVELOPMENTS

🚀 Google launches next-gen video, image models

TLDR: Google just announced the release of Veo 2, a state-of-the-art video generation model that creates high-resolution outputs with stunning realism and detail — along with Imagen 3, an upgraded image model also offering state-of-the-art quality.

Veo 2:

  • Veo 2 can generate 8-second clips at 4K resolution (720p at launch), and it has received significant upgrades in cinematic control quality.

  • The model also shows massive improvements in physics simulation and reduced hallucinations, leading to more realistic movement and detail.

  • Veo 2 outperformed all competitors in head-to-head human evaluations and prompt adherence, including OpenAI's recently released Sora.

  • The model is rolling out gradually through the VideoFX waitlist, with YouTube Shorts integration planned for 2025.

Imagen 3:

  • The upgraded model delivers enhanced color vibrancy and composition across artistic styles, with better handling of fine details, textures, and text rendering.

  • New capabilities include more accurate prompt interpretation and better rendering of complex scenes that match user intentions.

  • Imagen 3 outperformed all models, including Midjourney, Flux, and Ideogram, in human evaluations for preference, visual quality, and prompt adherence.

  • The model is now available through Google Labs' ImageFX and is rolling out to over 100 countries.

Why it matters: Google is having an absolutely massive end to 2024 — first with Gemini 2.0 and now Veo 2 and Imagen 3. These models appear to up the bar in both categories, giving Google state-of-the-art type performance across nearly every area of AI. OpenAI may have the hype this holiday season, but Google is showing the results.

🔎 ChatGPT Search goes free for everyone

TLDR: OpenAI just announced a major expansion of its ChatGPT Search feature on Day 8 of the company's livestream event, making it freely available to all users alongside added voice search capabilities and improved mobile features.

The details:

  • The previously premium search feature now extends to all logged-in users, with faster responses, and is now available through a globe icon on the platform.

  • Search has also been added to Advanced Voice Mode for premium users, allowing them to conduct searches through natural spoken prompts.

  • The Search mobile experience has been revamped, with enhanced visual layouts for local businesses and native integration with Google and Apple Maps.

  • Users can also set ChatGPT Search as a default search engine, with results displaying relevant links before ChatGPT text responses for faster access.

  • OpenAI also teased a 'mini Dev Day' for tomorrow.

Why it matters: ChatGPT's ability to access the web and up-to-date information is an important step towards an agentic future, particularly within Advanced Voice Mode — turning the tool into a much more intelligent and capable version of Siri (and maybe powering it eventually). Search is about to change in a big way in the AI era.

🎙️ Create a customer support AI voice agent for your website

TLDR: ElevenLabs' new Conversational AI Agents let you incorporate an AI-powered voice agent that can interact naturally with your visitors.

Step-by-step:

  1. Create an ElevenLabs account and navigate to the Agents section.

  2. Configure your AI agent's personality and initial message.

  3. Choose or create a custom voice for natural interactions.

  4. Customize the widget's appearance and embed it on your site.

Pro tip: Use the "Test AI agent" button to test your agent thoroughly before deploying. This helps ensure responses align with your expectations and brand voice.

🎬 AI agents make 10+ minute videos from text

TLDR: AI startup Higgsfield just introduced ReelMagic, a multi-agent platform that transforms story concepts into complete 10-minute videos, claiming to streamline the entire production process into a single workflow.

The details:

  • The tool uses specialized AI agents for production roles like scriptwriting and editing, creating cohesive long-form outputs in under 10 minutes.

  • ReelMagic starts with a short synopsis, and then AI agents handle script refinement, virtual actor casting, filming, sound/music, and editing.

  • ReelMagic's smart reasoning engine automatically selects optimal AI models for each shot, and it has partnerships with Kling, Minimax, ElevenLabs, and more.

  • The platform is already being tested by leading Hollywood studios, and Higgsfield is also planning to launch Hera, an AI video streaming platform.

  • Access is available to Project Odyssey participants via a waitlist, with no info on a broader release.

Why it matters: There has been a disconnect between AI video generators and the ability to craft cohesive, longer-form content—with heavy manual editing needed. While not available publicly yet, ReelMagic looks to be a workflow that combines AI's limitless creative power to unlock broader storytelling capabilities.

NEW TOOLS & JOBS

Trending AI Tools

  • 🗂️ ChatGPT Projects - Group files, chats, and custom instructions in one place for better organization and streamlined interactions

  • 🎥 Pika 2.0 - New video generation model with 'ingredients' to incorporate user's own images into outputs with improved motion and animation

  • 💬 Eden - AI-powered social plugin to reply on any webpage in one click to generate tailored comments

  • ✍️ Draft Alpha - AI writing assistant to produce quality content across distribution channels with a consistent brand voice

QUICK HITS

  • Meta rolled out an update for its Ray-Ban smart glasses, introducing live AI assistance, real-time language translation, and Shazam integration for hands-free music recognition.

  • YouTube launched new controls allowing content creators to explicitly authorize specific AI companies to train models on their videos, with an initial list of 18 major tech companies, including OpenAI, Microsoft, and Meta.

  • Google Labs debuted a new experiment called Whisk, a creative AI tool that combines Imagen 3 and Gemini to help users remix and transform visuals through image-to-image capabilities.

  • Former Google CEO Eric Schmidt warned about AI's increasing capabilities in an interview with ABC, saying 'pulling the plug' may be necessary when self-improving systems arrive.

  • SoftBank's Masayoshi Son pledged a $100B investment in U.S. AI in a meeting with incoming president Donald Trump, aiming to create 100,000 jobs over the next four years.

  • Defense giant Lockheed Martin established a new subsidiary called Astris AI, hoping to accelerate AI adoption across the defense industry and commercial applications.

That's it for today! Before you go we'd love to know what you thought of today's newsletter to help us improve The AI Navio experience for you.