Jiakai's Blog | Google I/O 2025 Notes

Google I/O 2025 Notes

2025-05-21

#google, #gemini

1704 Words

8 min

Update (2025.5.21)

Future updates for this post moved to Flarum post—Google I/O 2025 Blog Update Summary.

Woke up this morning to see Google released a ton of new stuff. Can’t help but marvel at Google’s power—the future definitely belongs to Gemini (It’s Gemini Season¹). Though recently my ChatGPT usage for daily questions far exceeds other AIs.

Google I/O 25 Keynote YouTube video opening animation was all Veo-generated. via: https://youtu.be/o8NiE3XMPrM

It’s Gemini Season!

Many AI enthusiasts’ X posts expressed that Google’s updates are overwhelming—hard to keep up. So much information—without organizing now, will miss major updates.

Google’s updates are overwhelming—hard to keep up

via: https://x.com/testingcatalog/status/1924899785355510239

This article aims to be a first-hand Google announcement relay while adding my own thoughts as much as possible.

First, some good Google I/O 2025 summary links:

NotebookLM’s Google I/O 2025 summary: https://notebooklm.google.com/notebook/953b658a-579b-4b3c-b280-43b3781babf3

Tibor Blaho’s Google I/O 2025 summary tweet: https://x.com/btibor91/status/1924938391478468754

Chubby’s Google I/O 2025 summary tweet: https://x.com/kimmonismus/status/1924895087332405554

X’s hot comments about 2025 Google I/O: https://x.com/i/events/1922336973353598976

Gemini App

This section mostly from the full official update blog post² and official Gemini App Updates page.

This section introduces Ultra-exclusive features or models not released immediately but coming to Ultra subscribers soon, so Ultra subscription currently has very low value.

Emphasis: English experience > Chinese experience

First must emphasize: Gemini App Chinese experience isn’t as good as ChatGPT or domestic Doubao.

English questions get well-reasoned replies with reference citations.

English questions to Gemini get well-reasoned replies with reference citations.

Chinese questions get replies without reference citations, sometimes feeling inadequate. Recently saw many L Site posts about Gemini dumbing down—this is actually a misunderstanding. Google with its deep pockets wants to steal all ChatGPT users, won’t do sneaky dumbing down. So-called “dumbing down” is actually due to insufficient Gemini App Chinese user support.

Chinese questions to Gemini get replies without reference citations, sometimes feeling inadequate.

For better Gemini App experience, when Chinese replies aren’t satisfying, recommend rephrasing in English. For those not good at English, can pair with “Immersive Translate” browser extension—triple-space to convert native input to English; set shortcut to translate model replies back to native language.

New pricing plans

Gemini Advanced/Google One AI Premium renamed to Google AI Pro.

Gemini App new pricing plans

Besides some rate/context length differences, Pro users have over free users: Flow [an AI video creation tool, mentioned later], Gemini in Chrome [similar to Copilot in Edge, mentioned later].

Ultra users have over Pro users: Gemini 2.5 Pro Deep Think model, Veo3 model, Project Mariner [browser Agent], YouTube Premium individual plan, larger storage.

Ultra plan: $125/month for first three months, then $250/month. Initially tempting, but after calming down, no impulse purchase. After all, reasoning models thinking more—often just good benchmark scores. Daily use, classic LLMs are enough. Better reasoning model use cases: logic, math, code.

Heard about Project Mariner last year end. Hearing the name again today feels unfamiliar—browser Agent prototype still feels like a toy.

ChatGPT o3’s introduction to Project Mariner

As for YouTube membership, few would get US region YouTube membership—usually low-price regions, joining Latin American family groups.

30TB storage is useless for regular people. All my files and photos on Google, accumulated 2-3 years, haven’t exceeded 200GB.

Ultra subscription suits wealthy folks or extreme Gemini AI enthusiasts/Google die-hard fans. For students like me, just watch—freeloading up to Pro plan is enough.

via:

https://blog.google/products/google-one/google-ai-ultra/

Agent Mode

This feature looks nice, Ultra-only. See Google CEO Sundar Pichai’s tweet: https://x.com/sundarpichai/status/1924909900033122466

ChatGPT o3’s introduction to Google Ultra’s Agent Mode

Gemini 2.5 Pro Deep Think, Gemini 2.5 Flash

Gemini 2.5 Pro Deep Think Ultra-only, benchmark data looks great. 2.5 Pro DeepThink API only open to a small group of trusted testers.

Gemini 2.5 Pro Deep Think benchmarks

Gemini 2.5 Flash cost-effectiveness improved further.

Gemini 2.5 Flash cost-effectiveness improved further

Thought Summaries

Thought Summaries in Gemini App, AI Studio, Vertex AI

Thought summaries aim to further improve model output interpretability.

ChatGPT o3’s explanation of Gemini Thought Summaries

Veo3, Imagen4

Veo3’s feature is text-to-video with sound—interesting, should be world’s first text-to-video with sound model. Only US Ultra subscribers can experience.

Imagen4 open to all Gemini users. Below shows a raccoon holding a sign generated with Imagen4. For text-to-image, GPT-4o native text-to-image and recent Doubao’s Seedream 3.0 seem impressive.

Raccoon holding sign generated by Imagen4

Grok leads by far in low censorship—this is unique. Few days ago had Doubao generate borderline images—Doubao directly refused for sensitive Chinese prompts. Had Grok translate prompt to Russian [ChatGPT refused translation request for borderline prompts], broke Doubao’s guardrails. Sometimes can’t help but feel Grok shouldn’t be ignored—low censorship is its biggest selling point. Probably Grok 3.5 coming soon—hope it surprises everyone!

Grok leads by far in low censorship

ChatGPT o3’s explanation of Veo3, Imagen4 user types

https://x.com/GeminiApp/status/1924893675529900467

https://x.com/GeminiApp/status/1924973242768642433

Deep Research creating apps/Canvas updates

After Gemini Deep Research, can create apps. Besides preset operations: Web page, Infographic, Quiz, Audio Overview (podcast), can also create custom apps. Don’t recommend custom app creation—custom apps have bugs.

Facing walls of text makes people impatient. This feature directly solves Deep Research products’ visualization insufficiency.

After Deep Research, can create apps.

Quiz app created after Deep Research

via: https://x.com/GeminiApp/status/1924892563695456675

Image and file-based Deep Research

Image-based Deep Research use case I thought of: ~~JAV code Deep Research~~ [Actually too much—Gemini App high censorship, any slightly explicit image gets removed. But if asking to find actors in image (premise: image NSFW content is low), can Deep Research]. Removed related explicit images; guessing locations from photos.

Below shows Gemini guessing location from the same image I gave ChatGPT o3—results lacking. ChatGPT o3 is better at guessing locations from images. Actually had Doubao guess from this image before—Doubao’s deep thinking felt jumpy but got the location right first try. For guessing domestic photo locations, domestic AIs are better—giants have tons of Chinese data.

ByteDance’s Doubao is already Top 1 domestic AI in my mind. Doubao doesn’t offer one-click conversation deletion option—destined that all user data will build Doubao’s throne. Almost all AI providers are the same—surface level not violating privacy, not using user data for training, secretly decoupling data from accounts and quietly improving products. Such operations are too common.

Using Gemini Deep Research to guess location from photo

Below shows using Gemini Deep Research to study DeepSeek’s latest paper—file-based Deep Research should have many practical use cases.

Using Gemini Deep Research to study DeepSeek’s latest paper

via: https://x.com/GeminiApp/status/1924892309583458706

Other

Gemini Live free on Android and iOS.

Many are already Google AI Pro users. Friends who haven’t claimed the deal—can consider getting a US edu email [no longer limited to US—Brazil, Indonesia, Japan, UK edu emails qualify] and claim this freebie before June 30 this year (https://gemini.google/students/).

via:

https://x.com/GeminiApp/status/1924891940598014359

https://x.com/NewsFromGoogle/status/1924887278008307728

Gemini API

This section mostly from official blog post³.

gemini-2.5-flash-preview-05-20

Improved Gemini 2.5 Flash model with better performance.

Gemini 2.5 Flash and Gemini 2.5 Pro will be GA (Generally Available) in early June 2025

Google models’ inherent release rhythm: Experimental—Preview—GA.

https://x.com/OfficialLoganK/status/1924878626312618323

Gemma 3n E4B

ChatGPT o3’s introduction to Gemma 3n E4B

Mobile-centric open-source model—good for open source. Regular users have better closed-source options.

via:

https://developers.googleblog.com/en/introducing-gemma-3n/

https://huggingface.co/collections/google/gemma-3n-preview-682ca41097a31e5ac804d57b

https://ai.google.dev/gemma/docs/gemma-3n

https://news.ycombinator.com/item?id=44044199

Native audio output and Live API improvements

Stream section added gemini-2.5-flash-preview-native-audio-dialog and gemini-2.5-flash-exp-native-audio-thinking-dialog models. Both are Gemini 2.5 Flash Native Audio. Early features include Proactive audio, etc.

Gemini native audio output and Live API improvements

Conversational assistant apps can consider integrating—Chinese works well too.

Generate Media section—>Gemini Speech Generation, added text-to-speech models—gemini-2.5-flash-preview-tts and gemini-2.5-pro-preview-tts, can generate voice from scripts.

Gemini added two TTS models

AI Mode

English only. Kicking Off, Looking At flow to answer user questions—interesting. Previous AI Mode probably didn’t have these two prompts.

AI Mode for political figure searches only provides links, no generation.

New AI Mode features shown at I/O including Deep Search will roll out to AI Mode Labs users over coming weeks and months.

https://blog.google/products/search/google-search-ai-mode-update/

Gemini Diffusion

Google’s text diffusion model. Users with access say: incredibly fast. First commercially deployed diffusion model LLM is Mercury.

via:

https://blog.google/technology/google-deepmind/gemini-diffusion/

https://x.com/GoogleDeepMind/status/1924888095448825893

https://news.ycombinator.com/item?id=44044080

https://deepmind.google/models/gemini-diffusion/

Flow

An AI video creation tool. See ChatGPT o3’s introduction.

ChatGPT o3’s introduction to Flow

Played around and generated a US kitty captain in space.

via:

https://labs.google/fx/tools/flow/

SynthID Detector

Project for quickly and efficiently identifying if AI-generated content contains SynthID watermark. See ChatGPT o3’s introduction.

ChatGPT o3’s introduction to SynthID Detector project

Content generated in Gemini App started having watermarks last year (via: https://deepmind.google/discover/blog/watermarking-ai-generated-text-and-video-with-synthid/). So friends using Gemini for writing assistance—remember to have other LLMs rewrite. Though domestic AIGC detection hasn’t added SynthID yet, can’t rule out future addition. If added, can self-check with SynthID Detector first.

via:

https://blog.google/technology/ai/google-synthid-ai-content-detector/

Stitch

ChatGPT o3’s introduction to Stitch

AI prototyping tool for design-to-development one-stop flow. I’m not a designer—designer friends can pay more attention.

Has Standard Mode [Gemini 2.5 Flash powered] and Experimental Mode [Gemini 2.5 Pro powered].

Stitch-generated page praising Trump.

Stitch-generated page praising Trump

Remember to disable Stitch conversation data training in settings—though Google probably won’t strictly comply.

Google Chrome Gemini

Google Chrome Gemini similar to Edge Copilot

Similar to Copilot in Edge. Summarizing webpages is a nice option, though it’s a bit sticky.

https://x.com/Google/status/1924892719739973640

Project Astra

Joined the Waitlist last year end—still haven’t gotten to experience it. Disappointing. Project Astra demo videos on Google’s YouTube channel have high views—shows Project Astra is highly anticipated.

Project Astra demo videos on Google’s YouTube channel have high views

After watching Project Astra’s latest demo videos—Helping handsome guy assemble bicycle, Helping girl with chemistry learning—I also want such a universal AI assistant.

ChatGPT o3’s introduction to Project Astra

via: https://deepmind.google/models/project-astra/

Try it on

Virtual try-on feature from Google Shopping. Upload full-body photo for virtual try-on.

Grabbed a random beauty full-body photo from Xiaohongshu for experiment—effect is great.

Original:

Original

Virtual try-on:

Virtual try-on

https://blog.google/products/shopping/how-to-use-google-shopping-try-it-on/

Android XR

Not familiar with this area. See ChatGPT o3’s introduction.

ChatGPT o3’s introduction to Android XR

https://blog.google/products/android/android-xr/

https://blog.google/products/android/android-xr-gemini-glasses-headsets/

Google Meet Real-time Translation

Initially supports English and Spanish. See related demo video.

Web comment: Google just ended my translation career. This feature is promising.

Web comment: Google just ended my translation career

Other

Still some not covered, like Lyria 2 [Google’s music generation model], Google Beam [video conferencing stuff].

As for Jules [Google’s AI code Agent], I tried it yesterday—felt average. Still need to learn programming properly. AI-assisted learning speeds up progress. Just Vibe Coding all day—end up being nothing without AI.

Summary

Day 1 Google I/O 2025 announcements roughly these. Day 2 usually no major announcements. This week is destined to be extraordinary—whether o3-Pro, Grok 3.5, Claude 4 coming, stay tuned. Anthropic has an event at midnight Beijing time May 23, 2025.

Day 2 Google I/O 2025 usually no major announcements

Google launching Ultra plan shows following OpenAI’s path. High pricing means destroying AI democratization. But Google already laid groundwork letting users freebie 15 months Gemini membership—won’t criticize high pricing too much.

My most-used AI app has switched to ChatGPT. Though I haven’t abandoned Gemini, Claude, etc., overall ChatGPT feels best in all aspects. From earlier text, you saw for new knowledge points, I basically use ChatGPT o3 model for research. This Agent LLM is just too convenient—excellent UX. Unless replies are unsatisfying, generally too lazy to open search engines.

Can say ChatGPT-style Agent LLMs deal a big blow to Perplexity-type AI search wrappers. I believe future Google’s similar o3 Agent LLM might directly sentence Perplexity-type products to death. What’s Perplexity’s moat? Nothing—models are outsourced, web data is outsourced. Eventually will be consumed by these giants.

Some interesting content seen today:

Gemini is freeloading user feedback, should provide free service to users

Document Info

License: Free to share - Non-commercial - No derivatives - Attribution required (CC BY-NC-ND 4.0)

← Previous：ChatGPT o3, o4 mini Experience

Next：Claude 4 Experience →

Google I/O 2025 Notes

Update (2025.5.21)

Gemini App

Gemini API

AI Mode

Gemini Diffusion

Flow

SynthID Detector

Stitch

Google Chrome Gemini

Project Astra

Try it on

Android XR

Google Meet Real-time Translation

Other

Summary

Document Info

Emoji Reaction