Howdy, wizards.
Dario’s Picks
The most important news stories in AI this week
1. The new AI features coming to Pixel 9.
Google recently revealed its new lineup of phones, Google Pixel 9. There'll be an on-device AI and the Pro phones will be bundled with 1 year of Gemini Advanced. Here's a recap of the coolest AI features:
Gemini Live: Conversations with Gemini assistant get near real-time. Apparently it's still a bit too chatty, like I mentioned before. You can also talk to Gemini even when your screen is locked.
Pixel screenshots: take screenshot of something you want to remember on your phone, and conversationally search for it later.
Call notes: AI summaries of phone calls (stored locally and not sent to the cloud).
Screen understanding: Gemini can understand what's on your screen, and give you answers based off of it.
Magic editor: AI powered photo editing, in-painting type of features and more.
Why it matters Google is pushing advanced AI into pockets of Android users worldwide with the Pixel 9, and beating Apple to launch. The features are in many ways similar to what Apple has in store with Apple Intelligence, so it'll be interesting to see who actually delivers the best user experience.
2. Researchers build an AI agent that acts like a translation company. You can try a demo here (you'll need an OpenAI API key). Researchers from Macau university have proposed an agentic system to translate novels from Chinese to English. It's a system of several LLMs that assigns roles to each one, mimicking human roles in a translation company e.g. senior & junior editors, localization specialists, proofreaders. Hat tip to The Batch.
Why it matters While not revolutionary, studies like this help us move forward in designing super powerful workflows using LLMs. Literary translation is way more challenging than translation ordinary text and conversations, especially across languages that are so different as Chinese and English. The results of this study were generally positive, the agentic translation fared better on most metrics than GPT-4 most of the time.
3. OpenAI banned an Iranian group using ChatGPT to spread political propaganda. As part of OpenAI's ongoing work to disrupt malicious uses of its tools, it banned a cluster of ChatGPT account, identified as Storm-2035. The accounts were used to generate and spread misinformation, including the Gaza conflict and the US presidential election (some accounts posed as progressives while others as conservatives). The format they were using was mainly long-form articles and short social media comments.
Why it matters Preventing malicious actors using AI as the technologies get more advanced and persuasive is anything but an easy task. It's naive to think OpenAI will stop them all, but AI companies taking this seriously and advancing their methods to prevent misuse is certainly a worthwhile pursuit for society. We've developed numerous ways to fight cybercrime since the birth of the internet, and that work will only be intensified now in the age of AI.