by Sakshi Dhingra - 3 hours ago - 3 min read
Former OpenAI CTO Mira Murati is pushing a different vision for artificial intelligence, one where AI systems no longer wait silently for users to finish speaking before responding.
Her startup, Thinking Machines Lab, has unveiled a new category of systems called “interaction models,” designed to process and respond to audio, video, and text simultaneously in real time. The goal is to make conversations with AI feel closer to natural human dialogue rather than the rigid turn-by-turn structure most chatbots use today.
Most modern AI assistants still operate in a stop-and-start pattern. A user speaks or types, the model processes the request, and only then generates a response. According to Thinking Machines, that approach creates what it describes as a “bandwidth bottleneck” between humans and AI systems.
The company’s new architecture attempts to remove that delay by allowing AI models to continuously observe and react while conversations are happening. Internally, the system is described as “full duplex,” meaning the AI can listen and generate responses at the same time, similar to how humans communicate during live conversations.
Thinking Machines claims its preview model, TML-Interaction-Small, can achieve a response latency of roughly 0.40 seconds, approaching the speed of natural human interaction and outperforming several existing real-time AI benchmarks from competitors.
The company is positioning the technology as more than just a faster voice assistant.
In demonstrations shared by Thinking Machines, the system was shown translating speech live, identifying mentions of animals during storytelling, and even monitoring posture to notify users when they were slouching. The broader idea is to create AI systems that remain contextually aware throughout an interaction instead of freezing perception while generating responses.
That shift could have implications far beyond chatbots. Real-time collaborative AI may eventually influence tutoring systems, enterprise copilots, accessibility tools, live customer support, robotics, and wearable AI experiences where constant awareness matters more than isolated prompts.
Since leaving OpenAI in 2024, Murati has rapidly built Thinking Machines into one of the most closely watched AI startups in Silicon Valley. The company officially launched in 2025 and recruited researchers and engineers from OpenAI, Meta, Anthropic, and Mistral.
The startup has attracted significant investor attention amid the growing race to define the next generation of AI interfaces. Rather than focusing purely on larger models or benchmark scores, Thinking Machines appears to be betting heavily on interaction quality and responsiveness as the next competitive frontier.
That direction aligns with a broader shift happening across the AI industry, where companies are increasingly experimenting with multimodal and conversational systems capable of handling voice, video, and live contextual awareness simultaneously.
Despite the early excitement, Thinking Machines’ announcement remains a research preview rather than a publicly available product. The company says a limited preview will arrive in the coming months, followed by a wider release later this year.
Whether users will actually prefer AI systems that can interrupt, react instantly, or maintain continuous awareness remains uncertain. Real-time AI also introduces additional technical and privacy challenges, especially around persistent listening, contextual memory, and live multimodal processing.
Still, the announcement signals a growing belief inside the AI industry that future assistants may need to behave less like search engines and more like active collaborators.