AI Giants and Phone Makers Battle for OS-Level Control

AI Giants and Phone Makers Battle for OS-Level Control

A tectonic shift is underway within the mobile industry, as a fundamental struggle for control of the phone’s core operating system pits the ambitions of artificial intelligence developers against the established dominance of smartphone manufacturers. This is not a simple battle over adding another app or feature; it is a conflict that will ultimately define the next generation of user experience and dictate how billions of people interact with their most personal devices. The era of command-based voice assistants is rapidly closing, making way for a new paradigm centered on the system-level AI Agent. These sophisticated entities are designed to be deeply woven into the fabric of the mobile OS, granting them the power to understand complex user intent and autonomously execute intricate tasks across multiple, unrelated applications. This industry-wide pivot was catalyzed by a landmark collaboration that, despite its commercial failure, offered the world a tangible glimpse into the future of intelligent mobile computing and ignited the race for OS-level supremacy.

The Catalyst a Failed Phone Ignites an Industry

While the Nubia M153 smartphone, a device born from a deep collaboration between ByteDance and ZTE, was a commercial disappointment with a reported 30,000 unsold units, its strategic impact far outweighed its sales figures. The device served as a crucial “signal flare” for the entire tech sector, acting as the first real-world proof-of-concept for a deeply integrated, system-level AI Agent. Its significance lies not in its market performance but in its role as a successful “engineering verification” that fundamentally altered the industry’s trajectory. The key innovation was the embedding of ByteDance’s Doubao assistant not as a simple application, but as a Graphical User Interface Agent at the foundational OS layer. This unique position allowed the AI to visually perceive the screen and directly manipulate on-screen elements, enabling it to autonomously execute complex, multi-app workflows that were previously impossible for traditional assistants. This demonstration of a true, system-level Agent capable of directly managing the user interface provided a tangible prototype of the future, forcing the industry to reckon with its implications.

The release of the Nubia M153 immediately triggered a defensive response from the technology ecosystem’s most powerful incumbents. Super-apps like WeChat and Taobao, recognizing the disruptive potential of an OS-level agent that could bypass their carefully controlled user journeys, quickly implemented “countermeasures” to block such automated operations. This resistance underscored the profound shift in power dynamics that system-level AI represents. However, despite these roadblocks and the device’s lackluster sales, the imaginative potential unlocked by the M153’s capabilities proved too significant to ignore. It served as the ultimate catalyst, validating the concept of the edge-side Agent and prompting other major hardware manufacturers, including vivo, Lenovo, and Transsion, to enter into serious strategic discussions with ByteDance. The M153’s failure as a product became secondary to its success in accelerating the industry’s inevitable march toward a new, more intelligent mobile paradigm.

A Convergence of Necessity and Opportunity

The strategic convergence of mobile phone and AI manufacturers is not an exploratory venture but a necessary evolution driven by a confluence of technological readiness and pressing industrial demand. For phone manufacturers, the smartphone market has long been characterized by incremental hardware upgrades that fail to generate significant consumer excitement or product differentiation. In an urgent search for a compelling experiential breakthrough, system-level AI presents the most promising path forward. The logic of the “Siri era,” which relied on fixed instructions and basic Q&A, no longer satisfies user expectations for true intelligence. Furthermore, a massive, pre-existing user base for on-device assistants already exists. A Q3 2025 report revealed that mobile phone manufacturers’ proprietary AI assistants command a staggering 535 million monthly active users. This built-in audience represents fertile ground for introducing more advanced capabilities, as system-level assistants hold the natural advantage of being “ready-to-use,” unlike standalone AI apps that users must actively seek out and launch.

This symbiotic relationship is equally crucial for AI manufacturers. Companies like ByteDance possess mature, powerful large models and sophisticated engineering capabilities, but for them, the mobile phone represents the most critical, high-frequency entry point to billions of users. Gaining a permanent, integrated foothold at the OS level provides an unparalleled channel for deploying their AI, gathering valuable usage data, and establishing their ecosystem’s dominance. This convergence is made possible by two parallel technological leaps. First is the dramatic improvement in the capabilities of large models, which have shown marked advances in instruction understanding, multi-round planning, and tool-calling, finally allowing AI to undertake complex task chains. Second is the rapid advancement in mobile hardware. Analysis has shown that by 2025, 88% of high-end SoCs shipped possessed generative AI capabilities. The peak AI computing power of these chips approached 100 TOPS, a fourfold increase from just a few years prior. This surge in Neural Processing Unit performance and energy efficiency has made “edge-side execution”—running powerful AI models locally on the device—a practical and efficient reality.

The Two Fronts Integration vs In-House Control

As collaborations between AI developers and phone makers become more common, a fundamental tension has emerged, crystallizing into a competition between “system and ecosystem.” This conflict has given rise to two distinct models of cooperation. The first, exemplified by the Nubia M153, is one of deep integration. In this scenario, the AI manufacturer deeply embeds its product into the operating system, effectively defining the entire AI experience, its interaction modes, and its permission boundaries. This model, however, requires the phone maker to cede significant control over a core user-facing feature to an external partner. The skepticism of established hardware players toward this approach was captured in a remark from an Honor engineer, who noted that “two short people together won’t give birth to a tall one,” reflecting a deep-seated reluctance to relinquish such a critical part of the user experience.

In stark contrast, the dominant and more realistic model preferred by major manufacturers like Honor and vivo is one of “capability invocation.” In this framework, the phone manufacturer maintains full ownership and control of its native AI Agent, such as Honor’s YOYO Agent or vivo’s Blue Heart Intelligence. They lead the product logic, system integration, and user experience design, while simply “accessing” or “invoking” the underlying model capabilities of an AI company. Here, the AI company acts as a “capability provider” rather than a “product definer,” a less threatening and more manageable partnership for the hardware giants. This preference has spurred a powerful counter-movement from these companies, who are now leveraging their long-standing advantages in OS control, hardware integration, and vast device ecosystems to fortify their positions. This strategic maneuvering represents a direct challenge to the ambitions of AI cloud manufacturers, framing the next stage of competition as a battle to see who will ultimately own the user’s AI entry point and dictate the rules of the ecosystem.

Redefining the Role of the App

The rise of powerful, autonomous AI Agents has raised an existential question about the future of traditional applications. However, the emerging consensus, articulated by industry leaders, is that this dynamic is one of evolution, not elimination. The primary goal of AI is to meet user needs more conveniently and to expand the set of previously unmet needs. In this new structure, the Agent’s role is not to replace apps but to fundamentally change the user’s “entry position.” The AI Agent is evolving into a sophisticated “demand scheduling layer.” It will be responsible for understanding a user’s high-level intent, disassembling that intent into a series of smaller, actionable tasks, and then distributing the execution of these tasks to the most appropriate applications or system capabilities available. In this paradigm, apps transform from being the starting point of a user’s journey into becoming encapsulated “capabilities and services” that are called upon by the Agent as needed.

While this new structure undeniably shifts the power dynamic away from individual app developers and toward the controller of the Agent, the core value of major applications will not be rapidly weakened. Top-tier platforms that control essential functions like payment systems, user accounts, content supply, and security infrastructure are not easily replaceable by an Agent in the short term. An AI Agent may be able to book a flight and a hotel, but it will still rely on the payment and identity services provided by established financial and social apps to complete the transactions securely. The Agent will change how users interact with these critical services, orchestrating them in more seamless and intelligent ways, but the services themselves remain the indispensable infrastructure of the digital economy. The battle, therefore, is not over the existence of apps, but over who controls the primary interface that commands them.

The Dawn of a New Computing Era

The mobile industry had reached a clear inflection point, with the experimental Nubia M153 having successfully served its purpose as an industry-wide catalyst. It solidified the collective belief that the edge-side, system-level Agent was an unavoidable and essential path forward for mobile computing. This realization ignited a race on two fronts: AI companies aggressively pushed to “board” mobile phones and secure their place within the OS, while established phone manufacturers like Xiaomi and Huawei fortified their own OS-centric AI ecosystems to retain their long-held dominance. The market matured rapidly, as user behavior shifted from tentatively “trying out” AI to actively “relying on” it for instant, short-duration needs—a use case perfectly suited for a native mobile Agent. The competition also began expanding beyond the smartphone to other form factors like AI smart glasses, where the Agent becomes even more fundamental to the core user experience. Ultimately, the future of the AI-powered mobile device was defined by the resolution of this “system vs. ecosystem” struggle, answering the pivotal question of who would control the intelligence: the AI cloud manufacturers providing the models, or the hardware manufacturers who owned the operating system and the device.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later