Understanding the Forward Deployed Engineer (FDE) Model for AI Startups

English Podcast

中文版本

最近,Y Combinator 请来了 Bob McGrew ——前 OpenAI 首席研究官,同时也是 PayPal 和 Palantir 的资深技术骨干。令人意外的是,在场的创业者们并没有追问他“如何打造下一个 GPT”,反而一窝蜂地想知道:Palantir 的 FDE 模式究竟是怎么运作的?Bob 也坦言,过去一年里,他为无数创业公司提供过咨询,几乎所有人都在痴迷研究这种模式如何真正落地。

什么是 FDE?

FDE(Forward Deployed Engineer,前线部署工程师) 的核心理念,是把工程师直接派驻到客户一线,负责打通“理想产品”与“真实需求”之间的鸿沟。这一思路最早源于 Palantir 服务美国情报机构的岁月。那时客户的挑战极其复杂、没有任何现成模板,只能“现场拼凑”解决方案。起初,很多人认为这种模式无法规模化、太过劳动密集,不符合标准化的 SaaS 理念。可如今,正在探索 AI Agent 与企业级落地的创业公司们,却纷纷把它奉为圭臬。

它是如何运作的

Palantir 把 FDE 团队拆分为两类角色:

  • Echo:行业洞察者,深入客户工作流程,挖掘核心痛点,敢于质疑现状。
  • Delta:技术实干家,能够在现场快速迭代,把想法变成可运行的原型。

与此同时,总部的 核心产品团队 则把这些前线临时拼凑的“碎石路”经验,沉淀为真正的平台功能——就像把碎石铺成的便道逐步升级为可复用的高速公路。

为什么它重要

FDE 模式最大的优势,是能和客户建立极深的合作关系,发现那些任何调研或问卷都无法揭示的真实需求。执行得好,它能形成强大的护城河。但风险同样存在。如果缺乏纪律,FDE 很容易沦为传统咨询或外包。判断是否健康的关键在于:核心产品是否在持续进化?交付效率是否在不断提高?如果只是人海战术的项目交付,那就南辕北辙了。

与咨询的本质区别

关键差异在于:

  • 咨询 只解决一次性问题。
  • FDE 则要求把一线的经验和解决方案反馈到平台中,让产品每服务一个客户就更强大一分。

这种反馈闭环,以及产品经理把定制需求抽象为通用功能的能力,才是 FDE 的真正精髓。

为什么 AI 创业公司都在效仿

对 AI Agent 公司而言,市场过于碎片化和不确定,不存在“通吃型”产品。深度嵌入客户现场,不是可选项,而是唯一的探索路径。唯有如此,才能找到真正的产品形态和市场契合点。

商业模式的变化

传统 SaaS 依赖订阅规模化,而 FDE 合同更偏向结果导向与灵活定价。这里的关键杠杆是 产品杠杆:同样的前线投入,能否带来更大的合同规模,同时不断降低下一次定制的边际成本。

更大的图景

FDE 的流行揭示了现代科技公司的一个悖论:规模化的公司,往往要坚持做那些“无法规模化的事”。AI 的能力正在爆发,但距离真正落地仍有巨大鸿沟。而正是在这个鸿沟里,蕴藏着当下创业公司最大的机会。这不是一条轻松的道路,更像是长期的阵地战,而非一蹴而就的闪电战。但对创业者来说,它或许是唯一可行的道路。

【人工智能】什么是FDE?为何在硅谷爆火? | 前线部署工程师 | Bob McGrew | Palantir | 历史成因 | PMF | 总部产品平台 | Echo&Delta团队 | 历史倒退?


Recently, Y Combinator hosted Bob McGrew, the former Chief Research Officer at OpenAI and a veteran technologist from PayPal and Palantir. What surprised many was the line of questioning. Instead of asking him how to build the next GPT, founders kept pressing him on a very different topic: Palantir’s FDE model.

Bob admitted that over the past year, nearly every startup he’s advised has been obsessed with learning how this model works in practice.

What Exactly Is FDE?

FDE (Forward Deployed Engineer) is a model where engineers embed directly with customers to bridge the gap between what the product aspires to be and what the customer actually needs.

The idea traces back to Palantir’s early days working with U.S. intelligence agencies. The challenges were messy, complex, and had no off-the-shelf solutions. The only way forward was to “build on the ground” with the client. At the time, many dismissed it as unscalable, labor-intensive, and far from the clean SaaS ideal. Fast forward to today, and the very same approach is being embraced by AI startups building agents and enterprise solutions.

How It Works

Palantir structured its FDE teams around two roles:

  • Echo: the industry-savvy operator who lives inside the customer’s workflow, identifies core pain points, and challenges the status quo.
  • Delta: the technical builder who can spin up prototypes quickly, solving problems in real time.

Meanwhile, the core product team back at HQ takes these frontline hacks and turns them into platform features. Think of it as paving a permanent road where the FDEs first laid down gravel.

Why It Matters

The strength of the FDE model is that it forges unusually deep relationships with customers. It surfaces real market demand—things no survey or user interview could ever uncover. Done right, it creates a defensible moat.

But it’s also risky. Without discipline, FDE can collapse into traditional consulting or body-shop outsourcing. The litmus test of a healthy model is whether the core platform keeps evolving, making each new deployment faster, cheaper, and more scalable.

Different from Consulting

The distinction is critical:

  • Consulting delivers one-off solutions.
  • FDE is about feeding learnings back into the product, so the platform gets stronger with every customer.

This feedback loop—and the ability of product managers to abstract from bespoke requests—is what turns customer-specific fixes into reusable product capabilities.

Why AI Startups Love It

For AI Agent companies, the market is far too fragmented and unpredictable for a “one-size-fits-all” solution. No universal product exists. Embedding deeply with customers isn’t optional—it’s the only way to figure out what works, discover product-market fit, and build enduring platforms.

A Shift in Business Models

Unlike traditional SaaS, which scales on pure subscriptions, FDE contracts are more outcome-driven and flexible. The key lever is product leverage: doing the same amount of frontline work but translating it into larger contracts and less marginal customization over time.

The Bigger Picture

The rise of FDE highlights a paradox of modern tech: at scale, the best companies keep doing the things that “don’t scale.” The gulf between breakthrough AI capabilities and messy, real-world adoption is exactly where the biggest opportunities lie today.

It’s not an easy path—more trench warfare than blitzscaling—but for founders, it may be the only one that works.


Watch the full discussion here: The FDE Playbook for AI Startups with Bob McGrew

AI-Powered Search: Google’s Transformation vs. Perplexity

TL;DR, Play the podcast (Audio Overview generated by NotebookLM)

  1. Abstract
  2. Google’s AI Transformation: From PageRank to Gemini-Powered Search
    1. The Search Generative Experience (SGE) Revolution
    2. Google’s LLM Arsenal
    3. Technical Architecture Integration
    4. Key Differentiators of Google’s AI Search
  3. Perplexity AI Architecture: The RAG-Powered Search Revolution
    1. Simplified Architecture View
    2. How Perplexity Works: From Query to Answer
    3. Technical Workflow Diagram
  4. The New Search Paradigm: AI-First vs AI-Enhanced Approaches
    1. Google’s Philosophy: “AI-Enhanced Universal Search”
    2. Perplexity’s Philosophy: “AI-Native Conversational Search”
    3. Comprehensive Technology & Business Comparison
  5. The Future of AI-Powered Search: A New Competitive Landscape
    1. Implementation Strategy Battle: Integration vs. Innovation
    2. The Multi-Modal Future
    3. Business Model Evolution Under AI
    4. Technical Architecture Convergence
    5. The Browser and Distribution Channel Wars
  6. Strategic Implications and Future Outlook
    1. Key Strategic Insights
    2. The New Competitive Dynamics
    3. Looking Ahead: Industry Predictions
  7. Recommendations for Stakeholders
  8. Conclusion

Abstract

This blog examines the rapidly evolving landscape of AI-powered search, comparing Google’s recent transformation with its Search Generative Experience (SGE) and Gemini integration against Perplexity AI‘s native AI-first approach. Both companies now leverage large language models, but with fundamentally different architectures and philosophies.

The New Reality: Google has undergone a dramatic transformation from traditional keyword-based search to an AI-driven conversational answer engine. With the integration of Gemini, LaMDA, PaLM, and the rollout of AI Overviews (formerly SGE), Google now synthesizes information from multiple sources into concise, contextual answers—directly competing with Perplexity’s approach.

Key Findings:

  • Convergent Evolution: Both platforms now use LLMs for answer generation, but Google maintains its traditional search infrastructure while Perplexity was built AI-first from the ground up
  • Architecture Philosophy: Google integrates AI capabilities into its existing search ecosystem (hybrid approach), while Perplexity centers everything around RAG and multi-model orchestration (AI-native approach)
  • AI Technology Stack: Google leverages Gemini (multimodal), LaMDA (conversational), and PaLM models, while Perplexity orchestrates external models (GPT, Claude, Gemini, Llama, DeepSeek)
  • User Experience: Google provides AI Overviews alongside traditional search results, while Perplexity delivers answer-first experiences with citations
  • Market Dynamics: The competition has intensified with Google’s AI transformation, making the choice between platforms more about implementation philosophy than fundamental capabilities

This represents a paradigm shift where the question is no longer “traditional vs. AI search” but rather “how to best implement AI-powered search” with different approaches to integration, user experience, and business models.

Keywords: AI Search, RAG, Large Language Models, Search Architecture, Perplexity AI, Google Search, Conversational AI, SGE, Gemini.

Google has undergone one of the most significant transformations in its history, evolving from a traditional link-based search engine to an AI-powered answer engine. This transformation represents a strategic response to the rise of AI-first search platforms and changing user expectations.

The Search Generative Experience (SGE) Revolution

Google’s Search Generative Experience (SGE), now known as AI Overviews, fundamentally changes how search results are presented:

  • AI-Synthesized Answers: Instead of just providing links, Google’s AI generates comprehensive insights, explanations, and summaries from multiple sources
  • Contextual Understanding: Responses consider user context including location, search history, and preferences for personalized results
  • Multi-Step Query Handling: The system can handle complex, conversational queries that require reasoning and synthesis
  • Real-Time Information Grounding: AI overviews are grounded in current, real-time information while maintaining accuracy

Google’s LLM Arsenal

Google has strategically integrated multiple advanced AI models into its search infrastructure:

Gemini: The Multimodal Powerhouse
  • Capabilities: Understands and generates text, images, videos, and audio
  • Search Integration: Enables complex query handling including visual search, reasoning tasks, and detailed information synthesis
  • Multimodal Processing: Handles queries that combine text, images, and other media types
LaMDA: Conversational AI Foundation
  • Purpose: Powers natural, dialogue-like interactions in search
  • Features: Enables follow-up questions and conversational context maintenance
  • Integration: Supports Google’s shift toward conversational search experiences

PaLM: Large-Scale Language Understanding

  • Role: Provides advanced language processing capabilities
  • Applications: Powers complex reasoning, translation (100+ languages), and contextual understanding
  • Scale: Handles extended documents and multimodal inputs

Technical Architecture Integration

Google’s approach differs from AI-first platforms by layering AI capabilities onto existing infrastructure:

  • Hybrid Architecture: Maintains traditional search capabilities while adding AI-powered features
  • Scale Integration: Leverages existing massive infrastructure and data
  • DeepMind Synergy: Strategic integration of DeepMind research into commercial search applications
  • Continuous Learning: ML ranking algorithms and AI models learn from user interactions in real-time
  • Global Reach: AI features deployed across 100+ languages with localized understanding

Perplexity AI Architecture: The RAG-Powered Search Revolution

Perplexity AI represents a fundamental reimagining of search technology, built on three core innovations:

  1. Retrieval-Augmented Generation (RAG): Combines real-time web crawling with large language model capabilities
  2. Multi-Model Orchestration: Leverages multiple AI models (GPT, Claude, Gemini, Llama, DeepSeek) for optimal responses
  3. Integrated Citation System: Provides transparent source attribution with every answer

The platform offers multiple access points to serve different user needs: Web Interface, Mobile App, Comet Browser, and Enterprise API.

Core Architecture Components

Simplified Architecture View

For executive presentations and high-level discussions, this three-layer view highlights the essential components:

How Perplexity Works: From Query to Answer

Understanding Perplexity’s workflow reveals why it delivers fundamentally different results than traditional search engines. Unlike Google’s approach of matching keywords to indexed pages, Perplexity follows a sophisticated multi-step process:

The Eight-Step Journey

  1. Query Reception: User submits a natural language question through any interface
  2. Real-Time Retrieval: Custom crawlers search the web for current, relevant information
  3. Source Indexing: Retrieved content is processed and indexed in real-time
  4. Context Assembly: RAG system compiles relevant information into coherent context
  5. Model Selection: AI orchestrator chooses the optimal model(s) for the specific query type
  6. Answer Generation: Selected model(s) generate comprehensive responses using retrieved context
  7. Citation Integration: System automatically adds proper source attribution
  8. Response Delivery: Final answer with citations is presented to the user

Technical Workflow Diagram

The sequence below shows how a user query flows through Perplexity’s system.

This process typically completes in under 3 seconds, delivering both speed and accuracy.

The New Search Paradigm: AI-First vs AI-Enhanced Approaches

The competition between Google and Perplexity has evolved beyond traditional vs. AI search to represent two distinct philosophies for implementing AI-powered search experiences.

  • Hybrid Integration: Layer advanced AI capabilities onto proven search infrastructure
  • Comprehensive Coverage: Maintain traditional search results alongside AI-generated overviews
  • Gradual Transformation: Evolve existing user behaviors rather than replace them entirely
  • Scale Advantage: Leverage massive existing data and infrastructure for AI training and deployment
  • Model Agnostic: Orchestrate best-in-class models rather than developing proprietary AI
  • Clean Slate Design: Built from the ground up with AI-first architecture
  • Answer-Centric: Focus entirely on direct answer generation with source attribution
  • Conversational Flow: Design for multi-turn, contextual conversations rather than single queries

Comprehensive Technology & Business Comparison

DimensionGoogle AI-Enhanced SearchPerplexity AI-Native Search
InputNatural language + traditional keywordsPure natural language, conversational
AI ModelsGemini, LaMDA, PaLM (proprietary)GPT, Claude, Gemini, Llama, DeepSeek (orchestrated)
ArchitectureHybrid (AI + traditional infrastructure)Pure AI-first (RAG-centered)
RetrievalEnhanced index + Knowledge Graph + real-timeCustom crawler + real-time retrieval
Core TechAI Overviews + traditional rankingRAG + multi-model orchestration
OutputHybrid (AI Overview + links + ads)Direct answers with citations
ContextLimited conversational memoryFull multi-turn conversation memory
ExtensionsMaps, News, Shopping, Ads integrationDocument search, e-commerce, APIs
BusinessAd-driven + AI premium featuresSubscription + API + e-commerce
UX“AI answers + traditional options”“Conversational AI assistant”
ProductsGoogle Search with SGE/AI OverviewPerplexity Web/App, Comet Browser
DeploymentGlobal rollout with localizationGlobal expansion, English-focused
Data AdvantageMassive proprietary data + real-timeReal-time web data + model diversity
ProductsGoogle Search, AdsPerplexity Web/App, Comet Browser

The Future of AI-Powered Search: A New Competitive Landscape

The integration of AI into search has fundamentally changed the competitive landscape. Rather than a battle between traditional and AI search, we now see different approaches to implementing AI-powered experiences competing for user mindshare and market position.

Implementation Strategy Battle: Integration vs. Innovation

Google’s Integration Strategy:

  • Advantage: Massive user base and infrastructure to deploy AI features at scale
  • Challenge: Balancing AI innovation with existing business model dependencies
  • Approach: Gradual rollout of AI features while maintaining traditional search options

Perplexity’s Innovation Strategy:

  • Advantage: Clean slate design optimized for AI-first experiences
  • Challenge: Building user base and competing with established platforms
  • Approach: Focus on superior AI experience to drive user acquisition

The Multi-Modal Future

Both platforms are moving toward comprehensive multi-modal experiences:

  • Visual Search Integration: Google Lens vs. Perplexity’s image understanding capabilities
  • Voice-First Interactions: Google Assistant integration vs. conversational AI interfaces
  • Video and Audio Processing: Gemini’s multimodal capabilities vs. orchestrated model approaches
  • Document Intelligence: Enterprise document search and analysis capabilities

Business Model Evolution Under AI

Advertising Model Transformation:

  • Google must adapt its ad-centric model to AI Overviews without disrupting user experience
  • Challenge of monetizing direct answers vs. traditional click-through advertising
  • Need for new ad formats that work with conversational AI

Subscription and API Models:

  • Perplexity’s success with subscription tiers validates alternative monetization
  • Growing enterprise demand for AI-powered search APIs and integrations
  • Premium features becoming differentiators (document search, advanced models, higher usage limits)

Technical Architecture Convergence

Despite different starting points, both platforms are converging on similar technical capabilities:

  • Real-Time Information: Both now emphasize current, up-to-date information retrieval
  • Source Attribution: Transparency and citation becoming standard expectations
  • Conversational Context: Multi-turn conversation support across platforms
  • Model Diversity: Google developing multiple specialized models, Perplexity orchestrating external models

The Browser and Distribution Channel Wars

Perplexity’s Chrome Acquisition Strategy:

  • $34.5B all-cash bid for Chrome represents unprecedented ambition in AI search competition
  • Strategic Value: Control over browser defaults, user data, and search distribution
  • Market Impact: Success would fundamentally alter competitive dynamics and user acquisition costs
  • Regulatory Reality: Bid likely serves as strategic positioning and leverage rather than realistic acquisition

Alternative Distribution Strategies:

  • AI-native browsers (Comet) as specialized entry points
  • API integrations into enterprise and developer workflows
  • Mobile-first experiences capturing younger user demographics

Strategic Implications and Future Outlook

The competition between Google’s AI-enhanced approach and Perplexity’s AI-native strategy represents a fascinating case study in how established platforms and startups approach technological transformation differently.

Key Strategic Insights

  • The AI Integration Challenge: Google’s transformation demonstrates that even dominant platforms must fundamentally reimagine their core products to stay competitive in the AI era
  • Architecture Philosophy Matters: The choice between hybrid integration (Google) vs. AI-first design (Perplexity) creates different strengths, limitations, and user experiences
  • Business Model Pressure: AI-powered search challenges traditional advertising models, forcing experimentation with subscriptions, APIs, and premium features
  • User Behavior Evolution: Both platforms are driving the shift from “search and browse” to “ask and receive” interactions, fundamentally changing how users access information

The New Competitive Dynamics

Advantages of Google’s AI-Enhanced Approach:

  • Massive scale and infrastructure for global AI deployment
  • Existing user base to gradually transition to AI features
  • Deep integration with knowledge graphs and proprietary data
  • Ability to maintain traditional search alongside AI innovations

Advantages of Perplexity’s AI-Native Approach:

  • Optimized user experience designed specifically for conversational AI
  • Agility to implement cutting-edge AI techniques without legacy constraints
  • Model-agnostic architecture leveraging best-in-class external AI models
  • Clear value proposition for users seeking direct, cited answers

Looking Ahead: Industry Predictions

Near-Term (1-2 years):

  • Continued convergence of features between platforms
  • Google’s global rollout of AI Overviews across all markets and languages
  • Perplexity’s expansion into enterprise and specialized vertical markets
  • Emergence of more AI-native search platforms following Perplexity’s model

Medium-Term (3-5 years):

  • AI-powered search becomes the standard expectation across all platforms
  • Specialized AI search tools for professional domains (legal, medical, scientific research)
  • Integration of real-time multimodal capabilities (live video analysis, augmented reality search)
  • New regulatory frameworks for AI-powered information systems

Long-Term (5+ years):

  • Fully conversational AI assistants replace traditional search interfaces
  • Personal AI agents that understand individual context and preferences
  • Integration with IoT and ambient computing for seamless information access
  • Potential emergence of decentralized, blockchain-based search alternatives

Recommendations for Stakeholders

For Technology Leaders:

  • Hybrid Strategy: Consider Google’s approach of enhancing existing systems with AI rather than complete rebuilds
  • Model Orchestration: Investigate Perplexity’s approach of orchestrating multiple AI models for optimal results
  • Real-Time Capabilities: Invest in real-time information retrieval and processing systems
  • Citation Systems: Implement transparent source attribution to build user trust

For Business Strategists:

  • Revenue Model Innovation: Experiment with subscription, API, and premium feature models beyond traditional advertising
  • User Experience Focus: Prioritize conversational, answer-first experiences in product development
  • Distribution Strategy: Evaluate the importance of browser control and default search positions
  • Competitive Positioning: Decide between AI-enhancement of existing products vs. AI-native alternatives

For Investors:

  • Platform Risk Assessment: Evaluate how established platforms are adapting to AI disruption
  • Technology Differentiation: Assess the sustainability of competitive advantages in rapidly evolving AI landscape
  • Business Model Viability: Monitor the success of alternative monetization strategies beyond advertising
  • Regulatory Impact: Consider potential regulatory responses to AI-powered information systems and search market concentration

The future of search will be determined by execution quality, user adoption, and the ability to balance innovation with practical business considerations. Both Google and Perplexity have established viable but different paths forward, setting the stage for continued innovation and competition in the AI-powered search landscape.

  • Monitor the browser control battle and distribution channel acquisitions
  • Technology Differentiation: Assess the sustainability of competitive advantages in rapidly evolving AI landscape
  • Business Model Viability: Monitor the success of alternative monetization strategies beyond advertising
  • Regulatory Impact: Consider potential regulatory responses to AI-powered information systems and search market concentration

Conclusion

The evolution of search from Google’s traditional PageRank-driven approach to today’s AI-powered landscape represents one of the most significant technological shifts in internet history. Google’s recent transformation with its Search Generative Experience and Gemini integration demonstrates that even the most successful platforms must reinvent themselves to remain competitive in the AI era.

The competition between Google’s AI-enhanced strategy and Perplexity’s AI-native approach offers valuable insights into different paths for implementing AI at scale. Google’s hybrid approach leverages massive existing infrastructure while gradually transforming user experiences, while Perplexity’s clean-slate design optimizes entirely for conversational AI interactions.

As both platforms continue to evolve, the ultimate winners will be users who gain access to more intelligent, efficient, and helpful ways to access information. The future of search will likely feature elements of both approaches: the scale and comprehensiveness of Google’s enhanced platform combined with the conversational fluency and transparency of AI-native solutions.

The battle for search supremacy in the AI era has only just begun, and the innovations emerging from this competition will shape how humanity accesses and interacts with information for decades to come.


This analysis reflects the state of AI-powered search as of August 2025. The rapidly evolving nature of AI technology and competitive dynamics may significantly impact future developments. Both Google and Perplexity continue to innovate at unprecedented pace, making ongoing monitoring essential for stakeholders in this space. This analysis represents the current state of AI-powered search as of August 2025. The rapidly evolving nature of AI technology and competitive landscape may impact future developments.

小说:见证者 Witness

English Podcast

English Version

Part One: The Singularity
The story begins on the Moon, beside China’s Tianhe Base, with the sudden appearance of a black obelisk. It was unquestionably not of human origin from the very first day of its discovery. Composed of an unidentifiable, perfectly smooth black material, it reflected no light and emitted no heat, as if it were a three-dimensional void against the cosmic background. The astronauts who discovered it named it “Witness.”

For twenty years, human scientists exhausted every technological means to study it, yet not a single atom could be removed from its surface. As research approached a deadlock, teetering on the edge of becoming a symbolic relic, a “point” was discovered.

On the side of the obelisk facing Earth, at its exact center, there was a point.

It was neither a mark nor a dent nor a protrusion. It seemed intrinsic to the material itself—a geometric perfection made manifest. The point was discovered by quantum metrologist Dr. Yun Tianming during a holographic surface scan aimed at mapping quantum fluctuations on the obelisk. Amid the torrent of data, he identified an absolute “nothingness,” a singularity with zero information entropy.

When the image was translated into visible-light models, the point appeared there: a perfect, dimensionless point.

The following decade became the most maddening ten years in physics.

The team first used an atomic force microscope (AFM) to examine the point at the nanoscale, hoping to resolve its edge structure. By conventional expectation, any solid surface should reveal electron cloud distributions and quantum fluctuations. Yet the force curves remained perfectly flat, devoid of noise or disturbance, as if the probe were suspended in a vacuum, unable to detect any structural signal.

Next, they turned to scanning tunneling microscopy (STM) to measure the local density of electronic states. Regardless of voltage adjustments, the tunneling current remained zero—no energy levels for electrons to occupy, as though the region did not belong to the three-dimensional material world.

To rule out instrument limitations, the team deployed laser interferometry to approach the precision of the Planck length. Still, the data remained perfectly symmetrical: the distances from the point to the four edges of the obelisk were exactly equal—not approximately, but to a precision beyond the limits of quantum measurement. Every terminal value in the dataset entered an infinite loop of zeros, seemingly mocking humanity’s grasp of physical law.

“This makes no sense,” Yun murmured to his colleague Dr. Cheng Xin after countless sleepless nights. “According to the Heisenberg uncertainty principle, we cannot determine a particle’s position with infinite precision. The very existence of this point undermines the foundations of physics itself. It is an ontological miracle, something that should not exist.”

Cheng Xin pointed to the rotating holographic obelisk model, streams of data cascading like a waterfall. “Perhaps we’re approaching this incorrectly, Tianming. We keep trying to measure it, treating it as part of our universe. But… what if it’s not?”


Part Two: The Anchor of Dimensions
Cheng Xin’s words struck Yun like lightning. He began feverishly developing new mathematical models. No longer did he consider the obelisk a three-dimensional object; instead, he hypothesized it was a higher-dimensional entity “sliced” into our three-dimensional space.

“Imagine this,” he explained at an international physics conference, his holographic presence tinged with fervor, “an infinitely thin needle piercing through an infinitely large sheet of paper. For two-dimensional beings on the paper, they would perceive only a perfect point. No matter how precise their measurement tools, the point would always appear at the ‘center’ they can perceive. They cannot comprehend the needle, because the third dimension is beyond them.”

His theory caused a stir. Most dismissed it as philosophical speculation. Yet it perfectly explained the point’s “perfect centering.”

“This point,” Yun continued, “is not a feature on the obelisk’s surface. It is the obelisk itself! Or rather, it is the projection of a higher-dimensional object’s ‘axis’ into our universe. We are not measuring a point on a two-dimensional plane; we are gazing upon a reality-piercing, higher-dimensional spine.”

The theory became known as the “Anchor of Dimensions” Hypothesis. The point anchors a four-dimensional—or even higher-dimensional—object into our three-dimensional space. The civilization that left it had used the simplest, most elegant method to demonstrate a physics beyond our imagination.

They were saying: You exist, but not in the space you can perceive.


Part Three: The Response
How could this hypothesis be tested? It could not be verified by measurement. Yun proposed a bold experiment: do not measure the point’s “position,” but perturb the reality around it.

A massive ring-shaped device was constructed around the Witness. It emitted no particles or energy, but generated an extraordinarily precise, twisted spacetime field—a “whisper of gravitational waves.” If the point truly was a higher-dimensional projection, disturbances in our dimension might elicit a response from the anchor.

The day of the experiment drew the eyes of the world to the Moon. Yun and Cheng Xin stood at the control center, their hearts racing.

“Spacetime field generator, 1% power.”

Nothing happened.

“10%… 30%… 70%…”

The obelisk remained silent. The readouts were unchanging, despairingly stable.

“100%.”

A moment of silence.

Suddenly, the room felt gripped by an invisible hand; the air seemed to collapse. Heartbeats across Earth faltered, as if drawn toward a nonexistent direction.

Walls stretched, floors sank, control panels warped, faces elongated into unseeable dimensions. It was a sensation beyond language—like a drowning person inhaling air for the first time, or a blind man suddenly scorched by sunlight.

Then, they saw.

The point was no longer a point but a luminous spine piercing reality, extending into dimensions that could not be named. It was not dazzling, yet clearer than any star.

A torrent of conceptual information flooded their minds—not words, not sound, but pure ideas:

—Very good.

—You have finally abandoned the ruler and begun measuring the universe with thought.

—This door opens for you. We await on the other side.

The messages faded, and perception collapsed back into three-dimensional space. The room was unchanged; instruments stable. Only their breathing and trembling eyes revealed the magnitude of what had occurred. It was as if they had been swept by a cosmic tsunami and returned to shore.


Part Four: The Beginning
The secret of the Witness was revealed. It was not a monument, a warning, or a work of art.

It was a test: the most concise, ruthless, and elegant test.

The civilization that left it used a perfect geometric singularity to filter cosmic civilizations. Only when a species transcended three-dimensional thinking and began to understand higher dimensions could it “graduate” and earn the invitation to higher-dimensional existence.

Humanity took thirty years to solve the puzzle. Thirty years to earn a single answer.

Yun Tianming and Cheng Xin stood by the viewport, gazing at the serene black obelisk. That point, the enigma that had tormented generations of scientists, was no longer a point—it was a nail, pinning human civilization onto the test paper of the cosmos.

Perhaps countless intelligent species in innumerable galaxies had faced similar points. Some solved them, some failed, some still wandered the labyrinth. Humanity was merely one example, granted the privilege to step through the doorway to higher dimensions. It was both an invitation and a judgment.

And it began with understanding that perfect, infinitesimal point. Now, the gaze of all humanity was drawn to the invisible line extending to higher dimensions, calling them onward.

Suddenly, Yun shivered: Humanity has been chosen. But does being chosen mean fortunate?

The Moon remained silent; the black obelisk unchanged. A signpost, or perhaps a chain.


Afterword
When I was a child, I read a story—I can no longer verify which magazine or author it was—but it left a profound impression on me. In that story, humanity discovered a black obelisk from an extraterrestrial civilization. Though seemingly ordinary, every measurement—height, width, every geometric feature of its surface—conformed to a perfect, infinitely precise golden ratio.

Scientists exhausted themselves trying to decode any physical message: cosmic coordinates, mathematical formulas, or warnings. Ultimately, they realized the obelisk was the information itself. It was not language, but a tool for measuring and filtering. This civilization had elegantly, nonviolently, demonstrated a force beyond physical scale.

This story fascinated me and inspired my creation of The Singularity. I further concretized the idea of perfection, transforming it into a dimensionless point—a miracle challenging the foundations of physics. It is not a display of technology, but a test of human thought itself.

This work pays homage to that childhood story, reminding us that the deepest cosmic mysteries may not lie in distant stars, but in the simplest of concepts.

Tributes:

  • Arthur C. Clarke, 2001: A Space Odyssey
  • Carl Sagan, Contact
  • Liu Cixin, The Three-Body Problem
  • Ted Chiang, Story of Your Life

第一部分:奇点

故事始于月球,中国天河基地旁的突然出现的黑色方尖碑。它并非人类所造,这一点从发现它的第一天起就毋庸置疑。它由一种无法识别、绝对光滑的黑色材料构成,不反射任何光线,不泄露任何热量,仿佛是宇宙背景上一个三维的空洞。发现它的航天员们将其命名为“见证者”。

二十年来,人类科学家们用尽了一切科技手段研究它,却连其表面的一颗原子都未能刮下。直到对它的研究进入瓶颈,几乎要变成一种象征性的纪念时,那个“点”被发现了。

在方尖碑朝向地球的那一面,正中央,有一个点。

它不是一个标记,也不是一个凹痕或凸起。它看起来就像是材料本身的一个内在属性,一个几何学上的完美概念被赋予了实体。发现它的是量子度量学家云天明博士。他当时正在进行一次全息地形成像扫描,试图绘制方尖碑表面的量子涨落。在数据洪流的中心,他发现了一个绝对的“无”,一个信息熵为零的奇点。

当图像被转化为可见光模型时,那个点就在那里。一个完美的,没有维度的点。

接下来的十年,成了物理学界最令人抓狂的十年。

研究团队首先使用了原子力显微镜(AFM),希望在纳米尺度上分辨点的边缘结构。按照常规预期,任何固体表面都应显示出电子云分布及量子涨落。然而,扫描得到的势阱曲线始终平直,无噪声、无扰动,仿佛探针悬空在真空之上,无法捕获任何结构信号。

随后,他们改用扫描隧道显微镜(STM),测量点附近的电子态密度。无论电压如何微调,隧穿电流始终为零——没有可供电子跃迁的能级,仿佛该区域根本不属于三维物质世界。

为了进一步排除仪器局限,团队部署了激光干涉仪,将测量精度推进至接近普朗克长度的数量级。即便如此,数据链条依旧完美对称,测得点到方尖碑四条边缘的距离完全相等——不是近似,而是精确到超越量子测量极限。每条数据末端的值都呈现出无限循环的零,像在嘲笑人类对物理规律的认知极限。

“这不合道理,”云天明在无数个不眠之夜后,对着同事程心博士喃喃自语,“根据海森堡不确定性原理,我们不可能无限精确地确定一个粒子的位置。这个点本身的存在,就在嘲笑我们整个物理学大厦的根基。它是一个本体论上的奇迹,一个不该存在的东西。”

程心指着屏幕上旋转的方尖碑模型,数据流像瀑布一样刷新。“或许我们的思路错了,天明。我们总想着‘测量’它,把它当成一个我们宇宙里的东西。但如果……它不是呢?”

第二部分:维度之锚

程心的话像一道闪电击中了云天明。他开始疯狂地建立新的数学模型。他不再把方尖碑看作一个三维空间中的物体,而是假设它是一个更高维度物体在我们三维空间中的“切片”。

“想象一下,”他激动地对一个国际物理学研讨会解释道,全息投影中的他显得有些狂热,“一根无限细的针,垂直穿过一张无限大的纸。对于纸上的二维生物来说,它们能看到的只是一个完美的点。无论它们用多么精密的尺子去测量,那个点永远在它们所能感知的‘中心’。它们无法理解这根针,因为它们无法感知第三个维度。”

他的理论引起了轩然大波。大多数人认为这是纯粹的哲学臆想。但它完美地解释了那个点的“完美居中”特性。

“那个点,”云天明继续说道,“不是方尖碑表面的一个特征。它就是方尖碑本身!或者说,它是那个高维物体的‘中轴’在我们宇宙中的投影。我们不是在测量一个二维平面上的点,我们是在凝视一个穿越我们现实的、来自更高维度的‘轴’!”

这个理论被称为“维度之锚”假说。那个点,就是将一个四维甚至更高维度的物体,“锚定”在我们三维空间中的坐标奇点。留下它的文明,在用一种最简单、最优雅的方式,向我们展示一种我们无法想象的物理学。

他们在说:我们存在,但不在你们所能感知的空间里。

第三部分:回应

如何证实这个假说?无法用测量来证实。云天明提出了一个大胆的实验:不要去测量它的“位置”,而是去扰动它周围的“现实”。

一个巨大的环形设备在“见证者”周围被建立起来。它不会发射任何粒子或能量,而是会产生一种极其精密的、被扭曲的时空场——一种“引力波的低语”。他们的想法是,如果这个点真的是高维度的投影,那么扰动我们这个维度的时空结构,或许能从“锚点”得到一丝反馈。

实验进行的那一天,全球的目光都聚焦在月球。云天明和程心站在控制中心,心跳如鼓。

“时空场发生器启动,功率1%。”

什么都没有发生。

“10%… 30%… 70%…”

方尖碑依然静默。控制台上的所有读数都稳定得令人绝望。

“功率100%。”

片刻的寂静。

忽然,整个房间像被一只无形的手捏住了,空气仿佛塌陷。地球上的每个人的心跳都骤然失去节奏,仿佛身体被拉向某个不存在的方向。

墙壁在延伸,地板在坠落,控制台在扭曲,他们彼此的面孔也像被拉伸到看不见的维度。那是一种没有语言的感受,就像溺水的人突然吸入空气,又像盲人第一次被刺眼的阳光灼痛双眼。

然后,他们“看见”了。

那个点不再是点,而是一条贯穿现实的光之脊梁,向上、向下,延展进他们无法命名的空间。它并不耀眼,却比任何星辰都要清晰。

意识中涌入一段无法拒绝的信息,不是声音,不是文字,而是一种纯粹的概念。

——很好。

——你们终于抛下了尺子,开始学会用思想丈量宇宙。

——这扇门,为你们而开。我们,在另一边等候。

话语消失,感知塌回三维。房间依旧,仪器稳定,唯有每个人的呼吸与眼神在颤抖。仿佛他们刚刚从一场浩瀚的海啸里被抛回岸上。

第四部分:开端

“见证者”的秘密被揭开。它不是一个纪念碑,不是一个警告,也不是一个艺术品。

它是一道考题。最简洁、最无情、最优雅的考题。

留下它的文明,用一个完美的几何奇点,筛选着宇宙中的文明。只有当一个文明能够超越三维的测量思维,开始理解维度的本质时,他们才算“毕业”,才有资格获得这张通往更高维度的“邀请函”。

人类花了三十年才解开这道谜题。三十年,才换来一个答案。

云天明和程心站在舷窗前,凝望着远处静谧的黑色方尖碑。 那个点,那个曾经让所以科学家痛苦至极的谜题,如今在他们的眼里已不是点,而是一枚钉子——把人类文明钉在浩瀚宇宙的试卷上。

也许在无数星系中,无数智慧种族都曾面对过这样的点。有的解开,有的失败,有的至今仍在迷宫中徘徊。人类只是其中一例,被允许踏上更高维度的门槛。这是一份邀请。也是一份裁决。

而是从理解那个完美的、无限小的点开始。现在,整个人类的目光都投向了那条无形的、通往更高维度的线, 向人类发出召唤。

突然,云天明不寒而栗:人类,已经被选中。但被选中,就是幸运吗?

月球静默,黑色方尖碑一如既往。它像路标,也像锁链。


后记: 我小时候读过一个故事,具体是哪本杂志或者哪个作家已经无法考证了,但它在我脑海里留下了深刻的印记。故事里说,人类发现了一个来自地外文明的黑色方尖碑。这个方尖碑看起来没有任何特别之处,但无论用多么精确的仪器去测量,它的所有比例,从高度、宽度到表面上的每一个几何特征,都符合一种完美的、无限精确的黄金比例。

人类科学家们耗费了无数精力,试图从中解读出任何物理信息,比如宇宙坐标、数学公式或者警告。但最终他们意识到,这个方尖碑本身就是信息。它不是用来交流的语言,而是一种用来衡量和筛选的工具。地外文明用这种最优雅、最非暴力的方式,向人类展示纯粹的、超越物理尺度的力量。

这个故事让我深深着迷,也成为了我创作《奇点》的灵感来源。在这个故事中,我将这种“完美”的概念进一步具象化,让它变成了一个没有维度的点,一个从根本上挑战我们物理学基础的奇迹。它不是一种技术展示,而是对我们思维本身的一次终极考验。本文就是对我小时候读到的那个不知名故事的一次致敬和再创作。它提醒我,最深刻的宇宙之谜,或许不是藏在浩瀚星辰中,而是隐藏在最简单概念里。

致敬作品:

  • 阿瑟·克拉克(Arthur C. Clarke)的《2001:太空漫游》
  • 卡尔·萨根(Carl Sagan)的《超时空接触》(Contact
  • 刘慈欣的《三体》(The Three-Body Problem
  • 特德·姜(Ted Chiang)的《你一生的故事》(Story of Your Life

《大模型精诚》两篇

世有愚者,读方三年,便谓天下无病可治;及治病三年,乃知天下无方可用。

— 【唐】孙思邈《大医精诚》

《大模型精诚(上)》仿照孙思邈的《大医精诚》而作,论述了术之源和道之始。《大模型精诚(下)》继承了上篇的旨趣,进一步阐明了工具的用途;批判了浮夸学术的弊端,强调了明人志向的正直;愿后来的学者能够谨慎守护精诚之道。

  1. 《大模型精诚·上:御术循道》
  2. 《大模型精诚·下:格物穷理》
  3. 《大医精诚》原文
中文摘要 & English Abstract

《大模型精诚》分为上下两篇,仿孙思邈《大医精诚》之文风,提出面对大语言模型(LLMs)之道,须持“精勤”“诚敬”之精神。文章指出,大模型非万能,初学者易为其智能所惑,唯有深入算法原理、洞察数据源头,方能破“表象之惑”,守“正道之用”。技术本无心,人心为舵,若妄施滥用,或将致祸;唯秉“精诚之志”,以智辅仁,方可济世安人。

“On the Sincerity and Mastery in Large Models” is a two-part essay inspired by Sun Simiao’s classical Chinese text On the Absolute Sincerity of Great Physicians. Written in classical Chinese style, it warns against superficial understanding and blind faith in large language models (LLMs). It calls for practitioners to uphold a spirit of diligence (“精”) and sincerity (“诚”)—to understand the inner principles of algorithms and the biases within data. The model is but a tool; its moral compass lies in the human operator. Only by combining technical rigor with ethical restraint can AI serve humanity and avoid causing harm. This is both a philosophical treatise on AI and a critique of today’s hasty tech culture.

《大模型精诚·上:御术循道》

昔者圣贤格物致知,究天人之际,通古今之变,成一家之言。今夫人工智能之兴,盖亦格物之一端也。自其出也,声誉日隆,众人或惊其智,或惧其势,而不知其理者众矣。或操之以利,或役之为器,然利器在手,而不察其锋芒,未必不自伤也。是故观其用者多,究其道者寡。

世有愚者,览模型三月,便谓天下无难可为;及历参数之调三载,方知世无定式可循。故智者必穷其理,探其源,精勤不倦,不得道听途说,或一知半解,便言大道已了,深自误哉!夫模型者,数据瀚海之所凝,千万参数之所成。 初也,对答如流,人以为智;久则偏识横生,虚妄自出。是以不明其本,只见其文,则如沙上筑塔,虽高而易倾;水中捞月,虽美而终空。

夫术者,行之器也;道者,心之正也。有道无术,术尚可求;有术无道,止于术。是故为学者,当怀精诚之心,以格物致知。不为文饰所惑,不为便捷所役。上穷算理之幽,下辨数据之源。善用其利,以济百工;慎防其弊,以安天下。惟敬惟谨,方能行稳致远;术必精诚,方可臻于大成。

噫!大道之行,贵在明理,重在行之。浮华易得,实知难求。若以华辞饰伪,虽一时风光,终非正道;惟有以道御术,以术辅仁,方能不为其所役,而反致其功。愿世之为学者,慎始慎终,毋躁毋惰;内求于诚,外谨于行,不仅以问答取巧,更以格物证理。则大模型之道,不特能济世用事,亦可以砥砺心志,通于大道矣!


《大模型精诚·下:格物穷理》

夫大模型者,非徒技艺之巧,实启万象之门,应世变之机也。机巧肇兴,数十年耕耘,方结斯器,蔚然成势。其志在洞察智能之源,其义在贯通语言之理。通天人,综古今,非术末之流,乃文明之维。治世之助,经纬之翼。然世多趋利者,见术而不求道;窥皮而不识骨。操几行机命,即称智械之师;观几番演示,便誉人类之镜。或视生成若巫术,妄言灵机觉醒,惑众而自昏,笑之可也。

昔圣贤格物致知,寒暑不辍,穷理尽性,尚不敢言道成;今浮学遇器之妙,应一问而百答,便谓智能已极,人力可替,岂不谬哉?夫言虽似人,意非其本;识未通理,情不存诚。模型者,镜也,影也,非圣贤之心也。其构也,采万卷之籍,汇千年之言,亿万试炼,始见一用。然中有偏识之患、虚妄之误、理断之病、语悖之失,不可不察。不明其理,妄施其用,是犹未诊而投毒,害人而不自知也。

故为学者,当守精诚。不为华饰所眩,不为捷径所诱。内怀谨惧之心,外行谨严之道。器不妄用,人不失本。若夫施用之道,尤宜慎审。毋以小试之验,妄推大用之功;毋因偶中之答,遽信全才之能。逢伦理之辩,涉利害之争,当辨是非之界,守正直之心。不可托器以避己责,盖模型无心,惟人有心。算巧犹器,操之在人;人失其正,器失其依,虽有神器,亦可为祸。是故大者不在术之精,而在人之诚与敬也。

噫!世风浮躁,器成于速,工毁于轻。市井谈智者,多竞捷而寡思;创企逐利者,多耀术而失本。若不慎其患,不固其本,大器之用,虽广亦危。惟精勤而博识,惟谨慎而明理;守德以立身,循道以济世。方可保其善用,免其深害。夫技艺者,舟也;德义者,舵也。舟疾无舵,必倾覆于风波。若能以人为本,以智辅仁,引势以用,慎力以行,使模型不妄言,使人心常自省。则虽新器日出,世亦不乱;虽智能骤起,人亦不亡。


《大医精诚》原文

中国【唐】孙思邈(581~682年)所著之《备急千金要方》第一卷,乃是中医学典籍中,论述医德的一篇极重要文献,为习医者所必读。

张湛曰:“夫经方之难精,由来尚矣。今病有内同而外异,亦有内异而外同,故五脏六腑之盈虚,血脉荣卫之通塞,固非耳目之所察,必先诊候以审之。而寸口关尺,有浮沉弦紧之乱;俞穴流注,有高下浅深之差;肌肤筋骨,有厚薄刚柔之异。唯用心精微者,始可与言于兹矣。今以至精至微之事,求之于至粗至浅之思,岂不殆哉!若盈而益之,虚而损之,通而彻之,塞而壅之,寒而冷之,热而温之,是重加其疾,而望其生,吾见其死矣。故医方卜筮,艺能之难精者也。既非神授,何以得其幽微?世有愚者,读方三年,便谓天下无病可治;及治病三年,乃知天下无方可用。故学者必须博极医源,精勤不倦,不得道听途说,而言医道已了,深自误哉!

凡大医治病,必当安神定志,无欲无求,先发大慈恻隐之心,誓愿普救含灵之苦。若有疾厄来求救者,不得问其贵贱贫富,长幼妍蚩,怨亲善友,华夷愚智,普同一等,皆如至亲之想。亦不得瞻前顾后,自虑吉凶,护惜身命,见彼苦恼,若己有之,深心凄怆,勿避险巇,昼夜寒暑,饥渴疲劳,一心赴救,无作工夫行迹之心,如此可做苍生大医,反之则是含灵巨贼。自古明贤治病,多用生命以济危急,虽曰贱畜贵人,至于爱命,人畜一也。损彼益己,物情同患,况于人乎?夫杀生求生,去生更远,吾今此方所以不用生命为药者,良由此也。其虻虫水蛭之属,市有先死者,则市而用之,不在此例。只如鸡卵一物,以其混沌未分,必有大段要急之处,不得已隐忍而用之,能不用者,斯为大哲,亦所不及也。其有患疮痍下痢,臭秽不可瞻视,人所恶见者,但发惭愧、凄怜、忧恤之意,不得起一念蒂芥之心,是吾之志也。

夫大医之体,欲得澄神内视,望之俨然,宽裕汪汪,不皎不昧,省病诊疾,至意深心,详察形候,纤毫勿失,处判针药,无得参差,虽曰病宜速救,要须临事不惑,唯当审谛覃思,不得于性命之上,率而自逞俊快,邀射名誉,甚不仁矣。又到病家,纵绮罗满目,勿左右顾盼;丝竹凑耳,无得似有所娱;珍羞迭荐,食如无味;醽醁兼陈,看有若无。所以尔者,夫一人向隅,满堂不乐,而况病人苦楚,不离斯须,而医者安然欢娱,傲然自得,兹乃人神之所共耻,至人之所不为,斯盖医之本意也。

夫为医之法,不得多语调笑,谈谑喧哗,道说是非,议论人物,炫耀声名,訾毁诸医,自矜己德,偶然治瘥一病,则昂头戴面,而有自许之貌,谓天下无双,此医人之膏肓也。老君曰:人行阳德,人自报之;人行阴德,鬼神报之;人行阳恶,人自报之,人行阴恶,鬼神害之。寻此贰途,阴阳报施,岂诬也哉?

所以医人不得恃己所长,专心经略财物,但作救苦之心,于冥运道中,自感多福者耳。又不得以彼富贵,处以珍贵之药,令彼难求,自眩功能,谅非忠恕之道。志存救济,故亦曲碎论之,学者不可耻言之鄙俚也!”


END

Zuckerberg’s Gamble: Risks and Rewards in AI Talent Acquisition


Mark Zuckerberg’s recent move to bring Alex Wang and his team into Meta represents a bold and strategic maneuver amid the rapid advancement of large models and AGI development. Putting aside the ethical considerations, Zuckerberg’s approach—laying off staff, then offering sky-high compensation packages with a 48-hour ultimatum to Top AI scientists and engineers from OpenAI , alongside Meta’s acquisition of a 49% stake in Scale AI—appears to serve multiple objectives:

1. Undermining Competitors

By poaching key talent from rival companies, Meta not only weakens their R&D teams and disrupts their momentum but also puts pressure on Google, OpenAI, and others to reassess their partnerships with Scale AI. Meta’s investment may further marginalize these competitors by injecting uncertainty into their collaboration with Scale AI.

2. Reinvigorating the Internal Team

Bringing in fresh blood like Alex Wang’s team and Open AI Top talents could reenergize Meta’s existing research units. A successful “talent reset” may help the company gain a competitive edge in the race toward AGI.

3. Enhancing Brand Visibility

Even if the move doesn’t yield immediate results, it has already amplified Meta’s media presence, boosting its reputation as a leader in AI innovation.

From both a talent acquisition and PR standpoint, this appears to be a masterstroke for Meta.


However, the strategy is not without significant risks:

1. Internal Integration and Morale Challenges

The massive compensation packages offered to those talents could trigger resentment among existing employees—especially in the wake of recent layoffs—due to perceived pay inequity. This may lower morale and even accelerate internal attrition. Cultural differences between the incoming and incumbent teams could further complicate internal integration and collaboration.

2. Return on Investment and Performance Pressure

Meta’s substantial investment in Alex Wang and Scale AI comes with high expectations for short-term deliverables. In a domain as uncertain as AGI, both the market and shareholders will be eager for breakthroughs. If Wang’s team fails to deliver measurable progress quickly, Meta could face mounting scrutiny and uncertainty over the ROI.

3. Impacts on Scale AI and the Broader Ecosystem

Alex Wang stepping away as CEO is undoubtedly a major loss for Scale AI, even if he retains a board seat. Leadership transitions and potential talent departures may follow. Moreover, Scale AI’s history of legal and compliance issues could reflect poorly on Meta’s brand—especially if public perception ties Meta to those concerns despite holding only non-voting shares. More broadly, Meta’s aggressive “poaching” approach may escalate the AI talent war, drive up industry-wide costs, and prompt renewed debate over ethics and hiring norms in the AI sector.


Conclusion
Meta’s latest move is undeniably ambitious. While it positions the company aggressively in the AGI race, it also carries notable risks in terms of internal dynamics, ROI pressure, and broader ecosystem disruption. Only time will tell whether this bold gamble pays off.

Our Future with AI: Three Strategies to Ensure It Stays on Our Side

As Artificial Intelligence rapidly evolves, ensuring it remains a beneficial tool rather than a source of unforeseen challenges is paramount; this article explores three critical strategies to keep AI firmly on our side. Our AI researchers can draw lessons from cybersecurity, robotics, and astrobiology side. Source: IEEE Spectrum April 2025; 3 Ways to Keep AI on Our Side: AI Researchers can Draw Lessons from Cybersecurity, Robotics, and Astrobiology

Play the podcast

中文翻译摘要

这篇文章提出了确保人工智能安全和有益发展的三个独特且跨学科的策略。

应对人工智能的独特错误模式:布鲁斯·施奈尔(Bruce Schneier)和内森·E·桑德斯(Nathan E. Sanders)(网络安全视角)指出,人工智能系统,特别是大型语言模型(LLMs),其错误模式与人类错误显著不同——它们更难预测,不集中在知识空白处,且缺乏对自身错误的自我意识。他们提出双重研究方向:一是工程化人工智能以产生更易于人类理解的错误(例如,通过RLHF等精炼的对齐技术);二是开发专门针对人工智能独特“怪异”之处的新型安全与纠错系统(例如,迭代且多样化的提示)。

更新伦理框架以打击人工智能欺骗:达里乌什·杰米尔尼亚克(Dariusz Jemielniak)(机器人与互联网文化视角)认为,鉴于人工智能驱动的欺骗行为(包括深度伪造、复杂的错误信息宣传和操纵性人工智能互动)日益增多,艾萨克·阿西莫夫(Isaac Asimov)传统的机器人三定律已不足以应对现代人工智能。他提出一条“机器人第四定律”:机器人或人工智能不得通过冒充人类来欺骗人类。实施这项法律将需要强制性的人工智能披露、清晰标注人工智能生成内容、技术识别标准、法律执行以及公众人工智能素养倡议,以维护人机协作中的信任。

建立通用人工智能(AGI)检测与互动的严格协议:埃德蒙·贝戈利(Edmon Begoli)和阿米尔·萨多夫尼克(Amir Sadovnik)(天体生物学/SETI视角)建议,通用人工智能(AGI)的研究可以借鉴搜寻地外文明(SETI)的方法论。他们主张对AGI采取结构化的科学方法,包括:制定清晰、多学科的“通用智能”及相关概念(如意识)定义;创建超越图灵测试局限性的鲁棒、新颖的AGI检测指标和评估基准;以及制定国际公认的检测后协议,以便在AGI出现时进行验证、确保透明度、安全性和伦理考量。

总而言之,这些观点强调了迫切需要创新、多方面的方法——涵盖安全工程、伦理准则修订以及严格的科学协议制定——以主动管理先进人工智能系统的社会融入和潜在未来轨迹。


Abstract: this article presents three distinct, cross-disciplinary strategies for ensuring the safe and beneficial development of Artificial Intelligence.

Addressing Idiosyncratic AI Error Patterns (Cybersecurity Perspective): Bruce Schneier and Nathan E. Sanders highlight that AI systems, particularly Large Language Models (LLMs), exhibit error patterns significantly different from human mistakes—being less predictable, not clustered around knowledge gaps, and lacking self-awareness of error. They propose a dual research thrust: engineering AIs to produce more human-intelligible errors (e.g., through refined alignment techniques like RLHF) and developing novel security and mistake-correction systems specifically designed for AI’s unique “weirdness” (e.g., iterative, varied prompting).

Updating Ethical Frameworks to Combat AI Deception (Robotics & Internet Culture Perspective): Dariusz Jemielniak argues that Isaac Asimov’s traditional Three Laws of Robotics are insufficient for modern AI due to the rise of AI-enabled deception, including deepfakes, sophisticated misinformation campaigns, and manipulative AI interactions. He proposes a “Fourth Law of Robotics”: A robot or AI must not deceive a human being by impersonating a human being. Implementing this law would necessitate mandatory AI disclosure, clear labeling of AI-generated content, technical identification standards, legal enforcement, and public AI literacy initiatives to maintain trust in human-AI collaboration.

Establishing Rigorous Protocols for AGI Detection and Interaction (Astrobiology/SETI Perspective): Edmon Begoli and Amir Sadovnik suggest that research into Artificial General Intelligence (AGI) can draw methodological lessons from the Search for Extraterrestrial Intelligence (SETI). They advocate for a structured scientific approach to AGI that includes:

  • Developing clear, multidisciplinary definitions of “general intelligence” and related concepts like consciousness.
  • Creating robust, novel metrics and evaluation benchmarks for detecting AGI, moving beyond limitations of tests like the Turing Test.
  • Formulating internationally recognized post-detection protocols for validation, transparency, safety, and ethical considerations, should AGI emerge.

Collectively, these perspectives emphasize the urgent need for innovative, multi-faceted approaches—spanning security engineering, ethical guideline revision, and rigorous scientific protocol development—to proactively manage the societal integration and potential future trajectory of advanced AI systems.


Here are the full detailed content:

3 Ways to Keep AI on Our Side

AS ARTIFICIAL INTELLIGENCE reshapes society, our traditional safety nets and ethical frameworks are being put to the test. How can we make sure that AI remains a force for good? Here we bring you three fresh visions for safer AI.

  • In the first essay, security expert Bruce Schneier and data scientist Nathan E. Sanders explore how AI’s “weird” error patterns create a need for innovative security measures that go beyond methods honed on human mistakes.
  • Dariusz Jemielniak, an authority on Internet culture and technology, argues that the classic robot ethics embodied in Isaac Asimov’s famous rules of robotics need an update to counterbalance AI deception and a world of deepfakes.
  • And in the final essay, the AI researchers Edmon Begoli and Amir Sadovnik suggest taking a page from the search for intelligent life in the stars; they propose rigorous standards for detecting the possible emergence of human-level AI intelligence.

As AI advances with breakneck speed, these cross-disciplinary strategies may help us keep our hands on the reins.


AI Mistakes Are Very Different from Human Mistakes

WE NEED NEW SECURITY SYSTEMS DESIGNED TO DEAL WITH THEIR WEIRDNESS

Bruce Schneier & Nathan E. Sanders

HUMANS MAKE MISTAKES all the time. All of us do, every day, in tasks both new and routine. Some of our mistakes are minor, and some are catastrophic. Mistakes can break trust with our friends, lose the confidence of our bosses, and sometimes be the difference between life and death.

Over the millennia, we have created security systems to deal with the sorts of mistakes humans commonly make. These days, casinos rotate their dealers regularly, because they make mistakes if they do the same task for too long. Hospital personnel write on patients’ limbs before surgery so that doctors operate on the correct body part, and they count surgical instruments to make sure none are left inside the body. From copyediting to double-entry bookkeeping to appellate courts, we humans have gotten really good at preventing and correcting human mistakes.

Humanity is now rapidly integrating a wholly different kind of mistakemaker into society: AI. Technologies like large language models (LLMs) can perform many cognitive tasks traditionally fulfilled by humans, but they make plenty of mistakes. You may have heard about chatbots telling people to eat rocks or add glue to pizza. What differentiates AI systems’ mistakes from human mistakes is their weirdness. That is, AI systems do not make mistakes in the same ways that humans do.

Much of the risk associated with our use of AI arises from that difference. We need to invent new security systems that adapt to these differences and prevent harm from AI mistakes.

IT’S FAIRLY EASY to guess when and where humans will make mistakes. Human errors tend to come at the edges of someone’s knowledge: Most of us would make mistakes solving calculus problems. We expect human mistakes to be clustered: A single calculus mistake is likely to be accompanied by others. We expect mistakes to wax and wane depending on factors such as fatigue and distraction. And mistakes are typically accompanied by ignorance: Someone who makes calculus mistakes is also likely to respond “I don’t know” to calculus-related questions.

To the extent that AI systems make these humanlike mistakes, we can bring all of our mistake-correcting systems to bear on their output. But the current crop of AI models—particularly LLMs—make mistakes differently.

AI errors come at seemingly random times, without any clustering around particular topics. The mistakes tend to be more evenly distributed through the knowledge space; an LLM might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats. And AI mistakes aren’t accompanied by ignorance. An LLM will be just as confident when saying something completely and obviously wrong as it will be when saying something true.

The inconsistency of LLMs makes it hard to trust their reasoning in complex, multistep problems. If you want to use an AI model to help with a business problem, it’s not enough to check that it understands what factors make a product profitable; you need to be sure it won’t forget what money is.

THIS SITUATION INDICATES two possible areas of research: engineering LLMs to make mistakes that are more humanlike, and building new mistake-correcting systems that deal with the specific sorts of mistakes that LLMs tend to make.

We already have some tools to lead LLMs to act more like humans. Many of these arise from the field of “alignment” research, which aims to make models act in accordance with the goals of their human developers. One example is the technique that was arguably responsible for the breakthrough success of ChatGPT: reinforcement learning with human feedback. In this method, an AI model is rewarded for producing responses that get a thumbs-up from human evaluators. Similar approaches could be used to induce AI systems to make humanlike mistakes, particularly by penalizing them more for mistakes that are less intelligible.

When it comes to catching AI mistakes, some of the systems that we use to prevent human mistakes will help. To an extent, forcing LLMs to double-check their own work can help prevent errors. But LLMs can also confabulate seemingly plausible yet truly ridiculous explanations for their flights from reason.

Other mistake-mitigation systems for AI are unlike anything we use for humans. Because machines can’t get fatigued or frustrated, it can help to ask an LLM the same question repeatedly in slightly different ways and then synthesize its responses. Humans won’t put up with that kind of annoying repetition, but machines will.

RESEARCHERS ARE still struggling to understand where LLM mistakes diverge from human ones. Some of the weirdness of AI is actually more humanlike than it first appears.

Small changes to a query to an LLM can result in wildly different responses, a problem known as prompt sensitivity. But, as any survey researcher can tell you, humans behave this way, too. The phrasing of a question in an opinion poll can have drastic impacts on the answers.

LLMs also seem to have a bias toward repeating the words that were most common in their training data—for example, guessing familiar place names like “America” even when asked about more exotic locations. Perhaps this is an example of the human “availability heuristic” manifesting in LLMs; like humans, the machines spit out the first thing that comes to mind rather than reasoning through the question. Also like humans, perhaps, some LLMs seem to get distracted in the middle of long documents; they remember more facts from the beginning and end.

In some cases, what’s bizarre about LLMs is that they act more like humans than we think they should. Some researchers have tested the hypothesis that LLMs perform better when offered a cash reward or threatened with death. It also turns out that some of the best ways to “jailbreak” LLMs (getting them to disobey their creators’ explicit instructions) look a lot like the kinds of social-engineering tricks that humans use on each otherfor example, pretending to be someone else or saying that the request is just a joke. But other effective jailbreaking techniques are things no human would ever fall for. One group found that if they used ASCII art (constructions of symbols that look like words or pictures) to pose dangerous questions, like how to build a bomb, the LLM would answer them willingly.

Humans may occasionally make seemingly random, incomprehensible, and inconsistent mistakes, but such occurrences are rare and often indicative of more serious problems. We also tend not to put people exhibiting these behaviors in decision-making positions. Likewise, we should confine AI decision-making systems to applications that suit their actual abilities—while keeping the potential ramifications of their mistakes firmly in mind.


Asimov’s Laws of Robotics Need an Update for AI PROPOSING A FOURTH LAW OF ROBOTICS

Dariusz Jemielniak

IN 1942, the legendary science fiction author Isaac Asimov introduced his Three Laws of Robotics in his short story “Runaround.” The laws were later popularized in his seminal story collection I, Robot.

  1. FIRST LAW: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. SECOND LAW: A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  3. THIRD LAW: A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

While drawn from works of fiction, these laws have shaped discussions of robot ethics for decades. And as AI systems—which can be considered virtual robots—have become more sophisticated and pervasive, some technologists have found Asimov’s framework useful for considering the potential safeguards needed for AI that interacts with humans.

But the existing three laws are not enough. Today, we are entering an era of unprecedented human-AI collaboration that Asimov could hardly have envisioned. The rapid advancement of generative AI, particularly in language and image generation, has created challenges beyond Asimov’s original concerns about physical harm and obedience.

THE PROLIFERATION of AI-enabled deception is particularly concerning. According to the FBI’s most recent Internet Crime Report, cybercrime involving digital manipulation and social engineering results in annual losses counted in the billions. The European Union Agency for Cybersecurity’s ENISA Threat Landscape 2023 highlighted deepfakes—synthetic media that appear genuine—as an emerging threat to digital identity and trust.

Social-media misinformation is a huge problem today. I studied it during the pandemic extensively and can say that the proliferation of generative AI tools has made its detection increasingly difficult. AI-generated propaganda is often just as persuasive as or even more persuasive than traditional propaganda, and bad actors can very easily use AI to create convincing content. Deepfakes are on the rise everywhere. Botnets can use AI-generated text, speech, and video to create false perceptions of widespread support for any political issue. Bots are now capable of making phone calls while impersonating people, and AI scam calls imitating familiar voices are increasingly common. Any day now, we can expect a boom in video-call scams based on AI-rendered overlay avatars, allowing scammers to impersonate loved ones and target the most vulnerable populations.

Even more alarmingly, children and teenagers are forming emotional attachments to AI agents, and are sometimes unable to distinguish between interactions with real friends and bots online. Already, there have been suicides attributed to interactions with AI chatbots.

In his 2019 book Human Compatible (Viking), the eminent computer scientist Stuart Russell argues that AI systems’ ability to deceive humans represents a fundamental challenge to social trust. This concern is reflected in recent policy initiatives, most notably the European Union’s AI Act, which includes provisions requiring transparency in AI interactions and transparent disclosure of AI-generated content. In Asimov’s time, people couldn’t have imagined the countless ways in which artificial agents could use online communication tools and avatars to deceive humans.

Therefore, we must make an addition to Asimov’s laws.

FOURTH LAW: A robot or AI must not deceive a human being by impersonating a human being.

WE NEED CLEAR BOUNDARIES. While human-AI collaboration can be constructive, AI deception undermines trust and leads to wasted time, emotional distress, and misuse of resources. Artificial agents must identify themselves to ensure our interactions with them are transparent and productive. AI-generated content should be clearly marked unless it has been significantly edited and adapted by a human.

Implementation of this Fourth Law would require

  • mandatory AI disclosure in direct interactions,
  • clear labeling of AI-generated content,
  • technical standards for AI identification,
  • legal frameworks for enforcement, and
  • educational initiatives to improve AI literacy.

Of course, all this is easier said than done. Enormous research efforts are already underway to find reliable ways to watermark or detect AI-generated text, audio, images, and videos. But creating the transparency I’m calling for is far from a solved problem.

The future of human-AI collaboration depends on maintaining clear distinctions between human and artificial agents. As noted in the IEEE report Ethically Aligned Design, transparency in AI systems is fundamental to building public trust and ensuring the responsible development of artificial intelligence.

Asimov’s complex stories showed that even robots that tried to follow the rules often discovered there were unintended consequences to their actions. Still, having AI systems that are at least trying to follow Asimov’s ethical guidelines would be a very good start.


What Can AI Researchers Learn from Alien Hunters?

THE SETI INSTITUTE’S APPROACH HAS LESSONS FOR RESEARCH ON ARTIFICIAL GENERAL INTELLIGENCE

Edmon Begoli & Amir Sadovnik

THE EMERGENCE OF artificial general intelligence (systems that can perform any intellectual task a human can) could be the most important event in human history. Yet AGI remains an elusive and controversial concept. We lack a clear definition of what it is, we don’t know how to detect it, and we don’t know how to interact with it if it finally emerges.

What we do know is that today’s approaches to studying AGI are not nearly rigorous enough. Companies like OpenAI are actively striving to create AGI, but they include research on AGI’s social dimensions and safety issues only as their corporate leaders see fit. And academic institutions don’t have the resources for significant efforts.

We need a structured scientific approach to prepare for AGI. A useful model comes from an unexpected field: the search for extraterrestrial intelligence, or SETI. We believe that the SETI Institute’s work provides a rigorous framework for detecting and interpreting signs of intelligent life.

The idea behind SETI goes back to the beginning of the space age. In their 1959 Nature paper, the physicists Giuseppe Cocconi and Philip Morrison suggested ways to search for interstellar communication. Given the uncertainty of extraterrestrial civilizations’ existence and sophistication, they theorized about how we should best “listen” for messages from alien societies.

We argue for a similar approach to studying AGI, in all its uncertainties. The last few years have shown a vast leap in AI capabilities. The large language models (LLMs) that power chatbots like ChatGPT and enable them to converse convincingly with humans have renewed the discussion of AGI. One notable 2023 preprint even argued that ChatGPT shows “sparks” of AGI, and today’s most cutting-edge language models are capable of sophisticated reasoning and outperform humans in many evaluations.

While these claims are intriguing, there are reasons to be skeptical. In fact, a large group of scientists have argued that the current set of tools won’t bring us any closer to true AGI. But given the risks associated with AGI, if there is even a small likelihood of it occurring, we must make a serious effort to develop a standard definition of AGI, establish a SETI-like approach to detecting it, and devise ways to safely interact with it if it emerges.

THE CRUCIAL FIRST step is to define what exactly to look for. In SETI’s case, researchers decided to look for certain narrowband signals that would be distinct from other radio signals present in the cosmic background. These signals are considered intentional and only produced by intelligent life. None have been found so far.

In the case of AGI, matters are far more complicated. Today, there is no clear definition of artificial general intelligence. The term is hard to define because it contains other imprecise and controversial terms. Although intelligence has been defined by the Oxford English Dictionary as “the ability to acquire and apply knowledge and skills,” there is still much debate on which skills are involved and how they can be measured. The term general is also ambiguous. Does an AGI need to be able to do absolutely everything a human can do?

One of the first missions of a “SETI for AGI” project must be to clearly define the terms general and intelligence so the research community can speak about them concretely and consistently. These definitions need to be grounded in disciplines such as computer science, measurement science, neuroscience, psychology, mathematics, engineering, and philosophy.

There’s also the crucial question of whether a true AGI must include consciousness and self-awareness. These terms also have multiple definitions, and the relationships between them and intelligence must be clarified. Although it’s generally thought that consciousness isn’t necessary for intelligence, it’s often intertwined with discussions of AGI because creating a self-aware machine would have many philosophical, societal, and legal implications.

NEXT COMES the task of measurement. In the case of SETI, if a candidate narrowband signal is detected, an expert group will verify that it is indeed from an extraterrestrial source. They’ll use established criteria—for example, looking at the signal type and checking for repetition—and conduct assessments at multiple facilities for additional validation.

How to best measure computer intelligence has been a long-standing question in the field. In a famous 1950 paper, Alan Turing proposed the “imitation game,” more widely known as the Turing Test, which assesses whether human interlocutors can distinguish if they are chatting with a human or a machine. Although the Turing Test was useful in the past, the rise of LLMs has made clear that it isn’t a complete enough test to measure intelligence. As Turing himself noted, the relationship between imitating language and thinking is still an open question.

Future appraisals must be directed at different dimensions of intelligence. Although measures of human intelligence are controversial, IQ tests can provide an initial baseline to assess one dimension. In addition, cognitive tests on topics such as creative problem-solving, rapid learning and adaptation, reasoning, and goal-directed behavior would be required to assess general intelligence.

But it’s important to remember that these cognitive tests were designed for humans and might contain assumptions that might not apply to computers, even those with AGI abilities. For example, depending on how it’s trained, a machine may score very high on an IQ test but remain unable to solve much simpler tasks. In addition, an AI may have new abilities that aren’t measurable by our traditional tests. There’s a clear need to design novel evaluations that can alert us when meaningful progress is made toward AGI.

IF WE DEVELOP AGI, we must be prepared to answer questions such as: Is the new form of intelligence a new form of life? What kinds of rights does it have? What are the potential safety concerns, and what is our approach to containing the AGI entity?

Here, too, SETI provides inspiration. SETI’s postdetection protocols emphasize validation, transparency, and international cooperation, with the goal of maximizing the credibility of the process, minimizing sensationalism, and bringing structure to such a profound event. Likewise, we need internationally recognized AGI protocols to bring transparency to the entire process, apply safety-related best practices, and begin the discussion of ethical, social, and philosophical concerns.

We readily acknowledge that the SETI analogy can go only so far. If AGI emerges, it will be a human-made phenomenon. We will likely gradually engineer AGI and see it slowly emerge, so detection might be a process that takes place over a period of years, if not decades. In contrast, the existence of extraterrestrial life is something that we have no control over, and contact could happen very suddenly.

The consequences of a true AGI are entirely unpredictable. To best prepare, we need a methodical approach to defining, detecting, and interacting with AGI, which could be the most important development in human history.


2024 Guest Lecture Notes: AI, Machine Learning and Data Mining in Recommendation System and Entity Matching

  1. Lecture Notes Repository on GitHub
    1. Disclaimer
    2. 2024-10-14: AI/ML in Action for CSE5ML
    3. 2024-10-15: AI/DM in Action for CSE5DMI
  2. Contribution to the Company and Society
  3. Reference

In October of 2024, I was invited by Dr Lydia C. and Dr Peng C to give two presentations as a guest lecturer at La Trobe University (Melbourne) to the students enrolled with CSE5DMI Data Mining and CSE5ML Machine Learning.

The lectures are focused on data mining and machine learning applications and practice in industry and digital retail; and how students should prepare themselves for their future. Attendees are postgraduate students currently enrolled in CSE5ML or CSE5DMI in 2024 Semester 2, approximately 150 students for each subject (CSE5ML or CSE5DMI) who are pursuing one of the following degrees:

  • Master of Information Technology (IT)
  • Master of Artificial Intelligence (AI)
  • Master of Data Science
  • Master of Business Analytics

Lecture Notes Repository on GitHub

Viewer can find the Lecture Notes on my GitHub Repository: https://github.com/cuicaihao/GuestLecturePublic under a Creative Commons Attribution 4.0 International License.

Disclaimer

This repository is intended for educational purposes only. The content, including presentations and case studies, is provided “as is” without any warranties or guarantees of any kind. The authors and contributors are not responsible for any errors or omissions, or for any outcomes related to the use of this material. Use the information at your own risk. All trademarks, service marks, and company names are the property of their respective owners. The inclusion of any company or product names does not imply endorsement by the authors or contributors.

This is public repository aiming to share the lecture for the public. The *.excalidraw files can be download and open on https://excalidraw.com/)

2024-10-14: AI/ML in Action for CSE5ML

  • General Slides CSE5ML
  • Case Study: Recommendation System
  • A recommendation system is an artificial intelligence or AI algorithm, usually associated with machine learning, that uses Big Data to suggest or recommend additional products to consumers. These can be based on various criteria, including past purchases, search history, demographic information, and other factors.
  • This presentation is developed for students of CSE5ML LaTrobe University, Melbourne and used in the guest lecture on 2024 October 14.

2024-10-15: AI/DM in Action for CSE5DMI

  • General Slides CSE5DMI
  • Case Study: Entity Matching System
    • Entity matching – the task of clustering duplicated database records to underlying entities.”Given a large collection of records, cluster these records so that the records in each cluster all refer to the same underlying entity.”
  • This presentation is developed for students of CSE5DMI LaTrobe University, Melbourne and used in the guest lecture on 2024 October 15.

Contribution to the Company and Society

This journey is also align to the Company’s strategy.

  • Being invited to be a guest lecturer for students with related knowledge backgrounds in 2024 aligns closely with EDG’s core values of “weʼre real, weʼre inclusive, weʼre responsible”.
  • By participating in a guest lecture and discussion on data analytics and AI/ML practice beyond theories, we demonstrate our commitment to sharing knowledge and expertise, embodying our responsibility to contribute positively to the academic community and bridge the gap between theory builders and problem solvers.
  • This event allows us to inspire and educate students in the same domains at La Trobe University, showcasing our passion and enthusiasm for the business. Through this engagement, we aim to positively impact attendees, providing suggestions for their career paths, and fostering a spirit of collaboration and continuous learning.
  • Showing our purpose, values, and ways of working will impress future graduates who may want to come and work for us, want to stay and thrive with us. It also helps us deliver on our purpose to create a more sociable future, together.

Moreover, I am grateful for all the support and encouragement I have received from my university friends and teammates throughout this journey. Additionally, the teaching resources and environment in the West Lecture Theatres at La Trobe University are outstanding!

Reference

-END-

Is the AI PC a Gimmick or a Faster Carriage?

TL,DL: The post discusses the impact of AI on productivity, particularly through the emergence of AI PCs powered by localized edge AI. It highlights how large language models and the Core Ultra processor enable AI PCs to handle diverse tasks efficiently and securely. The article also touches on the practical applications and benefits of AI PCs in various fields. The comprehensive overview emphasizes the transformative potential of AI PCs and their pivotal role in shaping the future of computing.

Translation from the Source: AI PC 是噱头还是更快的马车?

Is AI a Bubble or a Marketing Gimmick?

Since 2023, everyone has known that AI is very hot, very powerful, and almost magical. It can generate articles with elegant language and write comprehensive reports, easily surpassing 80% or even more of human output. As for text-to-image generation, music composition, and even videos, there are often impressive results. There’s no need to elaborate on its hype…

For professions like designers and copywriters, generative AI has indeed helped them speed up the creative process, eliminating the need to start from scratch. Due to its high efficiency, some people in these positions might even face the worry of losing their jobs. But for ordinary people, aside from being a novelty, AI tools like OpenAI and Stable Diffusion don’t seem to provide much practical help for their work. After all, most people don’t need to write well-structured articles or compose poems regularly. Moreover, after seeing many AI outputs, they often feel that they are mostly correct but useless information—helpful, but not very impactful.

So, when a phone manufacturer says it will no longer produce “traditional phones,” people scoff. When the concept of an AI PC emerges, it’s hard not to see it as a marketing gimmick. However, after walking around the exhibition area at Intel’s 2024 commercial client AI PC product launch, I found AI to be more useful than I imagined. Yes, useful—not needing to be breathtaking, but very useful.

The fundamental change in experience brought by localized edge AI

Since it is a commercial PC, it cannot be separated from the productivity tool attribute. If you don’t buy the latest hardware and can’t run the latest software versions, it’s easy to be labeled as having “low application skills.” Take Excel as an example. The early understanding of efficiency in Excel was using formulas for automatic calculations. Later, it was about macro code for automatic data filtering, sorting, exporting, etc., though this was quite difficult. A few years ago, learning Python seemed to be the trend, and without it, one was not considered competent in data processing. Nowadays, with data visualization being the buzzword, most Excel users have to search for tutorials online and learn on the spot for unfamiliar formulas. Complex operations often require repeated attempts.

So, can adding “AI” to a PC or installing an AI assistant make it trendy? After experiencing it firsthand, I can confirm that the AI PC is far from superficial. There is a company called ExtendOffice, specializing in Office plugins, which effectively solves the pain points of using Excel awkwardly: you just state your intention, and the AI assistant directly performs operations on the Excel sheet, such as currency conversion or encrypting a column of data. There’s no need to figure out which formula or function corresponds to your needs, no need to search for tutorials, and it skips the step-by-step learning process—the AI assistant handles it immediately.

This highlights a particularly critical selling point of the AI PC: localization, and based on that, it can be embedded into workflows and directly participate in processing. We Chinese particularly love learning, always saying “teaching someone to fish is better than giving them a fish,” but the learning curve for fishing is too long. In an AI PC, you can get both the fish and the fishing skills because the fisherman (AI assistant) is always in front of you, not to mention it can also act as a chef or secretary.

Moreover, the “embedding” mentioned earlier is not limited to a specific operation (like adding a column of data or a formula to Excel). It can generate multi-step, cross-software operations. This demonstrates the advantage of large language models: they can accept longer inputs, understand, and break them down. For example, we can tell the AI PC: “Mute the computer, then open the last read document and send it to a certain email.” Notably, as per the current demonstration, there is no need to specify the exact document name; vague instructions are understandable. Another operation that pleasantly surprised me was batch renaming files. In Windows, batch renaming files requires some small techniques and can only change them into regular names (numbers, letter suffixes, etc.). But with the help of an AI assistant, we can make file names more personalized: adding relevant customer names, different styles, etc. This seemingly simple task actually involves looking at each file, extracting key information, and even describing some abstract information based on self-understanding, then individually writing new file names—a very tedious process that becomes time-consuming with many files. With the AI assistant, it’s just a matter of saying a sentence. Understanding longer contexts, multi-modal inputs, etc., all rely on the capabilities of large language models, but this is running locally, not relying on cloud inference. Honestly, no one would think that organizing file names in the local file system requires going to the cloud, right? The hidden breaks between the edge and the cloud indeed limit our imagination, so these local operations of the AI PC really opened my mind.

Compared to the early familiar cloud-based AI tools, localization brings many obvious benefits. For instance, even when offline, natural language processing and other operations can be completed. For those early users who heavily relied on large models and encountered service failures, “the sky is falling” was a pain point. Not to mention scenarios without internet, like on a plane, maintaining continuous availability is a basic need.

Local deployment can also address data security issues. Since the rise of large models, there have been frequent news of companies accidentally leaking data. Using ChatGPT for presentations, code reviews, etc., is great, but it requires uploading documents to the cloud. This has led many companies to outright ban employees from using ChatGPT. Subsequently, many companies chose to train and fine-tune private large models using open-source models and internal data, deploying them on their own servers or cloud hosts. Furthermore, we now see that a large model with 20 billion parameters can be deployed on an AI PC based on the Core Ultra processor.

These large models deployed on AI PCs have already been applied in various vertical fields such as education, law, and medicine, generating knowledge graphs, contracts, legal opinions, and more. For example, inputting a case into ThunderSoft’s Cube intelligent legal assistant can analyze the case, find relevant legal provisions, draft legal documents, etc. In this scenario, the privacy of the case should be absolutely guaranteed, and lawyers wouldn’t dare transmit such documents to the cloud for processing. Doctors have similar constraints. For research based on medical cases and genetic data, conducting genetic target and pharmacological analyses on a PC eliminates the need to purchase servers or deploy private clouds.

Incidentally, the large model on the AI PC also makes training simpler than imagined. Feeding the local files visible to you into the AI assistant can solve the problem of “correct nonsense” that previous chatbots often produced. For example, generating a quote email template with AI is easy, but it’s normal for a robot to not understand key information like prices, which requires human refinement. If a person handles this, preparing a price list in advance is a reasonable requirement, right? Price lists and FAQs need to be summarized and refined, then used to train newcomers more effectively—that’s the traditional view. Local AI makes this simple: let it read the Outlook mailbox, and it will learn the corresponding quotes from historical emails. The generated emails won’t just be template-level but will be complete with key elements. Our job will be to confirm whether the AI’s output is correct. And these learning outcomes can be inherited.

Three Major AI Engines Support Local Large Models

In the information age, we have experienced several major technological transformations. First was the popularization of personal computers, then the internet, and then mobile internet. Now we are facing the empowerment and even restructuring of productivity by AI. The AI we discuss today is not large-scale clusters for training or inference in data centers but the PCs at our fingertips. AIGC, video production, and other applications for content creators have already continuously amazed the public. Now we further see that AI PCs can truly enhance the work efficiency of ordinary office workers: handling trivial tasks, making presentations, writing emails, finding legal provisions, etc., and seamlessly filling in some of our skill gaps, such as using unfamiliar Excel functions, creating supposedly sophisticated knowledge graphs, and so on. All this relies not only on the “intelligent emergence” of large language models but also on sufficiently powerful performance to support local deployment.

We frequently mention the “local deployment” of large models, which relies on strong AI computing power at the edge. The so-called AI PC relies on the powerful CPU+GPU+NPU triad AI engines of the Core Ultra processor, whose computing power is sufficient to support the local operation of a large language model with 20 billion parameters. As for AIGC applications represented by text-to-image generation, they are relatively easy.

Fast CPU Response: The CPU can be used to run traditional, diverse workloads and achieve low latency. The Core Ultra adopts advanced Intel 4 manufacturing process, allowing laptops to have up to 16 cores and 22 threads, with a turbo frequency of up to 5.1GHz.

High GPU Throughput: The GPU is ideal for large workloads that require parallel throughput. The Core Ultra comes standard with Arc GPU integrated graphics. The Core Ultra 7 165H includes 8 Xe-LPG cores (128 vector engines), and the Core Ultra 5 125H includes 7. Moreover, this generation of integrated graphics supports AV1 hardware encoding, enabling faster output of high-quality, high-compression-rate videos. With its leading encoding and decoding capabilities, the Arc GPU has indeed built a good reputation in the video editing industry. With a substantial increase in vector engine capabilities, many content creation ISVs have demonstrated higher efficiency in smart keying, frame interpolation, and other functions based on AI PCs.

Efficient NPU: The newly introduced NPU (Neural Processing Unit) in the Core Ultra provides 10 times the efficiency of traditional CPUs and GPUs in processing AI workloads. As an AI acceleration engine, it allows the NPU to handle high-complexity, high-demand AI workloads, greatly reducing energy consumption.

Edge AI has unlimited possibilities, and its greatest value is precisely in practicality. With sufficient computing power, whether through large-scale language models or other models, it can indeed increase the efficiency of content production and indirectly enhance the operational efficiency of every office worker.

For commercial AI PCs, Intel has also launched the vPro® platform based on Intel® Core™ Ultra, which organically combines AI with the productivity, security, manageability, and stability of the commercial platform. Broadcom demonstrated that vPro-based AI PC intelligent management transforms traditional asset management from passive to proactive: previously, it was only possible to see whether devices were “still there” and “usable,” and operations like patch upgrades were planned; with AI-enhanced vPro, it can autonomously analyze device operation, identify potential issues, automatically match corresponding patch packages, and push suggestions to maintenance personnel. Beirui’s Sunflower has an AI intelligent remote control report solution, where remote monitoring of PCs is no longer just screen recording and capturing but can automatically and in real-time identify and generate remote work records of the computer, including marking sensitive operations such as file deletion and entering specific commands. This significantly reduces the workload of maintenance personnel in checking and tracing records.

The Future is Here: Hundreds of ISVs Realizing Actual Business Applications Henry Ford once commented on the invention of the automobile: “If you ask your customers what they need, they will say they need a faster horse.”

“A faster horse” is a consumer trap. People who think AI phones and AI PCs are just gimmicks might temporarily not see the need to upgrade their horse based on convention. More deeply, the public has some misunderstandings about the implementation of AI, which manifests in two extremes: one extreme thinks it’s something for avant-garde heavy users and flagship configurations, typically in scenarios like image and video processing; the other extreme sees it as refreshing chatbots, like an enhanced search engine, useful but not necessary. In reality, the implementation of AI PCs far exceeds the imagination of many people: for commercial customers, Intel has deeply optimized cooperation with more than 100 ISVs worldwide, and over 35 local ISVs have optimized integration at the terminal, creating a huge AI ecosystem with over 300 ISV features, bringing an unprecedented AI PC experience!

Moreover, I do not think this scale of AI application realization is pie in the sky or “fighting the future.” Because in my eyes, the display of numerous AI PC solutions is like an “OpenVINO™ party.” OpenVINO™ is a cross-platform deep learning toolkit developed by Intel, meaning “Open Visual Inference and Neural Network Optimization.” This toolkit was actually released in 2018, and over the years, it has accumulated a large number of computer vision and deep learning inference applications. By the time of the Iris Xe integrated graphics era, the software and hardware combination already had a strong reputation. For example, relying on a mature algorithm store, various AI applications can be easily built on the 11th generation Core platform, from behavior detection for smart security to automatic inventory checking in stores, with quite good results. Now, as AI PC integrated graphics evolve to Xe-LPG, with doubled computing power, the various applications accumulated by OpenVINO™ will perform even better, achieving the “location” (sustainable Xe engine) and “harmony” (ISV resources of OpenVINO™) that are already in place.

What truly ignites the AI PC is “timing,” namely, the practicalization of large language models. The breakthrough of large language models has effectively solved the problems of natural language interaction and data training, greatly lowering the threshold for ordinary users to utilize AI computing power. Earlier, I cited many examples embedded in office applications. Here, I can give another example: the combination of Kodong Intelligent Controller’s multimodal visual language model with a robotic arm. The robotic arm is a common robot application, which has long been able to perform various operations with machine vision, such as moving and sorting objects. However, traditionally, object recognition and operation require pre-training and programming. With the integration of large language models, the whole system can perform multimodal instruction recognition and execution. For instance, we can say: “Put the phone on that piece of paper.” In this scenario, we no longer need to teach the robot what a phone is, what paper is, do not need to give specific coordinates, and do not need to plan the moving path. Natural language instructions and camera images are well integrated, and execution instructions for the robotic arm are generated automatically. For such industrial scenarios, the entire process can be completed on a laptop-level computing platform, and the data does not need to leave the factory.

Therefore, what AI PC brings us is definitely not just “a faster horse,” but it subverts the way PCs are used and expands the boundaries of user capabilities. Summarizing the existing ISVs and solutions, we can categorize AI PC applications into six major scenarios:

  1. AI Chatbot: More professional Q&A for specific industries and fields.
  2. AI PC Assistant: Directly operates the PC, handling personal files, photos, videos, etc.
  3. AI Office Assistant: Office plugins to enhance office software usage efficiency.
  4. AI Local Knowledge Base: RAG (Retrieval Augmented Generation) applications, including various text and video files.
  5. AI Image and Video Processing: Generation and post-processing of multimedia information such as images, videos, and audio.
  6. AI PC Management: More intelligent and efficient device asset and security management.

Summary

It is undeniable that the development of AI always relies on the technological innovation and combination of hardware and software. AI PCs based on Core Ultra are first of all faster, stronger, lower power consumption, and longer battery life PCs. These hardware features support AI applications that bring deeper changes to our usage experience and modes. PCs empowered with “intelligent emergence” are no longer just productivity tools; in some scenarios, they can directly transform into collaborators or even operators. Behind this are performance improvements brought by microarchitecture and production process advancements, as well as the empowerment of new productivity like large language models.

If we regard CPU, GPU, and NPU as the three major computing powers of AI PCs, correspondingly, the value of AI PCs for localizing AI (on the client side) can be summarized into three major rules: economy, physics, and data confidentiality. The so-called economy means that processing data locally can reduce cloud service costs and optimize economic efficiency; physics corresponds to the “virtual” nature of cloud resources, where local AI services can provide better timeliness, higher accuracy, and avoid transmission bottlenecks between the cloud and the client; data confidentiality means that user data stays completely local, preventing misuse and leakage.

In 2023, the rapid advancement of large language models achieved the AI era in the cloud. In 2024, the client-side implementation of large language models ushered in the AI PC era. We also look forward to AI continuously solidifying applications in the intertwined development of the cloud and the client, continuously releasing powerful productivity; and we look forward to Intel jointly advancing with ISV+OEM in the future to provide us with even stronger “new productivity.”


AI PC 是噱头还是更快的马车?

AI 是虚火还是营销噱头?

2023 年以来,所有人都知道 AI 非常的热、非常的牛、非常的神,生成的文章辞藻华丽、写的报告面面俱到,毫不谦虚地说,打败 80% 甚至更多的人类。至于文生图、作曲,甚至是视频,都常有令人惊艳的作品。吹爆再吹爆,无需赘述……

对于设计师、文案策划等职业,生成式 AI 确实已经帮助他们提高了迸发创意的速度,至少不必万丈高楼平地起了。由于效率太高,这些岗位中的部分人可能反而要面对失业的烦恼。但对于普通人,AI 除了猎奇,OpenAI、SD 等时髦玩意儿好像对工作也没啥实质性的帮助——毕竟平时不需要写什么四平八稳的文章,更不需要吟诗作赋,而且见多了 AI 的输出,也实在觉得多是些正确的废话,有用,但也没啥大用。

所以,当某手机厂商说以后不生产“传统手机”的时候,大家嗤之以鼻。当 AI PC 概念出现的时候,也难免觉得是营销噱头。但是,当我在 2024 英特尔商用客户端 AI PC 产品发布会的展区走了一圈之后,我发现 AI 比我想象中的更有用。是的,有用,不需要技惊四座,但,很有用。

端侧 AI 的本地化落地带来根本性的体验变化

既然是商用 PC,那就离不开生产力工具属性。如果不买最新的硬件,玩不转最新的软件版本,很容易在鄙视链中打上“应用水平低下”的标签。就拿 Excel 为例吧,最早接触 Excel 的时候,对效率的理解是会用公式,自动进行一些计算等。再然后,是宏代码,自动执行数据的筛选、排序、导出等等,但这个难度还是比较大的。前几年呢,又似乎流行起了 Python,不去学一下那都不配谈数据处理了。在言必称数据可视化的当下,多数 Excel 用户的真实情况是尝试陌生的公式都需要临时百度一下教程,现学现用,稍复杂的操作可能要屡败屡试。

那 PC 前面加上 “AI”,或者装上某个 AI 助理,就可以赶时髦了吗?我实际体验之后,确定 AI PC 绝非如此浅薄。在 AI PC 上,有个专门做 Office 插件的公司叫 ExtendOffice,就很好地解决了 Excel 用起来磕磕绊绊的痛点:你只要说出你的意图,AI 助手马上直接在 Excel 表格上进行操作,譬如币值转换,甚至加密某一列数据。不需要去琢磨脑海里的需求到底需要对应哪个公式或者功能才可以实现,不用去查找教程,也跳过了 step by step 的学习,AI 助手当场就处理完了。

这就体现了 AI PC 一个特别关键的卖点:本地化,且在此基础上,可以嵌入工作流程,直接参与处理。我们中国人特别热爱学习,总说“授人以鱼不如授人以渔”,但“渔”的学习曲线太长了。在 AI PC 里,鱼和渔可以同时获得,因为渔夫(AI 助手)随时都在你眼前,更不要说它还可以当厨师、当秘书。

而且,刚才说的“嵌入”并不局限于某一个操作环节(类似于刚才说的给 Excel 增加某一列数据、公式),而是可以生成一个多步骤的、跨软件的操作。这也体现了大语言模型的优势:可以接受较长的输入并理解、分拆。譬如,我们完全可以对 AI PC 说:帮我将电脑静音,然后打开上次阅读的文档,并把它发送给某某邮箱。需要强调的是,以目前的演示,不需要指定准确的文档名,模糊的指示是可以理解的。还有一个让我暗暗叫好的操作是批量修改文件名。在 Windows 下批量修改文件名是需要一些小技巧的,而且,只能改成有规律的文件名(数字、字母后缀)等,但在 AI 助手的帮助下,我们可以让文件名更有个性:分别加上相关客户的名字、不同的风格类型等等。这事说起来简单,但其实需要挨个查看文件、提取关键信息,甚至根据自我理解去描述一些抽象的信息,然后挨个编写新的文件名——过程非常琐碎,文件多了就很费时间,但有了 AI 助手,这就是一句话的事。理解较长的上下文、多模态输入等等,这些都必须依赖大语言模型的能力,但其实是在本地运行的,而非借助云端的推理能力。讲真,应该没有人会认为整理文件名这种本地文件系统的操作还需要去云端绕一圈吧?从端到云之间隐藏的各种断点确实限制了我们的想象力,因此,AI PC 的这些本地操作真的打开了我的思路。

相对于大家早期较为熟悉的基于云端的 AI 工具,本地化还带来了很多显而易见的好处。譬如,断网的情况下,也是可以完成自然语言的处理和其他的操作。这对于那些曾经重度依赖大模型能力,且遭遇过服务故障的早期大模型用户而言,“天塌了”就是痛点。更不要说坐飞机之类的无网络场景了,保持连续的可用性是一个很朴素的需求。

本地部署还可以解决数据安全问题。大模型爆火之初就屡屡传出某某公司不慎泄露数据的新闻。没办法,用 ChatGPT 做简报、检查代码等等确实很香啊,但前提是得把文档上传到云端。这就导致许多企业一刀切禁止员工使用 ChatGPT。后来的事情就是许多企业选择利用开源大模型和内部数据训练、微调私有的大模型,并部署在自有的服务器或云主机上。更进一步的,现在我们看到规模 200 亿参数的大模型可以部署在基于酷睿 Ultra 处理器的 AI PC 上。

这种部署在 AI PC 上的大模型已经涉及教育、法律、医学等多个垂直领域,可以生成包括知识图谱、合同、法律意见等。譬如,将案情输入中科创达的魔方智能法务助手,就可以进行案情分析,查找相关的法律条文,撰写法律文书等。在这个场景中,很显然案情的隐私是应该绝对保证的,律师不敢将这种文档传输到云端处理。医生也有类似的约束,基于病例、基因数据等进行课题研究,如果能够在 PC 上做基因靶点、药理分析等,就不必采购服务器或者部署私有云了。

顺便一提的是,AI PC 上的大模型还让训练变得比想象中要简单,把本地你能看到的文件“喂”给 AI 助理之类的就可以了。这就解决了以往聊天机器人那种活只干了一半的“正确的废话”。譬如,通过 AI 生成一个报价邮件模板是很轻松的,但是,一般来说价格这种关键信息,机器人不懂那是很正常的事情,所以需要人工进行完善。如果找一个人类来处理这种事情,那提前做一份价格表是合理要求吧?报价表、FAQ 等都是属于需要总结提炼的工作,然后才能更有效率地培训新人——这是传统观念。本地的 AI 可以让这个事情变得很简单:让它去读 Outlook 邮箱就好了,片刻之后它自己就从历史邮件中“学”到对应的报价。相应生成的邮件就不仅是模版级了,而是要素完善的,留给我们做的就只剩确认 AI 给的结果是否正确。而且这种学习成果是可以继承下来的。

三大 AI 引擎撑起本地大模型

信息时代,我们已经经历了几次重大的科技变革。首先是个人电脑的普及,然后是互联网的普及,再就是移动互联网。现在我们正在面对的是 AI 对生产力的赋能甚至重构。我们今天讲的 AI 不是在数据中心里做训练或者推理的大规模集群,而是手边的 PC。AIGC、视频制作等面向内容创作者的应用已经不断给予大众诸多震撼了。现在我们进一步看到的是 AI PC 已经可以实实在在的提升普通白领的工作效率:处理琐碎事务,做简报、写邮件、查找法条等等,并且无缝衔接式地补齐我们的一些技能短板,类似于应用我们原本并不熟悉的的 Excel 功能、制作原以为高大上的知识图谱,诸如此类。这一切当然不仅仅依赖于大语言模型的“智能涌现”,也需要足够强大的性能以支撑本地部署。

我们多次提到的大模型的“本地部署”,都离不开端侧强劲的 AI 算力。所谓的 AI PC,依靠的是酷睿 Ultra 处理器强悍的 CPU+GPU+NPU 三大 AI 引擎,其算力足够支持 200 亿参数的大语言模型在本地运行推理过程,至于插图级的文生图为代表的 AIGC 应用相对而言倒是小菜一碟了。
 

  • CPU 快速响应:CPU 可以用来运行传统的、多样化的工作负载,并实现低延迟。酷睿 Ultra 采用先进的 Intel 4 制造工艺,可以让笔记本电脑拥有多达 16 个核心 22 个线程,睿频可高达 5.1GHz。
     
  • GPU 高吞吐量:GPU 非常适合需要并行吞吐量的大型工作负载。酷睿 Ultra 标配 Arc GPU 核显,酷睿 Ultra 7 165H 包含 8 个 Xe-LPG 核心(128 个矢量引擎),酷睿 Ultra5 125H 包含 7 个。而且,这一代核显还支持 AV1 硬编码,可以更快速地输出高质量、高压缩率的视频。凭借领先的编解码能力,Arc GPU 确实在视频剪辑行业积累的良好的口碑。随着矢量引擎能力的大幅度提升,大量内容创作 ISV 的演示了基于 AI PC 的更高效率的智能抠像、插帧等功能。
     
  • NPU 优异能效:酷睿 Ultra 处理器全新引入的 NPU(神经处理单元)能够以低功耗处理持续存在、频繁使用的 AI 工作负载,以确保高能效。譬如,火绒演示了利用 NPU 算力接管以往由 CPU 和 GPU 承担的病毒扫描等工作,虽然速度较调用 GPU 略低,但能耗有明显的优势,特别适合安全这种后台操作。我们已经很熟悉的视频会议中常用的美颜、背景更换、自动居中等操作,也可以交给 NPU 运行。NPU 也完全有能力仅凭一己之力运行轻量级的大语言模型,例如 TinyLlama 1.1,足以满足聊天机器人、智能助手、智能运维等连续性的业务需求,而将 CPU 和 GPU 的资源留给其他业务。
     

针对商用 AI PC,英特尔还推出了基于英特尔® 酷睿™ Ultra 的 vPro® 平台,将 AI 和商用平台的生产力、安全性、可管理性和稳定性有机结合。博通展示的基于 vPro 的 AI PC 智能化管理将传统的资产管理从被动变为主动:以往只能看到设备是否“还在”、“能用”,补丁升级等操作也是计划内的;而 AI 加持的 vPro 可以自主分析设备的运行,从中发现隐患并自动匹配相应的补丁包、向运维人员推送建议等。贝锐向日葵有一个AI智能远控报告方案,对 PC 的远程监控不再仅仅是录屏、截屏,而是可以自动、实时地识别和生成电脑的远程工作记录,包括标记一些敏感操作,如删除文件、输入特定的指令等。这也明显减轻了运维人员检查、回溯记录的工作量。

未来已来:数以百计的 ISV 实际业务落地

亨利福特曾经这样评价汽车的发明:“如果你问你的顾客需要什么,他们会说需要一辆更快的马车。”

“更快的马车”是一种消费陷阱,认为 AI 手机、AI PC 只是噱头的人们可能只是基于惯例认为自己暂时不需要更新马车。更深层次的,是大众对 AI 的落地有一些误解,表现为两种极端:一种极端是认为那是新潮前卫的重度用户、旗舰配置的事情,典型的场景是图像视频处理等;另一种极端是觉得是耳目一新的聊天机器人,类似于强化版的搜索引擎,有更好,无亦可。但实际上,AI PC 的落地情况远超许多人的想象:对于商用客户而言,英特尔与全球超过 100+ 个 ISV 深度优化合作,本土 35+ISV 在终端优化融合,创建包含 300 多项 ISV 特性的庞大 AI 生态系统,带来规模空前的 AI PC 体验!

而且,我并不认为这个数量级的 AI 应用落地是画饼或者“战未来”。因为在我眼里,诸多 AI PC 解决方案的展示,宛如 “OpenVINO™ 联欢会”。OpenVINO™ 是英特尔开发的跨平台深度学习工具包,意即“开放式视觉推理和神经网络优化”。这个工具包其实在 2018 年就已经发布,数年来已经积累了大量计算视觉和深度学习推理应用,发展到 Iris Xe 核显时期,软件、硬件的配合就已经很有江湖地位了。譬如依托成熟的算法商店,基于 11 代酷睿平台可以很轻松的构建各式各样的 AI 应用,从智慧安防的行为检测,到店铺自动盘点,效果相当的好。现在,AI PC 的核显进化到 Xe-LPG,算力倍增,OpenVENO™ 积累的各式应用本身就会有更好的表现,可以说“地利”(具有延续性的 Xe 引擎)和“人和”(OpenVINO™ 的 ISV 资源)早就是现成的。

真正引爆 AI PC 的是“天时”,也就是大语言模型步入实用化。大语言模型的突破很好地解决了自然语言交互和数据训练的问题,极大地降低了普通用户利用 AI 算力的门槛。前面我举了很多嵌入办公应用的例子,在这里,我可以再举一个例子:科东智能控制器的多模态视觉语言模型与机械臂的结合。机械臂是司空见惯的机器人应用,早就可以结合机器视觉做各种操作,移动、分拣物品等等。但物品的识别和操作,传统上是是需要预训练和编程的。结合大语言模型后,整套系统就可以做多模态的指令识别与执行了,譬如我们可以说:把手机放到那张纸上面。在这个场景中,我们不再需要教会机器人手机是什么、纸是什么,不需要给具体的坐标,不需要规划移动的路径。自然语言的指令,摄像头的图像,这些多模态的输入被很好地融合,并自行生成了执行指令给机械臂。对于这样的工业场景,整套流程可以在一台笔记本电脑等级的算力平台上完成,数据不需要出厂。

所以,AI PC 给我们带来的,绝对不仅仅是“更快的马车”,而是颠覆了 PC 的使用模式,拓展了用户的能力边界。盘点已有的 ISV 与解决方案,我们可以将 AI PC 的应用总结为六大场景:
 

  • Al Chatbot:针对特定行业和领域更加专业的问答。
     
  • AI PC 助理:直接对 PC 操作,处理个人文件、照片、视频等。
     
  • Al Office 助手:Office 插件,提升办公软件使用效率。
     
  • AI 本地知识库:RAG(Retrieval Augmented Generation,检索增强生成)应用,包括各类文本和视频文件。
     
  • AI 图像视频处理:图像、视频、音频等多媒体信息的生成与后期处理。
     
  • AI PC 管理:更加智能高效的设备资产及安全管理。

小结

不可否认,AI 的发展永远离不开硬件与软件的技术创新、相互结合,基于酷睿 Ultra 的 AI PC 首先是更快、更强、更低功耗、更长待机的 PC,这些硬件特性支撑的 AI 应用对我们的使用体验、使用模式带来了更深刻的改变。获得“智能涌现”加持的 PC 不再仅仅是生产力工具,在某些场景中,它直接可以化身协作者甚至操作者。这背后既有微架构和生产工艺提升带来的性能改进,也有大语言模型等新质生产力的赋能。

如果我们将 CPU、GPU、NPU 视作是 AI PC 的三大算力,相应的,也可以将 AI PC 让 AI 本地化(端侧)落地的价值归纳为三大法则:经济、物理、数据保密。所谓经济,是数据在本地处理可降低云服务成本,优化经济性;物理则对应云资源的“虚”,本地 AI 服务可以提供更好的及时性,更高的准确性,避免了云与端之间的传输瓶颈;数据保密,是指用户数据完全留在本地,防止滥用和泄露。

在 2023 年,大语言模型的狂飙成就了云端的 AI 元年。2024 年,大语言模型的端侧落地开启了 AI PC 元年。我们也期待 AI 在云与端的交织发展当中不断夯实应用,源源不绝地释放强大生产力;更期待英特尔未来联合 ISV+OEM 共同发力,为我们提供更加强劲的“新质生产力”。

AI Revolutionizes Industry and Retail: From Production Lines to Personalized Shopping Experiences

  1. Industry and Retail Relationship
  2. AI in Industry
  3. AI in Retail
  4. Summary

AI technology is increasingly being utilized in industry and retail sectors to enhance efficiency, productivity, and customer experiences. In this post, we firstly revisit the relationship between the industry and retail sections, then provide some common AI technologies and applications used in these domains.

Industry and Retail Relationship

The key difference between industry and retail lies in their primary functions and the nature of their operations:

Industry:

  • Industry, often referred to as manufacturing or production, involves the creation, extraction, or processing of raw materials and the transformation of these materials into finished goods or products.
  • Industrial businesses are typically involved in activities like manufacturing, mining, construction, or agriculture.
  • The primary focus of the industry is to produce goods on a large scale, which are then sold to other businesses, wholesalers, or retailers. These goods are often used as inputs for other industries or for further processing.
  • Industries may have complex production processes, rely on machinery and technology, and require substantial capital investment.

Retail:

  • Retail, on the other hand, involves the sale of finished products or goods directly to the end consumers for personal use. Retailers act as intermediaries between manufacturers or wholesalers and the end customers.
  • Retailers can take various forms, including physical stores, e-commerce websites, supermarkets, boutiques, and more.
  • Retailers may carry a wide range of products, including those manufactured by various industries. They focus on providing a convenient and accessible point of purchase for consumers.
  • Retail operations are primarily concerned with merchandising, marketing, customer service, inventory management, and creating a satisfying shopping experience for consumers.

AI in Industry

AI, or artificial intelligence, is revolutionizing industry sectors by powering various applications and technologies that enhance efficiency, productivity, and customer experiences. Here are some common AI technologies and applications used in these domains:

1. Robotics and Automation: AI-driven robots and automation systems are used in manufacturing to perform repetitive, high-precision tasks, such as assembly, welding, and quality control. Machine learning algorithms enable these robots to adapt and improve their performance over time.

2. Predictive Maintenance: AI is used to predict when industrial equipment, such as machinery or vehicles, is likely to fail. This allows companies to schedule maintenance proactively, reducing downtime and maintenance costs.

3. Quality Control: Computer vision and machine learning algorithms are employed for quality control processes. They can quickly identify defects or irregularities in products, reducing the number of faulty items reaching the market.

4. Supply Chain Optimization: AI helps in optimizing the supply chain by predicting demand, managing inventory, and optimizing routes for logistics and transportation.

5. Process Optimization: AI can optimize manufacturing processes by adjusting parameters in real time to increase efficiency and reduce energy consumption.

6. Safety and Compliance: AI-driven systems can monitor and enhance workplace safety, ensuring that industrial facilities comply with regulations and safety standards.


AI in Retail

AI technology is revolutionizing the retail sector too, introducing innovative solutions and transforming the way businesses engage with customers. Here are some key AI technologies and applications used in retail:

1. Personalized Marketing: AI is used to analyze customer data and behaviours to provide personalized product recommendations, targeted marketing campaigns, and customized shopping experiences.

2. Chatbots and Virtual Assistants: Retailers employ AI-powered chatbots and virtual assistants to provide customer support, answer queries, and assist with online shopping.

3. Inventory Management: AI can optimize inventory levels and replenishment by analyzing sales data and demand patterns, reducing stockouts and overstock situations.

4. Price Optimization: Retailers use AI to dynamically adjust prices based on various factors, such as demand, competition, and customer behaviour, to maximize revenue and profits.

5. Visual Search and Image Recognition: AI enables visual search in e-commerce, allowing customers to find products by uploading images or using images they find online.

6. Supply Chain and Logistics: AI helps optimize supply chain operations, route planning, and warehouse management, improving efficiency and reducing costs.

7. In-Store Analytics: AI-powered systems can analyze in-store customer behaviour, enabling retailers to improve store layouts, planogram designs, and customer engagement strategies.

8. Fraud Detection: AI is used to detect and prevent fraudulent activities, such as credit card fraud and return fraud, to protect both retailers and customers.

Summary

AI’s potential to transform industry and retail is huge and its future applications are very promising. As AI technologies advance, we can expect increased levels of automation, personalization, and optimization in industry and retail operations.

AI technologies in these sectors often rely on machine learning (ML), deep learning (DL), natural language processing (NLP), and computer vision (CV), and now Generative Large Language Models (LLM) to analyze and gain insights from data. These AI applications are continuously evolving and are changing the way businesses in these sectors operate, leading to improved processes and customer experiences.

AI will drive high levels of efficiency, innovation, and customer satisfaction in these sectors, ultimately revolutionizing the way businesses operate and interact with consumers.