Physical AI · Plan

Physical AI Learning Plan

This is a read-first study plan, not a build plan. It runs in four phases: foundations, the software stack, the hardware ladder, and a first build. The goal is to reach a level of understanding where hardware and project decisions are informed, not impulsive. There is no rush.

Plan   Last updated May 17, 2026

How this plan works

Each phase has a clear question it answers. Progress sequentially. The exit criteria at the end of each phase is concrete: when you can do that, you are ready for the next phase. The companion Build Plan handles hardware sequencing and project specifics. The four layer guides (Governance, Hardware, Software, Systems) plus the Landscape are the reference material.

Phase 1: Foundations · "What is this world?"

Question this phase answers: What is Physical AI, how does the NVIDIA stack work end-to-end, and what are the key concepts?

Time estimate: 3-5 hours of reading across a few sessions.

1.1 Read the landscape

1.2 NVIDIA platform overview

  • Watch Jensen Huang's GTC 2026 keynote robotics section (~20 min). Search "Jensen Huang GTC 2026 keynote robotics" on YouTube. The keynote contextualizes why NVIDIA is building all of this and what "Physical AI" means at industry scale.
  • Read the NVIDIA Robotics platform page. Focus on the "three computers" concept (training, simulation, deployment).
  • Read the Jetson embedded systems overview. Understand the hardware ladder and where Orin Nano Super fits.
  • Read the humanoid robot use case page. Shows how GR00T, Isaac Sim, and Jetson Thor come together for one use case.

1.3 Key concepts (glossary deep dives)

  • CUDA: what it is, why it matters for AI. Read CUDA toolkit overview (overview only, not the full docs).
  • ROS2: what Robot Operating System actually does. Read the 5-minute conceptual intro.
  • Reinforcement Learning vs Imitation Learning: understand the two main approaches to teaching robots. Search "reinforcement learning vs imitation learning robotics" for a good blog post or video.
  • Sim-to-real transfer: why simulation training matters and what makes it hard. The GR00T N1 announcement blog covers this well.

1.4 Industry context

Phase 1 exit criteria
You can explain the NVIDIA Physical AI stack to someone in plain language: what Isaac Sim, Isaac Lab, Isaac ROS, GR00T, and Cosmos each do, how they connect, and why simulation matters.

Phase 2: The software stack · "How does the software work?"

Question this phase answers: What does each piece of the NVIDIA robotics software stack actually do, and what would you use each for?

Time estimate: 5-8 hours across multiple sessions.

2.1 Jetson AI Lab (the hands-on starting point)

The single best resource for understanding what a Jetson can do. Even before buying hardware, the tutorials show what is possible.

  • Browse the Jetson AI Lab homepage.
  • Read through the tutorial categories (do not do them yet, just understand what exists): Getting Started, Fundamentals, VLM (Vision-Language Models), VLA (Vision-Language-Action), Applications, Model Optimization.
  • Browse the Models page. Understand what models are optimized for Jetson.
  • Browse the Community Projects page to see what people have built.

2.2 Isaac platform deep dives

Read the overview pages for each Isaac component. The goal is to understand what each tool does and when you would use it, not to learn how to use them yet.

2.3 GR00T deep dive

  • Read the GR00T GitHub README. Understand the model architecture, supported embodiments, and what N1.7 can do.
  • Read the original GR00T N1 announcement. Focus on the GR00T-Dreams synthetic data pipeline.
  • Understand the VLA architecture: how does a model go from seeing a scene + hearing "pick up the cup" to generating motor commands?

2.4 Hugging Face LeRobot

The open-source alternative and complement to the NVIDIA stack. More accessible for individual builders.

2.5 ROS2 basics

  • Read the ROS2 conceptual overview.
  • Understand nodes, topics, services, and actions (the core ROS2 concepts).
  • Read about Isaac ROS specifically: how does NVIDIA extend ROS2 with GPU acceleration?

2.6 The competing thesis: LeCun and JEPA

Understanding the alternative approach is as important as understanding NVIDIA's stack. The debate between NVIDIA's simulation-heavy approach and LeCun's observation-based approach shapes where the industry is heading.

2.7 AgenticROS and the Claude-to-robot pipeline

This is where software-AI skills (Claude, MCP, Claude Code) connect directly to physical robots.

  • Browse the AgenticROS site. Understand what it does: natural-language control of ROS2 robots via AI agents.
  • Browse the AgenticROS GitHub. Look at the architecture: how do Claude Code / Desktop / MCP connect to ROS2?
  • Read the ROSClaw paper for the academic foundation: a model-agnostic executive layer for connecting AI agents to robots.
  • Understand NemoClaw: NVIDIA's governance and security layer for AI agents. How does it monitor AI agent "intent"? See the Governance guide.
Phase 2 exit criteria
You can describe what you would use each Isaac component for in a specific project AND explain how LeCun's JEPA approach differs from NVIDIA's simulation-based approach. Example: "NVIDIA generates synthetic training data in Isaac Sim to teach a robot. LeCun's V-JEPA learns from watching real video and needs far less robot-specific data. Both need governance, neither provides it."

Phase 3: The hardware · "What should I actually buy?"

Question this phase answers: Which hardware makes sense for your learning goals, and what can each piece do?

Time estimate: 2-3 hours of reading, then purchasing decisions.

3.1 Jetson Orin Nano Super

  • Read the product page.
  • Read the getting started guide.
  • Understand what is included in the dev kit vs what you need to add (power supply, storage, display).
  • Compare to a general-purpose laptop: what can Jetson do that consumer hardware cannot? (Dedicated 24/7 operation, CUDA, multiple camera inputs, GPIO for sensors and actuators.)

3.2 Robotic arms

  • Read the SO-ARM100 / 101 GitHub: assembly guide, calibration, LeRobot tutorial.
  • Understand leader-follower imitation learning: how does the teaching process work?
  • Compare vendors: PartaBot (assembled, US-based) vs Seeed Studio (kit) vs Waveshare.

3.3 Reachy Mini

3.4 Other hardware

  • Review the Hardware Guide for the full sensors / compute / actuators / structure breakdown with verified products and prices.
  • JetBot / JetRacer: assess whether autonomous navigation is a priority.
  • Vision projects: assess whether multi-camera detection is interesting.
  • Voice assistant: assess whether offline conversational AI is a goal.
Phase 3 exit criteria
You have a clear, informed hardware purchase list with reasons for each item. This feeds directly into the Build Plan.

Phase 4: Build and contribute · "How do I get hands dirty?"

Question this phase answers: What do you build, in what order, and how do you document the learning?

This phase overlaps with the Build Plan. Defer to that doc for hardware sequencing and project specifics.

4.1 First build (after hardware arrives)

  • Set up Jetson Orin Nano with JetPack 6.2.
  • Run through Jetson AI Lab "Getting Started" tutorials.
  • Run a local LLM on Jetson and compare to running the same model on a general-purpose laptop.
  • Set up a camera and run real-time object detection (YOLO26 or similar).

4.2 LeRobot + robotic arm

  • Set up SO-ARM101 with LeRobot.
  • Complete the LeRobot tutorial: data collection, training, deployment.
  • Teach the arm a simple task via imitation learning.
  • Document the experience (this is portfolio content).

4.3 Isaac Sim exploration

  • Install Isaac Sim (runs on a workstation or cloud instance).
  • Load a sample robot environment.
  • Understand the simulation-to-deployment pipeline.
  • Attempt a simple sim-to-real transfer with the robotic arm.

4.4 Documentation and sharing

  • Document each project as you go.
  • Consider public posts about the learning journey (not just the build, the learning).
  • Contribute back: file issues, fix docs, share datasets.
Phase 4 exit criteria
You have built something physical, trained it with AI, and documented the experience. You can speak credibly about edge AI, robotics, and the NVIDIA stack from first-hand experience.

Cross-references

DocPurpose
LandscapeThe full map of Physical AI
GovernanceThe rules: regulators, standards, the governance gaps
HardwareSensors, compute, actuators, structure, with verified prices
SoftwareRuntime, perception, decision, world models, frameworks
SystemsWhat gets built: arms, AMRs, vehicles, humanoids, drones, surgical
Build PlanSequenced hardware purchases and projects