T2: Half-Day Workshop – Int´l Conference on Unmanned Aircraft Systems

Embodied-AI for Aerial Robots: What do we need for full autonomy?

Organizers and Lecturers

Nitin J. Sanket
Worcester Polytechnic Institute
Guanrui Li
Worcester Polytechnic Institute
Sihao Sun
Delft University of Technology
Melissa Greeff
Queen’s University

Summary
Aerial robots present a peculiar challenge for autonomy: the requirement for high-rate processing with a dearth of computation and sensing capabilities. They push the limits of what can be possible with onboard sensing and computation in robotics. To solve complex autonomy tasks on aerial robots, the approach has to be ingenious and creative, breaking away from the norm of how they are solved on other types of robots with substantially more computing and sensing capabilities. Furthermore, with the advent of the Artificial Intelligence (AI) revolution in robotics, the utilization of embodiment (knowledge of self) becomes pivotal to the success of AI-based approaches on resource-constrained aerial robots.

Data for Generalized Embodied AI Models:

Creating generalized embodied AI models for aerial robots requires data that reflects the complexities and challenges unique to this domain. Unlike ground-based robots or manipulators, aerial robots usually navigate in three-dimensional, often unstructured environments such as forests, deserts, or mountainous areas. These conditions demand that data collection for aerial robots must extend beyond traditional sources, which are often constrained to urban or controlled settings designed for cars or manipulators. Furthermore, current open-source datasets are predominantly designed for other platforms, making the transfer of knowledge to aerial systems inefficient and incomplete.

The proposed workshop seeks to address this gap by fostering global, cross-institutional collaboration aimed at creating large-scale, multimodal datasets that capture the unique operational environments of aerial robots. These datasets should span a range of conditions, including varying weather, altitudes, and geographical terrains, to ensure models can generalize effectively to real-world tasks. Collecting data from dynamic, unstructured environments such as disaster zones, agricultural fields, and remote wilderness areas will be critical for improving model robustness.

In addition to real-world data collection, the workshop will emphasize the complementary role of high-fidelity simulations in generating diverse training data. Simulation platforms can provide scalable, safe, and cost-effective means of testing aerial robots in complex environments that may be difficult or dangerous to access in the real world. By focusing on strategies such as data augmentation and transfer learning, the workshop aims to improve model flexibility and adaptability across a range of deployment scenarios.

Efficient Neural Networks for Aerial Systems:

Designing neural network models for tiny aerial robots (< 1Kg of All Up Weight) with reduced Size, Weight, Area and Power (SWAP) constraints requires a delicate balance between computational efficiency and high-level intelligence. These robots operate under strict computational and power constraints, making it essential to design neural network structures that can process data in real-time, with minimal latency and low energy consumption. In the proposed workshop, we will explore and discuss the recent development of novel architectures such as Spiking Neural Networks (SNNs), mobileSAM, and Dinov2, which are specifically optimized for deployment on resource-constrained aerial robot platforms.

We also reckon that the co-design of hardware and AI is equally important to this topic. By leveraging hardware structures designed specifically for AI acceleration, such as edge computing modules with integrated CPUs, Graphics Processing Units (GPUs), Neural Processing Units (NPUs), and neuromorphic chips, aerial robots can achieve higher performance for demanding tasks like obstacle avoidance and decision-making, meanwhile maintaining low energy consumption.

In addition to architectural and hardware considerations, the selection of input and output modalities is a key factor in designing efficient neural networks for aerial systems. Multimodal inputs, such as visual data, inertial measurements, and proprioception, are essential for enabling robust navigation and autonomy in dynamic and unpredictable environments. Efficient processing of these inputs supports critical tasks, including obstacle avoidance, human-object interaction, and mission-specific autonomy. The choice of output modalities should be context-driven, emphasizing fast and accurate decision-making to meet the specific needs of the task at hand.

Furthermore, embedding an awareness of the robot’s physical state and limitations into neural network models can significantly improve resilience in challenging conditions, such as sensory failures or environmental disruptions. This embodied intelligence is vital for ensuring real-time, precise decision-making in mission-critical applications such as search and rescue, environmental monitoring, and disaster response. By integrating these advanced neural network designs with hardware optimization and multimodal processing, we can push the boundaries of autonomous aerial systems.

Generalized World Intelligence for Flexible Adaptation:

Embodied AI requires not only advanced perception but also the ability to abstract and generalize world models for real-world adaptability. This capability is especially critical for tiny aerial robots to navigate dynamic and unpredictable environments. These robots need minimal yet highly flexible world models that allow for rapid adaptation to new or changing conditions without requiring significant computational resources. By developing such models, we enable aerial robots to generalize across a wide spectrum of environments, facilitating seamless operations in diverse, real-world scenarios.

In this workshop, we will examine the methods and frameworks required to build these generalized world intelligence systems, with a focus on their role in enhancing the adaptability of tiny aerial robots to novel or zero-shot environments. These models allow robots to switch between tasks and domains without extensive retraining, thereby improving efficiency and scalability. Moreover, they must be capable of handling uncertainty, such as incomplete or noisy sensory data, which is common in complex, real-world environments.

Additionally, the workshop will explore the compositionality of foundational models—how simpler models can be combined and adapted to solve complex, multi-step tasks in real-world settings. By addressing these challenges, we aim to push the boundaries of embodied AI, enabling aerial robots to operate autonomously across a wider array of applications, including search and rescue, environmental monitoring, and infrastructure inspection.

This workshop focuses on the emerging need for onboard embodied AI in tiny aerial robots (< 1Kg All Up Weight), particularly those with tight constraints on computational resources. The expected impact is to drive the development of more efficient, scalable AI models that can power the next generation of autonomous aerial robots. This work will have broad implications for applications such as search and rescue, plant pollination environmental monitoring, and smart infrastructure inspection.

Workshop Objectives
The objective of the workshop is to encourage discussion among the participants to come up with focus areas for embodied aerial autonomy in the near and far future. Our invited talks will cover subject experts from various sub-fields, cultural and technical backgrounds such that they can bring different perspectives on solving the common problems in embodied AI for aerial robots. This will ensure that we provide and collate the highest quality content and enable young and seasoned researchers to think out of the box to solve classical problems in a new light with the latest toolkits and frameworks. Our panel discussions will involve questions that probe the curiosity of the audience and take them on a philosophical thrilling adventure of research questions for the future.

Key Topics

High-fidelity simulation for data generation

Sim2real transfer

Reinforcement learning for aerial robots

Multi-modal learning for aerial autonomy

Embodied AI models for aerial robots

Size, Weight, Area and Power (SWAP) – aware design

Foundational models for autonomy under SWAP constraints

Bio-inspired navigation

Efficient neural network designs

Spiking neural network

Generalized world intelligence for flexible adaptation

Online adaptation learning

Zero-shot generalization for navigation, action and recognition

Evaluation and benchmarking methods for aerial robot autonomy

Light-weight sensor fusion for autonomy, SLAM and odometry

Workshop Format
The workshop will consist of invited talks and panel discussions.

Target Audience
This workshop is suitable for young researchers, graduate students conducting research in related areas, scientists, and engineers, as well as developers and manufacturers of autonomous UAVs.

Tentative Outcome
It is expected that participants will obtain a deep understanding of emboided AI challenges, tools, methods that need to be addressed and overcome to advance autonomy. The discussions will unravel and set the stage for short and long term research goals for the community at large.

Schedule

Time	Talk	Comments
09:00 09:05	Organizers	Opening remarks
09:05 09:30	Marco Tognon	Learning to Touch and Manipulate in Flight: Towards Embodied Al for Aerial Manipulators
09:30 09:55	Guido De Croon (Zoom)	Embodied Al for tiny drones
09:55 10:20	Salua Hamaza	Physical Intelligence for New Sensing and Control Paradigms
10:20 10:55	Giuseppe Loianno	Learning Agile and Collaborative Robot Navigation
10:55 11:10	Coffee break
11:10 11:35	Kostas Alexis	Computational co-design of aerial robot airframes and autonomy through the strengths of learning and evolutionary algorithms
11:35 12:00	Guanya Shi (Zoom)	Adaptive, and Efficient Learning and Control for Agile Robotics
12:00 12:25	Rik Bouwmeester	Crazyflie and the Realities of Edge Al
12:25 12:55	Sanket, Melissa Greeff	All In-person speakers, Guanrui Li, Nitin Panel Discussion: Challenges and Opportunities in SWAP-Aware Embodied Al for Aerial Robots
12:55 13:00	Organizers	Closing remarks