-
LEAP: LLM-Generation of Egocentric Action Programs
We introduce LEAP (illustrated in Figure 1), a novel method for generating video-grounded action programs through use of a Large Language Model (LLM). These action programs represent the motoric, perceptual,...
-
Mid-Vision Feedback for Convolutional Neural Networks
Feedback plays a prominent role in biological vision, where perception is modulated based on agents’ continuous interactions with the world, and evolving expectations and world model. We introduce a novel...
-
Therbligs in Action: Video Understanding through Motion Primitives
Therbligs in Action: Video Understanding through Motion PrimitivesEadom Dessalene, Michael Maynord, Cornelia Ferm¨uller, Yiannis AloimonosUniversity of Maryland, College ParkCollege Park, MD 20742, USA{edessale,maynord,fermulcm,[email protected]} In this paper we introduce a rule-based,...
-
Egocentric Object Manipulation Graphs
We introduce Egocentric Object Manipulation Graphs (Ego-OMG) – a novel repre-sentation for activity modeling and anticipation of near future actions integratingthree components: 1) semantic temporal structure of activities, 2) short-term...
-
Goal-Driven Autonomy in Dynamic Environments
Dynamic environments are complex and change in often unexpected ways. Given such anenvironment, many autonomous agents have difficulty when the world does not cooperate withdesign assumptions. We present an approach...
-
Data-Driven Goal Generation for Integrated Cognitive Systems
We describe our Meta-cognitive, Integrated, Dual-Cycle Architecture (MIDCA), whose purpose is to provide agents with a greater capacity for acting in an open world and dealing with unexpected events. We...
-
The integration of cognitive and metacognitive processes with data-driven and knowledge-rich structures
This paper examines computational relationships between mind and body and distinguishes thinking about the world from thinking about thinking. The discussion is grounded within the framework of a preliminary computational...
-
Forecasting action through contact representations from first person video
Human actions involving hand manipulations are structured according to the making and breaking of hand-object contact, and human visual understanding of action is reliant on anticipation of contact as is...