Object-oriented dynamics learning through multi-level abstraction

  • Guangxiang Zhu
  • , Jianhao Wang
  • , Zhizhou Ren
  • , Zichuan Lin
  • , Chongjie Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations.

Original languageEnglish
Title of host publicationAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
PublisherAAAI press
Pages6989-6998
Number of pages10
ISBN (Electronic)9781577358350
StatePublished - 2020
Event34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
Duration: Feb 7 2020Feb 12 2020

Publication series

NameAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

Conference

Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
Country/TerritoryUnited States
CityNew York
Period02/7/2002/12/20

Fingerprint

Dive into the research topics of 'Object-oriented dynamics learning through multi-level abstraction'. Together they form a unique fingerprint.

Cite this