TY - JOUR
T1 - Learning visual object categories for robot affordance prediction
AU - Sun, Jie
AU - Moore, Joshua L.
AU - Bobick, Aaron
AU - Rehg, James M.
PY - 2010
Y1 - 2010
N2 - A fundamental requirement of any autonomous robot system is the ability to predict the affordances of its environment. The set of affordances define the actions that are available to the agent given the robots context. A standard approach to affordance learning is direct perception, which learns direct mappings from sensor measurements to affordance labels. For example, a robot designed for cross-country navigation could map stereo depth information and image features directly into predictions about the traversability of terrain regions. While this approach can succeed for a small number of affordances, it does not scale well as the number of affordances increases. In this paper, we show that visual object categories can be used as an intermediate representation that makes the affordance learning problem scalable. We develop a probabilistic graphical model which we call the Category-Affordance (CA) model, which describes the relationships between object categories, affordances, and appearance. This model casts visual object categorization as an intermediate inference step in affordance prediction. We describe several novel affordance learning and training strategies that are supported by our new model. Experimental results with indoor mobile robots evaluate these different strategies and demonstrate the advantages of the CA model in affordance learning, especially when learning from limited size data sets.
AB - A fundamental requirement of any autonomous robot system is the ability to predict the affordances of its environment. The set of affordances define the actions that are available to the agent given the robots context. A standard approach to affordance learning is direct perception, which learns direct mappings from sensor measurements to affordance labels. For example, a robot designed for cross-country navigation could map stereo depth information and image features directly into predictions about the traversability of terrain regions. While this approach can succeed for a small number of affordances, it does not scale well as the number of affordances increases. In this paper, we show that visual object categories can be used as an intermediate representation that makes the affordance learning problem scalable. We develop a probabilistic graphical model which we call the Category-Affordance (CA) model, which describes the relationships between object categories, affordances, and appearance. This model casts visual object categorization as an intermediate inference step in affordance prediction. We describe several novel affordance learning and training strategies that are supported by our new model. Experimental results with indoor mobile robots evaluate these different strategies and demonstrate the advantages of the CA model in affordance learning, especially when learning from limited size data sets.
UR - https://www.scopus.com/pages/publications/77949346006
U2 - 10.1177/0278364909356602
DO - 10.1177/0278364909356602
M3 - Article
AN - SCOPUS:77949346006
SN - 0278-3649
VL - 29
SP - 174
EP - 197
JO - International Journal of Robotics Research
JF - International Journal of Robotics Research
IS - 2-3
ER -