Paper in WACV (2015): “Semantic Instance Labeling Leveraging Hierarchical Segmentation”

January 6th, 2015 Irfan Essa Posted in Computer Vision, Henrik Christensen, PAMI/ICCV/CVPR/ECCV, Papers, Robotics, Steven Hickson No Comments »


  • S. Hickson, I. Essa, and H. Christensen (2015), “Semantic Instance Labeling Leveraging Hierarchical Segmentation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Hickson-SILLHS,
      author  = {Steven Hickson and Irfan Essa and Henrik
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.147},
      month    = {January},
      pdf    = {}
      publisher  = {IEEE Computer Society},
      title    = {Semantic Instance Labeling Leveraging Hierarchical
      year    = {2015}


Most of the approaches for indoor RGBD semantic labeling focus on using pixels or superpixels to train a classifier. In this paper, we implement a higher level segmentation using a hierarchy of superpixels to obtain a better segmentation for training our classifier. By focusing on meaningful segments that conform more directly to objects, regardless of size, we train a random forest of decision trees as a classifier using simple features such as the 3D size, LAB color histogram, width, height, and shape as specified by a histogram of surface normals. We test our method on the NYU V2 depth dataset, a challenging dataset of cluttered indoor environments. Our experiments using the NYU V2 depth dataset show that our method achieves state of the art results on both a general semantic labeling introduced by the dataset (floor, structure, furniture, and objects) and a more object specific semantic labeling. We show that training a classifier on a segmentation from a hierarchy of super pixels yields better results than training directly on super pixels, patches, or pixels as in previous work.2015-Hickson-SILLHS_pdf

AddThis Social Bookmark Button

Paper in IROS 2012: “Linguistic Transfer of Human Assembly Tasks to Robots”

October 7th, 2012 Irfan Essa Posted in 0205507, Activity Recognition, IROS/ICRA, Mike Stilman, Robotics No Comments »

Linguistic Transfer of Human Assembly Tasks to Robots

  • N. Dantam, I. Essa, and M. Stilman (2012), “Linguistic Transfer of Human Assembly Tasks to Robots,” in Proceedings of Intelligent Robots and Systems (IROS), 2012. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2012-Dantam-LTHATR,
      author  = {N. Dantam and I. Essa and M. Stilman},
      booktitle  = {Proceedings of Intelligent Robots and Systems
      doi    = {10.1109/IROS.2012.6385749},
      pdf    = {}
      title    = {Linguistic Transfer of Human Assembly Tasks to
      year    = {2012}


We demonstrate the automatic transfer of an assembly task from human to robot. This work extends efforts showing the utility of linguistic models in verifiable robot control policies by now performing real visual analysis of human demonstrations to automatically extract a policy for the task. This method tokenizes each human demonstration into a sequence of object connection symbols, then transforms the set of sequences from all demonstrations into an automaton, which represents the task-language for assembling a desired object. Finally, we combine this assembly automaton with a kinematic model of a robot arm to reproduce the demonstrated task.

Presented at: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2012), October 7-12, 2012 Vilamoura, Algarve, Portugal.


AddThis Social Bookmark Button

Funding (2011) NSF (1146352) “EAGER: Linguistic Task Transfer for Humans and Cyber Systems”

September 1st, 2011 Irfan Essa Posted in Activity Recognition, Mike Stilman, NSF, Robotics No Comments »

EAGER: Linguistic Task Transfer for Humans and Cyber Systems (Mike Stillman, Irfan Essa) NSF/RI

This project, investigating formal languages as a general methodology for task transfer between distinct cyber-physical systems such as humans and robots, aims to expand the science of cyber physical systems by developing Motion Grammars that will enable task transfer between distinct systems.

Formal languages are tools for encoding, describing and transferring structured knowledge. In natural language, the latter process is called communication. Similarly, we will develop a formal language through which arbitrary cyber-physical systems communicate tasks via structured actions. This investigation of Motion Grammars will contribute to the science of human cognition and the engineering of cyber-physical algorithms. By observing human activities during manipulation we will develop a novel class of hybrid control algorithms based on linguistic representations of task execution. These algorithms will broaden the capabilities of man-made systems and provide the infrastructure for motion transfer between humans, robots and broader systems in a generic context. Furthermore, the representation in a rigorous grammatical context will enable formal verification and validation in future work.
Broader Impacts: The proposed research has direct applications to new solutions for manufacturing, medical treatments such as surgery, logistics and food processing. In turn, each of these areas has a significant impact on the efficiency and convenience of our daily lives. The PIs serve as coordinators of graduate/undergraduate programs and mentors to community schools. In order to guarantee that women and minorities have a significant role in the research, the PIs will annually invite K-12 students from Atlanta schools with primarily African American populations to the laboratories. One-day robot classes will be conducted that engage students in the excitement of hands-on science by interactively using lab equipment to transfer their manipulation skills to a robot arm.

Via Award#1146352 – EAGER: Linguistic Task Transfer for Humans and Cyber Systems.

AddThis Social Bookmark Button