Perception Pipelines

SIG Coordinator: Patrick Mihelich

Mailing list:

Topics: image_pipeline, ROS integration with OpenCV (cv_bridge) and PCL (pcl_ros), image_transport, prototyping with Ecto


  • Patrick Mihelich
  • Chad Rockey
  • Jack O'Quin
  • Ethan Rublee
  • Michael Dixon
  • Vincent Rabaud
  • Dejan Pangercic
  • Troy Straszheim
  • Kurt Konolige

Focus areas for Fuerte


Owner: Troy, Ethan


Ecto is an in-development framework for dataflow programming. With Ecto, you specify a directed graph of computation, where the outputs of one processing node hook into the inputs of other processing nodes. Once the graph is set up and you start feeding it data, Ecto handles all the synchronization and scheduling.

This a slightly different abstraction from ROS's pub-sub model, and in fact image_pipeline uses nodelets for much the same purpose. Unfortunately we've found that nodelets are a clunky solution for composing perception processing pipelines:

  • You write quite a bit of boilerplate to do synchronization, convert ROS messages to OpenCV/PCL structures, etc.
  • There's potentially a lot of efficiency to be gained from having a high-level scheduler with a view of the whole computation graph.
  • Writing launch files for nodelet pipelines is doable, but it's more verbose and error-prone than necessary.
  • Introspection is lacking, e.g. rxgraph doesn't work with nodelets.




  • Stable standalone release of Ecto, which only has dependencies on cmake, boost and python.
  • Core set of perception Ecto "Cells" (processing nodes) depending only on Eigen, OpenCV and PCL.
  • Experimental port of the ROS image_pipeline to Ecto. (Kurt has started on this)

  • Establish conventions for ROS packages that use Ecto graphs/cells.

Depth image processing

Owner: Patrick


With the wide availability of the Kinect and similar sensors, working with depth images has become a hot use case over the last year. We have several nodelets for working with depth images in depth_image_proc, currently part of the openni_kinect stack. Depth images themselves are a new concept in ROS, and need to be standardized. Previously we worked with disparity images and point clouds only.


More ideas:

  • Depth image compression
    • PNG seems to work for lossless.
    • Tri trees promising for lossy according to Ethan.
  • Make it easier to work with bags (recreate point clouds).

PCL integration

Owner: Michael


Currently there doesn't exist an ideal way to publish point cloud data from 3d sensor drivers. For max efficiency with nodelets using PCL, it's best to publish as the PCL-native pcl::PointCloud<T> type, but that requires pulling in all of PCL. sensor_msgs/PointCloud2 is the low-dependency option, but is harder to use and requires a costly conversion for use with PCL.

Solution: allow drivers to depend on a minimal core of PCL, containing little more than the point cloud type and ROS glue.

PCL 1.1 now has standalone debian packages. Due to dependency nastiness - PCL 1.1 has its own copies of some ROS message types - we aren't currently able to reuse the system install in ROS, and have to do a separate source install. The ROS Core SIG plans (high priority) to pull out a separate stand-alone ros_msgs library, allowing code to use ROS data structures without being a ROS package. If that lands soon enough, we may be able to use the system install of PCL in ROS.

PCL 2.0 will bring new integration challenges, but appears far enough away to be out of scope for Fuerte.


  • Use system install of PCL 1.x (whatever the latest release is), if possible.
    • Blocked on ROS Core SIG
    • Requires PCL 1.x debians to use common sensor_msgs dependency instead of defining its own
  • New stack point_cloud_common (by analogy with image_common) containing packages:
    • pcl_common
      • Installs only libpcl_common and a few core headers
    • pcl_bridge
      • By analogy with cv_bridge
      • Contains ROS glue headers pcl/ros/conversions.h, pcl_ros/point_cloud.h

  • perception_pcl then depends on point_cloud_common

    • pcl depends on pcl_common, installs all the other libraries

    • pcl_ros depends on pcl_bridge, builds the PCL nodelets etc.

Video interop

Owner: ??


We need better interoperability between ROS concepts (bags, image topics) and more standard representations (video file formats, streaming video). Some ad hoc solutions in this space already exist:

  • ogg_saver writes the Theora image_transport stream to an Ogg Theora file.

  • We've got some code lying around to expose a ROS image stream as a Video4Linux device (e.g. for use with Skype), but it hasn't been released.

  • There are instructions for converting bags to video files, but this could be easier.

  • The TurtleBot folks have some way of funneling a ROS image topic to Android tablets as streaming video.

  • mjpeg_server exposes ROS image topics to a browser via HTTP.

    • Ethan has another version of this built on Boost.Asio.


Concrete goals TBD. There's clearly room for consolidation and improvement.


  • video_interop stack as a place to put these sorts of tools.

    • What should go in it initially?
  • Browser based image viewer, based on something like mjpeg_server.
  • VP8 image_transport

GUI updates

Owner: Patrick


There are a couple of GUI components in image_pipeline: image_view, camera_calibration. Technically this violates our stack guidelines, which dictate that GUI components should live in a separate stack from code that could run on a headless (as in no monitor...) robot. We've gotten by because they use OpenCV for the GUI, and we currently treat OpenCV as one big blob.


  • Make OpenCV system install more modular, so we can install HighGUI (and other potentially problematic modules like GPU) separately.
  • New image_pipeline_gui stack.
  • Move image_view to image_pipeline_gui.
  • Move camera_calibration GUI tool to image_pipeline_gui. Core (GUI-agnostic) code could remain in image_pipeline as camera_calibration_core.

Longer-term, work with the GUI SIG on more ambitious improvements.

image_transport everywhere

Owner: ??


image_transport has proven its usefulness in transparently providing compressed image streams. However, it's implementation is tied to roscpp. How to best use image_transport from non-C++ ROS clients is an open question.

Ecto is the low-hanging fruit here. Although an Ecto pipeline runs in a Python process, all data manipulation is done in C++, and the current ROS-Ecto bridge uses roscpp. So the ROS-Ecto bridge should be able to use image_transport as-is.

A native Python image_transport - that is, compatible with rospy - is trickier. It would be nice to have, as Python is good for prototyping. There are technical difficulties to doing this. Ideally we would reuse the existing image_transport_plugins, written in C++, rather than maintain per-language versions. Those plugins use roscpp to connect to topics. I doubt roscpp and rospy play nicely in the same process.

rosjava is worth thinking about, but I (Patrick) don't have any agenda there yet. There is a tutorial for using a compressed image topic from rosjava.


  • Make Ecto image_transport aware
  • Native Python image_transport is a stretch goal. Unless someone is motivated to take it on, it's likely to get bounced to Galapagos.


First feature freeze (2012-01-15)

  • Standalone release of Ecto (API need not be entirely frozen, but we need a release to build off of)
  • REP defining conventions for depth images in ROS
  • Review depth_image_proc and freeze API

  • Nodelet launch file equivalents of image_proc, stereo_image_proc nodes

Final feature freeze (2012-02-15)

  • Core Ecto "Cells" wrapping OpenCV and PCL algorithms
  • Ecto port of image_pipeline - this is experimental, but we should at least have a working prototype by here

  • Make Ecto image_transport aware

  • Move depth_image_proc to image_pipeline

  • Use system install of PCL 1.x (blocked on ROS Core SIG)
  • Minimal point_cloud_common stack for 3D sensor drivers

  • More modular OpenCV system install
  • Move image_view and camera_calibration GUI components to new image_pipeline_gui stack

  • Browser based viewer for ROS image topics
  • Any other new video interop features...

Release date (2012-03-15)

  • Better have your documentation done by here :)

Wiki: fuerte/Planning/Perception Pipelines (last edited 2011-10-05 19:09:46 by MichaelDixon)