(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

How to profile roslaunch nodes

Description: This tutorial explains how to use profiling tools for roscpp nodes that you are launching with roslaunch.

Keywords: roslaunch, gprof, valgrind, callgrind

Tutorial Level: INTERMEDIATE

Here are some examples to to integrate with profiling a ROS node process.

gprof

gprof is a performance analysis tool for Unix processes. It uses a both instrumentation and sampling. An instrumentation code is automatically inserted into a program code during compilation time with -pg option (for gcc) and sampling data is save in gmon.out file, just before the program exits.

1. Add -pg option after catkin_package() in your CMakeLists.txt file

...
catkin_package()

### FOR PROFILING (need to set after catkin_package)
add_compile_options(-pg)
set(catkin_LIBRARIES ${catkin_LIBRARIES} -pg)
###

include_directories(${catkin_INCLUDE_DIRS})

add_executable(talker src/talker.cpp)
target_link_libraries(talker ${catkin_LIBRARIES})
...

2. Change your launch file(s) so that each ROS node outputs different output. (Otherwise, all ROS node outputs to same gmon.out file)

Add <env> tag just before the each <node> tag. The GMON_OUT_PREFIX environment variable specifies the output name for gprof, so it should be identical. For example, use ROS node name would be good idea.

  <env name="GMON_OUT_PREFIX" value="listener" />
  <node name="listener" pkg="examples" type="listener" output="screen" /> 
                                                                           
  <env name="GMON_OUT_PREFIX" value="talker" />
  <node name="talker" pkg="examples" type="talker" output="screen" />

In this example, the ROS node will output the profiling output data to you ~/.ros/ directory with file named listener.[pid] and talker.[pid]. Note that this is tricky and might be changed in the future release of roslaunch.

3. Start your ROS node with roslaunch command as usual, and exits with C-c

4. You can find your profiling output file under the ~/.ros directory, for example talker.21301 and listener.21300. To analyse, run

$ gprof devel/lib/examples/listener ~/.ros/listener.21300

and

$ gprof devel/lib/examples/talker ~/.ros/talker.21301

You'll get something like

Flat profile:

Each sample counts as 0.01 seconds.
 no time accumulated

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
  0.00      0.00     0.00      423     0.00     0.00  boost::detail::shared_count::~shared_count()
  0.00      0.00     0.00      222     0.00     0.00  boost::detail::sp_counted_base::release()
  0.00      0.00     0.00      222     0.00     0.00  boost::detail::atomic_exchange_and_add(int*, int)
  0.00      0.00     0.00      221     0.00     0.00  boost::detail::shared_count::shared_count(boost::detail::shared_count const&)
  0.00      0.00     0.00      221     0.00     0.00  boost::detail::sp_counted_base::add_ref_copy()
  0.00      0.00     0.00      221     0.00     0.00  boost::detail::atomic_increment(int*)
  0.00      0.00     0.00      201     0.00     0.00  boost::detail::shared_count::shared_count()
  0.00      0.00     0.00      178     0.00     0.00  boost::detail::shared_count::swap(boost::detail::shared_count&)
  0.00      0.00     0.00      165     0.00     0.00  boost::function_base::empty() const
  0.00      0.00     0.00      154     0.00     0.00  boost::shared_ptr<std_msgs::String_<std::allocator<void> > >::~shared_ptr()
  0.00      0.00     0.00      132     0.00     0.00  boost::shared_ptr<std_msgs::String_<std::allocator<void> > const>::~shared_ptr()
  0.00      0.00     0.00      132     0.00     0.00  std::remove_reference<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*&>::type&& std::move<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*&>(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*&)

...

and

Flat profile:

Each sample counts as 0.01 seconds.
 no time accumulated

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
  0.00      0.00     0.00       99     0.00     0.00  boost::detail::shared_count::shared_count()
  0.00      0.00     0.00       78     0.00     0.00  boost::shared_ptr<ros::Publisher::Impl>::operator->() const
  0.00      0.00     0.00       78     0.00     0.00  bool std::operator==<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*)
  0.00      0.00     0.00       76     0.00     0.00  boost::detail::shared_count::~shared_count()
  0.00      0.00     0.00       69     0.00     0.00  std::remove_reference<unsigned char*&>::type&& std::move<unsigned char*&>(unsigned char*&)
  0.00      0.00     0.00       57     0.00     0.00  void boost::checked_array_delete<unsigned char>(unsigned char*)
  0.00      0.00     0.00       57     0.00     0.00  boost::detail::sp_counted_base::~sp_counted_base()
  0.00      0.00     0.00       57     0.00     0.00  boost::detail::sp_counted_impl_pd<unsigned char*, boost::checked_array_deleter<unsigned char> >::~sp_counted_impl_pd()
  0.00      0.00     0.00       57     0.00     0.00  boost::checked_array_deleter<unsigned char>::operator()(unsigned char*) const
  0.00      0.00     0.00       53     0.00     0.00  ros::message_traits::MD5Sum<std_msgs::String_<std::allocator<void> > >::value()
  0.00      0.00     0.00       52     0.00     0.00  ros::message_traits::MD5Sum<std_msgs::String_<std::allocator<void> > >::value(std_msgs::String_<std::allocator<void> > const&)
  0.00      0.00     0.00       52     0.00     0.00  char const* ros::message_traits::md5sum<std_msgs::String_<std::allocator<void> > >(std_msgs::String_<std::allocator<void> > const&)
  0.00      0.00     0.00       50     0.00     0.00  boost::shared_ptr<void const>::shared_ptr()
  0.00      0.00     0.00       50     0.00     0.00  std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::_M_get_Node_allocator()

...

Callgrind + Kcachegrind

Callgrind is a profiling tool that records the call history among function in UNIX process.The profile data is written out to a file named callgrind.out.[pid] at a program termination. The data files generated by Callgrind can be loaded into Kcachegrind for browsing the performance results.

1. Update your launch file so that each ROS node outputs different profile data files.

  <node name="listener" pkg="examples" type="listener" output="screen"
        launch-prefix="valgrind --tool=callgrind --callgrind-out-file='callgrind.listener.%p' " />
  <node name="talker" pkg="examples" type="talker" output="screen"
        launch-prefix="valgrind --tool=callgrind --callgrind-out-file='callgrind.talker.%p'" />

2. Start your ROS node with roslaunch command, as usual. When you terminate the roslaunch process, you all get profile data under ~/.ros directory.

3. Use Kcachegrind to visualize the profile data. For example kcachegrind ~/.ros/callgrind.listener.8915 and kcachegrind ~/.ros/callgrind.talker.8914 shows following images respectively.

attachment:kcachegrind-listener.png

attachment:kcachegrind-talker.png

Wiki: roslaunch/Tutorials/Profiling roslaunch nodes (last edited 2018-08-01 06:50:26 by Kei Okada)