(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

Changing the packet timeout and number of process data retries for pr2_etherCAT

Description: Changing the packet timeout and retries for pr2_etherCAT

Tutorial Level: INTERMEDIATE

Timeout

An EtherCAT slave device never generates network traffic. The EtherCAT master (computer) generates all packets, and slave device simply fills in or pull out data from packets. As such, every packet sent by EtherCAT master, should come back to the master, unless:

  1. The packet data was corrupted (CRC bad). The computer's network card will drop packet on reception.
  2. There is a broken device or a non-EtherCAT device plugged into chain.

The EtherCAT slaves process the packets in hardware so the delay between sending a packets and receiving packet should be very consistent. After sending a packet, the Master should be able to wait a small timeout before receiving the result. If the packet is not received after the small timeout, then it must have been dropped.

Unfortunately, the Linux OS, and most Ethernet drivers is not hard realtime. The amount of time a packet takes to get delivered back to a user-space program can vary widely. This leads to the following tradeoff:

  • Setting the timeout to be too short causes user-space to believe that a packet is dropped when OS takes too long to deliverer (ie late packet).

  • Setting timeout to be too long causes user-space program to wait too long and and messes up realtime loop.

In past, the timeout was set to 20 milliseconds, which lead to very few late packets but, caused noticeable glitches in realtime controllers.

With new releases, the timeout has been reduced to 1 millisecond. A dropped packet should cause less glitching of realtime controllers. Hopefully, a timeout of 1ms does not cause extra late packets.

Process data retries

When a packet containing process data (realtime data) times-out, the EtherCAT master assumes the packet is lost and will retry sending process data. For safety reasons, the EtherCAT master will halt motor after the retries a certain amount of times. In the past the number of retries was 5 and the timeout was 20ms. This meant that the master would resend packet for 100ms before halting motors (5*20ms = 100ms = 1/10th second). With reduced timeout, 5 retries no longer make much sense. Instead the retries is calculated based on the timeout so that the master spends about 20ms retrying a packet transmission. So for a 1ms timeout, the master will retry 20 times before halting motors.

Changing the timeout

The the new release of pr2_ethercat_drivers, it is possible to change the default timeout by using a setting a rosparam. Before launching pr2_etherCAT.

rosparam set /pr2_etherCAT/realtime_socket_timeout 20000

The above example sets the timeout to 20000 microseconds (20 milliseconds).

NOTE: Changing parameter after launching pr2_etherCAT has no effect. Thus, setting parameter after using sudo robot launch will not work.

Also note that timeout only takes effect after devices are initialized and pr2_etherCAT starts running in realtime mode. If the timeout was set to a value that was way too small (say 1usec). Then pr2_etherCAT should start-up, but it would start dropping all/most packets after switching to realtime mode.

Changing the retries

The number of retries usually is calculated based on the timeout. However, the number of retries can also be changed with a rosparam

rosparam set /pr2_etherCAT/max_pd_retries 5

Where to Set Parameter

For the PR2, set these parameters in the file /etc/ros/robot.launch. This will change it for all users of your PR2.

<launch>
  <param name="pr2_etherCAT/max_pd_retries" value="5" />
  <param name="pr2_etherCAT/realtime_socket_timeout" value="20000" />

  <!-- Launch robot here-->
</launch>

Wiki: ethercat_hardware/Tutorials/ChangingRealtimePacketTimeout (last edited 2010-10-28 22:19:08 by KevinWatts)