(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

Types of EtherCAT packet loss

Description: How to determine what is causing EtherCAT packet loss.

Keywords: EtherCAT Packet Drop Late

Tutorial Level: INTERMEDIATE

Introduction

For pr2_etherCAT needs reliable communication to run properly. In most cases, EtherCAT communication is extremely reliable. However there are software or hardware issues that adversely effect reliability.

3 Most common sources of communication problems

The three most common sources of communication problems:

  1. Packet CRC errors cause by bad cables/hardware
  2. Late packets caused by software / CPU overload
  3. Missing devices caused by disconnected cables

Determining which problem you have

Dropped Packets (Late or CRC errors?)

Both "CRC errors" and and "Late Packets" are counted as as "Dropped Packets" by pr2_etherCAT. When pr2_etherCAT sends a packet, it expects a response packet within a short period of time. If pr2_etherCAT doesn't receive a response within a specified "timeout" period, it counts the packet as "dropped".

If "dropped" packet comes back later, it is also counted as "late". If the packet never comes back later then the Network Card (NIC) hardware must have dropped it because a CRC error caused one or more bad data bits.

Looking at the "EtherCAT Master" dianostics will help you determine what is causing dropped packets. There are two values you want to look at

  • Dropped Packets
  • RX Late Packet

With these two values : you can determine whether there is a hardware or software problem using the following relationship:

 (Dropped Packets) = (CRC Errors) + (RX Late Packet)

Below is a screenshot for PR2 Dashboard that shows where these values can be found

More information

Late Packets

The Linux OS on the PR2 is only "soft" realtime. This means the sometimes things can take longer than usual. One thing that can occur is the Linux kernel gets bogged down, and received network traffic queues up instead of being processed immediately. In extremely cases, packets can be delayed from 100ms or more.

Why only software causes late packets

Since slave devices process EtherCAT packets in hardware, the processing delay between receiving a packet and sending a response is extremely low and very consistent.

Wiki: pr2_etherCAT/Tutorials/PacketLoss (last edited 2010-12-02 07:57:28 by MeloneeWise)