This page contains a high-level overview of the overall organization of roscpp's internals, starting from the bottom up. == General Philosophy == The philosophy for roscpp is that only a very specific set of APIs is externally visible. This means that none of the public headers are allowed to include the private ones, unless those private ones require templating. This means type erasure happens early and often. In general we try to minimize the number of external headers included. For example, no `Boost.Thread` headers are allowed in the common public APIs (`CallbackQueue` is an exception, and should be fixed), and in the future we'd like to remove `Boost.Bind` as well, since it adds significant compile time. == xmlrpc == We use a modified 3rdparty library called [[xmlrpcpp]]. It unfortunately does not allow us to use our own polling code for its sockets, so we spin up a separate thread for it to do its work in. While we do use xmlrpcpp, all use of it goes through the `XMLRPCManager` singleton, and all communication with the master goes through the `master::execute` function. == PollSet/PollManager == At the lowest level, all the non-xmlrpc networking is done using `poll()`, and is managed by the `PollSet` class. There is currently one `PollSet` used, which is managed by the `PollManager` singleton. `PollSet` lets you register file descriptors with it, and add events (read or write) to be listened for. == Connections and Transports == The transport system attempts to make transport types (e.g. TCP, UDP) generic at the lowest level. There is a base class, `Transport`, that is extended by `TransportTCP` and `TransportUDP` which provide methods like `read()` and `write()`. At a higher level, the `Connection` class provides a callback-based way of reading and writing data to a transport. The `Connection` class manages tracking how much data has been sent/received and calling callbacks when the operation has finished. Connections are tracked by the `ConnectionManager` class, which mainly manages incoming TCP/UDP connections, as well as keeping pointers to `Connection` objects when they're dropped until they're allowed to be deleted. When the `ConnectionManager` receives a new incoming TCP or UDP connection, it will create either a `TransportSubscriberLink` or `ServiceClientLink` depending on whether it is a topic or service connection. == Initialization == This is explained in the [[roscpp/Overview|Overview]] as well, but I might as well go over it here. There are two parts to roscpp's initialization, the `init()` method and the `start()` method. `init()` does very little -- it simply parses the environment and command line for things like the master URI, local hostname/ip address, remappings, etc. `init()` is not allowed to actually do any communication -- this is so that the user can manually do things like check if the master is up and have custom behavior if it's not. `init()` also does not start any threads. `start()` performs most of the actual initialization. `start()` can either be called manually, in which case you are required to call `shutdown()`, or it will automatically be called by the first `NodeHandle` created, and then `shutdown()` will automatically be called when the last `NodeHandle` is destroyed. `start()` will... 1. initialize all of the various singletons 2. install the SIGINT signal handler if it's supposed to 3. create the [[rosout]] `log4cxx` appender 4. create the `~get_loggers` and `~set_logger_level` services 5. check the `/use_sim_time` parameter and if it's set, subscribe to `/clock` == Topics == Topics are managed by the `TopicManager`, which keeps the list of `Subscriptions` and `Publications`. By the time anything arrives at the `TopicManager`, any compile-time time information has been erased. This means that, for example, `TopicManager::subscribe()` only has access to whatever is in the `SubscribeOptions` class. === Subscriptions === Subscriptions to a topic are managed by the `Subscription` class. Only one `Subscription` class is allowed for a topic, and multiple subscribers to the same topic will share a single connection, preventing bandwidth wastage. The `Subscription` object manages N `PublisherLink` objects, which each connect to a different publisher on that topic. There are two types of `PublisherLink` objects. `TransportPublisherLink` connects to a publisher using the transport layer. `IntraprocessPublisherLink` connects to a local, intraprocess publisher, and allows those to skip the transport layer entirely (and provide no-copy message passing when possible). ==== publisherUpdate ==== When a new publisher registers with the master, the node will receive a "publisherUpdate" xmlrpc call, which gets forwarded to `Subscription::pubUpdate` on the right topic. For non-intraprocess connections, this then starts an asynchronous xmlrpc request to the new publisher. When that xmlrpc request goes through, the `Subscription::pendingConnectionDone()` method is called. Once this request goes through, a new `TransportPublisherLink` is created for that publisher. ==== Retry ==== A TCP `TransportPublisherLink` that gets disconnected from its publisher will attempt to retry its connection. Its first retry is scheduled for 100ms after the disconnect, and that time doubles every time it fails to connect until it hits 20 seconds, where it is capped. ==== Subscription Callbacks and Deserialization ==== When a message is received, it's pushed into a `SubscriptionQueue`, which handles throwing away old messages if the queue fills up. An important thing to note is that the message has not been deserialized yet. Instead of pushing the message itself onto the queue, a `MessageDeserializer` object is pushed onto it, and this object is shared by multiple callbacks if possible (if their rtti info matches). This means that: * Messages that are thrown away because their queue fills up are never deserialized * Messages are not deserialized until the first callback associated with that subscription is called === Publications === Publications on a topic are managed by the `Publication` class. Only one `Publication` class is allowed for a topic, and multiple publishers on the same topic share connections to their subscribers. The `Publication` object manages N `SubscriberLink` objects, which are each connected to a different subscriber on that topic. There are two types of `SubscriberLink` objects. `TransportSubscriberLink` connects to a subscriber using the transport layer (and is created by the `ConnectionManager`), while `IntraprocessSubscriberLink` connects to an intraprocess subscriber, and skips the transport layer entirely. === No-Copy Intraprocess Support === roscpp uses a combination of `boost::shared_ptr` and `rtti` to provide mostly-safe no-copy intraprocess message passing. It's only mostly safe because it relies on the publisher to never modify the message object it has just published. Note that this only applies if the message is passed in as a `boost::shared_ptr`. Otherwise it currently still requires a `serialize()`/`deserialize()`, though it will skip the transport layer. The path of an intraprocess message is as follows: * `Publisher::publish()` * `TopicManager::publish()` * calls `Publication::getPublishTypes()` to determine what kind of subscribers we have * If we have any "serialization required" subscribers, serialize the message * If we don't have any "no-copy" subscribers, immediately clear out the message pointer and `rtti` information * `Publication::publish()` * Finds the `IntraprocessSubscriberLink` in this `Publication`, if any * Directly enqueues a message to it, only allowing "no-copy" publication (this may not be necessary) * `IntraprocessSubscriberLink::enqueueMessage()` * `IntraprocessPublisherLink::handleMessage()` * `Subscription::handleMessage()` * For each subscriber, checks if the rtti info matches. If so, adds the message to its subscription queue. There are some portions of this that aren't really necessary. For performance reasons it's probably a good idea to get rid of `IntraprocessPublisherLink`/`IntraprocessSubscriberLink` and just directly connect `Publication` and `Subscription`. == Services == Services are managed by the `ServiceManager` singleton. === Service Servers === A service server is managed by the `ServicePublication` class, which tracks a number of `ServiceClientLink` objects. Since only a single server for a given service is allowed at any given time, the same is true in a roscpp node. The `ServiceClientLink` object (created by the `ConnectionManager`) handles a connection to a single service client. === Service Clients === A connection to a service server is managed by a `ServiceServerLink`. A new `ServiceServerLink` is created for each service client. == User API == The user API is almost entirely included through the `ros.h` header, with the exception of `callback_queue.h` that needs some cleaning up to remove extra includes before it can be included there (it exposes `Boost.Thread`). Most of the user API should be pretty self-explanitory, so I won't go over it. The main thing worth going over is the callback mechanism that allows multiple different callback function signatures. === Multiple Callback Signatures === Subscription callbacks allow multiple callback signatures, e.g.: {{{ #!cplusplus void (const MsgConstPtr& msg); void (const MsgPtr& msg); void (const Msg& msg); void (Msg msg); void (const ros::MessageEvent& evt); etc. }}} The mechanism that allows this is called the `ParameterAdapter`, which is a template class specialized for different types of parameters. For example, the standard (const MsgConstPtr&) signature looks like this: {{{ #!cplusplus template struct ParameterAdapter& > { typedef typename boost::remove_reference::type>::type Message; typedef ros::MessageEvent Event; typedef const boost::shared_ptr Parameter; static const bool is_const = true; static Parameter getParameter(const Event& event) { return event.getMessage(); } }; }}} It provides a `Message` typedef, which is guaranteed non-const and non-reference. It provides the `MessageEvent` type, as an `Event` typedef. It provides a `Parameter` typedef, which returns the actual value to be passed to the callback, and an `is_const` static that tells us if the message is constant and not allowed to change. `ParameterAdapter` is templated on the function signature (not just the message type), and extracts the message type and other information from that. `ParameterAdapter` is used in conjunction with `SubscriptionCallbackHelperT` (also templated on the function signature), which calls the callback like so: {{{ #!cplusplus callback_(ParameterAdapter

::getParameter(event)); }}} Adding new callback signatures, as long as they're still single-parameter, is as simple as creating a new `ParameterAdapter` specialization (which can even be done outside the ROS source if necessary). == Callback Queue == The `CallbackQueue` class is one of the trickiest cases in [[roscpp]], because it has to support so many corner cases. The current implementation supports: * Ensuring that any callbacks with a specific owner (specified with the removal id passed to `addCallback()`) will never be called once that `removeByID` has returned * Removing the callback that is currently in progress * Recursively calling (calling `CallbackQueue::callAvailable()` or `callOne()` from within a callback that is getting called from `callAvailable()` or `callOne()` on the same queue * Threadsafe calls to `callAvailable()` and `callOne()` These requirements add a lot of complexity to `CallbackQueue`, and make it not as efficient as it could be. A queue could be written that is much faster and less complex than the current one, if it chooses either a different interface or a different set of requirements. Removing (or changing the semantics of) `callAvailable()` could make `CallbackQueue` much simpler, and potentially remove the need for a lot of the complexity (such as the need for thread-local storage). == Future == === Transports === The transport layer was conceived when we only had a TCP transport, with the idea that any necessary changes would happen when UDP was implemented. This didn't happen, and it turns out the way the transport layer was designed was a mistake, and only supports the TCP case well. It is the cause of our lack of good large-message support over UDP. The main reason the transport layer doesn't work well right now is that it doesn't know about messages. Currently, for example, receiving messages is a "pull" model. `TransportPublisherLink` asks its `Connection` for a 4-byte length, and when that arrives, parses, allocates space, and asks for the message itself. Instead, the transport should be pushing messages to the `PublisherLink`, and the `Connection` class can be done away with entirely. Essentially, I designed a more generic networking transport system, when I should have designed a message-passing transport system. === Sub/Pub === While roscpp does support no-copy intraprocess message passing, it's not the fastest of implementations. This is mainly because roscpp began not supporting intraprocess message passing at all, and has grown into its current form. Ideally this would swap around, with the intraprocess case becoming the main one, with the network side hooking into it. With some enhancement of the lock-free structures in the [[lockfree]] package, it should even be possible to do it in a lock-free manner, though you'd need a lockfree `CallbackQueue` implementation as well. That's possible, but maybe not with the same guarantees the current `CallbackQueue` makes -- this is why the `CallbackQueue` is abstracted into `CallbackQueueInterface`. === NodeHandle === IMHO a mistake was made with `NodeHandle`, in that it does far too much, and everything you can do in roscpp ended up as a method in it. This means that now even if all you need is its namespace/remapping functionality, you drag along a lot of baggage. After a fair amount of thought, I'd prefer a `Context` class that only deals with namespacing and remapping (and has a notion of both its public and private namespaces). I also wish I had made fewer overloads of each function type, and moved those into the `XOptions` structures, using the named parameter idiom (like `TransportHints`). Ideally normal subscribers would also follow the [[message_filters]] pattern. Subscribing to a topic might then look something like this: {{{ #!cplusplus ros::Subscriber sub(SubscribeOptions(context.name("foo"), 5).allowConcurrent().callbackQueue(my_queue).callback(myCallback)); // Alternatively, you could leave out the callback above, and then do: // sub.registerCallback(myCallback); }}} Note also the use of `context.name()` above, which should return a `Name` object rather than a string. This may be going too far, but there are benefits to not using strings directly (like knowing if a name has been remapped already or not). If a string is passed directly, a global context would be used. === Single Master Assumption === There is currently an assumption made in all of roscpp that there is a single master. This is mainly a holdover from the old `ros::Node` days, which had the same assumption. The main thing that would need to change to allow multiple masters (other than on the user API side) is internally storing the master along with any topic/service. A `Name` class would help here, as it could include the master. On the user API side I've been thinking of something like: {{{ #!cplusplus ros::MasterContext master("http://pri:11311"); ros::Context context(master); ... }}} If no `MasterContext` is specified, it would fall back on the global one (created out of the environment/command line), just like it does now. === Global Singletons === There are a number of global singletons in existence right now. I'd like to clean them up into just a single singleton or global accessor, with a class that manages their lifetime, e.g.: {{{ #!cplusplus class Globals { ConnectionManager* connection_manager_; XMLRPCManager* xmlrpc_manager_; ... public: ConnectionManager* getConnectionManager(); ... }; Globals* getGlobals(); }}}