rocon_concert/Reviews/simple_scheduler API proposal_API_Review/Starting Rapps

Starting Rapps

Topics

Whose responsibility is it to start rapps?

Current Resolution

The scheduler will start and stop rapps, keeping track of where they are assigned.

Discussion Thread

This is in part related to the topic of requests as it affects the shape of the request message used. Refer to discussion there also.

(Daniel) Services probably shouldn't have this responsibility. See the requests topic - requests should probably be based on the platform info + rapp name. In which case sending the allocation back to the service and getting the service manage starting and stopping rapps is just extra work that each and every service needs to do. Since the information is already out there, just centralise this start/stop management get some specific component to start them so the code for starting the rapps is only in one place (not in each and every service). Less work and less chances for bugs in services.
- (Jack) In some cases, the scheduler needs to revoke assignments that were made for a requester that is no longer responding, or was too slow in releasing a preempted resource. That could be an argument for handling it in general. The scheduler knows who requested each resource and what they wanted.
(Daniel) Should the scheduler do this? It could, though it's not really in the scope of 'scheduling'. Could certainly pass this job off to the conductor.
- (Jack) I do not know enough to answer this on a system level. Some questions come to mind:
  - How are rapps assigned to specific services?
    - Up to now, they don't really get assigned to a service. A service requests a certain rapp and provides remapping for that rapp at the same time so that it can be found and started with the remaps. This drops down interfaces in the locations that the concert service needs and thereafter the service forgets about the resources (very static style, like roslaunch).
    - To get more dynamic, I can envisage the service doing a bit more resource management, i.e. requesting and releasing only as needed, but usage might/should be similar.
  - Does the conductor have a unique ID for each service? For each rapp?
    - Each client (robot) has a unique name usually postfixed by a randomly generated uuid, each rapp name is necessarily unique also (ros package/app name pair). Each concert service will also have a unique id (probably it's runtime name is sufficient to guarantee that)
    - The unique name for the service is probably not relevant. There should be no reason that the scheduler needs to know anything about the service. If it just has to care about 'requests', that would in my opinion be ideal.
- (Jack) Starting and stopping various activities is certainly part of what an operating system scheduler does. From the discussion above, it seems reasonable for the concert scheduler to perform similar actions.
(Daniel) Stopping apps is relevant here too. If a concert service doesn't clean up resources properly (maybe it crashed) or doesn't return resources in a timely fashion, then we must have someone responsible for the cleanup of those resources - cleanup should probably entail 1) de-allocation and 2) stopping apps. The scheduler was the first though for this responsibility, if not the scheduler, is there another candidate?
- (Jack) If all of this can be done in a single place, the scheduler seems like a good fit to me.
- (Jack) Since we envision multiple scheduler implementations, it makes sense to encapsulate the messy details of starting and stopping rapps in a common concert package, similar to what we are doing for the request message protocol.

Older discussion thread.

(Jihoon) Is scheduler in charge of launching apps in client as well? If not, who is in charge of start/stop clients' app?
- (Jack) No, it just waits for requests on a well-known topic. I don't know what starts the apps and services. The scheduler does need to handle requester disappearance, maybe via a timeout. Requesters could be tasked with renewing their requests periodically to prove they are still alive and using the resources they were assigned. If the same rapp restarts with a different UUID, it will look like a new and different requester. Reusing the same UUID might allow a crashed rapp to restart, I'm not quite sure how that should work.
- (Jihoon) I am just trying to understand the relation and workflow among service, scheduler, and resources. This drawing shows my opinion on how these three components interact. Could you explain the full workflow you think from request to use? For example, 1. Service A request a resource X to scheduler. 2. Scheduler starts a rapp in robot Y to provide resource X and notify service A that it is granted. Service A confirms that all expected topics are available and start to use them.
- (Jack) My understanding is that the scheduler selects a robot matching the request made by service A, and sends it a message granting access to that device. Service A is responsible for contacting the robot and starting the correct rapp. When it is finished, it sends a "releasing" message to the scheduler so it knows that robot is now available for other tasks.
  - (Jihoon) I think it is good time to discuss the relation among these three components since how and who starts rapp would require the changes in message structure. There are a couple of options I can think of to handle the start of rapp in resource.
    1. Scheduler requests resources to start rapp.
      - Service does not need to directly access resource.
      - Servouce does not need to worrry whether the resource has the necessary rapp.
      - TODO
        Add rapp name in Request.msg. So scheduler starts up rapps for requester.
    2. Service directly requests resources to start rapp
      - Service has a freedom to utilize the given resource with any purpose.
      - TODO
        SchedulerFeedback.msg includes which resources are granted for service X
        Logic to check if the required rapp is installed or not inside of service
        rapp start/stop logic inside of service.
    3. Could be more option.
  - (Jack) You undoubtedly understand the rapp interface better than I do. From conversations with Daniel during his recent trip to Texas, I believe he described option 2. The service requests resources by their rocon_std_msgs/PlatformInfo, which may contain NAME_ANY ("*") wild cards to match some set of robots with appropriate capabilities. When the scheduler assigns a specific robot to fulfill that request, it will send a scheduler_msgs/SchedulerFeeback containing the corresponding scheduler_msgs/Request with all the same requester-assigned UUID and all NAME_ANY fields resolved to the exact resource that was granted.
  - (Jihoon) I see how the current design come from. It sounds reasonably enough to take as a first choice. I will try to integrate the option 2 using scheduler_msgs in opp branch for our demo. Meanwhile you can take a look the current stub with here.
    - (Jack) Thanks for the link, Jihoon, it's helpful. My current very experimental Python package for managing the scheduler topic protocol is at https://github.com/utexas-bwi/rocon_scheduler_requests. I posted a snapshot of the current documentation at http://farnsworth.csres.utexas.edu/docs/rocon_scheduler_requests/html/index.html. I don't know if there is enough to be worth looking at yet, but I'll keep working on it and update those links.

ROS 2 Documentation

Wiki

Page

User

Starting Rapps

Topics

Current Resolution

Discussion Thread