(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

Training a new object

Description: This tutorial gives an overview of the necessary steps and parameters of the training of objects which can be used by the asr_descriptor_surface_based_recognition package.

Tutorial Level: INTERMEDIATE


  1. Setup
  2. Tutorial


To be able to train a new object you need at least two files:

  1. A textured mesh of the object in the Collada .dae-format. Make sure that the texture has a high resolution so it can be used to render the object to an artificial image in the pose validation step.
  2. A point cloud of the object saved in the Wavefront .obj-format. This file should be identical to the mesh file above with the difference that it should only contain the vertices and no other information like faces, normals or colors.

For easier training it is recommended to rotate the object so that it is aligned to the principal axes of the frame it is relative to (main orientation in x-direction).

Copy those files to the /trainer_data/input folder which is located in the asr_descriptor_surface_based_recognition package. In addition to the files above you can add a set of views of the object (saved as .png or .jpeg images) which can be used during training. Those are optional though as those views are usually created by using the live input of a camera during training.

If you want to use live images instead of a set of prerecorded ones make sure to connect the camera setup you will use in the recognition phase later (only the RGB-camera, the depth sensor is not needed during training). It is recommended to use the same setup in both the training and recognition phase to get optimal results.


Once you have provided the files mentioned in the setup above and connected the camera you can start the training application with the following command:

roslaunch asr_descriptor_surface_based_recognition descriptor_surface_based_trainer.launch

This should open a gui window looking like this: trainer_1.png

If you have provided the files correctly, the application should set them as default parameters in this first window. Now you can set the parameters of the 3D recognition based on your scenario (see the image above):

  1. Add a unique name which describes the object. This name should have the same format as the ones of the objects already stored in the asr_object_database, so make sure that you set it accordingly (camel case, no numbers)
  2. Choose the .obj-file you have provided. This will be used to generate the 3D model HALCON uses in its matching phase.
  3. Choose the .dae-file you have provided. This will be used for the pose validation and the visualization of the object later.
  4. The rotation type of the object describes around how many axes the object can be rotated by certain degrees without changing its geometrical appearance:
    1. No rotation: Rotation-variant objects which do not have a symmetrical appearance

    2. Cylinder: Cylindrical objects with one rotational axis

    3. Sphere: All objects with at least two axes fall into this category (note that this feature has not been implemented in the recognizer and is only here as a placeholder)

  5. Add the coordinates of the object's normalized orientation
  6. The diameter should be set automatically by using the object model. Only change this if the calculated value is incorrect.
  7. This is the score threshold used for the 3D matching. Check this page for more information.

If you are satisfied with your input click next and another window should open: trainer_2.png

Here you can see three input regions:

  1. All your already created views can be seen here. This is empty the first time you open this window. Those views are used for the 2D matching and should cover most parts of the object's texture. They should not overlap and for cylindrical objects they should all be taken around one axis. Note that the amount of views has great influence on the overall runtime of the recognition, a good amount is 4-6 views per object.
  2. To add a new view click the + button at the top, to delete an already created one click - and to edit one click the one on the right.
  3. Once you have created a view, you have to set the bounding box around the texture of the object (this is used only for visualization purposes but is necessary for each view). To add a bounding box click on the preview image to add a corner point of the box. You will have to add all four to complete the box which will appear if you have done it correctly. The order of the points you set is clockwise from the top left. To delete the already set points click with the right mouse button.

As you will need at least one view click the + button so that the view creation window will open: trainer_3.png

Here you can set a new view for the object and then alter the recognition parameters for this view:

  1. At first choose a source for your view image at the top left: If you have a camera setup available choose camera, otherwise you can use prerecorded images by choosing the files you have provided in the input directory at setup time. Once you have chosen the topic you want to use (or the specific file) the image should appear in the preview window. On the right side you can do the same for the test image, this is used for testing the parameters you have set.
  2. After you have chosen your image source you will have to click Use current image to fix it and make it usable. Once it is fixed you should crop the image using the sliders below until only the relevant texture of the object is visible.

  3. Once you have chosen the parameters (see area #4 in the image above) you can test them by clicking Start test. As this will take a lot of time depending on your image size and your chosen parameters, make sure that you have set them correctly before testing. Your average score should be as high as possible (usually at least higher than 0.4) and your average time as low as possible (the times sum up for your amount of views in the worst case during recognition so make sure you either have low frame times per view or not many views in total).

  4. Here you can set the view parameters.
    • At first set the orientation of this view in the object frame (if this is the first view it should probably match the orientation you have set in the first window).
    • The next parameter is the score threshold used for the 2D matching, this should usually be around 0.15-0.2.
    • The offset values are used to move the central point of the found descriptor points if it does not match the object's center already. In the testing preview (see area #3) the center point is visualized as a large point during the test. Move it until it is lying in the center of your object (No new model has to be created when restarting the test after changing those two values).
    • The Axis 1 (and Axis 2 in case of a spherical object) parameter describes the rotation axis around which the object can be turned by the angle parameter without changing its geometrical view (for a perfect cylinder the angle would be 1; for a non quadratic, rectangular base for example this would be 180)
    • On the right side you can choose the parameters the HALCON matching algorithms use. Compare the HALCON documentation for them and set them accordingly. Usually you don't need to change the scale parameters and the patch size, the depth should be something around 9 and the fern number around 25. Those values very much depend on the texture in the image though and the amount of descriptor points found in it, so make sure you test them and set them depending on your setup.

    • If your object can be turned around this view's orientation axis by 180 degrees without changing it's geometric view (e.g. a prefect cylinder or a box-shaped one like the Vitalis-object shown in the example image above), then you should check the upside-down box
    • With the use-color checkbox you can set whether to use a colored image (if available) to train your current view or use a greyscale one (usually it is better to avoid a colored one as a greyscale image typically led to better results)

After you have set the parameters correctly and tested them (move the object around during testing and try to create a scenario which is close to your future recognition scenes) click the Save button to end this step. You are back at the 2D recognition parameters window and now you should set the bounding box for the view you have just created. Choose it in the left panel and click the image preview as mentioned above to create a bounding box. You should do those steps over for as many views as you need. Once you are satisfied click the Finish button to end the training. This will create all necessary output files in the accordingly named directory in the trainer_data folder (This might take a while depending on your machine, the amount of views and their parameters). If everything went correctly you should be able to simply copy the created folder with the data inside to the asr_object_database (to the descriptor_surface category) and be able to use it during recognition like the other objects.

There should be a log file in the created folder so you can check if and where errors occurred during the finishing step.

Wiki: asr_descriptor_surface_based_recognition/trainer_tutorial (last edited 2017-06-06 11:57:48 by TobiasAllgeyer)