Note: This tutorial assumes that you have completed the previous tutorials: Introduction to using the database_interface.
(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

The SQL database interface: advanced concepts

Description: This tutorial shows some of the advanced concepts of using the SQL database interface, such as objects spanning multiple tables, sequences, and binary data fields.

Tutorial Level: INTERMEDIATE

In this tutorial, we will be extending the applied example from the previous tutorial, Introduction to using the database_interface, to illustrate some additional concepts.

Field (de)serialization: TEXT vs. BINARY

We recall from the previous tutorial that any data field of a database class must be wrapped as a DBField<>. Behind the scenes, such fields must be read from, and written to the database itself.

The low-level C++ interfaces that we use generally provide two types of data serialization: TEXT and BINARY. However, the SQL data type does not always match the C++ data type, therefore some type of conversion is necessary regardless of which method is used. We have tried to hide as much as possible this additional overhead from the user.

When constructing a DBField<>, the user can specify what kind of serialization should be performed. The following rules apply:

  • TEXT (de)serialization is performed by default using the streaming (<< and >>) operators

    • any data type for which these operators are defined can be encapsulated in a DBField<>

    • if you want to encapsulate your own data type in a DBField<> you should define these operators.

  • the database interface defines the streaming operators for vectors, (de)serializing them as expected by the database:
    • std::vector<T> gets (de)serialized as {T,T,...,T}

    • the type T must in turn have the streaming operators defined
  • BINARY (de)serialization is only provided for a single data type: std::vector<char>

    • (de)serializing any other DBField<> type using BINARY serialization will fail

    • we are hoping to provide BINARY (de)serialization for more data types in future releases

Binary field example

Let's assume that we also want to store a student's photo in our table, as a binary bitmap. We will add one more column to our table, student_photo, with SQL type bytea.

The C++ class would look like this:

class StudentWithPhoto : public database_interface::DBClass
{
public:  
  database_interface::DBField<int> student_id_;
  database_interface::DBField<std::string> student_first_name_;
  database_interface::DBField<std::string> student_last_name_;
  database_interface::DBField< std::vector<std::string> > student_majors_;
  database_interface::DBField<double> student_gpa_;
  //new field:
  database_interface::DBField< std::vector<char> > student_photo_;
  
  StudentWithPhoto() : 
    student_id_(database_interface::DBFieldBase::TEXT, 
                this, "student_id", "student", true),
    student_first_name_(database_interface::DBFieldBase::TEXT, 
                        this, "student_first_name", "student", true),
    student_last_name_(database_interface::DBFieldBase::TEXT, 
                       this, "student_last_name", "student", true),
    student_majors_(database_interface::DBFieldBase::TEXT, 
                    this, "student_majors", "student", true),
    student_gpa_(database_interface::DBFieldBase::TEXT, 
                 this, "student_gpa", "student", true),
    student_photo_(database_interface::DBFieldBase::BINARY, 
                 this, "student_photo", "student", true)
  {
    primary_key_field_ = &student_id_;

    fields_.push_back(&student_first_name_);
    fields_.push_back(&student_last_name_);
    fields_.push_back(&student_majors_);
    fields_.push_back(&student_gpa_);
    //don't forget to add all fields to the fields_ array:
    fields_.push_back(&student_photo_);

    setAllFieldsReadFromDatabase(true);
    setAllFieldsWriteToDatabase(true);

    student_photo_.setReadFromDatabase(false);
    student_photo_.setWriteToDatabase(false);
  }
};

Note that the new field gets constructed as a field with BINARY (de)serialization:

    student_photo_(database_interface::DBFieldBase::BINARY, 
                 this, "student_photo", "student", true)

Also note that, in this particular case, the new field will not be loaded from, or saved to the database by default.

    student_photo_.setReadFromDatabase(false);
    student_photo_.setWriteToDatabase(false);

This is due to a problem in the underlying SQL library, explained below. However, in general, we assume that binary fields contain very large amounts of data that we do not want to read from, or write to the database unless we specifically ask for it. You can always retrieve a particular field of a given class instance like this:

  //retrieve the students, but photos will not be loaded by default
  std::vector< boost::shared_ptr<StudentWithPhoto> > students_with_photos;
  database.getList(students_with_photos);

  //now get the photo for the first student retrieved
  database.loadFromDatabase( &(students_with_photos[0]->student_photo_) );
  std::cerr << "The photo has " 
            << students_with_photos[0]->student_photo_.data().size() 
            << " bytes \n";

Combining TEXT and BINARY fields

Due to a restriction of the underlying libpq library for PostgreSQL databases, TEXT and BINARY fields can not be retrieved as part of the same query. This forces this model where class instances are retrieved without their BINARY fields, which are later loaded on-demand as in the example above. We might find a workaround for this problem in future releases.

Using Sequences for Automatic Primary Key Generation

In the previous tutorial we have seen how to insert a new instance of a class (a Grade in our example) in the database. However, we had to specify the value of the primary key by hand, which is generally cumbersome.

PostgreSQL provides a solution for this: using sequences as default values for primary keys. A sequence is a database object that can be queried and incremented. If you specify the sequence as the default value of your primary key, then any new insertion in the table will automatically get assigned the next number in the sequence as an id.

Note that creating the sequence and setting is as default value for a column is a SQL operation that is external to this interface. Here we will create a sequence called grade_id_seq with the following SQL command:

CREATE SEQUENCE grade_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1;

And set it as the default value for the grade_id column in the grade table. Using pgadmin3, add "nextval('grade_id_seq'::regclass)" to the default value field for that column.

Once this is done, we only need to tell the relevant C++ class, preferably in its constructor, that the primary key field uses a sequence with that name:

class GradeWithSequence : public database_interface::DBClass
{
public:
  database_interface::DBField<int> grade_id_;
  database_interface::DBField<int> student_id_;
  database_interface::DBField<std::string> grade_subject_;
  database_interface::DBField<double> grade_grade_;

  GradeWithSequence() :
    grade_id_(database_interface::DBFieldBase::TEXT, 
                this, "grade_id", "grade", true),
    student_id_(database_interface::DBFieldBase::TEXT, 
                this, "student_id", "grade", true),
    grade_subject_(database_interface::DBFieldBase::TEXT, 
                   this, "grade_subject", "grade", true),
    grade_grade_(database_interface::DBFieldBase::TEXT, 
                 this, "grade_grade", "grade", true)  
  {
    primary_key_field_ = &grade_id_;
    fields_.push_back(&student_id_);
    fields_.push_back(&grade_subject_);
    fields_.push_back(&grade_grade_);

    setAllFieldsReadFromDatabase(true);
    setAllFieldsWriteToDatabase(true);

    grade_id_.setSequenceName("grade_id_seq");
    grade_id_.setWriteToDatabase(false);
  }
};

Note that, once we use a sequence, we do not want the actual value of the field to be written to the database (the interface will use the default value instead), hence the two calls:

    grade_id_.setSequenceName("grade_id_seq");
    grade_id_.setWriteToDatabase(false);

You can then write new GradeWithSequence instances into the database without worrying about the primary key value. The value will be generated automatically upon insertion, and returned to you as the relevant field of the inserted instance:

  //insert a grade with automatic key generation
  GradeWithSequence new_grade_seq;
  new_grade_seq.student_id_.data() = 2;
  new_grade_seq.grade_subject_.data() = "mythology";
  new_grade_seq.grade_grade_.data() = 4.0;
  database.insertIntoDatabase(&new_grade_seq);
  std::cerr << "The newly inserted grade was assigned grade_id=" 
            << new_grade_seq.grade_id_.data() 
            << "\n";

Objects Spanning Multiple Tables

So far, we have only seen cases where there is perfect overlap between a C++ DBClass and a SQL table. However, this is not a hard rule:

  • a DBClass can reference only a subset of the fields of a table

  • a DBClass can span multiple tables. However, this is a more obscure case wich might not be encountered in practice, so we will provide limited details here

    • you can use this if you want to separate large data entries from more commonly accessed, smaller sized entries (for example, if you want the student photo stored in s different table than the rest of the student information)
    • more details can be found in the Code API for the DBClass, and an applied example in the DatabaseOriginalModel class in the household_objects_database package.

Wiki: database_interface/Tutorials/The SQL database interface: advanced concepts (last edited 2011-03-07 04:13:15 by KevinWatts)