## For instruction on writing tutorials ## http://www.ros.org/wiki/WritingTutorials #################################### ##The SQL Database Interface: Advanced Concepts #################################### ## for a custom note with links: ## note = ## for the canned note of "This tutorial assumes that you have completed the previous tutorials:" just add the links ## note.0=[[database_interface/Tutorials/Introduction to using the database_interface|Introduction to using the database_interface]] ## descriptive title for the tutorial ## title = The SQL database interface: advanced concepts ## multi-line description to be displayed in search ## description = This tutorial shows some of the advanced concepts of using the SQL database interface, such as objects spanning multiple tables, sequences, and binary data fields. ## the next tutorial description (optional) ## next = ## links to next tutorial (optional) ## next.0.link= ## next.1.link= ## what level user is this tutorial for ## level= IntermediateCategory ## keywords = #################################### <> <> In this tutorial, we will be extending the applied example from the previous tutorial, [[database_interface/Tutorials/Introduction to using the database_interface|Introduction to using the database_interface]], to illustrate some additional concepts. == Field (de)serialization: TEXT vs. BINARY == We recall from the previous tutorial that any data field of a database class must be wrapped as a `DBField<>`. Behind the scenes, such fields must be read from, and written to the database itself. The low-level C++ interfaces that we use generally provide two types of data serialization: TEXT and BINARY. However, the SQL data type does not always match the C++ data type, therefore some type of conversion is necessary regardless of which method is used. We have tried to hide as much as possible this additional overhead from the user. When constructing a `DBField<>`, the user can specify what kind of serialization should be performed. The following rules apply: * '''TEXT (de)serialization is performed by default using the streaming (`<<` and `>>`) operators''' * any data type for which these operators are defined can be encapsulated in a `DBField<>` * if you want to encapsulate your own data type in a `DBField<>` you should define these operators. * the database interface defines the streaming operators for vectors, (de)serializing them as expected by the database: * `std::vector` gets (de)serialized as `{T,T,...,T}` * the type T must in turn have the streaming operators defined * '''BINARY (de)serialization is only provided for a single data type''': `std::vector` * (de)serializing any other `DBField<>` type using BINARY serialization will fail * we are hoping to provide BINARY (de)serialization for more data types in future releases === Binary field example === Let's assume that we also want to store a student's photo in our table, as a binary bitmap. We will add one more column to our table, `student_photo`, with SQL type `bytea`. The C++ class would look like this: {{{ class StudentWithPhoto : public database_interface::DBClass { public: database_interface::DBField student_id_; database_interface::DBField student_first_name_; database_interface::DBField student_last_name_; database_interface::DBField< std::vector > student_majors_; database_interface::DBField student_gpa_; //new field: database_interface::DBField< std::vector > student_photo_; StudentWithPhoto() : student_id_(database_interface::DBFieldBase::TEXT, this, "student_id", "student", true), student_first_name_(database_interface::DBFieldBase::TEXT, this, "student_first_name", "student", true), student_last_name_(database_interface::DBFieldBase::TEXT, this, "student_last_name", "student", true), student_majors_(database_interface::DBFieldBase::TEXT, this, "student_majors", "student", true), student_gpa_(database_interface::DBFieldBase::TEXT, this, "student_gpa", "student", true), student_photo_(database_interface::DBFieldBase::BINARY, this, "student_photo", "student", true) { primary_key_field_ = &student_id_; fields_.push_back(&student_first_name_); fields_.push_back(&student_last_name_); fields_.push_back(&student_majors_); fields_.push_back(&student_gpa_); //don't forget to add all fields to the fields_ array: fields_.push_back(&student_photo_); setAllFieldsReadFromDatabase(true); setAllFieldsWriteToDatabase(true); student_photo_.setReadFromDatabase(false); student_photo_.setWriteToDatabase(false); } }; }}} Note that the new field gets constructed as a field with BINARY (de)serialization: {{{ student_photo_(database_interface::DBFieldBase::BINARY, this, "student_photo", "student", true) }}} Also note that, in this particular case, the new field will '''not''' be loaded from, or saved to the database by default. {{{ student_photo_.setReadFromDatabase(false); student_photo_.setWriteToDatabase(false); }}} This is due to a problem in the underlying SQL library, explained below. However, in general, we assume that binary fields contain very large amounts of data that we do not want to read from, or write to the database unless we specifically ask for it. You can always retrieve a particular field of a given class instance like this: {{{ //retrieve the students, but photos will not be loaded by default std::vector< boost::shared_ptr > students_with_photos; database.getList(students_with_photos); //now get the photo for the first student retrieved database.loadFromDatabase( &(students_with_photos[0]->student_photo_) ); std::cerr << "The photo has " << students_with_photos[0]->student_photo_.data().size() << " bytes \n"; }}} === Combining TEXT and BINARY fields === Due to a restriction of the underlying `libpq` library for PostgreSQL databases, TEXT and BINARY fields can not be retrieved '''as part of the same query'''. This forces this model where class instances are retrieved without their BINARY fields, which are later loaded on-demand as in the example above. We might find a workaround for this problem in future releases. == Using Sequences for Automatic Primary Key Generation == In the previous tutorial we have seen how to insert a new instance of a class (a `Grade` in our example) in the database. However, we had to specify the value of the primary key by hand, which is generally cumbersome. PostgreSQL provides a solution for this: using sequences as default values for primary keys. A sequence is a database object that can be queried and incremented. If you specify the sequence as the default value of your primary key, then any new insertion in the table will automatically get assigned the next number in the sequence as an id. Note that creating the sequence and setting is as default value for a column is a SQL operation that is external to this interface. Here we will create a sequence called `grade_id_seq` with the following SQL command: {{{ CREATE SEQUENCE grade_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1; }}} And set it as the default value for the `grade_id` column in the `grade` table. Using pgadmin3, add "nextval('grade_id_seq'::regclass)" to the default value field for that column. Once this is done, we only need to tell the relevant C++ class, preferably in its constructor, that the primary key field uses a sequence with that name: {{{ class GradeWithSequence : public database_interface::DBClass { public: database_interface::DBField grade_id_; database_interface::DBField student_id_; database_interface::DBField grade_subject_; database_interface::DBField grade_grade_; GradeWithSequence() : grade_id_(database_interface::DBFieldBase::TEXT, this, "grade_id", "grade", true), student_id_(database_interface::DBFieldBase::TEXT, this, "student_id", "grade", true), grade_subject_(database_interface::DBFieldBase::TEXT, this, "grade_subject", "grade", true), grade_grade_(database_interface::DBFieldBase::TEXT, this, "grade_grade", "grade", true) { primary_key_field_ = &grade_id_; fields_.push_back(&student_id_); fields_.push_back(&grade_subject_); fields_.push_back(&grade_grade_); setAllFieldsReadFromDatabase(true); setAllFieldsWriteToDatabase(true); grade_id_.setSequenceName("grade_id_seq"); grade_id_.setWriteToDatabase(false); } }; }}} Note that, once we use a sequence, we do not want the actual value of the field to be written to the database (the interface will use the default value instead), hence the two calls: {{{ grade_id_.setSequenceName("grade_id_seq"); grade_id_.setWriteToDatabase(false); }}} You can then write new `GradeWithSequence` instances into the database without worrying about the primary key value. The value will be generated automatically upon insertion, and returned to you as the relevant field of the inserted instance: {{{ //insert a grade with automatic key generation GradeWithSequence new_grade_seq; new_grade_seq.student_id_.data() = 2; new_grade_seq.grade_subject_.data() = "mythology"; new_grade_seq.grade_grade_.data() = 4.0; database.insertIntoDatabase(&new_grade_seq); std::cerr << "The newly inserted grade was assigned grade_id=" << new_grade_seq.grade_id_.data() << "\n"; }}} == Objects Spanning Multiple Tables == So far, we have only seen cases where there is perfect overlap between a C++ `DBClass` and a SQL table. However, this is not a hard rule: * a `DBClass` can reference only a subset of the fields of a table * a `DBClass` can span multiple tables. However, this is a more obscure case wich might not be encountered in practice, so we will provide limited details here * you can use this if you want to separate large data entries from more commonly accessed, smaller sized entries (for example, if you want the student photo stored in s different table than the rest of the student information) * more details can be found in the Code API for the [[http://www.ros.org/doc/api/database_interface/html/classdatabase__interface_1_1DBClass.html|DBClass]], and an applied example in the [[http://www.ros.org/doc/api/household_objects_database/html/classhousehold__objects__database_1_1DatabaseOriginalModel.html|DatabaseOriginalModel]] class in the [[household_objects_database]] package. ## AUTOGENERATED DO NOT DELETE ## TutorialCategory ## FILL IN THE STACK TUTORIAL CATEGORY HERE