C++ Collections Guide and Reference

Chapter 7

Using Indexes to Optimize Performance

The information about queries and indexes is organized in the following manner:

Indexes and Query Optimization

Adding an Index to a Collection with add_index()

You can direct ObjectStore to optimize queries over a particular collection by adding an index into the collection with the member function os_collection::add_index().

Suppose, for example, that you want to optimize look-up of parts in a_set by part number as in the following query:

      a_set->query_pick("part*", "part_number==411", db1)
Example: add_index()
You request an index into the set a_set. You specify the key of the index as the value of the data member part_number. To do this, use the member function os_collection::add_index():

      os_Set<part*> *a_set;
      . . . 
      os_index_path &key_spec = 
            os_index_path::create("part*", "part_number",db1);
      a_set->add_index(key_spec);
The key here is specified with a reference to a path. See Creating Paths.

The last argument to add_index() specifies the database, segment, or object cluster in which the index is stored. The index remains until it is removed with drop_index(). By default, an index is placed in the same segment as the collection for which the index is being added.

After you invoke this function, any query over a_set involving look-up by part_number is optimized.

Indexes can also end in functions. In this case, the query must end in the function in order to employ the index.

Index Maintenance

You might need to perform index maintenance for any data member or member function in the path used to specify the index key. See Performing or Enabling Index Maintenance.

Pointer-Valued Members and char* Keys

If you create an index with a path to a pointer-valued data member - other than a char*-valued member - you can optimize look-up based on address. char*-valued data members are treated specially. An index based on a char* member optimizes look-up by the string pointed to.

However, because the query is on the address and not the value, pointer-valued members are limited in their usefulness.

Indexes and Performance

Without the index, a linear search must be used to perform such queries, and each element of a_set will have to be examined. By adding an index into a_set, you are instructing the system to maintain an access method (consisting of hash tables and/or a B-tree) allowing efficient look-up by part_number.

Adding an index into a collection slows down updates to the collection somewhat. It also slows down updates to the data member specifying the index key because index maintenance is performed whenever such an update occurs. However, indexes make the associated look-ups significantly faster. Therefore, it is a good idea to request an index if the ratio of look-ups to updates is large.

Dropping Indexes from a Collection with drop_index()

Indexes can be added and dropped during the run of an application. If an index makes sense for only part of an application's run, the application can add an index, then remove it later. For example, if the first part of an application performs many look-ups but relatively few updates and the second part performs many updates and relatively few look-ups, the program can add an index at the beginning of the first part, then remove the index at the beginning of the second part.

Because of this, you might unexpectedly find that you want to drop an index temporarily and later add it again. It is a good idea to design your application to keep track of your indexes so that you can easily drop and re-add them at a later time, especially if you have many indexes.

Example:
drop_index()
You remove an index with the member function drop_index(). Following is an example:

      os_Set<part*> *a_set;
      . . . 
      os_index_path &key_spec = 
            os_index_path::create("part*","part_number", db1);
      . . . 
      a_set->drop_index(key_spec);
Note that you specify the key, with an os_index_path, when dropping an index, because the same collection can have several different indexes to optimize different kinds of look-ups.

The os_index_path argument does not need to be the same instance of os_index_path supplied when the index was added, but it must specify the same key. If the path strings used to create two os_index_paths differ only with regard to white space, the os_index_paths specify the same index key.

If an index with the specified key was never added to the collection, err_no_such_index is signaled.

You can add and drop indexes at run time as frequently as you like. The ObjectStore query optimizer adapts dynamically.

Testing for the Presence of an Index with has_index()

You can also test for the presence of an index with a specified key by using the member function has_index().

This function returns a value saying whether an index can support the index type specified with index_options.

You must supply a path string and one of the index options. An index that supports exact-match queries (hash table) can only be used for exact matches. An index that supports range queries (binary tree) can be used for both exact-match and range queries. In effect, os_collection::has_index answers the question "Can this index support this type of query?" and not the option that was used to create the index.

Possible values for index_option are ordered and unordered.

Following is an example:

      os_Set<part*> *a_set;
      . . . 
      os_index_path &key_spec = ...
      . . . 
      if (a_set->has_index(key_spec, options)) ...
options for has_index can have the value ordered or unordered.

The function os_collection::get_indexes() allows you to retrieve information on all the indexes into a specified collection.

Indexes and Complex Paths

Path expressions can specify not just a single data member, but a navigational path involving multiple member accesses. Such path expressions can be used to specify index keys. For example, suppose that you want to optimize look-up of a part based on the emp_id of any of the responsible_engineers for the part (suppose that the member responsible_engineers is collection valued). You can use the following path:

      os_index_path::create(
            "part*","(*responsible_engineers)[]->emp_id", db1)
This path is like ones you have seen before except that here the data member name responsible_engineers is followed by the symbols [], indicating that the next component of the path (emp_id) is to be applied to each element of the collection of responsible engineers rather than to the collection itself. Note that you cannot end an expression with the symbols [].

Index Options

As mentioned previously, the index from part numbers to parts is implemented as hash tables (unordered indexes). This is the default. But if your application performs range queries involving part_number, you can request an ordered index, implemented using a B-tree. With a B-tree, queries involving <, <=, >, or >= comparisons on part numbers can be computed more efficiently than with an index implemented only with hash tables.

Example: B-tree query
Following is an example:

      part_extent->query(
            "part*", "part_number > 411", db1
      )
      part_extent[:part_number > 411:]
Use of a B-tree also makes iteration in order of part_number more efficient (see Controlling Traversal Order).

os_index_path::ordered Enumerator

You request an ordered index by specifying os_index_path::ordered as the second argument to add_index() when requesting the index:

      os_Set<part*> *a_set;
      . . . 
      os_index_path &a_path = 
            os_index_path::create("part*","part_number", db1);
      a_set->add_index(a_path, os_index_path::ordered);
Here os_index_path::ordered is an ObjectStore-supplied enumerator.

Of course, if a B-tree is best for parts of your application and a hash table is best for other parts, you can add and drop indexes of these types dynamically, as appropriate. You cannot have an ordered and unordered index using the same os_index_path on a collection.

Index Option Enumerators

You can also specify a variety of other index options using various other enumerators. The enumerators can be combined into a bit pattern with bit-wise disjunction (using | (bit-wise OR)). Following is the complete list of index option enumerators.

Default index behavior
The following disjunction of enumerators specifies the default index behavior:

      os_index_path::unordered |
      os_index_path::allow_duplicates |
      os_index_path::copy_key
By default, an index is allocated in the same segment as its associated collection. If you want, however, you can supply an os_database* or os_segment* indicating where to allocate a new index. Supply this information as the third argument for calls that include an options argument. Otherwise, supply the clustering information as the second argument.

Performing or Enabling Index Maintenance

Whenever you use a path, for each data member mentioned in the path string, except const and collection-valued members, you must perform or enable index maintenance. Note that failing to perform or enable index maintenance can result in corrupted indexes, incorrect query results, and program failures.

os_indexable data members
Data members declared as os_indexable do automatic index maintenance on update. Pointer-valued data members should not be declared as os_indexable because the index will be on the address, not the value.

Collections and indexes
For all collections, a collection can either participate in an index or own an index over itself. In either case, when items are inserted into and removed from collections, automatic index maintenance occurs. This is true whether or not the item is os_indexable and is also true for indexed member functions. However, you must still do index maintenance on update.

Non-os_indexable data members
For data members that are not declared os_indexable (for instance pointer-valued data members), you must do index maintenance when you modify the object that the pointer points to.

Paths as Indexes

If you use a path as an index key or to specify traversal order for a safe cursor, you have two or three options for each data member in the path.

Option 1: automatic index maintenance
Declare an os_backptr member and enable automatic index maintenance. See Declaring an os_backptr Member. See also Enabling Automatic Index Maintenance.

This option simplifies the coding of updates compared to options 2 and 3.

Like option 2, this option carries extra space overhead compared to option 3, in the form of an os_backptr member for each object containing the member.

Option 2: user-controlled index maintenance (using os_backptr)
Declare an os_backptr and perform user-controlled index maintenance, using make_link() and break_link(). See Declaring an os_backptr Member and User-Controlled Index Maintenance with an os_backptr.

Like option 3, this option can make coding updates to the member slightly more complex compared to option 1, but it avoids the use of "wrapper objects" and the apparent value/actual value distinction. See The Actual Value/Apparent Value Distinction for more information.

Like option 1, this option carries extra space overhead compared to option 3, in the form of an os_backptr member for each object containing the member.

Option 3: user-controlled index maintenance (without os_backptr)
Do not declare an os_backptr and perform user-controlled index maintenance, using the collection operations insert() and remove(). This option only applies if the member is not mentioned in any multistep path used as an index key. See User-Controlled Index Maintenance Without an os_backptr.

Like option 2, this option can make coding updates to the member slightly more complex, compared to option 1, but it avoids the use of wrapper objects and the apparent value/actual value distinction. See The Actual Value/Apparent Value Distinction for more information.

With this option, you need not declare an os_backptr, so you avoid the space overhead, incurred with options 1 and 2, of an os_backptr member for each object containing the member. However, the index maintenance accompanying each data member update can be more expensive.

Such index maintenance is more expensive if there is an index that 1) is not keyed by the data member and 2) indexes a collection on which you perform insert() and remove() for index maintenance associated with updating the member.

With option 3, each such index is updated, whereas with options 1 and 2, such indexes are not updated.

Declaring an os_backptr Member

To make a data member indexable, you add to the class whose data member you want to be indexable a public or private data member of type os_backptr. The declaration of the data member of type os_backptr must precede the declaration of the data member (or member functions) you want to make indexable.

Example: os_backptr declaration
Following is an example:

      class part {
            public:
                  . . . 
                  os_backptr b ;
                  int id ;
                  department *dept ;
                  . . . 
                  part();
                  ~part();
                  . . . 
      };
Note that it is sufficient to define a single data member of type os_backptr for all indexable members of a class.

Inheritance of the os_backptr

ObjectStore supports inheritance of the os_backptr data member provided that the member is inherited from a base class along the leftmost side of the type inheritance lattice and provided that the leftmost base class is not a virtual base class (directly or through inheritance). In all other cases, you must define a data member of type os_backptr directly in the class defining the members you want to be indexable.

Example: os_backptr class definitions
Consider, for example, the following class definitions:

      class base1 {... os_backptr b1 ; ...} ;
      class base2a : public base1 {...} ;
      class base2b {... os_backptr b2b ;...} ;
      class derived : public base2a, public base2b {
            char *name ;
      } ;
Class derived's name member can use class base1's os_backptr member b1. Any data member in class base2a can also use class base1's os_backptr member b1. Indexable members in class base2b should continue to use base2b's os_backptr member b2b.

Now consider

      class base1 {...os_backptr b1 ;...} ;
      class base2a : virtual public base1 {...} ;
      class base2b {...os_backptr b2b ; ... } ;
      class derived : public base2a, public base2b {
            os_backptr d ;
            char *name ;
      } ;
It is not possible for class derived's indexable data members to use base1's os_backptr member, because base2a inherits class base1 virtually. For the same reason, data members in class base2a cannot use class base1's os_backptr member b1. Because base2b is not inherited along the leftmost side of the type inheritance lattice, an additional os_backptr member (d in this example) must be defined to allow name to be indexable.

Enabling Automatic Index Maintenance

Using the functions os_backptr::break_link() and os_backptr::make_link() whenever you update the member allows ObjectStore to perform index maintenance under user control (see User-Controlled Index Maintenance with an os_backptr).

This is also true for pointer-valued data members. Because the pointer is the index value, ObjectStore does not detect updates and you must use os_backptr::break_link() and os_backptr::make_link() for index maintenance whenever you update the member.

However, for all other data members that are not char* or char[] valued, you can avoid using these functions by declaring the data member using the following macros:

If you use these macros, updates to the data member trigger automatic index maintenance.

os_indexable_member() Macro

For a data member normally declared as

      value-type member-name ;
declare it instead as

      os_indexable_member(class-name,member-name,value-type) 
            member-name ;
where class-name is the name of the class defining the indexable member, member-name is the name of the data member being made indexable, and value-type is the member's type.

os_indexable_body() Macro

The last thing you must do to make a data member indexable is call the macro os_indexable_body() to instantiate the bodies of the functions that provide access to the indexable member. These functions ensure that any indexes keyed by that data member are properly updated when the member is updated. The macro call should appear (at top level) in a file associated with the class defining the indexable member:

      os_indexable_body(part,id,int,os_index(part,b));
      os_indexable_body(part,idx2,int,os_index(part,b));
Here the macro calls have the form

      os_indexable_body(class,member,value-type,backptr-spec) 
where

os_index() macro

Calls to os_index() have the form

      os_index(class,member) 
where class is the name of the class defining the indexable member, and member is the name of the os_backptr-valued data member appearing before indexable members of the class.

Avoid White Space in Macro Arguments

Some macro arguments are used (among other things) to concatenate unique names. The details of cpp macro preprocessing differ from compiler to compiler, and in some cases it is necessary to enter these macro arguments without white space to ensure that the argument concatenation will work correctly. All the examples given in this section follow this important convention, and should therefore work with any cpp.

The Actual Value/Apparent Value Distinction

The actual value of an indexable data member is a special container object that encapsulates the apparent value. For example, the apparent value of the data member id of the previous examples is an int, but the actual value is an instance of a class defined implicitly by the member macro.

This implicitly defined class defines operator =() and a conversion operator (operator int() in this example) for converting instances of the implicitly defined type to instances of the apparent value type. Under most circumstances, these operators make the container object transparent.

Examples
For example, to set the id of a part to 411, you simply do

      a_part->id = 411
And to get the value and pass it to a function expecting an int argument, you do

      f(a_part->id)
Because f() expects an int argument (the apparent but not actual value type of id), the conversion operator is invoked, making the above call equivalent to

      f(a_part->id.operator int())
For cases where the actual value cannot be transparent, you should use the functions getvalue() and setvalue() defined by the actual value type of the indexable member. For example:

      printf("The id is %d \n", a_part->id);  /* This won't work correctly */
does not work because the compiler cannot tell that the second argument to printf() is supposed to be an int, so the conversion operator is not invoked. Instead you should use

      printf("The id is %d \n", a_part->id.getvalue()); 
or

      printf("The id is %d \n", (int)(a_part->id));
The getvalue() and setvalue() functions will always work correctly for getting and setting indexable data member values.

char* and char() Members

char* and char[] indexable data members are typically intended to be associated with indexes keyed by string rather than address. For example, iteration based on a char* member proceeds in order of the string pointed to rather than in order of address. To ensure proper index maintenance for such members, you must use user-controlled index maintenance. See User-Controlled Index Maintenance with an os_backptr.

Restriction on Updates

Note that if the values of an indexable data member are instances of a user-declared class (not pointers to such instances), the values of such an instance's data members cannot be directly altered without circumventing the required index maintenance. To make such a change, the value of the indexable data member must be replaced wholesale with a modified copy of the old value. That is, the instance must be copied and altered, then the altered object must be copied back as the new value of the indexable member.

User-Controlled Index Maintenance with an
os_backptr

Using the macros described in Enabling Automatic Index Maintenance allows ObjectStore to perform fully automatic index maintenance for data members. However, for any data member, you can avoid using these macros (and the accompanying actual value/apparent value distinction) by using the functions os_backptr::break_link() and os_backptr::make_link() whenever you update the member. In this case you need only define the os_backptr member; the indexable member itself can be declared in the normal way.

You also use overloadings of make_link() and break_link() to perform index maintenance for member functions called in query or path strings.

Making and Breaking Links on Indexable Data Members

Call break_link() just before making a change to an indexable data member (this removes an entry from each relevant index), and call make_link() just after making the change (this inserts a new entry into each relevant index, indexing the object by the new value of the relevant path). You can ensure that this happens by encapsulating these calls in a member function for setting the value of the data member.

For indexes keyed by paths that go through the elements of a collection, index maintenance is performed automatically when you change the membership of a collection.

Example: message class definition
For example, given the following class definition:

      #include <stddef.h>
      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      . . . 
      class message {
            public:
                  . . . 
                  os_backptr b;
                  int id;         /* an indexable member */
                  char *subject_line;  /* a second indexable member */
                  class date {
                        public:
                              int day;
                              int month;
                              int year; 
                  } date_received;  /* a third indexable member */
                  . . . 
                  message(int id, char*subj, int dd, int mm, int yy);
                  ~message();
                  int set_id(int);
                  char *set_subject_line(char*);
                  void set_date(int dd, int mm, int yy);
      };
Example: function definitions
You should define functions for setting each data member as follows:

int message::set_id(int i) {
            b.break_link(
                  &id, 
                  &id, 
                  os_index(message,b) - os_index(message,id)
            );
            id = i;
            b.make_link(
                  &id, 
                  &id, 
                  os_index(message,b) - os_index(message,id)
            );
            return i;
} /* end set_id() definition */
char *message::set_subject_line(char *subj) {
            b.break_link(
                  &subject_line, 
                  &subject_line, 
                  os_index(message,b) - os_index(message,subject_line)
            );
            if (strlen(subj) < 500)
                  strcpy(subject_line, subj);
            else
                  error("string too long");
            b.make_link(
                  &subject_line, 
                  &subject_line, 
                  os_index(message,b) - os_index(message,subject_line)
            );
            return subj;
} /* end set_subject_line() definition*/
void message::set_date(int dd, int mm, int yy) {
            b.break_link(
                  &date_received, 
                  &date_received, 
                  os_index(message,b) - os_index(message,date_received)
            );
            date_received.day = dd;
            date_received.month = mm;
            date_received.year = yy;
            b.make_link(
                  &date_received, 
                  &date_received, 
                  os_index(message,b) - os_index(message,date_received)
            );
} /* end set_date() definition */
Note that because the values of date_received are instances of a user-defined class, date, this example assumes that you have defined and registered a rank function (and possibly a hash function) for the class date. See the examples in Supplying Rank and Hash Functions.

If these set-value functions provide the only interface for modifying the values of indexable members, indexes will be properly maintained. Circumventing the interface, for example, by passing the address of an indexable member value to a function that alters its value through the pointer, can result in inconsistent indexes.

Making and Breaking Links to Indexed Member Functions

To maintain indexes keyed by paths containing member function calls, use the following new overloadings of os_backptr::make_link() and os_backptr::break_link():

      void make_link(
            void* ptr_to_obj, 
            void* ptr_to_obj, 
            const char* class_name, 
            const char* function_name
      ) const ;
      void break_link(
            void* ptr_to_obj, 
            void* ptr_to_obj, 
            const char* class_name,
            const char* function_name
      ) const;
Automatic index maintenance is not available for such indexes.

Arguments to make_link() and break_link()
ptr_to_obj is the object whose state changed, requiring an update to one or more indexes. When you call these functions, supply the same value for the first and second arguments.

class_name is the name of the class that defines the member function called in the path of the indexes to be updated.

function_name is the name of the member function itself.

When to use these functions
Call these functions whenever you perform an update that affects the return value of any member function appearing in a query or path string. You must make a pair of calls (one to break_link() and one to make_link()) for each such member function affected by each data member change.

Call break_link() just before making the change (this removes an entry from each relevant index), and call make_link() just after making the change (this inserts a new entry into each relevant index, indexing the object by the new value of the relevant path). You can ensure that this happens by encapsulating these calls in a member function for setting the value of the data member.

For indexes keyed by paths that go through the elements of a collection (for example, * ( (*get_children())[]->get_location() ) ) index maintenance is performed automatically when you change the membership of a collection. See Example: Member Function Calls in Query and Path Strings.

User-Controlled Index Maintenance Without an
os_backptr

Collections do automatic index maintenance. Therefore, you avoid os_backptr overhead in the following way: If a member is not mentioned in any multistep path used as an index key, you can perform index maintenance by using collection insert and remove operations. Performing index maintenance in this way allows you to avoid declaring an os_backptr member. See Performing or Enabling Index Maintenance.

When you update the data member, follow this procedure:

  1. For each index keyed by the member, remove the object containing the member from the indexed collection (it might or might not actually be an element of this collection).

  2. Update the member.

  3. Insert the object back into each collection, if any, mentioned in step 1, provided the object was a member of that collection prior to your performing step 1.

Rank and Hash Function Requirements

If your application uses paths ending in instances of a class, you must define and register a rank function and possibly a hash function for the path's terminal type. For ordered indexes keyed by such paths, you must supply a rank function. For unordered indexes keyed by such paths, you must supply both a rank and a hash function (the rank function is used to resolve hashing collisions). See Supplying Rank and Hash Functions.

Example: Member Function Calls in Query and Path Strings

Following is a listing of three files that make up a simple program using member function calls in paths and queries:

Files used in this example
The rectangle class
The class rectangle defines public accessor functions for the following pieces of abstract state:

The values for label, length, width, children, and location are stored in private data members. The value for area is computed from the values for length and width.

Query functions
Each function for reading a piece of abstract state (a get function) is declared as a query function. In addition, the public members coord::x and coord::y are declared as indexable data members. This allows the member function rectangle::get_location(), for example, to be called in a query and allows indexes to be keyed by, for example, the path

      get_location()->x
that is, the x-coordinate of a rectangle's location.

Rectangle Header File - rectangle.hh

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <iostream.h>
      #include <stdlib.h>
      #include <string.h>
      class coord {
            public:
                  os_backptr b ; /*  needed for indexable member */
                  os_indexable_member(coord,x,int) x ;
                  os_indexable_member(coord,y,int) y ;
                  coord(int x1, int y1) { x = x1 ; y = y1 ; }
                  coord() { x = 0 ; y = 0 ; }
      } ;
      class rectangle {
            public: 
                  os_backptr b ;  /*  needed for query functions */
            private: 
                  char *label ;
                  int length ;
                  int width ;
                  os_Set<rectangle*> &children ;
                  coord location ;
            public:
                  rectangle(
                        const char *lbl, 
                        int l, 
                        int w, 
                        const os_Set<rectangle*> *chldrn_ptr,
                        coord lcn
                  ) ;
                  rectangle(const char *lbl) ;
                  ~rectangle() ;
                  char *get_label() ;
                  void set_label(const char *lbl) ;
                  int get_length() ;
                  void set_length(int l) ;
                  int get_width() ;
                  void set_width(int w) ;
                  os_Set<rectangle*> *get_children() ;
                  coord *get_location() ;
                  void set_location(coord lcn) ;
                  int get_area() ;
                  static os_typespec *get_os_typespec() ;
      } ;
      os_query_function(rectangle,get_label,char*) ;
      os_query_function(rectangle,get_length,int) ;
      os_query_function(rectangle,get_width,int) ;
      os_query_function(rectangle,get_location,coord*) ;
      os_query_function(rectangle,get_area,int) ;
      os_query_function(rectangle,get_children,\
      os_Set<rectangle*>*) ;
Notice that there is no function for setting the children of a given rectangle. This is because the same collection is used to record the children of a rectangle throughout the rectangle's lifetime. Changes in a rectangle's children are reflected by insertions and removals performed on this collection. This means that rectangle::get_children() need not call make_link() and break_link(); index maintenance is performed by the collection's insert and remove operations.

Schema Source File - schema.cc

Following is schema.cc, containing the calls to OS_MARK_QUERY_FUNCTION().

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/manschem.hh>
      #include "rectangle.hh"
            OS_MARK_SCHEMA_TYPE(rectangle);
            OS_MARK_SCHEMA_TYPE(coord);
            OS_MARK_SCHEMA_TYPE(os_Set<rectangle*>);
            OS_MARK_QUERY_FUNCTION(rectangle,get_label);
            OS_MARK_QUERY_FUNCTION(rectangle,get_length);
            OS_MARK_QUERY_FUNCTION(rectangle,get_width);
            OS_MARK_QUERY_FUNCTION(rectangle,get_location);
            OS_MARK_QUERY_FUNCTION(rectangle,get_area);
            OS_MARK_QUERY_FUNCTION(rectangle,get_children); 

Main Program File - rectangle.cc

Macro calls and rank functions
Following is the first part of rectangle.cc:

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <iostream.h>
      #include "rectangle.hh"
      os_indexable_body(coord,x,int,os_index(coord,b)) ;
      os_indexable_body(coord,y,int,os_index(coord,b)) ;
      os_query_function_body(rectangle,get_label,char*,b) ;
      os_query_function_body(rectangle,get_length,int,b) ;
      os_query_function_body(rectangle,get_width,int,b) ;
      os_query_function_body(rectangle,get_location,coord*,b) ;
      os_query_function_body(rectangle,get_area,int,b) ;
      os_query_function_body(\
      rectangle,get_children,os_Set<rectangle*>*,b) ;
      int coord_rank(const void *arg1, const void *arg2) {
            const coord *c1 = (const coord *)arg1 ;
            const coord *c2 = (const coord *)arg2 ;
            if ( c1->x < c2->x )
                  return os_collection::LT ;
            else if ( c1->x > c2->x )
                  return os_collection::GT ;
            else if ( c1->y < c2->y )
                  return os_collection::LT ;
            else if (c1->y > c2->y)
                  return os_collection::GT ;
            else
                  return os_collection::EQ ;
      }
Notice the calls to os_query_function_body(). The rank function coord_rank() is defined here so as to be able to perform queries comparing coord objects and so as to create an index keyed by a path ending in coord objects, in this case the path

      * ( (*get_children())[]->get_location() )
Rectangle member function Implementations
Following is the second part of rectangle.cc:

      rectangle::rectangle(
            const char *lbl, 
            int l, 
            int w, 
            const os_Set<rectangle*> *chldrn_ptr, 
            coord lcn
      ):children(*new(os_database::of(this), 
            os_Set<rectangle*>::get_os_typespec()) os_Set<rectangle*> )
      { 
            label = new(
                  os_database::of(this), 
                  os_typespec::get_char(), 
                  strlen(lbl)+1
            ) char[strlen(lbl)+1] ; 
            strcpy(label, lbl) ;
            length = l ; 
            width = w ;
            children = *chldrn_ptr ;
            location.x = lcn.x ;
            location.y = lcn.y ;
      }
      rectangle::rectangle(const char *lbl) : 
            children(*new(os_database::of(this), 
            os_Set<rectangle*>::get_os_typespec()) os_Set<rectangle*>) 
{
            label = new(
            os_database::of(this), 
                  os_typespec::get_char(), 
                  strlen(lbl)+1 
            ) char[strlen(lbl)+1] ; 
            strcpy(label, lbl) ;
            length = 0 ; 
            width = 0 ;
            location.x = 0 ;
            location.y = 0 ;
      }
      rectangle::~rectangle() { 
                  delete [] label ; 
      }
      char *rectangle::get_label() { 
            return label ; 
      }
      void rectangle::set_label(const char *lbl) {
            b.break_link(this, this, "rectangle", "get_label") ;
            label = new(
                  os_database::of(this), 
                  os_typespec::get_char()
                  strlen(lbl)+1 
            ) char[strlen(lbl)+1] ; 
            strcpy(label, lbl) ;
            b.make_link(this, this, "rectangle", "get_label") ;
      }
      int rectangle::get_length() { 
            return length ; 
      }
      void rectangle::set_length(int l) { 
            /*  two query member functions depend on this data member */
            /*  so we call each of make_link() and break_link() twice */
            b.break_link(this, this, "rectangle", "get_length") ;
            b.break_link(this, this, "rectangle", "get_area") ;
            length = l ; 
            b.make_link(this, this, "rectangle", "get_length") ;
            b.make_link(this, this, "rectangle", "get_area") ;      
      }
      int rectangle::get_width() { 
            return width ; 
      }
      void rectangle::set_width(int w) {
            /*  two query member functions depend on this data member */
            /*  so we call each of make_link() and break_link() twice */
            b.break_link(this, this, "rectangle", "get_width") ;
            b.break_link(this, this, "rectangle", "get_area") ;
            width = w ; 
            b.make_link(this, this, "rectangle", "get_width") ;
            b.make_link(this, this, "rectangle", "get_area") ;
      }
      os_Set<rectangle*> *rectangle::get_children() {
            return &children ;
      }
      coord *rectangle::get_location() {
            return &location ;
      }
      void rectangle::set_location(coord lcn) { 
            b.break_link(this, this, "rectangle", "get_location") ;
            location.x = lcn.x ; 
            location.y = lcn.y ; 
            b.make_link(this, this, "rectangle", "get_location") ;
      }
      int rectangle::get_area() { 
            return length * width ; 
      }
      void print_rects(os_Set<rectangle*> &the_rects) {
            os_Cursor<rectangle*> c(the_rects) ;
            for ( rectangle *r = c.first() ; r ; r = c.next() )
                  cout << r->get_label() << "\n" ;
            cout << "\n" ;
      }
Note that the set functions perform index maintenance, calling break_link() and make_link() for each query function whose return value depends on the underlying private data member. For example, set_width() calls break_link() once for the query function get_width() and once for the query function get_area(), because both get_width() and get_area() use the private data member width to derive their return values.

As noted earlier, there are no make_link() and break_link() calls associated with changing a rectangle's children because index maintenance is performed automatically by the insertion and removal operations of the collection get_children() returns.

The driver
Following is the last part of rectangle.cc:

      void mquery(os_database *db) {
            os_index_key(coord,coord_rank,0) ;
            rectangle *r1 = new(db, rectangle::get_os_typespec())
                  rectangle("1") ;
            rectangle *r2 = new(db, rectangle::get_os_typespec())
                  rectangle("2") ;
            rectangle *r3 = new(db, rectangle::get_os_typespec())
                  rectangle("3") ;
            os_Set<rectangle*> &the_rectangles = *new(db, 
                  os_Set<rectangle*>::get_os_typespec()) 
                  os_Set<rectangle*>; 
            the_rectangles |= r1 ;
            the_rectangles |= r2 ;
            the_rectangles |= r3 ;
            os_index_path &label_path = 
                  os_index_path::create("rectangle*", "get_label()", db) ;
            os_index_path &length_path =
                  os_index_path::create("rectangle*", "get_length()", db) ;
            os_index_path &width_path =
                  os_index_path::create("rectangle*", "get_width()", db) ;
            os_index_path &x_location_path = 
                  os_index_path::create("rectangle*",
                  "get_location()->x",db);
            os_index_path &y_location_path = 
                  os_index_path::create("rectangle*", 
                  "get_location()->y", db) ;
            os_index_path &area_path = 
                  os_index_path::create("rectangle*", "get_area()", db) ;
            os_index_path &children_loc_path = 
                  os_index_path::create("rectangle*", 
                  "* ( (*get_children())[]->get_location() )", db) ;
            the_rectangles.add_index( label_path, 
                  os_index_path::unordered) ;
            the_rectangles.add_index(length_path,
                  os_index_path::ordered) ;
            the_rectangles.add_index(width_path, 
                  os_index_path::ordered) ;
            the_rectangles.add_index(x_location_path,
                  os_index_path::ordered) ;
            the_rectangles.add_index(y_location_path,
                  os_index_path::ordered) ;
            the_rectangles.add_index(area_path,
                  os_index_path::ordered) ;
            the_rectangles.add_index(children_loc_path,
                  os_index_path::ordered) ;
            r1->set_label("rect1") ;
            r1->set_length(20) ;
            r1->set_width(60) ;
            r1->set_location(coord(10, 20)) ;
            r1->get_children()->insert(r2) ;
            r2->set_label("rect2") ;
            r2->set_length(40) ;
            r2->set_width(35) ;
            r2->set_location(coord(20, 40)) ;
            r2->get_children()->insert(r3) ;
            r3->set_label("rect3") ;
            r3->set_length(200) ;
            r3->set_width(80) ;
            r3->set_location(coord(50, 60)) ;
            cout << "Rectangles labeled \"rect1\":\n" ;
            os_Set<rectangle*> &answer = the_rectangles.query(
                  "rectangle*","strcmp(\"rect1\", get_label()) == 0",db) ;
            print_rects(answer) ;
            cout << "Rectangles with length < 100:\n" ; 
            answer = the_rectangles.query("rectangle*",
                  "get_length() < 100",db) ;
            print_rects(answer) ;
            cout << "Rectangles with width > 50:\n" ; 
            answer = the_rectangles.query("rectangle*",
                  "get_width() > 50",db) ;
            print_rects(answer) ;
            cout << "Rectangles with location x coord <= 25:\n" ; 
            answer = the_rectangles.query("rectangle*",
                  "get_location()->x <= 25",db) ;
            print_rects(answer) ;
            cout << "Rectangles with area >= 3000:\n" ; 
            answer = the_rectangles.query("rectangle*",
                  "get_area() >= 3000",db) ;
            print_rects(answer) ;
            cout<<"Rectangles with children whose location < (30, 50):\n";
            const os_coll_query &children_loc_query =
                  os_coll_query::create("rectangle*",
                  "(*get_children())[:*get_location()<*(coord*)a_coord_ptr :]",
                  db) ;
            coord a_coord(30, 50) ;
            os_bound_query bound_children_loc_query(
                  children_loc_query,
                  os_keyword_arg("a_coord_ptr", &a_coord)) ;
            answer = the_rectangles.query(bound_children_loc_query) ;
            print_rects(answer) ;
            os_Set<rectangle*> *answer_ptr = &answer ;
            delete answer_ptr ;
            delete children_loc_query; 
      } 
The main() driver function
      main(int, char **argv) {
            /*  argv[1] is the database */
            cout << "\nStarting mquery ...\n\n";
            objectstore::initialize();
            os_collection::initialize();
            OS_BEGIN_TXN(tx1, 0, os_transaction::update)
                  mquery(os_database::create(argv[1])) ;
            OS_END_TXN(tx1)
            cout << "Done.\n\n" ;
      } 
The function main() initializes ObjectStore and ObjectStore collections and calls mquery(), passing in a newly created database. The function mquery() registers the coord rank function, then creates three rectangles, adding them to a collection, the_rectangles.

Next mquery() creates several index paths involving member functions. Then it adds indexes to the_rectangles, specifying the paths as index keys. Then various set functions are used, triggering the index maintenance coded earlier in rectangle.cc. The collection of children for two of the rectangles is modified with rectangle::get_children() and os_Collection::insert(), which triggers index maintenance performed by insert().

Finally, various queries involving member functions are performed on the_rectangles.

Example: driver output
The output looks like this:

      Starting mquery ...
      Rectangles labeled "rect1":
      rect1
      Rectangles with length < 100:
      rect1
      rect2
      Rectangles with width > 50:
      rect1
      rect3
      Rectangles with location x coord <= 25:
      rect1
      rect2
      Rectangles with area >= 3000:
      rect3
      Rectangles with children whose location < (30, 50):
      rect1
      Done.
This driver demonstrates how to express queries with member functions. To verify that index maintenance is being performed properly, you can modify the driver to create more rectangles and insert them into the_rectangles. If the_rectangles has only three elements, the query optimizer chooses not to use indexes in query evaluation. Following is an example:

      char s[500];
      rectangle *rx = i
      rectangle *ry = 0 ;
      for (int i = 0 ; i < 200 ; i++) {
            sprintf(s, "%d", i);
            rx = new(db, rectangle::get_os_typespec()) rectangle(s) ; 
            the_rectangles |= rx ;
            rx->set_length(i) ;
            rx->set_width(i) ;
            rx->set_location(coord(i, i)) ;
            if (ry) 
                  rx->get_children()->insert(ry) ;
            ry = rx ;
      }



[previous] [next]

Copyright © 1999 Object Design, Inc. All rights reserved.

Updated: 03/10/99 07:54:12