Validating CDMI Features – Metadata Search

Here we go again with an announcement of a cloud offering that again validates an existing standardized feature of CDMI. The new Amazon CloudSearch offering lets you store structured metadata in the cloud and perform queries on the metadata. They missed an opportunity, however, to integrate this with their existing cloud object storage offering. After all, if you already have object storage, why not put the metadata with the data object instead of separating it out in a separate cloud?

CDMI lets you put the user metadata directly into the storage object, where it is protected, backed up, archived and retained along with the actual data. CDMI’s rich query functions are then able to find the storage object based on the values of the metadata without talking to a separate cloud offering with a new, proprietary API.

CDMI standardizes a Query Queue that allows the client to create a scope specification (equivalent to a WHERE clause) to find specific objects that match the criteria, and a results specification (equivalent to a SELECT clause) that determines the elements of the object that are returned for each match. Results are placed in a CDMI queue object and can be processed one at a time, or in bulk. This powerful feature allows any storage cloud that has a search feature to expose it in a standard manner for interoperability between clouds.

An example of the metadata associated with a query queue is as follows:

{
     "metadata" : {
          "cdmi_queue_type" : "cdmi_query_queue",
          "cdmi_scope_specification" : [
               {
                    "domainURI" : "== /cdmi_domains/MyDomain/",
                    "parentURI" : "starts /MyMusic",
                    "metadata" : {
                         "artist" : "*Bono*"
                    }
               }
          ],
          "cdmi_results_specification": {
               "objectID" : "",
               "metadata" : {
                    "title" : ""
               }
          }
     }
}

 

When results are stored in a query queue, each enqueued value consists of a JSON object of MIME-type “application/json”. This JSON object contains the specified values requested in the cdmi_results_specification of the query queue metadata.

An example of a query result JSON object is as follows:

{
     "objectID" : "00007E7F0010EB9092B29F6CD6AD6824",
     "metadata" : {
          "title" : "Vertigo"
     }
}

Thus if you are using your storage cloud for storing music files, for example, all of the metadata for each mp3 object can be stored right along with the object, and CDMI’s powerful query mechanisms can be used to find the files you are interested in without invoking a separate search cloud with disassociated metadata,

Leave a Reply

Your email address will not be published. Required fields are marked *