Query Migration from Object Oriented World to Semantic World

— In the last decades, object-oriented approach was able to take a large share of databases market aiming to design and implement structured and reusable software through the composition of independent elements in order to have programs with a high performance. On the other hand, the mass of information stored in the web is increasing day after day with a vertiginous speed, exposing the currently web faced with the problem of creating a bridge so as to facilitate access to data between different applications and systems as well as to look for relevant and exact information wished by users. In addition, all existing approach of rewriting object oriented languages to SPARQL language rely on models transformation process to guarantee this mapping. All the previous raisons has prompted us to write this paper in order to bridge an important gap between these two heterogeneous worlds (object oriented and semantic web world) by proposing the first provably semantics preserving OQL-to-SPARQL translation algorithm for each element of OQL Query (SELECT clause, FROM clause, FILTER constraint, implicit/ explicit join and union/intersection SELECT queries).

Currently, the majority of information systems for companies databases adopt the object-oriented approach regarded as the best data organization paradigm providing the ability to represent complex entities and implement structured software with very high performance, which makes the development of methods and tools for automatic mapping from object oriented world to semantic world a very relevant need. These reasons motivated us to work on this topic so as to elaborate a first conversion query algorithm of OQL to SPARQL that translate each component of OQL SELECT query to its equivalent in SPARQL language.

II. RelaTed woRks
Recently, several researches focus on the mapping of data, models, concepts, and queries from the existing data source content to semantic web world. The majority of these researches are interested much more to the relational systems than others; several approaches have been proposed about this mapping direction, such as: RETRO [6] that choose not to physically transform the data but to derive a domain specific relational schema from RDF data and its query mapping transforms an SQL query over the schema into an equivalent SPARQL query executable upon the RDF store. R2RML [7,8] a language for expressing customized mappings from relational databases to RDF datasets presented recently with a novel version which provides a user interface to create and edit mappings interactively even for non-experts. D2RQ/ Update [5] is an extension of D2RQ [9] to enable executing SPARQL/ Update statements on the mapped data, and to facilitate the creation of a read-write Semantic Web.
Regarding the object-oriented data source, the SPOON approach (Sparql to Object Oriented eNgine) described in [11] propose an automatic mapping between the object-oriented model (ODL) and the correspondent one at the ontological level in order to build a SPARQL endpoint. The paper [12] aims to address query rewriting by means of model transformations. In fact, it allows querying RDF data sources via an object oriented query which is automatically rewritten in SPARQL in order to access RDF data, it also translate SPARQL queries into object oriented queries so as to implement SPARQL endpoints for object oriented applications.
These studies did not propose any query translation solution for rewriting each element of Object Oriented queries into SPARQL queries semantically equivalent but they rely on models transformation process to guarantee this mapping.

III. QueRy language MeTaModel & exaMPles
In this section, we describe languages used by our translation approach from object oriented world to semantic web world in order to represent each language with its own metamodel developed from their grammars [14] [15] : the Object Query Language (OQL) for objectoriented databases and a query language for RDF data (SPARQL).

A. OQL Metamodel
The OQL is an object-oriented query language in the Object Data Management Group standard named ODMG; this language provides an easy access to an object databases. Like SQL, the SELECT query which runs on relational tables works with the same syntax and semantics on collections of ODMG objects, which leads to search for an instance of an object rather than looking for a row of data. Several implementations of this standard exist; we quote as examples: HQL [16], JPQL [17], and others.
The metamodel schematized below is limited to SELECT Query in its simple and compound form (Intersect and Union SELECT query). The fig. 1 represents the OQL query of such a type that is composed of five clauses: SelectFromClause, WhereClause, GroupByClause, OrderByClause and HavingClause. The SelectFromClause representation is given in fig 2. This clause is composed of an optional SelectClause (we can omit the SELECT clause in some implementation of OQL language such as HQL) and a mandatory FromClause. A SelectClause contains a PropertyList composed of a list of values or objects resulting from the query; these properties are described as a path that permits to browse the object model. The FromClause allows selecting properties from the object model. This clause is composed of a mandatory ClassReference and an optional ClassJoined ; the ClassReference indicates the class name ClassNameDeclaration or collection name CollectionNameDeclaration of selected objects whereas the ClassJoined indicates the set of classes which we want to join.

B. SPARQL Metamodel
The SPARQL is an RDF query language, that is, a semantic query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format [13]. The fig. 4 schematizes the SPARQL metamodel presented the different types for queries. In this paper, we are only interested by SelectQuery.

C. Examples
In the examples illustrated in Table I

Iv. QueRy MaPPIng algoRIThM
In this section, we will detail our main contribution by describing all procedures used in our query mapping algorithm:

B. ConstructTriplePattern Subprocedure
The ConstructTriplePattern subprocedure takes as input the OQL SELECT Attributes, OSA, Class Reference, CR, Class Joined, CJ and Where Clause Attribute, WCA so as to return at the end a set of Triple Patten of SPARQL equivalent query. Firstly, the algorithm stores the OSA in the set A (initially blank) dedicated to contain all query attributes, then it verifies the existence of join in the query by determining its type if it exists; In fact, the explicit join type is checked if the CJ variable is not null, in this case, the algorithm extract the join condition operand in order to add it to the set A, and next it also extract the ClassReference included in the ClassJoined clause in order to add them to the set CR dedicated to contain all Classes References of the query. Similarly, the implicit join type is checked if the number of elements of the set CR is strictly greater than 1, in this case, the join condition operand is added to the set A. If the query contains a where clause, its attribute is added also to the set A. Before adding attributes to the set A, the algorithm checks firstly if these attributes do not already exist in that list.
After the combination of all the query attributes in the set A and Classes references in the set CR, it glances through the set A for each Class Reference CR i in order to extract for each a j attribute its name and the alias for its class; if the CR i alias equal to the alias of the class attribute a j , then it formulate the triple pattern of equivalent SPARQL query and adds it to the set TP and removing the attribute a j from the list A so as not to reprocess it in the following iterations. The attributes that do not satisfy the above condition will be stored in a temporary list so as to add them again to the set A and switch to the next reference class and repeat the same process.

D. ConstructSparqlWhereClause Subprocedure
The ConstructSparqlWhereClause subprocedure takes as input the set of triple pattern TP returned by the ConstructTriplePattern Subprocedure and the Filter Expression FilterExp returned by the ConstructFilterExpression Subprocedure. This algorithm glances through the set of TP to concatenate the triple patterns in order to formulate the SPARQL WHERE clause equivalent. In the case where the two triple patterns have the same subject, the second one will be reduced by removing its subject and adding a comma after the first triple pattern.

E. MappingOQLtoSPARQL Procedure
The MappingOQLtoSPARQL is the main procedure of our algorithm; it takes as input the OQL SELECT query, q in so as to return at the end the SPARQL equivalent query, q out . A conversion tree of OQL query is generated by using the parse function. If the query type is "SimpleQuery", the conversion tree generates SPARQL SELECT clause, FROM clause contained classes references and WHERE clause if it exists, then the set of triple patterns is constructed from the ConstructTriplePattern, and the FILTER expression from ConstructFilterExpression qualifying as inputs for the ConstructSparqlWhereClause generated the SPARQL WHERE clause. The SPARQL SELECT clause is generated from ConstructSparqlSelectClause; the results of previous Subprocedures are concatenated so as to formulate the SPARQL equivalent query. We proceed with the same manner if the OQL query type is "JoinQuery" except that the OQL conversion tree will generates the ClasseJoined in addition to ClassReference in FROM clause. In cases where the type of the OQL query is "UnionQuery" or "IntersectQuery", the conversion tree generates two OQL SELECT queries q1 and q2 that will be used in the recursive procedure MappingOQLtoSPARQL so as to construct the SPARQL SELECT query of each one and concatenate them in order to have an equivalent SPARQL SELECT query.

F. Merge Subprocedure
The Merge subprocedure takes as inputs two OQL subqueries and the merge type in order to generate a significant and valid SPARQL query. Firstly, it extracts the SELECT clauses from each subqueries and encapsulate these in S1 and S2, secondly, it extracts and encapsulate the triple patterns of each subqueries in TP1 and TP2. Finally, it extracts and stores the FILTER expressions of each the subqueries in F1 and F2. If the merge type is "UNION" then the q out 's SELECT clause takes one of subqueries SELECT clause, and the q out 's WHERE clause is formulated from the concatenation of the q1's WHERE clause returned by the ConstructSparqlWhereClause Subprocedure taking as inputs TP1 and F1 as well as the keyword UNION and the q2's WHERE clause returned also by the ConstructSparqlWhereClause Subprocedure taking as inputs TP2 and F2. We proceed with the same manner if the SPARQL query type is "JoinQuery" except that we remove the keyword Union.

v. conclusIon
In summary, the main contribution of this paper in the pertinent topic of interoperability between object oriented world and relational world is the elaboration of a query conversion algorithm of the OQL SELECT queries to SPARQL equivalent queries by translating each element of OQL query (SELECT clause, FROM clause, FILTER constraint, implicit/explicit join and union/intersection SELECT queries) to its equivalent in SPARQL language so as to bridge the gap between this two world without a physical data transformation.
One obvious extension of our research is to reinforce our algorithm by supporting more concepts, such as: subqueries, collections, aggregation and composition.