Difference between revisions of "SMO"

From Schema Evolution
Jump to: navigation, search
Line 4: Line 4:
  
 
'''Schema Modification Operators:'''  
 
'''Schema Modification Operators:'''  
Shneiderman and Thomas proposed in [Shneiderman and Thomas, 1982] a comprehensive set of schema changes, including structural schema changes and also changes regarding the keys and dependencies. More recently, Bernstein et al. have also proposed a set of schema evolution primitives using algebra-based constraints as their primitives [Bernstein et al., 2006, Bernstein et al., 2008]. Among several options, we chose the Schema Modification Operators (SMOs) that we proposed in [Prism] and [Prima]. These SMOs capture the essence of the existing works, but can also express schema changes not modeled by previous approaches. For example, by using function in the ADD COLUMN operator, SMOs can support semantic conversion of columns (e.g., currency exchange), column concatenation/split (e.g., different address formats), and other similar changes that have been heavily exploited in modeling MediaWiki schema changes. The effectiveness of SMOs have been validated in [Moon et al., 2008, Curino et al., 2008c], where the [Prism] and [Prima] systems used SMOs to describe schema evolution in transaction-time databases and to support historical query reformulations over multi-schema-version transaction-time databases. The syntax of SMO is similar to that of SQL DDL [ISO/IEC 9075-*: 2003, 2003, Eisenberg et al., 2004], and provides a concise way to describe typical modifications of a database schema and the corresponding data migration. Every SMO takes as input a schema and produces as output a new version of the same schema. Table 3 presents a list of SMOs, operating on tables (the first six) and on columns (the last five) of a given DB schema, together with a brief explanation. Note that simple SMOs can be arbitrarily combined in a sequence, to describe complex structural changes, as those occured in the MediaWiki DB schema evolution. Classification Using SMOs In this context we exploit SMOs as a pure classification instrument to provide a fine-grained analysis of the types of change the schema has been subject to. While there might be several ways to describe a schema evolution step by means of SMOs, we carefully select, analyzing the available documentation, the most natural set of SMOs describing each schema change in the MediaWiki history.
+
Shneiderman and Thomas proposed in [Shneiderman and Thomas, 1982] a comprehensive set of schema changes, including structural schema changes and also changes regarding the keys and dependencies. More recently, Bernstein et al. have also proposed a set of schema evolution primitives using algebra-based constraints as their primitives [Bernstein et al., 2006, Bernstein et al., 2008]. Among several options, we chose the Schema Modification Operators (SMOs) that we proposed in [[Prism]] and [[Prima]]. These SMOs capture the essence of the existing works, but can also express schema changes not modeled by previous approaches. For example, by using function in the ADD COLUMN operator, SMOs can support semantic conversion of columns (e.g., currency exchange), column concatenation/split (e.g., different address formats), and other similar changes that have been heavily exploited in modeling MediaWiki schema changes.  
 +
The effectiveness of SMOs have been validated in [Moon et al., 2008, Curino et al., 2008c], where the [[Prism]] and [[Prima]] systems used SMOs to describe schema evolution in transaction-time databases and to support historical query reformulations over multi-schema-version transaction-time databases.  
 +
The syntax of SMO is similar to that of SQL DDL ISO/IEC 9075-*: 2003, and provides a concise way to describe typical modifications of a database schema and the corresponding data migration. Every SMO takes as input a schema and produces as output a new version of the same schema.  
 +
Note that simple SMOs can be arbitrarily combined in a sequence, to describe complex structural changes, as those occured in the MediaWiki DB schema evolution.  
 +
 
 
__TOC__
 
__TOC__
  
 
== SMO Semantics ==
 
== SMO Semantics ==
 +
  
 
== Information Preservation ==
 
== Information Preservation ==
 +
Information preservation is a very interesting property of an evolution step. Intuitively means that the evolution step being performed is invertible since no data (or more generally information) are lost during the migration.  While being a highly desirable property it is not  always guaranteed to hold. In particular we studied for each of our Schema  Modification Operators under which conditions we can guarantee information preservation.
 +
In the following we are going to discuss for each operator if and when it is information preserving.
 +
 +
=== CREATE TABLE ===
 +
* CREATE TABLE tab(a,b,c);
 +
 +
The create table SMO is always information preserving. In fact it doesn't operate on existing data, but introduces new information in the database in a monotonic way, and it is thus information preserving.
 +
 +
 +
=== ADD COLUMN ===
 +
 +
* ADD COLUMN d AS "0" INTO tab;
 +
 +
The add column SMO is always information presercing. In tact it increasing the information capacity of the existing schema without operating on existing data. The value inserted in the new column added are generating as "constants" or functionally generated from the other columns, e.g.:
 +
 +
* ADD COLUMN d AS CONCAT(b,c) INTO tab;
 +
 +
and thus require no skolemization. 
 +
===
  
 
== Redundancy ==
 
== Redundancy ==

Revision as of 18:40, 12 May 2008

PAGE UNDER CONSTRUCTION

This page describes the set of Schema Modification Operators shared by Prism and Prima.

Schema Modification Operators: Shneiderman and Thomas proposed in [Shneiderman and Thomas, 1982] a comprehensive set of schema changes, including structural schema changes and also changes regarding the keys and dependencies. More recently, Bernstein et al. have also proposed a set of schema evolution primitives using algebra-based constraints as their primitives [Bernstein et al., 2006, Bernstein et al., 2008]. Among several options, we chose the Schema Modification Operators (SMOs) that we proposed in Prism and Prima. These SMOs capture the essence of the existing works, but can also express schema changes not modeled by previous approaches. For example, by using function in the ADD COLUMN operator, SMOs can support semantic conversion of columns (e.g., currency exchange), column concatenation/split (e.g., different address formats), and other similar changes that have been heavily exploited in modeling MediaWiki schema changes. The effectiveness of SMOs have been validated in [Moon et al., 2008, Curino et al., 2008c], where the Prism and Prima systems used SMOs to describe schema evolution in transaction-time databases and to support historical query reformulations over multi-schema-version transaction-time databases. The syntax of SMO is similar to that of SQL DDL ISO/IEC 9075-*: 2003, and provides a concise way to describe typical modifications of a database schema and the corresponding data migration. Every SMO takes as input a schema and produces as output a new version of the same schema. Note that simple SMOs can be arbitrarily combined in a sequence, to describe complex structural changes, as those occured in the MediaWiki DB schema evolution.

Contents


SMO Semantics

Information Preservation

Information preservation is a very interesting property of an evolution step. Intuitively means that the evolution step being performed is invertible since no data (or more generally information) are lost during the migration. While being a highly desirable property it is not always guaranteed to hold. In particular we studied for each of our Schema Modification Operators under which conditions we can guarantee information preservation. In the following we are going to discuss for each operator if and when it is information preserving.

CREATE TABLE

* CREATE TABLE tab(a,b,c);

The create table SMO is always information preserving. In fact it doesn't operate on existing data, but introduces new information in the database in a monotonic way, and it is thus information preserving.


ADD COLUMN

* ADD COLUMN d AS "0" INTO tab;

The add column SMO is always information presercing. In tact it increasing the information capacity of the existing schema without operating on existing data. The value inserted in the new column added are generating as "constants" or functionally generated from the other columns, e.g.:

* ADD COLUMN d AS CONCAT(b,c) INTO tab;

and thus require no skolemization.

=

Redundancy

Personal tools