Difference between revisions of "Prima"

From Schema Evolution
Jump to: navigation, search
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The old problem of managing the history of database information is now made more urgent and complex by fast-spreading web information systems. Indeed, systems such as Wikipedia are faced with the challenge of managing the history of their databases in the face of intense database schema evolution. Our '''PRIMA''' system addresses this difficult problem by introducing two key pieces of new technology. The first is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. The second key piece of technology provided by '''PRIMA''' is that schema evolution is transparent to the user: she writes queries against the current schema while retrieving the data from one or more schema versions. The system then performs the labor-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly relevant for historical queries spanning over potentially hundreds of different schema versions. The latter one is realized by (i) introducing Schema Modification Operators (SMOs) to represent the mappings between successive schema versions and (ii) an XML integrity constraint language (XIC) to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been tested against both synthetic data and real-world data from the Wikipedia DB schema evolution history.
+
PRIMA is a transaction-time DBMS that supports schema evolution. It supports management and querying of evolving data under evolving schema. PRIMA is an acronym for ''Panta Rhei Information Management and Archival''.
  
 +
The main investigators are:
  
== Data set ==
+
Hyun J. Moon (contact author): [http://yellowstone.cs.ucla.edu/~hjmoon/]
 +
 
 +
Carlo A. Curino: [http://carlo.curino.us/]
 +
 
 +
Alin Deutsch: [http://db.ucsd.edu/people/alin/]
 +
 
 +
Chien-Yi Hou
 +
 
 +
Carlo Zaniolo: [http://www.cs.ucla.edu/~zaniolo/]
 +
 
 +
== Overview ==
 +
 
 +
The old problem of managing the history of database information is now made more urgent and complex by fast-spreading web information systems. Indeed, systems such as Wikipedia are faced with the challenge of managing the history of their databases in the face of intense database schema evolution. Our PRIMA system addresses this difficult problem by introducing two key pieces of new technology. The first is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. The second key piece of technology provided by PRIMA is that schema evolution is transparent to the user: she writes queries against the current schema while retrieving the data from one or more schema versions. The system then performs the labor-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly relevant for historical queries spanning over potentially hundreds of different schema versions. The latter one is realized by (i) introducing Schema Modification Operators ([[SMO]])s to represent the mappings between successive schema versions and (ii) an XML integrity constraint language (XIC) to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been tested against both synthetic data and real-world data from the Wikipedia DB schema evolution history.
 +
 
 +
== Experiment Data Set ==
 
=== Employee Database Schema Evolution: Synthetic data ===
 
=== Employee Database Schema Evolution: Synthetic data ===
  
Line 14: Line 29:
 
* [http://yellowstone.cs.ucla.edu/schema-evolution/documents/prima/wiki-queries.tar Queries] (tar file): 20 queries taken and translated from [http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=real&limit=50000 Wikipedia online profiler]
 
* [http://yellowstone.cs.ucla.edu/schema-evolution/documents/prima/wiki-queries.tar Queries] (tar file): 20 queries taken and translated from [http://noc.wikimedia.org/cgi-bin/report.py?db=enwiki&sort=real&limit=50000 Wikipedia online profiler]
  
=== Publication ===
+
== AIMS ==
Coming soon
+
PRIMA was initially based on XML DB that execute XQuery queries. In order to improve the efficiency of the system, we are pursuing RDBMS-based system, which we call [[AIMS]].
Contact: Hyun J. Moon [http://www.cs.ucla.edu/~hjmoon/]
+
 
 +
== Publications ==
 +
 
 +
''"Managing and querying transaction-time databases under schema evolution"''  Hyun J. Moon, Carlo A. Curino, Alin Deutsch, Chien-Yi Hou, and Carlo Zaniolo. Accepted for publication at Very Large Data Base '''VLDB, 2008'''. (PDF will be available soon)

Latest revision as of 17:33, 6 December 2010

PRIMA is a transaction-time DBMS that supports schema evolution. It supports management and querying of evolving data under evolving schema. PRIMA is an acronym for Panta Rhei Information Management and Archival.

The main investigators are:

Hyun J. Moon (contact author): [1]

Carlo A. Curino: [2]

Alin Deutsch: [3]

Chien-Yi Hou

Carlo Zaniolo: [4]

Contents

[edit] Overview

The old problem of managing the history of database information is now made more urgent and complex by fast-spreading web information systems. Indeed, systems such as Wikipedia are faced with the challenge of managing the history of their databases in the face of intense database schema evolution. Our PRIMA system addresses this difficult problem by introducing two key pieces of new technology. The first is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. The second key piece of technology provided by PRIMA is that schema evolution is transparent to the user: she writes queries against the current schema while retrieving the data from one or more schema versions. The system then performs the labor-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly relevant for historical queries spanning over potentially hundreds of different schema versions. The latter one is realized by (i) introducing Schema Modification Operators (SMO)s to represent the mappings between successive schema versions and (ii) an XML integrity constraint language (XIC) to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been tested against both synthetic data and real-world data from the Wikipedia DB schema evolution history.

[edit] Experiment Data Set

[edit] Employee Database Schema Evolution: Synthetic data

[edit] Wikipedia Database Schema Evolution: Real-world data

[edit] AIMS

PRIMA was initially based on XML DB that execute XQuery queries. In order to improve the efficiency of the system, we are pursuing RDBMS-based system, which we call AIMS.

[edit] Publications

"Managing and querying transaction-time databases under schema evolution" Hyun J. Moon, Carlo A. Curino, Alin Deutsch, Chien-Yi Hou, and Carlo Zaniolo. Accepted for publication at Very Large Data Base VLDB, 2008. (PDF will be available soon)

Personal tools