Benchmark Datasets

From Schema Evolution
Revision as of 21:16, 25 May 2012 by Shigao (Talk | contribs)

Jump to: navigation, search

This page contains a brief summary of a dataset of schema evolution histories that is being collected to the goal of creating a benchmark for schema evolution. Schema_Evolution_Benchmark reports a detailed analysis of the MediaWiki case study, while in the following tables we list and report information for the other system under analysis.


Open Source Web Information Systems
Name Description No. of Revisions Lifespan Popularity DBMS Further Information


MediaWiki [1] MediaWiki is used by over 30,000 wiki websites around the world including Wikipedia the availability of DB data is impressive. In particular the Wikimedia Foundation releases the entire Wikipedia DB dump bi-weekly. The user can experience with small data-set for some non-popular language (<10Mb) or work on the entire English Wikipedia *enwiki* or even install the entire dataset > 700Gb. Read More.. 323 9.0 years. Ranked 5782 and 1,469,996 downloads from sourceforge on 09/10/09. Powers wikipedia, ranked 6th on alexa.org on 05/15/2012. MySQL SVN access
Schemas.zip
Statistics
SMO
Joomla! 1.5 [2] Joomla! is an award-winning Content Management System (CMS) that will help you build websites and other powerful online applications. Best of all, Joomla! is an open source solution that is freely available to everybody. [Read More...] 46 4.0 years. Ranked 296 on alexa.org on 05/25/2012. MySQL SVN access
Schemas.tar.gz
TikiWiki [3] TikiWiki (Tiki) is your Groupware/CMS (Content Management System) solution.[Read More..] 152 7 years. Ranked 239 and 765,420 downloads from sourceforge on 09/09/09. Ranked 41,620 on alexa.org on 08/05/2009. MySQL SVN access
Schemas.tar.gz
Statistics
SMOS
phpwiki[4] It is a PHP WikiWikiWeb. A WikiWikiWeb is a web site where anyone can edit the pages through an HTML form. Linking is done automatically on the server side. All pages are stored in a database.[Read More...] 19 5.2 years. Ranked 7,982 and 351,561 downloads from sourceforge on 09/09/09. MySQL SVN access
Schemas.tar
Statistics
[SMOS
Deki Wiki 8.05[5] MindTouch Deki Wiki is the Web's most popular commercially supported wiki platform for creating content and mashups using a wiki interface. The free, open source application is an easy to use program for authoring, aggregating, organizing, and sharing almost any kind of content. Enterprises can build online communities and in-house Intranets, create collaborative applications, or add wiki capabilities to existing applications. Deki Wiki includes a state-of-the-art WYSIWYG editor, integration with the LDAP, and open source providers like WordPress, Joomla, Drupal, and Mambo. Deki Wiki is also a platform for building collaborative Web applications that access functionality or data from anywhere on the Internet. Its flexible architecture even allows wiki capabilities to be added to existing applications regardless of the underlying language or technology. [Read More...] 16 4.0 years. Ranked 28 and 490,905 downloads from sourceforge on 09/09/09. MySQL SVN access
Schemas.zip
Statistics
Nucleaus[6] Nucleus offers you the building blocks you need to create a web presence. Whether you want to create a personal blog, a family page, or an online business site, Nucleus CMS can help you achieve your goals.Read More... 51 6.7 years. Ranked 54874 on alexa.org on 09/10/2009. MySQL SVN access
Statistics
ATutor [7] ATutor is an Open Source Web-based Learning Content Management System (LCMS/LMS) and social networking environment designed with accessibility and adaptability in mind. Administrators can install or update ATutor in minutes, develop custom themes to give ATutor a new look, and easily extend its functionality with feature modules. Educators can quickly assemble, package, and redistribute Web-based instructional content, easily import prepackaged content, and conduct their courses online. Students learn in an adaptive, social learning environment. Read More.. 216 5.7 years. Ranked 4939 and 294,931 downloads from sourceforge on 09/09/09. Ranked 117,088 on alexa.org on 08/05/2009. MySQL SVN access
Statistics
XOOPS Dynamic Web CMS [8] XOOPS is a dynamic web content management system written in PHP for the MySQL database. Its object orientation makes it an ideal tool for developing small or large community websites, intra company and corporate portals, weblogs and much more.[Read More..] 14 8 years. Ranked 35 and 7,402,174 downloads from sourceforge on 09/10/2009. Ranked 23,555 on alexa.org on 08/05/2009. MySQL SVN access
Schemas.tar.gz
Statistics
Coppermine Photo Gallery [9] Coppermine is an easily set-up, fast, feature-rich photo gallery script with mySQL database, user management, private galleries, automatic thumbnail creation, ecard feature and a template system for easy customization to match the rest of a site.[Read More..] 36 2.5 years. Ranked 261 and 5,653,224 downloads from sourceforge on 09/10/2009. Ranked 19,852 on alexa.org on 08/05/2009. MySQL SVN access
Schemas.tar.gz
Statistics
TYPO3 Content Management Framework [10] TYPO3 is an enterprise class Web CMS written in PHP/MySQL. It's designed to be extended with custom written backend modules and frontend libraries for special functionality. It has very powerful integration of image manipulation.[Read More..] 39 4 years. Ranked 1946 and 4,091,386downloads from sourceforge on 09/10/2009. Ranked 31,844 on alexa.org on 08/05/2009. MySQL SVN access
Schemas.tar.gz
Statistics


Open Source Software
Name Description No. of Revisions Lifespan Popularity DBMS Further Information


KtDMS[11] KnowledgeTree® is open source document management software that connects people, processes, and ideas. Collaborate, securely store all your critical documents, address compliance challenges, and focus on providing a simple solution that works for your business. Read More... 105 6.3 years. Ranked 156,935 on alexa.org on 05/23/2012. MySQL SVN access

Schemas.zip
Statistics

Slashcode[12] Slashcode is the site for All Things Slash. Slash is the source code and database that was originally used to create Slashdot, and has now been released under the GNU General Public License. It is a bona fide Open Source / Free Software project.[ Read More...] 256 8.10 years. Ranked 5799 and 142,849 downloads from sourceforge on 09/09/09. MySQL CVS access
Statistics
Zabbix[13] ZABBIX is an enterprise-class open source distributed monitoring solution.ZABBIX is software that monitors numerous parameters of a network and the health and integrity of servers. ZABBIX uses a flexible notification mechanism that allows users to configure e-mail based alerts for virtually any event. This allows a fast reaction to server problems. ZABBIX offers excellent reporting and data visualisation features based on the stored data. This makes ZABBIX ideal for capacity planning. Read More... 196 8.3 years. Ranked 398 and 511,389 downloads from sourceforge on 09/09/09. MySQL CVS access
Statistics
e107[14] e107 is a website system written in PHP and MySQL. It installs a completely dynamic website on your server allowing you complete control of your site from a secure and intuitive, yet powerful and flexible admin area. Coded within the XHTML 1.1 Standard.[Read More...] 16 5.4 months. Ranked 203 and 1,255,976 downloads from sourceforge on 09/09/09. MySQL CVS access
Statistics

On sourceforge


Medicine/Biology Databases
Name Description No. of Revisions Lifespan Popularity DBMS Further Information


Ensembl Genetic DB [15] The Ensembl project produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online. [Read More...] 412 9.8 years. 178,340 on alexa.org on 08/05/2009. MySQL CVS access
SVN access
API Documentation
[Statistics]
GrainGene [16] The GrainGenes 2.0 is a DB for Triticeae and Avena. [Read More...] 4,019 on alexa.org on 08/05/2009. MySQL Schema access
SQL Interface
[Statistics]
UCSC Genome Bioinformatics [17] The UCSC database is a MySQL based project. [Read More...] *on alexa.org on 08/05/2009. MySQL
BioSQL [18] BioSQL is a joint effort between the OBF projects (BioPerl, BioJava etc) to support a shared database schema for storing sequence data. In theory, you could load a GenBank file into the database with BioPerl, then using Biopython extract this from the database as a record object with featues - and get more or less the same thing as if you had loaded the GenBank file directly as a SeqRecord using SeqIO. [Read More...] 46 6.6 years. Ranked 11,740,689 on alexa.org on 08/05/2009. SVN access
Schema.zip
[Statistics]
GUS [19] The Genomics Unified Schema (GUS) is an extensive relational database schema and associated application framework designed to store, integrate, analyze and present functional genomics data. The GUS schema supports a wide range of data types including genomics, gene expression, transcript assemblies, proteomics and others. It emphasizes standards-based ontologies and strong-typing. [Read More...] Ranked 15,597,325 on alexa.org on 08/05/2009. SVN access
[Statistics]
NCBO [20] The National Center for Biomedical Ontology is a consortium of leading biologists, clinicians, informaticians, and ontologists who develop innovative technology and methods allowing scientists to create, disseminate, and manage biomedical information and knowledge in machine-processable form. [Read More...] on alexa.org on 08/05/2009. SVN access]
[Statistics]
Open EMR [21] OpenEMR is a free medical practice management, electronic medical records, prescription writing, and medical billing application. These programs are also referred to as electronic health records. OpenEMR is licensed under the General Gnu Public License (General GPL). It is a free open source replacement for medical applications such as Medical Manager, Health Pro, and Misys. [Read More...] Ranked 642,041 on alexa.org on 08/05/2009. On sourceforge
[Statistics]
Genomic DB Survey [22] Survey of almost 80 genomic databases. [Read More...] Ranked on alexa.org on 08/05/2009 [][]
[Statistics]


CERN Physics DBs
Name Description No. of Revisions Lifespan Popularity DBMS Further Information


GridCC [23] The GRIDCC is a three-year project funded by the European Commission. Its goal is integrating instruments and sensors with the traditional Grid resources. The GRIDCC middleware is being designed bearing in mind use cases from a very diverse set of applications, and as the result, the GRIDCC architecture provides access to the instruments in as generic a way as possible. GRIDCC is also developing an adaptable user interface and a mechanism for executing complex workflows in order to increase both the usability and the usefulness of the system. The new middleware is incorporated into significant applications that will allow the software validation in terms both of functionality and quality of service. The pilot application this paper focuses on is applying GRIDCC to support Remote Operations of the ELETTRA synchrotron radiation facility. We describe the results of implementing via GRIDCC complex workflows involved in the both routine operations and troubleshooting scenarios. In particular, the implementation of an orbit correction feedback shows the level of integration of instruments and traditional Grid resources which can be reached using the GRIDCC middleware. [Read More...] 6 Ranked 5,115,481 on alexa.org on 08/05/2009. CVS access[Statistics]
ATLAS (Trigger) [24] ATLAS is a particle physics experiment at the Large Hadron Collider at CERN. Starting in Spring 2009, the ATLAS detector will search for new discoveries in the head-on collisions of protons of extraordinarily high energy. ATLAS will learn about the basic forces that have shaped our universe since the beginning of time and that will determine its fate. Among the possible unknowns are the origin of mass, extra dimensions of space, microscopic black holes, and evidence for dark matter candidates in the universe.Trigger is one of the software in the ATLAS project.[Read More...] 77 2 years. on alexa.org on 08/05/2009 Oracle CVS access
[Statistics]
GENV
CVS access
CVS access
EGEE JRAI Middleware [25] The mandate of the is to provide a reference open source implementation of the foundation services that are application independent and need to be deployed at all sites connected to the infrastructure. On top this foundation, an open-ended set of application specific higher-level services that can be deployed on-demand at specific sites are provided directly by JRA1 or integrated from other sources and projects.Grid Foundation Middleware comprises all services that need to be deployed on a production Grid infrastructure in order to provide a consistent, dependable service. It can be regarded as the Middleware Infrastructure.[Read More...] 17 4 years. on alexa.org on 08/05/2009 Oracle, MySQL CVS access
CVS access
[Statistics]
CASTOR [26] CASTOR, stands for the CERN Advanced STORage manager, is a hierarchical storage management (HSM) system developed at CERN used to store physics production files and user files. Files can be stored, listed, retrieved and accessed in CASTOR using command line tools or applications built on top of the different data transfer protocols like RFIO (Remote File IO), ROOT libraries, GridFTP and XROOTD. CASTOR manages disk cache(s) and the data on tertiary storage or tapes. Currently (2007) there are some 109 million files and about 15 petabyte of data in CASTOR. [Read More...] SRM2-45
CASTOR-149
3 years. on alexa.org on 08/05/2009 Oracle CVS access
CVS access
[Statistics]
ATLAS(DQ2) [27] The ATLAS data management system (DQ2) is responsible for managing ATLAS data. DQ2 is designed to work with the full analysis tool-chain. [Read More...] Oracle - 17
MySQL - 51
on alexa.org on 08/05/2009. Oracle, MySQL CVS access (Oracle)
CVS access (MySQL)
Schema-tracker
[Statistics]
DRAC [Read More...] WMS - 41 , Production Management System - 14 on alexa.org on 08/05/2009. Oracle CVS access
CVS access
[Statistics]
ELFMS [28] ELFms (Extremely Large Fabric management system)is one of the CERN Webs and comprises of Quattor, Lemon and LEAF.[Read More...] on alexa.org on 08/05/2009 Oracle CVS access
[Statistics]
Other Hydra-service and triDAS [Read More...] Hydra-service - 5 and triDAS - 7 on alexa.org on 08/05/2009 MySQL, Oracle CVS access
CVS access
[Statistics]
Personal tools