Table of Contents
This section gives a brief overview of the architechture used in BASE. This is a good starting point if you need to know how various parts of BASE are glued together. The figure below should display most of the importants parts in BASE. The following sections will briefly describe some parts of the figure and give you pointers for further reading if you are interested in the details.
BASE stores most of it's data in a database. The database is divided into two parts, one fixed and one dynamic part.
The fixed part contains tables that corresponds
to the various items found in BASE. There is, for example, one table
for users, one table for groups and one table for reporters. Some items
share the same table. Biosources, samples, extracts and labeled extracts are
all biomaterials and share the BioMaterials
table. The access
to the fixed part of the database goes through Hibernate in most cases
or through the the Batch API in some cases (for example, access to reporters).
The dynamic part of the database contains tables for storing analyzed data. Each experiment has it's own set of tables and it is not possible to mix data from two experiments. The dynamic part of the database can only be accessed by the Batch API and the Query API using SQL and JDBC.
Note | |
---|---|
The actual location of the two parts depends on the database that is used. MySQL uses two separate databases while PostgreSQL uses one database with two schemas. |
Hibernate (www.hibernate.org) is an
object/relational mapping software package. It takes plain Java objects
and stores them in a database. All we have to do is to set the properties
on the objects (for example: user.setName("A name")
). Hibernate
will take care of the SQL generation and database communication for us.
This is not a magic or automatic process. We have to provide mapping
information about what objects goes into which tables and what properties
goes into which columns, and other stuff like caching and proxy settings, etc.
This is done by annotating the code with Javadoc comments. The classes
that are mapped to the database are found in the net.sf.basedb.core.data
package, which is shown as the Data classes box in the image above.
The HibernateUtil
Hibernate supports many different database systems. In theory, this means
that BASE should work with all those databases. However, in practice we have
found that this is not the case. For example, Oracle converts empty strings
to null
values, which breaks some parts of our code that
expects non-null values. Another difficulty is that our Batch API and some parts of
the Query API:s generates native SQL as well. We try to use database dialect information
from Hibernate, but it is not always possible. The DbEngine
DefaultDbEngine
MySQLEngine
PostgresDbEngine
Hibernate comes with a price. It affects performance and uses a lot of memory. This means that those parts of BASE that often handles lots of items at the same time doesn't work well with Hibernate. This is for example reporters, array design features and raw data. We have created the Batch API to solve these problems.
The Batch API uses JDBC and SQL directly against the database. However, we
still use metadata and database dialect information available from Hibernate
to generate most of the SQL we need. In theory, this should make the Batch API
just as database-independent as Hibernate is. In practice there is some information
that we can't extract from Hibernate so we have implemented a simple
DbEngine
BatchableData
Note | |
---|---|
The main reason for the Batch API is to avoid the internal caching of Hibernate which eats lots of memory when handling thousands of items. Hibernate 3.1 introduced a new stateless API which among other things doesn't do any caching. This version was released after we had created the Batch API. We made a few tests to check if it would be better for us to switch back to Hibernate but found that it didn't perform as well as our own Batch API (it was about 2 times slower). In any case, we can never get Hibernate to work with the dynamic database, so the Batch API is needed. |
The data classes are, with few exceptions, for internal use. These are the classes that are mapped to the database with Hibernate mapping files. They are very simple and contains no logic at all. They don't do any permission checks or any data validation.
Most of the data classes has a corresponding item class. For example:
UserData
User
GroupData
Group
The exception to the above scheme are the batchable classes, which are
all subclasses of the BatchableData
ReporterData
ReporterBatcher
The Query API is used to build and execute queries against the data in the
database. It builds a query by using objects that represents certain
operations. For example, there is an EqRestriction
AddExpression
The Query API knows how to work both via Hibernate and via SQL. In the first case it generates HQL (Hibernate Query Language) statements which Hibernate then translates into SQL. In the second case SQL is generated directly. In most cases HQL and SQL are identical, but not always. Some situations are solved by having the Query API generate slightly different query strings (with the help of information from Hibernate and the DbEngine). Some query elements can only be used with one of the query types.
The Controller API is the very heart of the Base 2 system. This part of the core is used for boring but essential details, such as user authentication, database connection management, transaction management, data validation, and more. We don't write more about this part here, but recommends reading the documents below.
From the core code's point of view a plug-in is just another client application. A plug-in doesn't have more powers and doesn't have access to some special API that allows it to do cool stuff that other clients can't.
However, the core must be able to control when and where a plug-in is
executed. Some plug-ins may take a long time doing their calculations
and may use a lot of memory. It would be bad if a several users started
to execute a resource-demanding plug-in at the same time. This problem is
solved by adding a job queue. Each plug-in that should be executed is
registered as Job
Note | |
---|---|
BASE ships with two types of job controllers. One internal that runs
inside the web application, and one external that is designed to run
on separate servers, so called job agents. The internal job controller
should work fine in most cases. The drawback with this controller is
that a badly written plug-in may crash the entire web server. For example,
a call to System.exit() in the plug-in code shuts down Tomcat
as well.
|
Client applications are application that use the BASE Core API. The current web application is built with Java Server Pages (JSP). It is supported by several application servers but we have only tested it with Tomcat. Other client applications are the external job agents that executes plug-ins on separate servers, and the migration tool that migrates data from a BASE 1.2.x installation to BASE 2.
Although it is possible to develop a completely new client appliction from scratch we don't see this as a likely thing to happen. Instead, there are some other possibilites to access data in BASE and to extend the functionality in BASE.
The first possibility is to use the Web Service API. This allows you to access some of the data in the BASE database and download it for further use. The Web Service API is currently very limited but it is not hard to extend it to cover more use cases.
A second possibility is to use the Extension API. This allows a developer to add functionality that appears directly in the web interface. For example, additional menu items and toolbar buttons. This API is also easy to extend to cover more use cases.
The BASE plug-ins site also has examples of extensions and web services implementations.