Digital Library Administration
Contents
VDL Creation and Management
Resources Management
VO and Users Management
Content & Storage Management
Content Management strictly relies on Storage Management. Therefore it is a prerequisite to setup a running instance of Storage Management before Content Management can be successfully started. There are two possibilities to setup Storage Management: a simple one using Apache Derby as a database backend and an advanced one, where an existing database is used via JDBC.
Simple Setup of Storage Management using Apache Derby
Apache Derby is an open source relational database implemented entirely in Java and available under the Apache License, Version 2.0 with a small footprint of about 2 megabytes. It is sufficient to be used as a database backend for getting started with Storage Management. However, when much data is stored or some more elaborate backup & recovery strategies should get used, traditional (huge) RDBMS might be a better choice.
If Storage Management is deployed dynamically or manually from the GAR, it's default installation places a configuration file in $GLOBUS_LOCATION/etc/<Service-Gar-Filename>/StorageManager.properties that expects Derby to be available and have permissions to write at file under ./StorageManagementService/db/storage_db. Derby is started in embedded mode, for which it doesn't even need a username or password. Multiple connections from the same Java Virtual Machine are possible and are also quite fast, but no two Java VM can access the DB at the same time.
If all dependencies have been installed correctly, the container should start and create a new database if needed.
The lines defining the JDBC connection to the database in the above mentioned configuration files are:
DefaultRawFileContentManager=jdbc\:derby DefaultRelationshipAndPropertyManager=jdbc\:derby # derby settings (Default) jdbc\:derby.class=org.diligentproject.contentmanagement.baselayer.rdbmsImpl.GenericJDBCDatabase jdbc\:derby.params.count=4 jdbc\:derby.params.0=local_derby_storage_db jdbc\:derby.params.1=org.apache.derby.jdbc.EmbeddedDriver jdbc\:derby.params.2=jdbc\:derby\:./StorageManagementService/db/storage_db;create\=true jdbc\:derby.params.3=5By changing the
jdbc\:derby.params.2=jdbc\:derby\:./StorageManagementService/db/storage_db;create\=trueafter derby\: you can choose another place to store the database.
In this setting, all relationships and properties as well as the raw file content are stored inside the Derby database. This is defined in the first two lines of the configurtaion snipped shown above.
Advanced Setup of Storage Management using an arbitrary relational JDBC-Database
Storage Management depends on the following external components:
- Apache Commons Database Connection Pooling which requires itself Commons Pool and therefore also Commons Collections
- a JDBC-driver for the database to use.
The first one should get dynamically deployed, the second you will have to install since it depends only on the RDBMS you want to use. Most common choice is to use MySQL, since it is used for many of the gLite components as well like DPM or LFC, such that there is no need to set up another RDBMS. The corresponding JDBC driver is named Connector/J and is released under a dual-lincesing strategy like the MySQL RDBMS itself: a commercial license and the GNU General Public License. For this reason, neither the RDBMS nor the JDBC driver are directly distributed with the gCube software.
You will have to prepare the DBMS manually to create a new database that will get used for Storage Management. For this, you may also want to install mysql-client, MySQL Administrator, and MySQL Query Browser - or a database-independent tool like ExecuteQuery. On Scientific Linux 3, the following steps need to be performed:
apt-get install mysql-server mysql-client mysqladmin create <dbname> mysql --user=root <dbname>
This will install the MySQL server (if not already present) and the corresponding command-line client. The next line will create a new, empty database. The last line will connect to this database using the comand-line client. If the RDBMS has been set up to require a password for the local root account, use the option -p to be promted for the password. Once you are logged in, you have to create a new user with sufficient rights to connect, create new and alter tables and perform all kinds of selects, inserts, updates, delete from them in this database.
The easiest way to achieve this in MySQL isGRANT ALL PRIVILEGES ON <dbname>.* TO '<username>'@'%' IDENTIFIED BY '<password>';(MySQL has its very own syntax instead of CREATE USER here until version 5.0 - see [1] for more details.)
If you use MySQL versions < 5, it has by default a limited file size of individual database files of 4GB or even 2GB on some filesystems. This might become a problem if you either store many, many files or just a couple of huge files and MySQL might start to complain "Table is full". In this case, execute the SQL command
ALTER TABLE Raw_Object_Content MAX_ROWS=1000000000 AVG_ROW_LENGTH=1000;
to allocate pointers for bigger tables. See [2] for details.
Due to some inconvenience in the MySQL protocol for transfering BLOBs of several megabytes, you might have to increase the MAX_ALLOWED_PACKET variable in the my.cnf. On Scientif Linux this is located in /var/lib/mysql/ - see [3] for details.
For using MySQL, you can use the following lines in the above mentioned configuration file:
# local mysql settings (template for MySQL instances) jdbc\:mysql_local.class=org.diligentproject.contentmanagement.baselayer.rdbmsImpl.GenericJDBCDatabase jdbc\:mysql_local.params.count=4 jdbc\:mysql_local.params.0=local_mysql_db jdbc\:mysql_local.params.1=com.mysql.jdbc.Driver jdbc\:mysql_local.params.2=jdbc\:mysql\://127.0.0.1/storage_db?user\=THE_USER&password\=THE_PASS jdbc\:mysql_local.params.3=100You will have to change the line
jdbc\:mysql_local.params.2=jdbc\:mysql\://127.0.0.1/storage_db?user\=THE_USER&password\=THE_PASSin order to use the correct IP-address of your server, the database name, the username and the password. This is nothing else than a regular JDBC connection (plus \ infront of each : to escape them in the Java property file) string, so if you are familiar with that, it should be quite simple to use; otherwise there is plenty of documentation how to make sense out of this, e.g. [4].
In addition, you have to set the Storage Manager to use this database by default. Therefore simply edit the lines on top to:
DefaultRawFileContentManager=jdbc\:mysql_local DefaultRelationshipAndPropertyManager=jdbc\:mysql_local
If you do consistent renaming / copy & paste, there is no need to stick to the mysql_local.