Database Guidelines

From LiquidPubWiki

Jump to: navigation, search

In order to work with the database these are the recommended guidelines

Contents

Database Schema

The latest snapshot of the main LP DB schema is available at https://dev.liquidpub.org/svn/liquidpub/prototype/ljdemo/server/resources/meta/database/dbmodel/ (current (old) version is here: https://dev.liquidpub.org/svn/liquidpub/db/). The description of the tables is available at LP_DB (still only of the old schema). When changing the schema, please always produce SQL script and .pdf picture of the schema. Please notice: if you created the schema with automatic script generation tools like DDL export function in Oracle Data Modeler, you will need to perform an additional step before uploading the SQL script to the db, that is running the sequence creation updater script (see below for instructions).

The description of the tables for loading DBLP data is at DBLP_tables.

The description of the data we can get from SpringerLink is available at SpringerLink.

Adding sequences creation instructions to DDL SQL script

It's a three steps process:

  1. Create the DDL SQL script for your database
  2. Donwload the bash script from here
  3. Run the bash script specifying the newly created script as target. (e.g.
./createseq.sh LP_DB.sql

Now your SQL script should contain sequences creation instructions also.

Table description

The (new) schema reflects the database representation of the Liquidpub concepts and model. This is how (some of) the concepts map to the schema:

Scientific resources

There is a number of tables that store scientific resources - one table for common resources and few tables for specific types of resources. Each resource is present only in one of these tables, and there is no connection between these tables on database level.

  • sci_resource stores common resources like papers, videos, slide sets, datasets, experiments, etc.,
  • liquidjournal stores liquid journals,
  • issue stores issues

People

  • person contains information about people (authors, editors, users etc.) which may be or may not be the users of the system.

Relations

The concept of relation is implemented via tables relation, relation_type and entity_type.

  • entity_types contains types of entities that can be annotated or linked with relations, namely so far: sci_resource, liquidjournal, issue, person, reference, relation. For each entity type there should be a main table with the integer surrogate primary key. Different entity tables are not connected together.
  • relation_type stores possible types of relations in the system.

Relation "connects" pairs of entities. During system exploitation, the users will introduce different types of relations. Some of the types however will be predefined, namely:

    • between scientific resources:
      • X is_next_version_of Y,
      • X is_alternative_representation_of Y,
      • X is_performed_on (dataset) Y,
      • X is_reporting_on (experiment or dataset) Y,
      • X cites Y,
      • ...
    • between people and scientific resources:
      • X is_author_of
      • X contributed_to Y (generalization of the authorship relation),
      • X edited Y,
      • ...
  • relation stores the actual relations, containing the following information:
    • relation type (references the relation_type table),
    • relation author (who introduced the relation; references the person table),
    • entity types (reference entity_type tables),
    • entity id's (the primary keys in the entities\' main tables),
    • time when relation was last introduced.

Some of the relations that are crucial to the system are modeled as normal join tables, containing the foreign keys to their entities' tables. Among them are:

  • resources_in_issues
  • resources_in_journals
  • journal_editors
  • issue_editors
  • entity_from_infosource
  • relation_certification

Annotations

...under construction...

Certfication

...under construction...

Database Users

In order to be able to work with the database you will need that we create a DB user. Please ask for one by writing to (imran AT disi.unitn.it)

Your DB user should have permissions to create and use all kinds of objects in your schema and permissions to SELECT, INSERT, UPDATE and DELETE in all the tables in the LiquidPub schema.

Database Objects

  • If you want to create or modify an object you should test the modifications in your schema first. When you are comfortable with them then create an oracle script for them and send them to (imran AT disi.unitn.it)
  • When creating a table or field try to use descriptive and meaningful names, the table name should be in plural i.e. students, courses, students_courses, etc. The field name should be in singular i.e. name, description, etc.
  • The field that would be the id of the table should be named like the tables name in singular followed by “_id” i.e. for the “students” table its id would be “student_id”. It is also recommended that the id field should be an auto-increment field.
  • To create an auto-increment field first you have to create the table, then create a sequence object, name it the same as the table i.e. for the “students” table the sequence name would be “students_seq”. After creating the sequence, create a trigger before insert for the table and bound the sequence with the id field.

Example of using sequence

Here create the table,

  CREATE TABLE "PROJECT"."STUDENTS" ( "STUDENT_ID"
  INTEGER NOT NULL , "FIRST_NAME" VARCHAR2(100) NOT NULL , "LAST_NAME"
  VARCHAR2(100) NOT NULL , "TITLE" VARCHAR2(100), "EMAIL" VARCHAR2(100),
  "URI" VARCHAR2(1000), CONSTRAINT "STUDENTS_PK" PRIMARY KEY ("STUDENT_ID") VALIDATE );

After that create the sequence,

  CREATE SEQUENCE "PROJECT "."STUDENTS_SEQ" NOCYCLE NOORDER NOCACHE NOMAXVALUE MINVALUE 1 INCREMENT BY 1 START WITH 1;

And last create the trigger,

  CREATE OR REPLACE TRIGGER “PROJECT "."BI_STUDENT" BEFORE
  INSERT ON "PROJECT "."STUDENTS" FOR EACH ROW begin
  select "STUDENTS_SEQ".nextval into :NEW.STUDENT_ID from dual;
  end;

Now you have a table with an auto-increment id field.

Accessing DB from an application

The preferred way to access DB is through JNDI-configured DataSources. In order to be able to use them, you'll need to

  1. Update your application configuration, and replace the code that retrieves instance(s) of JDBC connection directly from the Driver with a code that exploits JNDI to fetch a DataSource object, and use that one to retrieve a connection.
  2. Update the application server configuration, and define a new JNDI resource in the global naming resources space, with the same name you used in your application configuration and the proper connection parameters.

In a certain sense, you export your DB connection configuration to the applications server level and leave only a reference to the DataSource object through JNDI behind it.

This also has an important consequence on library dependencies for the application: since it's the application server that takes charge of concretely instantiating the connection with the DB, there's no need for the application to drag along the JDBC driver once used for setting up the connection. Hence, JDBC driver library can be safely removed from web apps libraries: conversely, it needs to be added to application server's classpath.

Example: migrate existing Appfuse-light derived app

In this case, data access is made through Hibernate, so you'll have to alter Hibernate's own configuration files to instruct it to use provided DataSource instead of trying to issue a standalone connection with the DB.

Let's suppose you have a connection to an Oracle DB and would like to turn it into a DataSource called "jdbc/yourName", and uses Tomcat 6 as your application server.

Basically, you have to edit the following files in your application source code:

  1. src/main/resources/jdbc.properties

Comment out lines starting with "jdbc", and add the following line

hibernate.connection.datasource=java:/comp/env/jdbc/yourName
  1. src/main/webapp/WEB-INF/applicationContext-hibernate.xml

Drop old "dataSource" bean definition and replace it with the following:

    <!-- New configuration using JNDI -->    
    <bean id="dataSource" class="org.springframework.jndi.JndiObjectFactoryBean">
        <property name="jndiName" value="${hibernate.connection.datasource}" />    
        <!--  Fallback to a non-jndi datasource if we are not in the container -->
        <property name="defaultObject" ref="fallbackDataSource" />
    </bean>

For JUnit integration (mostly for local development purposes), JNDI configuration will not work because JUnit tests are not launched within a container (e.g. servlet). Hence, for local testing purposes, old "dataSource" configuration should remain and be referenced by the "defaultObject" property. This is a temporal solution until we find a better way of integrating JUnit and JNDI.

    <!-- Configuration without using JNDI -->
    <bean id="dataSourceTest" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
        <property name="driverClassName" value="${jdbc.driverClassName}"/>
        <property name="url" value="${jdbc.url}"/>
        <property name="username" value="${jdbc.username}"/>
        <property name="password" value="${jdbc.password}"/>
        <property name="maxActive" value="30"/>
        <property name="maxIdle" value="10"/>
        <property name="maxWait" value="1000"/>
        <property name="defaultAutoCommit" value="true"/>
    </bean>

Edit "hibernateProperties" subsection inside sessionFactory bean definition; in my own experience, these lines do suffice:

        <property name="hibernateProperties">
            <props>
                <prop key="hibernate.dialect">${hibernate.dialect}</prop>
                <prop key="hibernate.connection.datasource">${hibernate.connection.datasource}</prop>
            </props>
        </property>
  1. src/main/webapp/WEB-INF/web.xml

Add the following lines:

	<resource-ref>
		<description>A meaningful description</description>
		<res-ref-name>jdbc/yourName</res-ref-name>
		<res-type>javax.sql.DataSource</res-type>
		<res-auth>Container</res-auth>
	</resource-ref>
  1. src/main/webapp/META-INF/context.xml

Inside the Context root element, type:

  <ResourceLink name="jdbc/yourName"
            global="jdbc/yourName"
            type="javax.sql.DataSource" />

Then, last but not least, don't forget to update your web server configuration file; speaking of Tomcat 6, it would be

  • CATALINA_HOME/conf/server.xml

Locate the "GlobalNamingResource" tag and and the following element inside it:

    <Resource name="jdbc/yourName" auth="Container"
              type="javax.sql.DataSource"
              maxActive="100" maxIdle="30" maxWait="10000"
              username="<username>" password="<password>" driverClassName="oracle.jdbc.driver.OracleDriver"
              url="jdbc:oracle:thin:@<server-address>:1521:<instance-sid>"/>

Obviously you'll have to put your actual DB connection parameters values instead of the placeholders: for a deeper understanding of the other parameters, please have a look at Tomcat's official JNDI Documentation.

Finally, remember to move JDBC driver corresponding jar library from src/main/webapp/WEB-INF/lib to CATALINA_HOME/lib, and of course restart Tomcat.

Querying the database

  • When using SELECT statements in the code try to avoid the SELECT * FROM table1 type of statement, so that if someone adds or removes columns from the table your queries will still be working and also you would not be overcharging the network with information you would not use.
  • When using INSERT statements use the long format i.e. INSERT INTO table1 (column1, column2) VALUES (value1, value2) instead of INSERT INTO table1 VALUES (value1, value2) because if someone adds or removes columns from the table you will get an error or have your data inserted in the wrong columns.

DB tools

See Development_tools#DB_design_and_access

Personal tools