Adding a node to the Astro-WISE federation

Adding a complete node to the Astro-WISE federation requires the installation of the Astro-WISE software, a database, one or more dataservers. The database and dataservers then have to be connected to the other nodes in the federation. Here we describe both the local requirements (such as hardware, database layout, diskspace), the requirements for the connections between the nodes (such as firewall settings, bandwidth) and the installation and configuration of the components.

Firewall setup

Nodes that participate in a federation have to be accessed by the other nodes. This means that for the database and dataservers some network ports need to be opened for all participating nodes.

For the database host, port 1521/tcp needs to be open to

db.astro.rug.astro-wise.org
db.astro.uni-bonn.astro-wise.org
db.na.astro.astro-wise.org
db.ocam.mpe.astro-wise.org
...

For the dataserver host, port 8000/tcp needs to be open to

ds.astro.rug.astro-wise.org
ds.astro.uni-bonn.astro-wise.org
ds.na.astro.astro-wise.org
ds.ocam.mpe.astro-wise.org
...

If a cluster of dataservers is available, it is sufficient to open the necessary port for only one of these.

Database creation and configuration

It is assumed that the most recent version of Oracle has been installed and that no database has yet been created. The database has to be created with the following settings.

  • The default blocksize has to be 32k

  • ASM—Automatic Storage Management—has to be used for the device. This gives maximum flexibility, performance and relialibility.

    • Initially, one ASM diskgroup should be sufficient
    • The first diskgroup should be called AWDISKGROUP1
    • New harddisks can be added to an ASM diskgroup in a working database. Existing harddisks can be taken off-line in a working database. In both cases the data will be rebalanced automatically.
  • The following tablespaces should be defined.

    • AWLISTS, add eight datafiles, each having autoextend with 100MB to maxsize. This allows for 1056GB of data, before additional datafiles have to be added.
    • AWINDX, add four datafiles, each having autoextend with 100MB to maxsize. This allows for 528GB of data, before additional datafiles have to be added.
    • UNDOTBS1 UNDO tablespace <= 4GB autoextend=off
    • TEMP temporary tablespace <= 2GB autoextend=on maxsize=2GB
    • USERS, make sure that the datafile has autoextend with 100MB to maxsize.

    USERS has to be the default tablespace and TEMP has to be the default temporary tablespace. For certain operations, the TEMP and UNDOTBS1 may be too small. For TEMP you can increase the maxsize. For the UNDO tablespace you should create an UNDOTBS2 tablespace which can grow indefinitely. After that you should make UNDOTBS2 the default UNDO tablespace, perform your operation, make UNDOTBS1 the default tablespace and drop UNDOTBS2. The reason to do that is that otherwise the UNDO tablespace can grow very big and allow runaway transactions to run for hours before failing. 4GB corresponds to a transaction that takes about an hour, which should be sufficient for operations on tables of sizes up to 300GB.

  • Archive logging has to be set to on. This allows for online backups to be made. A backup of a 300GB database typically takes several hours, during which the database would not be available if it had to be backed up off-line.

  • Set the service name of the database equal to its fqdn.

    ALTER SYSTEM SET SERVICE_NAMES = 'db.?.astro-wise.org' SCOPE=BOTH;
    

    Replace the question mark appropriately for your domain. Using the fqdn of the database host as the database service name will make it possible to connect to the database without any client configuration. No tnsnames.ora needs to be present in that case on the client.

  • Enable the usage of global_names.

    ALTER SYSTEM SET GLOBAL_NAMES = TRUE SCOPE=BOTH; ALTER DATABASE
    RENAME GLOBAL_NAME TO DB.?.ASTROWISE.ORG;
    

    If the database name contains dashes they have to be removed when renaming the GLOBAL_NAME of the database.

  • Make sure that your listener runs in shared mode and execute

    ALTER SYSTEM SET DISPATCHERS = '(PROTOCOL=TCP) (DISPATCHERS=7)';
    

    If each client would make a so-called “dedicated” connection, the database host would be saturated quickly when parallel processing takes place. This starts to become noticeable when you have only a few hundred MB left on your database host and you have one hundred or more parallel connections. With lsnrctl services you can discover how and how often there has been a connection to your database. Ideally, the number of established DEDICATED REMOTE SERVER connections should be zero.

  • You have to explicitly enable checking of resource limits. You do this with

    ALTER SYSTEM SET RESOURCE_LIMIT = TRUE;
    

    The database default is that resource limits that are set in database profiles will not be enforced.

  • There should be ten redo logs of 2GB, where each redo log has only one member. For typical workloads this is more than enough, but can be required for some long running maintenance transactions.

Streams configuration

Oracle Streams is set up at a node by taking the following steps.

  1. Change the STRMADMIN password and datafile for the LOGMNRTS tablespace in

    common/toolbox/dbmakestrmadmin.sql
    

    and run the script as SYS in sqlplus.

  2. Make database links to other nodes as STRMADMIN, given the following name for the database link and the fqdn of the node.

    <linkname> = DB.ASTRO.RUG.ASTROWISE.ORG
    <nodename> = DB.ASTRO.RUG.ASTRO-WISE.ORG
    

    Note that a dash is not allowed in the linkname. Connect as STRMADMIN before making the link with the following command

    CREATE DATABASE LINK <linkname>
    CONNECT TO STRMADMIN
    IDENTIFIED BY <strmadmin password>
    USING '<nodename>';
    

    and repeat this for all remote nodes that one wants to connect to.

  3. Create a datapump directory and grant access to it by the STRMADMIN user.

    CREATE OR REPLACE DIRECTORY AWDATAPUMP AS '/your/awdatapump/directory';
    GRANT READ, WRITE ON DIRECTORY AWDATAPUMP TO STRMADMIN;
    

Maintenance

Cleaning up deleted files and database objects

The common/toolbox/dbvacuum.py script is used to put aside files on the dataservers and delete corresponding objects from the database, which users have deleted. The files will be moved to the ddata directory on the dataservers from which they can be removed for all eternity.

Archivelog backup

The archivelog should be backed up daily using a cron job similar to

TAG=`date +BACKUP_AW01.ARCH_%y%m%d%H%M%S`

$ORACLE_HOME/bin/rman target / log=$HOME/Logs/$TAG.log <<EOF
backup device type disk tag '$TAG' archivelog all not backed up delete all input;
EOF