Skip to main content
Mitratech Success Center

Installing and Configuring Data Warehouse

If you are upgrading Data Warehouse, follow the following instructions, and disregard the rest of this section, which pertains to brand-new installations.

Upgrading from Earlier Versions

The only direct upgrade path to version 6.1.1 is from Data Warehouse version 5.x. If you have a version of Data Warehouse earlier than 4.0, you must remove your existing Data Warehouse (see Removing the Data Warehouse), then recreate it using the instructions in New Installations.

Important: Data Warehouse should not be upgraded until TeamConnect is already at your desired version. Ensure that the TeamConnect version is correct before proceeding.

To upgrade from version 4.0, from the installation media, open the directory TeamConnect_DataWarehouse and copy the contents to a target directory on your hard drive. (For purposes of this documentation, we will presume that your target directory is named datawarehouse and it is a subdirectory of your hard drive's root directory.)

  1. Ensure that file kettle.properties contains the correct connection information for your Data Warehouse database. Details about connection properties are found in step 3 on New Installations.
  2. Modify shared.xml and kettle.properties, as described in step 3 on New Installations. Then return to this procedural.
  3. For SQL Server and NTLM authentication only, add two new attributes to each affected connection element in file shared.xml. See the highlighted text in the example below. This example deals with the source database connection, but NTLM can be used on source and/or target database connections.

    <connection>
    <name>Source_TeamConnect_data</name>
    <server>${jdbc.source_TC_hostname}</server>
    <type>MSSQL</type>
    <access>Native</access>
    <database>${jdbc.source_TC_dbname}</database>
    <port>${jdbc.source_TC_portnumber}</port>
    <username>${jdbc.source_TC_username}</username>
    <password>${jdbc.source_TC_userpassword}</password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
    <attribute><code>EXTRA_OPTION_MSSQL.domain</code>
    <attribute>DOMAIN_NAME</attribute>
    </attribute>
    </attributes>
    </connection>


    For DOMAIN_NAME, substitute the actual Windows domain name that will be doing the authentication.
     
  4. Run the upgrade scripts for your version and operating system (.bat for Windows users and .sh for Linux users). As of Data Warehouse 4.1, users can simply run the WH_upgrade.bat or WH_upgrade.sh file and be automatically upgraded through all preceding versions to 6.0. However, if desired or if an error is encountered, users can proceed stepwise through each upgrade as previously required by Data Warehouse upgrades.

Important: After each upgrade script, you must run the installer. Run the correct WH_install.bat or WH_install.sh file depending upon your operating system. These files are in the data warehouse target directory.

The respective paths are as follows:

  1. Data Warehouse 4.0:

Run WH_upgrade_40U1.bat or WH_upgrade_40U1.sh Run Installer
Run WH_upgrade_402.bat or WH_upgrade_402.sh Run Installer
Run WH_upgrade_41.bat or WH_upgrade_41.sh Run Installer
Run WH_upgrade_50.bat or WH_upgrade_50.sh Run Installer

  1. Data Warehouse 4.0 Update 1:

Run WH_upgrade_402.bat or WH_upgrade_402.sh Run Installer
Run WH_upgrade_41.bat or WH_upgrade_41.sh Run Installer
Run WH_upgrade_50.bat or WH_upgrade_50.sh Run Installer

  1. Data Warehouse 4.0 Update 2:

Run WH_upgrade_41.bat or WH_upgrade_41.sh Run Installer
Run WH_upgrade_50.bat or WH_upgrade_50.sh Run Installer

  1. Data Warehouse 5.0

Run WH_upgrade_501.bat or WH_upgrade_501.sh Run Installer

e. Data Warehouse 5.1

Run WH_upgrade_51.bat or WH_upgrade_51.sh Run Installer
Run WH_upgrade_511.bat or WH_upgrade_511.sh Run Installer
Run WH_upgrade_512.bat or WH_upgrade_512.sh Run Installer
Run WH_upgrade_513.bat or WH_upgrade_513.sh Run Installer

f. Data Warehouse 6.0

Run WH_upgrade_600.bat or WH_upgrade_600.sh Run 6.0 Installer

g. Data Warehouse 6.1

Run WH_upgrade_611.bat or WH_upgrade_611.sh Run 6.11 Installer

h. Data Warehouse 6.3

Run WH_upgrade_62.bat or WH_upgrade_62.sh Run 6.2 Installer.

Note: If you are using Oracle for Data Warehouse 6.3, you need to download the ojdbc6.jar/ojdbc8.jar file and place it at <openJDK folder>\jre\lib\ext folder.

If you are using MSSQL for Data Warehouse 6.3, you need to download the mssql-jdbc-7.4.1.jre8 file and place it at <openJDK folder>\jre\lib\ext folder.

New Installations

Important: Upgrading from version 4.0 is supported.

Installation of Data Warehouse creates the Data Warehouse target database schema and makes a few schema changes to the source database.

Note: Installation is primarily concerned with schema changes and does not populate the target database with data, except for tables WH_CATEGORIES and WH_CURRENCIES. For more information about these two tables, see How the ETL Job Works. Data for all other Data Warehouse tables are populated in the separate task Refreshing the Data Warehouse.

The following tools are used for Data Warehouse:

  • PentahoEnterprise reporting, analysis, dashboard, data mining, and workflow application with ETL capabilities for business intelligence needs. For details, see:
    www.pentaho.com/products/data_integration
  • SpoonA GUI application that provides the design of transformations and jobs which can be run with other Kettle tools such as Pan (transformations) and Kitchen (jobs).

Prior to installing Pentaho tools and configuring Data Warehouse, it is recommended that you shut down TeamConnect.

Note: TeamConnect Data Warehouse does not support customization of its scripts, functionality, or target database schema

To install TeamConnect Data Warehouse

  1. From the installation media, open the directory TeamConnect_Datawarehouse and copy the contents to a target directory on your hard drive. (For purposes of this documentation, we will presume that your target directory is named datawarehouse and it is a subdirectory of your hard drive's root directory.)
    datawarehouse/.kettle contains several files, including two that must be edited:
    • kettle.properties
    • shared.xml
  2. Ensure that the source\openjdk-8u275-b01_linux\jre\bin\java is executable.
  3. Open the kettle.properties file and verify the information that is shown in the listing below. You must edit that information and replace the existing values with those that are specific to your own databases.

    jdbc.source_TC_hostname=10.0.2.23
    jdbc.source_TC_dbname=QDB12O9I
    jdbc.source_TC_portnumber=1521
    jdbc.source_TC_username=TeamConnect
    jdbc.source_TC_userpassword=TeamConnect
    jdbc.source_TC_dbtype=ORACLE
    jdbc.target_TC_hostname=10.0.2.23
    jdbc.target_TC_dbname=QDB12O9I
    jdbc.target_TC_portnumber=1521
    jdbc.target_TC_username=DataWarehouse
    jdbc.target_TC_userpassword=DataWarehouse
    jdbc.target_TC_dbtype=ORACLE

    The example above shows Oracle values. Here is an example of some Microsoft SQL Server values:
    jdbc.target_TC_hostname=WEBISQL2002
    jdbc.target_TC_dbname=TC24
    jdbc.target_TC_portnumber=1433
    jdbc.target_TC_username=sa
    jdbc.target_TC_userpassword=password
    jdbc.target_TC_dbtype=MSSQL


    If your source and target databases run on SQL Server, and you wish to use NTLM authentication to connect to them, leave empty the values of jdbc.source_TC_username, jdbc.source_TC_userpassword , jdbc.target_TC_username and jdbc.target_TC_userpassword.

    You will also need to modify file shared.xml, as described in the next step.
    The examples above show database passwords listed in plain text. Optionally, you can omit the password lines entirely, or leave the value after the = sign blank, and instead modify the shared.xml file to use an encrypted password as described in step 8.
    If your Data Warehouse application is behind a firewall, you may need to add an SSL specification in the TC_dbname property, as in the example line below.
    jdbc.target_TC_dbname=TC24;ssl=request
  4. The WH_COMMIT_SIZE=5000 field sets the number of records committed during the ETL process. This value defaults to 5000. For large data warehouse setups suffering slow runtimes, it may be recommended to increase this value.

    WH_BATCH_SIZE and WH_DELETE_SIZE are used for batching the process. Instead of processing the whole data all together, we can batch it into smaller sets of data. It reduces the use of memory and prevents the SNAPSHOT_TOO_OLD issue.
    Note: 

    In most cases, the following parameters would suffice:

    WH_BATCH_SIZE=100000

    However, depending on the size of the T_ tables, i.e T_PROJECT, T_INVOICE, T_HISTORY; if the size on those tables reach over millions, they may increase the batch size to 500000.

    If you’re getting SNAPSHOT_TOO_OLD issues, you need to decrease the batch size.

  5. LOAD_ACCOUNT_TRANSACTIONS is set to NO by default to avoid loading the account transaction into the data warehouse. (Note: If clients need to store the account transactions information in the data warehouse, they need to mark LOAD_ACCOUNT_TRANSACTIONS as YES)

  6. For SQL Server and NTLM authentication only, add two new attributes to each affected connection element in file shared.xml. See the highlighted text in the example below. This example deals with the source database connection, but NTLM can be used on source and/ or target database connections.
    Note: 
    <connection>
    <name>Source_TeamConnect_data</name>
    <server>${jdbc.source_TC_hostname}</server>
    <type>MSSQLNATIVE</type>
    <access>Native</access>
    <database>${jdbc.source_TC_dbname}</database>
    <port>${jdbc.source_TC_portnumber}</port>
    <username>${jdbc.source_TC_username}</username>
    <password>${jdbc.source_TC_userpassword}</password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
    <attribute><code>EXTRA_OPTION_MSSQL.domain</code>
    <attribute>DOMAIN_NAME</attribute>
    </attribute>
    </attributes>
    </connection>



    For DOMAIN_NAME, substitute the actual Windows domain name that will be doing the authentication.

  7. The kettle.properties file also has a parameter, WH_LOOKUP_POPULATION, that must be evaluated now, before installation. There are three possible parameter values. The parameter value affects what will be placed in Data Warehouse tables that relate to custom fields that contain lookup table values.
  • Parameter value NAME would cause this lookup table value to appear in Data Warehouse as LLC.
  • Parameter value FULL_NAME would cause this lookup table value to appear in Data Warehouse as Vendor:Corporation:LLC.
  • Parameter value FULL_NAME_NO_ROOT would remove the Root category from the name; i.e. “Root|Value1|Value1.1” becomes “Value1|Value11”
  • Parameter value TREE_POSITION would cause this lookup table value to appear in Data Warehouse as VEND_CORP_LLC1. In this case, the report designer would then need to create a join to the lookup table to retrieve the text description.

Important: If this parameter is changed after you have already begun using Data Warehouse, your data will be inconsistent. The only option, in that case, would be to remove and reinstall Data Warehouse. Therefore you must carefully evaluate which of the three possible values should be used for this parameter.

8. The kettle.properties file has a new section that allows the data warehouse refresh job to refresh custom materialized views. The parameters are:

  • WH_MVIEW_REFRESH=[YES/NO] - set to NO to ignore the whole MV refresh. Use this if you don't have a custom MView. Set to YES if you want to refresh custom Mview at the end of DW refresh. If you have an auto-refresh MView, those MView will be altered into REFRESH ON DEMAND.
  • WH_MVIEW_REFRESH_EVERYTIME='listOfMviews' - The single quotes are required. Set this with a comma delimited list of MView names that will be refreshed every time the DW refreshes. i.e.  WH_MVIEW_REFRESH_EVERYTIME='MVIEW1, MVIEW2, MVIEW3'
  • WH_MVIEW_REFRESH_DAILY='listOfMviews' - The single quotes are required. Set this with a comma delimited list of MView names that will be refreshed the first DW refresh of the day.
  • WH_MVIEW_REFRESH_WEEKLY='listOfMviews' - The single quotes are required. Set this with a comma delimited list of MView names that will be refreshed weekly.
  • WH_MVIEW_REFRESH_DAYOFTHEWEEK='dayOfTheWeek' - The single quotes are required. Set this with a day (SUN,MON,TUE,WED,THU,FRI,SAT) that you want the MView refreshes. This is defaulted to SUN
  • WH_MVIEW_REFRESH_MONTHLY='listOfMviews' - The single quotes are required. Set this with a comma delimited list of MView names that will be refreshed on the first refresh of the month the DW refreshes.

WH_MVIEW_REFRESH_WEEKLY and WH_MVIEW_REFRESH_DAYOFTHEWEEK work together if you want to set up a weekly refresh of the MViews.

If WH_MVIEW_REFRESH=YES but WH_MVIEW_REFRESH_EVERYTIME, WH_MVIEW_REFRESH_WEEKLY, WH_MVIEW_REFRESH_MONTHLY, WH_MVIEW_REFRESH_WEEKLY are set to '', then all mviews owned by the user will be refreshed every time when DW refreshes.

9. The kettle.properties file has a new section for gather stat. Set WH_GATHER_STAT=YES will allow the DW to do stat gathering at the end of the initial and refresh job.

10. The kettle.properties file also has a section that determines how you receive email notifications. If you want to receive email notifications when the job fails or succeeds, set the value of ENABLE_EMAIL_NOTIFICATION to YES. If you want to receive email notifications only when the job runs successfully, set the value of         ENABLE_SUCCESS_EMAIL_NOTIFICATION to YES. Supply specific information about your email contact and server in the other properties if you set either of these to YES. If you do not want to use email notifications, edit this section to set both enable values to NO, and ignore the other properties.

#Email notification settings
ENABLE_EMAIL_NOTIFICATION=YES
ENABLE_SUCCESS_EMAIL_NOTIFICATION=YES
DESTINATION_ADDRESS=
SENDER_NAME=
SENDER_ADDRESS=
REPLY_TO_ADDRESS=
CONTACT_PERSON=
CONTACT_PHONE=
#Email server settings
SMTP_SERVER=
SMTP_PORT=
AUTHENTICATION_USER=
AUTHENTICATION_PASSWORD=

11. Open the shared.xml file and verify the contents that are shown in the listing below. Note that you cannot use shared.xml exactly as it appears. You must edit the <type> element inside the <connection> element to supply information specific to your own database servers ("MSSQLNATIVE" or "ORACLE").

Note: The default value listed below is for Oracle users.

<connection>
    <name>Target_TeamConnect_Warehouse</name>
    <server>${jdbc.target_TC_hostname}</server>
    <type>ORACLE</type>
    <access>Native</access>
    <database>${jdbc.target_TC_dbname}</database>
    <port>${jdbc.target_TC_portnumber}</port>
    <username>${jdbc.target_TC_username}</username>
    <password>${jdbc.target_TC_userpassword}</password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
        <attribute>
            <code>PORT_NUMBER
            </ code>
            <attribute>${jdbc.target_TC_portnumber}</attribute>
        </attribute>
        <!-- ONLY INCLUDE THIS LINE IF USING SQL SERVER
<attribute><code>MSSQLUseIntegratedSecurity</code><attribute>false</ attribute></attribute>
-->
    </attributes>
</connection>
<connection>
    <name>Source_TeamConnect_data</name>
    <server>${jdbc.source_TC_hostname}</server>
    <type>ORACLE</type>
    <access>Native</access>
    <database>${jdbc.source_TC_dbname}</database>
    <port>${jdbc.source_TC_portnumber}</port>
    <username>${jdbc.source_TC_username}</username>
    <password>${jdbc.source_TC_userpassword}</password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
        <attribute>
            <code>PORT_NUMBER
            </ code>
            <attribute>${jdbc.source_TC_portnumber}</attribute>
        </attribute>
        <attribute>
            <code>SQL_CONNECT</code>
            <attribute>SET TRANSACTION ISOLATION LEVEL READ COMMITTED;</attribute>
        </attribute>
        <!-- INCLUDE THIS LINE IF USING SQL SERVER
<attribute><code>MSSQLUseIntegratedSecurity</code><attribute>false</ attribute></attribute>
-->
    </attributes>
</connection> 

12. The example in step 6 shows passwords that use variables. In addition to this method, passwords that are used in shared.xml and kettle.properties can be encrypted. To encrypt a password, run one of the following files from the main Data Warehouse directory:

  • Encr.bat (for Windows)
  • ./encr.sh (for Linux)

When prompted, enter the plain text of your database password. The encrypted password string will be displayed, such as Encrypted FFA50T6REL88P7SQ.

Copy the entire string, including the word "Encrypted" and the space, to the clipboard. In shared.xml, in the password value that should be encrypted, paste the encrypted password string to replace the previous value.

Repeat these encryption instructions for each connection that you wish to encrypt. Save and close the file.

An example of using encrypted passwords is shown below:

<connection>
<name>Target_TeamConnect_data</name>
<server>${jdbc.source_TC_hostname}</server>
<type>ORACLE</type>
<access>Native</access>
<database>${jdbc.source_TC_dbname}</database>
<port>${jdbc.source_TC_portnumber}</port>
<username>${jdbc.source_TC_username}</password>
<password>Encrypted FFA50T6REL88P7SQ</password>
<servername/>
<data_tablespace/>
<index_tablespace/>
<attributes>
<attribute><code>PORT_NUMBER</code> <attribute>$ jdbc.source_TC_portnumber}</attribute>
<attributes>
</connection>
<connection>
<name>Target_TeamConnect_Warehouse</name>
<server>${jdbc.target_TC_hostname}</server>
<type>ORACLE</type>
<access>Native</access>
<database>${jdbc.target_TC_dbname}</database>
<port>${jdbc.target_portnumber}</database>
<username>${jdbc.target_TC_username}</username>
<password>Encrypted FFA50T6REL88P7SQ</password>
<servername/>
<data_tablespace/>
<index_tablespace/>
<attributes>
<attribute><code>PORT_NUMBER</code><attribute>$
{jdbc.target_TC_portnumber}</attribute>
</attributes>
</connection>

13. Navigate to the data warehouse installation directory and run one of the following files:

  • For Windows: WH_install.bat
  • For Linux: ./WH_install.sh

IMPORTANT: Users who are on a 4.x version of Data Warehouse must see the Upgrading from Earlier Versions step before running these scripts.

Logs for the command output will be located in the ./TCWH_logs subdirectory. The log file name has the format WH_install_date_time.log, where date and time are numbers representing the time when the job was started.

Your source and target databases must be each capable of opening at least 110 simultaneous connections while this command is running.

If the TeamConnect source database has a very large number of custom objects, the database script that creates the Data Warehouse schema may be quite large. If so, be sure to maximize the amount of Java heap space allocated to this job.

How to resume the failed refresh job?

Run the below bat/sh file to resume the refresh job without starting the job from the beginning and if the previous run of the job failed.

you can download the files here:

TeamConnect_Warehouse_Refresh_resume.sh

TeamConnect_Warehouse_Refresh_resume.bat

  • Was this article helpful?