ODI 12c: Upgrade Benchmarks

Please find below some Upgrade Benchmarks for upgrading the (Database) Repository for Oracle ODI 11g to ODI 12c using the ODI Upgrade Assistant which performs all necessary steps to convert the Repository e.g. upgrading the 11g interface to 12c mapping concept, as seen in below step:

image

Below is a table with the benchmarks for the respective Repositories:

Repository Repository Size in MB (measured by XML export size) Time duration
MASTER 150 33 Minuets
WORK REPO 1 250 15 Hours
WORK REPO 2 150 15 Hours

The Oracle support also features a Technote containing some tips to improve the process:

ODI Upgrade Hangs With Upgrade Progress Stucks at x% (Doc ID 1664300.1)

Advantages of an ETL Tool

Many IT Departments are still relying on SQL Scripts executed within their Database to Move Data from one Instance of the Database to another. This does work fairly well even with very Basic SQL like:

INSERT INTO table1 (column1, column2, someInt, someVarChar )
SELECT table2.column1, table2.column2, 8, 'some string etc.'
FROM table2
WHERE table2.criteria;

The above SQL Script should even work on most Database Products. For the Oracle Database even more sophisticated SQL Capabilities exist to cope with the Task of Moving data from one place to another. This comes with various SQL Statements e.g. the COPY FROM Clause:

COPY FROM HR/your_password@ORCL -
CREATE NEW_TABLE_SALESMEN (EMPLOYEE_ID, SA_MAN) -
USING SELECT EMPLOYEE_ID, LAST_NAME FROM EMPLOYEES -
WHERE JOB_CODE='SALESMEN';

This would “copy” the full data from one Schema to another and create a new Table. Even Incremental Loads (that is only the changed data) could be realized quickly by using the MERGE or UPSERT Command e.g.

MERGE INTO employees e
    USING hr_records h
    ON (e.id = h.emp_id)
  WHEN MATCHED THEN
    UPDATE SET e.address = h.address
  WHEN NOT MATCHED THEN
    INSERT (id, address)
    VALUES (h.emp_id, h.address);

All above sample code would also be very efficient since the complete Processing is done within the Database by the SQL Optimizer.

So to summarize: the above code is easy, could be placed within the Database Scheduler and is also performing very efficient within the database. So why do ETL Tools exist for this Task?

For more Complex Data Movement Design and Operation a dedicated ETL Tool should definitely be considered for their following Advantages:

  1. More Flexible configuration in Terms of Data Flow and Handling Connection Deployment
  2. Monitoring of Data Loading Workflows
  3. Robustness against errors during data loading, due to the Restartability from the fault point
  4. Debugging functionality during Development
  5. Structured and standardized design processes
  6. Orchestration of different loading procedures and Modeling the Dependency’s of Loading Procedures (Load Sequence)
  7. Easier Parallelization (performance) of multiple Workflows
  8. Implementation of business rules and transformation of the data
  9. Visual representation of data flow (data lineage documentation created by Tool)
  10. Multi- User Development (checkin / checkout )
  11. Built-in Versioning Capabilities (Source Code Control)
  12. Easier definition of aggregation Rules for data
  13. Quicker implementation of new Requirements to existing Mappings (e.g. adding new Column)
  14. Easier Implementation of Change Data Capture by using pre-built Models to utilize the same Mapping for both Tasks
  15. Simpler access to 3rd Party Sources which need to be incorporated into the Loading Procedure

After all, the Oracle BI Application Product also ships with an ETL Tool for data Movement like Informatica or ODI.