Installing Oracle R on Database VM Image

Download the Oracle Pre-Built Database Developer VMs from OTN.image
Import the extracted VM into Virtual Box and bootup the Virtual Machine. After starting the VM, the following screen is presented witin Virtual Box:
check the Linux Version of the Downloaded Virtual Machine with below command:

# uname -r

For this VM it is: 4.1.12-61.1.27.el7uek.x86_64, thus: Oracle Enterprise Linux 7.

Install R (distribution) for Oracle R Enterprise

Afterwards, switch user to root (Password is “oracle”) and navigate to the yum Repository configuration:
using the vi (i command for – – inserting- – Mode and Esc for exiting the insert Mode, save with :wq) update the following sections within the repo file:



Install R with the following command
this automatically downloads and performs the installation:

Installing Oracle R Enterprise Server

Add the Oracle Home Lib to the Library Path environment variable:
This can also be added to the ~/.bashrc
As next step, download the Oracle R Enterprise software from OTN.
And store to a installation directory e.g. /u01/ORE_Inst_Dir
and unzip the files:
Start the installation by running the installer:

Installing Oracle R Client

Unzip the Client zip files:
switch to root user and install all unzipped files:
Start Oracle R Client with the ORE command and load the Oracle R libaries with library(ORE). When loading the R libaries, the following error is thrown:

Loading required package: OREembed
Error in dyn.load(file, DLLpath = DLLpath, …) :
   unable to load shared object ‘/u01/app/oracle/product/’: cannot open shared object file: No such file or directory
Error: package ‘OREembed’ could not be loaded


Since, the file is not available it needs to be installed:
[Thanks to the following article which features the same issue]

After installing the missing file, the ORE Libaries are loaded successfully within R:

Further references:

Official Oracle Documentation:

Oracle Information Management Reference Architecture – Examples from Oracles pre-built Analytical Products

The set of Oracle Analytical Products has become quite large in the recent time. Beginning of 2015 the list (non-exhaustive) would probably be comprised of:

  • Big Data Appliance
  • Exadata
  • Exalytics
  • TimesTen
  • Endeca
  • Oracle BI
  • Data Mining
  • Oracle R
  • Essbase
  • In Database Analytical Functions
  • Hyperion Planning
  • Essbase
  • Crystal Ball
  • RTD
  • BI Publisher
  • SmartView Office Integration

Thus, there comes not only the need to integrate but also to organize all these Products. The Integration of all these Products (on the Technical Level) does not come Out-of-the-box and can sometimes take a certain effort E.g. Importing Essbase Hierarchies in Oracle BI to allow Users to use these Entities for Analysis. Understanding the Integration and Organization of these Products, together with the Applications which contain the Business Process Execution ), Oracle does provide the Oracle Information Management Reference Architecture.

On the Left Side are all the Data Sources which could be Unstructured Data in Text Files, Master Data Systems for Customers or Products, External Data e.g. market Trend or Benchmarking data, Standard or self-written Applications. The Data is extracted from these Systems and loaded to the Staging Layer. This is a Temporary load of the Data for the purpose of taking the Data into the Foundation Layer. After the Data Loading job to the Foundation Layer has been completed e.g. for Order Data for a certain Month the data will be removed from the Staging Area.

The Data within the Foundation Layer should be normalized since it is taken over from the Operational Sources which are mostly 3NF in nature and this will also use less disk space compared to a normalized data.

The Access and Performance Layer is the Area where most data should be accessed, given that within this Area the Data will be Optimized for Analytical Processing e.g. by using Star Schema setup or any other DataWarehouse Query Optimization Technique. The pre-built Horizontal BI Applications also belong to this Area since the also use Star Schemas.

Data from any of the other Layers like the Source Systems, (Staging), and Foundation can also be access by the Analytical Tools which omits the need to Transform or tune all data within the Access and Performance Layer. The BI Server itself functions as an additional abstraction from the Physical storage of the data within the Warehouse. The Users access and gain the Insight by either accessing the BI Server (Abstraction Layer) directly by creating Reports or indirectly by receiving Alerts e.g. if certain Thresholds are met or by a weekly Business Report by mail. Data Scientists or Statisticians create their work within the Analysis Sandpit to Test their Hypothesis.

While typical Reporting data is understood by most Business Users. E.g. Revenue for a Region or a Store, conveying Advanced Analysis produced by Data Scientist or Statisticians will be more difficult. Typically a Data Scientist or Statisticians will use dedicated Tools (Desktop Client Software) which can only be operated by Trained Employees.

Thus, a Question that naturally arises is: How can the Business Insight gained within the Analysis Sandpit by Data Scientist or Statisticians be distributed to the General Business User.

Oracles own Pre-Built Analytical Products give some great Examples on how simple “System of Records” Reports and more advanced Statistical Reports can be used to provide Business Users with the required Insight.

The Oracle Communication Data model Uses the Support-Vector-Machine Algorithm to determine a correlation between certain Attributes of a Customer and their decision to Cancel (churn) their Contract or Subscription.

The above Report lists the most Important attributes (Ranked) that determine Customers how cancelled their Subscription. The Contract Left Days Attribute has a very high Importance (or Correlation) on Customers Leaving. This Information can be used to drive targeted Marketing Campaigns to offer a better retention Offer to the Customer. The Below report (again from the Communications Data Model) mixes the Predicted Churn Probability with the Contract Information for the Revenue Band:

The Sample App Image also contains some advanced (Data Mining based) Analysis mixed with normal Attributes to bring the Insight to the Line of Business:

The above Report Predicts the Life Time Value of the Customers and gives the # of Customers within the Predicted LTV Band. At last this can also be combined with the Mapviewer, allowing Geospatial Analysis. The Below Report from the Sample App shows the LTV of Customers plotted on the Map:

This can be done because Oracle Products store all this Model Information within the Oracle Database. Thus, one can use Oracle BI Metadata to build a RPD Model on Top to provide a convenient access to the Information via the web Browser. The Below Example is from the Oracle RTD Integration with Oracle BI EE:

For further readings on the Oracle Information Management Architecture itself (rather the Examples), please find the following links:

Trying to understand the Oracle Reference Architecture for Information Management

Information Management and Big Data (From Oracle)

Evolution of Information Management Architecture and Development (Co-published from Oracle & Rittmann Mead)

Network Analysis with Oracle Communication Data Model and Cytoscape

One advanced feature nowdays is to Display a Network in a more Visual Form. This is often applied to Social Networks as you have people communicating and maybe influencing each other, but even for a more Traditional Scenario like a Telephone Network.

Oracle Provides a Standard Data Model for the Telecommunication Industry to keep e.g. all the CR (Call Record) Data. One Way to use this CR Data is to Visualize a Network based upon the Call Records and Identify People that might churn and check which affect this might have on other Customers. The Oracle Product of OCDM is further described here. Additionally a Tutorial is also provided here.

The Churn calculation is performed by another Algorithm of the Standard Data Model, but now this Information can be visually included within the Network Analysis.

Thus, we want to display the following Information:

  • If a Customer is subject to Churning
  • Call Volume between Network Participants
  • Revenue Generated by each Network Participant

We supply the above Information using to Kind of Data Sets:

Descriptive Attributes of the Network Participants (Nodes) including Revenue Information (From the Billing System) and Churn Information (Calculated by another Data Mining Algorithm):

And the Information about the Call Volumn between the Network Participants/ Nodes:

To do this we can Export the very Structured Data from the Oracle Communication Data Model and load use a Open Source Software e.g. Cytoscape. (in the following Examples 2.6.2 has been used).



We can immediately see that the Network is visualized and gives an Impression of the Information. If we would have just the Call Information in a tabular format like above this would have been incomprehensibly difficult for Humans to Understand and interpret.

We have many Configuration Options available for this Diagram which we can use to map and Visualize our Information. E.g. we can use the Call Volume expressed as TOTAL_CALL_CNT (this has been Aggregated from the Detail Level Records beforehand) as the Line width, the Churn Indicator as the Node Color and the Node Size is the Revenue (as Revenue Band, which has also been Aggregated beforehand).

When Scrolling in one can see that the Line Width is denoting the Call Volume between the individual Nodes and the Color Red/Green indicates the Nodes with a Churn Risk:

We can now use the VizMapper to further enhance the readability of the Network.

One can also just select the nodes for further Analysis and see exactly which kind of nodes belong to a certain Part of the Network. In the below example the top right corner has been highlighted and to have more clear Vizualisation the Style has been changed to Universe in the VizMapper Property Section.

Using a Open Source Software like Cytoscape and the Oracle Communication Data Model only relative moderate Data Preparation and Visualization Effort is required to perform the Network Analysis.