Advanced Analytics within Oracle Cloud Services

when using Oracle Autonomous ADW it supplies Oracle Machine Learning (OML) which is based on Zeppelin Notebooks and currently allows to use only SQL. Thus, any advanced analytics will be translated to SQL/PLSQL language of the Oracle Database and only provide advanced analytics functions as supported by the Oracle Database. There are also future plans to provide Python within OML. This will have the advantage that data is not kept in-memory  of a local machine (that is running Python), but natively in the database within tables and thus allow much higher data volumes compared with local machines, i.e. desktops or laptops. Thus, the Zeppelin Notebooks may allow using Python.

The previous labeled Big Data Cloud Service (BDCS) which provided Hadoop and HDFS with Cloudera Management will be shifted to OCI and re-labeled to Big Data Service (BDS).

The currently available Big Data Cloud (Compute Edition) (BDC-CE) is not considered a strategic service.

Jupiter Notebooks may work with BDS in the future.

This is very good news for Data scientist wishing to use Python instead of R within the Oracle Cloud.



Installing Oracle R on Database VM Image

Download the Oracle Pre-Built Database Developer VMs from OTN.image
Import the extracted VM into Virtual Box and bootup the Virtual Machine. After starting the VM, the following screen is presented witin Virtual Box:
check the Linux Version of the Downloaded Virtual Machine with below command:

# uname -r

For this VM it is: 4.1.12-61.1.27.el7uek.x86_64, thus: Oracle Enterprise Linux 7.

Install R (distribution) for Oracle R Enterprise

Afterwards, switch user to root (Password is “oracle”) and navigate to the yum Repository configuration:
using the vi (i command for – – inserting- – Mode and Esc for exiting the insert Mode, save with :wq) update the following sections within the repo file:



Install R with the following command
this automatically downloads and performs the installation:

Installing Oracle R Enterprise Server

Add the Oracle Home Lib to the Library Path environment variable:
This can also be added to the ~/.bashrc
As next step, download the Oracle R Enterprise software from OTN.
And store to a installation directory e.g. /u01/ORE_Inst_Dir
and unzip the files:
Start the installation by running the installer:

Installing Oracle R Client

Unzip the Client zip files:
switch to root user and install all unzipped files:
Start Oracle R Client with the ORE command and load the Oracle R libaries with library(ORE). When loading the R libaries, the following error is thrown:

Loading required package: OREembed
Error in dyn.load(file, DLLpath = DLLpath, …) :
   unable to load shared object ‘/u01/app/oracle/product/’: cannot open shared object file: No such file or directory
Error: package ‘OREembed’ could not be loaded


Since, the file is not available it needs to be installed:
[Thanks to the following article which features the same issue]

After installing the missing file, the ORE Libaries are loaded successfully within R:

Further references:

Official Oracle Documentation:

Network Analysis with Oracle Communication Data Model and Cytoscape

One advanced feature nowdays is to Display a Network in a more Visual Form. This is often applied to Social Networks as you have people communicating and maybe influencing each other, but even for a more Traditional Scenario like a Telephone Network.

Oracle Provides a Standard Data Model for the Telecommunication Industry to keep e.g. all the CR (Call Record) Data. One Way to use this CR Data is to Visualize a Network based upon the Call Records and Identify People that might churn and check which affect this might have on other Customers. The Oracle Product of OCDM is further described here. Additionally a Tutorial is also provided here.

The Churn calculation is performed by another Algorithm of the Standard Data Model, but now this Information can be visually included within the Network Analysis.

Thus, we want to display the following Information:

  • If a Customer is subject to Churning
  • Call Volume between Network Participants
  • Revenue Generated by each Network Participant

We supply the above Information using to Kind of Data Sets:

Descriptive Attributes of the Network Participants (Nodes) including Revenue Information (From the Billing System) and Churn Information (Calculated by another Data Mining Algorithm):

And the Information about the Call Volumn between the Network Participants/ Nodes:

To do this we can Export the very Structured Data from the Oracle Communication Data Model and load use a Open Source Software e.g. Cytoscape. (in the following Examples 2.6.2 has been used).



We can immediately see that the Network is visualized and gives an Impression of the Information. If we would have just the Call Information in a tabular format like above this would have been incomprehensibly difficult for Humans to Understand and interpret.

We have many Configuration Options available for this Diagram which we can use to map and Visualize our Information. E.g. we can use the Call Volume expressed as TOTAL_CALL_CNT (this has been Aggregated from the Detail Level Records beforehand) as the Line width, the Churn Indicator as the Node Color and the Node Size is the Revenue (as Revenue Band, which has also been Aggregated beforehand).

When Scrolling in one can see that the Line Width is denoting the Call Volume between the individual Nodes and the Color Red/Green indicates the Nodes with a Churn Risk:

We can now use the VizMapper to further enhance the readability of the Network.

One can also just select the nodes for further Analysis and see exactly which kind of nodes belong to a certain Part of the Network. In the below example the top right corner has been highlighted and to have more clear Vizualisation the Style has been changed to Universe in the VizMapper Property Section.

Using a Open Source Software like Cytoscape and the Oracle Communication Data Model only relative moderate Data Preparation and Visualization Effort is required to perform the Network Analysis.

New Big Data Virtual Machine for Download

Oracle has released a new Virtual Machine focusing on Big Data and Advanced Analytics. The Full Articel can be found on the Oracle Blog Articel. This Machine does not only contain the Pre-Installed Software and Sample Data Sets to work on, but also a set of Labs and Hands-on Training Material. To get to the Image: click here.

Additional Information on the Big Data Appliance is provided here.

There is also a dedicated Learning Library for Big Data (OOL) and additionally Learning Videos have been created on Youtube. A more broader discussion on Oracle Big Data is within this Roundtable Discussion.