80/20 rule, more like 40/80 rule..

The pareto principle suggest that with 20% of effort already 80% of the result can be achieved. Translated to business management, it is often reported that 80% of sales come from 20% of clients.

The following case study shows a example flight from Washington to London Heathrow operated by BA:

image

Even though the prices shown are undiscounted trip prices and before and yield optimization to fill the plane, the give a rough indication.

In this case, 45% of the passengers (of the none economy classes) will contribute 84% of the revenue.

image

This shows that the 80/20 rule can sometimes be more like a 40/80 rule.

This can also be visualized with a stacked bar graph:

image

Within Oracle OAC, a dedicated pareto graph is also available to show the relation between effort and results and the importance of the top contributors. This is definitely one of the patterns that each business should know about and beware of:

image

Advertisements

Error with DVD and DVML when installing in directory with spaces

It is recommended to install Oracle DVD into a directory without any spaces. While the installation of DVD will work for a directory with spaces like “C:\Program Files\Oracle\Data Visualization Desktop”, the subsequent installation of Oracle DVML will not work for the same. When installing DVML after installing DVD to a directory without spaces, the following error will be thrown:

"C:\Users\****\AppData\Local\Temp\BiPython_Framework.exe" is not recognized as command file.
Could Not Find C:\Users\****\AppData\Local\Temp\BiPython_Framework.exe
Error: Unable to access jarfile C:\PROGRA~1\Oracle\Data

This behavior is also describes in below Oracle Support Note:

DVML install fails for users with a space in Username Profile (Doc ID 2509892.1)

If the installation into the Program Files directories (like “C:\Program Files” or “C:\Program Files (x86)”) of Windows is intended, the following workaround can be applied to install DVD and DVML into a directory with spaces:

within script:

…\Oracle Data Visualization Desktop\install_dvml.cmd

add double quotes around below pathes (add double quotes shown in red below):

Line 20: … -jar %BI_PRODUCT_HOME%\modules\oracle.bi.dvdesktop.installer\rinstall.jar -download=true       …
Line 27: … -jar %BI_PRODUCT_HOME%\modules\oracle.bi.dvdesktop.installer\rinstall.jar -updateDVMLJSON=true

Afterwards the installation of DVML should run without errors.

ADWC Performance Benchmark on SSB Schema

The SSB Schema is a generated database schema, that can be used to evaluate the Performance of a Datawarehouse system. This is used to evaluate the performance between ADWC in the cloud and a local Oracle Database which is running within a Virtual Box image on a Laptop. This test is not conducted under test lab conditions and should only provide a rough order of magnitude.

System (Hardware) Baseline

The following instances have been used for the Performance test.

ADWC Trial

based on Hardware specification shown:

image

This instances is only used as reference, since the sizing (with 2 CPUs) and query performance for some queries based on the same SSB schema are publically available.

ADWC

instance which only uses 1 CPU

VM Image

with 4GB Ram

image

and 1 CPU

image

Based on Laptop Host system (Lenovo Yoga 370 Laptop) with:

image

The image runs Oracle Database 12c R2 Enterprise Edition, with no specific/tuned configuration (parameters) for Datawarehousing.

System Schema Baseline

image

Benchmark Performance

For Prologue Queries based on these examples. As mentioned before, for the ADW instance “ADWC Trail”, some queries run times are publically available. These queries have also been executed within the other instances. The runtimes of the ADWC Trail have only been used as reference:

image

SSB Queries, as outlined here and here (Oracle Documentation).  The below shows the query run-times for the ADWC and local Oracle Database (running as a Virtual Box image). The connection to the ADWC instance is made with the high connection type. When using the medium or low connection type, all queries were showing longer response times (as expected). Thus, the high connection type has been used for conducting the Benchmark.

image

The Excel file containing the system baseline in terms of number of records, queries and run-time results can be downloaded here.

Below is a galary of the executed SSB Queries on ADW and Virtual Box Oracle Database 12c:

Use of Date data type in Oracle DV

DV allows to make use of individual elements of any Date or Datetime column. Thus, there is no need to calculate the Year portion of a Date as an additional column. For this to work correclty the data type must also be of Date or Datetime on the Database. When importing a text (csv) file to a Oracle Database, a Date may be defined as a text (varchar) field by default. In order to make use of this DV Feature, the data type should be changed to Date data type when importing e.g. with SQL Developer. Otherwise, the field containing the Date may just be displayed with a blue “A” like any other attribute, but not with the purple “clock” to identify a true Date field.

image

DV: Displaying data as % (Percentage) Of other column in graphs…

To display data as percentage of, can be defined within the Visualization properties for Tables and pivot Tables. However, the percentage of does not exist for all Visualitzations such as Bar Graphs and thus requires the calculation of these percentage values. Other Visuals like Pie Chart or Donut do support to display the data as Percentage Of.

Taking the following baseline example with Car Accidents by Body Type and if drinking has been involved:

image The percentage Of can be displayed for each column using the Value Tap of the Visualization properties:

image

This will result in the following view for the same table (as above), showing the percentage of values:

image

The above setting will only show the percentage of with respect to the same column (distribution within the same column, as black arrow below) and not as percentage of Body Type (Green Arrow below)!

image

The property section, does have the possibility to change the relation for displaying the percentage to other columns, e.g. Body Type, but this is not calculating correctly within DV (no further inverstigation on this Issue):

image

As mentioned at the beginning, for some Graphs like Bar Graphs, there is no option to display the data as percentage of values, since this is not the best visualisation to represent distributions and thus the Bar Graphs will only show the absolute values, like:

image

In order to display the correct Percentage Of in a table or in a graph, it is need to calculated as Percentage Of before hand. The formula for calculating the percentage is:

percentage value = SUM (occurences)  / COUNT (overall)

image

In order to calculate the correct sum for the occurences, the formula for decoding the attribute field into 1 and 0 has to be done on the data preparation step (on the prepare tab). Otherwise, the SUM will not calculate correctly when using the “My Calculations”, since it will not apply the calculation on each row, but only on the aggregated result as clearly shown below with the bottom red box:

image

This is because the Case When Formula is not evaluated row-by-row, but only applied after aggregation. Hence, the decoding of values for calculating the percentage must be placed within the data preparation step:

image

After this preparation, the formula can be defined as shown below (Please note the multiplication with 1.00 to convert to a decimal number to represent the percentage):

image

This gives the intended result for correct Percentage Of values within a Bar Chart and Table View:

image