Friday, May 24, 2013

OLTP vs.OLAP

Transactional Data : In a very vague sense, whatever data that acts as an evidence for showing something has happened is transactional data. It is also called Operational data. For example : A Car manufacturing company keeps a record of all the cars manufactured, all cars sold, inventories ordered, transportation etc in a set of tables which is collectively called a Database. 
This kind of data is used by Store managers, dealers to manage inventories, control production, track status of any order etc on a day to day basis.  Note the word used here is 'day to day basis'. In such a scenario, the smallest things are tracked. For example, if a customer has ordered a car, he can request for its status from the dealer, the dealer can then track the status from the plant manager, the plant manager can look at the status whether it the car is in paint shop or getting its engine. 

Now lets ask a simple question to ourselves : Will the CEO of this Automobile company ever look at these databases or Tables ???
Answer is NO. He is not interested in individual customers or cars. His job would be to look at the bigger picture, see how many cars the company is selling over a period of time, say a year or two years. His job is to observe the trends of customer, e.g. How the demand for SUVs is declining while Small Cars is growing, can we launch a new car ??, How much budget is to be allocated for advertising etc. etc.
 
This need for analysing data in a broader sense has grown over the years. Managers usually will look at data that might be as old as ten years (or more) in some cases. Data Mining is a concept that was born out of this need.

OLTP : In the older ages, Transactional Data was processed (processed : Insert/Update/Select) across geographies using customised Java Programs which would then be put across the web so that they could be accessed Online. This realization is often termed as Online Transaction Processing or OLTP. In its simplest form, buying anything on eBay or pulling out cash from ATM is OLTP. 

OLTP Requirements : OLTP  essentially needs inserts and updates be as fast as possible, in transactional systems, data is more inserted, updated and deleted and less 'Selected'.  Imagine your ebay purchase or ATM machine taking hours to process. Nobody would do online transactions then. 
Database design is very crucial keeping the insert and update frequency in mind. E-R diagrams/ Data Modeling methods are used for arriving at an efficient design.
 
OLTP Today : Many off-the-shelf OLTP applications are used by clients instead of designing custom Java applications, e.g. SAP ERP systems, Oracle's JD Edwards etc. This has drastically reduced the development and implementation time.

OLAP :  (Online Analytical Processing) Analytical data is often collected from the transactional systems. But does that mean you pull data from transactional system and put it as is with all the joins and constraints untouched ?? No, Whenever data is analysed, we use 'Select' query more and do insert/updates less. And faster selection of data means faster analysis and faster decision making. This need gives rise to the concept of Data Warehousing. A Warehouse is a place where all the data that is necessary for decision making is kept at one single place, and is arranged in such a way that the retrieval time is as minimal as possible. The best method to organize such a warehouse is to de-normalize all the tables of a transactional database and keep it in one huge single table.
How a Data Warehouse is organized is a different subject matter. But the universal truth is that a Warehouse should have minimum retrieval time else the design is a failure. Analysis means generating reports, trends etc from a set of data which can be used for making decisions. Before the arrival of Datawarehousing Applications custom Java programs were designed and built by companies.

OLAP Requirements : Lots and Lots of Space, Warehouse design must be such that 'Selection' is faster and Update/Insert may or may not be faster.

OLAP Today : Hyperion Essbase is one of the most widely used Data Warehouse application. Oracle 11g also has Warehouse building abilities. Again, using these off-the-shelf applications greatly reduce design and implementation time 

OLTP vs. OLAP
- Faster Insert/Update vs. Slower Insert/Update 
- Slower Selection vs. Faster Selection
- No need to keep track of Slowly changing Dimensions vs. Slowly Changing Dimensions are very important
- Normalized vs. De-normalized (although they are normalized to some extent)

Can Oracle Data Integrator be used for both Transactional and Analytical needs  ?
Yes, ODI can be used for Integration tasks of transactional systems as well as building Warehouses from Transactional systems. 


No comments: