The Goal of SAP-BI
Ø offer complete end-to end DWH solutions.
Types of Data
Basically, there are two types of data in a standard business practice.
1) Master Data
2) Transaction Data
Master Data
· It is the data which is not going to change frequently.
· Maintains uniqueness (no duplicates).
· It represents all real life entities such as customers, vendors, materials, Plants etc. (Primary Key)
Transaction Data
· It is the data which is going to change very frequently
· Allows duplication
· Maintains foreign Key (reference to the master data)
OLTP (Online Transaction Processing)
· Current data in detailed view
· Read and Write possibility
· Less volume of data
· Flat reporting
OLAP (Online Analytical Processing)
· Historical data in summarized view
· Read possibility
· Huge volume of data
· Multi -dimensional reporting
Entity Relation Ship Model
o Entity
· Any object which can perform work by itself.
· All real life objects such as customers, vendors, plants, materials, etc……
· Every entities maintain its own attributes name, age, address, phone
Attribute nothing but properties or behavior of entity
Relationship
It is an association ship between two or more entities.
Relationships are 3 types.
- one to one
- One to many
- Many to many
Schema
The representation of database tables and their relationships is called Schema.
ER model
Based on entities and Relationships between entities, we design the database by using
ER model.
Ex:
Customer –Entity
Customer no, Customer name, Customer add- attributes / properties.
Note:
· OLTP applications designed with ER model
· ER model is normalized
· It is 2 dimensional.
Multi- Dimensional Modeling:
In MDM, all real life objects such as customers, vendors, materials, plants etc are mapped to Dimensions.
Dimension is an angle of viewing or analyzing the data.
Star Schema
· A fact table at the center surrounded by several dimension tables seems to be a star. Hence the schema is called Star Schema.
· The model based on the star schema is called Cube.
· In a star Schema model, the fact table Maintains million to billion of records (duplicate records) .
· On the other hand, dimension tables are usually small. This means that dimension table contains a few thousands to few million records (No duplicate records).
· In a star schema model the fact table contain transaction data and the dimension table contains Master Data
Fact Table: The collection of facts or measures or key figures is called a Fact Table.
Generally Fact Table handles Transaction data and it is very large.
Dimension Table: The collection of characteristics is called a Dimension Table.
Dimension Table handles Master data and it is small.
The model based on the Star schema is called Cube.
Limitations of Star Schema
Master Data is not reused. So master data is maintained Redundancy (MD is inside cube
Degraded Performance (Table maintains Alpha + Numeric keys)
Limited Analysis. (We can analyze data in 16 angles).
248 char
Extended Star Schema
Star Schema + SID technology.
In Extended Star Schema, suppose it maintain attr, text, hier in one table, we get demoralized problem so maintain tables separately in case of master data
Data Design:
Master Data is outside so it can be reused for others.
Performance improving.
More analysis (16*248) Each dimension contain 248 SIDS.
SID (surrogate id)
Every characterstic will have sid table But key figures not:-
The characterstic contain alpha + numeric so to convert alphanumeric to numeric, use Sid. But in key figures always contain numeric so no need to contain sid
Advantages of Extended Star Schema:
· Faster loading of data/ faster access to reports
· Sharing of master data
· Easy loading of time dependent objects
Differences between Classical Star Schema and Extended Star Schema:
· In Classic star schema, dimension and master data table are same. But in Extend star schema, dimension and master data table are different. (Master data resides outside the Info cube and dimension table, inside Info cube).
· In Classic star schema we can analyze only 16 angles (perspectives) whereas in extended star schema we can analyze in 16*248 angles. Plus the performance is faster to that extent.