As a thumb rule, “Every Foreign key in a fact table should reference a dimension row”, let’s look at the following example to understand the use of default row in DW
Source File to populate the “Employee Dimension”
| Row id | Emp First Name | Emp Last Name | EMP No | EMP Dep |
| 1 | John | Beck | 2001 | A |
| 2 | Mike | Morry | 2002 | A |
| 3 | Kevin | Peter | 2003 | B |
| 4 | Steve | Morry | 2004 | B |
| 5 | Jim | Chen | 2005 | C |
As you know from the (DW Loading Techniques) the dimension table gets loaded first with Naturel and surrogate keys.
Here is how the table populated
| Employee Dimension | ||||
| Emp_Surrogate_Key | EMP No | Emp First Name | Emp Last Name | EMP Dep |
| 0 | 0 | NULL | NULL | NULL |
| 100001 | 2001 | John | Beck | A |
| 100002 | 2002 | Mike | Morry | A |
| 100003 | 2003 | Kevin | Peter | B |
| 100004 | 2004 | Steve | Morry | B |
| 100005 | 2005 | Jim | Chen | C |
Source File to populate the “Employee Fact”
| Row id | EMP No | EMP Sal |
| 1 | 2001 | 1000 |
| 2 | 2002 | 2000 |
| 3 | 2003 | 1000 |
| 4 | 2004 | 2000 |
| 5 | 2005 | 3000 |
| 6 | 2006 | 4000 |
The fact table surrogate keys gets populated by doing the lookup against the natural key in the look-up process, as you can see that the “Employee Fact” source has Emp no 2006 which is not available in the dimension table. So during the surrogate key look-up process the fact table loaded with “0” for that employee
Why does it Matter:
Well, as you know the fact and dimension joined by surrogate key,
Select emp_dim* from
Employee Dimension as emp_dim,
Employee Fact as emp_fact
Where
emp_dim. Emp_surrogate_key=emp_fact. Emp_surrogate_key
Here is the result without the default row
| Emp_Surrogate_Key | Emp First Name | Emp Last Name | EMP No | EMP Dep | EMP Sal |
| 100001 | John | Beck | 2001 | A | 1000 |
| 100002 | Mike | Morry | 2002 | A | 2000 |
| 100003 | Kevin | Peter | 2003 | B | 1000 |
| 100004 | Steve | Morry | 2004 | B | 2000 |
| 100005 | Jim | Chen | 2005 | C | 3000 |
As you can see that the “Emp no 2006” salary information’s is not get displayed in the result
Here is the result with the default row
| Emp_Surrogate_Key | Emp First Name | Emp Last Name | EMP No | EMP Dep | EMP Sal |
| 100001 | John | Beck | 2001 | A | 1000 |
| 100002 | Mike | Morry | 2002 | A | 2000 |
| 100003 | Kevin | Peter | 2003 | B | 1000 |
| 100004 | Steve | Morry | 2004 | B | 2000 |
| 100005 | Jim | Chen | 2005 | C | 3000 |
| 0 | NULL | NULL | 2006 | Null | 4000 |
0 comments: