Assuming, I have below two tables Department and Employee where I am storing data for different customers(tenants) and there’s one-to-many (1:M) relationship between these two tables (i.e. one department can have 1 or more employees)
However, let’s say both the tables are SCD Type 2 i.e. storing history with effective and termination dates. There are no constraints, indexes etc. created on these tables at database level.
Department table:
cust_id dept_id dept_name efctv_dt trmntn_dt Dept_key
1001 D1 IT 12-01-2018 12-31-9999 1001D1
1001 D2 HR 01-01-2019 12-31-9999 1001D2
1002 D3 Admin 02-01-2019 02-28-2019 1002D3
1002 D3 HR+Admin 03-01-2019 12-31-9999 1002D3
1002 D4 Finance 02-01-2019 12-31-9999 1002D4
Employee table:
cust_id emp_id emp_name dept_id efctv_dt trmntn_dt Emp_key
1001 E1 XYZ D1 01-01-2019 01-31-2019 1001D1
1001 E1 XYZ-A D1 02-01-2019 12-31-9999 1001D1
1001 E2 ABC D2 02-01-2019 12-31-9999 1001D2
1002 E3 AXBYCZ D3 03-01-2019 03-31-2019 1002D3
1002 E3 AXBYCZ D4 04-01-2019 12-31-9999 1002D4
1002 E4 DEFG D4 04-01-2019 12-31-9999 1002D4
Columns cust_id & dept_id can be concatenated together as a separate column in both the tables as a key field and used as a join between both the tables.
Department Key=Concatenate(department[cust_id], department[dept_id] )
Employee Key=Concatenate(employee[cust_id], employee[dept_id] )
Example key output values= 1001D1, 1001D2, 1002D3, 1002D4
Now let’s say we have following reporting requirements, i.e.
To filter on Date Ranges (in visualization) - assuming there's another date dimension table with all dates & hierarchy
1) When no specific date range or filter selected - show all current active employee & departments names (where, trmntn_dt = 12-31-9999)
So, expected output is:
Emp Name Dept Name
XYZ-A IT
ABC HR
AXBYCZ Finance
2) When reporting for a specific month example - Jan-2019 - show all employees & department names active as of that month. So, expected output is:
Emp Name Dept Name
XYZ IT
3) When reporting for a specific Quarter example - Q1-2019 - show all employees & department names active as of that quarter. So, expected output is:
Emp Name Dept Name
XYZ-A IT
ABC HR
AXBYCZ HR+Admin
However, the join condition in AS Tabular model with 1:M relationship between these two tables would fail because the rows are not unique in the Department table (rows for D3), which is on the one side of the relationship.
If you include efctv_dt or trmntn_dt also in the concatenated join condition in both the tables as key for joining i.e.
Department Key=Concatenate(department[cust_id], department[dept_id] ) & Concatenate(department[efctv_dt],””))
Employee Key=Concatenate(employee[cust_id], employee[dept_id] ) & Concatenate(employee[efctv_dt],””))
Example key output values= 1001D112-01-2018, 1001D201-01-2019…
However, though now the rows would be unique since we don’t expect same row twice on the same day (unless some ETL issues like process ran twice on the same day etc.)
AS tabular model doesn’t allow to create complex join/condition (like in SAP BO Universe) so that we could add below condition when joining these two SCD type 2 tables as below which might help solve some of the requirements - something like below in the Where clause of the SQL:
dept.cust_id = emp.cust_id
And dept.dept_id = emp.dept_id
And ( calendar_date is between efctv_dt and trmntn_dt
Or
trmntn_dt = ’12-31-9999’
)
I think for creating/calculating any measure value is still doable with lots of examples available online on DAX to filter on the dates, but what about with just the dimensional attributes ?
Is this the right approach ? How to handle these? And, that too without using surrogate keys for generating unique values in each SCD type 2 tables and referencing it as FK/reference key from Parent->Child (1:M) i.e. Dept->Emp table.
Please suggest.
Thanks!