Informatica Powercenter – Update Strategy Transformation SCD1 SCD2

Something worth keeping in mind: in order to Update Strategy to work properly you need to make sure that your target table has a primary key.

Here’s what the official documentation says about Update Strategy Information:

“The Update Strategy transformation is an active transformation. When you design a data warehouse, you need to decide what type of information to store in targets. As part of the target table design, you need to determine whether to maintain all the historic data or just the most recent changes.”

Note: Update Strategy is pretty straight forward, but it comes with a lot around it.

Setting up Update Strategy Transformation

First understand this: We are using two different tables with the same structure to play around here. CustomerSCD_base and CustomerSCD.

CustomerSCD_Base do not have primary key and allow nulls everywhere, so we can do anything we want here.

CustomerSCD has CustomerID as primary key.

Our mapping in Powercenter is like this:

in our mapping we have CustumerSCD_Base as source and CustomerSCD as target, and, of course, Update Strategy as transformation.

By editing our transformation we can go to properties tab and check the “Update Strategy Expression”.

Here are all options we can have for that field:

To better understand how it works let’s see some use cases.

Simple Insert

Data-wise, here’s our current situation:

1 row for source and same for target

That way, if I run this transformation nothin will happen, as our CustomerSCD primary key will not allow the same key to be inserted in the table again.

We can even see here in the Monitor that a row got rejected:

Let add a new row to our CustomerSCD_Base and see what happens.

1 jnew row added to source

Now we have a new row in our base table. Let’s run our Powercenter workflow once more and see the result.

1 new row added to target

Now Powercenter was able to insert our new row from base to SCD table. Monitor shows that we had 1 rejected and 1 affected.

Simple Update

Now I changed our transformation from Insert to UPDATE, let’s see what happens.

update strategy transformation properties tab

Our result still shows the same data.

But Monitor shows 2 affected rows.

That’s because those 2 rows are eligible for update, however no field were changed.

Let’s try something out of ordinary, and change the CustomerID, the primary key on our target table, and see what happens.

See that I changed from ANATR to XPTO. Let’s run our workflow again and see what happens.

This time we only had 1 row affected, because Powercenter was not able to find in the target table another row with ‘XPTO’ as primary key.

A more practical example would be updating the user Phone. Let’s change ALFKI number to something else.

Let’s run workflow again and see…

phone number updated from source to target

You see, we now have updated our target table with our new Phone number. This kind of update in a dimension table is called Slowly Changing Dimension type 1, in short, SCD1.

Here what Kimball’s says about SCD1:

With slowly changing dimension type 1, the old attribute value in the dimension row is overwritten with the new value; type 1 attributes always reflects the most recent assignment, and therefore this technique destroys history. Although this approach is easy to implement and does not create additional dimension rows, you must be careful that aggregate fact tables and OLAP cubes affected by this change are recomputed.

Kimball group

Simple Delete

You now already understand what happens, so I will not be repetitive. For deletion I will just do this one time.

So, changing transformation:

Running workflow and checking results:

Only ALFKI were deleted, since it’s the only matching from from source to target.

Going a bit more complex

So far we have seen everything there is about Update Strategy Transformation itself, however Update Strategy is commonly used together with some other transformations, like router, filter, expression transformation, etc. We do that to create more complex scenarios.

For instance, in our topic “simple update”, we have updated one of our CustomerID from source table. Remember:

We noticed that after this change from ANATR to XPTO this row remained useless. But, normally, when you have an ETL process like these you usually add new rows and update existing ones. Therefore, on this scenario we should have an INSERT+UPDATE behavior, following this set of rules:

  • if CustomerID do not exists, then INSERT
  • if CustomerID already exists AND some attribute changed, then UPDATE
  • if CustomerID already exists AND nothing changed, then do NOTHING

By doing this we should have XPTO inserted in our target table.

Update or Insert scenario – brute force

Let’s try to implement this logic:

  • if CustomerID do not exists, then INSERT
  • if CustomerID already exists AND some attribute changed, then UPDATE
  • if CustomerID already exists AND nothing changed, then do NOTHING

Here’s our current data set:

Here we have two rows in source (CustomerSCD_base) and two rows in our target table (CustomerSCD).

If we run a simple UPDATE, nothing will happen. So now I am creating an UPDATE/INSERT mapping.

top transformation update, bottom transformation inserts (or vice-versa)

This works. What Powercenter does here is, he run all rows from source in each of my transformations. So all my 2 rows are checked regarding INSERT and all regarding UPDATE, this works, but not if your source and target has many rows. I do not recommend you do this in your real world environment. Here’s the result:

Now XPTO is added to CustomerSCD. If I run the same workflow again, nothing will be inserted, since there is no new CustomerID in our source table. Only update will happen.

So, how can we improve this scenario? How can we tell Powercenter to check if its a new row and add only if its new, and update only those that already exists? We need to, somehow, check if CustomerID from source table exists in target table…

Update Insert Scenario – Slowly Changing Dimension type 1 – SCD1 – Manually

One more time, let’s try to implement this logic:

  • if CustomerID do not exists, then INSERT
  • if CustomerID already exists AND some attribute changed, then UPDATE
  • if CustomerID already exists AND nothing changed, then do NOTHING

Here is our mapping goal.

informatica powercenter slowly changing dimension type 1 mapping
final complete mapping, scd1

How does that mapping works? How we build this?

Lookup transformation: we are using a lookup here so we can check if our source table customerID exists in our target table customerID.

creating lookup transformation

When we create our lookup we have to choose TARGET as the lookup table, this means that we want to check our customerID from source against customerID from target.

Once lookup is created drag your customerID from source to your lookup transformation, and create a condition comparing CustomerID from source against CustomerID from target.

editing lookup

Expression transformation: we create a expression transformation so we can have all fields to compare and create our rule to set flag for insert or update.

  • Drag all fields from source qualifier to your expression transformation
  • Also drag all fields from lookup transformation to your expression transformation
    • note1: no need to drag customerID1 from your lookup transformation, since this information is already coming from your source qualifier.
    • note2: only drag from lookup those fields you want to check for change in your expression transformation. In this example we are checking all fields from target, so had to drag them all.
  • Rename all fields you brought from your lookup transformation and add the LKP_ prefix (this is optional, it’s just to help you find whichever are the lookup fields)
  • Add two new fields, UPDATE_FLAG and INSERT_FLAG
editing expression transformation

Now, inside your flag fields we write expressions to validate our set of rules.

IIF( CustomerID = LKP_CustomerID1 AND
CompanyName != LKP_CompanyName1
OR ContactName != LKP_ContactName1
OR ContactTitle != LKP_ContactTitle1
OR Address != LKP_Address1
OR City != LKP_City1
OR Region != LKP_Region1
OR PostalCode != LKP_PostalCode1
OR Country != LKP_Country1
OR Phone != LKP_Phone1
OR Fax != LKP_Fax1 

The INSERT_FLAG code is checking if LKP_CUSTOMER is null or not. It will return TRUE if it is null. It will be null every time that a customerID from source do not exists on target.

The UPDATE_FLAG code is checking if customerID from source is equal customerID from target, this means that, if the customerID already exists it will return TRUE. This code also check for any changes on our other fields.

The ROUTER Transformation: we create the router transformation to segregate results, to create output ports for each result we want. We will have output group for INSERT and another one for UPDATE. Here’s how we handle the router transformation:

  • With exception of the LKP_ fields from your Expression Transformation drag all fields to ROUTER (see first image below)
  • Make sure you also drag INSERT_FLAG and UPDATE_FLAG
  • Edit your router transformation (double click it)
  • In GROUPS create two new groups, one for INSERT, another for UPDATE, and a DEFAULT one will be automatically created (see second image below)
  • For group INSERT set the Group Filter Condition to INSERT_FLAG (see second image below)
  • For group UPDATE set the Group Filter Condition to UPDATE_FLAG (see second image below)
express transformation to router
router groups

For router’s INSERT group, just drag it results to your target table. You do not need to drag INSERT_FLAG nor UPDATE_FLAG port to your target.

router’s insert output ports to target

For router’s UPDATE group, drag the it to a Update Strategy transformation, and make sure your Update Strategy transformation is set to DD_UPDATE. To finish, drag the output port from your Update Strategy to your target table.

router’s update output ports to target + update strategy properties

Save your mapping > create a workflow and run it. Just something to be aware of:

  1. Make sure your workflow session is set to DATA DRIVEN.

2. On your workflow session mapping tab, make sure your Lookup is set to your target database connection.

SCD1 – Running workflow and results

Now that our mapping and workflow is ready, lets run it, but fist, lets check out dataset. Here it is:

initial dataset pre-run. Target table empty

Here our dataset after the workflow of that mapping is executed for the first time:

4 rows inserted

Here is our Monitor results. We successfully inserted 4 new rows on our target table.

4 rows inserted

Here’s our result if we do nothing and just re-run the workflow. Nothing changed.

nothing changed
no rows affected

Let’s update a row from our source table.

changed source table customer ALFKI country name to Portugal

Now let’s run our workflow again. This time we expect 1 affected row on our monitor, and also that target table gets updated.

update worked as expected!
monitor results like expected

So far, so good. We are now able to keep our rows with most updated/recent data.

Slowly Changing Dimension type 1 – SCD1 – Using Powercenter Wizard

Pretty much everything we have manually done so far can be implemented via Powercenter Wizard. Yes! Powercenter has wizards to create SCD1, SCD2, SCD3. Let’s see how Powercenter Wizard do creating a SCD1, and then we will compare our manual mapping against wizard’s mapping.

Go to Mappings > Wizard > Slowly Changing Dimensions…

mapping wizard
name your SCD
choose your source and name a new target
setting fields

Setting the fields is important. Here we are saying that CustomerID is our logical key field, which means that this is the primary key, and all the other fields are going to be used to check for changes.

Once we hit finish our mapping is created for us. Let’s do a detailed review, step by step, on what is happening here.

the whole picture

Source Qualifier and Lookup transformation

Source qualifier to lookup transformation: here the wizard created a lookup from target to check if what we are inserting exists already in target, or not.

We can validate that by checking the table name lookup (img below) and the condition (img above).

Something curious is that our key used in comparison is not checked for output. Maybe we should have done that in our manual attempt.

the wizard unchecked output port for source key

Expression Transformation

After lookup we have Expression Transformation. Here the wizard bought all ports from lookup ( minus CustomerID ), and also all ports from source qualifier.

ports coming from lookup and source qualifier

Wizard also created two new fields; ChangedFlag and NewFlag.

changedFlag code
DECODE(Address,PM_PREV_Address,1,0) = 0
DECODE(City,PM_PREV_City,1,0) = 0
DECODE(CompanyName,PM_PREV_CompanyName,1,0) = 0
DECODE(ContactName,PM_PREV_ContactName,1,0) = 0
DECODE(ContactTitle,PM_PREV_ContactTitle,1,0) = 0
DECODE(Country,PM_PREV_Country,1,0) = 0
DECODE(Fax,PM_PREV_Fax,1,0) = 0
DECODE(Phone,PM_PREV_Phone,1,0) = 0
DECODE(PostalCode,PM_PREV_PostalCode,1,0) = 0
DECODE(Region,PM_PREV_Region,1,0) = 0
newFlag code

To sum up:

  • NewFlag returns TRUE if PM_PRIMARYKEY is null. PM_PRIMARYKEY is the key that comes from our target table. This PM_PRIMARYKEY key will be null if our lookup transformation condition returns null. In other words, this will be null if our CustomerID from source do not exists on target.
  • ChangedFlag returns TRUE if PM_PRIMARYKEY is not null (meaning that this key already exists in our target database), and DECODE() all fields we marked to check. What DECODE() do is check if source values is equal to target value, if it is equal return 1, otherwise returns 0. By using this DECODE() the wizard is checking if any field changed.

Filter Transformation | UPDATE Changed record

The filters transformations. After Expression transformation the mapping moves on to filters transformations.
We have two filters: FIL_InsertNewRecord and FIL_UpdateChangedRecord.

Let’s first talk about FIL_UpdateChangedRecord.


To compose the UpdateChangeRecord filter transformation we have ports coming from source qualifier, lookup and expression.

  • ChangedFlag is the only port that comes from expression transformation.
  • PM_PRIMARYKEY is the only port that comes from lookup transformation.
  • All other ports comes from source qualifier.
    • note that CustomerID from source is not brought to this filter transformation.
FIL_UpdateChangedRecord – filter transformation edit

The only properties set on FIL_UpdateChangedRecord transformation is the Filter Condition as ChangedFlag. This mean that this filter will pass through all rows when ChangedFlag returns TRUE (remember: changedFlag will return TRUE every time we have a record from source that have an ID that already exists in target and some of those other fields, like address, city, changed).

Update Strategy for Update Records

After the filter transformation we move to the Update Strategy transformation. No secrets here; we just ensure that we are using the DD_UPDATE function. The filter is already making sure that only the rows that matter is being passed through, so, here in the Update Stratetgy transformation we just make sure that those rows will be updated in our target. Notice that ChangedFlag is not being passed along. Besides that all ports, with exception for CustomerID, will be updated in our target table.

Note that CustomerID is not changed. It means that our update will update a row based on PM_PRIMARYKEY and keep CustomerID intact.

update strategy using DD_UPDATE function

Filter Transformation | INSERT New record

The other branch in our pipeline, after leaving expression transformation, is the FIL_InsertNewRecord filter transformation.


To compose the InsertNewRecord filter transformation we have ports coming from source qualifier, and expression only. For this filter no port comes from lookup transformation (it’s an insert).

  • NewFlag is the only port that comes from expression transformation.
  • All other ports comes from source qualifier.
    • note that CustomerID from source is brought to this filter transformation.

To be super clear: Unlike the Update version, this filter is using CustomerID from source, and this is because it’s a insert and it would not be possible to add a CustomerID from target, since it yet do not exists on target, that’s why we are using CustomerID from source.

For this Insert filter the only properties is the NewFlag. This means that every time a row returns TRUE for NewFlag it will be passed along by this filter, and any row that returns FALSE for NewFlag will be dropped.

Remember: our logic behind NewFlag says that a row is TRUE everytime this source ID do not exists in our target table.

Update Strategy for Insert Records

update strategy for insert records

Here on this Update Strategy transformation we just make sure that we are using the DD_INSERT function. All rows used comes from the filter transformation, with exception of NewFlag port, which is only there for filter purposes, so no reason to pass along to our Update Strategy transformation.

Sequence Generator transformation

This is a new one. We have not created this transformation in our first SCD1 manual example. The Sequence Generator transformation is being used here to create the PM_PRIMARYKEY. As we can see here the sequence starts with 0, increase 1 by 1 and the current value is 1. Meaning that, when inserting, the first PM_PRIMARYKEY registered will receive 1, the second 2, and so on.

sequence generator

Creating our target

So far the Wizard created our mapping, but our target isn’t created inside our database. So I will create it inside our MS SQL database.

Select your transformation. Go to target > Generate/execute SQL …

target > generate execute SQL
create your table with Primary Key


Remember that your workflow session has to be Data Driven and that your lookup should be pointing to your target database.

Workflow results Monitor

This is our starter dataset. We have 5 rows on source and 0 on target.


Result after running workflow for the first time. All rows inserted from source to target.

first run result

Result after doing nothing and just re-run the workflow. Nothing happened.

same thing

Result after updating some field on source and the running the workflow again. The update were transmitted from source to target.

updated, but before workflow execution
result after workflow execution

SCD2 – SCD type 2 – Slowly Changing Dimension type 2

Everything we did so far is great, but, we are loosing historical data. I mean, our customer moved from Germany to Portugal, but we simply lost track of where he lived before. We are updating data, but loosing history. This leads us to a new logic paradigm:

  • if CustomerID do not exists, then INSERT
    • create two datetime columns to track the row lifespance, DT_START, DT_END
    • DT_START receives the current system date when the row is inserted into our database
    • DT_END receives a maximum value like 1/1/9999
    • create a ACTIVE_FLAG column that receives 1, indicating its the current value
  • if CustomerID already exists AND some attribute changed, then
    • INSERT new row with new data
      • this new row DT_START receives the current system date
      • this new row ACTIVE_FLAG column that receives 1
    • the OLD row will:
      • update ACTIVE_FLAG to 0 (zero), indicating it’s inactive
      • update DT_END with current system date
  • if CustomerID already exists AND nothing changed, then do NOTHING

( Also, something else… So far we have used a 11 columns table. In real world examples you can have much larger tables, so understand that, sometimes you will only want to track SCD2 for specific columns. We can end up with something like, country being tracked as SCD2 and all other columns as SCD1; this means that a new row will only be created if country changes, otherwise it’s just a matter of updating data on other columns. )

What Kimball says:

Slowly changing dimension type 2 changes add a new row in the dimension with the updated attribute values. This requires generalizing the primary key of the dimension beyond the natural or durable key because there will potentially be multiple rows describing each member. When a new row is created for a dimension member, a new primary surrogate key is assigned and used as a foreign key in all fact tables from the moment of the update until a subsequent change creates a new dimension key and updated dimension row.

A minimum of three additional columns should be added to the dimension row with type 2 changes: 1) row effective date or date/time stamp; 2) row expiration date or date/time stamp; and 3) current row indicator.


For this example we will be using a more simplified version of our customer table. This time we will have CustomerSCD2_base for source and CustomerSCD2 for target. They are empty for now.

Before moving forward with the mapping, lets understand better the behavior we expect. Let’s use Microsoft Excel to help us with envisioning this.

Let’s say we have our dataset like this. One customer there. Source is top, Target is bottom.

starting point

Then we changed FRANS from ITALY to BRAZIL. Here’s what we expect to happen, according to our logic paradigm. Source is top, Target is bottom:

a new update/insert happened

So, once we had an update on FRANS country we added this as a new row in our target table. We also added ACTIVE_FLAG column to track which row is active, by active we mean current, and what is old.

To make it more real-world like we can incorporate couple of more new columns. This is how SCD2 looks more like it in real data warehouses:

the same update/insert but with more control columns

So, besides what we already have on SCD1 we will also need:

  • a way to generate SK_Customer
    • SK_Customer is a surrogate key, which we will be using as our true primary key for this table, since customerID will be duplicating on this table
  • a way to generate and insert/update our START_DT and END_DT
  • a way to generate our ACTIVE_FLAG

You can already tell that we will need to re-work and re-create our target table… But we will get to that soon.

Implementing SCD type 2 – SCD2 – Manually

informatica powercenter slowly changing dimension type 2 mapping
final complete mapping, scd2

To implement SCD type 2 we will create a mapping and do it manually. Let’s do it step-by-step and think at each stage what to do and why we are doing it.

Source and Target

First we need to understand our source, and we already have our source ready here it is:


Then we need to have our target ready. For that I created a target in Powercenter and used Powercenter to create my target inside my database.

creating the target within Powercenter
the table that Powercenter created within our db
source and target queried results

Source Qualifier and Target

Create a new mapping and drag your source to it, also drag your target.

source and target


Create a lookup from your target.

creating the lookup

Drag your comparison key between source and target and create that condition within your lookup transformation. Here we are comparing if CustomerID from target equals CustomerID1 from source.

lookup condition

Expression Transformation

Create a expression transformation and drag there all fields from source, and also some fields from lookup.

From source:

  • we will always bring all fields as they will be needed if we need to insert a new row.

From lookup:

  • we will always bring all fields we want to update, because we need those fields to compare, to check for changes.
all expression transformation fields

Besides fields from source and lookup we will also need:

  • INSERT_FLAG, this field will return true every time a row matches our logic to be inserted
  • UPDATE_FLAG, this field will return true every time a row matches our logic to be updated
  • GET_DATE, this field will get the current date. This is necessary to fulfill START_DT for new records and to update END_DT for updating records
  • END_DT, this field is static and will always return 01/01/9999 as this will be used everytime we need to insert a new record
  • ACTIVE_FLAG_1, this field always returns 1. We will use when inserting new rows to identify those rows as current
  • ACTIVE_FLAG_0, this field always returns 0. We will use when updating rows that are not longer current.




IIF( CustomerID = LKP_CustomerID1
LKP_ContactName1 != ContactName
LKP_Country1 != Country
LKP_Phone1 != Phone

GET_DATE code is simply SYSDATE.

END_DT code is simply TO_DATE( ’01/01/9999′, ‘MM/DD/YYYY’ ).

Router Transformation

router transformation

Let’s understand our inputs from Expression to Router:

CustoimerID, ContactName, Country and Phone comes from Source Qualifier. We pass those to our router because we will need them when we INSERT new rows.

We bring LPK_PM_PRIMARY because we need this info, as this is the SK (surrogate key) that we need when updating rows. We need to know which row will become the old row, that’s why LPK_PM_PRIMARY is important, it will be able to tell us which SK to set END_DT and ACTIVE_FLAG_0.

  • INSERT_FLAG, UPDATE_FLAG will be used as filters on router, just like we did on SCD1.
  • GET_DATE will return current date (SYSDATE), we will use when inserting new rows. This GET_DATE port will be also used to fill END_DT for updating rows.
  • END_DT is a fixed string of a 1/1/9999 date, that we will also use for insert rows.
  • ACTIVE_FLAG_1 is a static 1 that we will pass when inserting new rows.
  • ACTIVE_FLAG_0 is a static 0 that we will pass when updating rows.

Router’s groups, just like in SCD1:

This router’s groups will act as filters. Every time a row matches INSERT_FLAG it will be redirected to router’s INSERT ports, and same logic for UPDATE.

Update Strategy | Insert

If a row matches the INSERT_FLAG it will be redirected to INSERT group in our router. Those ports will be then redirected to our UPDATE STRATEGY transformation for insert.

We bring all fields that we need:

  • customerID, contactName, country, phone: those fields will bring data from source
  • get_date: will return current datetime (sysdate)
  • end_dt: will return 01/01/9999
  • active_flag_1: will return 1

So, by following this logic a new row will be registered in our target. However, since we are not using CustomerID as PrimaryKey, we need a Sequence Generator for our Surrogate Key.

The only change I did was to change Start Value from our Sequence Generator to 1 (default is 0).

sequence generator

Other important aspects of sequence generator is to keep un-checked cycle and reset, since it is generating primary key numbers.

Also, make sure your Update Strategy is set to DD_INSERT.

update strategy

Update Strategy | Update Insert

The same logic applies to Update Insert scenario. Difference here is that we get results from our Router’s UPDATE ports, as we will be inserting new rows for those rows that are changing. This is where the “keep history” magic happens.

Update Strategy | Update Update

In this final step we are just updating the old row that is now becoming inactive. That’s why we are using LKP_PM_PRIMARYKEY here. This means we have a change and we know which key to change.

Be aware that, here, GET_DATE goes to END_DT, as END_DT is now over, thus receiving current date.

ACTIVE_FLAG here receives 0 (zero).

Lookup, again, I know

Before going to testing, edit your lookup transformation again, and make sure that you do an SQL Override, with a code like this:

lookup edit > properties > sql override
,CustomersSCD2.CustomerID AS CustomerID
,CustomersSCD2.ContactName AS ContactName
,CustomersSCD2.Country AS Country
,CustomersSCD2.Phone AS Phone
FROM CustomersSCD2

This will make sure that only the rows that are active are being the scope of your mapping. Otherwise your mapping won’t work.

So, this SQL override, is to ensure that you only get the current rows, those with ACTIVE_FLAG equals 1, as it would make no much sense to treat old registers.

Running Workflow and testing SCD type 2 results

Let’s run our SCD 2 workflow and see if it works. Remember that you workflow session needs to be data driven and that all your mapping should be pointing to the right databases.

First scenario, source 4 rows, target 0 rows.

Running the workflow for the first time. Notice that 4 rows were inserted.

(primary key starting at 6 because I’ve truncated the table before, hehe)

If we run the workflow again, nothing happens. See:

no changes

If we change a customer country and run our workflow, a history needs to be created.

source updated, pre workflow execution

Our workflow now inserted and updated 1 row.

Here’s our history created. Now we are tracking every change in our customer attributes.

Here’s another update on the same customer, changing country one more time, this time to Sweden.


Hope you guys enjoyed this post, took couple of days to put it together. Now we can move on knowing that SCD1 and SCD2 are clear to us.

Reference – How to Implement SCD TYPE 2 using Informatica PowerCenter – Informatica Tutorial | Update Strategy transformation | Slowly Changing Dimension (SCD) Type 1 | –  – Slowly Changing Dimension Informatica interview questions scenario based Informatica tutorial

Published by Pedro Carvalho

Apaixonado por análise de dados e Power BI

One thought on “Informatica Powercenter – Update Strategy Transformation SCD1 SCD2

Deixe uma resposta

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: