Informatica Powercenter – Union Transformation

If you are familiar to SQL you already know what UNION does. In fact, you might be wondering, is Powercenter Union the same as SQL union or SQL union all? It is UNION ALL.

Here’s the official documentation on Powercenter Union transformation:

The Union transformation is a multiple input group transformation that you use to merge data from multiple pipelines or pipeline branches into one pipeline branch. It merges data from multiple sources similar to the UNION ALL SQL statement to combine the results from two or more SQL statements. Similar to the UNION ALL statement, the Union transformation does not remove duplicate rows. The Union transformation is an active transformation.

The Data Integration Service processes all input groups in parallel. It concurrently reads sources connected to the Union transformation and pushes blocks of data into the input groups of the transformation. The Union transformation processes the blocks of data based on the order it receives the blocks from the Integration Service.

The Union transformation is developed using the Custom transformation.

Rules and Guidelines for Union Transformation

Here’s what the official documentation tell us to use as rules and guidelines when working with a Union transformation:

  • You can create multiple input groups, but only one output group.
  • All input groups and the output group must have matching ports. The precision, datatype, and scale must be identical across all groups.
  • The Union transformation does not remove duplicate rows. To remove duplicate rows, you must add another transformation such as a Router or Filter transformation.
  • The Union transformation does not generate transactions.

My take on all those points is that we need to pay special atention to the second point. If you want to union sources you must ensure precision, scale, datatype across all ports (by ports you should already know that we are talking attributes), but the attribute name should matter too, right?! We will see that in our tests.

Creating a Union Transformation

Let’s first understand our data. We have sales from 3 different stores. Here their data:

Three different files, same structure:

Our objective here is to union all data and input it in our SQL database. We are going to use UNION transformation for this.

Create a mapping, add your sources, add your union transformation, add your target.

This is what we got, but as you can see Union Transformation has a NEWGROUP thing. You can’t just simply drag and drop all sources into your transformation, it won’t work.

First drag one of your groups.

By doing that you will create a group output, a pattern.

Now edit your transformation and create two new groups.

After doing that we will three groups of input and the same output in our UNION transformation. Now just drag and drop each source to its group, and after that select your output ports and connect to your target.

After creating a workflow and execute it, here’s the result: 21 rows, all inputted into the database.

Union removing duplicates

Everything we have done so far replicated data according to UNION ALL statement on SQL, but what if I need to remove duplicates before loading union data? You will need to add a SORT transformation after your UNION and before your target.

Make sure you select all ports you with to sort by. Remember that the PORT order is taken in consideration by Powercenter when sorting your data.

Also, select DISTINCT in properties tab.

Extra: Handling decimals

Did you notice that we lost decimal data on SELL_PRICE? That’s because when I created our sources I forgot to set precision to 2 decimal places. So, what I did, I went to Source Qualifiers and edited all my sources, as well edited all transformations and target, and recreated the target table as well.

edited all source qualifier from all sources
also edited the union transformation
edited the sort transformation as well
on target designer I edited our target and used it to recreate our target table on SQL database
re-ran the workflow, now we have decimals

References – Aula 04 – Criando transformações Union e Sorter no PowerCenter (Português BR)

Published by Pedro Carvalho

Apaixonado por análise de dados e Power BI

One thought on “Informatica Powercenter – Union Transformation

Deixe uma resposta

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: