Saturday, March 29, 2008

INFORMATICA ARCHITECTURE

Informatica provides following components
PowerCenter Repository
PowerCenter Repository Server
PowerCenter Client
PowerCenter Server

PowerCenter Repository : It maintains the metadata. Set of tables are created within repository database to maintain the Informatica metadata. PowerCenter client and server access the repository to access metadata.

PowerCenter Repository Server: It manages connections to the repository from client applications.

PowerCenter Client: This is used to maintain the powercenter objects such as Source, Targets, mappings etc. It has following client tools
Designer : To create mappings that contain transformation instructions for the PowerCenter Server.
Workflow manager: To create, schedule, and run workflows. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data.
Repository Manager: To administer the metadata repository. You can create repository users and groups, assign privileges and permissions, and manage folders and locks.
Repository Server Admin Console: To administer the Repository Servers and repositories Workflow Monitor: To monitor scheduled and running workflows for each PowerCenter Server.

PowerCenter Server: With help of repository and repository server, powercenter server executes the business logic for extraction, transformation and load.

What is Repository ?
Powercenter repository is maintained in RDBMS . It contains the instruction required to extract, transform and load data. PowerCenter access repository using repository server. Metadata in the repository is added using powercenter client tools. There can be two types of repository
Global Repository: Object created in Global repository can be accessed across different repository.
Local Repository: Object created in local repository can only be accessed within the repository. Global repository are registered to local repository so that global repository objects are accessed in local repository using shortcuts.

Error Handling Approach

Error Handling Approach
Error Handling is one of the most important component in any Data warehouse or data integration project. If process is not able to handle and manage ERROR effeciently, then it is very rare that the project will succeed.Following steps and questions should be answered


1. Who is the business owner of the process/project
2. Who will be responsible for Error correction
3. What will the interface to present the ERROR to business community
4. Error correction will be at source or IT will be responsibile to re-run error correction
5. Error retention period in ERROR table

Above are few points to consider before deciding on the ERROR management process. Any ETL will provide feature to implement ERROR management. And sometime we go outside the ETL tool for presentation and correction of ERROR.
Informatica version 7.x onward provide feature where you can capture Error information at transformation level and Source record in ERROR table or ERROR File. These tables or files can be re-processed to present the error records to business community.
But in my opinion it is always better to have a common ERROR management process of all data integration project ( independent of tool). Any ETL tool can support that Error management process.

Without using Aggregator transf

Without using Aggregator transformation how to find
sum(sal),
min(sal),
max(sal),
count(sal),
avg(sal) in relational table as well as in flatfile.

Mapping Scenario

Hi Friends,
This is another scenario similar to the previous one.
Suppose the source table has values like this
10 A
10 B
10 C
20 X
20 Y
20 Z
30 D
30 E
30 F

And the target should be
10 A*B*C
20 X*Y*Z
30 D*E*F

How will you do the mapping?

Answer:This can be done using an expression transf and Aggregator transf.
In expression transf declare 2 variables ID_V and NAME_V
NAME_V
IIF(ID_V=ID,NAME_V ' * ' NAME,NAME)
ID_V
IIF(ID_V!=ID,ID,ID_V)
NOW pass the output ports to aggregator GROUP BY ID and connect to target.
This will work..

Wednesday, March 26, 2008

Mapping Scenario

Hi all,
My source table has 2 columns as follows
10 A
10 B
10 C
20 L
20 M
20 N
30 D
30 E
In the target The data should be as follows
10 ABC
20 LMN
30 DE
Any suggestions?

Sunday, March 23, 2008

Oracle sequence generator and informatica sequence generator

I have a task in which I need to take a flat file as my source and use oracle's sequence generator for a column and store into target flat file with sequence numbers including.

EXample:Source flat file
abc 1000
xyz 2000
def 3000

Target flat file
1 abc 1000
2 xyz 2000
3 def 3000
Note : The sequence numbers should come only from ORACLE (not from informatica's sequence generator Transformation).

Saturday, March 22, 2008

Do's and Dont's while using Sorted Input in Aggregator transformation

In general, follow this check list to ensure that you are handling aggregator with sorted inputs correctly:
1. Do not use sorted input if any of the following conditions are true:


a.The aggregate expression uses nested aggregate functions.
b.The session uses incremental aggregation.
c. Input data is data driven.
You select data driven for the Treat Source Rows as Session Property, or the Update Strategy transformation appears before the Aggregator transformation in the mapping.

If you use sorted input under these circumstances, the Informatica Server reverts to default aggregate behavior, reading all values before performing aggregate calculations.

Remember, when you are using sorted inputs, you pass sorted data through the Aggregator.
Data must be sorted as follows:
1. By the Aggregator group by ports, in the order they appear in the Aggregator transformation.
2. Using the same sort order configured for the session.

If data is not in strict ascending or descending order based on the session sort order, the Informatica Server fails the session.