Thursday, December 23, 2010

Informatica PowerCenter


Informatica Repository Manager


A. Informatica PowerCenter includeds following type of repositories :
  • Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.
  • Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts.
  • Local Repository : Local repository is within a domain and it not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it shared folders.
  • Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment.

A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.
When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.


A. PowerCenter Server on Windows can connect to following databases:
  • IBM DB2
  • Informix
  • Microsoft Access
  • Microsoft Excel
  • Microsoft SQL Server
  • Oracle
  • Sybase
  • Teradata

A. PowerCenter Server on UNIX can connect to following databases:
  • IBM DB2
  • Informix
  • Oracle
  • Sybase
  • Teradata
Infomratica Mapping Designer


A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.


A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica:
  • Aggregator
  • Application Source Qualifier
  • Custom
  • Expression
  • External Procedure
  • Filter
  • Input
  • Joiner
  • Lookup
  • Normalizer
  • Output
  • Rank
  • Router
  • Sequence Generator
  • Sorter
  • Source Qualifier
  • Stored Procedure
  • Transaction Control
  • Union
  • Update Strategy
  • XML Generator
  • XML Parser
  • XML Source Qualifier

A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.
PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.


A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only.
Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties.
Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.


A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.


A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements.


A. Yes, joiner transformation can be used to join data from two flat file sources.


A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.

A. Yes.


A. Connected lookup takes input values directly from other transformations in the pipleline.
Unconnected lookup doesn take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.


A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings.


A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated.


A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert.
Following options are available for update strategy :
  • DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0.
  • DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1.
  • DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2.
  • DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3.

Some interview Questions


1. How can you recognize whether or not the newly added rows in the source are gets insert in the target?

Version number
Flag value
Effective date Range 

2.What is the use of incremental aggregation? Explain me in brief with an example. 

                        Captured changes in the source to aggregate calculations in a session. If the source changes only incrementally and you can capture changes, you can configure the session to process only those changes. This allows the Informatica Server to update your target incrementally,

3.What is the difference between stop and abort? 
 
                                         In this case data query from source databases is stopped immediately but whatever data has been loaded into buffer there transformations and loading contunes.Abort: Same as Stop but in this case maximum time allowed for buffered data is 60 Seconds.

4.What is the Confirmed Dimensions?

Conformed Dimensions: The dimension which gives the same meaning across different star schemas is called Conformed dimension.

5.About DTM?
D.T.M. [Data Transformation Manager]: It is responsible for modifying the data according to the transformation rule given in the mapping. it stores the data in temporary memory call staging area.[buffer]. Where data transformation activities takes place.

6.Type of Dimensions? 

Fact constellation: its process of joining two fact tables.

Conformed dimension: a dimension table which can be shared by multiple fact tables is known as conformed dimension [common dimension].

Fact less fact table: a fact table without any fact is known as fact less fact table.

Junk dimensions: a dimension with a type a flag, bullion and description or not used to describe the key performance indicators are known as Junk dimensions.


Dirty dimensions: in a dimension table if a record existing more than once with difference non key attribute is known as dirty dimensions.