Data Warehouse and Informatica Realtime Scenarios: 12/17/10

Friday, December 17, 2010

Simple Sql QAs

SQL

Q1. What is the basic difference between a join and a union?

A1. A join selects columns from 2 or more tables. A union selects rows.

Q2. What is normalization and what are the five normal forms?

A2. Normalization is a design procedure for representing data in tabular format. The five normal forms are progressive rules to represent the data with minimal redundancy.

Q3. What are foreign keys?

A3. These are attributes of one table that have matching values in a primary key in another table, allowing for relationships between tables.

Q4. Describe the elements of the SELECT query syntax.

A4. SELECT element FROM table WHERE conditional statement.

Q5. Explain the use of the WHERE clause.

A5. WHERE is used with a relational statement to isolate the object element or row.

Q6. What techniques are used to retrieve data from more than one table in a single SQL statement?

A6. Joins, unions and nested selects are used to retrieve data.

Q7. What is a view? Why use it?

A7. A view is a virtual table made up of data from base tables and other views, but not stored separately.

Q8. Explain an outer join.

A8. An outer join includes rows from tables when there are no matching values in the tables.

Q9. What is subselect? Is it different from a nested select?

A9. Subselect is a select which works in conjunction with another select. A nested select is a kind of subselect where the inner select passes to the where criteria for the outer select.

Q10. What is the difference between group by and order by?

A10. Group by controls the presentation of the rows, order by controls the presentation of the columns for the results of the SELECT statement.

Q11. What keyword does an SQL SELECT statement use for a string search?

A11. The LIKE keyword allows for string searches. The % sign is used as a wildcard.

Q12. What are some sql aggregates and other built-in functions?

A12. The common aggregate, built-in functions are AVG, SUM, MIN, MAX, COUNT and DISTINCT.

Q13. How is the SUBSTR keyword used in sql?

A13. SUBSTR is used for string manipulation with column name, first position and string length used as arguments. Eg. SUBSTR (NAME, 1 3) refers to the first three characters in the column NAME.

Q14. Explain the EXPLAIN statement.

A14. The explain statement provides information about the optimizer's choice of access path of the sql.

Q15. What is referential integrity?

A15. Referential integrity refers to the consistency that must be maintained between primary and foreign keys, ie every foreign key value must have a corresponding primary key value.

Q16. What is a NULL value? What are the pros and cons of using NULLS?

A16. A NULL value takes up one byte of storage and indicates that a value is not present as opposed to a space or zero value. It's the DB2 equivalent of TBD on an organizational chart and often correctly portrays a business situation. Unfortunately, it requires extra coding for an application program to handle this situation.

Q17. What is a synonym? How is it used?

A17. A synonym is used to reference a table or view by another name. The other name can then be written in the application code pointing to test tables in the development stage and to production entities when the code is migrated. The synonym is linked to the AUTHID that created it.

Q18. What is an alias and how does it differ from a synonym?

A18. An alias is an alternative to a synonym, designed for a distributed environment to avoid having to use the location qualifier of a table or view. The alias is not dropped when the table is dropped.

Q19. When can an insert of a new primary key value threaten referential integrity?

A19. Never. New primary key values are not a problem. However, the values of foreign key inserts must have corresponding primary key values in their related tables. And updates of primary key values may require changes in foreign key values to maintain referential integrity.

Q20. What is the difference between static and dynamic sql?

A20. Static sql is hard-coded in a program when the programmer knows the statements to be executed. For dynamic sql the program must dynamically allocate memory to receive the query results.

Q21. Compare a subselect to a join.

A21. Any subselect can be rewritten as a join, but not vice versa. Joins are usually more efficient as join rows can be returned immediately, subselects require a temporary work area for inner selects results while processing the outer select.

Q22. What is the difference between IN subselects and EXISTS subselect?

A22. If there is an index on the attributes tested an IN is more efficient since DB2 uses the index for the IN. (IN for index is the mnemonic).

Q23. What is a Cartesian product?

A23. A Cartesian product results from a faulty query. It is a row in the results for every combination in the join tables.

Q24. What is a tuple?

A24. A tuple is an instance of data within a relational database.

Q25. What is the difference between static and dynamic sql?

A25. Static sql is compiled and optimized prior to its execution; dynamic is compiled and optimized during execution.

Q26. Any SQL implementation covers data types in couple of main categories. Which of the following are those data types ? (Check all that apply) A. NUMERIC B. CHARACTER C. DATE AND TIME D. BLOBS E. BIT

A26. A,B,C. Not all SQL implementations have a BLOB or a BIT data types.

Q27. We have a table with a CHARACTER data type field. We apply a ">" row comparison between this field and another CHARACTER field in another table. What will be the results for records with field value of NULL ? (Check one that applies the best) A. TRUE B. FALSE C. UNKNOWN D. Error. E. Those records will be ignored

A27. C. NULL in a row when compared will give an UNKNOWN result.

Q28. Any database needs to go through a normalization process to make sure that data is represented only once. This will eliminate problems with creating or destroying data in the database. The normalization process is done usually in three steps which results in first, second and third normal forms. Which best describes the process to obtain the third normal form? (Check one that applies the best) A. Each table should have related columns. B. Each separate table should have a primary key. C. We have a table with multi-valued key. All columns that are dependent on only one or on some of the keys should be moved in a different table. D. If a table has columns not dependent on the primary keys, they need to be moved in a separate table. E. Primary key is always UNIQUE and NOT NULL.

A28. D. All columns in a table should be dependent on the primary key. This will eliminate transitive dependencies in which A depends on B, and B depends on C, but we're not sure how C depends on A.

Q29. SQL can be embedded in a host program that uses a relational database as a persistent data repository. Some of the most important pre-defined structures for this mechanism are SQLDA ("SQL Descriptor Area") and SQLCA ("SQL Communications Area") SQLCA contains two structures - SQLCODE and SQLSTATE. SQLSTATE is a standard set of error messages and warnings in which the first two characters defines the class and the last three defines the subclass of the error. Which of the following SQLSTATE codes is interpreted as "No data returned"? (Check one that applies the best) A. 00xxx B. 01xxx C. 02xxx D. 22xxx E. 2Axxx

A29. C. 00 - is successful completion, 01 - warnings, 22 - is data exception and 2A is syntax error. The SQLSTATE code format returned for "No data returned" is "02xxx".

Informatica interview questions & FAQs ?

What is a source qualifier?
What is a surrogate key?
What is difference between Mapplet and reusable transformation?
What is DTM session?
What is a Mapplet?
What is a look up function? What is default transformation for the look up function?
What is difference between a connected look up and unconnected look up?
What is up date strategy and what are the options for update strategy?
What is subject area?
What is the difference between truncate and delete statements?
What kind of Update strategies are normally used (Type 1, 2 & 3) & what are the differences?
What is the exact syntax of an update strategy?
What are bitmap indexes and how and why are they used?
What is bulk bind? How does it improve performance?
What are the different ways to filter rows using Informatica transformations?
What is referential Integrity error? How do you rectify it?
What is DTM process?
What is target load order?
What exactly is a shortcut and how do you use it?
What is a shared folder?
What are the different transformations where you can use a SQL override?
What is the difference between a Bulk and Normal mode and where exactly is it defined?
What is the difference between Local & Global repository?
What are data driven sessions?
What are the common errors while running a Informatica session?
What are worklets and what is their use?
What is change data capture?
What exactly is tracing level?
What is the difference between constraints based load ordering and target load plan?
What is a deployment group and what is its use?
When and how a partition is defined using Informatica?
How do you improve performance in an Update strategy?
How do you validate all the mappings in the repository at once?
How can you join two or more tables without using the source qualifier override SQL or a Joiner transformation?
How can you define a transformation? What are different types of transformations in Informatica?
How many repositories can be created in Informatica?
How many minimum groups can be defined in a Router transformation?
How do you define partitions in Informatica?
How can you improve performance in an Aggregator transformation?
How does the Informatica know that the input is sorted?
How many worklets can be defined within a workflow?
How do you define a parameter file? Give an example of its use.
If you join two or more tables and then pull out about two columns from each table into the source qualifier and then just pull out one column from the source qualifier into an Expression transformation and then do a ‘generate SQL’ in the source qualifier how many columns will show up in the generated SQL.
In a Type 1 mapping with one source and one target table what is the minimum number of update strategy transformations to be used?
At what levels can you define parameter files and what is the order?
In a session log file where can you find the reader and the writer details?
For joining three heterogeneous tables how many joiner transformations are required?
Can you look up a flat file using Informatica?
While running a session what default files are created?
Describe the use of Materialized views and how are they different from a normal view.

Informatica Question n Answers And Concepts2

What is the difference between PowerCenter and PowerMart?

With PowerCenter, you receive all product functionality, including the ability to register multiple servers, share metadata across repositories, and partition data.
A PowerCenter license lets you create a single repository that you can configure as a global repository, the core component of a data warehouse.
PowerMart includes all features except distributed metadata, multiple registered servers, and data partitioning. Also, the various options available with PowerCenter (such as PowerCenter Integration Server for BW, PowerConnect for IBM DB2, PowerConnect for IBM MQSeries, PowerConnect for SAP R/3, PowerConnect for Siebel, and PowerConnect for PeopleSoft) are not available with PowerMart.

What are the new features and enhancements in PowerCenter 5.1?

The major features and enhancements to PowerCenter 5.1 are:

a) Performance Enhancements
?    High precision decimal arithmetic. The Informatica Server optimizes data throughput to increase performance of sessions using the Enable Decimal Arithmetic option.
?    To_Decimal and Aggregate functions. The Informatica Server uses improved algorithms to increase performance of To_Decimal and all aggregate functions such as percentile, median, and average.
?    Cache management. The Informatica Server uses better cache management to increase performance of Aggregator, Joiner, Lookup, and Rank transformations.
?    Partition sessions with sorted aggregation. You can partition sessions with Aggregator transformation that use sorted input. This improves memory usage and increases performance of sessions that have sorted data.
b) Relaxed Data Code Page Validation
When enabled, the Informatica Client and Informatica Server lift code page selection and validation restrictions. You can select any supported code page for source, target, lookup, and stored procedure data.
c) Designer Features and Enhancements
?    Debug mapplets. You can debug a mapplet within a mapping in the Mapping Designer. You can set breakpoints in transformations in the mapplet.
?    Support for slash character (/) in table and field names. You can use the Designer to import source and target definitions with table and field names containing the slash character (/). This allows you to import SAP BW source definitions by connecting directly to the underlying database tables.
d) Server Manager Features and Enhancements
?    Continuous sessions. You can schedule a session to run continuously. A continuous session starts automatically when the Load Manager starts. When the session stops, it restarts immediately without rescheduling. Use continuous sessions when reading real time sources, such as IBM MQSeries.
?    Partition sessions with sorted aggregators. You can partition sessions with sorted aggregators in a mapping.
?    Register multiple servers against a local repository. You can register multiple PowerCenter Servers against a local repository.

What is a repository?

The Informatica repository is a relational database that stores information, or metadata, used by the Informatica Server and Client tools. The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.
We create and maintain the repository with the Repository Manager client tool. With the Repository Manager, we can also create folders to organize metadata and groups to organize users.

What are different kinds of repository objects? And what it will contain?

Repository objects displayed in the Navigator can include sources, targets, transformations, mappings, mapplets, shortcuts, sessions, batches, and session logs.

What are different kinds of repository objects? And what it will contain?

Repository objects displayed in the Navigator can include sources, targets, transformations, mappings, mapplets, shortcuts, sessions, batches, and session logs.

What is Sequence Generator Transformation?

The Sequence Generator transformation generates numeric values. We can use the Sequence Generator to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers.

The Sequence Generation transformation is a connected transformation. It contains two output ports that we can connect to one or more transformations.

What is the difference between connected lookup and unconnected lookup?

Differences between Connected and Unconnected Lookups:
Connected Lookup    Unconnected Lookup
Receives input values directly from the pipeline.    Receives input values from the result of a :LKP expression in another transformation.
We can use a dynamic or static cache    We can use a static cache
Supports user-defined default values    Does not support user-defined default values

What are connected and unconnected Lookup transformations?

We can configure a connected Lookup transformation to receive input directly from the mapping pipeline, or we can configure an unconnected Lookup transformation to receive input from the result of an expression in another transformation.

An unconnected Lookup transformation exists separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation.

A common use for unconnected Lookup transformations is to update slowly changing dimension tables.

What is a Lookup transformation and what are its uses?

We use a Lookup transformation in our mapping to look up data in a relational table, view or synonym.

We can use the Lookup transformation for the following purposes:

    Get a related value. For example, if our source table includes employee ID, but we want to include the employee name in our target table to make our summary data easier to read.
    Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
    Update slowly changing dimension tables. We can use a Lookup transformation to determine whether records already exist in the target.

What is a lookup table?

The lookup table can be a single table, or we can join multiple tables in the same database using a lookup query override. The Informatica Server queries the lookup table or an in-memory cache of the table for all incoming rows into the Lookup transformation.

If your mapping includes heterogeneous joins, we can use any of the mapping sources or mapping targets as the lookup table.

Where do you define update strategy?

We can set the Update strategy at two different levels:
? Within a session. When you configure a session, you can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the session mapping to flag records for different database operations.
? Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.

What is Update Strategy?

When we design our data warehouse, we need to decide what type of information to store in targets. As part of our target table design, we need to determine whether to maintain all the historic data or just the most recent changes.
The model we choose constitutes our update strategy, how to handle changes to existing records.

Update strategy flags a record for update, insert, delete, or reject. We use this transformation when we want to exert fine control over updates to a target, based on some condition we apply. For example, we might use the Update Strategy transformation to flag all customer records for update when the mailing address has changed, or flag all employee records for reject for people no longer working for the company.

What are the different types of Transformations?

a) Aggregator transformation: The Aggregator transformation allows you to perform aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that you can use the Aggregator transformation to perform calculations on groups. The Expression transformation permits you to perform calculations on a row-by-row basis only. (Mascot)

b) Expression transformation: You can use the Expression transformations to calculate values in a single row before you write to the target. For example, you might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers. You can use the Expression transformation to perform any non-aggregate calculations. You can also use the Expression transformation to test conditional statements before you output the results to target tables or other transformations.

c) Filter transformation: The Filter transformation provides the means for filtering rows in a mapping. You pass all the rows from a source transformation through the Filter transformation, and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output, and only rows that meet the condition pass through the Filter transformation.

d) Joiner transformation: While a Source Qualifier transformation can join data originating from a common source database, the Joiner transformation joins two related heterogeneous sources residing in different locations or file systems.
e) Lookup transformation: Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping.
The Informatica Server queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Use the result of the lookup to pass to other transformations and the target.

What is a transformation?

A transformation is a repository object that generates, modifies, or passes data. You configure logic in a transformation that the Informatica Server uses to transform data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data.
Each transformation has rules for configuring and connecting in a mapping. For more information about working with a specific transformation, refer to the chapter in this book that discusses that particular transformation.
You can create transformations to use once in a mapping, or you can create reusable transformations to use in multiple mappings.

What are the tools provided by Designer?

The Designer provides the following tools:
?    Source Analyzer. Use to import or create source definitions for flat file, XML, Cobol, ERP, and relational sources.
?    Warehouse Designer. Use to import or create target definitions.
?    Transformation Developer. Use to create reusable transformations.
?    Mapplet Designer. Use to create mapplets.
?    Mapping Designer. Use to create mappings.

What are the different types of Commit intervals?

The different commit intervals are:
? Target-based commit. The Informatica Server commits data based on the number of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.
? Source-based commit. The Informatica Server commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.

What is Event-Based Scheduling?

When you use event-based scheduling, the Informatica Server starts a session when it locates the specified indicator file. To use event-based scheduling, you need a shell command, script, or batch file to create an indicator file when all sources are available. The file must be created or sent to a directory local to the Informatica Server. The file can be of any format recognized by the Informatica Server operating system. The Informatica Server deletes the indicator file once the session starts.
Use the following syntax to ping the Informatica Server on a UNIX system:
pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}] [hostname:]portno
Use the following syntax to start a session or batch on a UNIX system:
pmcmd start {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno [folder_name:]{session_name | batch_name} [:pf=param_file] session_flag wait_flag
Use the following syntax to stop a session or batch on a UNIX system:
pmcmd stop {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno[folder_name:]{session_name | batch_name} session_flag
Use the following syntax to stop the Informatica Server on a UNIX system:
pmcmd stopserver {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno

Informatica Question n Answers And Concepts

Which transformation should we use to normalize the COBOL and relational sources?

Normalizer Transformation. When we drag the COBOL source in to the mapping Designer workspace, the Normalizer transformation automatically appears, creating input and output ports for every column in the source.

Difference between static cache and dynamic cache?

In case of Dynamic cache when you are inserting a new row it looks at the lookup cache to see if the row existing or not, If not it inserts in the target and cache as well in case of Static cache when you are inserting a new row it checks the cache and writes to the target but not cache

If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do.

What are the join types in joiner transformation?

The following are the join types Normal, Master Outer, Detail Outer, and Full Outer

In which conditions we can not use joiner transformation (Limitations of joiner transformation)?

In the conditions; either input pipeline contains an Update Strategy transformation, you connect a Sequence Generator transformation directly before the Joiner transformation
1.Both input pipelines originate from the same Source Qualifier transformation.

2. Both input pipelines originate from the same Normalizer transformation.
3. Both input pipelines originate from the same Joiner transformation.
4. Either input pipeline contains an Update Strategy transformation.
5. We connect a Sequence Generator transformation directly before the Joiner transformation.

What is the look up transformation?

Used to look up data in a relational table or view.
Lookup is a passive transformation and used to look up data in a flat file or a relational table

What are the difference between joiner transformation and source qualifier transformation?

Source Qualifier Operates only with relational sources within the same schema. Joiner can have either heterogeneous sources or relation sources in different schema 2. Source qualifier requires atleats one matching column to perform a join. Joiner joins based on matching port. 3. Additionally, Joiner requires two separate input pipelines and should not have an update strategy or Sequence generator (this is no longer true from Infa 7.2).

1) Joiner can join relational sources which come from different sources whereas in source qualifier the relational sources should come from the same data source. 2) We need matching keys to join two relational sources in source qualifier transformation. Where as we doesn’t need matching keys to join two sources.

Why use the lookup transformation?

Lookup is a transformation to look up the values from a relational table/view or a flat file. The developer defines the lookup match criteria. There are two types of Lookups in Power center-Designer, namely; 1) Connected Lookup 2) Unconnected Lookup. Different caches can also be used with lookup like static, dynamic, persistent, and shared (The dynamic cache cannot be used while creating an un-connected lookup)

Lookup transformation is Passive and it can be both Connected and Unconnected as well. It is used to look up data in a relational table, view, or synonym
look up is used to perform one of the following task: -to get related value -to perform calculation -to update slowly changing dimension table

check whether the record already existing in the table

What are the difference between joiner transformation and source qualifier transformation?

Source Qualifier Operates only with relational sources within the same schema. Joiner can have either heterogeneous sources or relation sources in different schema 2. Source qualifier requires atleats one matching column to perform a join. Joiner joins based on matching port. 3. Additionally, Joiner requires two separate input pipelines and should not have an update strategy or Sequence generator.

1) Joiner can join relational sources which come from different sources whereas in source qualifier the relational sources should come from the same data source. 2) We need matching keys to join two relational sources in source qualifier transformation. Where as we doesn’t need matching keys to join two sources.

What is source qualifier transformation?

SQ is an active transformation. It performs one of the following task: to join data from the same source database to filter the rows when Power centre reads source data to perform an outer join to select only distinct values from the source

In source qualifier transformation a user can defined join conditions, filter the data and eliminating the duplicates. The default source qualifier can over write by the above options; this is known as SQL Override. alidwh@gmail.com

the source qualifier represents the records that the Informatica server reads when it runs a session.

When we add a relational or a flat file source definition to a mapping, we need to connect it to a source qualifier transformation. The source qualifier transformation represents the records that the Informatica server reads when it runs a session.

How the Informatica server increases the session performance through partitioning the source?

Partitioning the session improves the session performance by creating multiple connections to sources and targets and loads data in parallel pipe lines

What are the rank caches?

The Informatica server stores group information in an index cache and row data in data cache

when the server runs a session with a Rank transformation; it compares an input row with rows with rows in data cache. If the input row out-ranks a stored row, the Informatica server replaces the stored row with the input row.

During the session, the Informatica server compares an input row with rows in the data cache. If the input row out-ranks a stored row, the Informatica server replaces the stored row with the input row. The Informatica server stores group information in an index cache and row data in a data cache.

What is Code Page Compatibility?

When two code pages are compatible, the characters encoded in the two code pages are virtually identical.

Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One code page can be a subset or superset of another. For accurate data movement, the target code page must be a superset of the source code page.

How can you create or import flat file definition in to the warehouse designer?

By giving server connection path

Create the file in Warehouse Designer or Import the file from the location it exists or modify the source if the structure is one and the same

first create in source designer then drag into warehouse designer you can't create a flat file target defenition directly ramraj

There is no way to import target definition as file in Informatica designer. So while creating the target definition for a file in the warehouse designer it is created considering it as a table, and then in the session properties of that mapping it is specified as file.

U can not create or import flat file definition in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source definition into warehouse designer workspace,the warehouse designer creates a relational target definition not a file definition.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.

What is aggregate cache in aggregator transformation?

Aggregate value will stored in data cache, grouped column value will stored in index cache

Power centre server stores data in the aggregate cache until it completes aggregate calculations.

aggregator transformation contains two caches namely data cache and index cache data cache consists aggregator value or the detail record index cache consists grouped column value or unique values of the records

When the Power Center Server runs a session with an Aggregator transformation, it stores data in aggregator until it completes the aggregation calculation.

The aggregator stores data in the aggregate cache until it completes aggregate calculations. When u run a session that uses an aggregator transformation, the Informatica server creates index and data caches in memory to process the transformation. If the Informatica server requires more space, it stores overflow values in cache files.

How can you recognize whether or not the newly added rows in the source are gets insert in the target?

In the type-2 mapping we have three options to recognize the newly added rows. i) Version Number ii) Flag Value iii) Effective Date Range
From session SrcSuccessRows can be compared with TgtSuccessRows
check the session log or check the target table

How the Informatica server increases the session performance through partitioning the source?

Partitioning the session improves the session performance by creating multiple connections to sources and targets and loads data in parallel pipe lines

What are the types of lookup?

Mainly first three Based on connection: 1. Connected 2. Unconnected Based on source Type: 1. Flat file 2. Relational Based on cache: 1. Cached 2. Un cached Based on cache Type: 1. Static 2. Dynamic Based on reuse: 1. persistence 2. Non persistence Based on input: 1. Sorted 2. Unsorted

connected, unconnected

mainly two types of look up...there 1.static lookup 2.dynamic lookup In static lookup..There two types are used one is connected and unconnected. In connected lookup means while using the pipeline symbol... In unconnected lookup means while using the expression condition

What are the types of metadata that stores in repository?

Data base connections, global objects,sources,targets,mapping,mapplets,sessions,shortcuts,transfrmations

The repository stores metadata that describes how to transform and load source and target data.

Data about data

Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets.

Following are the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target definitions Transformations.

What happens if Informatica server doesn't find the session parameter in the parameter file?

Workflow will fail.

Can you access a repository created in previous version of informatica?

We have to migrate the repository from the older version to newer version. Then you can use that repository.

Without using ETL tool can u prepare a Data Warehouse and maintain

Yes we can do that using PL/ SQL or Stored procedures when all the data are in the same databases. If you have source as flat files you can?t do it through PL/ SQL or stored procedures.

How do you identify the changed records in operational data

In my project source system itself sending us the new records and changed records from the last 24 hrs.

Why couldn't u go for Snowflake schema?

Snowflake is less performance while compared to star schema, because it will contain multi joins while retrieving the data.
Snowflake is preferred in two cases,
 If you want to load the data into more hierarchical levels of information example yearly, quarterly, monthly, daily, hourly, minutes of information. Prefer snowflake.
 Whenever u found input data contain more low cardinality elements. You have to prefer snowflake schema. Low cardinality example: sex , marital Status, etc., Low cardinality means no of distinct records is very less while compared to total number of the records

Name some measures in your fact table?

Sales amount.

How many dimension tables did you had in your project and name some dimensions (columns)?

Product Dimension : Product Key, Product id, Product Type, Product name, Batch Number.
Distributor Dimension: Distributor key, Distributor Id, Distributor Location,
Customer Dimension : Customer Key, Customer Id, CName, Age, status, Address, Contact
Account Dimension : Account Key, Acct id, acct type, Location, Balance,

How many Fact and Dimension tables are there in your project?

In my module (Sales) we have 4 Dimensions and 1 fact table.

How many Data marts are there in your project?

There are 4 Data marts, Sales, Marketing, Finance and HR. In my module we are handling only sales data mart.

What is the daily data volume (in GB/records)? What is the size of the data extracted in the extraction process?

Approximately average 40k records per file per day. Daily we will get 8 files from 8 source systems.

What is the size of the database in your project?

Based on the client?s database, it might be in GB?s.

What is meant by clustering?

It will join two (or more) tables in single buffer, will retrieve the data easily.

. Whether are not the session can be considered to have a heterogeneous target is determined?

It will consider (there is no primary key and foreign key relationship)

Under what circumstance can a target definition are edited from the mapping designer. Within the mapping where that target definition is being used?

We can't edit the target definition in mapping designer. we can edit the target in warehouse designer only. But in our projects, we haven't edited any of the targets. if any change required to the target definition we will inform to the DBA to make the change to the target definition and then we will import again. We don't have any permission to the edit the source and target tables.

Can a source qualifier be used to perform a outer join when joining 2 database?

No, we can't join two different databases join in SQL Override.

If u r source is flat file with delimited operator.when next time u want change that delimited operator where u can make?

In the session properties go to mappings and click on the target instance click set file properties we have to change the delimited option.

If index cache file capacity is 2MB and datacache is 1 MB. If you enter the data of capacity for index is 3 MB and data is 2 MB. What will happen?

Nothing will happen based the buffer size exists in the server we can change the cache sizes. Max size of cache is 2 GB.

Difference between next value and current value ports in sequence generator?

Assume that they r both connected to the input of another transformer?
It will gives values like nextvalue 1, currval 0.

How does dynamic cache handle the duplicates rows?
Dynamic Cache will gives the flags to the records while inserting to the cache it will gives flags to the records, like new record assigned to insert flag as "0", updated record is assigned to updated flag as "1", No change record assigned to rejected flag as "2"

How will u find whether your mapping is correct or not without connecting session?

Through debugging option.

If you are using aggregator transformation in your mapping at that time your source contain dimension or fact?

According to requirements, we can use aggregator transformation. There is no limitation for the aggregator. We should use source as dimension or fact.

What are Shortcuts?

We can create shortcuts to objects in shared folders. Shortcuts provide the easiest way to reuse objects. We use a shortcut as if it were the actual object, and when we make a change to the original object, all shortcuts inherit the change.

Shortcuts to folders in the same repository are known as local shortcuts. Shortcuts to the global repository are called global shortcuts.

What is a metadata?

Designing a data mart involves writing and storing a complex set of instructions. You need to know where to get data (sources), how to change it, and where to write the information (targets). PowerMart and PowerCenter call this set of instructions metadata. Each piece of metadata (for example, the description of a source table in an operational database) can contain comments about it.

In summary, Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets.

What are folders?

Folders let you organize your work in the repository, providing a way to separate different types of metadata or different projects into easily identifiable areas.

What are mappings?

A mapping specifies how to move and transform data from sources to targets. Mappings include source and target definitions and transformations. Transformations describe how the Informatica Server transforms data. Mappings can also include shortcuts, reusable transformations, and mapplets. Use the Mapping Designer tool in the Designer to create mappings.

What are mapplets?

You can design a mapplet to contain sets of transformation logic to be reused in multiple mappings within a folder, a repository, or a domain. Rather than recreate the same set of transformations each time, you can create a mapplet containing the transformations, then add instances of the mapplet to individual mappings. Use the Mapplet Designer tool in the Designer to create mapplets.

What are Transformations?

A transformation generates, modifies, or passes data through ports that you connect in a mapping or mapplet. When you build a mapping, you add transformations and configure them to handle data according to your business purpose. Use the Transformation Developer tool in the Designer to create transformations

What are Reusable transformations?

You can design a transformation to be reused in multiple mappings within a folder, a repository, or a domain. Rather than recreate the same transformation each time, you can make the transformation reusable, then add instances of the transformation to individual mappings. Use the Transformation Developer tool in the Designer to create reusable transformations.

What are Sessions and Batches?

Sessions and batches store information about how and when the Informatica Server moves data through mappings. You create a session for each mapping you want to run. You can group several sessions together in a batch. Use the Server Manager to create sessions and batches.

What is Session and Batches?Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets. After creating the session, we can use either the server manager or the command line program pmcmd to start or stop the session.Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. There Are Two Types Of Batches :

Sequential - Run Session One after the Other.

concurrent - Run Session At The Same Time.