Data Warehouse and Informatica Realtime Scenarios: 12/22/10

Tracing Levels

**Tracing Levels**
Tracing Level	Description
Normal	Integration Service logs initialization and status information, errors encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows.
Terse	Integration Service logs initialization information and error messages and notification of rejected data.
Verbose Initialization	In addition to normal tracing, Integration Service logs additional initialization details, names of index and data files used, and detailed transformation statistics.
Verbose Data	In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Also notes where the Integration Service truncates string data to fit the precision of a column and provides detailed transformation statistics. Allows the Integration Service to write errors to both the session log and error log when you enable row error logging. When you configure the tracing level to verbose data, the Integration Service writes row data for all rows in a block when it processes a transformation.

By default, the tracing level for every transformation is Normal.

Update strategy transformation Use

Update strategy transformation can be used in two levels.

1)Mapping level
2)Session level

Mapping level:-Within a mapping you can use update strategy transf to flag rows for

Insert
Update
Delete
Reject

Session level:-In session properties you have an option as
Treat source rows as-Data driven(by default)
Or you can choose insert,update,delete

For target tables you can set the following update strategy options.

Insert
Delete
Update
-Update as update
-update as insert
-update else insert.
Truncate table.

Do's and Dont's while using Sorted Input in Aggregator transformation

In general, follow this check list to ensure that you are handling aggregator with sorted inputs correctly:
1. Do not use sorted input if any of the following conditions are true:

a.The aggregate expression uses nested aggregate functions.
b.The session uses incremental aggregation.
c. Input data is data driven.
You select data driven for the Treat Source Rows as Session Property, or the Update Strategy transformation appears before the Aggregator transformation in the mapping.
If you use sorted input under these circumstances, the Informatica Server reverts to default aggregate behavior, reading all values before performing aggregate calculations.

Remember, when you are using sorted inputs, you pass sorted data through the Aggregator.
Data must be sorted as follows:
1. By the Aggregator group by ports, in the order they appear in the Aggregator transformation.
2. Using the same sort order configured for the session.
If data is not in strict ascending or descending order based on the session sort order, the Informatica Server fails the session.

Aggregate Transformation

The Aggregate transformation applies aggregate functions, such as Average, to column values and copies the results to the transformation output. Besides aggregate functions, the transformation provides the GROUP BY clause, which you can use to specify groups to aggregate across.
You configure the Aggregate transformation at the transformation, output, and column levels.

At the transformation level, you configure the Aggregate transformation for performance by specifying the following values:
- The number of groups that are expected to result from a Group by operation.
- The number of distinct values that are expected to result from a Count distinct operation.
- The percentage by which memory can be extended during the aggregation.
The Aggregate transformation can also be configured to generate a warning instead of failing when the value of a divisor is zero.

At the output level, you configure the Aggregate transformation for performance by specifying the number of groups that are expected to result from a Group by operation. The Aggregate transformation supports multiple outputs, and each can be configured differently.

At the column level, you specify the following values:
- The aggregation that the column performs.
- The comparison options of the aggregation.

You can also configure the Aggregate transformation for performance by specifying these values:

The number of groups that are expected to result from a Group by operation on the column.
The number of distinct values that are expected to result from a Count distinct operation on the column.

                          You can also identify columns as IsBig if a column contains large numeric values or numeric values with high precision.

                          The Aggregate transformation is asynchronous, which means that it does not consume and publish data row by row. Instead it consumes the whole rowset, performs its groupings and aggregations, and then publishes the results.
This transformation does not pass through any columns, but creates new columns in the data flow for the data it publishes. Only the input columns to which aggregate functions apply or the input columns the transformation uses for grouping are copied to the transformation output. For example, an Aggregate transformation input might have three columns: CountryRegion, City, and Population. The transformation groups by the CountryRegion column and applies the Sum function to the Population column. Therefore the output does not include the City column.

                         You can also add multiple outputs to the Aggregate transformation and direct each aggregation to a different output. For example, if the Aggregate transformation applies the Sum and the Average functions, each aggregation can be directed to a different output.

                     You can apply multiple aggregations to a single input column. For example, if you want the sum and average values for an input column named Sales, you can configure the transformation to apply both the Sum and Average functions to the Sales column.

                                  The Aggregate transformation has one input and one or more outputs. It does not support an error output.

Operations

The Aggregate transformation supports the following operations.

Operation	Description
Group by	Divides datasets into groups. Columns of any data type can be used for grouping. For more information, see GROUP BY (Transact-SQL
Sum	Sums the values in a column. Only columns with numeric data types can be summed. For more information, see SUM (Transact-SQL).
Average	Returns the average of the column values in a column. Only columns with numeric data types can be averaged. For more information, see AVG (Transact-SQL).
Count	Returns the number of items in a group. For more information, see COUNT (Transact-SQL).
Count distinct	Returns the number of unique nonnull values in a group. For more information, see Eliminating Duplicates with DISTINCT.
Minimum	Returns the minimum value in a group. For more information, see MIN (Transact-SQL). In contrast to the Transact-SQL MIN function, this operation can be used only with numeric, date, and time data types.
Maximum	Returns the maximum value in a group. For more information, see MAX (Transact-SQL). In contrast to the Transact-SQL MAX function, this operation can be used only with numeric, date, and time data types.

The Aggregate transformation handles null values in the same way as the SQL Server relational database engine. The behavior is defined in the SQL-92 standard. The following rules apply:

In a GROUP BY clause, nulls are treated like other column values. If the grouping column contains more than one null value, the null values are put into a single group.
In the COUNT (column name) and COUNT (DISTINCT column name) functions, nulls are ignored and the result excludes rows that contain null values in the named column.
In the COUNT (*) function, all rows are counted, including rows with null values.

Wednesday, December 22, 2010

Tracing Levels

Update strategy transformation Use

Do's and Dont's while using Sorted Input in Aggregator transformation

Aggregate Transformation