Thursday, September 25, 2014

Performance Guide 1

Performance Tuning

A) Explain the EXPLAIN: Check for EXPLAIN plan to see how exactly Teradata will be executing the query. Try to understand basic keywords in Explain Plan like confidence level, join strategy used, re-distribution happening or not.

B) Collect STATS: The stats of the columns used join conditions should updated. Secondary Indexes without proper STATS can be of little or no help. Check for STATS status of the table.

C) Use Proper PI: If the Primary index is not properly defined in any one or all of the tables in the query. Check if the PI of target table loaded is unique.

D) Use PPI: If there is Partition Primary Index created on a table, try to use it. If you are not using it in filter condition, it will degrade the performance.

E) No FUNCTIONS in Conditions: Try to avoid using function in join conditions. Ex Applying COALESCE or TRIM etc causes high CPU consumption.

F) Use PPI: If Partition Primary Index is defined in tables try to use it. If you are not using it in filter condition, it will degrade the performance.

G) Same column DATA TYPES:
 Define same data type for the joining columns.

H) Avoid IN clause in filter conditions: When there can be huge number of values in where conditions, better option can be to insert such values in a volatile table and use volatile table with INNER JOIN in the main query.

I) Use Same PI in Source & Target: PI columns also can help in saving the data into disk .If the Source and Target have the same PI, data dump can happen very efficiently form source to target.

J) Collect STATS on VOLATILE table: Collect stats on volatile tables where required can save AMPCPU. Remove stats if already present where it is not getting used.
If the volatile table contains UNIQUE PI, then go for sample stats rather than full stats.

K) DROPPING volatile tables explicitly: Once volatile tables is no more required you can drop those. Donít wait for complete procedure to be over. This will free some spool space immediately and could prove to be very helpful in avoiding No More Spool Space error.

L) NO LOG for volatile tables: Create volatile tables with NO LOG option.

M) Check DBQL Stats: Keep your performance stats accessible. Target the most AMPCPU consuming query first.

N) UPDATE clause: Do not write UPDATE clause with just SET condition and no WHERE condition. Even if the Target/Source has just one row, add WHERE clause for PI column.

O) DELETE & INSERT: Sometimes replacing UPDATE with DELETE & INSERT can save good number of AMPCPU. Check if this holds good for your query.

P) Query SPLITS: Split queries into several smaller queries logically and use volatile tables with proper PI.

Q) Try MSR: If same target table is loaded multiple times, try MSR for several sections. This will speed the final MERGE step into target table and you may see good CPU gain.

R) Try OLAP Functions: 
Check if replacing co-related sub query with OLAP function may result in AMPCPU saving.

S) Avoid DUPLICATE data: If the join columns in the tables involved in the query have duplicates. Use Distinct or Group by, load into a volatile table, collect stats and use the volatile table.

T) Use Proper JOINS: If joins used, donít use right outer, left or full joins where inner joins is sufficient.

U) User proper ALIAS: Check the aliases in the joins. Small mistake could lead to a product join.

V) Avoid CAST: Avoid unnecessary casting for DATE columns. Once defined as DATE, you can compare date columns against each other even when they are in different format. Internally, DATE is stored as INTEGER. CAST is required mainly when you have to compare VARCHAR value as DATE.

W) Avoid UDF: 
Most of the functions are available in Teradata for data manipulations. So avoid User Defined Functions

X) Avoid FULL TABLE SCAN: Try to avoid FTS scenarios like SI should be defined on the columns which are used as part of joins or Alternate access path. Collect stats on SI columns else there are chances where optimizer might go for FTS even when SI is defined on that particular column

Y) Avoid using IN/NOT IN: For large list of values, avoid using IN /NOT IN in SQLs. Write large list values to a temporary table and use this table in the query

Z) Use CONSTANTS: 
Use constants to specify index column contents whenever possible, instead of specifying the constant once, and joining the tables. This may provide a small savings on performance.

No comments:

Post a Comment