Tuesday, October 27, 2015

TD14.10 Statistics : “Threshold” option

TD14.10 Statistics : “Threshold” option

When you submit a COLLECT STATISTICS statement in 14.10, it may or may not execute.  A decision is made whether or not there is a value in recollecting these particular statistics at the time they are submitted.  That decision is only considered if threshold options are being used.

There are three levels of Threshold options

System threshold:   This is the default approach for applying thresholds for all 14.10 platforms.  This is default approach determines the appropriate threshold for each statistic and considers how much the underlying table has changed since the last collection.
The degree of change to the table since the last collection is compared against the current state of the table, based on USECOUNT tracking of inserts, deletes and updates performed.   If the change threshold has not been reached, and enough history has been collected for this statistics so that the optimizer can perceive a pattern in the data such that extrapolations can be confidently performed, then this statistics collection will be skipped.
       There is a DBS Control record parameter called SysChangeThresholdOption

DBA-defined global thresholds:  This thresholds override the system default, and rely on DBA-defined fixed percentages as thresholds.   Once set, all statistics collection statements will use these global threshold values, unless overridden by the third level of threshold options at the statement level.
Two parameters :
DefaultUserChangeThreshold:
DefaultTimeThreshold:

Thresholds on individual statements:  Optional USING clauses that are attached to COLLECT STATISTICS statements can override the system default or any global DBA-defined thresholds.

COLLECT STATISTICS USING THRESHOLD 8% AND THRESHOLD 10 DAYS
COLUMN column1 ON Table1;


Whichever threshold level is set, if the optimizer determines that the threshold has not been met, no statistics will be collected, even though they have been submitted.  When a collection has been asked for but has not been executed, a StatsSkipCount column in the DBC.StatsV row that represents this statistics will be incremented.  


DBQL Usecount and Threshold

The recommended percent of change thresholds rely on having DBQL USECOUNT logging turned on. USECOUNT tracks inserts, deletes and updates made to tables within a database, and as a result, can provide highly accurate information to the optimizer about how the table has changed since the last statistics collection.

The default system threshold functionality is able to be applied to a statistic collection only if USECOUNT logging has been enabled for the database that the statistics collection table belongs to.   In the absence of USECOUNT data, the default threshold behaviour will be ignored. However, both DBA-defined global thresholds and statement-based thresholds are able to use percent of change thresholds even without USECOUNT logging, but with the risk of less accuracy.

 In the cases where USECOUNT logging is not enabled, percent of change values are less reliable because the optimizer must rely on random AMP sample comparisons.
Percent of change is the recommended way to express thresholds.


Example :
Pick a small, non-critical database.
Enable DBQL USECOUNT logging on that database:
BEGIN QUERY LOGGING WITH USECOUNT ON TestDB;
Disable the system threshold parameter by setting DBS Control setting:
SysChangeThresholdOption = 3

Leave the global parameters disabled as they are by default:
DefaultUserChangeThreshold = 0

DefaultTimeThreshold = 0

Add USING THRESHOLD 10 PERCENT to the statistics collection statements just for the tables within your selected database:
COLLECT STATISTICS USING THRESHOLD 10% COLUMN TestID ON TestDB;
Insert a few rows into the table (less than 10% of the table size) and run an Explain of the statistics collection statement itself, and it will tell you whether or not skipping is taking place.
EXPLAIN COLLECT STATISTICS COLUMN TestID ON TestDB;

Monday, October 26, 2015

Monitor Flow Control

Monitor flow control state:

Refer table DBC.ResUsageSAWT to monitor flow control state also consider snapshot interval (10 mins default)


Column Definition of ResUsageSAWT

TheDate: Returns the calendar date on which the event was logged.

NodeID: Returns Node ID.

TheTime: Returns the time of the event in theform hh:mm:ss.

VprId: Identifies the AMP vproc.

Active: The Active controls whether or not the rows will be logged to the resource usage tables if Active Row Filter Mode is enabled.

MailBoxDepth: Current depth of the AMP work mailbox at the end of the period.
High AWT usage does not necessarily mean that the system is in flow control. Only if all AMP worker tasks are in use and the message queues are filled, flow control state is initiated.
While often MailBoxDepth is > 0 if the AMP is in flow control state, it’s not always like this because this measure presents only the size of the message queue at the end of each snapshot interval. It could be, that the AMP was in flow control state during the snapshot interval, but the queue is empty at the end of the interval. Keep this in mind.

FlowControlled: Specifies if an AMP is in flow control. If the value is non-zero, then the AMP is in flow control.

FlowCtlCnt: Returns the number of times this log period that the node entered the flow control state from a non-flow controlled state.

FlowCtlTime: Returns total time, in milliseconds, that an AMP is in flow control.

InuseMax: Returns maximum number of AMP WorkerTasks in use at any one time during the log period.By checking the the InUseMax measure, you will see how the total system workload is consuming AMP worker tasks, which means: how loaded your system is

Available: Returns the number of unreserved AMP Worker Tasks from the pool that are not being used at the end of the interval.
Available shows the number of unused AMP worker tasks at the end of the snapshot interval. A consecutive number of snapshots having Available = 0 means that your system ran out of AMP worker tasks over a longer time frame.

AvailableMin: Returns the minimum number of unreserved AMP Worker Tasks available in the pool for each AMP for the logged period. In other words, It is measuring the minimum number of AMP worker tasks having been used at one point in time during the snapshot interval

AwtLimit: Returns the current setting for AMP Worker Task (for example, 80, 100, or so on) in the DBS Control MaxLoadAWT field.

WorkTypeInuse00 - 15: Returns current number of AMP WorkerTasks in use for each work type.

WorkTypeMax00 - 15: Returns maximum number of AMP WorkerTasks in use at one time during the log period for each work type for the VprId vproc.

SELECT
TheDate
,TheTime
,Secs
,VPRID as TheAMP
,MailBoxDepth
,FlowCOntrolled
,FlowCtlCnt
,InUseMax
,Available
,AvailableMin
,AWTLimit
,WorkTypeMax00 AS DispatcherStep
,WorkTypeMax01 AS Spawned_Level1
,WorkTypeMax02 AS Spawned_Level2
,WorkTypeMax03 AS InternalWork
,WorkTypeMax04 AS Recovery
,WorkTypeMax08 AS ExpeditedDispatcherStep
,WorkTypeMax09 AS ExpeditedSpawned_Level1
,WorkTypeMax10 AS ExpeditedSpawned_Level2
,WorkTypeMax12 AS AbortStep
,WorkTypeMax13 AS SpawnedWorkAbort
,WorkTypeMax14 AS UrgentInternalWork
,WorkTypeMax15 AS MostUrgentInterbalWork
FROM DBC.ResUsageSAWT
WHERE thedate = date -2
and FlowCtlTime > 0 ;