Thursday, September 25, 2014

Secondary Index

Secondary Index

Before going to learn about Secondary Index, i would like to suggest you learn Primary Index.

A Secondary Index (SI) is an alternate data access path. It allows you to access the data without having to do a full-table scan.

You can drop and recreate secondary indexes dynamically, as they are needed.Secondary Indexes are stored in separate subtables that requires additional disk space and maintenance which is handled automatically by the system.

The entire purpose for the Secondary Index Subtable will be to point back to the real row in the base table via the Row-ID. 

Secondary Index Rules
Rule 1: Secondary Indexes are optional. 
Rule 2: Secondary Index values can be unique or non-unique.
Rule 3: Secondary Index values can be NULL.
Rule 4: Secondary Index values can be modified.
Rule 5: Secondary Indexes can be changed.
Rule 6: A Secondary Index has a limit of 64 columns.

Like Primary Indexes,Secondary Indexes also two Types (USI or NUSI).

Unique Secondary Index (USI) serves two purposes. 
  • Enforces uniqueness on a column or group of columns. The database will check USIs to see if the values are unique.
  • Speeds up access to a row (data retrieval speed).
When A USI is created Teradata will immediately build a secondary index subtable on each AMP.
Each AMP will then hash the secondary index value for each of their rows in the base table.

In the Below diagram,each AMP hashes the Emp_no column for all employee rows they hold.

The output of the Emp_no hash will utilize the hash map to point to a specific AMP and that AMP will hold the secondary index subtable row for the secondary index value. 

That means the subtable row will hold the base table Row-ID and Teradata will then find the base row immediately.

Syntax to create a Unique Secondary Index is:
CREATE UNIQUE INDEX (Column/Columns) ON .;
TeradataWiki-Teradata USI

Non-Unique Secondary Index (NUSI) is usually specified to prevent full-table scans, in which every row of a table is read. 

When A NUSI is created Teradata will immediately build a secondary index subtable on each AMP.

Each AMP will hold the secondary index values for their rows in the base table only. In our example, each AMP holds the name column for all employee rows in the base table on their AMP (AMP local).

Each AMP Local name will have the Base Table Row-ID (pointer)so the AMP can retrieve it quickly if needed. If an AMP contains duplicate first names, only one subtable row for that name is built with multiple Base Row-IDs.

Syntax to create a Non-Unique Secondary Index is:
CREATE INDEX (Column/Columns) ON .;

There can be up to 32 Secondary Indexes on a table
An USI is always a Two-AMP operation so it is almost as fast as a Primary Index, but a NUSI is an All-AMP operation, but not a Full Table Scan.


Data Access With USI:

TeradataWiki-Teradata USI
  • The SQL is submitted, specifying a USI (in this case, a customer number of 56).
  • The hashing algorithm calculates a row hash value (in this case, 602).
  • The hash map points to the AMP containing the subtable row corresponding to the row hash value (in this case, AMP 2).
  • The subtable indicates where the base row resides (in this case, row 778 on AMP 4).
  • The message goes back over the BYNET to the AMP with the row and the AMP accesses the data row (in this case, AMP 4).
  • The row is sent over the BYNET to the PE, and the PE sends the answer set on to the client application.
  • As shown in the example above, accessing data with a USI is typically a two-AMP operation. However, it is possible that the subtable row and base table row could end up being stored on the same AMP, because both are hashed separately. If both were on the same AMP, the USI request would be a one-AMP operation.

Data Access With NUSI:
TeradataWiki-Teradata USI
  • The SQL is submitted, specifying a NUSI (in this case, a last name of "Smith").
  • The hashing algorithm calculates a row hash value for the NUSI (in this case, 567).
  • All AMPs are activated to find the hash value of the NUSI in their index subtables. The AMPs whose subtables contain that value become the participating AMPs in this request (in this case, AMP1 and AMP2). The other AMPs discard the message.
  • Each participating AMP locates the row IDs (row hash value plus uniqueness value) of the base rows corresponding to the hash value (in this case, the base rows corresponding to hash value 567 are 640, 222, and 115).
  • The participating AMPs access the base table rows, which are located on the same AMP as the NUSI subtable (in this case, one row from AMP 1 and two rows from AMP 2).
  • The qualifying rows are sent over the BYNET to the PE, and the PE sends the answer set on to the client application (in this case, three qualifying rows are returned).

No comments:

Post a Comment