Thursday, 14 February 2013

Generating a sequential Surrogate Key in Teradata

This has to the commonest problem of all in the Data Warehousing World. And people suggest different ways to tackle this based on their comfort level. Here I show an approach to create a sequentially increasing surrogate key without the use of variables or CSUM.

CSUM is a deprecated feature and Teradata strongly discourages its use. This can be found in page 339 of the Teradata manual here


/*Create Source Table*/
create table src_tbl
(pk int,
dta varchar(10)
);

/*Insert rows in source */

insert into src_tbl(101, 'Anne');
insert into src_tbl(102, 'Betty');
insert into src_tbl(103, 'Cherrie');

/*Create Target Table*/
create table tgt_tbl
(sk int,
dta varchar(10)
);

/*Insert a row to simulate some data already present in target*/

insert into tgt_tbl (1, 'Anne');

/*Check Values are inserted */

select * from tgt_tbl;

select * from src_tbl;

/*Generate new values for source rows that are to be inserted */

sel (sel max(sk) from tgt_tbl) + row_number() over(order by src.dta) tgt_sk --<--- Surrogate Key Generated
, src.dta
from src_tbl src
left join tgt_tbl tgt
on src.dta = tgt.dta
where tgt.dta is null;

The Result is shown below

tgt_sk dta
2 Betty
3 Cherrie

No comments: