25 Aug 2010

Database vs ETL processing - a comparison

Source data

The POINT table/file holds 100 records similar to this. Point ranges from 1000 to 5000.

CustNo
Point
1054
2781
1055
3050
1056
3714
1057
3385
1058
3910
1059
3041
1060
3301
1061
4615
1062
4300
1063
5126

The CUSTOMER table/file holds 100 records similar to shown below.

CustNo
Name
Address
Town
Postcode
Sex
1054
Kirby Roman
390-4028 Vitae Avenue
Twin Falls
QO8 9DS
F
1055
Brianna Kaufman
Ap #414-8109 Nunc St.
Aberdeen
MD01 6PY
F
1056
Quon Maldonado
1214 Mi St.
Nevada City
EJ7 3BW
F
1057
Sarah Burke
897-4509 Integer Rd.
DuBois
J2 0ZW
F
1058
Alice Pickett
3730 Sed Street
Grambling
C6G 8YI
F
1059
Odette Richard
P.O. Box 920, 6838 Felis Av.
Oklahoma City
O5 0QM
F
1060
Ursa Nunez
310 Euismod Street
Pierre
P30 8FZ
F
1061
Caryn Walker
Ap #969-5153 Ipsum. Rd.
Idaho Falls
GF1 2VR
F
1062
Dai Lawrence
907-2556 Suspendisse Rd.
Pass Christian
B3R 2JV
F
1063
Claudia Calhoun
P.O. Box 932, 3401 Cras Avenue
Poughkeepsie
XU8 7LC
F

Requirement

1. Select only customers with points > 1000
2. Join with customer details on CustNo to populate customer details with points
3. Based on customer gender, create salutation field in output as Mr/Ms First name Surname
4. Create 3 feeds named Gold, Silver & Bronze based on following logic
• Gold - point > 4001
• Silver - point between 2001 and 4000
• Bronze - point between 1001 and 2000

All output feeds will have following fields:

CustNo
Salutation
Address
City PostCode
Point
Category (Gold/Silver/Bronze)

The output will look like this.

CustNo
Name
Address
Town Postcode
Point
Category
1073
Ms Cassidy Stokes
9679 Malesuada Ave
Sedalia BX4G 0HV
4120
Gold
1074
Ms Germane Olson
P.O. Box 631, 2307 Lobortis Road
Darlington N1H 3IM
5402
Gold
1044
Mr Emery Andrews
6721 Quis Av.
Needham X3Z 9SJ
3158
Silver
1045
Mr Leonard Logan
P.O. Box 554, 591 Massa. Road
Minnetonka LF3F 6ON
2569
Silver
1030
Mr Stewart Sanford
Ap #246-7429 Vitae Av.
Gadsden V26 6GZ
1522
Bronze

NB. There will actually be 3 output feeds - for gold, silver and bronze customers.

ETL Solution

The Clover ETL solution will be like shown in next figure.




Database Solution

The solution shown here uses Oracle database.

There will be a table named PROCESS_FLOW which will control the overall flow as explained further.

The sample contents of this table will be like this for the particular example we are discussing here.


STEP
SOURCE_VIEW
TARGET_TABLE
ACTION
DESCRIPTION
10
CUSTOMER_FILTER
CUSTOMER_FILTER_TABLE
TRUNCATE INSERT
selecting customers with point > 1000
20
CUSTOMER_JOIN
CUSTOMER_JOIN_TABLE
TRUNCATE INSERT
join above step with customer on CustNo
30
CUSTOMER_GOLD
CUSTOMER_GOLD_TABLE
TRUNCATE INSERT
select category = "GOLD" from above output



CREATE OR REPLACE VIEW CUSTOMER_FILTER AS SELECT * FROM POINT WHERE POINT > 1000;
CREATE TABLE CUSTOMER_FILTER_TABLE AS SELECT * FROM CUSTOMER_FILTER WHERE 1=2;
INSERT INTO CUSTOMER_FILTER_TABLE SELECT * FROM CUSTOMER_FILTER;
COMMIT;

CREATE OR REPLACE VIEW CUSTOMER_JOIN AS
SELECT P.CUSTNO, C.CUSTNAME, C.ADDRESS, C.TOWN, C.POSTCODE, P.POINT, C.SEX
,CASE WHEN P.POINT > 4000 THEN 'GOLD' WHEN (P.POINT > 2000 AND P.POINT <= 4000) THEN 'SILVER' WHEN (P.POINT > 1000 AND P.POINT <= 2000) THEN 'BRONZE' END CATEGORY FROM CUSTOMER_FILTER_TABLE P JOIN CUSTOMER C ON P.CUSTNO = C.CUSTNO; INSERT INTO CUSTOMER_JOIN_TABLE SELECT CJ.CUSTNO, DECODE(SEX,'M','Mr','F','Ms','??') || ' ' || CJ.CUSTNAME , CJ.ADDRESS, CJ.TOWN || ' ' || CJ.POSTCODE , NULL,CJ.POINT, CJ.SEX,CJ.CATEGORY FROM CUSTOMER_JOIN CJ; COMMIT; SELECT * FROM CUSTOMER_JOIN_TABLE; -- POSTCODE WILL BE NULL AS WE COMBINED THAT WITH TOWN IN PREVIOUS FIELD CREATE OR REPLACE VIEW CUSTOMER_GOLD AS SELECT * FROM CUSTOMER_JOIN_TABLE WHERE CATEGORY = 'GOLD'; CREATE TABLE CUSTOMER_GOLD_TABLE AS SELECT * FROM CUSTOMER_GOLD WHERE 1=2; INSERT INTO CUSTOMER_GOLD_TABLE SELECT * FROM CUSTOMER_GOLD; COMMIT; SELECT * FROM CUSTOMER_GOLD; -- list of GOLD customers


So, we can see how involved database processing steps are!

Difference between DB & ETL processing

Topic
DB
ETL
Parallel processing
DB process is sequential.
ETL process is parallel.
Phasing
DB process each step in a different phase.
Independent ETL components may be in same phase.
I/O
DB process lands data in disk (= tables).
ETL process can run in pipeline manner, without landing data to disk thus reducing I/O.
Intuitiveness
DB process is not intuitive. It makes maintaining them very difficult.
ETL process is a diagram. Very easy to understand and modify.
Platform specific
DB process is DB specific.
ETL process is generic. Heterogeneous environments can implement ETL processes in similar fashion.
Debugging
Debugging in DB is very cumbersome.
Debugging on ETL is often visual - so easy to spot problem.

Conclusion

An ETL process is much easier to maintain than DB specific process.

15 Aug 2010

Car handling terms - what they mean?

Sharp handling - the car goes exactly where you points steering and intend to go. It should not lose grip either. If you try to drive a family car over 60 km/h in a twisted road, you may find that the car may drift towards wrong side of central line marker! This means the handling is not sharp. Sometimes this can be taken care by traction control.

Sharp handling also means steering has to be precise ie car will go exactly where intend it to go. It will neigher understeer (mostly FWD) and not oversteer (mostly RWD). Vague steering also indicates driver is having difficulty judging how much exactly the car will turn if he moves steering a precise angle.

Body roll - it means when you take a sharp turn, the inside suspension pushes the car down and outside suspension pushes it up - resulting a feeling in body as if it is rolling on one side. A softer suspension will contribute to more body roll and vice versa for firm suspension. That is why most sports car has a firm suspension. On other hand, too much firm suspension means unfortable ride over not so smooth road surfaces.

Punchy/nippy/rev happy engine - means if you plot a graph of time vs RPM, it is quite steep. This also ensures car will have rapid acceleration. If this curve is flatter, then the engine is often termed as agricultural :)

Wallowy ride - more applicable to American cars with soft suspension. This is similar to ride you get in boats.

Engine running out of puff - it means engine has reached its max power output and unable to speed up the car any more.

A sports car will have - sharp handling, precise steering, firm suspension, good acceleration and good grip. Sometimes tyres are designed to sacrifice wet grip in favor of dry grip.

Handling of cars also depends on engine layout like FWD, RWD or mid-engined cars.

6 Aug 2010

Location of UK Motorway Services

You can plot this on online maps, Sat Navs or can just print and keep a copy in your car.

Latitude|Longitude|ServiceName
55.56536|-1.79202|A1 - Adderstone
52.80251|-0.61043|A1 - Colsterworth
52.94826|-0.67924|A1 - Great Gonerby
55.92459|-3.04416|A1 - Inveresk
54.44339|-1.66665|A1 - Scotch Corner
52.01373|-0.20123|A1(M) - J10 - Baldock
52.53187|-0.32212|A1(M) - J17 - Peterborough
53.38666|-1.05934|A1(M) - J34 - Blyth
54.72834|-1.5241|A1(M) - J61 - Durham
54.88905|-1.55962|A1(M) - J64/J65 - Washington (N)
54.89076|-1.55858|A1(M) - J64/J65 - Washington (S)
54.95219|-2.97988|A74 - Rockcliffe (S)
55.2193|-3.4112|A74(M) - J16/J17 - Annandale Water Services
56.77403|-2.60878|A90 - Inchbare
51.94615|-0.50183|M1 - J11/J12 - Toddington (N)
51.94876|-0.50297|M1 - J11/J12 - Toddington (S)
52.0823|-0.74831|M1 - J14/J15 - Newport Pagnell (N)
52.20834|-0.94255|M1 - J15a - Rothersthorpe (N)
52.21074|-0.9456|M1 - J15a - Rothersthorpe (S)
52.30564|-1.12241|M1 - J16/J17 - Watford Gap (N)
52.30815|-1.12252|M1 - J16/J17 - Watford Gap (S)
52.6178|-1.20609|M1 - J21/21a - Leicester Forest East (N)
52.62047|-1.20541|M1 - J21/21a - Leicester Forest East (S)
52.69617|-1.29516|M1 - J22 - Markfield, Leicester
52.82343|-1.3069|M1 - J23a/J24 - Donnington Park
52.96144|-1.26937|M1 - J25/J26 - Trowell (N)
52.96301|-1.26526|M1 - J25/J26 - Trowell (S)
53.13821|-1.33194|M1 - J28/J29 - Tibshelf (N)
51.63436|-0.26685|M1 - J3 - London Gateway
53.31454|-1.28338|M1 - J30/J31 - Woodall (N)
53.31658|-1.28088|M1 - J30/J31 - Woodall (S)
53.62202|-1.54912|M1 - J38/J39 - Wooley Edge (N)
53.62205|-1.54652|M1 - J38/J39 - Wooley Edge (S)
51.87115|0.19678|M11 - J8 - Birchanger Green, Bishop Stortford
51.34138|0.60594|M2 - J4/J5 - Medway (E)
51.34118|0.61019|M2 - J4/J5 - Medway (W)
51.26467|0.61489|M20 - J8 - Maidstone
51.0843|-0.198|M23 - J11 - Peas Pottage
51.68771|-0.2247|M25 - J23 - South Mimms
51.49118|0.26966|M25 - J30(S)/J31(N) - Thurrock
51.27259|0.03684|M25 - J5/J6 - Clacket Lane (E)
51.27259|0.04298|M25 - J5/J6 - Clacket Lane (W)
50.9578|-1.44831|M27 - J3/J4 - Rownhams (E), Toothill
50.95724|-1.44803|M27 - J3/J4 - Rownhams (W), Toothill
51.29609|-0.85786|M3 - J4a/J5 - Fleet (E)
51.29583|-0.85385|M3 - J4a/J5 - Fleet (W)
51.1193|-1.2545|M3 - J8/J9 - Winchester (N)
51.11709|-1.25358|M3 - J8/J9 - Winchester (S)
51.42395|-1.03164|M4 - J11/J12 - Reading (E)
51.42236|-1.03191|M4 - J11/J12 - Reading (W)
51.44961|-1.31033|M4 - J13 - Chieveley
51.48438|-1.558|M4 - J14/J15 - Membury (E)
51.48029|-1.5556|M4 - J14/J15 - Membury (W)
51.51041|-2.16246|M4 - J17/J18 - Leigh Delamere (E)
51.51046|-2.15414|M4 - J17/J18 - Leigh Delamere (W)
51.48828|-0.3901|M4 - J2/3 - Heston (E)
51.48737|-0.39119|M4 - J2/J3 - Heston (W)
51.5881|-2.83879|M4 - J23a - Magor
51.53992|-3.13093|M4 - J30 - Cardiff Gate
51.50813|-3.31019|M4 - J33 - Cardiff West
51.53412|-3.57646|M4 - J36 - Sarn Park
51.67886|-3.9957|M4 - J47 - Swansea
51.74723|-4.06447|M4 - J49 - Pont Abraham
51.94926|-1.20103|M40 - J10 - Cherwell Valley
52.21912|-1.50489|M40 - J15 - Warwick (E)
52.21661|-1.50345|M40 - J15 - Warwick (W)
51.73875|-1.09747|M40 - J8/J8a - Oxford
52.60649|-1.64171|M42 - J10 - Tamworth
52.36315|-1.94847|M42 - J2 - Hopwood Park
51.60307|-2.6199|M48 - J1 - Severn View, Aust
51.65548|-2.43319|M5 - J13/J14 - Michaelwood (N)
51.65846|-2.4245|M5 - J13/J14 - Michaelwood (S)
51.47695|-2.7092|M5 - J19 - Gordano
51.2682|-2.92335|M5 - J21/J22 - Sedgemoor (N)
51.26984|-2.92026|M5 - J21/J22 - Sedgemoor (S)
51.1043|-2.99566|M5 - J24 - Bridgewater
50.97702|-3.14878|M5 - J25/J26 - Taunton Deane (N)
50.97689|-3.14479|M5 - J25/J26 - Taunton Deane (S)
50.86291|-3.38429|M5 - J28/J29 - Cullompton
52.42797|-2.01871|M5 - J3/J4 - Frankley (N)
52.42943|-2.01677|M5 - J3/J4 - Frankley (S)
50.71535|-3.46514|M5 - J30 - Exeter, Sowton
52.06376|-2.15662|M5 - J7/J8 - Stensham (N)
52.05816|-2.14178|M5 - J7/J8 - Strensham (S)
51.92828|-2.57007|M50 (A449) - J4 - Ross Spur (N)
52.642|-2.0564|M6 - J10a/J11 - Hilton Park (N)
52.6447|-2.05646|M6 - J10a/J11 - Hilton Park (S)
52.88319|-2.17023|M6 - J14/J15 - Stafford (N)
52.87454|-2.16433|M6 - J14/J15 - Stafford (S)
52.99285|-2.28771|M6 - J15/J16 - Keele (N)
52.99429|-2.29155|M6 - J15/J16 - Keele (S)
53.13793|-2.33777|M6 - J16/J17 - Sandbach (N)
53.14009|-2.33597|M6 - J16/J17 - Sandbach (S)
53.3|-2.40236|M6 - J18/J19 - Knutsford (N)
53.30168|-2.40079|M6 - J18/J19 - Knutsford (S)
53.35948|-2.50495|M6 - J21 - Lymm
53.63039|-2.69083|M6 - J27/J28 - Charnock Richard (N)
53.63254|-2.69071|M6 - J27/J28 - Charnock Richard (S)
52.4726|-1.54965|M6 - J3/J3a - Corley (E)
52.46971|-1.54583|M6 - J3/J3a - Corley (W)
53.96069|-2.75907|M6 - J32/J33 - Lancaster (N)
53.96335|-2.76077|M6 - J32/J33 - Lancaster (S)
54.17816|-2.73467|M6 - J35/J36 - Burton-in-Kendal (N)
54.316|-2.637|M6 - J36/J37 - Killington Lake (S)
54.44932|-2.60687|M6 - J38/J39 - Tebay (N)
54.7954|-2.8703|M6 - J41/J42 - Southwaite (N)
54.80111|-2.87067|M6 - J41/J42 - Southwaite (S)
52.66494|-1.96975|M6 Toll - JT6/J7 - North Canes
53.59834|-2.57331|M61 - J6/J8 - Bolton West (N)
53.60193|-2.57417|M61 - J6/J8 - Bolton West (S)
53.56734|-2.23402|M62 - J18/J19 - Birch (N)
53.56835|-2.22872|M62 - J18/J19 - Birch (S)
53.71129|-1.74594|M62 - J25/J26 - Hartshead Moor (N)
53.715|-1.74454|M62 - J25/J26 - Hartshead Moor (S)
53.69712|-1.26613|M62 - J33 - Ferrybridge
53.41677|-2.64037|M62 - J8 - Burtonwood (N)
53.4171|-2.63419|M62 - J8 - Burtonwood (S)
53.71469|-2.47813|M65 - J4 - Blackburn with Darwen
55.5832|-3.8223|M74 - J11 - Happendon
55.5062|-3.69535|M74 - J13/J14 - Abington
55.0088|-3.0853|M74 - J21/J22 - Gretna Green
55.81242|-4.06159|M74 - J4/J5 - J4-5 Bothwell (S)
55.7898|-4.03735|M74 - J5/J6 - Hamilton (N)
55.8638|-3.76153|M8 - J4/J5 - Heart Of Scotland, Harthill (E)
55.86289|-3.75707|M8 - J4/J5 - Heart Of Scotland, Harthill (W)
56.07594|-3.92238|M80 - J6 - Stirling
56.07594|-3.92238|M9 - J9 - Stirling
56.21012|-3.43941|M90 - J6 - Kinross