6 Dec 2010

Linux Distro watch - again

I was recently playing again with some more Linux distros.

PinguyOS, which is based on Ubuntu, is very good out of the box distro. You can run it from live CD or from USB.

The flash support is built in and YouTube videos open fine.

But it has a big drawback - you can use persistent storage feature. Even though it exists in theory I could not make it work!

That's why I still can't praise it enough.

One reason why typical Windows users will be willing to use Linux is they want to play with a different OS.

For them, being able to boot from USB and save settings and other data is important. Also flash support etc. should be very easy to set up.

If possible, USB disk should be accessible from Windows system (with SuSE, Windows could not read the drive).

The distro should also come with common packages Open Office, although it is not that imporant as web based office apps are quite handy nowadays.

I tried EasyPeasy Linux and didn't really like the interface.

Lately I also poked around SuSE 11.3 It now has persistence feature created on the fly at first reboot. It's interface is eye catching. But some useful apps are not put into shortcuts by default! It froze once during my brief testing.

So, at the end, my Linux distro favorites still remain the same - Fedora, Ubuntu and SuSE (in that order).

1 Sept 2010

Ab Initio vs Clover ETL

Among all ETL tools used, Clover ETL comes closes to Ab Initio in look and feel.

I felt immediately at home with Clover's Designer (equivalent to Ab Initio's GDE).

Here is a 1-to-1 equivalent of Clover's components compared with Ab Initio's.

Ab Initio Clover ETL
Input files Universal Data Reader
Output files Universal Data Writer
Reformat Reformat
Join (in memory) ExtHashJoin
Join (sorted) ExtMergeJoin
Run component SystemExecute, DBExcute, RunGraph
Join with DB DBJoin
Lookup Lookup Join
DB Table DB Input Table
Trash Trash
Filter ExtFilter
Sort ExtSort
Sort within Group SortWithinGroups
Replicate SimpleCopy
Dedup Dedup
Concatenate Concatenate
Gather SimpleGather
Merge Merge
Partition Partition
Aggregate (deprecated) Aggregate
Normalize Normalizer
Denormalize Denormalizer
Rollup Rollup

25 Aug 2010

Database vs ETL processing - a comparison

Source data

The POINT table/file holds 100 records similar to this. Point ranges from 1000 to 5000.

CustNo
Point
1054
2781
1055
3050
1056
3714
1057
3385
1058
3910
1059
3041
1060
3301
1061
4615
1062
4300
1063
5126

The CUSTOMER table/file holds 100 records similar to shown below.

CustNo
Name
Address
Town
Postcode
Sex
1054
Kirby Roman
390-4028 Vitae Avenue
Twin Falls
QO8 9DS
F
1055
Brianna Kaufman
Ap #414-8109 Nunc St.
Aberdeen
MD01 6PY
F
1056
Quon Maldonado
1214 Mi St.
Nevada City
EJ7 3BW
F
1057
Sarah Burke
897-4509 Integer Rd.
DuBois
J2 0ZW
F
1058
Alice Pickett
3730 Sed Street
Grambling
C6G 8YI
F
1059
Odette Richard
P.O. Box 920, 6838 Felis Av.
Oklahoma City
O5 0QM
F
1060
Ursa Nunez
310 Euismod Street
Pierre
P30 8FZ
F
1061
Caryn Walker
Ap #969-5153 Ipsum. Rd.
Idaho Falls
GF1 2VR
F
1062
Dai Lawrence
907-2556 Suspendisse Rd.
Pass Christian
B3R 2JV
F
1063
Claudia Calhoun
P.O. Box 932, 3401 Cras Avenue
Poughkeepsie
XU8 7LC
F

Requirement

1. Select only customers with points > 1000
2. Join with customer details on CustNo to populate customer details with points
3. Based on customer gender, create salutation field in output as Mr/Ms First name Surname
4. Create 3 feeds named Gold, Silver & Bronze based on following logic
• Gold - point > 4001
• Silver - point between 2001 and 4000
• Bronze - point between 1001 and 2000

All output feeds will have following fields:

CustNo
Salutation
Address
City PostCode
Point
Category (Gold/Silver/Bronze)

The output will look like this.

CustNo
Name
Address
Town Postcode
Point
Category
1073
Ms Cassidy Stokes
9679 Malesuada Ave
Sedalia BX4G 0HV
4120
Gold
1074
Ms Germane Olson
P.O. Box 631, 2307 Lobortis Road
Darlington N1H 3IM
5402
Gold
1044
Mr Emery Andrews
6721 Quis Av.
Needham X3Z 9SJ
3158
Silver
1045
Mr Leonard Logan
P.O. Box 554, 591 Massa. Road
Minnetonka LF3F 6ON
2569
Silver
1030
Mr Stewart Sanford
Ap #246-7429 Vitae Av.
Gadsden V26 6GZ
1522
Bronze

NB. There will actually be 3 output feeds - for gold, silver and bronze customers.

ETL Solution

The Clover ETL solution will be like shown in next figure.




Database Solution

The solution shown here uses Oracle database.

There will be a table named PROCESS_FLOW which will control the overall flow as explained further.

The sample contents of this table will be like this for the particular example we are discussing here.


STEP
SOURCE_VIEW
TARGET_TABLE
ACTION
DESCRIPTION
10
CUSTOMER_FILTER
CUSTOMER_FILTER_TABLE
TRUNCATE INSERT
selecting customers with point > 1000
20
CUSTOMER_JOIN
CUSTOMER_JOIN_TABLE
TRUNCATE INSERT
join above step with customer on CustNo
30
CUSTOMER_GOLD
CUSTOMER_GOLD_TABLE
TRUNCATE INSERT
select category = "GOLD" from above output



CREATE OR REPLACE VIEW CUSTOMER_FILTER AS SELECT * FROM POINT WHERE POINT > 1000;
CREATE TABLE CUSTOMER_FILTER_TABLE AS SELECT * FROM CUSTOMER_FILTER WHERE 1=2;
INSERT INTO CUSTOMER_FILTER_TABLE SELECT * FROM CUSTOMER_FILTER;
COMMIT;

CREATE OR REPLACE VIEW CUSTOMER_JOIN AS
SELECT P.CUSTNO, C.CUSTNAME, C.ADDRESS, C.TOWN, C.POSTCODE, P.POINT, C.SEX
,CASE WHEN P.POINT > 4000 THEN 'GOLD' WHEN (P.POINT > 2000 AND P.POINT <= 4000) THEN 'SILVER' WHEN (P.POINT > 1000 AND P.POINT <= 2000) THEN 'BRONZE' END CATEGORY FROM CUSTOMER_FILTER_TABLE P JOIN CUSTOMER C ON P.CUSTNO = C.CUSTNO; INSERT INTO CUSTOMER_JOIN_TABLE SELECT CJ.CUSTNO, DECODE(SEX,'M','Mr','F','Ms','??') || ' ' || CJ.CUSTNAME , CJ.ADDRESS, CJ.TOWN || ' ' || CJ.POSTCODE , NULL,CJ.POINT, CJ.SEX,CJ.CATEGORY FROM CUSTOMER_JOIN CJ; COMMIT; SELECT * FROM CUSTOMER_JOIN_TABLE; -- POSTCODE WILL BE NULL AS WE COMBINED THAT WITH TOWN IN PREVIOUS FIELD CREATE OR REPLACE VIEW CUSTOMER_GOLD AS SELECT * FROM CUSTOMER_JOIN_TABLE WHERE CATEGORY = 'GOLD'; CREATE TABLE CUSTOMER_GOLD_TABLE AS SELECT * FROM CUSTOMER_GOLD WHERE 1=2; INSERT INTO CUSTOMER_GOLD_TABLE SELECT * FROM CUSTOMER_GOLD; COMMIT; SELECT * FROM CUSTOMER_GOLD; -- list of GOLD customers


So, we can see how involved database processing steps are!

Difference between DB & ETL processing

Topic
DB
ETL
Parallel processing
DB process is sequential.
ETL process is parallel.
Phasing
DB process each step in a different phase.
Independent ETL components may be in same phase.
I/O
DB process lands data in disk (= tables).
ETL process can run in pipeline manner, without landing data to disk thus reducing I/O.
Intuitiveness
DB process is not intuitive. It makes maintaining them very difficult.
ETL process is a diagram. Very easy to understand and modify.
Platform specific
DB process is DB specific.
ETL process is generic. Heterogeneous environments can implement ETL processes in similar fashion.
Debugging
Debugging in DB is very cumbersome.
Debugging on ETL is often visual - so easy to spot problem.

Conclusion

An ETL process is much easier to maintain than DB specific process.

15 Aug 2010

Car handling terms - what they mean?

Sharp handling - the car goes exactly where you points steering and intend to go. It should not lose grip either. If you try to drive a family car over 60 km/h in a twisted road, you may find that the car may drift towards wrong side of central line marker! This means the handling is not sharp. Sometimes this can be taken care by traction control.

Sharp handling also means steering has to be precise ie car will go exactly where intend it to go. It will neigher understeer (mostly FWD) and not oversteer (mostly RWD). Vague steering also indicates driver is having difficulty judging how much exactly the car will turn if he moves steering a precise angle.

Body roll - it means when you take a sharp turn, the inside suspension pushes the car down and outside suspension pushes it up - resulting a feeling in body as if it is rolling on one side. A softer suspension will contribute to more body roll and vice versa for firm suspension. That is why most sports car has a firm suspension. On other hand, too much firm suspension means unfortable ride over not so smooth road surfaces.

Punchy/nippy/rev happy engine - means if you plot a graph of time vs RPM, it is quite steep. This also ensures car will have rapid acceleration. If this curve is flatter, then the engine is often termed as agricultural :)

Wallowy ride - more applicable to American cars with soft suspension. This is similar to ride you get in boats.

Engine running out of puff - it means engine has reached its max power output and unable to speed up the car any more.

A sports car will have - sharp handling, precise steering, firm suspension, good acceleration and good grip. Sometimes tyres are designed to sacrifice wet grip in favor of dry grip.

Handling of cars also depends on engine layout like FWD, RWD or mid-engined cars.

6 Aug 2010

Location of UK Motorway Services

You can plot this on online maps, Sat Navs or can just print and keep a copy in your car.

Latitude|Longitude|ServiceName
55.56536|-1.79202|A1 - Adderstone
52.80251|-0.61043|A1 - Colsterworth
52.94826|-0.67924|A1 - Great Gonerby
55.92459|-3.04416|A1 - Inveresk
54.44339|-1.66665|A1 - Scotch Corner
52.01373|-0.20123|A1(M) - J10 - Baldock
52.53187|-0.32212|A1(M) - J17 - Peterborough
53.38666|-1.05934|A1(M) - J34 - Blyth
54.72834|-1.5241|A1(M) - J61 - Durham
54.88905|-1.55962|A1(M) - J64/J65 - Washington (N)
54.89076|-1.55858|A1(M) - J64/J65 - Washington (S)
54.95219|-2.97988|A74 - Rockcliffe (S)
55.2193|-3.4112|A74(M) - J16/J17 - Annandale Water Services
56.77403|-2.60878|A90 - Inchbare
51.94615|-0.50183|M1 - J11/J12 - Toddington (N)
51.94876|-0.50297|M1 - J11/J12 - Toddington (S)
52.0823|-0.74831|M1 - J14/J15 - Newport Pagnell (N)
52.20834|-0.94255|M1 - J15a - Rothersthorpe (N)
52.21074|-0.9456|M1 - J15a - Rothersthorpe (S)
52.30564|-1.12241|M1 - J16/J17 - Watford Gap (N)
52.30815|-1.12252|M1 - J16/J17 - Watford Gap (S)
52.6178|-1.20609|M1 - J21/21a - Leicester Forest East (N)
52.62047|-1.20541|M1 - J21/21a - Leicester Forest East (S)
52.69617|-1.29516|M1 - J22 - Markfield, Leicester
52.82343|-1.3069|M1 - J23a/J24 - Donnington Park
52.96144|-1.26937|M1 - J25/J26 - Trowell (N)
52.96301|-1.26526|M1 - J25/J26 - Trowell (S)
53.13821|-1.33194|M1 - J28/J29 - Tibshelf (N)
51.63436|-0.26685|M1 - J3 - London Gateway
53.31454|-1.28338|M1 - J30/J31 - Woodall (N)
53.31658|-1.28088|M1 - J30/J31 - Woodall (S)
53.62202|-1.54912|M1 - J38/J39 - Wooley Edge (N)
53.62205|-1.54652|M1 - J38/J39 - Wooley Edge (S)
51.87115|0.19678|M11 - J8 - Birchanger Green, Bishop Stortford
51.34138|0.60594|M2 - J4/J5 - Medway (E)
51.34118|0.61019|M2 - J4/J5 - Medway (W)
51.26467|0.61489|M20 - J8 - Maidstone
51.0843|-0.198|M23 - J11 - Peas Pottage
51.68771|-0.2247|M25 - J23 - South Mimms
51.49118|0.26966|M25 - J30(S)/J31(N) - Thurrock
51.27259|0.03684|M25 - J5/J6 - Clacket Lane (E)
51.27259|0.04298|M25 - J5/J6 - Clacket Lane (W)
50.9578|-1.44831|M27 - J3/J4 - Rownhams (E), Toothill
50.95724|-1.44803|M27 - J3/J4 - Rownhams (W), Toothill
51.29609|-0.85786|M3 - J4a/J5 - Fleet (E)
51.29583|-0.85385|M3 - J4a/J5 - Fleet (W)
51.1193|-1.2545|M3 - J8/J9 - Winchester (N)
51.11709|-1.25358|M3 - J8/J9 - Winchester (S)
51.42395|-1.03164|M4 - J11/J12 - Reading (E)
51.42236|-1.03191|M4 - J11/J12 - Reading (W)
51.44961|-1.31033|M4 - J13 - Chieveley
51.48438|-1.558|M4 - J14/J15 - Membury (E)
51.48029|-1.5556|M4 - J14/J15 - Membury (W)
51.51041|-2.16246|M4 - J17/J18 - Leigh Delamere (E)
51.51046|-2.15414|M4 - J17/J18 - Leigh Delamere (W)
51.48828|-0.3901|M4 - J2/3 - Heston (E)
51.48737|-0.39119|M4 - J2/J3 - Heston (W)
51.5881|-2.83879|M4 - J23a - Magor
51.53992|-3.13093|M4 - J30 - Cardiff Gate
51.50813|-3.31019|M4 - J33 - Cardiff West
51.53412|-3.57646|M4 - J36 - Sarn Park
51.67886|-3.9957|M4 - J47 - Swansea
51.74723|-4.06447|M4 - J49 - Pont Abraham
51.94926|-1.20103|M40 - J10 - Cherwell Valley
52.21912|-1.50489|M40 - J15 - Warwick (E)
52.21661|-1.50345|M40 - J15 - Warwick (W)
51.73875|-1.09747|M40 - J8/J8a - Oxford
52.60649|-1.64171|M42 - J10 - Tamworth
52.36315|-1.94847|M42 - J2 - Hopwood Park
51.60307|-2.6199|M48 - J1 - Severn View, Aust
51.65548|-2.43319|M5 - J13/J14 - Michaelwood (N)
51.65846|-2.4245|M5 - J13/J14 - Michaelwood (S)
51.47695|-2.7092|M5 - J19 - Gordano
51.2682|-2.92335|M5 - J21/J22 - Sedgemoor (N)
51.26984|-2.92026|M5 - J21/J22 - Sedgemoor (S)
51.1043|-2.99566|M5 - J24 - Bridgewater
50.97702|-3.14878|M5 - J25/J26 - Taunton Deane (N)
50.97689|-3.14479|M5 - J25/J26 - Taunton Deane (S)
50.86291|-3.38429|M5 - J28/J29 - Cullompton
52.42797|-2.01871|M5 - J3/J4 - Frankley (N)
52.42943|-2.01677|M5 - J3/J4 - Frankley (S)
50.71535|-3.46514|M5 - J30 - Exeter, Sowton
52.06376|-2.15662|M5 - J7/J8 - Stensham (N)
52.05816|-2.14178|M5 - J7/J8 - Strensham (S)
51.92828|-2.57007|M50 (A449) - J4 - Ross Spur (N)
52.642|-2.0564|M6 - J10a/J11 - Hilton Park (N)
52.6447|-2.05646|M6 - J10a/J11 - Hilton Park (S)
52.88319|-2.17023|M6 - J14/J15 - Stafford (N)
52.87454|-2.16433|M6 - J14/J15 - Stafford (S)
52.99285|-2.28771|M6 - J15/J16 - Keele (N)
52.99429|-2.29155|M6 - J15/J16 - Keele (S)
53.13793|-2.33777|M6 - J16/J17 - Sandbach (N)
53.14009|-2.33597|M6 - J16/J17 - Sandbach (S)
53.3|-2.40236|M6 - J18/J19 - Knutsford (N)
53.30168|-2.40079|M6 - J18/J19 - Knutsford (S)
53.35948|-2.50495|M6 - J21 - Lymm
53.63039|-2.69083|M6 - J27/J28 - Charnock Richard (N)
53.63254|-2.69071|M6 - J27/J28 - Charnock Richard (S)
52.4726|-1.54965|M6 - J3/J3a - Corley (E)
52.46971|-1.54583|M6 - J3/J3a - Corley (W)
53.96069|-2.75907|M6 - J32/J33 - Lancaster (N)
53.96335|-2.76077|M6 - J32/J33 - Lancaster (S)
54.17816|-2.73467|M6 - J35/J36 - Burton-in-Kendal (N)
54.316|-2.637|M6 - J36/J37 - Killington Lake (S)
54.44932|-2.60687|M6 - J38/J39 - Tebay (N)
54.7954|-2.8703|M6 - J41/J42 - Southwaite (N)
54.80111|-2.87067|M6 - J41/J42 - Southwaite (S)
52.66494|-1.96975|M6 Toll - JT6/J7 - North Canes
53.59834|-2.57331|M61 - J6/J8 - Bolton West (N)
53.60193|-2.57417|M61 - J6/J8 - Bolton West (S)
53.56734|-2.23402|M62 - J18/J19 - Birch (N)
53.56835|-2.22872|M62 - J18/J19 - Birch (S)
53.71129|-1.74594|M62 - J25/J26 - Hartshead Moor (N)
53.715|-1.74454|M62 - J25/J26 - Hartshead Moor (S)
53.69712|-1.26613|M62 - J33 - Ferrybridge
53.41677|-2.64037|M62 - J8 - Burtonwood (N)
53.4171|-2.63419|M62 - J8 - Burtonwood (S)
53.71469|-2.47813|M65 - J4 - Blackburn with Darwen
55.5832|-3.8223|M74 - J11 - Happendon
55.5062|-3.69535|M74 - J13/J14 - Abington
55.0088|-3.0853|M74 - J21/J22 - Gretna Green
55.81242|-4.06159|M74 - J4/J5 - J4-5 Bothwell (S)
55.7898|-4.03735|M74 - J5/J6 - Hamilton (N)
55.8638|-3.76153|M8 - J4/J5 - Heart Of Scotland, Harthill (E)
55.86289|-3.75707|M8 - J4/J5 - Heart Of Scotland, Harthill (W)
56.07594|-3.92238|M80 - J6 - Stirling
56.07594|-3.92238|M9 - J9 - Stirling
56.21012|-3.43941|M90 - J6 - Kinross

26 Jul 2010

The true color of IT projects

Can you imagine someone placing you on a cockpit of a plane. Then you are asked to fly and the plane.

You have been selected as pilot based on your physics degree and understanding of aerodynamics.

You have also flown a plane before but not like this plate and the controls on that plane were very much different. For example, the throttle control on previous plane is same as landing gear in current plane.

There is no manual defining your controls. Your manager asked just to operate each lever, switch and discover what it is used for.

Will you be a passenger on that plane? Probably not.

This situation analogous to what happens in thousands of complex IT projects. There are no documentations and code which was supposed to be an agile testing proof of concept, becomes live version. The original developers left the company with the knowledge in their heads.

The passengers (end users/customers) are still being served (flown) by pilots (developers) who don't really understand what they are doing.

The IT projects still seriously lack any standardization. People still expect adding wings to a car will make it a plane.

16 Jul 2010

What is driving pleasure?

My following discussion is purely for driving on normal roads and does not include track racing!

I often hear people say "this car is dull to drive, that car is thrilling, that road is boring" etc.

Then I start to wonder, what exactly makes a driving enjoyable?

So, I prepared a list of attributes that required for driving enjoyment. This is my personal list so others may differ.

Driving pleasure often associtated with psychology of driver. Same road in same car, one might find interesting while others may find it boring. Once mind must be in a state of good mood to enjoy driving. Seldom people enjoy their daily drive to work!

I think driving pleasure is a combination of following parameters

1. The state of mind of driver
2. The road
3. The car

Let's talk about roads. Nobody, as I am aware of, enjoys driving on a congested stop-start traffic in large town. But nearly everyone enjoys a scenic drive along country side. People tend to like twisty country roads more than straight motorways.

So, we can rank roads like this - town (worst), motorway/long straight roads (medium) and twisty country roads (best)

However, I get panicked if I have to drive thru a single track road, with bushes both sides. So, just a sceneic route is not enough. The road itself should encourage relaxed drive.

Next the car, probably the most important bit. Same route, driven in an expensive and luxurious car will feel different (at least for first time). As soon as you start to drive a better (in your mind) car, your mind automatically climbs up a higher level of achievement.

But you can have fun in same car which you drive daily. I noticed that body and mind feel thrilled when the car producing high power, which happens at lower gears with high RPM. The sound of engine roaring and feel of power along the road gives rise to thrill in mind. That is why it is often said a drive in sports car is thrilling.

A sports car offer following features over most standard cars

* A powerful engine
* A lower seating position - where you sit lower to ground which gives a feeling of high speed ride (compare a video shot at car's bumper level compared with that of in normal windscreen level)
* A high RPM engine experience (as stated above)
* The mental feeling of driving something special

However, as I already mentioned, one can have driving pleasure in normal car too. Provided it is driven to spawn thrill.

An off road driving is also very thrilling with a good 4x4. I was quite spellbound in my first ever off road experience with Land Rover.

Surprisingly, I did not find any difference in driving pleasure between manual and automatic cars!

1 Jul 2010

Black Swan Event

How did Black Swan name came into picture?

For a long time, people used to think all swans are white only. It was only when black swans were found in Australia, a long-time notion was changed!

Taleb tries to explain how Black Swan events happen in real life and what are the impacts of it.


Identifying a black swan event

Based on the author's criteria:

1. The event is a surprise (to the observer).
2. The event has a major impact.
3. After the fact, the event is rationalized by hindsight, as if it had been expected.

Taleb suggests that since most Black Swan events are without any precedence, it is almost impossible to predict such events.

So, one can only accept the events after they have happened.
The main idea in Taleb's book is not to attempt to predict Black Swan Events, but to build robustness against negative ones that occur and being able to exploit positive ones. Taleb contends that banks and trading firms are very vulnerable to hazardous Black Swan Events and are exposed to
losses beyond that predicted by their defective models.

See Wikipedia for more.

30 Jun 2010

Types of Oracle joins

Just as a brush up of Oracle join understanding.

Suppose we are joining two tables in Oracle.

SELECT T1.X, T2.Y
FROM T1 JOIN T2 ON T1.X = T2.X

Assume, T1 is driving table (i.e. more rows than T2)

Nested Loop join
------------------

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

Now if there is no index on column X in T2, Oracle will perform full table scan for each record in T1.

Hash join
---------

Oracle will load create an in-memory index (knowns as hash table) for T2 (i.e. smaller table)

It will still perform similar operation

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

But even if there is no index on T2.X, Oracle will use the hash table which works similar to index!

Clearly this is suitable only when T2 is small otherwise [1] hash table may not fit in memory [2] creation of hash table may involve a long time (then we could have created the index on T2 in first place!)

Sorted Merge join
-------------------

If both tables are very big, hash join is not suitable due to memory/time requirement and nested loop will be slow or will result in full table scan.

The alternative way to speed up performance is to use sorted merge join.

In this case, both tables are sorted by Oracle (using temporary tablespace) by joining key (X in this case).

The advantage is, when joining two tables, Oracle just needs to scan only a small part of the table (e.g. as they are sorted, to find a record in T2 when T1.X = 15 will require Oracle search just from X=10 to X=20 in T2 as it cannot be beyond this range and so on).

The trade off is, it will take some resource (space/time/memory) to sort the tables.

Usually Oracle chooses best method of join based on statistics avaialble. However, user may force using a different type of join using hints like use_hash(T1,T2) etc.

25 Jun 2010

Data Protection Act summary


As applicable in EEA/EU
Personal Information is data which relates to a living individual who can be either:
  • Identified directly from that information or
  • Identified indirectly from that information combined with any other data that is in the possession of the organisation holding the information.
The law requires that personal information provided to an organisation is,
  • Managed fairly and lawfully
  • Recorded accurately and kept secure.
  • Kept up-to-date
  • Used solely for the purposes agreed to.
  • Kept no longer than necessary.
An individual has the legal right to:
  • ask an organisation if they hold personal information about them
  • ask an organisation that does have their personal information:
  • what the information recorded is
  • what it's used for
  • a copy of all their personal information on record
  • receive a copy of all their personal information within 40 days
An individual has the legal right to:
  • Take action to rectify, block, erase or destroy inaccurate information.
  • Take action for compensation if damage is suffered by any contravention of the Act.
  • Take any complaint to the Office of the Information Commissioner, an independent body responsible for ensuring that the rules of the Data Protection Act are complied with.
The First Principle
Personal information must be processed fairly and lawfully.
For sensitive information to be processed an organisation must obtain an individual's explicit consent.
Sensitive personal data is information about an individual's,
  • racial or ethnic origin
  • political opinions
  • religious opinions
  • trade union membership
  • physical or mental health
  • sexual life
  • alleged or actual legal offences
  • legal proceedings or judgements.
The Second Principle
Personal information must be obtained and used for specified and lawful purposes.
We must make sure that an individual knows and understands why we want to use information about them.
We should make it clear to the individual what the information will be used for and who else it may be passed on to.
The Third Principle
Personal information must be adequate, relevant and not excessive.
We should only collect and keep personal information that allows us to do our work.
We should not hold any unnecessary personal details.
The Fourth Principle
Personal information must be accurate and kept up-to-date.
It is important for us to make sure that the personal information we hold is correct, up-to-date, and not misleading.
When collecting this information we should take reasonable steps to make certain that personal details are accurate.
The Fifth Principle
Personal information must be kept for no longer than necessary.
We should only keep on record a person's details for as long as it takes for us to do our work and no longer.
The information should be regularly updated and disposed of securely when we no longer need to use it.
The Sixth Principle
Personal information must be processed in accordance with the rights of the individual.
While we are working with a person's details we should make sure we do not use the information in a way which could cause them damage or distress.
We must also remember that an individual has the legal right to be given a copy of all their personal information recorded.
The Seventh Principle
Personal information must be kept secure.
We need to make sure that personal details held by us are safe from damage, accidental loss, destruction or unlawful access.
We should consider the harm or damage that could be caused through a breach in security and provide appropriate levels of,
  • security vetting to ensure the reliability of personnel
  • physical security
  • technological security
The Eighth Principle
Personal information must be properly protected when transferred overseas.
Personal information should not be transferred outside European Economic Areas to countries that do not ensure adequate levels of data protection for data subjects.
Adequate security measures should be taken to make certain information is not violated either in transit or at its destination.
Areas of Exemption
There are a number of areas of exemption provided for within the Data Protection Act. Although it is not the objective of this course to cover these in any detail, an awareness of them is valuable.
Different levels of exemption apply to different areas. Other examples of areas where exemptions apply are those involving,
  • National Security
  • Crime and Taxation
  • Credit Reference Agencies
  • Health, Education and Social Work
  • Journalism
  • Legal, Professional Privilege
  • Public Information, information made available to the public by law.