30 Jun 2010

Types of Oracle joins

Just as a brush up of Oracle join understanding.

Suppose we are joining two tables in Oracle.

SELECT T1.X, T2.Y
FROM T1 JOIN T2 ON T1.X = T2.X

Assume, T1 is driving table (i.e. more rows than T2)

Nested Loop join
------------------

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

Now if there is no index on column X in T2, Oracle will perform full table scan for each record in T1.

Hash join
---------

Oracle will load create an in-memory index (knowns as hash table) for T2 (i.e. smaller table)

It will still perform similar operation

for each record in T1
find matching record in T2 where T1.X = T2.X
fetch that record into result set

But even if there is no index on T2.X, Oracle will use the hash table which works similar to index!

Clearly this is suitable only when T2 is small otherwise [1] hash table may not fit in memory [2] creation of hash table may involve a long time (then we could have created the index on T2 in first place!)

Sorted Merge join
-------------------

If both tables are very big, hash join is not suitable due to memory/time requirement and nested loop will be slow or will result in full table scan.

The alternative way to speed up performance is to use sorted merge join.

In this case, both tables are sorted by Oracle (using temporary tablespace) by joining key (X in this case).

The advantage is, when joining two tables, Oracle just needs to scan only a small part of the table (e.g. as they are sorted, to find a record in T2 when T1.X = 15 will require Oracle search just from X=10 to X=20 in T2 as it cannot be beyond this range and so on).

The trade off is, it will take some resource (space/time/memory) to sort the tables.

Usually Oracle chooses best method of join based on statistics avaialble. However, user may force using a different type of join using hints like use_hash(T1,T2) etc.

25 Jun 2010

Data Protection Act summary


As applicable in EEA/EU
Personal Information is data which relates to a living individual who can be either:
  • Identified directly from that information or
  • Identified indirectly from that information combined with any other data that is in the possession of the organisation holding the information.
The law requires that personal information provided to an organisation is,
  • Managed fairly and lawfully
  • Recorded accurately and kept secure.
  • Kept up-to-date
  • Used solely for the purposes agreed to.
  • Kept no longer than necessary.
An individual has the legal right to:
  • ask an organisation if they hold personal information about them
  • ask an organisation that does have their personal information:
  • what the information recorded is
  • what it's used for
  • a copy of all their personal information on record
  • receive a copy of all their personal information within 40 days
An individual has the legal right to:
  • Take action to rectify, block, erase or destroy inaccurate information.
  • Take action for compensation if damage is suffered by any contravention of the Act.
  • Take any complaint to the Office of the Information Commissioner, an independent body responsible for ensuring that the rules of the Data Protection Act are complied with.
The First Principle
Personal information must be processed fairly and lawfully.
For sensitive information to be processed an organisation must obtain an individual's explicit consent.
Sensitive personal data is information about an individual's,
  • racial or ethnic origin
  • political opinions
  • religious opinions
  • trade union membership
  • physical or mental health
  • sexual life
  • alleged or actual legal offences
  • legal proceedings or judgements.
The Second Principle
Personal information must be obtained and used for specified and lawful purposes.
We must make sure that an individual knows and understands why we want to use information about them.
We should make it clear to the individual what the information will be used for and who else it may be passed on to.
The Third Principle
Personal information must be adequate, relevant and not excessive.
We should only collect and keep personal information that allows us to do our work.
We should not hold any unnecessary personal details.
The Fourth Principle
Personal information must be accurate and kept up-to-date.
It is important for us to make sure that the personal information we hold is correct, up-to-date, and not misleading.
When collecting this information we should take reasonable steps to make certain that personal details are accurate.
The Fifth Principle
Personal information must be kept for no longer than necessary.
We should only keep on record a person's details for as long as it takes for us to do our work and no longer.
The information should be regularly updated and disposed of securely when we no longer need to use it.
The Sixth Principle
Personal information must be processed in accordance with the rights of the individual.
While we are working with a person's details we should make sure we do not use the information in a way which could cause them damage or distress.
We must also remember that an individual has the legal right to be given a copy of all their personal information recorded.
The Seventh Principle
Personal information must be kept secure.
We need to make sure that personal details held by us are safe from damage, accidental loss, destruction or unlawful access.
We should consider the harm or damage that could be caused through a breach in security and provide appropriate levels of,
  • security vetting to ensure the reliability of personnel
  • physical security
  • technological security
The Eighth Principle
Personal information must be properly protected when transferred overseas.
Personal information should not be transferred outside European Economic Areas to countries that do not ensure adequate levels of data protection for data subjects.
Adequate security measures should be taken to make certain information is not violated either in transit or at its destination.
Areas of Exemption
There are a number of areas of exemption provided for within the Data Protection Act. Although it is not the objective of this course to cover these in any detail, an awareness of them is valuable.
Different levels of exemption apply to different areas. Other examples of areas where exemptions apply are those involving,
  • National Security
  • Crime and Taxation
  • Credit Reference Agencies
  • Health, Education and Social Work
  • Journalism
  • Legal, Professional Privilege
  • Public Information, information made available to the public by law.

6 Jun 2010

The McKinsey Way

Recently I read this book. I thought of writing down the salient points so that I do not forget.

Building the solution

  • Fact based
  • Rigidly structured
  • Hypothesis driven

The problem is not always the problem.

Don’t reinvent wheel but every client is unique.

Follow 80/20 rule. Work smarter not harder.

Find key drivers in business.

Know your solution so that you can precisely explain to your client in 30 seconds.

Break the problem into small bits and try to solve each part instead of trying to solve whole thing in one go. But don’t lose focus from big picture.

Say “I don’t know” if necessary.

A little team bonding goes a long way. Make your boss look good.

When you go for an interview, be prepared. During an interview, listen and guide.

Display data with charts.

A good business message will have – brevity, thoroughness and structure.

Maintain confidentially whenever appropriate.

Keep client team on your side. Work around liability client team members.

Engage client in solution process.

Be rigorous about implementation steps of your solution.

A good assistant is a lifeline.

If you want a life, lay down some rules.