Parent-child relationships, one to many relationships and foreign keys are the basics of databases. What are they and how are they useful?
Technically those names all represent the same thing yet why so many names?
This article looks from a business perspective at the power of this concept and how it can be harnessed.
Whilst a very early concept to learn when using databases this concept has massive influence on reporting and when interpreting reporting.
Implementation of the approach is left to the techs who set up the databases so have they set it up right?
- Basic relationships
- Influence on reporting
- Rules to live by
- Wrap up
[read more=”Read more” less=”Read less”]
All by myself
Before we launch into parent-child relationships let’s start with a singular concept.
Your company sells things to a person.
You set up an Excel spreadsheet to record all the names of your customers.
It takes a while but you get it there.
Now you have a lovely list with “clean” data that you’ve checked.
You notice there are groups in the data. People who live in a particular area or people have a preferred product.
Luckily Excel has a “filter” option which allows you to group things up.
You decide you want to store what you’ve sold to these people as well.
You record the dates and amount in a second sheet. However you also have to type the name in again each time as you have to remember who that sale was for.
So you start putting a lot of data into the second sheet.
Real life comes along and sets you a few challenges.
As Facebook says “It’s complicated!”
So let’s say Tyisha get’s married and her surname changes. You fix it in your person sheet.
However what do you do with all your sales entries? It’s a lot of work to go through historical records and fix them.
Jane Bloggs is a common name in your data, but one of the Janes gets married changes her surname… which ones are “this” Jane Bloggs.
You want to quickly see all the sales to Jane Bloggs, yes you can filter by first name and surname but you can’t tell the difference between the Janes.
Also your data is mounting up and there are more and more records to sift through. More data = more work for computers to do, so they slow down.
There needs to be a better system as this is getting messy!
Parent-child relationships to the rescue
We go back to our Person list and we add a column. We give every record in the sheet its own unique number.
Then we go to the sales data and instead of first name and surname, we put the number in place.
This saves us a lot of typing of names, we just need an Id value which will always be unique, no fear of names changing.
And this sets up the concept of a parent-child relationship.
There is one entry in the Person list, this is the single parent record.
You then can have many sales records in the sales sheet all linked back to the one parent.
You have One parent and Many children sales records, hence it can also be called a one to many relationship.
Maps and keys
On a map, a key or legend is included with the map to unlock it.
It gives you the information needed for the map to make sense.
Maps often use symbols or colors to represent things, and the map key explains what they mean.
Understanding a parent-child relationship requires documentation to spell out what links to what.
Some databases can store this information in the database if the programmer adds it but more often than not, external documentation is required. Think of it like saying “which maps are we using?”
When you read the information in the sales sheet, there is now just a number.
This isn’t very informative to read as it doesn’t give you any information about the person just an Id.
If you want to know more about this person, you need to look them up somewhere else.
The fact that the data is somewhere else introduces the word foreign.
So in our sales sheet of information, the column containing all the numbers which refer back to the parent is called a Foreign Key.
You have to go somewhere else to get the data and it is a unique key which unlocks the data.
On the sales sheet where you have the unique Ids, this is called a Primary key. The primary key is generally unique so as to avoid confusion. One person, one key unique Id.
Influence on reporting
This concept of using a foreign key to link many records back to one record becomes a great way to link lots of data.
Start with a company, then you have people who work for the company, so the Company Id is a foreign key in the Person data.
You then have sales to a person. So you might have multiple sales to one person.
As part of that you might record emails, phone calls and notes, all of which are linked to the person.
So foreign keys become pervasive throughout solutions.
Getting a system which manages these links is common.
The designers when designing the solution work out where you need one to many connections.
The challenge is when businesses introduce new parent-child relationships as reporting / operational needs change.
Traditional software is a bit like the statue of David, once built… that’s it.
Whilst designers say “if you built it right” then it wouldn’t need to change. Life and situations change that’s why we have updates.
Finding a system which can adapt in real time to new links without massive expensive change is very rare.
What to count
Let’s say our parent-child relation ship is Person to Contact.
I have a Person and I contact that person 3 times in a week. I record the contacts in Contact.
So when you do a report for the week, do you report 1 or 3?
- How many people did we contact this week?
- In terms of contacts how many contacts were made?
Unfortunately in reporting terms, these questions can be muddied by reporting?
“So how many contacts did we have last week?”
Did you mean people or did you mean actual calls?
This leads to “interpretation” and the greatest challenge to the specifics of what a dashboard and reporting reports.
Rules to live by
Working out the combinations
In databases with foreign keys you need to work out the relationship which comes down to one of three relationships
- One to one: A person has one age
- One to many: One company has many employees
- Many to many: Many contractors can work for many companies at the same time.
When you’re doing your reporting understanding how the database guys and programmers set up the relationships will dictate how your reporting works.
When Edgar Codd working for IBM first introduced the concept of databases in the 1970s he gave us the “relational” database.
Codd did give us rules to follow when designing databases and their relationships.
So what rules should you consider when setting up parent-child structures?
These rules are called “normal forms” and there are eleven of them. Most designers will only really need the first three.
Whilst these sounds very jargonistic they are excellent rules for database designers to follow.
Codd and others established the rules in the 1970s / early 80s. The rules still work today.
2003 and 2012 saw two new forms added.
Sometimes when working with databases for speed, especially in large data sets, you have do the opposite of normalisation and “denormalize” the data. Data warehouses use this concept a lot.
So if you got this far and it all made sense, well done.
This primary concept influences every modern system in the background.
Whether using WhatpsApps, Facebook, LinkedIn or Twitter, at the back is a database recording a parent (your account) with children (your messages).
Your interconnections form many to many relationships all stored in a database.
The parent-child relationship is worth understanding no matter what system you use.
If there’s anything in this article you’d like to chat to me about you can contact me here or on social media.