Data normalization (database normalization) was first introduced by Edgar F. Codd as an integral part of his relational model first released in 1969. It can be described as the series of processes involving the structuring of a relational database to agree with a set of so-called normal forms. They are often used to increase the cohesion between entity types. The primary goal of data normalization is to bring about a reduction and elimination (in some cases) of data redundancy.
Data redundancy can be described as a condition existing in a data storage technology or database, and it is characterized by situations whereby a piece of data is held in two separate places in a single database or two different spots in multiple software platforms or the environment. The use of data normalization assists the developers in overcoming the challenges of storing objects in a relational database that maintains the same information in several places.
BENEFITS OF DATA NORMALIZATION
Data normalization utilizes normal forms for data transformation. The benefits of these data normalization processes include but are not limited to the following;
- It reduces the anomalies associated with data modification
- It makes your database more compact with little to no null values and less redundant data.
- Faster index searching with shorter and narrower indexes
- Creates room for better use of segments which can be utilized in controlling the physical placement of data.
- It triggers the execution of commands in a faster and more efficient manner while maintaining redundant data.
- Expression of fewer indexes per table to create room for more flexibility in tuning.
NORMAL FORMS INVOLVED IN DATA NORMALIZATION
A normal form can be described as a standard structure for relational databases whereby a relation may not be nested within another relation. The various types of normal forms utilized in data normalization include the following;
- UNF: Unnormalized form
- 1NF: First normal form
- 2NF: Second normal form
- 3NF: Third normal form
- EKNF: Elementary key normal form
- BCNF: Boyce–Codd normal form
- 4NF: Fourth normal form
- ETNF: Essential tuple normal form
- 5NF: Fifth normal form
- DKNF: Domain-key normal form
- 6NF: Sixth normal form
STEPS/STAGES OF DATA NORMALIZATION
Data normalization passes through different stages and processes which obey a set of rules. These processes involve a series of applying tests on a relation with the aim of determining if it satisfies or violates such a requirement in the given normal form. The steps involved in data normalization include;
Step 1: Selection of data source and conversion into an unnormalized table (UNF)
Step 2: Transformation of unnormalized data into the first normal form (1NF)
Step 3: Further transformation of data from the first normal form (1NF) into second normal form (2NF).
Step 4: Data transformation from a second normal form (2NF) to third normal form (3NF).
There are cases which may require further data transformations such as the;
- Transformation of 3NF to Boyce-Codd normal form (BCNF)
- Transformation of BCNF to fourth normal form (4NF).
- Transformation of 4NF to fifth normal form (5NF).
Please note that; the higher the normal forms, the lesser its vulnerability to update anomalies the relations become. All the normal forms are based on the functional dependencies among the attributes of a relation.