Normalization in MS SQL — Simple terms and examples
Normalization is a process in database design used to organize data efficiently and eliminate redundancy. It involves decomposing tables into smaller, more manageable tables without losing data integrity. The goal is to minimize data duplication and ensure that the database adheres to certain rules to maintain data consistency.
Normalization is typically performed in several stages, each known as a normal form. Here’s a simplified explanation of normalization with examples:
Embark on a journey of continuous learning and exploration with DotNet-FullStack-Dev. Uncover more by visiting our https://dotnet-fullstack-dev.blogspot.com reach out for further information.
1. First Normal Form (1NF)
Definition: A table is in 1NF if it contains only atomic (indivisible) values and each column contains values of a single type.
Example:
Consider a table that stores information about students and their courses:
Issues:
- The
Courses
column contains multiple values (comma-separated).
Conversion to 1NF:
Create a new table for Courses
with a single course per row:
Students Table:
Courses Table:
2. Second Normal Form (2NF)
Definition: A table is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the entire primary key.
Example:
Consider a table that contains student grades along with student details:
Issues:
Instructor
depends only on theCourse
, not the whole primary key (StudentID
,Course
).
Conversion to 2NF:
Split the table into two:
StudentGrades Table:
Courses Table:
3. Third Normal Form (3NF)
Definition: A table is in 3NF if it is in 2NF and all the attributes are only dependent on the primary key.
Example:
Consider a table that includes student information and their department:
Issues:
DeptHead
is dependent onDepartment
, not on the primary key (StudentID
).
Conversion to 3NF:
Split the table into two:
Students Table:
Departments Table:
4.Boyce-Codd Normal Form (BCNF)
Definition
A table is in Boyce-Codd Normal Form (BCNF) if, for every functional dependency X→YX \rightarrow YX→Y, XXX is a superkey. This means that every determinant (left side of a functional dependency) must be a superkey.
Why BCNF is Needed
While 3NF deals with transitive dependencies (where non-key attributes depend on other non-key attributes), BCNF handles situations where a non-key attribute might determine another non-key attribute, which might still violate the normalization principles.
Example of BCNF
Consider the following table with information about projects and their managers:
Functional Dependencies:
ProjectID -> ManagerID
ManagerID -> ManagerName
Candidate Keys:
ProjectID
(for identifying each record uniquely)
Issues:
- In this table,
ManagerID
is not a superkey but it determinesManagerName
.
Decomposition to BCNF:
To convert this table into BCNF, we need to ensure that every determinant is a superkey. The functional dependency ManagerID -> ManagerName
indicates that ManagerID
is determining ManagerName
, and ManagerID
is not a superkey.
Solution:
Split the table into two:
Projects Table:
- Contains
ProjectID
andManagerID
.
Managers Table:
- Contains
ManagerID
andManagerName
.
Verification:
- In the
Projects
table,ProjectID
is a superkey. - In the
Managers
table,ManagerID
is a superkey.
Now, both tables are in BCNF because all functional dependencies have determinants that are superkeys.
Summary
- 1NF: Ensures that each column contains atomic values and each row is unique.
- 2NF: Ensures that non-key attributes are fully dependent on the whole primary key.
- 3NF: Ensures that all attributes are directly dependent on the primary key, removing transitive dependencies.
- BCNF: is a refinement of 3NF and ensures that for every functional dependency, the left side of the dependency (the determinant) is a superkey. This removes any potential anomalies that can still exist in 3NF, providing a more robust design for relational databases.
Normalization helps in designing a database schema that reduces redundancy and improves data integrity, making it easier to maintain and query. Applying BCNF ensures that the database schema is free of redundancies and maintains data integrity by enforcing that all dependencies are based on superkeys.