Database Fundamentals/Normalization
This lesson introduces normalization.
First Normal Form (1NF)
edit- Data is in 1NF when:
1. Each column contains atomic (indivisible) values. 2. Each row is unique, and no column has repeating groups.
Example: Before normalization (not in 1NF):
StudentID | Name | Subjects |
---|---|---|
1 | Alice | Math, Science |
2 | Bob | History, Math |
After normalization (in 1NF):
StudentID | Name | Subject |
---|---|---|
1 | Alice | Math |
1 | Alice | Science |
2 | Bob | History |
2 | Bob | Math |
Second Normal Form (2NF)
edit- A table is in 2NF if:
1. It is in 1NF. 2. All non-key attributes are fully dependent on the primary key.
Example: Before normalization (not in 2NF):
OrderID | ProductName | Price | CustomerName |
---|---|---|---|
1 | Pen | 1.00 | Alice |
2 | Notebook | 5.00 | Bob |
After normalization (in 2NF): Orders Table:
OrderID | CustomerName |
---|---|
1 | Alice |
2 | Bob |
Products Table:
ProductID | ProductName | Price |
---|---|---|
1 | Pen | 1.00 |
2 | Notebook | 5.00 |
OrderDetails Table:
OrderID | ProductID |
---|---|
1 | 1 |
2 | 2 |
Third Normal Form (3NF)
edit- A table is in 3NF if:
1. It is in 2NF. 2. It contains no transitive dependencies (i.e., no non-key attribute depends on another non-key attribute).
Example: Before normalization (not in 3NF):
EmployeeID | Name | Department | Manager |
---|---|---|---|
1 | Alice | Sales | John |
2 | Bob | IT | Sarah |
After normalization (in 3NF): Employees Table:
EmployeeID | Name | DepartmentID |
---|---|---|
1 | Alice | 101 |
2 | Bob | 102 |
Departments Table:
DepartmentID | Department | Manager |
---|---|---|
101 | Sales | John |
102 | IT | Sarah |
Objectives and Skills
editObjectives and skills for the normalization portion of Microsoft Exam 98-364 Database Fundamentals include:[1]
- Understand data storage
- Understand normalization
- Understand the reasons for normalization, the five most common levels of normalization, how to normalize a database to third normal form
- Understand normalization
Readings
editMultimedia
editActivities
edit- While using the types of entities based on the College database (student, course, instructor), convert the entities to an equivalent collection of tables that are normalized and are in fourth normal form.
- Using this information, convert the unnormalized relation to fourth normal form:
- Create an example of a table that is in first normal form, but not in second normal form.
- Create an example of a table that is in second normal form but not in third normal form.
- Create an example of a table that is in third normal form but not in fourth normal form.
- Using this information, convert the unnormalized relation to fourth normal form:
Lesson Summary
edit- Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. Normalization involves decomposing a table into less redundant (and smaller) tables without losing information; defining foreign keys in the old table referencing the primary keys of the new ones. The objective is to isolate data so that additions, deletions, and modifications of an attribute can be made in just one table and then propagated through the rest of the database using the defined foreign keys. [2]
- First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain. First normal form enforces these criteria: Eliminate repeating groups in individual tables. Create a separate table for each set of related data. Identify each set of related data with a primary key[3]
- Second normal form (2NF) is a normal form used in database normalization. A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. A table is in 2NF if it is in 1NF and every non-prime attribute of the table is dependent on the whole of every candidate key.[4]
- Third normal form is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that (1) the entity is in second normal form, and (2) all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes.[5]
- Fourth normal form (4NF) is a normal form used in database normalization. Introduced by Ronald Fagin in 1977, 4NF is the next level of normalization after Boyce–Codd normal form (BCNF). Whereas the second, third, and Boyce–Codd normal forms are concerned with functional dependencies, 4NF is concerned with a more general type of dependency known as a multivalued dependency. A Table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X \twoheadrightarrow Y, X is a superkey—that is, X is either a candidate key or a superset thereof.[6]
- Normal forms 5 and 6 are less commonly used. It is possible to design a database in 5NF from outset and adapt it into 6NF with slight tweaks.
- Data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data.[7]
Key Terms
edit- fifth normal form (5NF)
- first normal form (1NF)
- form
- fourth normal form (4NF)
- normalization
- normal forms (NF)
- redundant data
- second normal form (2NF)
- third normal form (3NF)