databasedev.co.uk - database solutions and downloads for microsoft access

Designer for Microsoft Access
Create complex MS Access databases without being an expert in relational database design! Designer for Microsoft Access asks you plain-language questions about what you want to manage with your database, and creates the tables and relationships automatically. Free trial available

First Normal Form (1NF) - Normalising Your Database:

Solving Data Problems

Database design theory includes design standards called normal forms. The process of making your data and tables match these standards is called normalizing data or data normalization. By normalizing your data, you eliminate redundant information and organize your table to make it easier to manage the data and make future changes to the table and database structure. This process removes the insertion, deletion, and modification anomalies you may see. In normalizing your data, you usually divide large tables into smaller, easier to maintain tables. You can then use the technique of adding foreign keys to enable connections between the tables.

Data normalization is part of the database design process and is not specific nor unique to any particular RDBMS. There are, in order, first, second, third, Boyce-Codd, fourth, and fifth normal forms. Each normal form represents an increasingly stringent set of rules; that is, each normal form assumes that the requirements of the preceding forms have been met. Many relational database designers feel that, if their tables are in third normal form, most common design problems have been addressed. However, the higher level normal forms can be of use and are included here.

This can be a rather intimidating topic, particularly if you try to understand it by reading some of the more theoretical texts. However, normal forms serve a very practical purpose in that they are designed to eliminate certain problems in table design and, as a result, in the data they contain. The exact wording of definitions of the normal forms varies depending upon the source; a set of definitions is provided in this topic. Our approach to understanding the normal forms will be to examine the problems they address.

First Normal Form

A table is in first normal form (1NF) if there are no repeating groups.

A repeating group is a set of logically related fields or values that occur multiple times in one record.

A Practical Approach

The sample tables below do not comply with first normal form. Look for fields that contain too much data and repeating group of fields.

EMPLOYEES_PROJECTS_TIME


A table with fields containing too much data.
EmployeeID Name Project Time
EN1-26 Sean O'Brien 30-452-T3, 30-457-T3, 32-244-T3 0.25, 0.40, 0.30
EN1-33 Amy Guya 30-452-T3, 30-382-TC, 32-244-T3 0.05, 0.35, 0.60
EN1-35 Steven Baranco 30-452-T3, 31-238-TC 0.15, 0.80
EN1-36 Elizabeth Roslyn 35-152-TC 0.90
EN1-38 Carol Schaaf 36-272-TC 0.75
EN1-40 Alexandra Wing 31-238-TC, 31-241-TC 0.20, 0.70

The example above is also related to another design issue, namely, that each field should hold the smallest meaningful value and that there should not be multiple values in a single field.

Why is this table design a problem?

There would be no way to sort by last names nor to know which allocation of time belonged to which project.

EMPLOYEES_PROJECTS_TIME


A table with repeating groups of fields.
EmpID Last Name First Name Project1 Time1 Project2 Time2 Project3 Time3
EN1-26 O'Brien Sean 30-452-T3 0.25 30-457-T3 0.40 32-244-T3 0.30
EN1-33 Guya Amy 30-452-T3 0.05 30-382-TC 0.35 32-244-T3 0.60
EN1-35 Baranco Steven 30-452-T3 0.15 31-238-TC 0.80
EN1-36 Roslyn Elizabeth 35-152-TC 0.90
EN1-38 Schaaf Carol 36-272-TC 0.75
EN1-40 Wing Alexandra 31-238-TC 0.20 31-241-TC 0.70

So why is this one a problem?

If an employee was assigned to a fourth project, you would have to add two new fields to the table. Also, it would be very difficult to total the amount of time devoted to a particular project.

The design problems addressed are very common-particularly among new designers who are accustomed to tracking data in a spreadsheet. Often, when building a spreadsheet, we arrange the data horizontally, laying it out across the spreadsheet. When designing tables, we have to think more vertically. Similar data belongs in the same column or field with a single value in each row.

Designing to meet first normal form

Now we will take the table you saw above and redesign it so it will comply with first normal form.

Look at the repeating groups of data. Identify tables and fields that will hold this data without the repeating groups. Think vertically and remember that similar data belongs in the same field.

Enter the sample data from the table to make sure you don't have repeating groups. If necessary, include foreign key field(s) to connect the tables.

EMPLOYEES


EmployeeID Last Name First Name
EN1-26 O'Brien Sean
EN1-33 Guya Amy
EN1-35 Baranco Steven
EN1-36 Roslyn Elizabeth
EN1-38 Schaaf Carol
EN1-40 Wing Alexandra

PROJECTS_EMPLOYEES_TIME


ProjectNum EmployeeID Time
30-328-TC EN1-33 0.35
30-452-T3 EN1-26 0.25
30-452-T3 EN1-33 0.05
30-452-T3 EN1-35 0.15
31-238-TC EN1-35 0.80
30-457-T3 EN1-26 0.40
31-238-TC EN1-40 0.20
31-241-TC EN1-40 0.70
32-244-T3 EN1-33 0.60
35-152-TC EN1-36 0.90
36-272-TC EN1-38 0.75

Mark the primary key field(s) and foreign keys in each table. Shown below with * indicating the Primary Key.

EMPLOYEES


* EmployeeID Last Name First Name
EN1-26 O'Brien Sean
EN1-33 Guya Amy
EN1-35 Baranco Steven
EN1-36 Roslyn Elizabeth
EN1-38 Schaaf Carol
EN1-40 Wing Alexandra

PROJECTS_EMPLOYEES_TIME


* ProjectNum * EmployeeID Time
30-328-TC EN1-33 0.35
30-452-T3 EN1-26 0.25
30-452-T3 EN1-33 0.05
30-452-T3 EN1-35 0.15
31-238-TC EN1-35 0.80
30-457-T3 EN1-26 0.40
31-238-TC EN1-40 0.20
31-241-TC EN1-40 0.70
32-244-T3 EN1-33 0.60
35-152-TC EN1-36 0.90
36-272-TC EN1-38 0.75

If an employee was assigned to an additional project, it would involve merely adding a new record. Also, it would be much easier to search for a particular project number as they are all held in a single column.

Introducing Functional Dependency

Before we go any further, there's a new concept you need to be aware of and that's functional dependency. A functional dependency is a relationship between fields so that the value in Field A determines the value in Field B, and there can be only one value in Field B. In that case, Field B is functionally dependent on Field A. Consider the following sample table:


Airport City
National Washington, DC
JFK New York
LaGuardia New York
Logan Boston
Dulles Washington, DC

Each airport name is unique and each airport can be in only one city. Therefore, City is functionally dependent on Airport. The value in the Airport field determines what the value will be in the City field (making Airport the determinant field) and there can be only one value in the City field. This does not need to work in the reverse. As shown in the table, a city can have more than one airport, so Airport is not functionally dependent on City; the value in City does not necessarily determine what the value in Airport will be.

You will sometimes see a functional dependency written in this format:

Determinant field(s) >> Functionally dependent field

as in:

Airport >> City

That's all for the first article on normal forms. View the articles detailing 2NF (2nd Normal Form) and 3NF (3rd Normal Form) for the next stages in your database development