Simple Relational Database Design Tips:
If your companies data is its lifeblood, then the database design
can be the most important part of an application.
A great many subjects have been written on this topic, and entire
college degrees have been built around it. However, as has been said
time and time again, there’s no teacher like experience.
I'll get the show started by listing five favourite tips and giving
a brief explanation of the reasoning behind each one. If you have
a general design tip (try not to be specific to a particular system)
that you'd like to share, post it in using the Contact Webmaster e-mail
address. Where possible, explain the reasoning behind your tip by
including an example or anecdote.
- Use meaningful field names
If you have ever inherited a database from another programmer you
may well know what I mean. Some like to name fields using the name
of the on-screen control that displayed the data from that field.
That's all well and good, but unfortunately, they also liked to
name other controls using some strange convention that combined
Hungarian notation with the order in which they have added the controls
to the screen: cbo1, txt2, cmd3, lbl4 and so on.
Unless you are using a system that restricts you to short field
names, make them as descriptive as possible-within reason, of course.
It is however possible to go overboard with this; Customer_Address_Street_Line_1
is very descriptive and meaningful, but no one would want to have
to type it more than once.
- Do your homework
Not only should you research your business needs when designing
a new database, you should check out the existing system, as well.
Few database projects are built from scratch; there is almost always
an existing system (maybe not computerised) that the organization
is using to fulfil its needs.
Obviously, the existing system is not perfect; otherwise, you wouldn't
be building a new one. But by studying it, you may discover nuances
that you would otherwise have missed had you ignored it. If nothing
else, examining the existing system is usually good for a laugh
or two!
- Take the user's keys away
When deciding which field or fields to use as keys in a table, always
consider the fields that users will be editing. It's usually a bad
idea to choose a user-editable field as a key. Doing so forces you
to take one of these two actions:
Restrict the user from editing the field after the record's creation.
If you do so, you may discover that your application isn't flexible
enough when business requirements suddenly change, and users need
to edit that un-editable field.
What happens when a user makes a mistake in data entry and
doesn't notice until the record is saved?
Delete and re-create?
What if the record isn't re-creatable; suppose the customer
left?
Provide some way of detecting and correcting key collisions.
Usually, this can be done with some effort, but it is expensive
in terms of performance. Also, a key correction may wind up being
possible only from outside the data layer, forcing you to break
the isolation between your data and business/user interface layers.
The underlying reasoning here is this: Make your design fit the
user, don't make the user fit the design.
- Don't over-normalise
For those unfamiliar with the term, normalisation helps eliminate
the redundancy of data in a database by ensuring that all fields
in a table are atomic. There are several forms of normalisation,
but the Third Normal Form (3NF) is generally regarded as providing
the best compromise between performance, extensibility, and data
integrity. Briefly, 3NF states that:
- » Each value in a table is to be represented once and
only once.
- » Each row in a table should be uniquely identifiable.
(It should have a unique key.)
- » No non-key information that relies upon another key
should be stored in the table.
Databases in 3NF are characterised by a group of tables storing
related data that is joined together through keys. For example,
a 3NF database for storing customers and their related orders would
likely have two tables: Customer and Order. The Order table would
not contain any information about an order's related customer. Instead,
it would store the key that identifies the row containing the customer's
information in the Customer table.
Advanced levels of normalisation exist, but is "more normal"
necessarily better?
Not always. In fact, for some projects, even 3NF may introduce
too much complexity into the database to be worth the rewards.
- Build in the adaptation factor
Not much is worse than discovering, or being informed, that your
"finished" database is missing a field for a crucial piece
of information.
So, unless table size is a serious concern, always build in a factor
to allow for new fields, even if this means just adding a couple
of extra text fields to the end.
- Submitted By Sandra Greer - 03/12/02
For any DBMS, avoid using a number unless it has to be a number
(and you have to do maths on it). Half the time, the users will
subsequently want to load in another dataset that uses alphanumerics,
or hyphens, or spaces.
That's why the Real Key is the autonumber Access puts on your records,
and the other user "key" is for searching.
- Always design the database on paper first - Courtesy
of Michael J. Hernandez.
This one tip will probably save you an enormous amount of time.
Designing the database on paper first guarantees that you will keep
focused on the task of designing the database as effectively and
efficiently as possible. You want to make certain that your database
will have only a minimum amount of redundant data and that you've
eliminated as much duplicate data as possible. Making changes to
the structure is also a much simpler task if done on paper.
You'd be surprised how fast an eraser works. Many people make the
mistake of designing a database with their particular RDBMS in mind
and will design tables in a manner they hope will fit certain forms
or reports they envision themselves using. In most cases, designing
a table in this manner will result in poorly constructed tables.
Common side effects of this type of construction include difficulty
in sorting the data and searching for specific values. In many cases,
the tables will have to be redesigned using proper database design
techniques.
- Make absolutely certain that each table only represents
one subject - Courtesy of Michael J. Hernandez.
A subject can be one of two things: and object or an event. In this
case, and object represents something tangible - such as a person,
place, or thing - while an event represents something that occurs
at a specific point in time. Both have characteristics that can
be stored as data and processed later as information.
When a table describes more than one subject, you will certainly
have unnecessary duplicate data and possibly redundant data as well.
Ensuring that a table represents a single subject guarantees that
you will avoid these problem.
- Every table should have a Primary Key - Courtesy
of Michael J. Hernandez.
This is an important tip for two very good reasons. First, a Primary
Key uniquely identifies each record in a given table and helps to
ensure against redundant data. This is the mechanism used to refer
to a particular record from other tables in the database. Second,
it is also the instrument used to establish a relationship between
a pair of tables. This will be of particular importance when you
want to retrieve data from multiple tables in a query. Here are
some guidelines for establishing a Primary Key:
- » It's value must be unique
- » It can never be null
- » It cannot be a multi-part field
- » It should comprise the minimum number of fields to
guarantee uniqueness
- » It is not optional in whole or in part
- » It must directly identify each value of the remaining
fields in a given record of a table
In order for a field to become a Primary Key for its parent table,
it must pass each of these guidelines. Failure to pass even one
will disqualify it as a possible Primary Key. If you make certain
you have established a valid Primary Key, you will greatly reduce
the possibility of encountering problems later when you begin to
work with table relationships.
- Invest the time to implement data integrity -
Courtesy of Michael J. Hernandez.
I can't over-emphasize the importance of this tip. Many of the problems
you'll encounter with inaccurate or erroneous information will be
a direct result of poor data integrity. While it sometimes seems
like a waste of time to pay so much attention to the many details
involved in establishing data integrity, it will actually save you
an enormous amount of time in the long run - you won't have to continually
go back to fix things. An interesting fact is that the very people
who "…just don't have the time." to establish proper
data integrity are the ones who usually spend a large amount of
time fixing their improperly designed databases. In many cases they
will spend up to three times the amount of time it would have taken
to design the database properly in the first place! So don't do
it over - do it right!