Introduction
There
are different ways in which you can approach data analysis. We are going to
examine two possible approaches - the top-down approach and the bottom-up
approach - and compare their strengths and weaknesses.
One method starts with the big
picture, the whole concept, and breaks it down into smaller, more manageable
chunks, while the other method starts from the bottom with the different
data components and fits them together to make the larger model.
Top-down data analysis
The top-down approach to data
analysis involves a number of steps to build the data model for your
database.
Step 1 - Identify data entities
The first step is to identify
the data entities involved. A data entity can be anything about which data
needs to be stored, physical or otherwise. For example, ORDERS could be one
entity for the data model and CLIENTS could be another. With the top-down
approach, this is the top level in the analysis.
An instance of an entity is an
example of an entity. An entity is not a specific item but a grouping of
items that can be grouped together. Within a database, an entity would be a
single row in a table. An example of an instance of an entity would be 'Joe
Brown', which is an instance of the CLIENTS entity.
Step 2 - Attributes and properties
The next step is to identify
the attributes for each of the entities that have been defined. An attribute
is some item of information about the entity which can be stored against it.
Time should also be spent defining the properties for these attributes. For
example, attributes of CLIENTS could be total sales, company name, address,
telephone number, e-mail address etc. An example of a property for the total
sales attribute is that it must be of type NUMERIC.
Step 3 - Relationships between entities
The last step is to identify
the relationships that exist between the different entities. A relationship
is an association between instances of entities. For example, a CLIENT may
have many ORDERS, but an ORDER can only have one CLIENT.
Once each of these steps has
been carried out, the whole data model has been defined from the top down.
This method of analysis results in a well organised data model.
Bottom-up data analysis
The bottom-up approach to data
analysis involves a different set of steps to build up the data model for
your database.
Step 1 - Identify data elements
The first step for an
organisation using the bottom-up approach is to start at the bottom and
identify data elements contained in documents, reports, files etc. Similar
elements may have different names and so must be carefully matched to the
right elements.
Step 2 - Grouping elements
The next step is to take the
various elements and group them into entity types. Once this has been done
the relationships can be identified between the entities, so giving us the
final data model.
This approach to data analysis
enables us to create a more complete data model.
Strengths and weaknesses
The main advantage to using
the top-down approach to data analysis is that it results in a
well-structured and well-organised data model. On the down side however, is
the fact that information could easily be overlooked especially when there
is a wide variety of data to be considered.
The bottom-up approach does
not suffer from this problem. As a result of gathering all the data elements
together to start with, there is less chance of data being overlooked and
therefore a more complete data model is created. However, the bottom-up
approach suffers from the fact that the resultant data model will not be as
well organised as it could be and will model the application level closer
than the real world situation.
It is possible to combine the
approaches to insure a well-structured data model as well as a complete data
model. This is done by first developing a more general data model and then,
or even at the same time, discovering the data elements using the bottom-up
approach. Once this has been done the data collected needs to be fitted to
the general data model, which can be modified if need to be to fit the needs
of the data.
This results in a model that
benefits from the best of both approaches to data analysis. |