Lecture 30: Databases – an overview
Objectives of this lecture
q Introduce the concept of databases
q Lean some basic database terminologies
What is Databases?
q Early computers were mainly used for scientific and
engineering computation (number crunching).
q However, It was soon understood that, computers could
also be used to manipulate symbols (character or symbolic manipulations). Thus, the business community where
information handling is the main task, began to take interest in computers.
q The first popular language for this kind of
applications, COBOL
(COmmon Business Oriented Language) was
developed and it soon became one of the most popular languages, second only to
FORTRAN.
q COBOL however, had one particular problem – there was
no data independence between a COBOL program and the data it handles. Any slight change in the data would require
a corresponding change in the COBOL programs that uses it.
q In an effort to improve things, a whole new area of
Computer science called database applications (or simply data bases) evolved.
q In this lecture and the next, we shall learn some of
the basic ideas in this area.
Some Basic terms
q Entity: This is an object (abstract or concrete) about which
information is being stored e.g., student, book, car, etc.
q Field: This is an attribute of an entity. E.g., name,
identification number, etc.
q Record: This is a collection of related fields e.g. all the
attribute of a student could constitute a record (ID number, name, major GPA,
etc)
q File: This is a collection of related records e.g., All records about books
in a library. A file is usually
arranged in a tabular form in which each row represent a record and each column
represent a field.
Ø
A file organized in such
a way that the records must be processed one by one from the beginning to end
is called a sequential file.
Ø
In such a file, two
records are separated by an inter-record gap.
The end of the file is indicated by a special character called
end-of-file (EOF) marker.
Ø
In contrast to
sequential files, random access files (also called direct-access files) enable processing
of a record without the review of other records. On the record fields is set as a key field or index.
q Database: This is a
collection of related files. For
example, all the files that are used to store information in a library system.
q There are three main components in a database
systems. These are:
Ø
Physical database: This consists of the
actual files comprising the database, usually arranged in some structured
manner- to avoid duplication, redundancy and to ensure integrity of the
data. The physical database is almost
always on a disk, magnetic tape or CD-ROM
Ø
Database Management Systems (DBMS): These are programs that provide facilities for
creating, accessing and maintaining the physical database. Examples of DBMD are dBASE, ACCESS & SQL
Specifically, the DBMS must provide an efficient and convenient environment for
users to create, add, retrieve and delete data.
Ø
Application Software: These are program written using the
facilities provided by DBMS to allow access to the physical database (or a
partition of it) In a user-friendly and user-specific manner.
q The following diagram shows a layered conception of a
database.
q Database Schemas : This is a logical
design of a database. It consists of a
set of tables of the data types used, giving the names of entities and
attributes and the relationship among them.
q Schemas are classified as internal – when they correspond to physical database and external when they correspond to data in application programs.
q The capability to change the internal schemas as
needed without affecting the external schema is called data independence.
q Schemas are generally written in a data definition language which generates the
schema tables stored in a file called a data dictionary. The data dictionary is always accessed
before any change to the physical data is made. The following figure illustrates some of these concept.