Lecture 30:  Databases – an overview

 

Objectives of this lecture

q       Introduce the concept of databases

q       Lean some basic database terminologies

 

What is Databases?

q       Early computers were mainly used for scientific and engineering computation (number crunching).

q       However, It was soon understood that, computers could also be used to manipulate symbols (character or symbolic manipulations).  Thus, the business community where information handling is the main task, began to take interest in computers.

q       The first popular language for this kind of applications, COBOL (COmmon Business Oriented Language) was developed and it soon became one of the most popular languages, second only to FORTRAN.

q       COBOL however, had one particular problem – there was no data independence between a COBOL program and the data it handles.  Any slight change in the data would require a corresponding change in the COBOL programs that uses it.

q       In an effort to improve things, a whole new area of Computer science called database applications (or simply data bases) evolved.

q       In this lecture and the next, we shall learn some of the basic ideas in this area.

 

Some Basic terms

q       Entity: This is an object (abstract or concrete) about which information is being stored e.g., student, book, car, etc.

q       Field: This is an attribute of an entity. E.g., name, identification number, etc.

q       Record: This is a collection of related fields e.g. all the attribute of a student could constitute a record (ID number, name, major GPA, etc)

q       File: This is a collection of related records e.g., All records about books in a library.  A file is usually arranged in a tabular form in which each row represent a record and each column represent a field.

 

Ø      A file organized in such a way that the records must be processed one by one from the beginning to end is called a sequential file.

Ø      In such a file, two records are separated by an inter-record gap.  The end of the file is indicated by a special character called end-of-file (EOF) marker.

Ø      In contrast to sequential files, random access files (also called direct-access files) enable processing of a record without the review of other records.  On the record fields is set as a key field or index.

q       Database:  This is a collection of related files.  For example, all the files that are used to store information in a library system.

q       There are three main components in a database systems.  These are:

Ø      Physical database: This consists of the actual files comprising the database, usually arranged in some structured manner- to avoid duplication, redundancy and to ensure integrity of the data.  The physical database is almost always on a disk, magnetic tape or CD-ROM

Ø      Database Management Systems (DBMS): These are programs that provide facilities for creating, accessing and maintaining the physical database.  Examples of DBMD are dBASE, ACCESS & SQL Specifically, the DBMS must provide an efficient and convenient environment for users to create, add, retrieve and delete data.

Ø      Application Software:  These are program written using the facilities provided by DBMS to allow access to the physical database (or a partition of it) In a user-friendly and user-specific manner.

q       The following diagram shows a layered conception of a database.

 

 

 

 

 

 

 

 

 

 

 

 

 

 


q       Database Schemas : This is a logical design of a database.  It consists of a set of tables of the data types used, giving the names of entities and attributes and the relationship among them.

q       Schemas are classified as internal – when they correspond to physical database and external when they correspond to data in application programs.

q       The capability to change the internal schemas as needed without affecting the external schema is called data independence.

q       Schemas are generally written in a data definition language which generates the schema tables stored in a file called a data dictionary.  The data dictionary is always accessed before any change to the physical data is made.  The following figure illustrates some of these concept.