Lecture 32: Programming Languages

Objectives of this lecture

q Take a brief history of programming languages

q Introduce the concept of grammars, syntax & semantics

Historical perspectives

q Each computer has its own machine language based on its architecture.

q Before the development of programming languages all programming was done using machine languages.

q The problem with machine language is that every thing (data & instructions and memory locations) must be represented in bits, which is tedious and error prone.

q Another problem was that programs were machine dependent. – a program written for one computer cannot run on another.

q Consider for example the following machine language code represented in octal.

2463 load the content of location 63 to register 4

3446 subtract from register 4, the content of location 46

5457 store the content of register 4 to location 57

q It is obvious that even though the code is simplified by using octal, it would be very difficult to handle.

q The first effort to solve some of the problems of machine language was the development of Assembly language. In this language, mnemonics are used to write statements. For example, the above code can be written as:

Load 4,x

SUBT 4,y

STORE 4,z

q Assembly languages became popular in the early 1950’s and are still in use, especially when the execution speed is critical.

q However, Assembly language is still difficult to use and is machine dependent, as the programmer is essentially manipulating the hardware – It is just a direct translation of the machine language.

q In 1954, IBM developed the first widely used high-level programming language, called FORTRAN (FORmula TRANslating system)

q FORTRAN allows the operations in the above examples to be done using a single assignment; z = x -y, which is similar to the algebraic formula for the operation.

q FORTRAN is still being used among scientists and engineers and has been revised several times, with the latest version as FORTRAN-90.

q FORTRAN introduced the concept of modularity where a program is broken into subroutines or subprograms. This is the basis for a new concept called block structuring that allows modularity within a program.

q Block structuring coupled with recursion led to the development of a new language called ALGOL (ALGOrithmic Language) in 1958, which later evolved to ALGOL-60 and ALGOL-68.

q Prompted by the success of FORTRAN, in the scientific community, Grace Murray Hopper initiated a movement in the mid 1950s to introduce computers to the business world. The result was COBOL.

q The combined features of FORTRAN, ALGOL & COBOL were the genesis of another IBM language called PL/I (Programming Language I) that was developed in 1964.

q PL/I was intended to replace these languages with a single language. This however did not work, as the language was very complicated. Thus PL/I became gradually extinct.

q Several other languages were developed as computers evolved. For example, BASIC (Beginner’s All-purpose Symbolic Instruction Code) was introduced in 1964 and became very popular for its simplicity and interactive-feature.

q BCPL (Basic Combined Programming Language) was developed in 1969 as an experimental systems language and gave rise to C language by D. Ritchie in 1974

q C enjoys widespread use in business, education and industry.

q Pascal was introduced in 1971 as an educational language and is used in many educational environments. C and Pascal are examples of ALGOL-based languages.

q The following figure shows a chronology of programming languages over the past six decades.

Which is the best language?

q It is difficult to say which language is the best, since a language may be good in certain types of application and bad in others

q We saw that an attempt to have one language for all application (PL/I) fails and that is still true up till today

q COBOL and recent 4GLs are good for data processing application

q FORTRAN is good for numerical computation

q C is a good structured language for numerical computation. It is also suitable for system programming because of its facilities for handling bits

q For a language to be good, it should be clear, simple and unambiguous and should have a predefined purpose. It should be flexible enough to program applications. It should be cost effective for its users and compatible with a variety of computers.

Grammars, Syntax and Semantics

q In all natural languages, there are rules for forming sentences. In English for example, letters of the alphabet are used to form words. Words are combined to form sentences.

q These rules are divided into syntax and semantics.

q Syntax rules are concerned with how words are combined to form sentences (or statements). For example, a syntax for forming a an English sentence could be: A sentence should consist of a subject, a verb and an object.

q If we assume that yllib is a subject, syub a verb and spihc an object, then

Yllib syub spich

Satisfies the syntax rule.

q Semantic on the other hand is concerned with the meaning of a sentence. Clearly, the above example does not convey any meaning and is therefore not a valid English sentence. If we reverse the letters, we get

Billy buys chips

And the meaning is now obvious.

q Programming languages also have syntax and semantics. The syntax rules are usually represented in syntax diagram. For example, the syntax for IF statement is as follows:

q The syntax of a programming language can also be described using a metalanguage, a language describing other languages.

q A metalanguage must be precise and unambiguous. Example of a metalanguage is Bakus Nour Form (BNF)

q In BNF, the rules of the language are expressed in terms of BNF grammars which takes the form

A::=B

Where ::= stand for “is defined as”

e.g., sentence ::=subject predicate

q other metasymbols are

[ ] for optional inclusions

( ) for repetition

| for choice

e.g., predicate ::= verb [object]

q Symbols also consist of terminal and non-terminal. Terminal symbols have values that are explicitly represented and are symbols of the grammar.

q Non-terminal symbols represent other symbols of the grammar that can be generated according to BNF rules.

q Programming language are called context-free grammars (CFG). These grammars consist of a set of non-terminal with a symbol specifying the beginning of the grammar sequence, a set of terminals and a set of production rules of the form

A::=B

Where A is a single non terminal symbols.

q Non-programming languages are usually dependent on context.

q Grammars also consist of rules specifying the order of operation. These are usually represented in a structured hierarchy of syntax, called a parse tree. For example, the expression

w::=x * y + z

Will have the following parse tree

This indicates that multiplication must be performed before addition