Sunday, September 27, 2009

Principles of Programming Languages - Module 5

Module 5      
Elementary Data Type

A data type is a class of data objects together with a set of operations for creating and manipulating them.

An elementary data object contains a single data value.  A class of such data objects over which various operations are defined is termed an elementary data type.

The basic elements of a specification of an elementary data type are as follows:
1. Attributes that distinguish data objects of that type. 
2. Values  that data objects of that type may have.  The type of a data object determines the set of possible(maximum) values that it may contain.  The set of values defined by an elementary data type is usually an ordered set with a least value and a greatest value; for any pair of distinct values, one is greater than the other.
3. Operations that define the possible manipulations of data objects of that type.  They may be  primitive which means that they are specified as part of the language definition, or they may be programmer-defined operations, in the form of subprograms or method declarations as part of class definitions.  Examples: integer addition, the use of the equal symbol to test equality, square-root operation

Implementation of Elementary Data Types

The implementation of an elementary data type consists of a storage representation for data objects and values of that type, and a set of  algorithms  or  procedures that define the operations of the type in terms of manipulations of the storage representation.

Storage Representation

Storage for elementary data types is strongly influenced by the under lying computer that will execute the program.  For example, the storage representation for integer or real values is almost always the integer or floating-point binary representation for numbers used in the underlying hardware.

The representation of a data object is ordinarily independent of its location in memory.  The storage representation is usually described in terms of the size of the block of the memory required(the number of memory words, bytes, or bits needed) and the layout of the attributes and data values within its block.  Usually the address of the first word or byte of such a block of memory is taken to represent the location of the data object.

Implementation of operations

Each operation defined for data objects of a given type may be implemented in one of three main ways:
1. Directly as hardware operation.  For example, if integers are stored using hardware representation for integers, then addition and subtraction are implemented using the arithmetic operations built into the hardware.
2.  As a procedure or function subprogram.  For example, a square-root operation is usually not provided directly as hardware operation.  It might be implemented as a square-root subprogram that calculates the square root of its argument.
3. As an inline code sequence.  An inline code sequence is also a software implementation of the operation.  However, instead of using a subprogram, the operations in the subprogram are copied into the program at the point where the subprogram would otherwise have been invoked.  For example, the absolute-value function  on numbers,  defined by  abs(x) = if x < 0 then -x else x is usually implemented as an inline code sequence:
a.  Fetch value of x from memory.
b.  If x > 0, skip the next instruction.
c.  Set x = -x.
d. Store new value of x in memory.
Each line is implemented by a single-hardware operation.

Declaration

A declaration  is a program statement that serves to communicate to the language translator information about the name and type of data objects needed during program execution.  By its placement in the program (e.g. within a particular subprogram or class definition), a declaration may also serve to indicate the desired lifetimes of the data objects.

Sample declarations of some programming languages:

• The C declaration float A, B; at the start of a subprogram indicates that two data objects of type float are needed during execution of the subprogram.  This is an example of an explicit declaration.
• FORTRAN is a programming language that provides implicit or default declarations, which are declarations that hold when no explicit declaration is given.  The variable INDEX may be used without explicit declaration; it is assumed by the compiler to be an integer variable because its name begins with one of  the letters I-N. 
• In Perl, simply assigning a value to a variable declares  the type of the variable
Examples:  $MyVar = 'This is a string'    # $MyVar is now a string variable
      $MyVar = 890                   # $MyVar is now an integer variable

The most important purpose for declarations, from the programmer's viewpoint, is that they allow for static rather than dynamic type checking.

Data storage representations that are built into the computer hardware usually include no type information, and the primitive operations on the data do no type checking.  For example, a particular word in the computer memory during execution of a program may contain the bit sequence 101010010100…1001 may represent an integer, a real  number, a sequence of characters, or an instruction.  The hardware primitive operation for integer addition cannot check whether its two arguments represents integers; they are simply bit sequences.  At the hardware level, conventional computers are particularly unreliable in detecting data type errors.

Type checking means the checking that each operation executed by a program receives the proper number of arguments of the proper data types.  For example given the expression X = A + B * C, the compiler must determine for each operation(addition, multiplication, and assignment) that each receives two arguments of the proper data type.

Type checking may be done at run time (dynamic type checking) or at compile time (static type checking).

Dynamic type checking  is run-time type checking usually performed immediately before the execution of a particular operation.  It is usually implemented by storing  a type tag in each data object that indicates the data type of the object.  For example, an integer data object would contain both the integer value and an integer type tag.

Scalar Data Types

Scalar data objects have a single attribute for its data object.  For example, an integer object has an integral value (ex. 130, -90, 12), and no other information can be obtained from that object.  They also follow the hardware architecture of a computer (e.g. integers, floating-points, characters).

Integers

The set of integer values defined for the type forms an ordered subset, within some finite bounds, of the infinite set of integers studied in mathematics.  The maximum integer value is sometimes represented as a defined costant.

Operations on these data objects include:
• Arithmetic
• Relational
• Assignment
• Bit

Floating-Point Real Numbers

A floating-point real number data type is often specified with only the single data type  attribute real, as in ForTran, or float, as in C.  The values form an ordered sequence from some hardware-determined minimum negative value to a maximum value, but the values are not distributed evenly across this range.  The precision required for floating-point numbers, in terms of the number of digits used in the decimal representation, may be specified by the programmer.

Operations on these data objects include:
• Arithmetic
• Relational
• Assignment
• Bit

Boolean operations on these objects are sometimes restricted.  Due to roundoff issues, equality between two real numbers is rarely achieved.

Fixed-Point Real Numbers

Data objects like money which contains pesos and cents, and other values which are rational that requires two or specific number of decimal places, should not be written as integers or floating-point values to avoid roundoff errors.  Thus, a form of fixed-point data should be used to represent these values.

A fixed number is represented as a digit sequence of fixed length, with the decimal point positioned at a given point between two digits.

Complex numbers

This numeric data type consists of a pair of numbers representing the number's real and imaginary parts.

Rational Numbers

A rational number is the quotient of two integers.  It is included in a programming language to avoid the problems of roundoff and truncation encountered in floating and fixed point representations of reals.

Enumerations

An ordered list of distinct values is called an enumeration.  Languages such as C, and C++ include an enumeration data type that allows the programmer to define and manipulate such variables more directly.

Below is sample program that uses the enum data type:

#include
#include
#include

void main()
{
enum x {One, Two, Three};
x y = Three;
clrscr();
if (y == 2)
cout << "Three"<< endl;
getch();
}

Booleans

Most languages provide a data type for representing true or false, usually called a Boolean or logical data type.

The most common operations on Boolean types include assignment as well as the following logical operations:
• And - Conjunction
• Or - Inclusive disjunction
• Not - Negation or complement

Characters

A character data type provides data objects that have a single character as their value.   The set of possible character values is usually taken to be a language-defined enumeration corresponding to the standard character sets supported by the underlying hardware and operating system, such as the ASCII character set. 

The ordering of the characters in this character set is called the collating sequence for the character set. Collating sequence is important because it determines the alphabetical ordering given to character strings by the relational operations.  Spaces, digits, and special characters may be alphabetized as well.

Operations include:

• Relational
• Assignment
• Testing whether a character value is one of the special classes letter, digit, or special character

Composite Data Types
Elementary data types may involve a complex data structure organization by the compiler.  Thus, multiple attributes  are often given for each such data type.

Character Strings

These are objects that composed of a sequence of characters.

There are at least three different treatments of character-string data types :

• Fixed declared length.  A character-string data object may have a fixed length that is declared in the program.
• Variable length to a declared bound.  A character-string data object may have a maximum length that is declared in the prior program, but the actual value stored in the data object may be a string  of shorter length-possibly even the empty string of no characters.
• Unbounded length.  A character-string data object may have a string value of any length, and the length may vary dynamically during execution  with no bound (beyond available memory).

Operations on Strings:
• Concatenation - an operation of joining two character strings to make one long string
• Relational operations on strings - use of logical operators like the less than, greater than symbols, etc.
• Substring selection using positioning subscripts - extracting substrings through positioning subscripts
• Input-output formatting - formatting of strings when read or displayed
• Substring selection using pattern matching - - extracting substrings through pattern matching
• Dynamic strings - The language Perl supports static and dynamic strings.  The string '$ABC' is static, and the statement print '$ABC'; will print the value $ABC.  The string "$ABC" is dynamic and the statement print "$ABC"; will cause the string to be evaluated, and the value of the Perl variable $ABC is printed instead.

Pointers

A pointer is also called a reference or access type.  It contains the location of another data object or may contain the null pointer, nil or null.  Pointers are ordinary data objects that may be simple variables or components of arrays and records.

Files and Input-Output

A file  is a data structure with two special properties:
1. It ordinarily is represented on a secondary storage device such as a disk or tape and thus may be much larger than the most data structures of other types.
2. Its lifetime may encompass a greater span of time than that of the program creating it.

Sequential Files

These are the most common type of file.  It is a data structure composed of a linear sequence of components of the same type.
Operations

• Open - this operation ordinarily requests information from the operating system about the location and properties of the file, allocates the required internal storage for buffers and other information, and sets the file-position pointer to the first component of the file.
• Read - this operation transfers the contents of the current file component to a designated variable in the program
• Write - this operation creates a new component at the current position in the file(always at the end) and transfers the contents of a designated program variable to the new component.
• End-of-file test - A read operation fails if the file position pointer designates the end of the file.  Because the file is of variable length, an explicit test for the end-of-file position is needed so that the program may take special action.
• Close - this operation involves notification to the operating system that the file can be detached from the program (and potentially made available to other programs) and possibly also deallocation of internal storage used for the file (such as buffers and buffer variables) (Pratt, and Zelkowitz, 2001)

No comments:

Post a Comment

Related Posts with Thumbnails