Google

Monday, September 24, 2007

Web page disignign

What is a Web Page?
A web is a page with hypertext links that cross-reference text in the Internet .A web page is also know as HTML pages because it is coded in HTML language.
Today web pages(HTML pages ) are the standard interface of the internet.

Power of HTML
Earlier HTML pages could only hold text.However, since the boom of Internet people have added more and more capabilities to this langauge.
It can now have images,animations,multimedia contents and even interactive application.

What is HTML ?
Most HTML tags have two parts the strarting tag that indicates the start of text or formatting and the closing tag that indicates the end of text of formatting .
The closing tag is the same as the starting tag,the only differneces is that it begins with a / just after the <>

Creating and editing web pages
Since a web page is a text file it can be created or edited in any text editor. However, there is application specially made for designing web pages. These applications are known as HTML editors.
Use any editor of your choice to create web pages.

A blank HTML page
A blank HTML page has the following code:


It appears in the title bar


It appears in the page

Creating your first web page
The above written code is a blank web page.Copy it to Notepad.Write the title of the page between the and tags.And write some one or two paragraphs of the between the and tags.
Give the save command and in the file name box type the filename inside double quotation("")with an extension of '.htm'.Now you can open the document in any browser and view it.

Tags and their Properties

Most of the HTML tags have their properties. The properties of every tag goes inside the opening tag .A tag can have any number of properties separated by space .Most of the properties have a value.For example if you are specifying the color property then its value will be the name of the color. A tag with properties will be written like.

Body tags and their meaning

Opening Tag Closing Tag Use
Visible area of the HTML Page
Formatting start and end.

Paragraph start and end

[None] Horizontal line

[None] Line Break
Start and end bold text.
Start and end Italic text.
Start and end Underline text.









Example


First page



Welcome to my site






More tag propeties.
Propeties Value

tag align Paragraph alignment[left,center,right and justify]

to

Tag Heading one(It is the greatest font size)

....

It is smallest font then the previous font size


tag width Width of line in pixel or percent
align alignment[left,center and right]
Color Color or line(IE only)
size size of the Horizontal line


Tag Font face
Font size
Font color

List tags and their use

Opening Tag Closing Tag Use
    Start and end of the bullet list
      Start and end of the number list
      Start and end of directory list
    1. Start and end of the list item
      Start and end of division or paragraph
      none Insert image.
      Hyperlink or anchor.



      Tag Proepeties

        tag
        Type Type of bullet(circle,sqaure or disc)

          tag
          Type Type of numbering(1,A,a,I or i)
          Start Beginning count of numbering (e.g.1,2,3,4 etc)

          tag
          Align Alignment of paragraph(left,center,right ro justify)

          Example


          List


          Computer Course



          1. Java

          2. Oracle

          3. C++

          4. HTML

          5. Javascript




          img tag properties

          Property
          Value
          SRC Location of the image(e.g."c:\windows\circle.gif")
          WIDTH Width of the image in pixel
          HEIGHT Height of the image in pixel
          ALT Alternate or tool-tip text
          VSPACE Space to the right and left of image in pixels
          HSPACE Space to the right and left of image in pixels
          NAME Name of the image.
          LOWSRC Location of a lower resolution image
          ALIGN Alignment of text with imag.(top,middle,bottom etc)

          Example


          Image


          Welcome to My site


          Click here




          Anchor tag properties

          Property Value
          Tag
          Name Name of the anchor
          HREF Location of the file that is referenced.
          TARGET Name of the window or frame to open the target file.

          NOTE: Hyperlink is the part of the references another document. When you click on a hyperlink the referenced document is opened. Anchor is a reference point inside a document that can be referenced by a hyperlink. Whentag is used as anchor it doesn't have a closing tag.

          Table in an HTML document

          In HTML a table begins with a

          tag and ends with a
          tag.
          Between the tag there are the rows that are enclosed between and tags. So there is one pair of and tags for each row.
          Inside these rows are the cells, which are enclosed between the tags.


          Example

          Table


          tag pairs inside every row should be the same. The contents of the cell goes between the and










          Row 1, Col 1

          Row 1, Col 2

          Row 2, Col 1

          Row 2, Col 2


          Contents of a cell
          The context of a cell is written between the and tags.A cell of table can contain any text,HTML,image or even another table(i.e.you can have a table inside another table).
          Tables are used in web pages to arrange the layout and /or to display tabular data.

          Table/cell Properties

          and
          tag

          Property
          Value
          WIDTH Width of table /cell in pixel or percent
          HEIGHT Height of table/cell in pixel or percent
          BGCOLOR Background color of table/cell
          BACKGROUND Background image of table/cell

          Tag only

          Property
          Value
          BORDER Thickness of table border in pixel
          BORDERCOLOR Color of table border
          CELLSPACNING Space between cells in pixel
          CELLPADDING Space between cell border and content in pixel


          More tag properties
          Property Value

          tag only
          ROWSPAN Number of rows the cell spans

          Example



















          Simple Table With Formatting

          Row 1, Col 1

          Row 1, Col 2

          Row 2, Col 1

          Row 2, Col 2

          This is a double-width, double-height cell with centered contents.


          Images in HTML document
          As we learnt earlier, a web page can only store text. So any images that appear inside the page is not inside the HTML document.Actually the image is a different file and there is a tag in the HTML page specifying the location of the image.The browser displays the image in the page as if it were a part of that document.

          Image Formats in a HTML page
          There are two types of image formats used in web pages.GIF(Graphics Interchangeable Format ".gif")and JPEG(Joint Photographics Expert Group ".jpg").The PNG(Portable Network Graphic ".png") format is still consideration and most probably will be used as web graphic format in the future.

          Inserting image into the HTML page
          IF you have a JPEG or GIF image you can insert it into HTML page by inserting thetag.Refer to lecture 3 for the properties of the tag looks like:

          Example
          The SRC property of the tag is must.All other tags are optional.If you dono't specify the width and height property the image dimesion is calculated automatically.If you don't specify the width and height property the image resized in the browser while displaying. If the width and height property is not specified then the browser will take a little longer time to display the image as it has to calculated the values, so it is better to specify them at design time.
          Other properties the tag are ALT and LOSRC.The ALT property is used to specify the text that will appear if the image cannot be shown in the page or when the mouse moves will be loaded before the actual high-resolution image is loaded .This property is used when low source image instead of a blank page before the large image appears.
          The format will be something like this:
          Example:
          Home Buttom

          Forms in a HTML document


          Forms are used in a web page to collect information form the user .Generally user cannot write or edit a web page in the browser but in a form he can type and enter data,which can be collect by the web site owner.
          For example a from can be used to accept the username and password of a user to log him onto the system or to take his details for any other purpose.

          Tags for inserting a form
          A form is inserted into a web page using the

          and
          tag.All the elements of a form tag are put between these tags inside the tag.The Type property of the tag determines the type of form element it is.For example will be a text box and will be a password box.The Tag doesnot have a closing tag.
          All types of form fields are inserted using the Tag except the Drop Down and list Box.These are enclosed between tags.The list items are placed between these tags.Each list item is enclosed between a pair of tags.

          The type of elements in a from

          From Element
          Tag Type
          Description
          Text Box Text Field where the user can enter any text
          Password Box Password Field where the user can enter password
          Text Area Textarea Field where the user can enter multiple line of text
          Check box Checkbox Filed where the user can check one or more of available option
          Radio Button Radio Filed where the user can select any one of available option
          Button Button Command buttton used to enter a command
          Submit Button Submit Command button used to submit the form
          Reset Button Reset Command button used to reset the form
          Drop Down Box Filed where the user can select an item from the drop down menu
          List Box Field where the user can select one or more items from a list

          Example

          here is a blank web page with only a from



          Form Page



          Full Name:

          Gender:MaleFemale

          Age Group:



          From Elements and their properties

          Tag properties

          Property Value
          NAME Name of the Form
          METHOD How the form data will be sent(Get or Post)
          ACTION The script or program file that will handle the form data

          Text Field/Password Field/File Field

          Property Value
          NAME Name of the Field
          SIZE Width of field in number of characters
          MAXLENGTH Themaximum number of character allowed (Including space)

          Check BOX/Radio Button(INPUT TYPE="CHECKBOX">

          Property Value
          NAME Name of the check box/radio button
          VALUE Value to pass when checked
          CHECKED Doesnot have a value,the box will appear checked initially

          BUTTON /SUBMIT BUTTON /RESET BUTTON

          //

          Property Value
          NAME Name of the Button(optional)
          VALUE Text on the button face

          Hidden field

          Property Value
          NAME Name of the field
          VALUE value of pass

          Drop Down/List Box

          Property Value
          NAME Name of the field
          SIZE Number of lines in the list box .

          Property Value
          VALUE Value to be passed when selected
          SELECTED Doesnot have a value,appears selected initially

          Text Area
          Property Value
          NAME Name of the field
          ROWS Height of the field in number of line
          COLS Width of the field in number of characters
          WRAP Type of text wrapping[off,virtual or physical]

          Frame

          Until now wach web page when opend takes over the entire browser screen.The browser screen could not be split into separate(unique) sections,showing different but related information.

          The HTML tags that divide a browser screen into two or more HTML recognizable unique region is the tags.Each unique region is called a frame.Each frame can be loaded with a different document and hence,allow multiple HTML documents to be seen concurrently.

          The HTML frame is a powerful feature that enables a web page to be broken into different unique section that,although realated ,operate independently of each other.

          The Tag

          The spliting of a browser screen into frames is accomplished with the and tags embedded into the HTML document .The tags require one of the following two attributes depending on whethher the screen has to be divided into rows and columns.

          ROWS This attribute is used to divide the screen into multiple rows.It can be set equal to a list of values.Depending on the require size of each row.The values can number of pixel,percentage of screen resolution and the symbol of * which indicates the remaining space of the screen
          COLS This attribute is used to divide the screen into multiple columns.It can be set equal to a list of values.Depending on the require size of each Columns..The values can number of pixel,percentage of screen resolution and the symbol of * which indicates the remaining space of the screen

          Example

          The Tag

          Once the browser screen is divided into rows(Horizontal sections)and columns (Vertical Sections),Each unique section defined can be loaded with different HTML documents.This is achieved by using the tag,which takes in the following attributes :

          Propery Value
          SRC Indicates the URL of the document to be loaded into the frame.
          MARGINHEIGHT Specifies the amount the amount of white space to be left at top and bottom of the frame
          MARGINWIDTH Specified the amount of white space to be along the sides of the frame
          NAME Gives the frame a unique name so it can be targeted by other documents.The name given must begin with an Alphanumeric character.
          NORSIZE Disables the frames resizing capability
          SCROLLING Controls the appearance of horizontall and vertical scrollbars in a frame.This takes the values YES/NO/AUTO

          Example
          <
          FRAMESET ROWS="30%,*">

          About Java

          Introduction

          Like any human language, Java provides a way to express concepts. If successful, this medium of expression will be significantly easier and more flexible than the alternatives as problems grow larger and more complex.

          You can’t look at Java as just a collection of features—some of the features make no sense in isolation. You can use the sum of the parts only if you are thinking about design, not simply coding. And to understand Java in this way, you must understand the problems with it and with programming in general. This book discusses programming problems, why they are problems, and the approach Java has taken to solve them. Thus, the set of features I explain in each chapter are based on the way I see a particular type of problem being solved with the language. In this way I hope to move you, a little at a time, to the point where the Java mindset becomes your native tongue.

          Throughout, I’ll be taking the attitude that you want to build a model in your head that allows you to develop a deep understanding of the language; if you encounter a puzzle you’ll be able to feed it to your model and deduce the answer.

          Prerequisites

          This book assumes that you have some programming familiarity: you understand that a program is a collection of statements, the idea of a subroutine/function/macro, control statements such as “if” and looping constructs such as “while,” etc. However, you might have learned this in many places, such as programming with a macro language or working with a tool like Perl. As long as you’ve programmed to the point where you feel comfortable with the basic ideas of programming, you’ll be able to work through this book. Of course, the book will be easier for the C programmers and more so for the C++ programmers, but don’t count yourself out if you’re not experienced with those languages (but come willing to work hard; also, the multimedia CD that accompanies this book will bring you up to speed on the basic C syntax necessary to learn Java). I’ll be introducing the concepts of object-oriented programming (OOP) and Java’s basic control mechanisms, so you’ll be exposed to those, and the first exercises will involve the basic control-flow statements.

          Although references will often be made to C and C++ language features, these are not intended to be insider comments, but instead to help all programmers put Java in perspective with those languages, from which, after all, Java is descended. I will attempt to make these references simple and to explain anything that I think a non- C/C++ programmer would not be familiar with.

          Although it is based on C++, Java is more of a “pure” object-oriented language.

          Both C++ and Java are hybrid languages, but in Java the designers felt that the hybridization was not as important as it was in C++. A hybrid language allows multiple programming styles; the reason C++ is hybrid is to support backward compatibility with the C language. Because C++ is a superset of the C language, it includes many of that language’s undesirable features, which can make some aspects of C++ overly complicated.

          The Java language assumes that you want to do only object-oriented programming. This means that before you can begin you must shift your mindset into an object-oriented world (unless it’s already there). The benefit of this initial effort is the ability to program in a language that is simpler to learn and to use than many other OOP languages. In this chapter we’ll see the basic components of a Java program and we’ll learn that everything in Java is an object, even a Java program.

          You manipulate objects with references

          Each programming language has its own means of manipulating data. Sometimes the programmer must be constantly aware of what type of manipulation is going on. Are you manipulating the object directly, or are you dealing with some kind of indirect representation (a pointer in C or C++) that must be treated with a special syntax?

          All this is simplified in Java. You treat everything as an object, so there is a single consistent syntax that you use everywhere. Although you treat everything as an object, the identifier you manipulate is actually a “reference” to an object. You might imagine this scene as a television (the object) with your remote control (the reference). As long as you’re holding this reference, you have a connection to the television, but when someone says “change the channel” or “lower the volume,” what you’re manipulating is the reference, which in turn modifies the object. If you want to move around the room and still control the television, you take the remote/reference with you, not the television.

          Also, the remote control can stand on its own, with no television. That is, just because you have a reference doesn’t mean there’s necessarily an object connected to it. So if you want to hold a word or sentence, you create a String reference:

          String s;

          But here you’ve created only the reference, not an object. If you decided to send a message to s at this point, you’ll get an error (at run-time) because s isn’t actually attached to anything (there’s no television). A safer practice, then, is always to initialize a reference when you create it:

          String s = "asdf";

          However, this uses a special Java feature: strings can be initialized with quoted text. Normally, you must use a more general type of initialization for objects.

          You must create
          all the objects

          When you create a reference, you want to connect it with a new object. You do so, in general, with the new keyword. new says, “Make me a new one of these objects.” So in the above example, you can say:

          String s = new String("asdf");

          Not only does this mean “Make me a new String,” but it also gives information about how to make the String by supplying an initial character string.

          Of course, String is not the only type that exists. Java comes with a plethora of ready-made types. What’s more important is that you can create your own types. In fact, that’s the fundamental activity in Java programming, and it’s what you’ll be learning about in the rest of this book.

          Where storage lives

          It’s useful to visualize some aspects of how things are laid out while the program is running, in particular how memory is arranged. There are six different places to store data:

          1. Registers. This is the fastest storage because it exists in a place different from that of other storage: inside the processor. However, the number of registers is severely limited, so registers are allocated by the compiler according to its needs. You don’t have direct control, nor do you see any evidence in your programs that registers even exist.
          2. The stack. This lives in the general RAM (random-access memory) area, but has direct support from the processor via its stack pointer. The stack pointer is moved down to create new memory and moved up to release that memory. This is an extremely fast and efficient way to allocate storage, second only to registers. The Java compiler must know, while it is creating the program, the exact size and lifetime of all the data that is stored on the stack, because it must generate the code to move the stack pointer up and down. This constraint places limits on the flexibility of your programs, so while some Java storage exists on the stack—in particular, object references—Java objects themselves are not placed on the stack.
          3. The heap. This is a general-purpose pool of memory (also in the RAM area) where all Java objects live. The nice thing about the heap is that, unlike the stack, the compiler doesn’t need to know how much storage it needs to allocate from the heap or how long that storage must stay on the heap. Thus, there’s a great deal of flexibility in using storage on the heap. Whenever you need to create an object, you simply write the code to create it using new, and the storage is allocated on the heap when that code is executed. Of course there’s a price you pay for this flexibility: it takes more time to allocate heap storage than it does to allocate stack storage (that is, if you even could create objects on the stack in Java, as you can in C++).
          4. Static storage. “Static” is used here in the sense of “in a fixed location” (although it’s also in RAM). Static storage contains data that is available for the entire time a program is running. You can use the static keyword to specify that a particular element of an object is static, but Java objects themselves are never placed in static storage.
          5. Constant storage. Constant values are often placed directly in the program code, which is safe since they can never change. Sometimes constants are cordoned off by themselves so that they can be optionally placed in read-only memory (ROM).
          6. Non-RAM storage. If data lives completely outside a program it can exist while the program is not running, outside the control of the program. The two primary examples of this are streamed objects, in which objects are turned into streams of bytes, generally to be sent to another machine, and persistent objects, in which the objects are placed on disk so they will hold their state even when the program is terminated. The trick with these types of storage is turning the objects into something that can exist on the other medium, and yet can be resurrected into a regular RAM-based object when necessary. Java provides support for lightweight persistence, and future versions of Java might provide more complete solutions for persistence.

          Special case: primitive types

          There is a group of types that gets special treatment; you can think of these as “primitive” types that you use quite often in your programming. The reason for the special treatment is that to create an object with new—especially a small, simple variable—isn’t very efficient because new places objects on the heap. For these types Java falls back on the approach taken by C and C++. That is, instead of creating the variable using new, an “automatic” variable is created that is not a reference. The variable holds the value, and it’s placed on the stack so it’s much more efficient.

          Java determines the size of each primitive type. These sizes don’t change from one machine architecture to another as they do in most languages. This size invariance is one reason Java programs are so portable.

          Primitive type

          Size

          Minimum

          Maximum

          Wrapper type

          boolean

          1-bit



          Boolean

          char

          16-bit

          Unicode 0

          Unicode 216- 1

          Character

          byte

          8-bit

          -128

          +127

          Byte

          short

          16-bit

          -215

          +215—1

          Short

          int

          32-bit

          -231

          +231—1

          Integer

          long

          64-bit

          -263

          +263—1

          Long

          float

          32-bit

          IEEE754

          IEEE754

          Float

          double

          64-bit

          IEEE754

          IEEE754

          Double

          void




          Void

          All numeric types are signed, so don’t go looking for unsigned types.

          The primitive data types also have “wrapper” classes for them. That means that if you want to make a nonprimitive object on the heap to represent that primitive type, you use the associated wrapper. For example:

          char c = 'x';
          Character C = new Character(c);

          Or you could also use:

          Character C = new Character('x');

          The reasons for doing this will be shown in a later chapter.

          High-precision numbers

          Java includes two classes for performing high-precision arithmetic: BigInteger and BigDecimal. Although these approximately fit into the same category as the “wrapper” classes, neither one has a primitive analogue.

          Both classes have methods that provide analogues for the operations that you perform on primitive types. That is, you can do anything with a BigInteger or BigDecimal that you can with an int or float, it’s just that you must use method calls instead of operators. Also, since there’s more involved, the operations will be slower. You’re exchanging speed for accuracy.

          BigInteger supports arbitrary-precision integers. This means that you can accurately represent integral values of any size without losing any information during operations.

          BigDecimal is for arbitrary-precision fixed-point numbers; you can use these for accurate monetary calculations, for example.

          Consult your online documentation for details about the constructors and methods you can call for these two classes.

          Arrays in Java

          Virtually all programming languages support arrays. Using arrays in C and C++ is perilous because those arrays are only blocks of memory. If a program accesses the array outside of its memory block or uses the memory before initialization (common programming errors) there will be unpredictable results.

          One of the primary goals of Java is safety, so many of the problems that plague programmers in C and C++ are not repeated in Java. A Java array is guaranteed to be initialized and cannot be accessed outside of its range. The range checking comes at the price of having a small amount of memory overhead on each array as well as verifying the index at run-time, but the assumption is that the safety and increased productivity is worth the expense.

          When you create an array of objects, you are really creating an array of references, and each of those references is automatically initialized to a special value with its own keyword: null. When Java sees null, it recognizes that the reference in question isn’t pointing to an object. You must assign an object to each reference before you use it, and if you try to use a reference that’s still null, the problem will be reported at run-time. Thus, typical array errors are prevented in Java.

          You can also create an array of primitives. Again, the compiler guarantees initialization because it zeroes the memory for that array.

          Arrays will be covered in detail in later chapters.

          You never need to destroy an object

          In most programming languages, the concept of the lifetime of a variable occupies a significant portion of the programming effort. How long does the variable last? If you are supposed to destroy it, when should you? Confusion over variable lifetimes can lead to a lot of bugs, and this section shows how Java greatly simplifies the issue by doing all the cleanup work for you.

          Scoping

          Most procedural languages have the concept of scope. This determines both the visibility and lifetime of the names defined within that scope. In C, C++, and Java, scope is determined by the placement of curly braces {}. So for example:

          {
          int x = 12;
          /* only x available */
          {
          int q = 96;
          /* both x & q available */
          }
          /* only x available */

          /* q “out of scope” */
          }

          A variable defined within a scope is available only to the end of that scope.

          Indentation makes Java code easier to read. Since Java is a free-form language, the extra spaces, tabs, and carriage returns do not affect the resulting program.

          Note that you cannot do the following, even though it is legal in C and C++:

          {
          int x = 12;
          {
          int x = 96; /* illegal */
          }
          }

          The compiler will announce that the variable x has already been defined. Thus the C and C++ ability to “hide” a variable in a larger scope is not allowed because the Java designers thought that it led to confusing programs.

          Scope of objects

          Java objects do not have the same lifetimes as primitives. When you create a Java object using new, it hangs around past the end of the scope. Thus if you use:

          {
          String s = new String("a string");
          } /* end of scope */

          the reference s vanishes at the end of the scope. However, the String object that s was pointing to is still occupying memory. In this bit of code, there is no way to access the object because the only reference to it is out of scope. In later chapters you’ll see how the reference to the object can be passed around and duplicated during the course of a program.

          It turns out that because objects created with new stay around for as long as you want them, a whole slew of C++ programming problems simply vanish in Java. The hardest problems seem to occur in C++ because you don’t get any help from the language in making sure that the objects are available when they’re needed. And more important, in C++ you must make sure that you destroy the objects when you’re done with them.

          That brings up an interesting question. If Java leaves the objects lying around, what keeps them from filling up memory and halting your program? This is exactly the kind of problem that would occur in C++. This is where a bit of magic happens. Java has a garbage collector, which looks at all the objects that were created with new and figures out which ones are not being referenced anymore. Then it releases the memory for those objects, so the memory can be used for new objects. This means that you never need to worry about reclaiming memory yourself. You simply create objects, and when you no longer need them they will go away by themselves. This eliminates a certain class of programming problem: the so-called “memory leak,” in which a programmer forgets to release memory.

          Creating new data types: class

          If everything is an object, what determines how a particular class of object looks and behaves? Put another way, what establishes the type of an object? You might expect there to be a keyword called “type,” and that certainly would have made sense. Historically, however, most object-oriented languages have used the keyword class to mean “I’m about to tell you what a new type of object looks like.” The class keyword (which is so common that it will not be emboldened throughout this book) is followed by the name of the new type. For example:

          class ATypeName { /* class body goes here */ }

          This introduces a new type, so you can now create an object of this type using new:

          ATypeName a = new ATypeName();

          In ATypeName, the class body consists only of a comment (the stars and slashes and what is inside, which will be discussed later in this chapter), so there is not too much that you can do with it. In fact, you cannot tell it to do much of anything (that is, you cannot send it any interesting messages) until you define some methods for it.

          Fields and methods

          When you define a class (and all you do in Java is define classes, make objects of those classes, and send messages to those objects), you can put two types of elements in your class: data members (sometimes called fields), and member functions (typically called methods). A data member is an object of any type that you can communicate with via its reference. It can also be one of the primitive types (which isn’t a reference). If it is a reference to an object, you must initialize that reference to connect it to an actual object (using new, as seen earlier) in a special function called a constructor (described fully in Chapter 4). If it is a primitive type you can initialize it directly at the point of definition in the class. (As you’ll see later, references can also be initialized at the point of definition.)

          Each object keeps its own storage for its data members; the data members are not shared among objects. Here is an example of a class with some data members:
          class DataOnly {
          int i;
          float f;
          boolean b;
          }

          This class doesn’t do anything, but you can create an object:

          DataOnly d = new DataOnly();

          You can assign values to the data members, but you must first know how to refer to a member of an object. This is accomplished by stating the name of the object reference, followed by a period (dot), followed by the name of the member inside the object:

          objectReference.member

          For example:

          d.i = 47;
          d.f = 1.1f;
          d.b = false;

          It is also possible that your object might contain other objects that contain data you’d like to modify. For this, you just keep “connecting the dots.” For example:

          myPlane.leftTank.capacity = 100;

          The DataOnly class cannot do much of anything except hold data, because it has no member functions (methods). To understand how those work, you must first understand arguments and return values, which will be described shortly.

          Default values for primitive members

          When a primitive data type is a member of a class, it is guaranteed to get a default value if you do not initialize it:

          Primitive type

          Default

          boolean

          false

          char

          ‘\u0000’ (null)

          byte

          (byte)0

          short

          (short)0

          int

          0

          long

          0L

          float

          0.0f

          double

          0.0d

          Note carefully that the default values are what Java guarantees when the variable is used as a member of a class. This ensures that member variables of primitive types will always be initialized (something C++ doesn’t do), reducing a source of bugs. However, this initial value may not be correct or even legal for the program you are writing. It’s best to always explicitly initialize your variables.

          This guarantee doesn’t apply to “local” variables—those that are not fields of a class. Thus, if within a function definition you have:

          int x;

          Then x will get some arbitrary value (as in C and C++); it will not automatically be initialized to zero. You are responsible for assigning an appropriate value before you use x. If you forget, Java definitely improves on C++: you get a compile-time error telling you the variable might not have been initialized. (Many C++ compilers will warn you about uninitialized variables, but in Java these are errors.)

          Methods, arguments, and return values

          Up until now, the term function has been used to describe a named subroutine. The term that is more commonly used in Java is method, as in “a way to do something.” If you want, you can continue thinking in terms of functions. It’s really only a syntactic difference, but from now on “method” will be used in this book rather than “function.”

          Methods in Java determine the messages an object can receive. In this section you will learn how simple it is to define a method.

          The fundamental parts of a method are the name, the arguments, the return type, and the body. Here is the basic form:

          returnType methodName( /* argument list */ ) {
          /* Method body */
          }

          The return type is the type of the value that pops out of the method after you call it. The argument list gives the types and names for the information you want to pass into the method. The method name and argument list together uniquely identify the method.

          Methods in Java can be created only as part of a class. A method can be called only for an object,[22] and that object must be able to perform that method call. If you try to call the wrong method for an object, you’ll get an error message at compile-time. You call a method for an object by naming the object followed by a period (dot), followed by the name of the method and its argument list, like this: objectName.methodName(arg1, arg2, arg3). For example, suppose you have a method f( ) that takes no arguments and returns a value of type int. Then, if you have an object called a for which f( ) can be called, you can say this:

          int x = a.f();

          The type of the return value must be compatible with the type of x.

          This act of calling a method is commonly referred to as sending a message to an object. In the above example, the message is f( ) and the object is a. Object-oriented programming is often summarized as simply “sending messages to objects.”

          The argument list

          The method argument list specifies what information you pass into the method. As you might guess, this information—like everything else in Java—takes the form of objects. So, what you must specify in the argument list are the types of the objects to pass in and the name to use for each one. As in any situation in Java where you seem to be handing objects around, you are actually passing references[23]. The type of the reference must be correct, however. If the argument is supposed to be a String, what you pass in must be a string.

          Consider a method that takes a String as its argument. Here is the definition, which must be placed within a class definition for it to be compiled:

          int storage(String s) {
          return s.length() * 2;
          }

          This method tells you how many bytes are required to hold the information in a particular String. (Each char in a String is 16 bits, or two bytes, long, to support Unicode characters.) The argument is of type String and is called s. Once s is passed into the method, you can treat it just like any other object. (You can send messages to it.) Here, the length( ) method is called, which is one of the methods for Strings; it returns the number of characters in a string.

          You can also see the use of the return keyword, which does two things. First, it means “leave the method, I’m done.” Second, if the method produces a value, that value is placed right after the return statement. In this case, the return value is produced by evaluating the expression s.length( ) * 2.

          You can return any type you want, but if you don’t want to return anything at all, you do so by indicating that the method returns void. Here are some examples:

          boolean flag() { return true; }
          float naturalLogBase() { return 2.718f; }
          void nothing() { return; }
          void nothing2() {}

          When the return type is void, then the return keyword is used only to exit the method, and is therefore unnecessary when you reach the end of the method. You can return from a method at any point, but if you’ve given a non-void return type then the compiler will force you (with error messages) to return the appropriate type of value regardless of where you return.

          At this point, it can look like a program is just a bunch of objects with methods that take other objects as arguments and send messages to those other objects. That is indeed much of what goes on, but in the following chapter you’ll learn how to do the detailed low-level work by making decisions within a method. For this chapter, sending messages will suffice.

          Building a Java program

          There are several other issues you must understand before seeing your first Java program.

          Name visibility

          A problem in any programming language is the control of names. If you use a name in one module of the program, and another programmer uses the same name in another module, how do you distinguish one name from another and prevent the two names from “clashing?” In C this is a particular problem because a program is often an unmanageable sea of names. C++ classes (on which Java classes are based) nest functions within classes so they cannot clash with function names nested within other classes. However, C++ still allowed global data and global functions, so clashing was still possible. To solve this problem, C++ introduced namespaces using additional keywords.

          Java was able to avoid all of this by taking a fresh approach. To produce an unambiguous name for a library, the specifier used is not unlike an Internet domain name. In fact, the Java creators want you to use your Internet domain name in reverse since those are guaranteed to be unique. Since my domain name is BruceEckel.com, my utility library of foibles would be named com.bruceeckel.utility.foibles. After your reversed domain name, the dots are intended to represent subdirectories.

          In Java 1.0 and Java 1.1 the domain extensions com, edu, org, net, etc., were capitalized by convention, so the library would appear: COM.bruceeckel.utility.foibles. Partway through the development of Java 2, however, it was discovered that this caused problems, and so now the entire package name is lowercase.

          This mechanism means that all of your files automatically live in their own namespaces, and each class within a file must have a unique identifier. So you do not need to learn special language features to solve this problem—the language takes care of it for you.

          Using other components

          Whenever you want to use a predefined class in your program, the compiler must know how to locate it. Of course, the class might already exist in the same source code file that it’s being called from. In that case, you simply use the class—even if the class doesn’t get defined until later in the file. Java eliminates the “forward referencing” problem so you don’t need to think about it.

          What about a class that exists in some other file? You might think that the compiler should be smart enough to simply go and find it, but there is a problem. Imagine that you want to use a class of a particular name, but more than one definition for that class exists (presumably these are different definitions). Or worse, imagine that you’re writing a program, and as you’re building it you add a new class to your library that conflicts with the name of an existing class.

          To solve this problem, you must eliminate all potential ambiguities. This is accomplished by telling the Java compiler exactly what classes you want using the import keyword. import tells the compiler to bring in a package, which is a library of classes. (In other languages, a library could consist of functions and data as well as classes, but remember that all code in Java must be written inside a class.)

          Most of the time you’ll be using components from the standard Java libraries that come with your compiler. With these, you don’t need to worry about long, reversed domain names; you just say, for example:

          import java.util.ArrayList;

          to tell the compiler that you want to use Java’s ArrayList class. However, util contains a number of classes and you might want to use several of them without declaring them all explicitly. This is easily accomplished by using ‘*’ to indicate a wild card:

          import java.util.*;

          It is more common to import a collection of classes in this manner than to import classes individually.

          The static keyword

          Ordinarily, when you create a class you are describing how objects of that class look and how they will behave. You don’t actually get anything until you create an object of that class with new, and at that point data storage is created and methods become available.

          But there are two situations in which this approach is not sufficient. One is if you want to have only one piece of storage for a particular piece of data, regardless of how many objects are created, or even if no objects are created. The other is if you need a method that isn’t associated with any particular object of this class. That is, you need a method that you can call even if no objects are created. You can achieve both of these effects with the static keyword. When you say something is static, it means that data or method is not tied to any particular object instance of that class. So even if you’ve never created an object of that class you can call a static method or access a piece of static data. With ordinary, non-static data and methods you must create an object and use that object to access the data or method, since non-static data and methods must know the particular object they are working with. Of course, since static methods don’t need any objects to be created before they are used, they cannot directly access non-static members or methods by simply calling those other members without referring to a named object (since non-static members and methods must be tied to a particular object).

          Some object-oriented languages use the terms class data and class methods, meaning that the data and methods exist only for the class as a whole, and not for any particular objects of the class. Sometimes the Java literature uses these terms too.

          To make a data member or method static, you simply place the keyword before the definition. For example, the following produces a static data member and initializes it:

          class StaticTest {
          static
          int i = 47;
          }

          Now even if you make two StaticTest objects, there will still be only one piece of storage for StaticTest.i. Both objects will share the same i. Consider:

          StaticTest st1 = new StaticTest();
          StaticTest st2 = new StaticTest();

          At this point, both st1.i and st2.i have the same value of 47 since they refer to the same piece of memory.

          There are two ways to refer to a static variable. As indicated above, you can name it via an object, by saying, for example, st2.i. You can also refer to it directly through its class name, something you cannot do with a non-static member. (This is the preferred way to refer to a static variable since it emphasizes that variable’s static nature.)

          StaticTest.i++;

          The ++ operator increments the variable. At this point, both st1.i and st2.i will have the value 48.

          Similar logic applies to static methods. You can refer to a static method either through an object as you can with any method, or with the special additional syntax ClassName.method( ). You define a static method in a similar way:

          class StaticFun {
          static
          void incr() { StaticTest.i++; }
          }

          You can see that the StaticFun method incr( ) increments the static data i. You can call incr( ) in the typical way, through an object:

          StaticFun sf = new StaticFun();
          sf.incr();

          Or, because incr( ) is a static method, you can call it directly through its class:

          StaticFun.incr();

          While static, when applied to a data member, definitely changes the way the data is created (one for each class vs. the non-static one for each object), when applied to a method it’s not so dramatic. An important use of static for methods is to allow you to call that method without creating an object. This is essential, as we will see, in defining the main( ) method that is the entry point for running an application.

          Like any method, a static method can create or use named objects of its type, so a static method is often used as a “shepherd” for a flock of instances of its own type.

          Your first Java program

          Finally, here’s the program.It starts by printing a string, and then the date, using the Date class from the Java standard library. Note that an additional style of comment is introduced here: the ‘//’, which is a comment until the end of the line:

          // HelloDate.java
          import java.util.*;

          public
          class HelloDate {
          public
          static void main(String[] args) {
          System.out.println("Hello, it's: ");
          System.out.println(new Date());
          }
          }

          At the beginning of each program file, you must place the import statement to bring in any extra classes you’ll need for the code in that file. Note that I say “extra;” that’s because there’s a certain library of classes that are automatically brought into every Java file: java.lang. Start up your Web browser and look at the documentation from Sun. (If you haven’t downloaded it from java.sun.com or otherwise installed the Java documentation, do so now). If you look at the list of the packages, you’ll see all the different class libraries that come with Java. Select java.lang. This will bring up a list of all the classes that are part of that library. Since java.lang is implicitly included in every Java code file, these classes are automatically available. There’s no Date class listed in java.lang, which means you must import another library to use that. If you don’t know the library where a particular class is, or if you want to see all of the classes, you can select “Tree” in the Java documentation. Now you can find every single class that comes with Java. Then you can use the browser’s “find” function to find Date. When you do you’ll see it listed as java.util.Date, which lets you know that it’s in the util library and that you must import java.util.* in order to use Date.

          If you go back to the beginning, select java.lang and then System, you’ll see that the System class has several fields, and if you select out you’ll discover that it’s a static PrintStream object. Since it’s static you don’t need to create anything. The out object is always there and you can just use it. What you can do with this out object is determined by the type it is: a PrintStream. Conveniently, PrintStream is shown in the description as a hyperlink, so if you click on that you’ll see a list of all the methods you can call for PrintStream. There are quite a few and these will be covered later in this book. For now all we’re interested in is println( ), which in effect means “print what I’m giving you out to the console and end with a new line.” Thus, in any Java program you write you can say System.out.println(“things”) whenever you want to print something to the console.

          The name of the class is the same as the name of the file. When you’re creating a stand-alone program such as this one, one of the classes in the file must have the same name as the file. (The compiler complains if you don’t do this.) That class must contain a method called main( ) with the signature shown:

          public static void main(String[] args) {

          The public keyword means that the method is available to the outside world (described in detail in Chapter 5). The argument to main( ) is an array of String objects. The args won’t be used in this program, but the Java compiler insists that they be there because they hold the arguments invoked on the command line.

          The line that prints the date is quite interesting:

          System.out.println(new Date());

          Consider the argument: a Date object is being created just to send its value to println( ). As soon as this statement is finished, that Date is unnecessary, and the garbage collector can come along and get it anytime. We don’t need to worry about cleaning it up.

          Compiling and running

          To compile and run this program, and all the other programs in this book, you must first have a Java programming environment. There are a number of third-party development environments, but in this book we will assume that you are using the JDK from Sun, which is free. If you are using another development system, you will need to look in the documentation for that system to determine how to compile and run programs.

          Get on the Internet and go to java.sun.com. There you will find information and links that will lead you through the process of downloading and installing the JDK for your particular platform.

          Once the JDK is installed, and you’ve set up your computer’s path information so that it will find javac and java, download and unpack the source code for this book (you can find it on the CD ROM that’s bound in with this book, or at www.BruceEckel.com). This will create a subdirectory for each chapter in this book. Move to subdirectory c02 and type:

          javac HelloDate.java

          This command should produce no response. If you get any kind of an error message it means you haven’t installed the JDK properly and you need to investigate those problems.

          On the other hand, if you just get your command prompt back, you can type:

          java HelloDate

          and you’ll get the message and the date as output.

          This is the process you can use to compile and run each of the programs in this book. However, you will see that the source code for this book also has a file called makefile in each chapter, and this contains “make” commands for automatically building the files for that chapter. See this book’s Web page at www.BruceEckel.com for details on how to use the makefiles.

          Comments and embedded documentation

          There are two types of comments in Java. The first is the traditional C-style comment that was inherited by C++. These comments begin with a /* and continue, possibly across many lines, until a */. Note that many programmers will begin each line of a continued comment with a *, so you’ll often see:

          /* This is a comment
          * that continues
          * across lines
          */

          Remember, however, that everything inside the /* and */ is ignored, so there’s no difference in saying:

          /* This is a comment that
          continues across lines */

          The second form of comment comes from C++. It is the single-line comment, which starts at a // and continues until the end of the line. This type of comment is convenient and commonly used because it’s easy. You don’t need to hunt on the keyboard to find / and then * (instead, you just press the same key twice), and you don’t need to close the comment. So you will often see:

          // this is a one-line comment


          Comment documentation

          One of the thoughtful parts of the Java language is that the designers didn’t consider writing code to be the only important activity—they also thought about documenting it. Possibly the biggest problem with documenting code has been maintaining that documentation. If the documentation and the code are separate, it becomes a hassle to change the documentation every time you change the code. The solution seems simple: link the code to the documentation. The easiest way to do this is to put everything in the same file. To complete the picture, however, you need a special comment syntax to mark special documentation, and a tool to extract those comments and put them in a useful form. This is what Java has done.

          The tool to extract the comments is called javadoc. It uses some of the technology from the Java compiler to look for special comment tags you put in your programs. It not only extracts the information marked by these tags, but it also pulls out the class name or method name that adjoins the comment. This way you can get away with the minimal amount of work to generate decent program documentation.

          The output of javadoc is an HTML file that you can view with your Web browser. This tool allows you to create and maintain a single source file and automatically generate useful documentation. Because of javadoc we have a standard for creating documentation, and it’s easy enough that we can expect or even demand documentation with all Java libraries.

          Syntax

          All of the javadoc commands occur only within /** comments. The comments end with */ as usual. There are two primary ways to use javadoc: embed HTML, or use “doc tags.” Doc tags are commands that start with a ‘@’ and are placed at the beginning of a comment line. (A leading ‘*’, however, is ignored.)

          There are three “types” of comment documentation, which correspond to the element the comment precedes: class, variable, or method. That is, a class comment appears right before the definition of a class; a variable comment appears right in front of the definition of a variable, and a method comment appears right in front of the definition of a method. As a simple example:

          /** A class comment */
          public class docTest {
          /** A variable comment */

          public int i;
          /** A method comment */

          public void f() {}
          }

          Note that javadoc will process comment documentation for only public and protected members. Comments for private and “friendly” members (see Chapter 5) are ignored and you’ll see no output. (However, you can use the -private flag to include private members as well.) This makes sense, since only public and protected members are available outside the file, which is the client programmer’s perspective. However, all class comments are included in the output.

          The output for the above code is an HTML file that has the same standard format as all the rest of the Java documentation, so users will be comfortable with the format and can easily navigate your classes. It’s worth entering the above code, sending it through javadoc and viewing the resulting HTML file to see the results.

          Embedded HTML

          Javadoc passes HTML commands through to the generated HTML document. This allows you full use of HTML; however, the primary motive is to let you format code, such as:

          /**
          *

          * System.out.println(new Date());
          *

          */

          You can also use HTML just as you would in any other Web document to format the regular text in your descriptions:

          /**
          * You can even insert a list:
          *

            *
          1. Item one
            *
          2. Item two
            *
          3. Item three
            *

          */

          Note that within the documentation comment, asterisks at the beginning of a line are thrown away by javadoc, along with leading spaces. Javadoc reformats everything so that it conforms to the standard documentation appearance. Don’t use headings such as

          or
          as embedded HTML because javadoc inserts its own headings and yours will interfere with them.

          All types of comment documentation—class, variable, and method—can support embedded HTML.

          @see: referring to other classes

          All three types of comment documentation (class, variable, and method) can contain @see tags, which allow you to refer to the documentation in other classes. Javadoc will generate HTML with the @see tags hyperlinked to the other documentation. The forms are:

          @see classname
          @see fully-qualified-classname
          @see fully-qualified-classname#method-name

          Each one adds a hyperlinked “See Also” entry to the generated documentation. Javadoc will not check the hyperlinks you give it to make sure they are valid.

          Class documentation tags

          Along with embedded HTML and @see references, class documentation can include tags for version information and the author’s name. Class documentation can also be used for interfaces (see Chapter 8).

          @version

          This is of the form:

          @version version-information

          in which version-information is any significant information you see fit to include. When the -version flag is placed on the javadoc command line, the version information will be called out specially in the generated HTML documentation.

          @author

          This is of the form:

          @author author-information

          in which author-information is, presumably, your name, but it could also include your email address or any other appropriate information. When the -author flag is placed on the javadoc command line, the author information will be called out specially in the generated HTML documentation.

          You can have multiple author tags for a list of authors, but they must be placed consecutively. All the author information will be lumped together into a single paragraph in the generated HTML.

          @since

          This tag allows you to indicate the version of this code that began using a particular feature. You’ll see it appearing in the HTML Java documentation to indicate what version of the JDK is used.

          Variable documentation tags

          Variable documentation can include only embedded HTML and @see references.

          Method documentation tags

          As well as embedded documentation and @see references, methods allow documentation tags for parameters, return values, and exceptions.

          @param

          This is of the form:

          @param parameter-name description

          in which parameter-name is the identifier in the parameter list, and description is text that can continue on subsequent lines. The description is considered finished when a new documentation tag is encountered. You can have any number of these, presumably one for each parameter.

          @return

          This is of the form:

          @return description

          in which description gives you the meaning of the return value. It can continue on subsequent lines.

          @throws

          Exceptions will be demonstrated in Chapter 10, but briefly they are objects that can be “thrown” out of a method if that method fails. Although only one exception object can emerge when you call a method, a particular method might produce any number of different types of exceptions, all of which need descriptions. So the form for the exception tag is:

          @throws fully-qualified-class-name description

          in which fully-qualified-class-name gives an unambiguous name of an exception class that’s defined somewhere, and description (which can continue on subsequent lines) tells you why this particular type of exception can emerge from the method call.

          @deprecated

          This is used to tag features that were superseded by an improved feature. The deprecated tag is a suggestion that you no longer use this particular feature, since sometime in the future it is likely to be removed. A method that is marked @deprecated causes the compiler to issue a warning if it is used.

          Documentation example

          Here is the first Java program again, this time with documentation comments added:

          //: c02:HelloDate.java
          import java.util.*;

          /** The first Thinking in Java example program.
          * Displays a string and today's date.
          * @author Bruce Eckel
          * @author http://www.BruceEckel.com
          * @version 2.0
          */

          public class HelloDate {
          /** Sole entry point to class & application
          * @param args array of string arguments
          * @return No return value
          * @exception exceptions No exceptions thrown
          */

          public static void main(String[] args) {
          System.out.println("Hello, it's: ");
          System.out.println(new Date());
          }
          } ///:~

          The first line of the file uses my own technique of putting a ‘:’ as a special marker for the comment line containing the source file name. That line contains the path information to the file (in this case, c02 indicates Chapter 2) followed by the file name. The last line also finishes with a comment, and this one indicates the end of the source code listing, which allows it to be automatically extracted from the text of this book and checked with a compiler.

          Coding style

          The unofficial standard in Java is to capitalize the first letter of a class name. If the class name consists of several words, they are run together (that is, you don’t use underscores to separate the names), and the first letter of each embedded word is capitalized, such as:

          class AllTheColorsOfTheRainbow { // ...

          For almost everything else: methods, fields (member variables), and object reference names, the accepted style is just as it is for classes except that the first letter of the identifier is lowercase. For example:

          class AllTheColorsOfTheRainbow {
          int anIntegerRepresentingColors;
          void changeTheHueOfTheColor(int newHue) {
          // ...
          }
          // ...
          }

          Of course, you should remember that the user must also type all these long names, and so be merciful.

          The Java code you will see in the Sun libraries also follows the placement of open-and-close curly braces that you see used in this book.

          Summary

          In this chapter you have seen enough of Java programming to understand how to write a simple program, and you have gotten an overview of the language and some of its basic ideas. However, the examples so far have all been of the form “do this, then do that, then do something else.” What if you want the program to make choices, such as “if the result of doing this is red, do that; if not, then do something else”? The support in Java for this fundamental programming activity will be covered in the next chapter.

          Exercises

          1. Following the HelloDate.java example in this chapter, create a “hello, world” program that simply prints out that statement. You need only a single method in your class (the “main” one that gets executed when the program starts). Remember to make it static and to include the argument list, even though you don’t use the argument list. Compile the program with javac and run it using java. If you are using a different development environment than the JDK, learn how to compile and run programs in that environment.
          2. Find the code fragments involving ATypeName and turn them into a program that compiles and runs.
          3. Turn the DataOnly code fragments into a program that compiles and runs.
          4. Modify Exercise 3 so that the values of the data in DataOnly are assigned to and printed in main( ).
          5. Write a program that includes and calls the storage( ) method defined as a code fragment in this chapter.
          6. Turn the StaticFun code fragments into a working program.
          7. Write a program that prints three arguments taken from the command line. To do this, you’ll need to index into the command-line array of Strings.
          8. Turn the AllTheColorsOfTheRainbow example into a program that compiles and runs.
          9. Find the code for the second version of HelloDate.java, which is the simple comment documentation example. Execute javadoc on the file and view the results with your Web browser.
          10. Turn docTest into a file that compiles and then run it through javadoc. Verify the resulting documentation with your Web browser.
          11. Add an HTML list of items to the documentation in Exercise 10.
          12. Take the program in Exercise 1 and add comment documentation to it. Extract this comment documentation into an HTML file using javadoc and view it with your Web

            This can be a flashpoint. There are those who say “clearly, it’s a pointer,” but this presumes an underlying implementation. Also, Java references are much more akin to C++ references than pointers in their syntax. In the first edition of this book, I choose to invent a new term, “handle,” because C++ references and Java references have some important differences. I was coming out of C++ and did not want to confuse the C++ programmers whom I assumed would be the largest audience for Java. In the 2nd edition, I decided that “reference” was the more commonly used term, and that anyone changing from C++ would have a lot more to cope with than the terminology of references, so they might as well jump in with both feet. However, there are people who disagree even with the term “reference.” I read in one book where it was “completely wrong to say that Java supports pass by reference,” because Java object identifiers (according to that author) are actually “object references.” And (he goes on) everything is actually pass by value. So you’re not passing by reference, you’re “passing an object reference by value.” One could argue for the precision of such convoluted explanations, but I think my approach simplifies the understanding of the concept without hurting anything (well, the language lawyers may claim that I’m lying to you, but I’ll say that I’m providing an appropriate abstraction.)

            static methods, which you’ll learn about soon, can be called for the class, without an object.

            With the usual exception of the aforementioned “special” data types boolean, char, byte, short, int, long, float, and double. In general, though, you pass objects, which really means you pass references to objects.

            Some programming environments will flash programs up on the screen and close them before you've had a chance to see the results. You can put in the following bit of code at the end of main( ) to pause the output:

                try {
            System.in.read();
            } catch(Exception e) {}

            This will pause the output until you press “Enter” (or any other key). This code involves concepts that will not be introduced until much later in the book, so you won’t understand it until then, but it will do the trick.

            A tool that I created using Python (see www.Python.org) uses this information to extract the code files, put them in appropriate subdirectories, and create makefiles.


          Histroy of C++

          Computer languages have undergone dramatic evolution since the first electronic computers were built to assist in telemetry calculations during World war II.Early on,programmers worked with the most primitive computer instructions:machine langauge.These instructions were represented by long strings of ones and zeros.soon,assembles were invented to map machine instructions to human-readable and-manageable mnemonics,such as ADD and MOV.
          In time,higher-level languages evloved,sush as BASIC and COBOL.These langauges let people work with something approximating words and sentences,such as let I=100.These instructions were translated back into machine language by interpreters and compilers.An intermediary then invokes a linker,which turns the object file into an executable program.

          Because interpreters read the code as it is written and execute the code on the spot,interpreters are easy for the programmer to work with.Compilers,however,introduce the extra steps of compiling and linking the code,which is inconvenient.Compilers produce a program that is very fast each time it is run.However,the time-consuming task of translating the soource code into machine language has already been accomplished.
          Another advantage of many compiled lanagues like C++ is that you can distribute the executable program to people who donot have the compilers.With an interpretive language,you must have the languages to run the program.

          For many years,the principle goal of computer programmers was to write short pieves of code that would execute quickly.The program needed to be small,because memory was expensive,and it needed to be fast,because processing power was also expensive.As computers have become smaller,cheaper,and faster,and as the cost of memory has fallen,these priorities have changed.Today the cost of programmer's time far outweighs the cost of most of the computers in use by businesses.Well-written,easy to-maintian code is at a premium.Easy-to-maintain means that as business requirements change,the program can be extended and enhanced without great expense.

          Programs

          The word program is used in two ways:to describe individual instructions,or source code,created by the programmer,and to describe an entire piece of executable software.This distinction can cause enormous confusion,so we will try to distinguish between the source code on one hand,and the executable on the other.

          Procedural,Structured and Object Oriented Programming

          Until recently,programs were thought of as series of procedures that acted upon data.A procedure,or function,is a set specific instructions executed one after the other.The data was quite separate from procedures,and the trick in programming was to keep track of which functions called which other functions,and what data was changed.To make sense of this potentially confusing situation,structured programming was created.
          The principle idea behind structured programming is as simple as the idea of divide and conquer.A computer program can be thought of as consisting of a set of tasks.Any task that is too complex to be described simply would be broken down into a set of smaller component tasks,until the tasks were sufficiently small and self-contained enough that they were easily understood.

          C++ and Object_oriented Programming

          C++ fully supports object-oriented programming,including the four pillars of object-oriented development:encapsulation,data hiding,inheritance,and polymorphism.Encapsulation and Data Hiding when an engineer needs to add a resistor to the device she is creating,she doesn't typically build a new one from scratch.She walks over to a bin of resistors,examines the coloured bands that indicate the properties,and picks the one she needs.The resistor is a "balck box" as far as the engineer is concerned --she doesnot much care how it does its work as long as conforms to her specifications:she doesnot need to look inside the box to use it in he design.
          The property of being a self-contained unit is called encapsulation.With encapsulation,we can accomplish data hiding.Data hiding is the highly valued characteristic that an object can be used without the user knowing how the compressor works,you can use a wall-designed object without knowing about its internal data members.
          Similarly,when the engineer uses the resistor,she need not know anything about the internal state of the resistor.All the propeties of the resistor are encapsulated in the resistor object;they are not spread out through the circuitry.It is not necessary to understand how the resistor works in order to use it effectively.Its data is hidden inside the resistor's casing.

          C++ supports the properties of encapsulation and data hiding through the creation of user-defined types,called classes.Once created,a well-defined class acts as a fully encapsulated entity--it is used as a whole unit.The actual inner wrokings of the class should be hidden.Users of a well-defined class do not need to know how the class works;they just need to know how to use it.Inheritance and Reuse when the engineers at Acme Motors want to build a new car,they have two choices:They can start from scratch,or they can modify an existing model.Perhaps their Star model is nearly perfect,but they'd like to add a turbocharger and a six-speed transmission.The chief engineer would prefer not to start from the ground up,but rather to say,"Let's build another Star,but let's add these additional capabilities.We'll call the new model a Quasar ."A Quasar is a kind of Star,but one with new features.

          C++ supports the idea of reuse through inheritance.A new type,which is an extension of an exiting type,can be decalred.This new subclass is said it derive from the existing type and is sometimes called a derived type.The Quasar is derived from the Star and thus inherits all its qualities,but can add to them as needed.The new Quasar might respond differently than a Star does when you press down on the accelerator.The Quasar might engage fuel injection and a turbocharger,while the star would simply let gasoline into its carburetor.A user,however,does not have to know about these differences.He can just"floot it,"and the right thing will happen,depending on which can he's driving.

          C++ supports the idea that different objects do "the right thing" through what is called function polymorphism and class ploymorphism.Poly means many,and morph means form.Ploymorphism refers to the same name taking many forms.

          How C++ Evolved

          As object-oriented analysis,design and programming began to catch on,Bjarne Stroustrup took the most popular langauge for commercial software development,C and extended it to provide the features needed to facilities object-oriented programming .He created C++ and in less than a decade it has gone from being used by only a handful of developers at AT & T to being the programming langauge of choice for an estimated one million developers worldwide.It is expected that by the end of the decade,c++ will be the predomminant langauge for commercial software development.
          While it is true that C++ is a superset of C,and that virtutally any legal C program is a legal C++ program,the leap from C to C++ is very significant.C++ benefited from its relationship to C for many years,as C programmers could ease into their use of C++.To really get the full benefit of C++,however,many programmers found they had to unlearn much of what they knew and learn a whole new way of conceptutalizing and solving programming problems.

          A first impression of C++
          We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Send your email to Frank Brokken.

          Please state the document version you're referring to, as found in the title (in this document: 4.4.2).

          In this chapter the usage of C++ is further explored. The possibility to declare functions in structs is further illustrated using examples. The concept of a class is introduced.


          Variable and Contraints

          Programs need a way to store the data they use.variable and constraints offer various ways to represent and manipulate that data.


          What is a variable?

          In C++ a variable is a place to store information.A variable is a location in your computer's memory in which you can store value and from which you can later retrieve that value.
          Your computer's memory can be viewed as a series of cobbyholes.Each cobbyhole is one of many,many such holes all lined up.Each cubbyhole--or memory location--is numbered sequentillly.These numbers are known as memory addresses.A variable reserves one or more cubbyholes in which you may store a value.
          Your variable name(for example,myVariable)is a label on one of these cubbyholes,so that you can find it easily without knowing its actual memory address.

          A first impression of C++

          In this chapter the usage of C++ is further explored. The possibility to declare functions in structs is further illustrated using examples. The concept of a class is introduced.

          3.1: More extensions of C in C++

          Before we continue with the `real' object-oriented approach to programming, we first introduce some extensions to the C programming language, encountered in C++: not mere differences between C and C++, but syntactical constructs and keywords that are not found in C.

          3.1.1: The scope resolution operator ::

          The syntax of C++ introduces a number of new operators, of which the scope resolution operator :: is described first. This operator can be used in situations where a global variable exists with the same name as a local variable:


          #include

          int
          counter = 50; // global variable

          int main()
          {
          for (register int counter = 1; // this refers to the
          counter < 10; // local variable
          counter++)
          {
          printf("%d\n",
          ::counter // global variable
          / // divided by
          counter); // local variable
          }
          return (0);
          }

          In this code fragment the scope operator is used to address a global variable instead of the local variable with the same name. The usage of the scope operator is more extensive than just this, but the other purposes will be described later.

          3.1.2: cout, cin and cerr

          In analogy to C, C++ defines standard input- and output streams which are opened when a program is executed. The streams are:

          • cout, analogous to stdout,
          • cin, analogous to stdin,
          • cerr, analogous to stderr.

          Syntactically these streams are not used with functions: instead, data are read from the streams or written to them using the operators <<, called the insertion operator and >>, called the extraction operator. This is illustrated in the example below:



          #include

          void main()
          {
          int
          ival;
          char
          sval[30];

          cout << "Enter a number:" << endl;
          cin >> ival;
          cout << "And now a string:" << endl;
          cin >> sval;

          cout << "The number is: " << ival << endl
          << "And the string is: " << sval << endl;
          }

          This program reads a number and a string from the cin stream (usually the keyboard) and prints these data to cout. Concerning the streams and their usage we remark the following:

          • The streams are declared in the header file iostream.
          • The streams cout, cin and cerr are in fact `objects' of a given class (more on classes later), processing the input and output of a program. Note that the term `object', as used here, means the set of data and functions which defines the item in question.
          • The stream cin reads data and copies the information to variables (e.g., ival in the above example) using the extraction operator >>. We will describe later how operators in C++ can perform quite different actions than what they are defined to do by the language grammar, such as is the case here. We've seen function overloading. In C++ operators can also have multiple definitions, which is called operator overloading.
          • The operators which manipulate cin, cout and cerr (i.e., >> and <<) also manipulate variables of different types. In the above example cout <<> results in the printing of an integer value, whereas cout << "Enter a number" results in the printing of a string. The actions of the operators therefore depend on the type of supplied variables.
          • Special symbolic constants are used for special situations. The termination of a line written by cout is realized by inserting the endl symbol, rather than using the string "\n".

          The streams cin, cout and cerr are in fact not part of the C++ grammar, as defined in the compiler which parses source files. The streams are part of the definitions in the header file iostream. This is comparable to the fact that functions as printf() are not part of the C grammar, but were originally written by people who considered such functions handy and collected them in a run-time library.

          Whether a program uses the old-style functions like printf() and scanf() or whether it employs the new-style streams is a matter of taste. Both styles can even be mixed. A number of advantages and disadvantages is given below:

          • Compared to the standard C functions printf() and scanf(), the usage of the insertion and extraction operators is more type-safe. The format strings which are used with printf() and scanf() can define wrong format specifiers for their arguments, for which the compiler sometimes can't warn. In contrast, argument checking with cin, cout and cerr is performed by the compiler. Consequently it isn't possible to err by providing an int argument in places where, according to the format string, a string argument should appear.
          • The functions printf() and scanf(), and other functions which use format strings, in fact implement a mini-language which is interpreted at run-time. In contrast, the C++ compiler knows exactly which in- or output action to perform given which argument.
          • The usage of the left-shift and right-shift operators in the context of the streams does illustrate the possibilities of C++. Again, it requires a little getting used to, coming from C, but after that these overloaded operators feel rather comfortably.

          The iostream library has a lot more to offer than just cin, cout and cerr. In chapter 12 iostreams will be covered in greater detail.

          3.1.3: The keyword const

          The keyword const very often occurs in C++ programs, even though it is also part of the C grammar, where it's much less used.

          This keyword is a modifier which states that the value of a variable or of an argument may not be modified. In the below example an attempt is made to change the value of a variable ival, which is not legal:



          int main()
          {
          int const // a constant int..
          ival = 3; // initialized to 3

          ival = 4; // assignment leads
          // to an error message

          return (0);
          }

          This example shows how ival may be initialized to a given value in its definition; attempts to change the value later (in an assignment) are not permitted.

          Variables which are declared const can, in contrast to C, be used as the specification of the size of an array, as in the following example:



          int const
          size = 20;
          char
          buf[size]; // 20 chars big

          A further usage of the keyword const is seen in the declaration of pointers, e.g., in pointer-arguments. In the declaration



          char const *buf;

          buf is a pointer variable, which points to chars. Whatever is pointed to by buf may not be changed: the chars are declared as const. The pointer buf itself however may be changed. A statement as *buf = 'a'; is therefore not allowed, while buf++ is.

          In the declaration



          char *const buf;

          buf itself is a const pointer which may not be changed. Whatever chars are pointed to by buf may be changed at will.

          Finally, the declaration



          char const *const buf;

          is also possible; here, neither the pointer nor what it points to may be changed.

          The rule of thumb for the placement of the keyword const is the following: whatever occurs just prior to the keyword may not be changed. The definition or declaration in which const is used should be read from the variable or function identifier back to the type indentifier:

          ``Buf is a const pointer to const characters''
          This rule of thumb is especially handy in cases where confusion may occur. In examples of C++ code, one often encounters the reverse: const preceding what should not be altered. That this may result in sloppy code is indicated by our second example above:


          char const *buf;

          What must remain constant here? According to the sloppy interpretation, the pointer cannot be altered (since const precedes the pointer-*). In fact, the charvalues are the constant entities here, as will be clear when it is tried to compile the following program:



          int main()
          {
          char const *buf = "hello";

          buf++; // accepted by the compiler
          *buf = 'u'; // rejected by the compiler

          return (0);
          }

          Compilation fails on the statement *buf = 'u';, not on the statement buf++.

          3.1.4: References

          Besides the normal declaration of variables, C++ allows `references' to be declared as synonyms for variables. A reference to a variable is like an alias; the variable name and the reference name can both be used in statements which affect the variable:


          int
          int_value;
          int
          &ref = int_value;

          In the above example a variable int_value is defined. Subsequently a reference ref is defined, which due to its initialization addresses the same memory location which int_value occupies. In the definition of ref, the reference operator & indicates that ref is not itself an integer but a reference to one. The two statements



          int_value++; // alternative 1
          ref++; // alternative 2

          have the same effect, as expected. At some memory location an int value is increased by one --- whether that location is called int_value or ref does not matter.

          References serve an important function in C++ as a means to pass arguments which can be modified (`variable arguments' in Pascal-terms). E.g., in standard C, a function which increases the value of its argument by five but which returns nothing (void), needs a pointer argument:



          void increase(int *valp) // expects a pointer
          { // to an int
          *valp += 5;
          }

          int main()
          {
          int
          x;

          increase(&x) // the address of x is
          return (0); // passed as argument
          }

          This construction can also be used in C++ but the same effect can be achieved using a reference:



          void increase(int &valr) // expects a reference
          { // to an int
          valr += 5;
          }

          int main()
          {
          int
          x;

          increase(x); // a reference to x is
          return (0); // passed as argument
          }

          The way in which C++ compilers implement references is actually by using pointers: in other words, references in C++ are just ordinary pointers, as far as the compiler is concerned. However, the programmer does not need to know or to bother about levels of indirection. (Compare this to the Pascal way: an argument which is declared as var is in fact also a pointer, but the programmer needn't know.)

          It can be argued whether code such as the above is clear: the statement increase (x) in the main() function suggests that not x itself but a copy is passed. Yet the value of x changes because of the way increase() is defined.

          Our suggestions for the usage of references as arguments to functions are therefore the following:

          • In those situations where a called function does not alter its arguments, a copy of the variable can be passed:


            void some_func(int val)
            {
            printf("%d\n", val);
            }

            int main()
            {
            int
            x;

            some_func(x); // a copy is passed, so
            return (0); // x won't be changed
            }

          • When a function changes the value of its argument, the address or a reference can be passed, whichever you prefer:


            void by_pointer(int *valp)
            {
            *valp += 5;
            }

            void by_reference(int &valr)
            {
            valr += 5;
            }

            int main ()
            {
            int
            x;

            by_pointer(&x); // a pointer is passed
            by_reference(x); // x is altered by reference
            return (0); // x might be changed
            }

          • References have an important role in those cases where the argument will not be changed by the function, but where it is desirable to pass a reference to the variable instead of a copy of the whole variable. Such a situation occurs when a large variable, e.g., a struct, is passed as argument, or is returned from the function. In these cases the copying operations tend to become significant factors when the entire structure must be copied, and it is preferred to use references. If the argument isn't changed by the function, or if the caller shouldn't change the returned information, the use of the const keyword is appropriate and should be used.

            Consider the following example:



            struct Person // some large structure
            {
            char
            name [80],
            address [90];
            double
            salary;
            };

            Person
            person[50]; // database of persons

            void printperson (Person const &p) // printperson expects a
            { // reference to a structure
            printf ("Name: %s\n" // but won't change it
            "Address: %s\n",
            p.name, p.address);
            }

            Person const &getperson(int index) // get a person by indexvalue
            {
            ...
            return (person[index]); // a reference is returned,
            } // not a copy of person[index]

            int main ()
            {
            Person
            boss;

            printperson (boss); // no pointer is passed,
            // so variable won't be
            // altered by function
            printperson(getperson(5)); // references, not copies
            // are passed here
            return (0);
            }

          • It should furthermore be noted here that there is another reason for using references when passing objects as function arguments: when passing a reference to an object, the activation of a copy constructor is avoided.

          References also can lead to extremely `ugly' code. A function can also return a reference to a variable, as in the following example:



          int &func()
          {
          static int
          value;

          return (value);
          }

          This allows the following constructions:



          func() = 20;
          func() += func ();

          It is probably superfluous to note that such constructions should not normally be used. Nonetheless, there are situations where it is useful to return a reference. Even though this is discussed later, we have seen an example of this phenomenon at our previous discussion of the iostreams. In a statement like cout << "Hello" <<>, the insertion operator returns a reference to cout. So, in this statement first the "Hello" is inserted into cout, producing a reference to cout. Via this reference the endl is then inserted in the cout object, again producing a reference to cout. This latter reference is not further used.

          A number of differences between pointers and references is pointed out in the list below:

          • A reference cannot exist by itself, i.e., without something to refer to. A declaration of a reference like

            int &ref;

            is not allowed; what would ref refer to?

          • References can, however, be declared as external. These references were initialized elsewhere.
          • Reference may exist as parameters of functions: they are initialized when the function is called.
          • References may be used in the return types of functions. In those cases the function determines to what the return value will refer.
          • Reference may be used as data members of classes. We will return to this usage later.
          • In contrast, pointers are variables by themselves. They point at something concrete or just ``at nothing''.
          • References are aliases for other variables and cannot be re-aliased to another variable. Once a reference is defined, it refers to its particular variable.
          • In contrast, pointers can be reassigned to point to different variables.
          • When an address-of operator & is used with a reference, the expression yields the address of the variable to which the reference applies. In contrast, ordinary pointers are variables themselves, so the address of a pointer variable has nothing to do with the address of the variable pointed to.

          3.2: Functions as part of structs

          The first chapter described that functions can be part of structs . Such functions are called member functions or methods. This section discusses the actual definition of such functions.

          The code fragment below illustrates a struct in which data fields for a name and address are present. A function print() is included in the struct definition:



          struct person
          {
          char
          name [80],
          address [80];
          void
          print (void);
          };

          The member function print() is defined using the structure name (person) and the scope resolution operator (::):



          void person::print()
          {
          printf("Name: %s\n"
          "Address: %s\n", name, address);
          }

          In the definition of this member function, the function name is preceded by the struct name followed by ::. The code of the function shows how the fields of the struct can be addressed without using the type name: in this example the function print() prints a variable name. Since print() is a part of the struct person, the variable name implicitly refers to the same type.

          The usage of this struct could be, e.g.:



          person
          p;

          strcpy(p.name, "Karel");
          strcpy(p.address, "Rietveldlaan 37");
          p.print();

          The advantage of member functions lies in the fact that the called function can automatically address the data fields of the structure for which it was invoked. As such, in the statement p.print() the structure p is the `substrate': the variables name and address which are used in the code of print() refer to the same struct p.

          3.3: Several new data types

          In C the following basic data types are available: void, char, short, int, long, float and double. C++ extends these five basic types with several extra types: the types bool, wchar_t and long double. The type long double is merely a double-long double datatype. Apart from these basic types a standard type string is available.

          3.3.1: The `bool' data type

          In C the following basic data types are available: void, char, int, float and double. C++ extends these five basic types with several extra types. In this section the type bool is introduced.

          The type bool represents boolean (logical) values, for which the (now reserved) values true and false may be used. Apart from these reserved values, integral values may also be assigned to variables of type bool, which are implicitly converted to true and false according to the following conversion rules (assume intValue is an int-variable, and boolValue is a bool-variable):



          // from int to bool:
          boolValue = intValue ? true : false;

          // from bool to int:

          intValue = boolValue ? 1 : 0;

          Furthermore, when bool values are inserted into, e.g., cout, then 1 is written for true values, and 0 is written for false values. Consider the following example:


          cout << "A true value: " << true << endl
          << "A false value: " << false << endl;

          The bool data type is found in other programming languages as well. Pascal has its type Boolean, and Java has a boolean type. Different from these languages, C++'s type bool acts like a kind of int type: it's primarily a documentation-improving type, having just two values true and false. Actually, these values can be interpreted as enum values for 1 and 0. Doing so would neglect the philosophy behind the bool data type, but nevertheless: assigning true to an int variable neither produces warnings nor errors.

          Using the bool-type is generally more intuitively clear than using int. Consider the following prototypes:



          bool exists(char const *fileName); // (1)
          int exists(char const *fileName); // (2)

          For the first prototype (1), most people will expect the function to return true if the given filename is the name of an existing file. However, using the second prototype some ambiguity arises: intuitively the returnvalue 1 is appealing, as it leads to constructions like


          if (exists("myfile"))
          cout << "myfile exists";

          On the other hand, many functions (like access(), stat(), etc.) return 0 to indicate a successful operation, reserving other values to indicate various types of errors.

          As a rule of thumb we suggest the following: If a function should inform its caller about the success or failure of its task, let the function return a bool value. If the function should return success or various types of errors, let the function return enum values, documenting the situation when the function returns. Only when the function returns a meaningful integral value (like the sum of two int values), let the function return an int value.

          3.3.2: The `wchar_t' data type

          The wchar_t type is an extension of the char basic type, to accomodate wide character values, such as the Unicode character set. Sizeof(wchar_t) is 2, allowing for 65,536 different character values.

          Note that a programming language like Java has a data type char that is comparable to C++'s wchar_t type, while Java's byte data type is comparable to C++'s char type. Very convenient....

          3.4: Data hiding: public, private and class

          As mentioned previously C++ contains special syntactical possibilities to implement data hiding. Data hiding is the ability of one program part to hide its data from other parts; thus avoiding improper addressing or name collisions of data.

          C++ has two special keywords which are concerned with data hiding: private and public. These keywords can be inserted in the definition of a struct. The keyword public defines all subsequent fields of a structure as accessible by all code; the keyword private defines all subsequent fields as only accessible by the code which is part of the struct (i.e., only accessible for the member functions) (Besides public and private, C++ defines the keyword protected. This keyword is not often used and it is left for the reader to explore.). In a struct all fields are public, unless explicitly stated otherwise.

          With this knowledge we can expand the struct person:



          struct person
          {
          public:
          void
          setname (char const *n),
          setaddress (char const *a),
          print (void);
          char const
          *getname (void),
          *getaddress (void);
          private:
          char
          name [80],
          address [80];
          };

          The data fields name and address are only accessible for the member functions which are defined in the struct: these are the functions setname(), setaddress() etc.. This property of the data type is given by the fact that the fields name and address are preceded by the keyword private. As an illustration consider the following code fragment:



          person
          x;

          x.setname ("Frank"); // ok, setname() is public
          strcpy (x.name, "Knarf"); // error, name is private

          The concept of data hiding is realized here in the following manner. The actual data of a struct person are named only in the structure definition. The data are accessed by the outside world by special functions, which are also part of the definition. These member functions control all traffic between the data fields and other parts of the program and are therefore also called `interface' functions.

          Also note that the functions setname() and setaddress() are declared as having a char const * argument. This means that the functions will not alter the strings which are supplied as their arguments. In the same vein, the functions getname() and getaddress() return a char const *: the caller may not modify the strings which are pointed to by the return values.

          Two examples of member functions of the struct person are shown below:



          void person::setname(char const *n)
          {
          strncpy(name, n, 79);
          name[79] = '\0';
          }

          char const *person::getname()
          {
          return (name);
          }

          In general, the power of the member functions and of the concept of data hiding lies in the fact that the interface functions can perform special tasks, e.g., checks for the validity of data. In the above example setname() copies only up to 79 characters from its argument to the data member name, thereby avoiding array boundary overflow.

          Another example of the concept of data hiding is the following. As an alternative to member functions which keep their data in memory (as do the above code examples), a runtime library could be developed with interface functions which store their data on file. The conversion of a program which stores person structures in memory to one that stores the data on disk would mean the relinking of the program with a different library.

          Though data hiding can be realized with structs, more often (almost always) classes are used instead. A class is in principle equivalent to a struct except that unless specified otherwise, all members (data or functions) are private. As far as private and public are concerned, a class is therefore the opposite of a struct. The definition of a class person would therefore look exactly as shown above, except for the fact that instead of the keyword struct, class would be used. Our typographic suggestion for class names is a capital as first character, followed by the remainder of the name in lower case (e.g., Person).

          3.5: Structs in C vs. structs in C++

          At the end of this chapter we would like to illustrate the analogy between C and C++ as far as structs are concerned. In C it is common to define several functions to process a struct, which then require a pointer to the struct as one of their arguments. A fragment of an imaginary C header file is given below:


          // definition of a struct PERSON_
          typedef struct
          {
          char
          name[80],
          address[80];
          } PERSON_;

          // some functions to manipulate PERSON_ structs

          // initialize fields with a name and address
          extern void initialize(PERSON_ *p, char const *nm,
          char const *adr);

          // print information
          extern void print(PERSON_ const *p);

          // etc..

          In C++, the declarations of the involved functions are placed inside the definition of the struct or class. The argument which denotes which struct is involved is no longer needed.



          class Person
          {
          public:
          void initialize(char const *nm, char const *adr);
          void print(void);
          // etc..
          private:
          char
          name[80],
          address[80];
          };

          The struct argument is implicit in C++. A function call in C like



          PERSON_
          x;

          initialize(&x, "some name", "some address");

          becomes in C++:



          Person
          x;

          x.initialize("some name", "some address");

          3.6: Namespaces

          Imagine a math teacher who wants to develop an interactive math program. For this program functions like cos(), sin(), tan() etc. are to be used accepting arguments in degrees rather than arguments in radials. Unfortunately, the functionname cos() is already in use, and that function accepts radials as its arguments, rather than degrees.

          Problems like these are normally solved by looking for another name, e.g., the functionname cosDegrees() is defined. C++ offers an alternative solution by allowing namespaces to be defined: areas or regions in the code in which identifiers are defined which cannot conflict with existing names defined elsewhere.

          3.6.1: Defining namespaces

          Namespaces are defined according to the following syntax:


          namespace identifier
          {
          // declared or defined entities
          // (declarative region)
          }

          The identifier used in the definition of a namespace is a standard C++ identifier.

          Within the declarative region, introduced in the above code example, functions, variables, structs, classes and even (nested) namespaces can be defined or declared. Namespaces cannot be defined within a block. So it is not possible to define a namespace within, e.g., a function. However, it is possible to define a namespace using multiple namespace declarations. Namespaces are said to be open. This means that a namespace CppAnnotations could be defined in a file file1.cc and also in a file file2.cc. The entities defined in the CppAnnotations namespace of files file1.cc and file2.cc are then united in one CppAnnotations namespace region. For example:



          // in file1.cc
          namespace CppAnnotations
          {
          double cos(double argInDegrees)
          {
          ...
          }
          }

          // in file2.cc
          namespace CppAnnotations
          {
          double sin(double argInDegrees)
          {
          ...
          }
          }

          Both sin() and cos() are now defined in the same CppAnnotations namespace.

          Namespace entities can also be defined outside of their namespaces.

          3.6.1.1: Declaring entities in namespaces

          Instead of defiing entities in a namespace, entities may also be declared in a namespace. This allows us to put all the declarations of a namespace in a header file which can thereupon be included in sources in which the entities of a namespace are used. Such a header file could contain, e.g.,



          namespace CppAnnotations
          {
          double cos(double degrees);
          double sin(double degrees);
          }

          3.6.1.2: A closed namespace

          Namespaces can be defined without a name. Such a namespace is anonymous and it restricts the usability of the defined entities to the source file in which the anonymous namespace is defined.

          The entities that are defined in the anonymous namespace are accessible the same way as static functions and variables in C. The static keyword can still be used in C++, but its use is more dominant in class definitions (see chapter 5). In situations where static variables or functions are necessary, the use of the anonymous namespace is preferred.

          3.6.2: Referring to entities

          Given a namespace and entities that are defined or declared in it, the scope resolution operator can be used to refer to the entities that are defined in the namespace. For example, to use the function cos() defined in the CppAnnotations namespace the following code could be used:


          // assume the CppAnnotations namespace is declared in the next header
          // file:
          #include

          int main()
          {
          cout << "The cosine of 60 degrees is: " <<
          CppAnnotations::cos(60) << endl;
          return (0);
          }

          This is a rather cumbersome way to refer to the cos() function in the CppAnnotations namespace, especially so if the function is frequently used.

          Therefore, an abbreviated form (just cos() can be used by declaring that cos() will refer to CppAnnotations::cos(). For this, the using-declaration can be used. Following



          using CppAnnotations::cos; // note: no function prototype, just the
          // name of the entity is required.

          the function cos() will refer to the cos() function in the CppAnnotations namespace. This implies that the standard cos() function, accepting radials, cannot be used automatically anymore. The plain scope resolution operator can be used to reach the generic cos() function:


          int main()
          {
          using CppAnnotations::cos;
          ...
          cout << cos(60) // this uses CppAnnotations::cos()
          << ::cos(1.5) // this uses the standard cos() function
          << endl;
          return (0);
          }

          Note that a using-declaration can be used inside a block. The using declaration prevents the definition of entities having the same name as the one used in the using declaration: it is not possible to use a using declaration for a variable value in the CppAnnotations namespace, and to define (or declare) an identically named object in the block in which the using declaration was placed:


          int main()
          {
          using CppAnnotations::value;
          ...
          cout << value << endl; // this uses CppAnnotations::value

          int
          value; // error: value already defined.

          return (0);
          }

          3.6.2.1: The using directive

          A generalized alternative to the using-declaration is the using-directive:



          using namespace CppAnnotations;

          Following this directive, all entities defined in the CppAnnotations namespace are uses as if they where declared by using declarations.

          While the using-directive is a quick way to import all the names of the CppAnnotations namespace (assuming the entities are declared or defined separately from the directive), it is at the same time a somewhat dirty way to do so, as it is less clear which entity will be used in a particular block of code.

          If, e.g., cos() is defined in the CppAnnotations namespace, the function CppAnnotations::cos() will be used when cos() is called in the code. However, if cos() is not defined in the CppAnnotations namespace, the standard cos() function will be used. The using directive does not document as clearly which entity will be used as the using declaration does. For this reason, the using directive is somewhat deprecated.

          3.6.3: The standard namespace

          Apart from the anonymous namespace, many entities of the runtime available software (e.g., cout, cin, cerr and the templates defined in the Standard Template Library) are now defined in the std namespace.

          Regarding the discussion in the previous section, one should use a using declaration for these entities. For example, in order to use the cout stream, the code should start with something like



          #include

          using std::cout;

          Often, however, the identifiers that are defined in the std namespace can all be accepted without much thought. Because of that, one often encounters a using directive, rather than a using declaration with the std namespace. So, instead of the mentioned using declaration a construction like


          #include

          using namespace std;

          is often encountered. Whether this should be encouraged is subject of some dispute. Long using declarations are of course inconvenient too. So as a rule of thumb one might decide to stick to using declarations, up to the point where the list becomes impractically long, at which point a using directive could be considered.

          3.6.4: Nesting namespaces and namespace aliasing

          Namespaces can be nested. The following code shows the definition of a nested namespace:


          namespace CppAnnotations
          {
          namespace Virtual
          {
          void
          *pointer;
          }
          }

          Now the variable pointer defined in the Virtual namespace, nested under the CppAnnotations namespace. In order to refer to this variable, the following options are available:
          • The fully qualified name can be used. A fully qualified name of an entity is a list of all the namespaces that are visited until the definition of the entity is reached, glued together by the scope resolution operator:


            int main()
            {
            CppAnnotations::Virtual::pointer = 0;
            return (0);
            }

          • A using declaration for CppAnnotations::Virtual can be used. Now Virtual can be used without any prefix, but pointer must be used with the Virtual:: prefix:


            ...
            using CppAnnotations::Virtual;

            int main()
            {
            Virtual::pointer = 0;
            return (0);
            }

          • A using declaration for CppAnnotations::Virtual::pointer can be used. Now pointer can be used without any prefix:


            ...
            using CppAnnotations::Virtual::pointer;

            int main()
            {
            pointer = 0;
            return (0);
            }

          • A using directive or directives can be used:


            ...
            using namespace CppAnnotations::Virtual;

            int main()
            {
            pointer = 0;
            return (0);
            }

            Alternatively, two separate using directives could have been used:


            ...
            using namespace CppAnnotations;
            using namespace Virtual;

            int main()
            {
            pointer = 0;
            return (0);
            }

          • A combination of using declarations and using directives can be used. E.g., a using directive can be used for the CppAnnotations namespace, and a using declaration can be used for the Virtual::pointer variable:


            ...
            using namespace CppAnnotations;
            using Virtual::pointer;

            int main()
            {
            pointer = 0;
            return (0);
            }

          At every using directive all entities of that namespace can be used without any further prefix. If a namespace is nested, then that namespace can also be used without any further prefix. However, the entities defined in the nested namespace still need the nested namespace's name. Only by using a using declaration or directive the qualified name of the nested namespace can be omitted.

          When fully qualified names are somehow preferred, while the long form (like CppAnnotations::Virtual::pointer) is at the same time considered too long, a namespace alias can be used:



          namespace CV = CppAnnotations::Virtual;

          This defines CV as an alias for the full name. So, to refer to the pointer variable the construction


          CV::pointer = 0;

          Of course, a namespace alias itself can also be used in a using declaration or directive.

          3.6.4.1: Defining entities outside of their namespaces

          It is not strictly necessary to define members of namespaces within a namespace region. By prefixing the member by its namespace or namespaces a member can be defined outside of a namespace region. This may be done at the global level, or at intermediate levels in the case of nested namespaces. So while it is not possible to define a member of namespace A within the region of namespace C, it is possible to define a member of namespace A::B within the region of namespace A.

          Note, however, that when a member of a namespace is defined outside of a namespace region, it must still be declared within the region.

          Assume the type int INT8[8] is defined in the CppAnnotations::Virtual namespace.

          Now suppose we want to define (at the global level) a member function funny of namespace CppAnnotations::Virtual, returning a pointer to CppAnnotations::Virtual::INT8. The definition of such a function could be as follows (first everything is defined inside the CppAnnotations::Virtual namespace):



          namespace CppAnnotations
          {
          namespace Virtual
          {
          void
          *pointer;

          typedef int INT8[8];

          INT8 *funny()
          {
          INT8
          *ip = new INT8[1];

          for (int idx = 0; idx < sizeof(INT8) / sizeof(int); ++idx)
          (*ip)[idx] = (1 + idx) * (1 + idx);

          return (ip);
          }
          }
          }

          The function funny() defines an array of one INT8 vector, and returns its address after initializing the vector by the squares of the first eight natural numbers.

          Now the function funny() can be defined outside of the CppAnnotations::Virtual as follows:



          namespace CppAnnotations
          {
          namespace Virtual
          {
          void
          *pointer;

          typedef int INT8[8];

          INT8 *funny();
          }
          }

          CppAnnotations::Virtual::INT8 *CppAnnotations::Virtual::funny()
          {
          INT8
          *ip = new INT8[1];

          for (int idx = 0; idx < sizeof(INT8) / sizeof(int); ++idx)
          {
          cout << idx << endl;
          (*ip)[idx] = idx * idx;
          }

          return (ip);
          }

          At the final code fragment note the following:
          • funny() is declared inside of the CppAnnotations::Virtual namespace.
          • The definition outside of the namespace region requires us to use the fully qualified name of the function and of its returntype.
          • Inside the block of the function funny we are within the CppAnnotations::Virtual namespace, so inside the function fully qualified names (e.g., for INT8 are not required any more.

          Finally, note that the function could also have been defined in the CppAnnotations region. It that case the Virtual namespace would have been required for the function name and its returntype, while the internals of the function would remain the same:



          namespace CppAnnotations
          {
          namespace Virtual
          {
          void
          *pointer;

          typedef int INT8[8];

          INT8 *funny();
          }

          Virtual::INT8 *Virtual::funny()
          {
          INT8
          *ip = new INT8[1];

          for (int idx = 0; idx < sizeof(INT8) / sizeof(int); ++idx)
          {
          cout << idx << endl;
          (*ip)[idx] = idx * idx;
          }

          return (ip);
          }
          }

          Sunday, September 23, 2007

          About c

          Let us begin with a quick introduction in C. Our aim is to show the essential elements of the language in real programs, but without getting bogged down in details, rules, and exceptions. At this point, we are not trying to be complete or even precise (save that the examples are meant to be correct). We want to get you as quickly as possible to the point where you can write useful programs, and to do that we have to concentrate on the basics: variables and constants, arithmetic, control flow, functions, and the rudiments of input and output. We are intentionally leaving out of this chapter features of C that are important for writing bigger programs. These include pointers, structures, most of C's rich set of operators, several control-flow statements, and the standard library.

          This approach and its drawbacks. Most notable is that the complete story on any particular feature is not found here, and the tutorial, by being brief, may also be misleading. And because the examples do not use the full power of C, they are not as concise and elegant as they might be. We have tried to minimize these effects, but be warned. Another drawback is that later chapters will necessarily repeat some of this chapter. We hope that the repetition will help you more than it annoys.

          In any case, experienced programmers should be able to extrapolate from the material in this chapter to their own programming needs. Beginners should supplement it by writing small, similar programs of their own. Both groups can use it as a framework on which to hang the more detailed descriptions that begin in Chapter 2.

          1.1 Getting Started

          The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages:
          Print the words
          hello, world

          This is a big hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went. With these mechanical details mastered, everything else is comparatively easy.

          In C, the program to print ``hello, world'' is

             #include 

          main()
          {
          printf("hello, world\n");
          }
          Just how to run this program depends on the system you are using. As a specific example, on the UNIX operating system you must create the program in a file whose name ends in ``.c'', such as hello.c, then compile it with the command
             cc hello.c
          If you haven't botched anything, such as omitting a character or misspelling something, the compilation will proceed silently, and make an executable file called a.out. If you run a.out by typing the command
             a.out
          it will print
             hello, world
          On other systems, the rules will be different; check with a local expert.

          Now, for some explanations about the program itself. A C program, whatever its size, consists of functions and variables. A function contains statements that specify the computing operations to be done, and variables store values used during the computation. C functions are like the subroutines and functions in Fortran or the procedures and functions of Pascal. Our example is a function named main. Normally you are at liberty to give functions whatever names you like, but ``main'' is special - your program begins executing at the beginning of main. This means that every program must have a main somewhere.

          main will usually call other functions to help perform its job, some that you wrote, and others from libraries that are provided for you. The first line of the program,

             #include 
          tells the compiler to include information about the standard input/output library; the line appears at the beginning of many C source files. The standard library is described in Chapter 7.

          One method of communicating data between functions is for the calling function to provide a list of values, called arguments, to the function it calls. The parentheses after the function name surround the argument list. In this example, main is defined to be a function that expects no arguments, which is indicated by the empty list ( ).

          #include                  include information about standard library
          main() define a function called main
          that received no argument values
          { statements of main are enclosed in braces
          printf("hello, world\n"); main calls library function printf
          to print this sequence of characters
          } \n represents the newline character

          The first C program

          The statements of a function are enclosed in braces { }. The function main contains only one statement,

             printf("hello, world\n");
          A function is called by naming it, followed by a parenthesized list of arguments, so this calls the function printf with the argument "hello, world\n". printf is a library function that prints output, in this case the string of characters between the quotes.

          A sequence of characters in double quotes, like "hello, world\n", is called a character string or string constant. For the moment our only use of character strings will be as arguments for printf and other functions.

          The sequence \n in the string is C notation for the newline character, which when printed advances the output to the left margin on the next line. If you leave out the \n (a worthwhile experiment), you will find that there is no line advance after the output is printed. You must use \n to include a newline character in the printf argument; if you try something like

             printf("hello, world
          ");
          the C compiler will produce an error message.

          printf never supplies a newline character automatically, so several calls may be used to build up an output line in stages. Our first program could just as well have been written

             #include 

          main()
          {
          printf("hello, ");
          printf("world");
          printf("\n");
          }
          to produce identical output.

          Notice that \n represents only a single character. An escape sequence like \n provides a general and extensible mechanism for representing hard-to-type or invisible characters. Among the others that C provides are \t for tab, \b for backspace, \" for the double quote and \\ for the backslash itself. There is a complete list in Section 2.3.

          Exercise 1-1. Run the ``hello, world'' program on your system. Experiment with leaving out parts of the program, to see what error messages you get.

          Exercise 1-2. Experiment to find out what happens when prints's argument string contains \c, where c is some character not listed above.

          Variable and Arithmetic

          The next program uses the formula oC=(5/9)(oF-32) to print the following table of Fahrenheit temperatures and their centigrade or Celsius equivalents:
             1    -17
          20 -6
          40 4
          60 15
          80 26
          100 37
          120 48
          140 60
          160 71
          180 82
          200 93
          220 104
          240 115
          260 126
          280 137
          300 148
          The program itself still consists of the definition of a single function named main. It is longer than the one that printed ``hello, world'', but not complicated. It introduces several new ideas, including comments, declarations, variables, arithmetic expressions, loops , and formatted output.
             #include 

          /* print Fahrenheit-Celsius table
          for fahr = 0, 20, ..., 300 */
          main()
          {
          int fahr, celsius;
          int lower, upper, step;

          lower = 0; /* lower limit of temperature scale */
          upper = 300; /* upper limit */
          step = 20; /* step size */

          fahr = lower;
          while (fahr <= upper) { celsius = 5 * (fahr-32) / 9; printf("%d\t%d\n", fahr, celsius); fahr = fahr + step; } }
          The two lines
            /* print Fahrenheit-Celsius table
          for fahr = 0, 20, ..., 300 */
          are a comment, which in this case explains briefly what the program does. Any characters between /* and */ are ignored by the compiler; they may be used freely to make a program easier to understand. Comments may appear anywhere where a blank, tab or newline can.

          In C, all variables must be declared before they are used, usually at the beginning of the function before any executable statements. A declaration announces the properties of variables; it consists of a name and a list of variables, such as

              int fahr, celsius;
          int lower, upper, step;
          The type int means that the variables listed are integers; by contrast with float, which means floating point, i.e., numbers that may have a fractional part. The range of both int and float depends on the machine you are using; 16-bits ints, which lie between -32768 and +32767, are common, as are 32-bit ints. A float number is typically a 32-bit quantity, with at least six significant digits and magnitude generally between about 10-38 and 1038.

          C provides several other data types besides int and float, including:

          char character - a single byte
          short short integer
          long long integer
          double double-precision floating point

          The size of these objects is also machine-dependent. There are also arrays, structures and unions of these basic types, pointers to them, and functions that return them, all of which we will meet in due course.

          Computation in the temperature conversion program begins with the assignment statements

              lower = 0;
          upper = 300;
          step = 20;
          which set the variables to their initial values. Individual statements are terminated by semicolons.

          Each line of the table is computed the same way, so we use a loop that repeats once per output line; this is the purpose of the while loop

              while (fahr <= upper) {       ...    } 
          The while loop operates as follows: The condition in parentheses is tested. If it is true (fahr is less than or equal to upper), the body of the loop (the three statements enclosed in braces) is executed. Then the condition is re-tested, and if true, the body is executed again. When the test becomes false (fahr exceeds upper) the loop ends, and execution continues at the statement that follows the loop. There are no further statements in this program, so it terminates.

          The body of a while can be one or more statements enclosed in braces, as in the temperature converter, or a single statement without braces, as in

             while (i < i =" 2">
          In either case, we will always indent the statements controlled by the while by one tab stop (which we have shown as four spaces) so you can see at a glance which statements are inside the loop. The indentation emphasizes the logical structure of the program. Although C compilers do not care about how a program looks, proper indentation and spacing are critical in making programs easy for people to read. We recommend writing only one statement per line, and using blanks around operators to clarify grouping. The position of braces is less important, although people hold passionate beliefs. We have chosen one of several popular styles. Pick a style that suits you, then use it consistently.

          Most of the work gets done in the body of the loop. The Celsius temperature is computed and assigned to the variable celsius by the statement

                  celsius = 5 * (fahr-32) / 9;
          The reason for multiplying by 5 and dividing by 9 instead of just multiplying by 5/9 is that in C, as in many other languages, integer division truncates: any fractional part is discarded. Since 5 and 9 are integers. 5/9 would be truncated to zero and so all the Celsius temperatures would be reported as zero.

          This example also shows a bit more of how printf works. printf is a general-purpose output formatting function, which we will describe in detail in Chapter 7. Its first argument is a string of characters to be printed, with each % indicating where one of the other (second, third, ...) arguments is to be substituted, and in what form it is to be printed. For instance, %d specifies an integer argument, so the statement

                  printf("%d\t%d\n", fahr, celsius);
          causes the values of the two integers fahr and celsius to be printed, with a tab (\t) between them.

          Each % construction in the first argument of printf is paired with the corresponding second argument, third argument, etc.; they must match up properly by number and type, or you will get wrong answers.

          By the way, printf is not part of the C language; there is no input or output defined in C itself. printf is just a useful function from the standard library of functions that are normally accessible to C programs. The behaviour of printf is defined in the ANSI standard, however, so its properties should be the same with any compiler and library that conforms to the standard.

          In order to concentrate on C itself, we don't talk much about input and output until chapter 7. In particular, we will defer formatted input until then. If you have to input numbers, read the discussion of the function scanf in Section 7.4. scanf is like printf, except that it reads input instead of writing output.

          There are a couple of problems with the temperature conversion program. The simpler one is that the output isn't very pretty because the numbers are not right-justified. That's easy to fix; if we augment each %d in the printf statement with a width, the numbers printed will be right-justified in their fields. For instance, we might say

             printf("%3d %6d\n", fahr, celsius);
          to print the first number of each line in a field three digits wide, and the second in a field six digits wide, like this:
               0     -17
          20 -6
          40 4
          60 15
          80 26
          100 37
          ...
          The more serious problem is that because we have used integer arithmetic, the Celsius temperatures are not very accurate; for instance, 0oF is actually about -17.8oC, not -17. To get more accurate answers, we should use floating-point arithmetic instead of integer. This requires some changes in the program. Here is the second version:
             #include 

          /* print Fahrenheit-Celsius table
          for fahr = 0, 20, ..., 300; floating-point version */
          main()
          {
          float fahr, celsius;
          float lower, upper, step;

          lower = 0; /* lower limit of temperatuire scale */
          upper = 300; /* upper limit */
          step = 20; /* step size */

          fahr = lower;
          while (fahr <= upper) { celsius = (5.0/9.0) * (fahr-32.0); printf("%3.0f %6.1f\n", fahr, celsius); fahr = fahr + step; } }
          This is much the same as before, except that fahr and celsius are declared to be float and the formula for conversion is written in a more natural way. We were unable to use 5/9 in the previous version because integer division would truncate it to zero. A decimal point in a constant indicates that it is floating point, however, so 5.0/9.0 is not truncated because it is the ratio of two floating-point values.

          If an arithmetic operator has integer operands, an integer operation is performed. If an arithmetic operator has one floating-point operand and one integer operand, however, the integer will be converted to floating point before the operation is done. If we had written (fahr-32), the 32 would be automatically converted to floating point. Nevertheless, writing floating-point constants with explicit decimal points even when they have integral values emphasizes their floating-point nature for human readers.

          The detailed rules for when integers are converted to floating point are in Chapter 2. For now, notice that the assignment

             fahr = lower;
          and the test
             while (fahr <= upper) 
          also work in the natural way - the int is converted to float before the operation is done.

          The printf conversion specification %3.0f says that a floating-point number (here fahr) is to be printed at least three characters wide, with no decimal point and no fraction digits. %6.1f describes another number (celsius) that is to be printed at least six characters wide, with 1 digit after the decimal point. The output looks like this:

               0   -17.8
          20 -6.7
          40 4.4
          ...
          Width and precision may be omitted from a specification: %6f says that the number is to be at least six characters wide; %.2f specifies two characters after the decimal point, but the width is not constrained; and %f merely says to print the number as floating point.

          %d print as decimal integer
          %6d print as decimal integer, at least 6 characters wide
          %f print as floating point
          %6f print as floating point, at least 6 characters wide
          %.2f print as floating point, 2 characters after decimal point
          %6.2f print as floating point, at least 6 wide and 2 after decimal point

          Among others, printf also recognizes %o for octal, %x for hexadecimal, %c for character, %s for character string and %% for itself.

          Exercise 1-3. Modify the temperature conversion program to print a heading above the table.

          Exercise 1-4. Write a program to print the corresponding Celsius to Fahrenheit table.

          1.3 The for statement

          There are plenty of different ways to write a program for a particular task. Let's try a variation on the temperature converter.
             #include 

          /* print Fahrenheit-Celsius table */
          main()
          {
          int fahr;

          for (fahr = 0; fahr <= 300; fahr = fahr + 20) printf("%3d %6.1f\n", fahr, (5.0/9.0)*(fahr-32)); }
          This produces the same answers, but it certainly looks different. One major change is the elimination of most of the variables; only fahr remains, and we have made it an int. The lower and upper limits and the step size appear only as constants in the for statement, itself a new construction, and the expression that computes the Celsius temperature now appears as the third argument of printf instead of a separate assignment statement.

          This last change is an instance of a general rule - in any context where it is permissible to use the value of some type, you can use a more complicated expression of that type. Since the third argument of printf must be a floating-point value to match the %6.1f, any floating-point expression can occur here.

          The for statement is a loop, a generalization of the while. If you compare it to the earlier while, its operation should be clear. Within the parentheses, there are three parts, separated by semicolons. The first part, the initialization

             fahr = 0
          is done once, before the loop proper is entered. The second part is the test or condition that controls the loop:
             fahr <= 300 
          This condition is evaluated; if it is true, the body of the loop (here a single ptintf) is executed. Then the increment step
             fahr = fahr + 20
          is executed, and the condition re-evaluated. The loop terminates if the condition has become false. As with the while, the body of the loop can be a single statement or a group of statements enclosed in braces. The initialization, condition and increment can be any expressions.

          The choice between while and for is arbitrary, based on which seems clearer. The for is usually appropriate for loops in which the initialization and increment are single statements and logically related, since it is more compact than while and it keeps the loop control statements together in one place.

          Exercise 1-5. Modify the temperature conversion program to print the table in reverse order, that is, from 300 degrees to 0.

          Symbolic Constants

          A final observation before we leave temperature conversion forever. It's bad practice to bury ``magic numbers'' like 300 and 20 in a program; they convey little information to someone who might have to read the program later, and they are hard to change in a systematic way. One way to deal with magic numbers is to give them meaningful names. A #define line defines a symbolic name or symbolic constant to be a particular string of characters:

          #define name replacement list

          Thereafter, any occurrence of name (not in quotes and not part of another name) will be replaced by the corresponding replacement text. The name has the same form as a variable name: a sequence of letters and digits that begins with a letter. The replacement text can be any sequence of characters; it is not limited to numbers.

             #include 

          #define LOWER 0 /* lower limit of table */
          #define UPPER 300 /* upper limit */
          #define STEP 20 /* step size */

          /* print Fahrenheit-Celsius table */
          main()
          {
          int fahr;

          for (fahr = LOWER; fahr <= UPPER; fahr = fahr + STEP) printf("%3d %6.1f\n", fahr, (5.0/9.0)*(fahr-32)); }
          The quantities LOWER, UPPER and STEP are symbolic constants, not variables, so they do not appear in declarations. Symbolic constant names are conventionally written in upper case so they can ber readily distinguished from lower case variable names. Notice that there is no semicolon at the end of a #define line.

          Character Input and Output

          We are going to consider a family of related programs for processing character data. You will find that many programs are just expanded versions of the prototypes that we discuss here.

          The model of input and output supported by the standard library is very simple. Text input or output, regardless of where it originates or where it goes to, is dealt with as streams of characters. A text stream is a sequence of characters divided into lines; each line consists of zero or more characters followed by a newline character. It is the responsibility of the library to make each input or output stream confirm this model; the C programmer using the library need not worry about how lines are represented outside the program.

          The standard library provides several functions for reading or writing one character at a time, of which getchar and putchar are the simplest. Each time it is called, getchar reads the next input character from a text stream and returns that as its value. That is, after

             c = getchar();
          the variable c contains the next character of input. The characters normally come from the keyboard; input from files is discussed in Chapter 7.

          The function putchar prints a character each time it is called:

             putchar(c);
          prints the contents of the integer variable c as a character, usually on the screen. Calls to putchar and printf may be interleaved; the output will appear in the order in which the calls are made.

          1.5.1 File Copying

          Given getchar and putchar, you can write a surprising amount of useful code without knowing anything more about input and output. The simplest example is a program that copies its input to its output one character at a time:
          read a character
          while (charater is not end-of-file indicator)
          output the character just read
          read a character
          Converting this into C gives:
             #include 

          /* copy input to output; 1st version */
          main()
          {
          int c;

          c = getchar();
          while (c != EOF) {
          putchar(c);
          c = getchar();
          }
          }
          The relational operator != means ``not equal to''.

          What appears to be a character on the keyboard or screen is of course, like everything else, stored internally just as a bit pattern. The type char is specifically meant for storing such character data, but any integer type can be used. We used int for a subtle but important reason.

          The problem is distinguishing the end of input from valid data. The solution is that getchar returns a distinctive value when there is no more input, a value that cannot be confused with any real character. This value is called EOF, for ``end of file''. We must declare c to be a type big enough to hold any value that getchar returns. We can't use char since c must be big enough to hold EOF in addition to any possible char. Therefore we use int.

          EOF is an integer defined in , but the specific numeric value doesn't matter as long as it is not the same as any char value. By using the symbolic constant, we are assured that nothing in the program depends on the specific numeric value.

          The program for copying would be written more concisely by experienced C programmers. In C, any assignment, such as

             c = getchar();
          is an expression and has a value, which is the value of the left hand side after the assignment. This means that a assignment can appear as part of a larger expression. If the assignment of a character to c is put inside the test part of a while loop, the copy program can be written this way:
             #include 

          /* copy input to output; 2nd version */
          main()
          {
          int c;

          while ((c = getchar()) != EOF)
          putchar(c);
          }
          The while gets a character, assigns it to c, and then tests whether the character was the end-of-file signal. If it was not, the body of the while is executed, printing the character. The while then repeats. When the end of the input is finally reached, the while terminates and so does main.

          This version centralizes the input - there is now only one reference to getchar - and shrinks the program. The resulting program is more compact, and, once the idiom is mastered, easier to read. You'll see this style often. (It's possible to get carried away and create impenetrable code, however, a tendency that we will try to curb.)

          The parentheses around the assignment, within the condition are necessary. The precedence of != is higher than that of =, which means that in the absence of parentheses the relational test != would be done before the assignment =. So the statement

             c = getchar() != EOF
          is equivalent to
             c = (getchar() != EOF)
          This has the undesired effect of setting c to 0 or 1, depending on whether or not the call of getchar returned end of file. (More on this in Chapter 2.)

          Exercsise 1-6. Verify that the expression getchar() != EOF is 0 or 1.

          Exercise 1-7. Write a program to print the value of EOF.

          1.5.2 Character Counting

          The next program counts characters; it is similar to the copy program.
             #include 

          /* count characters in input; 1st version */
          main()
          {
          long nc;

          nc = 0;
          while (getchar() != EOF)
          ++nc;
          printf("%ld\n", nc);
          }
          The statement
             ++nc;
          presents a new operator, ++, which means increment by one. You could instead write nc = nc + 1 but ++nc is more concise and often more efficient. There is a corresponding operator -- to decrement by 1. The operators ++ and -- can be either prefix operators (++nc) or postfix operators (nc++); these two forms have different values in expressions, as will be shown in Chapter 2, but ++nc and nc++ both increment nc. For the moment we will will stick to the prefix form.

          The character counting program accumulates its count in a long variable instead of an int. long integers are at least 32 bits. Although on some machines, int and long are the same size, on others an int is 16 bits, with a maximum value of 32767, and it would take relatively little input to overflow an int counter. The conversion specification %ld tells printf that the corresponding argument is a long integer.

          It may be possible to cope with even bigger numbers by using a double (double precision float). We will also use a for statement instead of a while, to illustrate another way to write the loop.

              #include 

          /* count characters in input; 2nd version */
          main()
          {
          double nc;

          for (nc = 0; gechar() != EOF; ++nc)
          ;
          printf("%.0f\n", nc);
          }
          printf uses %f for both float and double; %.0f suppresses the printing of the decimal point and the fraction part, which is zero.

          The body of this for loop is empty, because all the work is done in the test and increment parts. But the grammatical rules of C require that a for statement have a body. The isolated semicolon, called a null statement, is there to satisfy that requirement. We put it on a separate line to make it visible.

          Before we leave the character counting program, observe that if the input contains no characters, the while or for test fails on the very first call to getchar, and the program produces zero, the right answer. This is important. One of the nice things about while and for is that they test at the top of the loop, before proceeding with the body. If there is nothing to do, nothing is done, even if that means never going through the loop body. Programs should act intelligently when given zero-length input. The while and for statements help ensure that programs do reasonable things with boundary conditions.

          1.5.3 Line Counting

          The next program counts input lines. As we mentioned above, the standard library ensures that an input text stream appears as a sequence of lines, each terminated by a newline. Hence, counting lines is just counting newlines:
             #include 

          /* count lines in input */
          main()
          {
          int c, nl;

          nl = 0;
          while ((c = getchar()) != EOF)
          if (c == '\n')
          ++nl;
          printf("%d\n", nl);
          }
          The body of the while now consists of an if, which in turn controls the increment ++nl. The if statement tests the parenthesized condition, and if the condition is true, executes the statement (or group of statements in braces) that follows. We have again indented to show what is controlled by what.

          The double equals sign == is the C notation for ``is equal to'' (like Pascal's single = or Fortran's .EQ.). This symbol is used to distinguish the equality test from the single = that C uses for assignment. A word of caution: newcomers to C occasionally write = when they mean ==. As we will see in Chapter 2, the result is usually a legal expression, so you will get no warning.

          A character written between single quotes represents an integer value equal to the numerical value of the character in the machine's character set. This is called a character constant, although it is just another way to write a small integer. So, for example, 'A' is a character constant; in the ASCII character set its value is 65, the internal representation of the character A. Of course, 'A' is to be preferred over 65: its meaning is obvious, and it is independent of a particular character set.

          The escape sequences used in string constants are also legal in character constants, so '\n' stands for the value of the newline character, which is 10 in ASCII. You should note carefully that '\n' is a single character, and in expressions is just an integer; on the other hand, '\n' is a string constant that happens to contain only one character. The topic of strings versus characters is discussed further in Chapter 2.

          Exercise 1-8. Write a program to count blanks, tabs, and newlines.

          Exercise 1-9. Write a program to copy its input to its output, replacing each string of one or more blanks by a single blank.

          Exercise 1-10. Write a program to copy its input to its output, replacing each tab by \t, each backspace by \b, and each backslash by \\. This makes tabs and backspaces visible in an unambiguous way.

          1.5.4 Word Counting

          The fourth in our series of useful programs counts lines, words, and characters, with the loose definition that a word is any sequence of characters that does not contain a blank, tab or newline. This is a bare-bones version of the UNIX program wc.
             #include 

          #define IN 1 /* inside a word */
          #define OUT 0 /* outside a word */

          /* count lines, words, and characters in input */
          main()
          {
          int c, nl, nw, nc, state;

          state = OUT;
          nl = nw = nc = 0;
          while ((c = getchar()) != EOF) {
          ++nc;
          if (c == '\n')
          ++nl;
          if (c == ' ' || c == '\n' || c = '\t')
          state = OUT;
          else if (state == OUT) {
          state = IN;
          ++nw;
          }
          }
          printf("%d %d %d\n", nl, nw, nc);
          }
          Every time the program encounters the first character of a word, it counts one more word. The variable state records whether the program is currently in a word or not; initially it is ``not in a word'', which is assigned the value OUT. We prefer the symbolic constants IN and OUT to the literal values 1 and 0 because they make the program more readable. In a program as tiny as this, it makes little difference, but in larger programs, the increase in clarity is well worth the modest extra effort to write it this way from the beginning. You'll also find that it's easier to make extensive changes in programs where magic numbers appear only as symbolic constants.

          The line

             nl = nw = nc = 0;
          sets all three variables to zero. This is not a special case, but a consequence of the fact that an assignment is an expression with the value and assignments associated from right to left. It's as if we had written
             nl = (nw = (nc = 0));
          The operator || means OR, so the line
             if (c == ' ' || c == '\n' || c = '\t')
          says ``if c is a blank or c is a newline or c is a tab''. (Recall that the escape sequence \t is a visible representation of the tab character.) There is a corresponding operator && for AND; its precedence is just higher than ||. Expressions connected by && or || are evaluated left to right, and it is guaranteed that evaluation will stop as soon as the truth or falsehood is known. If c is a blank, there is no need to test whether it is a newline or tab, so these tests are not made. This isn't particularly important here, but is significant in more complicated situations, as we will soon see.

          The example also shows an else, which specifies an alternative action if the condition part of an if statement is false. The general form is

             if (expression)
          statement
          1

          else
          statement
          2
          One and only one of the two statements associated with an if-else is performed. If the expression is true, statement1 is executed; if not, statement2 is executed. Each statement can be a single statement or several in braces. In the word count program, the one after the else is an if that controls two statements in braces.

          Exercise 1-11. How would you test the word count program? What kinds of input are most likely to uncover bugs if there are any?

          Exercise 1-12. Write a program that prints its input one word per line.

          1.6 Arrays

          Let is write a program to count the number of occurrences of each digit, of white space characters (blank, tab, newline), and of all other characters. This is artificial, but it permits us to illustrate several aspects of C in one program.

          There are twelve categories of input, so it is convenient to use an array to hold the number of occurrences of each digit, rather than ten individual variables. Here is one version of the program:

             #include 

          /* count digits, white space, others */
          main()
          {
          int c, i, nwhite, nother;
          int ndigit[10];

          nwhite = nother = 0;
          for (i = 0; i < c =" getchar())">= '0' && c <= '9') ++ndigit[c-'0']; else if (c == ' ' || c == '\n' || c == '\t') ++nwhite; else ++nother; printf("digits ="); for (i = 0; i < space =" %d," other =" %d\n">
          The output of this program on itself is
             digits = 9 3 0 0 0 0 0 0 0 1, white space = 123, other = 345
          The declaration
             int ndigit[10];
          declares ndigit to be an array of 10 integers. Array subscripts always start at zero in C, so the elements are ndigit[0], ndigit[1], ..., ndigit[9]. This is reflected in the for loops that initialize and print the array.

          A subscript can be any integer expression, which includes integer variables like i, and integer constants.

          This particular program relies on the properties of the character representation of the digits. For example, the test

             if (c >= '0' && c <= '9') 
          determines whether the character in c is a digit. If it is, the numeric value of that digit is
             c - '0'
          This works only if '0', '1', ..., '9' have consecutive increasing values. Fortunately, this is true for all character sets.

          By definition, chars are just small integers, so char variables and constants are identical to ints in arithmetic expressions. This is natural and convenient; for example c-'0' is an integer expression with a value between 0 and 9 corresponding to the character '0' to '9' stored in c, and thus a valid subscript for the array ndigit.

          The decision as to whether a character is a digit, white space, or something else is made with the sequence

             if (c >= '0' && c <= '9')       ++ndigit[c-'0'];   else if (c == ' ' || c == '\n' || c == '\t')       ++nwhite;   else       ++nother; 
          The pattern
             if (condition
          1
          )
          statement
          1

          else if (condition
          2
          )
          statement
          2

          ...
          ...
          else
          statement
          n
          occurs frequently in programs as a way to express a multi-way decision. The conditions are evaluated in order from the top until some condition is satisfied; at that point the corresponding statement part is executed, and the entire construction is finished. (Any statement can be several statements enclosed in braces.) If none of the conditions is satisfied, the statement after the final else is executed if it is present. If the final else and statement are omitted, as in the word count program, no action takes place. There can be any number of

          else if(condition)
          statement

          groups between the initial if and the final else.

          As a matter of style, it is advisable to format this construction as we have shown; if each if were indented past the previous else, a long sequence of decisions would march off the right side of the page.

          The switch statement, to be discussed in Chapter 4, provides another way to write a multi-way branch that is particulary suitable when the condition is whether some integer or character expression matches one of a set of constants. For contrast, we will present a switch version of this program in Section 3.4.

          Exercise 1-13. Write a program to print a histogram of the lengths of words in its input. It is easy to draw the histogram with the bars horizontal; a vertical orientation is more challenging.

          Exercise 1-14. Write a program to print a histogram of the frequencies of different characters in its input.

          1.7 Functions

          In C, a function is equivalent to a subroutine or function in Fortran, or a procedure or function in Pascal. A function provides a convenient way to encapsulate some computation, which can then be used without worrying about its implementation. With properly designed functions, it is possible to ignore how a job is done; knowing what is done is sufficient. C makes the sue of functions easy, convinient and efficient; you will often see a short function defined and called only once, just because it clarifies some piece of code.

          So far we have used only functions like printf, getchar and putchar that have been provided for us; now it's time to write a few of our own. Since C has no exponentiation operator like the ** of Fortran, let us illustrate the mechanics of function definition by writing a function power(m,n) to raise an integer m to a positive integer power n. That is, the value of power(2,5) is 32. This function is not a practical exponentiation routine, since it handles only positive powers of small integers, but it's good enough for illustration.(The standard library contains a function pow(x,y) that computes xy.)

          Here is the function power and a main program to exercise it, so you can see the whole structure at once.

             #include 

          int power(int m, int n);

          /* test power function */
          main()
          {
          int i;

          for (i = 0; i <>= 0 */
          int power(int base, int n)
          {
          int i, p;

          p = 1;
          for (i = 1; i <= n; ++i) p = p * base; return p; }
          A function definition has this form:
          return-type function-name(parameter declarations, if any)
          {
          declarations
          statements
          }
          Function definitions can appear in any order, and in one source file or several, although no function can be split between files. If the source program appears in several files, you may have to say more to compile and load it than if it all appears in one, but that is an operating system matter, not a language attribute. For the moment, we will assume that both functions are in the same file, so whatever you have learned about running C programs will still work.

          The function power is called twice by main, in the line

             printf("%d %d %d\n", i, power(2,i), power(-3,i));
          Each call passes two arguments to power, which each time returns an integer to be formatted and printed. In an expression, power(2,i) is an integer just as 2 and i are. (Not all functions produce an integer value; we will take this up in Chapter 4.)

          The first line of power itself,

              int power(int base, int n)
          declares the parameter types and names, and the type of the result that the function returns. The names used by power for its parameters are local to power, and are not visible to any other function: other routines can use the same names without conflict. This is also true of the variables i and p: the i in power is unrelated to the i in main.

          We will generally use parameter for a variable named in the parenthesized list in a function. The terms formal argument and actual argument are sometimes used for the same distinction.

          The value that power computes is returned to main by the return: statement. Any expression may follow return:

             return expression;
          A function need not return a value; a return statement with no expression causes control, but no useful value, to be returned to the caller, as does ``falling off the end'' of a function by reaching the terminating right brace. And the calling function can ignore a value returned by a function.

          You may have noticed that there is a return statement at the end of main. Since main is a function like any other, it may return a value to its caller, which is in effect the environment in which the program was executed. Typically, a return value of zero implies normal termination; non-zero values signal unusual or erroneous termination conditions. In the interests of simplicity, we have omitted return statements from our main functions up to this point, but we will include them hereafter, as a reminder that programs should return status to their environment.

          The declaration

              int power(int base, int n);
          just before main says that power is a function that expects two int arguments and returns an int. This declaration, which is called a function prototype, has to agree with the definition and uses of power. It is an error if the definition of a function or any uses of it do not agree with its prototype.

          parameter names need not agree. Indeed, parameter names are optional in a function prototype, so for the prototype we could have written

              int power(int, int);
          Well-chosen names are good documentation however, so we will often use them.

          A note of history: the biggest change between ANSI C and earlier versions is how functions are declared and defined. In the original definition of C, the power function would have been written like this:

             /* power:  raise base to n-th power; n >= 0 */
          /* (old-style version) */
          power(base, n)
          int base, n;
          {
          int i, p;

          p = 1;
          for (i = 1; i <= n; ++i) p = p * base; return p; }
          The parameters are named between the parentheses, and their types are declared before opening the left brace; undeclared parameters are taken as int. (The body of the function is the same as before.)

          The declaration of power at the beginning of the program would have looked like this:

              int power();
          No parameter list was permitted, so the compiler could not readily check that power was being called correctly. Indeed, since by default power would have been assumed to return an int, the entire declaration might well have been omitted.

          The new syntax of function prototypes makes it much easier for a compiler to detect errors in the number of arguments or their types. The old style of declaration and definition still works in ANSI C, at least for a transition period, but we strongly recommend that you use the new form when you have a compiler that supports it.

          Exercise 1.15. Rewrite the temperature conversion program of Section 1.2 to use a function for conversion.

          1.8 Arguments- Call by Value

          One aspect of C functions may be unfamiliar to programmers who are used to some other languages, particulary Fortran. In C, all function arguments are passed ``by value.'' This means that the called function is given the values of its arguments in temporary variables rather than the originals. This leads to some different properties than are seen with ``call by reference'' languages like Fortran or with var parameters in Pascal, in which the called routine has access to the original argument, not a local copy.

          Call by value is an asset, however, not a liability. It usually leads to more compact programs with fewer extraneous variables, because parameters can be treated as conveniently initialized local variables in the called routine. For example, here is a version of power that makes use of this property.

             /* power:  raise base to n-th power; n >= 0; version 2 */
          int power(int base, int n)
          {
          int p;

          for (p = 1; n > 0; --n)
          p = p * base;
          return p;
          }
          The parameter n is used as a temporary variable, and is counted down (a for loop that runs backwards) until it becomes zero; there is no longer a need for the variable i. Whatever is done to n inside power has no effect on the argument that power was originally called with.

          When necessary, it is possible to arrange for a function to modify a variable in a calling routine. The caller must provide the address of the variable to be set (technically a pointer to the variable), and the called function must declare the parameter to be a pointer and access the variable indirectly through it. We will cover pointers in Chapter 5.

          The story is different for arrays. When the name of an array is used as an argument, the value passed to the function is the location or address of the beginning of the array - there is no copying of array elements. By subscripting this value, the function can access and alter any argument of the array. This is the topic of the next section.

          1.9 Character Arrays

          The most common type of array in C is the array of characters. To illustrate the use of character arrays and functions to manipulate them, let's write a program that reads a set of text lines and prints the longest. The outline is simple enough:
             while (there's another line)
          if (it's longer than the previous longest)
          (save it)
          (save its length)
          print longest line
          This outline makes it clear that the program divides naturally into pieces. One piece gets a new line, another saves it, and the rest controls the process.

          Since things divide so nicely, it would be well to write them that way too. Accordingly, let us first write a separate function getline to fetch the next line of input. We will try to make the function useful in other contexts. At the minimum, getline has to return a signal about possible end of file; a more useful design would be to return the length of the line, or zero if end of file is encountered. Zero is an acceptable end-of-file return because it is never a valid line length. Every text line has at least one character; even a line containing only a newline has length 1.

          When we find a line that is longer than the previous longest line, it must be saved somewhere. This suggests a second function, copy, to copy the new line to a safe place.

          Finally, we need a main program to control getline and copy. Here is the result.

             #include 
          #define MAXLINE 1000 /* maximum input line length */

          int getline(char line[], int maxline);
          void copy(char to[], char from[]);

          /* print the longest input line */
          main()
          {
          int len; /* current line length */
          int max; /* maximum length seen so far */
          char line[MAXLINE]; /* current input line */
          char longest[MAXLINE]; /* longest line saved here */

          max = 0;
          while ((len = getline(line, MAXLINE)) > 0)
          if (len > max) {
          max = len;
          copy(longest, line);
          }
          if (max > 0) /* there was a line */
          printf("%s", longest);
          return 0;
          }

          /* getline: read a line into s, return length */
          int getline(char s[],int lim)
          {
          int c, i;

          for (i=0; i < c="getchar())!="EOF" c ="="" i =" 0;">
          The functions getline and copy are declared at the beginning of the program, which we assume is contained in one file.

          main and getline communicate through a pair of arguments and a returned value. In getline, the arguments are declared by the line

             int getline(char s[], int lim);
          which specifies that the first argument, s, is an array, and the second, lim, is an integer. The purpose of supplying the size of an array in a declaration is to set aside storage. The length of an array s is not necessary in getline since its size is set in main. getline uses return to send a value back to the caller, just as the function power did. This line also declares that getline returns an int; since int is the default return type, it could be omitted.

          Some functions return a useful value; others, like copy, are used only for their effect and return no value. The return type of copy is void, which states explicitly that no value is returned.

          getline puts the character '\0' (the null character, whose value is zero) at the end of the array it is creating, to mark the end of the string of characters. This conversion is also used by the C language: when a string constant like

             "hello\n"

          The %s format specification in printf expects the corresponding argument to be a string represented in this form. copy also relies on the fact that its input argument is terminated with a '\0', and copies this character into the output.

          It is worth mentioning in passing that even a program as small as this one presents some sticky design problems. For example, what should main do if it encounters a line which is bigger than its limit? getline works safely, in that it stops collecting when the array is full, even if no newline has been seen. By testing the length and the last character returned, main can determine whether the line was too long, and then cope as it wishes. In the interests of brevity, we have ignored this issue.

          There is no way for a user of getline to know in advance how long an input line might be, so getline checks for overflow. On the other hand, the user of copy already knows (or can find out) how big the strings are, so we have chosen not to add error checking to it.

          Exercise 1-16. Revise the main routine of the longest-line program so it will correctly print the length of arbitrary long input lines, and as much as possible of the text.

          Exercise 1-17. Write a program to print all input lines that are longer than 80 characters.

          Exercise 1-18. Write a program to remove trailing blanks and tabs from each line of input, and to delete entirely blank lines.

          Exercise 1-19. Write a function reverse(s) that reverses the character string s. Use it to write a program that reverses its input a line at a time.

          1.10 External Variables and Scope

          The variables in main, such as line, longest, etc., are private or local to main. Because they are declared within main, no other function can have direct access to them. The same is true of the variables in other functions; for example, the variable i in getline is unrelated to the i in copy. Each local variable in a function comes into existence only when the function is called, and disappears when the function is exited. This is why such variables are usually known as automatic variables, following terminology in other languages. We will use the term automatic henceforth to refer to these local variables. (Chapter 4 discusses the static storage class, in which local variables do retain their values between calls.)

          Because automatic variables come and go with function invocation, they do not retain their values from one call to the next, and must be explicitly set upon each entry. If they are not set, they will contain garbage.

          As an alternative to automatic variables, it is possible to define variables that are external to all functions, that is, variables that can be accessed by name by any function. (This mechanism is rather like Fortran COMMON or Pascal variables declared in the outermost block.) Because external variables are globally accessible, they can be used instead of argument lists to communicate data between functions. Furthermore, because external variables remain in existence permanently, rather than appearing and disappearing as functions are called and exited, they retain their values even after the functions that set them have returned.

          An external variable must be defined, exactly once, outside of any function; this sets aside storage for it. The variable must also be declared in each function that wants to access it; this states the type of the variable. The declaration may be an explicit extern statement or may be implicit from context. To make the discussion concrete, let us rewrite the longest-line program with line, longest, and max as external variables. This requires changing the calls, declarations, and bodies of all three functions.

             #include 

          #define MAXLINE 1000 /* maximum input line size */

          int max; /* maximum length seen so far */
          char line[MAXLINE]; /* current input line */
          char longest[MAXLINE]; /* longest line saved here */

          int getline(void);
          void copy(void);

          /* print longest input line; specialized version */
          main()
          {
          int len;
          extern int max;
          extern char longest[];

          max = 0;
          while ((len = getline()) > 0)
          if (len > max) {
          max = len;
          copy();
          }
          if (max > 0) /* there was a line */
          printf("%s", longest);
          return 0;
          }

          /* getline: specialized version */
          int getline(void)
          {
          int c, i;
          extern char line[];

          for (i = 0; i < c="getchar))" c ="="" i =" 0;">
          The external variables in main, getline and copy are defined by the first lines of the example above, which state their type and cause storage to be allocated for them. Syntactically, external definitions are just like definitions of local variables, but since they occur outside of functions, the variables are external. Before a function can use an external variable, the name of the variable must be made known to the function; the declaration is the same as before except for the added keyword extern.

          In certain circumstances, the extern declaration can be omitted. If the definition of the external variable occurs in the source file before its use in a particular function, then there is no need for an extern declaration in the function. The extern declarations in main, getline and copy are thus redundant. In fact, common practice is to place definitions of all external variables at the beginning of the source file, and then omit all extern declarations.

          If the program is in several source files, and a variable is defined in file1 and used in file2 and file3, then extern declarations are needed in file2 and file3 to connect the occurrences of the variable. The usual practice is to collect extern declarations of variables and functions in a separate file, historically called a header, that is included by #include at the front of each source file. The suffix .h is conventional for header names. The functions of the standard library, for example, are declared in headers like . This topic is discussed at length in Chapter 4, and the library itself in Chapter 7.

          Since the specialized versions of getline and copy have no arguments, logic would suggest that their prototypes at the beginning of the file should be getline() and copy(). But for compatibility with older C programs the standard takes an empty list as an old-style declaration, and turns off all argument list checking; the word void must be used for an explicitly empty list. We will discuss this further in Chapter 4.

          You should note that we are using the words definition and declaration carefully when we refer to external variables in this section.``Definition'' refers to the place where the variable is created or assigned storage; ``declaration'' refers to places where the nature of the variable is stated but no storage is allocated.

          By the way, there is a tendency to make everything in sight an extern variable because it appears to simplify communications - argument lists are short and variables are always there when you want them. But external variables are always there even when you don't want them. Relying too heavily on external variables is fraught with peril since it leads to programs whose data connections are not all obvious - variables can be changed in unexpected and even inadvertent ways, and the program is hard to modify. The second version of the longest-line program is inferior to the first, partly for these reasons, and partly because it destroys the generality of two useful functions by writing into them the names of the variables they manipulate.

          At this point we have covered what might be called the conventional core of C. With this handful of building blocks, it's possible to write useful programs of considerable size, and it would probably be a good idea if you paused long enough to do so. These exercises suggest programs of somewhat greater complexity than the ones earlier in this chapter.

          Exercise 1-20. Write a program detab that replaces tabs in the input with the proper number of blanks to space to the next tab stop. Assume a fixed set of tab stops, say every n columns. Should n be a variable or a symbolic parameter?

          Exercise 1-21. Write a program entab that replaces strings of blanks by the minimum number of tabs and blanks to achieve the same spacing. Use the same tab stops as for detab. When either a tab or a single blank would suffice to reach a tab stop, which should be given preference?

          Exercise 1-22. Write a program to ``fold'' long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.

          Exercise 1-23. Write a program to remove all comments from a C program. Don't forget to handle quoted strings and character constants properly. C comments don't nest.

          Exercise 1-24. Write a program to check a C program for rudimentary syntax errors like unmatched parentheses, brackets and braces. Don't forget about quotes, both single and double, escape sequences, and comments. (This program is hard if you do it in full generality.)

          Qbasic

          Introduction


          In the early days of programming, it was usually the scientific elite doing the programming and they were usually trained above and beyond the average American to do their programming work. It was not until 1964 at Dartsmouth college that the Beginner's All-purpose Symbolic Instruction Code would be introduced -- more commonly known as BASIC. Using common English to perform processor tasks, BASIC became quickly popular, although it was disliked by programmers of more "low-level" languages such as assembly and FORTRAN. In 1985 Microsoft released their own version of BASIC called QBasic with their MS-DOS 5.0 operating system. Since then, nearly every PC user owns their own copy of QBasic, making it a widely known language.

          QBasic is a very simple language to pick up, and yet it can accomplish a great deal. Granted you will probably never write Doom or Word Perfect with QBasic, but it has it's strong points. One of them is to introduce people to programming without having to worry about the internal workings of the computer. It's simple to create games, business applications, simple databases, and graphics. The best aspect of the language is it's close resemblance to English.

          This small tutorial introduces the simple concepts of programming to get you started, so if you already know another language or are already familiar with programming, you may want to skim through the first couple sections. Good luck!

          VARIABLES

          A variable, simply defined, is a name which can contain a value. Programming involves giving values to these names and presenting them in some form to the user. A variable has a type which is defined by the kind of value it holds. If the variable holds a number, it may be of integer, floating decimal, long integer, or imaginary. If the variable holds symbols or text, it may be a character variable or a string variable. These are terms you will become accustomed to as you continue programming.

          Here are some examples of values a variable might contain:

          STRING "hello, this is a string"
          INTEGER 5
          LONG 92883
          SINGLE 39.2932
          DOUBLE 983288.18

          The first is a string. Strings contain text. The last four are number types. But the computer does not know what kind of value you are trying to give a variable unless you tell it! There are two methods of telling the computer what kind of variable you are using:

          Explicitly declare the variable AS a type. This is done by using the DIM statement. Say you wanted to make a variable called number which would contain an integer (whole number, no digits after the decimal point). You would do it like this:

          DIM number AS INTEGER

          Then you would use that variable as an integer. The word DIM actually originates from the word Dimension, but you won't see why until we discuss arrays.

          Put a symbol after the variable name which is defined as representing that type. QBasic has a set of symbols which represent each variable type:

          $ String
          % Integer
          & Long
          ! Single
          # Double

          Appending one of these symbols to the name of a variable when you use it in the program tells the computer that you are using it as that type.

          This is actually a difficult concept to grasp for newcomers to programming. The most common error in QBasic is the infamous Type Mismatch which you will see a lot. This means that you are trying to put a value into a variable of the wrong type. You might be trying to put the letters "hi there" into an integer variable. If you don't define the type of the variable, then QBasic assumes it is of the Single type, which can often yield unexpected results. I personally prefer to use the type symbols after variable names, but come explicitly declare them usually at the head of their programs.


          INTERACTING WITH THE COMPUTER

          You know what a variable is and how to control them, it's time you learned some programming. QBasic (like all other languages) is set up using pre-defined statements according to the syntax specified for that statement. It may be helpful to look in the help index to learn a statement, although I've heard many complaint's that the help index is too hard. Indeed it is too hard for new programmers, but as you learn more and more statements and their syntaxes, you'll become accustomed to the index and use it as a casual reference. Lets make a program that prints some text on the screen. Type qbasic at the DOS prompt and enter the following program.

          CLS
          PRINT "This text will appear on the screen"
          END

          The first statement -- CLS -- stands for "clear screen." It erases whatever was on the screen before it was executed. PRINT simply displays its argument to the screen at the current text cursor location. The argument in this case is the text enclosed in quotes. PRINT displays text within quotes directly, or it can display the value of a variable, like this:

          CLS
          a% = 50
          b% = 100
          PRINT "The value of a is "; a%; " and the value of b is "; b%
          END

          This will yield the output; The value of a is 50 and the value of b is 100. The semicolons indicate that the next time something is printed, it will be right after where the last PRINT statement left off. Remember that PRINT prints literally what is inside quotes, and the value of the variable which is not in quotes. a% and b% are integers containing values in this example, and their values are printed using the PRINT statement. Say you want to interact with the user now. You'll need to learn a statement called INPUT. INPUT displays a prompt (the first argument) and assigns what the user types in to a variable (the second argument)

          CLS
          INPUT "What is your name? ", yourName$
          INPUT "How old are you? ", age%
          PRINT "So, "; yourName$; ", you are "; age%; " years old. That's interesting."
          END

          This firsts asks the user for their name and assigns it to the string variable yourName$. Then the age is requested, and the result is printed in a sentence. Try it out! So what happens if you input I DON'T KNOW for the age prompt? You'll get a weird message that says REDO FROM START. Why? The program is trying to assign a string (text) to an integer (number) type, and this makes no sense so the user is asked to do it over again. Another cornerstone of programming is the conditional test. Basically, the program tests if a condition is true, and if it is, it does something. It looks like English so it's not as hard as it sounds.

          CLS
          PRINT "1. Say hello" ' option 1
          PRINT "2. Say nice tie" ' option 2
          INPUT "Enter your selection ", selection%
          IF selection% = 1 THEN PRINT "hello"
          IF selection% = 2 THEN PRINT "nice tie"
          END

          The user is given a set of options, and then they input a value which is assigned to the variable selection%. The value of selection% is then tested, and code is executed based on the value. If the user pressed 1, it prints hello, but if they pressed 2, it prints nice tie. Also notice the text after the ' in the code. These are remark statements. Anything printed after a ' on a line does not affect the outcome of the program. Back to the actual code -- but what if the user doesn't input 1 or 2? What if they input 328? This must be taken into account as part of programming. You usually can't assume that the user has half a brain, so if they do something wrong, you can't screw up the program. So the ELSE statement comes into play. The logic goes like this: IF the condition is true,THEN do something, but if the condition is anything ELSE, then do something else. You follow? The ELSE statement is used with IF...THEN to test if a condition is anything else.

          CLS
          INPUT "Press 1 if you want some pizza.", number%
          IF number% = 1 THEN PRINT "Here's your pizza" ELSE PRINT "You don't get pizza"
          END

          That's a fairly simple example, and real life things will be much more complex. Lets try a "real life" program. QBasic is capable of fairly sophisticated math, so lets put some of it to use. Say your Algebra teacher tells you to find the areas of the circles with the following radiuses (radii, whatever), and he gives you a sheet with hundreds of radii. You decide to boot up your computer and write the following program:

          CLS
          pi! = 3.1415
          INPUT "What is the radius of the circle? ", radius!
          area! = pi! * radius! ^ 2
          PRINT "The area of the circle is ", area!
          END

          First, we're defining the variable pi. It's a single number, which means that it can be a fairly large number with some decimal places. The exclamation mark tells QBasic that pi is of the single type. Next, the user is prompted for the radius of their circle. Then the area is calculated. The * means "times," and the ^ (carrot) means "to the power of." radius! ^ 2 means "radius squared." This could also be written as pi! * radius! * radius!.

          There's one big problem with that program. The teacher gave you a sheet with hundreds of radii (please email me if you know how to spell this!). For every radius, you must run the program over again. This is not practical. If we had some kind of a loop until we wanted to quit that just kept on repeating over and over it would be much more useful. Of course, QBasic has the means of performing this feat. Loop structures. They start with the statement DO, and end with the statement LOOP. You can LOOP UNTIL or WHILE , or DO UNTIL or WHILE a condition is true. Another option (which we will use) is to break out of the loop manually as soon as a condition is true. Lets revise the previous code:

          CLS
          pi! = 3.1415
          DO ' Begin the loop here
          INPUT "What is the radius of the circle? (-1 to end) ", radius!
          IF radius! = -1 THEN EXIT DO
          area! = pi! * radius! ^ 2
          PRINT "The area of the circle is ", area!
          PRINT
          LOOP ' End the loop here
          END

          Now we can end the program by entering -1 as the radius. The program checks the radius after the user inputs it and checks if it is -1. If it is, it exits the loop. If it isn't it just keeps going it's merry way. The PRINT with no arguments prints a blanks line so we can separate our answers. I highly recommend entering this program into QBasic just so you can see exactly how it works.

          Say you want to print something in a certain pre-defined format. Say you want to print a series of digits with only 2 places after the decimal point and a dollar sign before the first digit. To do this requires the PRINT USING statement, which is very handy in applications for businesses. The PRINT USING statement accepts two types of arguments. The first is a string which has already been defined. This is a special type of string, in that it contains format specifiers, which specify the format of the variables passed as the other arguments. Confused? You won't be. Here's a quick list of the most common format specifiers

          ### digits
          & Prints an entire string
          \ \ Prints a string fit within the backslashes.
          Any thing longer is truncated
          $$ Puts a dollar sign to the left of a number
          . Prints a decimal point
          , Prints a comma every third digit to the left
          of the decimal point.

          And these can be combined in a format string to make a user defined way to print something. So $$#,###.## will print a number with a dollar sign to the left of it. If the number has more than two decimal places, it is truncated to two. If it is more than four digits long to the left of the decimal place, it is also truncated to fit. To use a PRINT USING statement, you must first define the format string containing the format specifiers. Then you use PRINT USING, then the name of the format string, and variable values to fill the places defined in the format string. Here's a code example

          CLS ' get user input
          INPUT "Enter item name: ", itemname$
          INPUT "How many items?: ", numitems%
          INPUT "What does one cost?: ", itemcost!
          CLS ' display inputs
          format$ = "\ \ #,### $$#,###.## $$#,###,###.##"
          PRINT "Item Name Quantity Cost Total Cost "
          PRINT "-------------- -------- ---------- --------------"
          totalcost! = numitems% * itemcost!
          PRINT USING format$; itemname$; numitems%; itemcost!; totalcost!
          END

          First, we get the item name, number of items, and cost per item from the user. Then we clear the screen and define the format string to be used. It contains a static length string (text that will be truncated if it is too long), up to 4 digits for the quantity, 4 digits and two decimals for the item cost, and 7 digits and two decimals for the total cost. Then we print out some column headers so we know what each value will represent, and some nice lines to go under the column headers. Then the total cost is calculated by multiplying the number of items by the item cost. Finally, the four variable's values are displayed under the column headers using the PRINT USING statement.


          MORE ADVANCED DATA MANIPULATION

          There are numerous methods to manipulate data and present it to the user in QBasic. One is called an array. An array is a variable which can contain more than one value. For example, you might have an array called a, and you could assign data to the members of that array. There might be a value for a(1), and a different value for a(6). Before an array can be used, it must be declared. Arrays are declared with the DIM statement used in section 1. Here is an example of an array declaration:

          DIM a(1 TO 100) AS INTEGER

          There are now 100 different values that can be assigned to the array a, and they must all be integers. It could also look like this:

          DIM a%(1 TO 100)

          Using the symbol % for integer. We call the different values for the array members of the array. Array a has 100 members. Array members can be assigned values by using a subscript number within parentheses after the array name, like this:

          a%(1) = 10
          a%(2) = 29
          a%(3) = 39

          And so on. Now you're probably wondering why the statement for declare is DIM. This comes from a term used in earlier programming languages that means dimension. That still doesn't answer the question... why not use the statement DECLARE? Well, an array can have more than one dimension. Arrays with multiple dimensions have y members in the second dimension for every x member of the first dimension in the following algorithm:

          DIM array( 1 TO x, 1 TO y) AS INTEGER

          So if the actual declaration looked like this:

          DIM a$( 1 TO 3, 1 TO 3)

          You would have the following members to assign values to:

          a$(1,1) a$(1,2) a$(1,3)
          a$(2,1) a$(2,2) a$(2,3)
          a$(3,1) a$(3,2) a$(3,3)

          A two dimensional array is useful for tracking the status of each piece in a checkers game, or something of the like. Recall the last example program of section that we had a program that would ask the user for the item name, the item cost, and the quantity of that item, the spit out the data just given in a nice format with the total in the right hand column. Well, with only one item, this program isn't very practical. But now with our newfound knowledge of arrays and the knowledge we already have of loops, we can create a
          somewhat useful application. The process will start with the program prompting the user for the number of items that will be calculated. Then the program loops for the number of times that the user entered at the beginning, assigning the data entered into a member of an array we will declare. A variable called netTotal! will be displayed at the end of the program which will contain the total costs of the items. netTotal! will accumulate each time through the loop as the current totalCost! is added to it. Type the following code:

          CLS
          INPUT "How many items to be calculated? ", totalItems%
          DIM itemName$(1 TO totalItems%) ' Declare our arrays
          DIM itemCost!(1 TO totalItems%)
          DIM numItems%(1 TO totalItems%)
          DIM totalCost!(1 TO totalItems%)
          FOR i% = 1 TO totalItems% 'First loop: get inputs
          CLS
          PRINT "Item "; i% ' Display the current item number
          PRINT
          INPUT "Item name -- ", itemName$(i%)
          INPUT "Item cost -- ", itemCost!(i%)
          INPUT "Quantity --- ", numItems%(i%)
          totalCost!(i%) = itemCost!(i%) * numItems%(i%)
          NEXT i%
          CLS
          PRINT "Summary"
          PRINT
          format$ = "\ \ $$#,###.## #,### $$#,###,###.##"
          PRINT "Item name Item Cost Quantity Total Cost "
          PRINT "----------------- ---------- -------- --------------"
          FOR i% = 1 TO totalItems%
          PRINT USING format$; itemName$(i%); itemCost!(i%); numItems%(i%); totalCost!(i%)
          netTotal! = netTotal! + totalCost!(i%)
          NEXT i%
          PRINT
          PRINT "Net Total = "; netTotal!
          END

          This program is much larger than anything we've done as of yet. It is kind of a review of everything we've done so far and one additional feature: the FOR...NEXT loop. This kind of loop loops for the number of times specified. A value is given to a variable and the program loops until that variable is equal or greater than the number specified after the TO.

          FOR i% = 1 TO 10
          PRINT i%
          NEXT i%

          Will loop 10 times, printing the numbers 1 through 10. The loop ends with a NEXT statement followed by the variable the loop increments for. So in our program, we have loops with index numbers (i%) starting at 1 and increasing for every number between 1 and the totalItems%, which is given by the user in the first part of the program. After the user inputs the number of items that will be calculated, four arrays are DIMensioned. They are one dimensional arrays, so they aren't very complex. The first FOR...NEXT loop prompts the user for each item. Then the format string is defined and the column headers are printed. The second FOR...NEXT loop cycles through the members of the four arrays and prints the data using the format string. The data for each member was assigned in the first FOR...NEXT loop. Each cycle through the second loop, the totalCost! of the current item being printed is added to the variable netTotal!. The netTotal! is the total sum of the total costs. After the second FOR...NEXT loop, the net total is printed and the program ends.

          Say we have a game and when the user makes a record score, they get to write their name on a list of the 10 best scores. But the next time the user plays the game, we want the name and position they recorded the last time they played to still be there. To do this, we must write to what is called a file, and then read it again later. If you are computer literate, then no doubt you know what a file is, and since you are using the internet to read this, I'm assuming you are. If you don't know what a file is and you really want me to explain it, then email me and I will. So we need to write to file. Before you can do anything to a file, you must open it, and there are different ways you can open a file, believe it or not. A file can be opened so you can read from it or write to it, or it can be opened and split into records like a database. Here is a quick table of the different ways you can open a file:

          Input: Read data from the file
          Output: Write data to the file
          Append: Write data to the end of a file
          Binary: Read from or write to a file in binary mode
          Random: Read from or write to a file which is split up in records like a database

          The syntax for the OPEN statement is quite peculiar. It's arguments require us to specify a file name, an access type (the 5 types defined above), and a file number. When the file is open, QBasic recognizes it by a number which we assign to the file when we open it. All references made to the file use this number. It can be any number from 1 to 255. An open statement may look like this:

          OPEN "sample.txt" FOR INPUT AS #1

          We will be reading data from this file because it was opened for input. Back to our problem about the game scores. Lets set up a program which will ask for their name and give them a random score. Then it will put their name on the list at the appropriate place on the top 10 (if it makes the list). We'll call the file "top10.dat." But say when the user buys the game, there are already 10 names and scores in there. We write the following program to put default names and scores into top10.dat:

          CLS
          OPEN "top10.dat" FOR OUTPUT AS #1
          FOR i% = 1 TO 10
          playername$ = "Player" + STR$(i%)
          playerscore% = 1000 - (i% * 100)
          WRITE #1, playername$, playerscore%
          NEXT i%
          CLOSE #1
          PRINT "Data written to file"
          END


          There are a couple strange features of this program that we have not seen yet. In the second line of the program the file is opened for output so we can write to it. In the fourth line of the program we get into some light string manipulation. A name is generated from the word player, and is concatenated with string form of the current loop number. You can concatenate two strings by using the + operator. The STR$ function returns the string representation of the number passed to it. The opposite of the STR$ function is the VAL function, which returns the numeric value of the string passed to it. Lastly, the WRITE statement writes to the file number specified as the first argument the values of the following arguments. Data is written to file in a format readable by the INPUT # statement which we will use in the actual program. We need this short program for the big one to work so we can give the program data to read from, or else we will get a nasty INPUT PAST END OF FILE error when we try to run it. Note that the file should be closed when we are done with it by using the CLOSE statement followed by the file number.

          And now we come to the big program, as I have referred to it. It is quite large and complex, and I have not fully described all the statements used in it, so I have broken it down to five sections which I will describe in detail afterwards. Here, at last, is the code:

          ' section 1
          CLS
          RANDOMIZE TIMER
          yourScore% = INT(RND * 1000)
          PRINT "Game Over"
          PRINT "Your score is "; yourScore%
          DIM playername$(1 TO 10) 'Declare arrays for the 10 entries on the list
          DIM playerscore%(1 TO 10)

          ' section 2
          OPEN "top10.dat" FOR INPUT AS #1
          DO WHILE NOT EOF(1) ' EOF means "end of file"
          i% = i% + 1
          INPUT #1, playername$(i%) 'Read from file
          INPUT #1, playerscore%(i%)
          LOOP
          CLOSE #1
          PRINT

          ' section 3
          FOR i% = 1 TO 10
          IF yourScore% >= playerscore%(i%) THEN
          FOR ii% = 10 TO i% + 1 STEP -1 'Go backwards (i% < 10)
          playername$(ii%) = playername$(ii% - 1)
          playerscore%(ii%) = playerscore%(ii% - 1)
          NEXT ii%
          PRINT "Congratulations! You have made the top 10!"
          INPUT "What is your name? ", yourName$
          playername$(i%) = yourName$
          playerscore%(i%) = yourScore%
          EXIT FOR
          END IF
          NEXT i%

          ' section 4
          OPEN "top10.dat" FOR OUTPUT AS #1
          FOR i% = 1 TO 10
          WRITE #1, playername$(i%), playerscore%(i%)
          NEXT i%
          CLOSE #1

          ' section 5
          PRINT
          PRINT "Here is the top 10"
          format$ = "\ \ #### "
          PRINT "Player Name Score"
          PRINT "-------------------------- -----"
          FOR i% = 1 TO 10
          PRINT USING format$; playername$(i%); playerscore%(i%)
          NEXT i%
          END

          Section 1: The screen is cleared. The second line contains the statement RANDOMIZE TIMER. When dealing with random numbers, we must give the computer a number so it has something to base the random number it will create from. This number is called the random seed generator. A random seed generator can be specified with the RANDOMIZE statement. For the seed, we need a number that will not be the same every time we run a program, so we decide to use the number of seconds which have elapsed since midnight. The TIMER statement accesses a system device called the system timer, and returns the current number of seconds which have elapsed since midnight. Since this number will change in each program we run, this can be used for the random seed. The variable yourScore% is given a random number from 0 to 1000. In the last part of section one, we declare two arrays with 10 members each.

          Section 2: In the first line we open the file for input, so we can read from it. The second line appears to be very weird at first. We are starting a loop with the DO statement, and then a condition to do while. The function EOF tests the file number passed to the function -- in this case 1 -- and if the end of the file (EOF) has been encountered it returns true. So EOF(1) is true if we are reading the end of the file. But we are using the Boolean operator NOT, so we want to loop while the end of file condition is false. We want to do the loop while we are not reading from the end of file 1. You will learn more about Boolean operators (NOT, AND, OR, XOR, etc.) as you continue programming. Then we assign the current data in the file to the arrays we declared in section 1. The INPUT # statement is used to read from the specified file into the specified variable(s) until a comma or carriage return is encountered.

          Section 3: The main purpose of this section is to re-write the top 10 list if the player's randomly generated score places on the list. We do this by cycling through the list, and testing if yourScore% is greater than or equal to (>=) the current playerscore% being tested. If it is, then we have to shift each existing score below the current one down one to make room for the new score being added. The user is congratulated and prompted for their name. The loop is then exited using the EXIT FOR statement, which then goes to section 4.

          Section 4: This short section simply opens the file for output so we can write to it. Then we write each of the members of the array to file.

          Section 5: In this final section we define the format string, print the headers, and then print all the members of the top 10 and their scores. And that's the end of the program!

          Now on to a new topic, which will later become related to the previous. User defined types. Recall that a type is the type of value a variable can have, such as integer, string, long, double, or single. You can create your own types which contain one or more of the already defined types. Here is an example of a user defined type:

          TYPE employeeType
          firstname AS STRING * 30
          lastname AS STRING * 30
          age AS INTEGER
          wage AS SINGLE
          END TYPE

          We have defined a new type, which consists of four data members, as I call them. We can now declare a variable of this type:

          DIM employee AS employeeType

          A variable of a user defined type is like an array, in that it can have more than one value assigned to it. But you can have an array of a variable of a user defined type as well, so things can get rather complex. Anyway, now that you have a user defined type, you can assign values to the data members of that variable. Use a period to access a data member of a type, like this:

          employee.firstname = "Bob"
          employee.lastname = "Foster"
          employee.age = 24
          employee.wage = 6.78

          This could have been helpful in the last program we made with the top 10 list. We could have declared a user defined type called playerType, like this:

          TYPE playerType
          name AS STRING * 20
          score AS INTEGER
          END TYPE

          and then declared an array of variables of that type

          DIM player(1 TO 10) AS playerType

          That would have made our code more efficient, but not necessarily more readable. Notice when we declare a string in a user defined type that it seems as if we are multiplying it by a number. Actually, we the number after the * defines the maximum length of the string. You must define this because the size of a user defined type must be known by the computer. Any value assigned to this string data member which exceeds the length specified is truncated.

          User defined types can serve more than to be efficient. They are the heart of the random access file mode, which is commonly used in database files. A database is a method of organizing large quantities of information in records and fields. In a record, there are a set of fields which are constant in every record. A field's value changes from record to record, however. Just the name of the field remains constant. So how does this relate to user defined types? Well think of a variable of a user defined type as a record in the
          database, and the data members fields of the records. Employee may be a record, and firstname, lastname, age, and wage may be fields. Values can be assigned to the fields in each record, thus constructing a database. A file opened for random access is organized in this fashion, with records split into fields. Each record in the random access file is given a record number which can be convenient in a database environment. In the OPEN statement for opening a random access file there is one extra argument. We must specify the length in bytes of how much space one record will occupy -- the record length. This can be easily taken by taking the LENgth of a variable defined as the user defined type we are going to use. So back to our employee example, we could use the LEN function to get the size in bytes of the employee variable, which is an employeeType. Here's the code:

          recordLen# = LEN(employee)
          OPEN "database.dat" FOR RANDOM AS #1 LEN = recordLen#

          LEN stands for length. You can also use the LEN function to get the number of characters in a string, but that is kind of irrelevant right now. So let's construct a simple database that will keep track of the employees of a business.

          ' Section 1
          CLS
          TYPE employeeType
          firstname AS STRING * 30
          lastname AS STRING * 30
          age AS INTEGER
          wage AS SINGLE
          END TYPE
          DIM employee AS employeeType

          ' Section 2
          PRINT "1.) Create new recordset"
          PRINT "2.) View existing recordset"
          INPUT "Which option? ", selection%

          ' Section 3
          IF selection% = 1 THEN
          INPUT "How many employees are in the company? ", numRecords%
          recordLen# = LEN(employee)
          OPEN "database.dat" FOR RANDOM AS #1 LEN = recordLen#
          FOR i% = 1 TO numRecords%
          CLS
          INPUT "First name: ", employee.firstname
          INPUT "Last name: ", employee.lastname
          INPUT "Age: ", employee.age
          INPUT "Wage: ", employee.wage
          PUT #1, ,employee
          NEXT i%
          CLS
          CLOSE #1
          PRINT "Recordset creation complete"
          END
          END IF

          ' Section 4
          IF selection% = 2 THEN
          recordLen# = LEN(employee)
          OPEN "database.dat" FOR RANDOM AS #1 LEN = recordLen#
          format$ = "\ \,\ \ ### $$##.##"
          PRINT "Last name First name Age Wage "
          PRINT "------------------ ------------------ --- -------"
          DO WHILE NOT EOF(1)
          GET #1, ,employee 'Sorry about the length of this line!!!
          PRINT USING format$; employee.lastname; employee.firstname; employee.age; employee.wage
          LOOP
          CLOSE #1
          END
          END IF

          I've split this program into sections again because that seems to work well for the larger ones.

          Section 1: We're defining the user defined type and declaring a variable of that type.

          Section 2: The first thing the user sees is a menu with the option to either create a new database (recordset) or view the existing one. The user is prompted to make a selection which is stored in the variable selection%.

          Section 3: If the user chose option 1 -- create a new recordset -- then this code is executed. First we prompt the user for how many employees are in the company so we know how many times to go through a loop. Then we open the file, prompt the user for the data for each variable, and write the whole record to file. The record is written using the PUT statement. The first argument in PUT is the file number, the second is the record number, and the third is the data to be written to file. If no record number is specified for the second argument, the current file position is used, which will just append what we specify after what is already there. This works fine, so we don't need to worry about explicit record numbers. Notice that we are writing the whole employee variable to file. This is because we write records to file, and the whole variable contains the data for the data members (fields).

          Section 4: If the user chooses to view the existing recordset, then we first open the file, define a format string for the printout, and print the headers. Next we have a loop until the end of file is encountered. Notice the GET statement, which is used to read from a random access file. The first argument is the file number we want to read from, the second is the record number (which we are leaving blank because we can read from the current position [CP] like we did in the PUT statement), and the third is the variable in
          which we read the data in to. This variable must be of the same type that we wrote with or else the types will be incompatible. You'd probably get a TYPE MISMATCH error if a different variable is used because the fields are not equal, so the program does not know what to assign the data to.

          Well that's it for random access. If you have understood half of what I've said, feel good. You have a good knowledge of what QBasic is about. Now on to some more advanced programming!


          GRAPHICS

          Graphics programming in QBasic can get fairly complex. Lets start from the beginning. Your screen is made up of hundreds of pixels. The number of pixels horizontally and the vertically determines the resolution of your monitor. Right now, your monitor is set up in a video graphics mode which determines how many pixels can be displayed on screen. My resolution is set to 800x600 right now, but the most common is 640x480. Your graphics mode is determined by the screen resolution in pixels, the text resolution (how many lines and columns of text can fit on your screen), the number of pages of video memory, and the color palette. There are 13 screen graphics modes in QBasic, and each has its different purpose. You can look in the help index in QBasic for a listing of the screen graphics modes and their specifications. Each of the aspects of a screen graphics type can be changed to create effects.

          There are a number of graphics routines used in QBasic which allow a variety of graphical effects. Lets try a few:

          SCREEN 12
          LINE (0,0)-(640,480), 1
          CIRCLE (320, 240), 20, 2
          PSET (10,10), 14
          DRAW "c15 bm100,400 l5e5f5l5"
          END

          The first line initializes the graphics mode to 12, which is 16 colors, 1 page of video memory, and 640x480 resolution.

          LINE draws a line from one coordinate to another. The first optional argument after the coordinates (which are not optional) is the color. After that, a B ("box") or BF ("box fill") can be used to draw a box or a box filled with the color specified. The first coordinate can be omitted and the - left in to draw a line from the current graphics position (CP) to the relative coordinates specified. LINE -(100,0) will draw a line from the current graphics position to 100 pixels to the right.

          CIRCLE draws a circle with the center at the coordinates specified. The first argument (required) after the coordinates is the radius of the circle. Then comes the color. After that, if you want to draw an arc, is the starting angle of the arc in radians, then the ending angle of the arc. To make an arc, first touch up on your geometry, then recall that to convert from degrees to radians is PI (3.14159265) divided by 180. The last argument is used if you want to make an ellipse, and is the ratio of the y axis to the x axis. So CIRCLE (320,240), 20, 2, 3.1415, 0, .5 would draw an elliptical green arc with the center at the middle of the screen, starting at 180 degrees (PI) and going to 0 degrees, with a compression ratio of 1 to 2 (x axis twice as big as the y). This looks like a wide smiley face mouth.

          PSET fills a pixel at the screen coordinate you specify with the color you specify. In this case, yellow.

          Finally, the DRAW statement. The DRAW statement has it's own commands which I strongly suggest you memorize. When we get in to scaling and rotation you will need to know your draw commands pretty well. The draw command in the above code example can be read as "color 15 (white), move without drawing to screen coordinate 100,400, draw left 5 units, draw up and right 5 units, draw down and right 5 units, and draw left 5 units." In other words, a triangle. A unit is set by the current scale mode, which by
          default is 4. Since default scale mode is 4, one unit represents 4 pixels. So our triangle is 40 pixels wide at the base.

          There are 16 defined colors in QBasic. The COLOR statement sets the current color for text output. I highly recommend memorizing the colors as well. Run this program:

          SCREEN 12
          FOR i% = 0 TO 15
          COLOR i%
          PRINT "COLOR"; i%
          NEXT i%

          This will print out the 16 colors used in QBasic. 0 is black, so that obviously won't show up. An quick reference for colors while you're in the QBasic IDE (integrated development environment) is to look under the OPTIONS | DISPLAY menu. The colors listed there are in the QBasic order, starting with black and ending with bright white.

          Now you know the basic graphics routines and their uses... lets make a couple programs that demonstrate them to a greater extent. First, a program which prompts the user for a radius, calculates the area and circumference, and draws the circle in a random color on the screen.

          SCREEN 12
          RANDOMIZE TIMER
          CONST pi! = 3.1415
          DO
          COLOR 15: INPUT "Radius (-1 to quit) --> ", radius!
          IF radius! = -1 THEN EXIT DO
          area! = pi! * radius! ^ 2
          circum! = pi! * 2 * radius!
          COLOR 14
          PRINT "Area = "; area!
          PRINT "Circumference = "; circum!
          CIRCLE (320,240), radius!, INT(RND * 15 + 1)
          DO: LOOP WHILE INKEY$ = ""
          CLS
          LOOP
          COLOR 9: PRINT "Good Bye!"
          END

          We first set the screen graphics mode and generate a random seed number based on the system timer. Then we prompt for the radius in a vivid bright white, and test to see if we should end the program. We then calculate the area and circumference, and print the results in yellow. Then we draw the circle from the middle of the screen at the radius given in a random color. This random color is set by first generating a random number from 0 to 14, adding 1 to it, and converting it to an integer with the INT function. The next line seems weird. The INKEY$ statement reads the keyboard and returns the string representation of the key pressed. We are looping while INKEY$ is nothing, or in other words, while the user isn't pressing anything. The loop goes on forever until the user presses any key, and at this time a value will be given to INKEY$ which you might decide to use. The screen is then cleared for the next entry. If the user breaks the loop by entering -1 for the radius, we print Good Bye! in bright blue letters.

          There are a lot more colors than just 16. In fact, you can change the values of each of the 16 colors to represent some other color that you specify. You do this with the PALETTE statement. The following applies to screen modes 12 and 13. This statement has two arguments: the color you want to change and the color you specify. Specifying a color is the hard part. Here is my version of the syntax of the palette statement

          PALETTE color, blueValue * 256 ^ 2 + greenValue * 256 + redValue

          color is the color you are changing. The _Values are numbers from 0 to 63 which specify the intensity of that color. You must use the multipliers after the values and use the addition operator to separate them. So lets make a program that fades the screen in and out, from black to purple. (blue and red make purple).

          SCREEN 12
          DO
          FOR i% = 1 TO 63
          PALETTE 0, i% * 256 ^ 2 + i%
          NEXT i%
          FOR i% = 63 TO 1 STEP -1
          PALETTE 0, i% * 256 ^ 2 + i%
          NEXT i%
          LOOP WHILE INKEY$ = ""
          END

          We start by changing the value of black (0), which is the background color to purple, from one degree of blue + red to the next. Then we bring it back down to black by decreasing the blue + red value. We do this over and over until the user presses a key or begins to have seizures.

          Scaling and rotation can be accomplished quite easily with the DRAW statement, although it involves some weird looking code. First, lets define a shape that we can scale and rotate.

          box$ = "bu5 l5 d10 r10 u10 l5 bd5"

          Interpretation: "move up 5 spaces without drawing, draw 5 spaces left, draw 10 spaces down, draw 10 spaces right, draw 10 spaces up, draw 5 spaces left, and move 5 spaces down without drawing." This forms a box. Notice that I started at the center and not at a corner or side which would seem to be easier. Well, when you rotate something, it draws based on the starting point of the object, and we want it to rotate so if we put a pen at each corner of the box, it would draw a perfect circle. Therefore we set the center of the box as the starting point of the object. I call this the "object handle," not to be confused with the handle used in the Windows API. The ta draw command stands for "turn angle," and obviously turns the object in the degrees you specify. So if we turned the box from 0 to 360 degrees, drawing the box at each step and erasing the previous image, we would get a rotating box. But we need one more function: VARPTR$. VARPTR$ stands for "variable pointer," a term you can completely ignore unless you get into C or Assembly programming. We need to somehow get the box$ shape into the draw string command we use implement in the loop, so we have to take the address of the object string and plug it into the draw string. This can be accomplished by using the X command, which tells VARPTR$ where to plug in the string's address so it can be used. With box$ defined above, here's the code for a rotating box:

          DO
          angle% = angle% + 1
          IF angle% >= 360 THEN angle% = 1
          DRAW "c0 bm320,240 ta" + STR$(angle% - 1) + "X" + VARPTR$(box$)
          DRAW "c1 bm320,240 ta" + STR$(angle%) + "X" + VARPTR$(box$)
          LOOP WHILE INKEY$ = ""
          END

          Not that hard is it? We draw the box at the previous angle in black, and then draw the box at the current angle in blue.

          Scaling is done pretty much the same way, but instead of changing the angle and erasing the previous image, we change the scale factor and erase the previous image. Recall that the default scale factor for the DRAW statement is 4 pixels per unit. Well, if we increase this factor then we will have more pixels per unit, thus giving the image the effect of enlargement. So if we set up a FOR...NEXT loop which will increase the scale factor from 2 to, say, 200, we will get the effect of scaling. But lets start with a smaller image which is maybe 8 pixels wide from the start instead of 40 so we get a more dramatic effect.

          SCREEN 12
          box$ = "bu4 l4 d8 r8 u8 l4 bd4"
          FOR s% = 2 TO 200
          DRAW "c0 bm320,240 s" + STR$(s% - 1) + "X" + VARPTR$(box$)
          DRAW "c2 bm320,240 s" + STR$(s%) + "X" + VARPTR$(box$)
          NEXT s%
          END

          Notice what we're doing here. We are starting the scale factor at one half of default (2) because the FOR...NEXT loop starts with s% at 2. The s draw command sets the scale factor. Notice also that we must continuously anchor the object handle at a point to keep it scaling about the handle. We do this by moving the object handle to 320,240 (center of screen) each time through the loop. Whenever we want to put a number into the draw string, we must concatenate the string format (STR$) of the number within the string. Instead of concatenating the box$ with the rest of the string, it is faster to only pass the address of the substring with the VARPTR$ function.

          So what if you want to scale and rotate something at the same time? Simple, just set up a FOR...NEXT loop which increases the scale factor as before, and within the loop increase the angle. But instead of subtracting a factor for the angle to erase the previous angle, lets do it this way: erase the previous image with the current angle, increase the angle, then draw the current image with the new current angle. This way if we want to change the factor at which the angle increases, we will only have to change one number instead of two

          SCREEN 12
          box$ = "bu4 l4 d8 r8 u8 l4 bd4"
          FOR s% = 2 TO 250
          DRAW "c0 bm320,240 ta" + STR$(a%) + "s" + STR$(s% - 1) + "X" + VARPTR$(box$)
          a% = a% + 1
          IF a% >= 360 THEN a% = 1
          DRAW "c1 bm320,240 ta" + STR$(a%) + "s" + STR$(s%) + "X" + VARPTR$(box$)
          NEXT s%
          END

          Try it out! Draw strings can get fairly complex, but you'll get used to them with practice and when you memorize the draw string commands.

          The screen coordinates for different screen modes can be fairly difficult to work with, and they do tend to be weird numbers. To make your code simpler to write, you can define a logical plane over the physical plane. An example of a physical plane is the 640x480 resolution established by the SCREEN 12 screen mode. You can define a logical, or alternate user-defined plane with the WINDOW statement.

          SCREEN 12
          WINDOW (0,0)-(100,100)
          CIRCLE (50,50),10,4
          LINE (0,0)-(50,50),2
          END

          This trivial example defines a logical plane which is 100x100. 50,50 is now the center of the screen, so this draws a red circle from the center with a radius of 10. The line statement draws a green line from the lower left corner to the center of the screen. Notice that defining a logical plane sets the origin (0,0) to the bottom left of the screen, instead of the default upper left. If you want the origin to be in the upper left with a logical plane, add the SCREEN keyword after WINDOW. So to define the graphics mode 12 screen resolution, the code is:

          SCREEN 12
          WINDOW SCREEN (0,0)-(640,480)

          Use whatever is more comfortable, but I would recommend using WINDOW SCREEN because there is less confusion when converting from logical to physical planes.

          Finally, a little information on creating DRAW effects with the other QBasic graphics routines. Hope you know some trigonometry for this part. Recall that in the unit circle, which has a radius of 1, that the coordinates of a point on the circle given an angle is defined as ( COS(angle), SIN(angle) ). Furthermore, if we are given a point on the circle, we can find the angle by drawing a vertical line perpendicular to the x axis from the point. If we take the arctangent of the vertical length of this line divided by the horizontal distance of this line from the origin, we will get the angle. So the angle is defined as ATN(Y/X). With this knowledge, it would be possible to create a spinning line using only the line command. If we create a loop which increments the angle from 0 to 360 then we can take to COS,SIN of the angle to get the point we should draw to. But there's only one more problem. The QBasic functions COS and SIN think in radians, so we must first convert the angle to radians by multiplying PI / 180. That is quite easily done . Here is the code:

          SCREEN 12
          CONST PI = 3.1415
          WINDOW SCREEN (-1,1)-(1,-1)
          DO
          LINE (0,0)-(COS(a% * PI / 180),SIN(a% * PI / 180)), 0
          a% = a% + 1
          IF a% >= 360 THEN a% = 1
          LINE (0,0)-(COS(a% * PI / 180),SIN(a% * PI / 180)), 14
          LOOP WHILE INKEY$ = ""
          END

          We start by initializing the graphics mode, then defining PI as a constant - a variable which will never change in the program execution. Then define the logical plane, and start the loop. The line starts from the center of the screen and goes to the coordinate specified by the COS,SIN of the angle. We loop until the user presses a key.

          There is one more type of graphics that QBasic has a strong point for : text. Graphical effects can be made quite easily using only text in QBasic. There are a few functions that are quite useful when dealing with text. The first is the CHR$ function. If you pass a number to the CHR$ function, it will return the ASCII (American standard code for information interchange) text value of that number. To find a listing of the ASCII character codes, look in the help contents, and there is a listing there. For example, to print a smiley face on the screen, the code is this:

          CLS
          PRINT CHR$(1)
          END

          Since the ASCII character code for a smiley face is 1, you can use the CHR$ function to get this. Another useful function is ASC, which returns the ASCII value of a text value you pass to it. So ASC("A") will return 65 because the ASCII value of A is 65. Every printable character (and then some) have an ASCII value, so these two functions make it quite easy.

          Finally, the LOCATE statement is extremely useful for any text based program. LOCATE sets the text CP to the coordinates you specify. The first argument is the column, and the second is the row. So

          CLS
          LOCATE 5,10
          PRINT CHR$(219)
          END

          Will print a solid white block at column 5, row 10. And that's it for graphics! You now know nearly every graphics routine in QBasic, and have the knowledge to make a game or highly graphical program. Graphics depend on how you arrange them, so it requires an artistic skill to some degree. If you get creative with these graphics commands, you can create nearly any effect you need.

          DESIGNING APPLICATIONS

          It is not practical in real world terms to set up an application in one long list of code. Many early programming languages were purely linear, meaning that they started from one point on a list of code, and ended at another point. All of the code I have written in this tutorial so far has been purely linear. However, linear programming is not practical in a team environment. If one person could write one aspect of code, and another write another part of the program, things would be much more organized. QBasic contains the capability to meet these needs, called modular programming. You can break a program into different "modules" which are separate from the main program and yet can be accessed by any part of it. I highly recommend the use of separate modules in programming applications, although it is not a simple task to learn.

          These separate modules are also known as procedures in the QBasic environment. There are two types of procedures: subs and functions. Subs merely execute a task and return to the main program, which functions execute a task and return a value to the main program. An example of a sub might be a procedure which displays a title screen on the screen, while a function may be a procedure that returns a degree in degrees given a number in radians. Function procedures are also used in Calculus, so you Calculus people should already be familiar with functions.

          Procedures can accept arguments in what is called an argument list. Each argument in the argument list has a defined type, and an object of that type must be passed to the procedure when it is called. For example, the CHR$ QBasic function accepts a numeric argument. The function itself converts this numeric argument into a string representation of the ASCII value of the number passed, and returns this one character string.

          Procedures in QBasic are given their own screen. When you enter the QBasic IDE, you are in the main procedure which can access all the others. Other procedures are created by typing the type of procedure (SUB or FUNCTION), the procedure name, followed by the complete argument list. You can view your procedures through the VIEW menu. Here is an example of a sub procedure which performs some operations for a program that will be using graphics, random numbers, and a logical plane.

          SUB initProgram()
          RANDOMIZE TIMER
          SCREEN 12
          WINDOW (0,0)-(100,100)
          COLOR 15
          END SUB

          The only thing you need to type is SUB initProgram (), and the screen will be switched to that procedure. The END SUB is placed there for you, so the only thing you need to type then is the code within the sub. Try typing this out on your own to see how this works. This procedure is called by simply typing initProgram in the main procedure. An alternative method is CALL initProcedure (). Right here the parentheses are optional, but if you were to pass arguments to the procedure, parentheses would be required with the CALL statement. Now lets try passing an argument to a procedure. We will pass two arguments to a procedure called center which are a string containing the text to be centered, and the horizontal location on the screen at which you wish to center it.

          SUB center( text$, hLoc% )
          LOCATE hLoc%, 41 - (LEN(text$) / 2)
          PRINT text$
          END SUB

          The first line after the sub declaration positions the starting point of the text at the horizontal location we passed at the second argument and vertical coordinate. The vertical coordinate is calculated by subtracting one half the screen's width in characters (41) and half the LENgth of the text we passed as the first argument. We would call center from the main procedure like this:

          center "Programmed by qp7@pobox.com", 12

          Or like this

          CALL center ("Programmed by qp7@pobox.com", 12)

          It's quite simple actually. Functions are slightly different and involve an additional part which subs do not: a return value. The return value is specified by assigning the value you want to return to the function name from within the function definition. When calling the function from within the main procedure, the name of the function is treated as a value which is evaluated at compile-time. Here is an example of a function definition:

          FUNCTION convert.To.Radians (degree!)
          LET PI = 3.1415
          convert.To.Radians = degree! * PI / 180
          END SUB

          The function is implicitly called in this program

          CLS
          INPUT "Enter a value in degrees: ", degreeValue!
          radianValue! = convert.To.Radians(degreeValue!)
          PRINT "The radian equivalent is"; radianValue!; "radians"
          END

          We treat the value returned from the function as a value we can immediately assign to another value. The variable radianValue! is given the value returned from convert.To.Radians. These concepts are supported in all programming languages, so this information will be beneficial to you in the future.

          There is one final concept which has proven to be very successful in programming: a message loop. With QBasic, you can construct a loop which runs for the length of the program, receives input from the user, and executes a message based on what the user does. We will construct a basic application which receives input from the user in the form of an arrow key, and moves a box on the screen based on the direction the user pressed. The arrow keys are different from normal inputted keys received with INKEY$. On the enhanced 101 keyboards which have arrow keys, INKEY$ returns two values: the ASCII text epresentation of the key pressed, and the keyboard scan code of the key pressed. Since the arrow keys do not have an ASCII text representation, we must use the keyboard scan codes for them. The keyboard scan codes can be viewed in the HELP | CONTENTS section of the QBasic menus. For this program, we will have two procedures in addition to the main procedure. The first will initialize the program settings and position the character in his starting position. The other will move the guy in the direction which we pass to the function. The main procedure will call the sub procedures and contains the main message loop which retrieves input from the user. First of all, here is the code for the main procedure:

          CONST UP = 1
          CONST DOWN = 2
          CONST LEFT = 3
          CONST RIGHT = 4

          TYPE objectType
          x AS INTEGER
          y AS INTEGER
          END TYPE
          DIM object AS objectType
          initScreen
          object.x = 41
          object.y = 24

          DO
          SELECT CASE INKEY$
          CASE CHR$(0) + CHR$(72): move UP, object
          CASE CHR$(0) + CHR$(80): move DOWN, object
          CASE CHR$(0) + CHR$(75): move LEFT, object
          CASE CHR$(0) + CHR$(77): move RIGHT, object
          CASE CHR$(32): EXIT DO
          END SELECT
          LOOP
          LOCATE 1,1: PRINT "Thank you for playing"
          END

          This code is fairly self explanatory with the exception of the SELECT CASE... END SELECT structure which I have not yet explained. This type of conditional testing format tests a condition, and several cases for that condition are then tested. In this case, we are seeing IF INKEY$ = CHR$(0) + CHR$(72), IF INKEY$ = CHR$(0) + CHR$(80), and so on. This is just a more legible format than IF...THEN...ELSE. Note that in the QuickBasic compiler, a CASE ELSE statement is required in the structure for what reason I have no idea. The above code is the driver for the rest of the program. First some CONSTants are
          declared which remain constant for the duration of the program and in any module. A user defined type is declared to store the coordinates of the character. Then an endless loop is executed, calling the appropriate procedure for the arrow key pressed until the user presses the space bar (CHR$(32)). Here is the code for the initScreen procedure:

          SUB initScreen ()
          SCREEN 12
          COLOR 9
          WIDTH 80,50
          LOCATE 24,41
          PRINT CHR$(1)
          END SUB

          The WIDTH 80,50 statement sets the screen text resolution to 80 columns and 50 rows. We then print a smiley face in the middle of the screen in a nice bright blue color. Next we need to write the move procedure, and then we will be done with the program.

          SUB move (way AS INTEGER, object AS objectType)
          LOCATE object.y, object.x
          PRINT CHR$(0) ' erase previous image
          SELECT CASE way
          CASE UP: IF object.y > 1 THEN object.y = object.y - 1
          CASE DOWN: IF object.y < 49 THEN object.y = object.y + 1
          CASE LEFT: IF object.x > 1 THEN object.x = object.x - 1
          CASE RIGHT: IF object.x < 79 THEN object.x = object.x + 1
          END SELECT
          LOCATE object.y, object.x
          PRINT CHR$(1) ' draw current image
          END SUB

          And that's the whole program... confusing as it may be! Ideas should be going through your head about what you could do with this information. Entire games can be created with this simple construct.

          There are more things to consider, but they are beyond the scope of this tutorial. If you were to design an application in QBasic, you would only need the information from this section and one heck of an imagination. Programming takes knowledge of the language and a creative mind... programs are made by programmers with both. If you can develop a creative mind, then you can develop any program conceivable.

          Fundamental of Computers

          An Introduction to the computer

          Computer is an electronic machine that accepts information, processes it according to the specific instructions, and provides the results as new information.
          To begin with, you most understand he impact of computer in the world today; Computers are affecting our lives in one-way or the other. Airlines, telephone and electricity bills, banking, medical diagnoses, and weather forecast...The list of services using computers is almost endless.
          You must have noticed that some uses of computers, or applications of computers, have made life much easier for you-your air ticket is now issued in a mater of minutes and your credit card gets processed very fast. However, there is something about the computer that might make your fell a little uneasy. Perhaps, a feeling that is far more intelligent and informed and too complex to operate. You may be surprised to know that the computer can't perform any task without you. The computer needs to be instructed on exactly what it has to do.
          The computer can store and manipulate large quantities of data at very high speed and even though, it can't think. It can make simple decisions and comparisons. For example, a computer can determine which two numbers is larger or which of two names comes first alphabetically and then act upon that decision. Although the computer can help to solve a wide variety of problems, It is merely a machine and can't solve problem on its own. It must be provided with instructions is the form of a computer program.
          A program is a list of instructions written in a special language that the computer understands. It tells the computer which operations to perform and in what sequence to perform them.
          In simply a computer is an electronic device that can accept input ands store data, process it and produce output.


          Computer has basically three parts:

          Keyboard
          System Unit (CPU) Central Processing Unit
          Monitor (VDU) Visual Display Unit
          A Keyboard is used to enter data, which is nothing but characters and numbers, into the computer. We can compare those keys of the typewriter. A system unit receives this data from the keyboard and holds on to it, until it is needed.
          A monitor is like a TV screen where whatever is retrieved from the computer can be displayed.

          History of Computer
          Computers are developed as a result of man's search for a faster way to calculate. Computer is very fast because data and instructions are represented as pulses within electronic circuits and they travel at the speed of light. The inventions and ideas of many mathematicians and scientists led to the development of the computer.
          The first mechanical calculating machines were invented during the 1600's. It was in 1642, that liaise Pascal developed the first mechanical calculating machine. The calculating machine could be used for addition and subtraction.
          In 1671, Gottfried Leibniz's a German philosopher and mathematician constructed a calculator that was an improvement on Pascal's invention. This machine could add, subtract, multiply, divide and extract roots.
          In the 1830's, an Englishman Charles Babbage devised the analytical engine, a kind of calculating machine. The analytical engine used axis and gears to computer values and store them and was designed to be a general purpose device that could be used to perform any mathematical operation automatically. The device includes memory, a central processing unit, input/output, and the use of programming language. All these are key elements in today's computers. That's why Babbage is often referred to as the father of the modern day Computer.
          Lady Augusta add Lovelace was a mathematician, and a long time supporter of Babbage. Lovelace thought of way to program the machine, so that it would repeat the same set of instructions and carry out instructions, If certain conditions exist. These techniques are till in use today. These techniques are still in use today, to make computer programs more efficient. Because of her work, many consider Lovelace, to be the First programmer.

          Computer Generation

          First Generation (1943-1958)

          The storage media or memory used in the first generation computer was vacuum tube. ENIAC (Electronic Numerical Integrator and Computer) was the first electronic general computer. It used vacuum tubes (18,000) and could do 300 multiplications per second.

          Second Generation (1958-1965)

          The computers using transistors as storage media were classified as second Generation computers. One transistor could do the task of 1000 vacuum tubes. Second generation computers were relatively smaller than the first generation computers. Computers were much faster and reliable. They had greater computing capacity.

          Third Generations (1965-1973)

          Third Generation computers were general-purpose computers. In 1964, International Business Machines (IBM) Corporation announced its system 360 family of mainframe computers. They are much faster as they used small chips containing thousands of parts integrated in them. Floppy disks. Hard disk, taps of card were used in this generation of computers. Large scale integration (LSI) about 20,000 transistors.

          Fourth Generation (1973- Now)
          While third generation computers saw the use of integrated circuits in building computers, the fourth generation is characterized by the increased number of circuits, allows more data to be stored on a memory chip. Lare Scale Integration (LSI) and very large scale integration (VLSI), allows memory chips having thousands of storage locations. Fourth generation computers have microprocessor, which have serial numbers. The serial numbers indicates the capability of computer and speed as well.

          Fifth Generation

          There are three factors that are said to characterize the fifth generation of computers; mega-chip memories, advanced parallel processing, and artificial intelligence. A mega-chip can have more than one million storage locations. With parallel data processing thousands of instructions can be processed simultaneously. Fifth generation computers are expected to have artificial intelligence.

          The Personal Computer
          The most popular form of the computer in use today is probably the PC or the personal Computer. The PC can be used for various applications. In fact, there are million of PCs already used by individuals and organizations. The PC us small in size but capable enough to handle large tasks. It can perform a divers range of functions, from keeping track of household accounts to keeping records of the stores of a large manufacturing company.

          Other Computer Systems
          The PC is although the most popular computer system, there are other computer system too, which are categorized on the basis of size, cost; and performance.
          Before we describe some of these computer systems. It is essential to understand the term, system. A system is a group of integrated parts that have a common purpose of achieving an objective. These parts or components of the PC system will be discussed in detail, later in the course. A popular system is the Mini-Computer, which is a small, general- Purpose computer. It can vary in size from a small desktop model to the size of a small filing cabinet. A typical mini system is more expensive than a PC and surpasses it in storage capacity and speed. While most PCs are oriented towards single users, mini systems are usually designed to simultaneously handle the needs of multiple users,, I.E. more than one person use a mini computer at the same time.
          A mainframe is another form of a computer system that is generally more powerful than a typical mini system. Mainframe themselves may very widely in cost and capability. They are used in large organizations for large-scale jobs.
          However, there is an overlap between the expensive Mini's and small Mainframe models in terms of cost and capability. Similarly, there is an overlap between the more powerful PC systems and the Mini Computer.
          At the end of the size and capability scales are the Super Computers. There systems are the largest, fastest and the most expensive computers in the world. They are used for complex scientific and defense applications.

          Benefits and Limitations of Computers

          The fact that computers have made their impact of almost all aspects of life in today's world can hardly be questioned. The question that you may ask here is how do you benefit from using a computer.

          A computer provides three basic benefits:
          Speed
          Accuracy (Accurate work)
          Diligence (care full hard work or toady effort)

          Computers work at very high speeds and are much faster than humans. The human equivalent of an average computer would be one million mathematicians working 24 hours a day. A computer rarely makes mistakes. In fact, most computer errors are cusec by human facilities. Unlike human, computers simply do not get bother or tired. The monotony of repetitive works does not affect computer. Unanticipated situation arises, computer will either produce erroneous results or abandon the task altogether. They do not have the potential to work out and alternative solution.

          Computer of a PC system -the Hardware.
          Now that you know the benefits and limitations of computer. let us move on to identifying each component of the personal computer with which you will be working during the course. The until that resembles a TV screen is called the Monitor, or more commonly VDU, short form for visual Display Unit.
          The component that closely resembles a typewriter is called the Keyboard. The box like structure that houses the necessary components to run the system is called the system unit.
          The Printer, as the name suggests, is used to print the results of any operation.
          The tiny device connected to the system unit through a cord, resembling a tail is called a Mouse. Moving the Mouse produces a corresponding movement of an arrowhead on the VDU.
          These devices are collectively known as the Hardware. Thus hardware comprises of all the physical components of a PC system.

          How does a computer work ?
          Most activities follow the basic principal of Input-Process-Output. Consider an automobile assembly line. The raw materials are forwarded to the assembling unit. This activity constitutes the Input part of the cycle. The process is the actual building of the automobile. There the process acts upon what has been the input and produces the Output, which in this case would be the finished automobile.
          Like all other activities, a computer system follows the Input-Output process or I-P-O cycle. This can best illustrate by an example of an airlines reservation. A person who whishes to travel by air between Singapore and New York first has to fill in a request slip. This contains relevant data, I.E., details of name, age, place of destination, etc. The booking clerk keys in the data from the request slip into the computer. The process in this case includes examining the flight number, the date of the journey, the class requested and determining whether seats are available as per the reservation details. The output of this process is some information confirming the booking, if seats are available, otherwise, the computer turning down the request may issue message.
          Let us see how each component of the PC system is related to the I-P-O cycle. The data in the request slip is input the input in computer is via the keyboard. Hence, the keyboard is an input device. The processing is performed by a component of the system unit called the Microprocessor. The information reading availability of seats is displayed on the VDU. Hence, the VDU is an Output Device. As mentioned earlier, the term hardware comprises the input and Output Devices along with the system unit.

          A closer Look at the Hardware


          Keyboard:
          The keyboard has already been identified as an input device. This is a component that closely resembles a typewriter console.
          While working on the PC using a keyboard, you eill notice a flashing point on the VDU. This is the cursor. When you press a keyboard, a character is displayed at the point where the cursor is flashing and the cursor moves one position forward. The keyboard provides different keys to perform various operations. (Refer below table)

          Key Function
          F1-F12 (Function Table) Used to perform special functions that depend on the software that is being used
          Caps lock Use to execute an instruction or data being keyed in through the keyboard.
          Shift Caps lock off: If pressed simultaneously with a character key, a capitalized alphabet is input.
          Caps lock on: If reverses the above effect.
          Also used to input the upper, symbols of keys with two symbols or characters on them.
          Ctrl, Alt Pressed with other keys, they input special message to the computer.
          Backspace Used to erase the characters to the left of the cursor position.
          Num-Lock Used to activate the numeric keys on the numeric keypad.
          Cursor Movement Keys Used to move the cursor in direction indicated
          Home, End, Page Up, Page Down Used to perform special functions, with which you will become familiar during the course.
          Insert Used to insert characters at the current cursor position.
          Delete Used to delete characters at the current cursor position
          Esc Depends on application, usually used to cancel a command.
          Print screen Used to print whatever is displayed on the screen.

          Mouse:
          A mouse is a small device that is connected to the system unit by means of a long wire. This is another input device, whose movement causes the corresponding movement of a pointer on the screen. It usually has two or three buttons. Using it, the user can select options from the screen.

          Visual Display Unit (VDU)
          Let us take a closer look at an output device. Data that has been processed needs to be displayed to the user. This is done using the monitor or the VDU. The VDU is similar to a TV screen and can display both text and graphic images. These displays can be either in black & white or color.

          Printer:
          The output on the VDU can't be stored for later reference. For a permanent output, you would require a printer that is also a common output device Using the printer, you can obtain the output on the paper. Printer are capable of printing at very high speeds. The printers commonly used with the PC are the coot-matrix printer, the ink-jet printer and the laser printer.

          System Unit:
          When data is input to a computer, it is processed and an output is produced at the output device. Processing takes place in the system unit. The component of the system unit that is involved in the actual processing is the microprocessor; another component of the system unit is the internal storage.

          Internal Storage
          Besides the microprocessor, the system unit also contains a storage area where the data is stored before belong actually processed. This storage area is called the internal storage. It is also referred to as primary storage, main memory or Random Access memory (RAM). Internal storage capacities may differ in different PC's. Typically, a PC will have an internal storage capacity of 640,000 characters or more.
          In computer technology, the storage capacity of a PC is measured in terms of bytes. Where one byte can store one character. Character here refers to any alphabet, number or other symbol. Therefore, to tore the ward Computer, 8 bytes would be required, Just as there is a basic unit, gram, and another unit, kilogram to measure weight, there is also byte and kilobyte (KB) to measure storage capacity. One KB is approximately equal to 1000 bytes. Therefore, 1 KB can store approximately 1000 characters. Another common unit of measurement of storage capacity is the megabyte (MB). which is equal to approximately 1000 KB. High Target capacities are measured in terms of Gigabytes (GB). One GB is approximately equal to 1000 MB.

          External Storage
          Since the internal storage capacity of PC is limited, it places a restriction on how much data can be stored at a time for processing. However, this is not the only draw back, once the PC is switched off, or in case off a power failure, all the data stored in the internal storage is lost. This means that every time you want to work on the PC, you would have to input the data required for processing. External storage is also refereed to as secondary storage.

          There are two kinds of external storage media used with a PC, they are:
          Floppy Disk
          Hard Disk

          Another media for external storage is the cartridge tape, It is particularly suitable for storage of large volume of data. Now CD-ROM and Magneto Optical (MO_ disk have also became an important media to tore large volume of data.
          Floppy Dist Hard Disk
          Also refereed to as diskettes of floppies. Also referred to as the fix disk.
          Removable. Suitable for data from one computer to another. Non-removable. It is attached within the system unit.
          Made of flexible tiny materiel. Less resistant to damage by heat, dust and accidental twists. Less prone to damage since it is within that system unit and is packed airtight.
          The diskettes currently in use have a diameter of 3.5 inches with a storage capacity of 730 KB or 1.44 MB. Can store data in the range of 20 Mg to 40 GB.
          Diskettes having a diameter of 5.25 inches and typical storage capacity of 360 KB are also used, slowly being phased out

          Comparison between a Diskette and a Hard Disk.

          Disk Drive
          The disk drive is contained within the system unit. The drive for diskettes is called Diskettes drive while the drive for a hard disk is called the Hard disk drive.

          It is important to differentiate between the storage media and storage devices. The diskette and the hard disk on which data is stored are the storage media, the disk drivers are the devices used for reading and writing. Going back to the analogy of the cassette tape recorder mentioned earlier. The cassette tape would be called storage medium while the cassette player is the storage device.

          Software

          A PC is incapable of performing any task with the hardware alone. It requires instructions that determine whether it will function as desired or not. Like data, instructions are also entered via the keyboard. In computer terminology, a set of instructions is called a program and one or more programs are called software.

          Software used on computers may be of different types. Some important classes of software are:
          Application Software
          Compiler Software
          The Operation Software

          Application Software

          We spoke of computer applications and identified certain areas where computers being used today. Software specially suited for specific applications is now available in the market, for example, software for billing systems, accounting software or software that enables the storage of documents. Such software is called application software since it is designed for a specific application.

          Application software that takes care of variety of business and corporate needs can now be bought off the shelf. These are called standard software packages. They are reasonably priced and come on any standard PC.

          Two popular standard software packages are Financial Accounting and Inventory control packages.

          Though these packages are for very specific applications, there are general application software's like database management systems (DEMS). spreadsheets and word processors.

          A spreadsheet package allows a user to enter number data, specify formula and perform calculations. Graphs can also be generated from the given data. A word processing package converts a PC into a sophisticated typewriting machine. It has the facility to perform spell checks, provide for synonyms, and allow changes or correcting in the document without having to re-type the entire document.

          Compiler Software

          Consider the case of four persons Rita, Bunu, Buddha Laxmi, Pramod, Amar, Anita who understand and speak English, Hindi, Nepali and Spanish respectively. To be able to understand each other, they would require some person or persons who could translate whatever is spoken into the language that each can understands.

          Similarly, since there are so many programs written in different computer languages. The hardware also needs a translator to convert the computer language into a form that it can understand.

          The computer only understands a language of electrical signals, called machine language. Software called the compiler converts the computer language into machine language. for example, there is a C complier that converts program written in the C language or machine language.

          Operating system Software

          Beside the application software and compiler software. There is a third kind of software, called the operating system, which is very important for the working of the PC.

          When a user wants to store any data or program, it is stored at a location that is known only to the operating system. Therefore, the operating system performs the task of storage management. The operating system also performs device management. For instance, when a user wants to print information on the printer, or display information on the VDU, He or she does not have to bother about the actual transportation of the information from the internal storage to the VDU or to the printer. The operating system takes care of it.

          All application software packages are written in computer language. There are various computer languages like C,C++, Fortran and Pascal, Each language is best suited for a particular application. C and C++ languages are used to develop highly complex software. Fortran is used for scientific application. Each language has its own vocabulary. There are some packages like Fox-Pro, Ms Access and Sybase that are more English like. Thus, even non-computer professionals, like executives and managers who have never studied computer science, cans learn to use these languages. So each language has its own vocabulary, each language differs form the other.

          Memory is the name of parts of computer that is use to store data. Programs and data are stored in the memory unit of the CPU during processing. Memory consists of storage locations. Memory is of two forms. Primary storage of main memory is of two types.
          Read only memory (ROM)
          Random access memory (RAM)
          There are certain essential functions that the computer must perform when it is switched on.

          ROM:-It is permanently in built into the computer at the time of its production. ROM is also called firmware. It stores a set of instruction and instructs the computer how to work. The user cannot change these instructions. ROM is non-volatile, that is, when computer goes off this instruction is not lost.
          RAM:- It is short-term memory of volatile memory. That is, when the computer is switched ON, the memory, which available to use, is RAM, but when the computer is switched OFF, all the information disappears. RAM is temporary whereas ROM is permanent memory.

          So if we have some important pike of work that we would like to keep, we use secondary storage.

          Secondary storage allows string large volumes of data. The contents of secondary storage are not lost when the power is turned OFF.

          Secondary storage device are: -

          Hard Disk
          Floppy Disk

          Hard Disks: -
          Hard disk is a large capacity, permanent storage area that offers access to the information stored on it. it is just like a long record and is made of smooth metal or molar plastic, which is coated with magnetic material. Kate is stored and retrieved in blocks. The tracks are therefore divided into block size regions called sectors. Typically, 512 bytes per sector are stored. Control data is recorded on the disk to indicate the start and end points of.

          Sector. Hard disk is always fixed in the computer and cannot be removed. Huge number of data can be stored in this type of storage media.

          Floppy Disk:
          Floppy disks are magnetically coated disks on which information; both programs and data typed form the keyboard, can be stored and retrieved. Floppy disks are put into and taken out of a disk drive. These are made of Mylar plastic combed which magnetic oxide and are 5.25 and 3.5 inched in diameter.

          CD-ROM (Compact disk Read Only Memory)
          It is non - erasable disk used for storing computer data.

          Internal Computer System
          Computer is a group of integrated parts that have the purpose of performing the operations called the program, which being executed qualifies it to be called a system.

          Computer system is basically made of input unit, by which data can be give to the computer, Central processing Unit, which does all arithmetic and logic operations, and Out put Unit which displays the result of these processing.

          The data and instructions are understood by the main memory of computer where its Central processing Unit (CPU) takes place.

          The processing result are displayed on the screen (Monitor) or printed by a printer.

          The output devices are Monitor or Printer.
          The computer set of parts is called computer system.
          Thus, in a computer system, perfect communication exits.
          The main machine consists of primary unit (also called main memory) holds data instructions and result of processing.
          The Arithmetic logic unit (ALU) is used to perform a calculation and make comparisons. (Decision).
          The control unit (CU) co-ordinates flow of data among the various components.

          The ALU and CU are together called the central processing unit or (CPU). the CPU is the main and heart combined which the central nervous system of computer.

          Impute Device: Input devices are main types but basically they are meant for presenting the information to the computer in a form understandable to its machine.
          Output device: The results of any computer processing have to be communicated to the user. Output devices translate the computer output into a form understandable to the human beings.

          Control Unit

          The control unit directs all operations inside the computer. It is known as nerve center of the computer because it controls and co-ordinates all hardware operations.

          Functions: It is used to transfer data from the input device to the memory output through ALU and also transfer the results from ALU to the memory and into the output device for printing.

          Arithmetic Logic Unit: ALU operates on the data available in the main memory and amends them back after processing once again to the main memory.

          Functions: It carries out arithmetic operations like addition, subtraction, multiplication and division.

          Control Unit

          Input Arithmetic Logic Output
          Device Unit Device

          Main Memory
          (RAM & ROM)
          (Primary Storage)

          Hard Disk
          Floppy Disk
          (Secondary Storage)

          Control Processing Unit

          It is the brain of the computer, which does the actual work of processing. Every input given to the computer needs to be processed before the actual output. The central processing Unit and its components. The CPU is an important part also called as the heart of the computer, having many more sub-components that are essential and perform discrete functions. Some of the main components in the CPU are discussed below.

          1] SWITCH MODE POWER SUPPLY
          All computer components require power supply, which the system draws from the AC mains. The internal power supply does the work of converting the AC input 230 volts into DC output of 5 and 12 volts. The internal power supply us normally as AMPS which provides the cable connections through which power is supplied as per the required voltage to the various, other units like the drivers, the mother board, the keyboard etc.

          Whenever the CPU is switched ON/OFF, it is the SMPS, which is, switched ON/OFF, as the external switch is a part of the SMPS.

          2] EXHAUST FAN
          Whenever the computer is switched on, a fan that is part of the SMPS also starts functioning to give a cooling effect inside the SMPS unit.

          3] SPEAKER
          A small studio speaker is connected to the motherboard and is used to produce sound effect whenever required in various software programs.

          4] MOTHERBOARD
          One of the major components of the CPU is the Motherboard, which is large board containing a number of small chips and other additional electronic devices or components. The peripheral attachments are connected to the motherboard. The motherboard has important components like the Microprocessor, Clock, Memory RAM and ROM (Chips).

          Microprocessor: The entire PC design is based on this chip. It can do all the mathematical calculations, comparisons logical operations and also attending to requests form attachments are connected like printers, restarting computer etc. The must widely used microprocessor used in computers is form Intel Company. They are usually available as 80286, 80386, 80486 and the Pentium chip.

          Clock: The time keeping is done by a clock, which provides signals by way of pulses to set up working pace. The speed of the clock is measured on the frequency of these pluses. The unit of measurement is in MHz or on the basic of the number of instructions per second. A quartz crystal is used giving accurate timings.

          Memory: The internal or Main Memory is in the form of chips. Here is the actual place where the RAM and ROM are.
          Bus: The Microprocessor works on the data stored in the RAM, Additionally the instructions stored in ROM has to be brought into the RAM whenever required, and therefore communication is essential between the microprocessor and the memory chips as well as other found on the motherboard. This requires communication or interaction is accomplished through a set of wires running between the various components. These wires are called as Buses. There are various types of Bus or wires depending upon the information they carry, the two main types are Data Bus, a set of 8 wires approximately for carrying the address or location where the data is to be stored in the memory.

          Ports: Input-Output devices like the printer, keyboard, Mouse etc are connected the CPU through sockets called as Ports. The Microprocessor communicates which the external world through the port. In other words they are the inlets and outlets of the microprocessor. The ports are basally of two types parallel and serial. A parallel port is capable of transmitting full byte. I.e. 8 bites are transferred simultaneously at a time, whereas, in case of a serial port eight bits Queue up and are then send through the cable.

          DATA REPRESENTATION

          Information is handled in the computer by electrical components such as transistors, integrated circuits, semi conductors, and wires, all of which can only indicate two states or conditions. Transistors are either conducting or non-conducting; magnetic materials are either magnetized or no magnetized in one direction or in the opposite direction; a pulse or voltage's present or not present all information is represented within the computer by the presence or absence of these various signals.

          The starting point of computer data-the smallest, most fundamental unit is called the 'bit' . The word 'bit' is a charming contraction for a longer and clumsier expression. 'Binary digit'. We're all familiar with the ten decimal digits, 0 through 9 that are used to express the numbers that we work with. Binary digits, in sort bits are similar, but while there are ten distinct decimal digits, there are only two different bit values, 'Zero' and 'One' which are written, naturally enough, as '0' and '1'

          The bits '0' and '1' represent 'off' and 'on'. The bit is the smallest possible chunk of information in the computer. Bits serve as a building block with which we can construct and work with large and more meaningful amount of information. The most important and most interesting collection of bits is the 'Byte'. a byte is eight bits, taken together as a single unit of computer data. The memory capacity of a computer or the storage capacity of the disk is measured in bytes.
          Bits & Bytes

          8 bits=1 Byte=1 Character
          1 Byte-00110011=8 Bites
          1024 bytes=1 Kilo Byte (KB)
          1024 K.B=1 Mega Byte (MB)
          1024M.B.=1 Giga Byte (GB)
          1024 B.G= 1 Tera Byte (TB)

          VIRUS

          What is virus ?
          A Virus is an actively infections, malicious computer program that place copies of itself into other applications and programs, but not data files or documents. The reason why a virus attaches itself to an executable program is in order to perform destructive action. It has to get executed. Thus, it remains dormant until a user runs the application or program to which it is attached.

          When a virus-infected program is executed it leads to the infection of all portions of the memory and then infects the files on the disks. So this is how it may lead to infection of a disk on another computer and thereby spreading the virus through the new system leading to a loss of stored information or data.

          A computer virus is basically a program written for destructive purpose. It is written in such a way that it can enter the computer without knowledge of the of the machine or the user. It enters the machine through an infected floppy or a program. It has therefore the capability to make prefect couples of itself and cause abnormal functioning of the machine.

          Viruses are brain works of intelligent programmers who are well averse with the intricate chip level manipulation of codes Effects of a virus infection-
          1) Corrupt files leading to data loss.
          2) Increase files sizes.
          3) Interference with display on the monitor.
          4) Formatting of the Hard Disk thereby destroying data.
          5) Destroying the File Allocation Table.
          6) Slowing down of the system.
          7) Marking good sectors as bad.
          8) Frequent hanging of machines.

          BOOT-SECTOR OR PARTITION:
          Table virus; A boot-sector virus replaces itself for the bootstrap loader and a partition table virus substitutes itself for the master boot program. This makes such various get loaded every time the machine is switched on. When the virus gets loaded into RAM. It goes infecting all the files on the disk. Also to show that the processing of data is going on normally the virus transfers control to the original bootstrap loader so that the booting procedure can take place normally. Some examples of boot sector/partition table viruses are (c) Brain. PC tone and ? Birthday Joshi. Other example are 13th Friday, Alabama, Ping-Pong ball virus (bouncing ball). Dark Avenger, BOXA virus, etc.

          MS-DOS 6.22

          DISK OPERATING SYSTEM act as an interface between user and the computer. Did operating system the programs that manage the disk operating and memory location. Managing the disk operation involves reading, writhng, searching, and shorting etc. the contents on to, the disk other operating systems are UNIX, XENIX, OS/2 etc.

          EVOSUTION OF DOS.
          The home computers were very successful in USA and this inspired IBM (international business Machine) to introduce small computers (PC' s or Personal computers) into the market. They also wanted and operating system for their Pac range. Mr. Bill Gates of Microsoft purchased the copyright of an operating system called Q DOS (Quickly and dirty Operating system), which was developed by Mr. Tim Peterson and used that as a basic for another operating system known as MS-DOS. There was an understanding between Microsoft and IBM and copy of MS-DOS is sold which every IBM computer.

          WHAT IS PC-DOS?

          PC-DOS and MS-DOS both refer to the same operating system. The only difference is that IBM has manufactured PC-DOS and Microsoft has manufactured MS-DOS.

          What is an operating system?

          An operating system is a set of software programs, which provide an interface between the user machines.

          function of an operating system


          Al manery management whenever any program is executed or any job is run it has be loaded into memory first, But b