 Computer Science Basics :: Section 1
Author: Mike Ware
Website: [warebiz] :: "The Programmer's Domain" - http://warebiz.tripod.com
Email: warebiz@yahoo.com

You are here --> [Section 1 :: Computer Programming Essentials]

--> Introduction to Computer Science
--> Steps of Program Development
--> Levels of Computer Languages
--> Internal Representation of Data

Introduction to Computer Science

Since you will be learning programming skills, it is a good idea to get familiar with the concepts of computer science (denoted CS). CS is not computer literacy. It is a combination of math and logic. When programming, you will use problem solving strategies to find the best solution to a given problem. CS is a science because you will be extremely experimental. CS is engineering because you will usually look for a solution to a "real world" problem. CS is also interdisciplinary, which means it relates to many different fields.

Now you know the aspects of computer science, but you may be wondering how programmers actually find solutions to the problems they are given. After a programmer is given a problem to solve, he begins a process called program development, which is discussed in the next section. Before writing any code, programmers develop an algorithm, which is a step-by-step plan for solving a particular problem in a finite amount of time. Finite refers to a certain amount of time to solve the problem.

>>>>>>>>>
DEFINITION: Algorithm - a step-by-step plan for solving a particular problem in a finite amount of time.
<<<<<<<<<

Algorithms are developed by simply thinking about how to come up with a solution for the problem. I normally sit at a desk with pen and paper and write down everything that comes to mind until I eventually end up with a semi-organized outline capable of being converted into an algorithm. There are three standard ways to "display" an algorithm on paper: structure chart, pseudocode, or flowchart. A structure chart, also know as a hierarchy chart, shows the functional flow of a program. It is a way of breaking down a program into logical steps. Pseudocode is a verbal description of a solution involving a mixture of English and a programming language. Pseudo is a prefix meaning almost, relating to (pseudo - code : almost, relating to code). A flowchart is essentially the same as pseudocode except it is a visual tool that uses different chart symbols to direct flow of the program. I prefer pseudocode, but you should use whichever method works best and feels most natural to you.

The following is an example of an algorithm written in pseudocode form:

................
PROBLEM: Write an algorithm that will, given the current date, find the date of the next day.
````````````````
///////////// BEGIN ALGORITHM /////////////
1) Get the current date
- Get the current month, day, and year in numerical form
2) If month = 2,
If day = 29 or day = 28 and it's not a leap year
increment month
else
increment day
else if month = 1,3,5,7,8,10,12
If day = 31
increment month
else
increment day
else if month = 4,6,9,11
If day = 30
increment month
else
increment day
3) Report "new" date [month, day, year]

Increment Day

Increment Month
if month = 12
increment year
else
day = 1

Increment Year
month = 1

Leap Year
divisible by 400 or divisible by 4 but not 100
///////////// END ALGORITHM /////////////

It is important to note that programmers develop an algorithm before writing any type of code. They do not immediately start writing code when given a problem to solve. This is a common mistake for many beginning programmers. Several of my classmates approach me about problems they have when trying to complete class projects. The main problem I find is that they don't fully understand what they need to do because they haven't taken the time to actually think about what the problem requires. Move on to the next section for more on developing solutions to problems...

Steps of Program Development

In the world of computing, the job of a computer programmer is to create programs that solve specific problems. The problems that programmers encounter may be as simple as printing a billing summary for customers at a store, or they can be as complex as performing financial calculations for a bank. Whatever the problem may be, programmers follow a multi-step process called program development in order to create successful programs. Program development is a five-step process requiring understanding of the problem at hand, developing a solution, writing a program, testing the program, and maintaining it.

The first step in the program development process is to understand the problem. This step is critical because a programmer cannot solve a problem until he fully understands it. During this step, the programmer carefully analyzes the problem in order to form a precise specification that includes the input required and the type of output needed. Input refers to the specific data that is put into a problem in order for it to be solved. Output refers to the exact answer that must be produced from the problem. Before the programmer can do anything else, it is vital that he fully understands the problem.

After he fully understands the problem, his next step is to develop a solution. It is important that the solution is developed before writing any type of programming code. When developing the solution, the programmer must devise an algorithm, which is a step-by-step plan for solving a particular problem. An algorithm can be displayed on paper in one of three ways: a flowchart, a structure chart, or pseudocode. Most programmers choose to use pseudocode, which is a verbal description of the solution involving a mixture of English and a programming language. During this step, the programmer must also make sure he is solving the problem correctly (verification), and he is solving the correct problem (validation).

After a solution has been developed, the next step of the process is to write program code. Writing code essentially means taking the algorithm and converting it into a computer programming language. The programmer must first pick an appropriate programming language to use, such as BASIC, Pascal, Ada, and C. When writing code, the programmer starts at the beginning of the algorithm and works his way down to the end. He must make sure his program code is well-structured and includes adequate documentation. Documentation is statements written in program code that does not theoretically affect the code, but is used to explain the logic behind specific code segments.

The next step in the process is to test the code. Testing can be done by running the program and manually checking the results. Two types of testing normally take place during this step: white box testing and black box testing. White box testing, commonly called glass box testing, refers to testing done by the person who wrote the program code. In other words, the person doing the testing knows everything about the program code. Black box testing, on the other hand, refers to testing done by someone who has no idea of how the program code is written. Documentation is especially important in this phase so other programmers (including yourself) can analyze the code easily in the future. This makes it extremely easier to find and correct mistakes that are initially unnoticed.

After the code has been thoroughly tested, the fifth and final step is maintaining the completed program. The programmer maintains it by updating the code and editing the code in order to make it more efficient. During this maintenance phase, programmers also may have to correct "bugs", which are errors in code that were not recognized during testing.

Every time a programmer is given a problem to solve, he calls upon the program development process for guidance. Every step in the process must be completed in order for the programmer to create a successful solution. If the problem is not understood, a solution will not be developed, and a program will not be written. If a program is successfully written but not maintained, the program will eventually become obsolete. Every step is critical towards the overall success of the program. Although the problems programmers encounter may change, the process of developing a solution will remain the same.

The following is an outline of each step in the program development process. Note the requirements of each step.

1) Understand the Problem
- You can't solve a particular problem unless you understand it.
- Form a precise specification, including the input required for the program and the output needed.

2) Develop an Algorithm
- You should develop a plan before writing any type of program code.
- Check verification : Are you solving the problem correctly?
- Check validation: Are you solving the correct problem.
- Ways to display algorithms: structure chart, pseudocode, flowchart [see also Introduction to CS]

3) Write the Program Code
- Essentially converting an algorithm into a computer programming language such as C, C++, Java, BASIC, Pascal, or Ada.
- Be sure to use meaningful identifiers, which are names give to variables
- Be sure to include adequate documentation, which are comment statements written in your program code that do not affect the code itself, but explain what
specific code segments are supposed to do.

4) Test the Program
- Run the program and manually check the results.
- Be thorough:
- Test all possibilities
- Test extreme data (invalid data, limit values, empty/null values)
- Perform BlackBox testing
- Perform WhiteBox testing

5) Maintenance
- Normally geared toward programming in the real world (careers)
- Update Code
- Correct "bugs"
- Edit code to make it more efficient.

With the program development process covered, we can now look at the different levels of computer languages. Read on for more...

Levels of Computer Languages

It is obvious that writing a computer program requires use of a computer language. There are three levels of computer languages: machine, symbolic, and high level.

Machine language dates back to the 1940's and is the only language that is directly understood by a computer. All code written in machine language is based on binary code, which makes it extremely difficult and time-consuming when writing large programs. Binary code is composed of streams of 1's and 0's. This makes perfect sense considering that the internal circuit of a computer consists of switches, transistors, and other devices that can only be in one state: on or off. The off state is represented by 0, and the on state is represented by 1. A program that took an hour to write in machine language could possibly be written in a couple of minutes using a high-level language.

Thanks to the work of mathematician Grace Hopper in the early 1950's, a new language was developed and named symbolic, or commonly referred to as assembly. Symbolic language made it possible to use commands such as MOV and ADD instead of having to rely primarily on binary code for every command. The symbols represented various machine language instructions, but the symbols still had to be converted to machine language for a computer to understand. A program called an assembler was developed and used for these purposes.

In the late 1950's, a major breakthrough was made and high-level languages were developed. High-level languages broke free of depending directly on the hardware of a computer. Instead, these languages concentrate on the problem being solved. Program code is written with symbolic names, but the names actually represent memory addresses in the computer. When the program code is run, a compiler translates the code into binary code (internally) so the computer can understand it. High-level languages make it extremely easier to write and understand program code than machine and symbolic language. Similar to symbolic language, high-level languages require a translator, normally called a compiler, to convert code into machine language for the computer to interpret.

Examples using machine and high-level languages:

Write a code segment that will add 2 integers and store the result.

**** NOTE: This machine language code segment will not work. It is meant to illustrate binary code. ****
///////////// BEGIN MACHINE LANGUAGE CODE /////////////

0000000 1101001 0010011 0001010
1111011 1010100 1010010 1011101
0000100 1110111 1110111 1010101
0001001 0011001 1111111 1010000

////////////// END MACHINE LANGUAGE CODE //////////////

///////////// BEGIN HIGH-LEVEL CODE /////////////

sum = firstValue + secondValue;

////////////// END HIGH-LEVEL CODE //////////////

Ok, so programmers use computer languages to give instructions to a CPU (central processing unit), but exactly how do computers understand the instructions? How do computers think, and how do computers interpret data? The next section deals with the internal representation of data from the perspective of a computer. Read on for more...

Internal Representation of Data

When considering the internal representation of data, understand two things: the units of storage in a computer and the different number systems.

Units of Storage
Smallest unit of storage to greatest: - bit (binary digit) [0 or 1]
- nibble (4 bits)
- byte (8 bits)
- word (2 bytes = 16 bits)
- longword (4 bytes = 32 bits)
- kilobyte (1024 bytes bytes = 2^10)
- megabyte (approx. 1 million bytes = 2^20)
- gigabyte (approx. 1 billion bytes = 2^30)

NOTE: In a bit, you can store one of 2 possible values: 1 or 0.
NOTE: In a byte, you can store one of 256 possible values.

Number Systems
- binary (base 2)
- octal (base 8)
- decimal (base 10)

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
A corresponds with 10
B corresponds with 11
C corresponds with 12
D corresponds with 13
E corresponds with 14
F corresponds with 15
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Converting values from one number system to another is not that important, but for us geeks, it's more knowledge to add to our cerebral cortex. It's always good to know how computers interpret data.

Example: 94 (decimal) = ? (binary)

a - 94 / 2 = 47 which has a remainder of 0
b - 47 / 2 = 23 which has a remainder of 1
c - 23 / 2 = 11 which has a remainder of 1
d - 11 / 2 = 5 which has a remainder of 1
e - 5 / 2 = 2 which has a remainder of 1
f - 2 / 2 = 1 which has a remainder of 0
g - 1 / 2 = 0 which has a remainder of 1

NOTE: As you perform the divisions, place the remainders in order from right to left.
Answer: 94 (Decimal) = 1011110 (binary)
-------------------------------------------------------------------------------------------------------------------------

Example: 1011110 (binary) = ? (decimal)

NOTE: only count those values which are turned on (designated by 1)
2^6 + 2^4 + 2^3 + 2^2 + 2^1 = 64 + 16 + 8 + 4 + 2 = 94

Answer: 1011110 (binary) = 94 (decimal)
-------------------------------------------------------------------------------------------------------------------------

Example: 8BC (hex) = ? (decimal)

8 x 16^2 + 11 x 16^1 + 12 x 16^0 ===>
8 x 156 + 11 x 16 + 12 = 2236

Answer: 8BC (hex) = 2236 (decimal)
-------------------------------------------------------------------------------------------------------------------------

Example: 110 (decimal) = ? (hex)

110 / 16 = 6 which has a remainder of 14 which corresponds to E in hex
6 / 16 = 0 which has a remainder of 6

NOTE: As you perform the divisions, place the remainders in order from right to left.
Answer: 110 (decimal) = 6E (hex)
-------------------------------------------------------------------------------------------------------------------------

Example: 1101111100110001100111 (binary) = ? (hex)

TRICK: First divide digits in pairs of four from right to left.

11  0111  1100  1100  0110  0111 ===>
0111 = 2^2 + 2^1 + 2^0 = 7
0110 = 2^2 + 2^1 = 6
1100 = 2^3 + 2^2 = 12 which corresponds to C
1100 = 2^3 + 2^2 = 12 which corresponds to C
0111 = 2^2 + 2^1 + 2^0 = 7
11 = 2^1 + 2^0 = 3

Answer: 1101111100110001100111 (binary) = 37CC67 (hex)
-------------------------------------------------------------------------------------------------------------------------

Example: F2C (hex) = ? (binary)

C = 1100
2 = 0010
F = 1111

Answer: F2C (hex) = 1111 0010 1100 (binary)
-------------------------------------------------------------------------------------------------------------------------

Integer Values
A computer stores integer values in two's complement form. For positive integers, it simply converts the value (should be decimal) into binary and performs a capacity check. Negative integers are slightly more difficult to calculate. Let's look at an example of each.

Example: - the integer 17, in 1-byte two's complement form:
00010001 [which is simply 17 converted into binary form]

Example: - the integer -17, in 1-byte two's complement form:
Step 1 - convert 17 into binary form: 00010001
Step 2 - complement (or "flip") each bit: 11101110
Step 3 - add 1: 11101111

The following example is slightly more difficult.
Let (LSB) denote the least significant byte (or the byte that has the lowest address in memory of a computer).

Example: - Suppose we have a 2-byte integer in memory that looks like the following:      00000110 10110001 (LSB)

In order to figure out the value of this integer, you need to first distinguish which byte is the least significant, which I give you in this example. In this case, 10110001 is the LSB. You then convert that byte from binary into decimal form. 10110001 converted to decimal equals 177. Your next step is to evaluate the byte that you have left over which is 00000110. You must convert this value into decimal form, but since you are dealing with a 2-byte integer, the right most bit in 00000110 <--- is in position 256. Therefore, first convert 00000110 into decimal. You should get 6, but you have to multiply that decimal value by 256 to take care of the two-byte integer. Then simply add the two decimal values together.

6 * 256 = 1536 + 177 = 1713, and 1713 is the value of 00000110 10110001

NOTE: Because of two's complement form, anytime the left most bit in a value given in binary form is 1, the value must be negative.
Examples:
11101111 is -17
10000000 is -128
10000001 is -127

Real Values (floating-point)
A computer stores a real value, which has a decimal point, by storing 2 integer values (binary mantissa and exponent). In order to store a mantissa integer value and an exponent integer value, the real value must be converted into scientific notation.

Example: Consider a real value 272.8914
272.8914 converted into scientific notation equals .2728914 X 10^3
2728914 would be stored as the mantissa and 3 would be stored as the exponent

Character Data (strings)
A computer stores character data by using the standard ASCII (Amercian Standard Code For Information Interchange) coding scheme. In order to store a character string "c++" internally in a computer, the characters 'c' '+' '+' must be stored numerically by using the ASCII table values. The following is an image resource taken from http://www.asciitable.com/:  NOTE: You may notice that there is a difference of 32 between an uppercase A and a lowercase a. Since there are 26 letters in the alphabet, why is the ASCII system built in this manner. The geniuses behind ASCII designed it this way to make internal processing easier. To get from a capital 'A' to a lowercase 'a' in binary form, all you would need to do is flip position 32 of the value.

Example:
01000001 - 'A'
01100001 - 'a'

As a beginning programmer, you may find it difficult to understand why learning the internal representation of data is important. There will definitely be times when finding a solution to a problem will directly or indirectly depend on knowing how computers interpret data. I cannot stress that enough. If you are serious about programming, think twice before not learning how computers interpret data.

You have now reached the end of the General Programming tutorial. If you have any questions concerning any content or example programs used in the articles, simply contact me and I'll make an attempt to respond as soon as possible. More General Programming articles are coming soon so be sure to check back for updates. Until then, happy coding.