QBasic - Step By Step :: Section 7
Author: Mike Ware
Website: [warebiz] :: "The Programmer's Domain" - http://warebiz.tripod.com
Email: warebiz@yahoo.com

You are here --> [Section 7 :: Arrays]

--> What is an Array?
--> One-Dimensional Arrays
--> Two-Dimensional Arrays
--> Sorting Array Data
--> Searching Array Data

What is an Array?

An array is a structured type that is useful for storing many individual values that are all of the same type. For example, consider a situation where you have to find the averages for sixty students in a class. Instead of creating sixty individual variables to hold data for each student like we would have to do using simple variables, we could use one single data array to store the sixty individual values as a group listing. Formally, an array is an ordered group of related data items (of the same type) having a common variable name. An array is primarily used when groups of data items must be stored and manipulated efficiently. An array is given a name and also a specified number of data items that it will contain; each individual item is referred to as an element. Each element is given a unique storage location in the array and a subscript or index value is used to identify the position of a particular element. The subscript, which may be an expression or simple variable, is enclosed in parantheses after the array name. Unless specified otherwise, the first element in an array is always given an index value of 1 and the index is incremented for each element thereafter. Up to this point in this tutorial, we have only dealt with unsubscripted variables, which are simple variables that are only capable of storing one value. An array is called a subscripted variable becuase when we need to access a certain element, we must use a subscript to point to that element so we can differentiate it from the rest.

In order to declare an array, we must use a DIM statement which plays the role of reserving internal memory space for an array of any desired size. The form of declaring a one-dimensional array is as follows:

```
DIM arrName ( startIndex TO endIndex )
```

where arrName is the variable name of the array, startIndex is the starting subscript value, and endIndex is the ending subscript value. Here are few more examples:

```
DIM scores (1 TO 15)
DIM numbers (10 TO 20)
DIM names\$(1 TO 10), grades\$(1 TO 10), gpa(1 TO 10)
```

One-Dimensional Arrays

A one-dimesional array can be thought of a single column of a specified number of rows used to hold related data; it is a single list of elements. For example, consider the following:

```
DIM students\$(1 TO 5)

students\$(1) = "John"
students\$(2) = "Matthew"
students\$(3) = "Luke"
students\$(4) = "Peter"
students\$(5) = "Michael"
```

Data can be stored in the array just as I have done above; by simply assigning a value into a particular array position using a subscript. For a better and more efficient way of storing data into a array, you should use a loop because of its ability to keep cycling until a certain condition is met, which in this case would be no more elements to be filled or end of the array. A FOR...NEXT loop is particulary nice when you know exactly how many items are to be placed in the array. For example, consider the following FOR...NEXT loop which could accomplish the same goal for our previous students\$ array:

```
DIM students\$(1 TO 5)

FOR count = 1 TO 5
PRINT "What is the name for student "; count;
INPUT students\$(count)
NEXT count
```

The above loop would be executed five times, from 1 to 5, so each element in the array will be filled by user input (the array has a size of five elements). Sometimes, you may have several arrays each of differnt types but are all related to eachother based upon corresponding elements in the arrays. When arrays are defined in this way, they are said to be parallel and can be easily assigned values using a loop. For example, consider the following:

```
DIM names\$(1 TO 10), grades\$(1 TO 10), gpa(1 TO 10)

FOR count = 1 TO 10
INPUT "Enter student: ", names\$(count)
INPUT "Enter gpa: ", gpa(count)
NEXT count
```

The above three arrays are said to be parallel because all elements in the arrays correspond and depend on eachother. The letter grade in grades\$(5) and gpa score in gpa(5) are associated with the student name in names\$(5). When the number of items to be entered into an array is not known, a DO WHILE loop can be used with a sentinel value and a counter variable to keep track of how many data items were entered. In this case, we must make sure the size of the array will be capable of storing the number of entered items. For example, consider the following:

```
DIM customers\$(1 TO 50), balances(1 TO 50)

count = 1
INPUT "Enter the name of the customer: "; customer\$
INPUT "Enter the customer's balance (-999 to quit): "; balance
DO WHILE (count < 50) AND (balance <> -999)
customers\$(count) = customer\$
balances(count) = balance
count = count + 1
INPUT "Enter the name of the customer: "; customer\$
INPUT "Enter the customer's balance (-999 to quit): "; balance
LOOP
count = count - 1

```

Two-Dimensional Arrays

While a one-dimensional array is only capable of storing data in a single column, a two-dimensional array allows data to be stored in both columns and rows; it can be thought of as muitlple single lists of data. Two-dimensonal arrays are used for the storing and retreiving of complex data. Suppose you need to calculate the grade for twenty students in a class. Each student has taken ten tests, and the grade for each student is entirely based on the average of the ten test scores. We could use a two-dimensional array to handle this situation by associating each row in the array as data for a particular student. Each row would therefore have ten columns to store each test score. For a first example, consider the following:

```
DIM stuData(1 TO 20, 1 TO 10)
```

The above DIM statement would declare a two-dimensional array holding a possible of 20 x 10 = 200 elements. It has 20 rows and 10 columns. This array declaration would take care of our previous student problem. As we can do with one-dimensional arrays, a FOR...NEXT loop is a great choice for placing data into each location in the array but since we are using two-dimesional arrays, we will have to use nested loops. For example, consider the following:

```
DIM totalPts(1 TO 20)

FOR row = 1 TO 20
FOR col = 1 TO 10
PRINT "Enter test score"; col; "for student"; row; ": "
INPUT stuData(row, col)
totalPts(row) = totalPts(row) + stuData(row, col)
NEXT col
NEXT row
```

The above loop would be executed a total number of 200 times. Every time a row is "cycled" through, the inside col loop would then be executed ten times to take of the ten test scores for each student. Each row in the array is therefore associated with a student. In this case, we should also have a parallel one-dimensional array holding the names of the students. Also, the totalPts array is holding the sum of all tests for each particular student so we can later find the average and calculate the letter grade based on it for each student.

As you can tell from above, to access data in a 2D array, we must a double subscript notation. For example, to access the fifth test score of the first student in our previous example, we would use:

```
stuData(1, 5)
```

Just like 1D arrays, we can also use numeric expressions as subscripts.

So far, we have only covered how to store data into an array. Extracting data from an array so we can print results basically uses the exact same concept instead we will only be printing data elements. Consider the following examples:

```
REM outputting information contained in a 1D array

DIM customers\$(1 TO 50), balances(1 TO 50)

count = 1
INPUT "Enter the name of the customer: "; customer\$
INPUT "Enter the customer's balance (-999 to quit): "; balance
DO WHILE (count < 50) AND (balance <> -999)
customers\$(count) = customer\$
balances(count) = balance
count = count + 1
INPUT "Enter the name of the customer: "; customer\$
INPUT "Enter the customer's balance (-999 to quit): "; balance
LOOP
count = count - 1

sizeOfArray = count
FOR counter = 1 TO sizeOfArray
PRINT "Customer Name: "; customers\$(counter)
PRINT "Customer Balance: "; balances(counter)
NEXT counter

REM outputting information contained in a 2D array

DIM stuData(1 TO 20, 1 TO 10)
DIM totalPts(1 TO 20)

FOR row = 1 TO 20
FOR col = 1 TO 10
PRINT "Enter test score"; col; "for student"; row; ": "
INPUT stuData(row, col)
totalPts(row) = totalPts(row) + stuData(row, col)
NEXT col
NEXT row

FOR row = 1 TO 20
FOR col = 1 TO 10
PRINT "Stu Test Scores: "; stuData(row, col);
NEXT col
PRINT
NEXT row
```

Sorting Array Data

When dealing with an array, you will most likely encounter problems that require the array to be sorted into some type of order. For example, consider the situation where you currently have an array of fifty employee names and another parallel array containing each employee's wages, and you must generate a report that lists the employee's information based on their wages, from who earned the most to who earned the least. In this type of situation, we would have to sort both of the arrays into descending order based on the wages of the employees. This is fairly simple to do and very convenient.

There are two ways in which you can sort an array: ascending or descending order. Ascending order starts low and ends high, or sorts values from least to greatest. Descending order starts high and ends low, or sorts values from greatest to least. In order to actually sort the array, we will take advantage of a "built-in" SWAP function. The SWAP function will simply exchange values between two positions in an array during the sort process.

I present two techniques for sorting an array: the bubble sort and the shell sort. Let's begin with the bubble sort.

Bubble Sort
A bubble sort is a sorting technique that occurs due to a series of comparisons done to adjacent elements in an array. The sorting will take place within nested loops. The outer loop will determine if the loop still needs to be sorted, and the inner loops actually filters through the array and performs "swaps" if needed. A "swap" occurs only if a certain condition is met depending on which order the array is being sorted. For ascending order, if the current element is greater than an adjacent element, then a "swap" will occur. For descending order, if the current element is less than an adjacent element, then a "swap" will occur.

The outer loop will allow for several "passes" through the array provided the array still needs to be sorted. During each "pass", the inner loop will perform comparisons among the elements and "swap" accordingly. Because "swaps" will be made during each "pass", the sort will ultimately "lock" elements at the end of the array during the sort process. For example, if want to sort an array of ten integers into ascending order, then after "pass" 1, the largest value in the array will be "locked" into the last position in the array. This is how the "bubble" sort gets its name; a value in position 1 of an array could be "bubbled" down to the last position in the array after the first "pass" through the sort.

Once you see the form of one bubble sort, you will be able to tweak the code and use it for any type of problem. The following complete program illustrates the use of a bubble sort to sort an array of 10 integers into ascending order:

```
////////////////////////////////////////////////////////////////////////////////////////

REM This program uses a bubble sort to sort an array of 10 integers
REM into ascending order.
REM
REM PROGRAMMER: Mike Ware
REM DATE LAST UPDATE: 4-3-00

CLS

sizeOfArray = 10
DIM nums(1 TO sizeOfArray)

FOR count = 1 TO sizeOfArray
INPUT "Enter a integral value: ", nums(count)
NEXT count

CALL sortNums(nums(), sizeOfArray)

PRINT
PRINT
PRINT "SORTED ARRAY"
PRINT "--------------------------------------------------"

FOR count = 1 TO sizeOfArray
PRINT nums(count);
NEXT count

END

SUB sortNums (nums(), sizeOfArray)

REM Begin bubble sort process to sort array into ascending order

last = sizeOfArray - 1
isChanged = 1

DO WHILE isChanged = 1
isChanged = 0
FOR count = 1 TO last
IF nums(count) > nums(count + 1) THEN
SWAP nums(count), nums(count + 1)
isChanged = 1
END IF
NEXT count
last = last - 1
LOOP

END SUB

////////////////////////////////////////////////////////////////////////////////////////
```

With the bubble sort out of the away, we can now move on to talk about the shell sort.

Shell Sort
Although the bubble sort is easy to understand, it is not very efficient because it is only capable of comparing and "swapping" adjacent elements in an array. A shell sort is done by comparing and sorting "grouped" elements which are divided by a "gap" value during each "pass" through the sort. Before the next "pass" occurs, the "gap" is then cut in half and the process is repeated thereafter until the list is sorted into a particular order.

Because the shell sort "groups" elements of an array together and is able to compare elements that may be distant from eachother, it is a more efficient and faster sort than the bubble sort. The speed and efficiency is extremely noticable when sorting large arrays.

The following is a complete program mimicking the previous program listed under the bubble sort section except I have replaced the bubble sort with a shell sort:

```
////////////////////////////////////////////////////////////////////////////////////////

REM This program uses a shell sort to sort an array of 10 integers
REM into ascending order.
REM
REM PROGRAMMER: Mike Ware
REM DATE LAST UPDATE: 4-7-00

CLS

sizeOfArray = 10
DIM nums(1 TO sizeOfArray)

FOR count = 1 TO sizeOfArray
INPUT "Enter a integral value: ", nums(count)
NEXT count

CALL sortNums(nums(), sizeOfArray)

PRINT
PRINT
PRINT "SORTED ARRAY"
PRINT "--------------------------------------------------"

FOR count = 1 TO sizeOfArray
PRINT nums(count);
NEXT count

END

SUB sortNums (nums(), sizeOfArray)

REM Begin shell sort process to sort array into ascending order

final = sizeOfArray
Gap = INT(final / 2)

DO WHILE Gap <> 0
isChanged = 1
FOR count = 1 TO (final - Gap)
IF nums(count) > nums(count + Gap) THEN
SWAP nums(count), nums(count + Gap)
isChanged = 0
END IF
NEXT count

REM if no changes were made to array, decrease gap size
IF isChanged = 1 THEN
Gap = INT(Gap / 2)
END IF
LOOP

END SUB

////////////////////////////////////////////////////////////////////////////////////////
```

The next section introduces some techniques for searching arrays. Some arrays must be previously sorted before certain search techniques are capable of being used; others don't have to be. Read on for more about searching arrays...

Searching Array Data

There are two methods used to search for certain values or certain conditions in an array: a sequential search and a binary search. By defintion, sequential means movement done in some type of general order; in programming, a sequential search will move from one item in an array to the next in search of something. By definition, binary generally means something that involves two parts; in programming, a binary search will basically "split" an array in half during each "pass" or search, in order to determine which halve contains the target (the item we are searching for) and repeat this "split" process until the target value is found or is not found.

Sequential Search
A sequential search searches array elements one-by-one until a certain value is found (normally called a target value) or perhaps the end of the array has been reached and no value was found. The search generally begins searching at the beginning of the array and "passes" through the array toward the end, but it is capable of being written so it works in a backwards motion. The best choice for controlling a sequential array, since it only moves from one position in the array to the next, would be to use a FOR...NEXT loop to perform the search operations. Sequential searches are efficient for arrays with few elements, but are considerably slow for arrays with large capacities. For example, consider the following complete program which uses a sequential search to find a certain target value and replace it with a value of 0:

```
////////////////////////////////////////////////////////////////////////////////////////
CLS

DIM testScores(1 TO 5)

sizeOfArray = 5

testScores(1) = 65
testScores(2) = 88
testScores(3) = 75
testScores(4) = 92
testScores(5) = 84

PRINT "Initial Array Elements"
FOR k = 1 TO sizeOfArray
PRINT testScores(k)
NEXT k

found\$ = "n"
target = 75
FOR index = 1 TO sizeOfArray
IF testScores(index) = target THEN
testScores(index) = 0
found\$ = "y"
END IF
NEXT index

IF found\$ = "n" THEN
PRINT
ELSE
PRINT
PRINT
PRINT target; "was found and replaced with a value of 0."
PRINT
PRINT "Updated Array Elements"
FOR k = 1 TO sizeOfArray
PRINT testScores(k)
NEXT k
END IF

END
////////////////////////////////////////////////////////////////////////////////////////
```

Binary Search
A binary search is essentially a search that repeatedly divides an array into havles by comparing a target value to the "current" middle element and determining which "half" does not contain the target value and eliminating that portion. This process is then repeatedly again for the new "half" until either the middle element is equal to the target value or the lower limit of the half is greater than the upper limit of the half, which indicates that the array does not contain the target value. As you can tell, a binary search will only work for arrays that have been previously sorted into ascending or descending order [see Sorting Array Data] because the determination of which half contains the target value directly depends on the order of the array (the target could be less than the middle element, greater than the middle element, or equal to the middle element). I will only cover a binary search for an array sorted into ascending order and leave the intuitive binary search for descending order up to you to solve.

Overview : You will begin the search process by determing a lower limit (normally 1 for indication of first element in array) and an upper limit (the last position in the array which contains a value). For arrays sorted into ascending order, if the target value is less than the middle element, the upper limit should be assigned the position directly before the middle element so during the next "pass" or search, it will look in the half which contains values near the target value. If the target value is greater than the middle element, the lower limit should be assigned the position directly after the middle element so during the next "pass" or search, it will look in the half which contains values near the target value. If the middle element is equal to the target, then no manipulation should be done to the limits.

The following complete program demonstrates the use of a binary search with an integer array explicitly sorted into ascending order to determine the position of a certain value contained in the array:

```
////////////////////////////////////////////////////////////////////////////////////////
CLS

DIM intArr(1 TO 10)

intArr(1) = 1
intArr(2) = 5
intArr(3) = 9
intArr(4) = 13
intArr(5) = 17
intArr(6) = 21
intArr(7) = 25
intArr(8) = 29
intArr(9) = 33
intArr(10) = 37

target = 9
lowerLimit = 1
upperLimit = 10
midElement = INT((upperLimit + lowerLimit) / 2)

DO WHILE lowerLimit <= midElement AND target <> intArr(midElement)
IF target < intArr(midElement) THEN
upperLimit = midElement - 1
ELSE
lowerLimit = midElement + 1
END IF
midElement = INT((upperLimit + lowerLimit) / 2)
LOOP

IF lowerLimit > upperLimit THEN
PRINT
PRINT "Search Unsuccessful"
ELSE
PRINT
PRINT
PRINT "Search Successful"
PRINT
FOR K = 1 TO 10
PRINT intArr(K);
NEXT K
PRINT
PRINT target; "is currently in position"; midElement; "of the array."
END IF

END
////////////////////////////////////////////////////////////////////////////////////////
```

Move on to next set of topics: Section 8 - Working With Records and Data Files