Array

From Rosetta Code
Revision as of 21:51, 26 August 2008 by rosettacode>TBH (Grammar)

An array is a composite data type, in the collection category, that stores multiple values all of the same declared type. The stored values are called elements and are accessed by a tuple of indices. By using a variable indices, the array may be accessed for a not predetermined set of indices, during the run of the program. All indices of an array are totally ordered.

An array in which some of the indices have infinite (or else very large) bounded subsets is known as an associative array. For example, arrays indexed by lexicographically ordered strings are associative.

Arrays with more than one index are called multidimensional arrays. For example, a matrix is a two-dimensional array.

Basic operations defined on arrays are:

  • Indexing: accessing an array element by its tuple of indices
  • Slicing: producing a subarray by putting some constraint on the indices. For example, PL/1 provides extracting of a row or a column of an array. In Ada any range of the index can be used in order to extract a subarray from a single-dimensional array
  • Iteration of the array's elements. Some languages have a foreach loop construct for array iteration
  • Querying the bounds of array indices
  • Operations on indices (next, previous, range etc)
  • Array programming languages provide operations applied to entire arrays, so programs in such languages often lack specific index reference

Values of arrays are called array aggregates.

Multidimensional arrays in which the valid range of one index depends on the value of another are called ragged (also jagged). This term comes from a typical example of a ragged array, when a two-dimensional array is used to store strings of different length in its rows. When put on paper the right margin of the output become ragged.

Each array index is bounded at run time. Further, when

  1. array is not associative
  2. each index is within the bounds (including the bounds)

then it is legal to access the array at the tuple of these indices.

The lower bound in many programming languages is fixed to either 0 (C and relatives) or 1 (Old Fortran and relatives), or arbitrary (Pascal and relatives, modern Fortran). In Ada any discrete type can used as an index.

For an empty array the lower bound of an index is greater than the upper bound. This causes the famous problem of declaring an empty array when the index type is a singleton (has only one value).

When bounds of all array indices are fixed, the array is said to be constrained. When some bounds are unknown until run time, the array is unconstrained. Non-associative, unconstrained arrays are usually passed to the subprograms with a "dope" attached to the array body. The dope contains the actual bounds of unconstrained array indices. Associative arrays carry some indexing structure with them, like a hash table etc.

The minimal size of a non-associative array is determined by the current bounds of its indices. In all regular programming languages, the size of such array can be set by the programmer at compile time or after. In many modern programming languages the size of the array may be computed and allocated at run time.

In most programming languages, arrays are accessed by using the array brackets [ and ], e.g. in A[i], but exceptions are Rexx, instead using the dot operator ., such as in A.i, Fortran, Ada and BASIC which use round parentheses A(i), and of course in the lisps which uses constructions like (ELT A n) for accessing and (SETA A n new_val) for setting (Interlisp) or (vector-ref A n) for accessing and (vector-set! A n new_val) for setting (Scheme). No bracket indexing occurs in J, an array language; instead, the normal syntax of function creation and function calling applies.

C

We wish to open a text file and compute letter frequency

FILE *any_text;
/* declare array */
int frequency[26]; 
/* declare a computed index */
int ch; 
 
any_text = fopen ("a_text_file.txt", "rt");
 
/* init the freq table: */
for (ch = 0; ch < 26; ch++) 
    frequency[ch] = 0;
 
ch = fgetc(any_text);
while (!feof(any_text)) {
    if (is_a_letter(ch))
        /* if the char is a letter, then increase character slot in the freq table: */
        frequency[ch-'A'] += 1;
    ch = fgetc(any_text);
}

J

The example task is the same: open a text file and compute letter frequency.
Written in this array programming language, no loops are specified.
Input is a directory-path with filename. Output is a 26-element single-axis integer array.

load 'files'     NB. fread is among these standard file utilities
ltrfreq=: 3 : 0
 letters=. (65+(i.26)) { a.                  NB. We'll work with minimal alphabet.
 reduced=. (#~ e.&letters) toupper fread y   NB. Omit non-letters. (y is the input.)
 sums   =. +/"_1 = reduced                   NB. Count how often each letter occurs.
 sums (letters I. (~. reduced))} 26 # 0      NB. Alphabetize the sums, then return.
)

Computational metrics

Access is O(1), appending is O(1), and insertion is O(n).