Hooray For Arrays.Part(1)

I’ve mentioned arrays while writing about other topics.  It’s time I write about arrays.  Someone could program in SQR for a whole career without ever using them, but that someone would have to work harder to do without these handy data objects.

Arrays Versus Spreadsheets And Tables

SQR arrays resemble spreadsheets or database tables, but also have significant differences.

  • An array has rows and columns (fields) like a spreadsheet or a table.
  • Data can be read, written, or updated as with a spreadsheet or a table.
  • Data access in an array is based on row numbers.  Spreadsheets use row numbers most of the time but do have lookup functions that search for a particular value and return a related value on the same row or the same column.  Database tables, especially relational database tables, usually access a row by its data values.
  • Peoplesoft SQR requires a user defined limit to the number of rows in an array.  Spreadsheet software imposes a fixed limit on all its spreadsheets.  Database management systems have no practical limits to the number of rows in a table.  (OK, it’s not unlimited, just practically unlimited.)
  • Array fields can be arrays themselves, creating two dimensional arrays.

Some problems are easier to solve with arrays, others are easier with tables.

  • Arrays are stored in memory, tables are stored on disk, and memory access is at least 1000 times faster (or 10,000 or 100,000 – I’ll leave the benchmarks to you).  That makes arrays preferable for algorithms that need to use each row of data more than once or algorithms that don’t need to store the data permanently.
  • Array data is addressed by position, table data is addressed by value.  Each method has its advantages.
  • Arrays allow random access, using rows in any order, or scanning forward and backward.  Efficient table access involves scanning a subset of rows in a predefined order without the ability to access a row after another row is read.
  • It’s easy to relate one row to another in an array and hard to do so in a table.  (Example: give me each employee’s current DEPTID and the one before that.)

Create-Array

This is more of a declaration than a command, and we can put it in the begin-setup section.  It instructs the SQR compiler to allocate and structure memory before the program begins to execute.  We can place the create-array declaration within an if statement or a while loop, but that does not make it conditional or cause it to occur multiple times.  This program will compile without error and show the text “goodbye”.  Note that the create-array declaration appears after the first use of the array, and the program flow never reaches it anyway.

begin-program
  if 1 = 1
    let x.y(1) = 'goodbye'
  else
    create-array name=x size=10
      field=y:char='hello'
  end-if
  let $xy = x.y(1)
  show $xy
end-program

The SQR Language Reference for Peoplesoft describes the array() function in the section about the let command, and refers us to the printarray function in the “ufunc.c” file.  I found printarray in the “extufunc.c” file, and it gives us some insight into the nature of SQR arrays.  Apparently, each field in an SQR array is stored in its own C array.

The printarray example shows us only two data types.  Numeric fields are stored as arrays of double length floating point numbers.  Character fields are stored as arrays with two elements in each position; a pointer to a character string and an unsigned short integer that indicates how much memory has been allocated for that character string.

We can speculate that integers and fixed precision numbers are probably stored in arrays.  I don’t know whether dates are stored as numbers or fixed length strings.  The data could be stored efficiently in arrays as floating point numbers (date and time) or they may use arrays of pointers to fixed length strings.

Clear-Array

In C, we have to declare a variable and describe it (byte, int, short, long, float, double, char, pointer, etc.) before we use it.  Also we’d better initialize it or it will be equal to whatever the voltage was on that part of the memory chip.

SQR resembles the BASIC language of its time (and maybe BASIC today too) in that variables need not be declared or initialized.  The first character of the variable (# or $) describes the default data type and SQR starts each variable at 0 or ” (the zero length string).

The most intriguing feature of create-array is that it requires us to declare the data types and allows us to initialize the values of the fields.  If this were true for scalar variables, SQR programs would look quite different, and several types of logic bugs would disappear.  (Scalar variables have single values – a date, a number, a string – as opposed to arrays and lists.)

It may be lack of experience on my part, but I think clear-array is one of the SQR great original ideas.  This command resets any field that has original values to those values.

Working With Array Elements

I was disappointed at first that we cannot use array elements in add, subtract, multiply, divide, or move commands.  When I learned that there were corresponding array commands for each of them, I was partly delighted.  The array-add, array-subtract, array-multiply, array-divide, put, and get commands are simultaneously much better than and almost as good as their scalar cousins.  (Then I was disappointed again to learn how few commands can use array elements.)

Pluses: The array commands allow multiple operations on a single array row.  We can add (or subtract, etc.) a list of numeric values to a list of fields.  We can move a list of values to a list of fields (put) or vice versa (get).

Minuses: The array commands cannot round the results of the arithmetic operations as the scalar commands can.  The put and get commands cannot apply edit masks or convert types (number to string) as the move command can.  The lists of values must be scalars, not array elements, so if you were hoping to copy an entire array row to another, you need to get all the fields into scalar variables and then put the scalar variables into the fields of another row.

When I’m writing a cross-tabulation report, I often want to load my row names and column headings into arrays.  The put command works on only one array row, so it seems no easier than a block of let commands.  The occurs parameter, which creates two dimensional arrays, is the answer.

create-array name:region size=1
  field=name:char:4

put ‘Northeast’ ‘Southeast’ ‘Midwest’ ‘West’
  into region(0) name(0) name(1) name(2) name(3)

Arrays Are Global

Variables can be global or local in scope.  A global variable is initialized when we first execute an SQR program.  A local variable is only available within its own procedure.  A global variable is available in a local procedure only if there is an underscore inserted in the second position of the variable name (#_global, $_global, &_a.global) Here’s how they interact:

begin-program
  let $region = 'USA'
  do one-state('CA')  ! California
  do one-state('NY')  ! New York
  show 'global value: ' $region
end-program

begin-procedure one-state($state)
  show 'local global value: ' $_region
  show 'local value: ' $region
  let $region = $state
  show 'local value: ' $region
end-procedure

The log file will show:

local global value: USA
local value:
local value: CA
local global value: USA
local value: CA
local value: NY
global value: USA

Note that the global version of $region was set to ‘USA’. The local version of $region started with the empty string.  The let command in the one-state procedure set the value of the local version of $region.  We could have written a let command in one-state to set the global version of $region, but that’s even worse than using global variables in a local procedure.

Arrays are always global and they are available in local procedures without modifying their name; we write “region.name(1)” everywhere.

2 Comments

  1. Bob Josephson says:

    Well, this is definitely not the topic I expected to see this week.

    Good article anyway. I assume another big difference between arrays and spreadsheets would be that spreadsheets allocate memory dynamically using a sparse matrix technique, whereas arrays are completely pre-allocated in a contiguous block.

    • administrator says:

      I was comparing the external (user) view of spreadsheets to the external view of arrays. I don’t know how every spreadsheet program manages its data, but I’d be surprised if a modern spreadsheet program didn’t “allocate memory dynamically using a sparse matrix.” For anyone working on a biography of me, you may note that I once wrote a spreadsheet program in interpreted BASIC that used a pre-allocated contiguous array.

      I extrapolated from very little evidence to comment on SQR arrays’ memory model. Each field in an SQR array has a pre-allocated block of memory, but I cannot tell whether all those blocks are contiguous. Also, a string field has a contiguous block of string pointers, but I doubt the string space is allocated until used. That would make it potentially quite different from a COBOL array of strings of predetermined length.