Bad SQL Redesigned

The Assignment

Last week we looked at various techniques for combining tables.  As an example, we took an assignment “to gather the hours reported on timecards for each employee over a two week pay period.”  We saw five ways to combine the Peoplesoft HCM tables PS_EMPLOYEES and PS_TL_RPTD_TIME.

By the end, I realized that there was no actual business problem that could be solved correctly and efficiently with this approach.  We sometimes get requests that deliberately exclude data in the database, for example:

  • We want a list of all employees, but not those who are on leave of absence or who telecommute (for a roll call at the fire drill tomorrow).
  • We want a list of all customers, except those on west of the Mississippi or who bought the Widget 2000 (to offer a discount for new buyers of the Widget 2000 in the Eastern USA).

How likely is this request?

  • We want a timecard summary of all employees.
  • Even though we make this request without specifying the tables to use, without knowing the characteristics of those tables, and without working knowledge of SQL join functionality,
  • We implicitly do not want employees who had timecards if they are no longer employees as of the day we run the report.  We also want no indication of the existence of employees who will be paid but who don’t have timecards.

More realistically, what do our business customers probably want?

  • List everyone who was an employee during a given two week period.
  • List everyone who had timecard entries dated within a given two week period, even if they were not an employee at the time.  (Perhaps they made timecard entries in advance and then left the organization.)
  •  Report the total quantity from each person’s timecards.  If someone on the list didn’t have timecards in this period, report the total quantity as zero.  (Normally we would challenge this; it doesn’t make sense to add hours, dollars, miles, and other units, but let’s assume our organization only has hours.)

Who To Include

We want all employees who were eligible for timecards during the pay period.  That’s indicated by an EMPL_STATUS of A (active), P (paid leave of absence), S (suspended), or U (terminated with pay) in the PS_JOB table.  The selection for that is as follows.

from PS_JOB J
where J.EFFDT >= (select max(EFFDT)
from PS_JOB
where EMPLID = J.EMPLID
and EMPL_RCD = J.EMPL_RCD
and J.EFFDT <= $pay_begin_dt)
and J.EFFDT <= $pay_end_dt
and J.EMPL_STATUS in ('A','P','S','U')

This selection may look unusual even to programmers who have experience with Peoplesoft HCM.  We have the standard subselection to find the maximum effective date on or before the beginning of the pay period.  However, we take more than that date; we take all subsequent dates until the end of the pay period.  This will give us employees who were eligible for timecards at any time within the pay period.

We check the EMPL_STATUS in that range of rows, but not with the usual “ALPS” values that correspond to HR_STATUS = A.  The Time Administration program in the Time & Labor module does not allow timecards for employees on unpaid leave of absence (L) but it does allow them for employees who are terminated with pay (U).

We also want any other employees who have timecards.  They may not be eligible for timecards, but that makes it even more important to have them on the report.  If we select PS_TL_RPTD_TIME without reference to PS_JOB we’ll get all the timecards and their employees.  An inner join of PS_TL_RPTD_TIME and PS_JOB would lose the rows we want.  An outer join or a union would work, but SQR supports another approach:

  1. Create an array of all employees (ever).
  2. Set a flag for anyone selected from the PS_JOB table.
  3. Enter hours from the timecard table and flag those people.
  4. Report everyone with a flag.

Array Management

SQR requires that we specify the size of an array and the Peoplesoft version of SQR requires that we not exceed that size.  We can get the row count of the PS_PERSONAL_DATA table and increase it sufficiently to let the program run for the next few years.  If necessary, we can estimate the number of employees to add over the next few years by getting the row count of the PS_EMPLOYMENT table where HIRE_DT was in the past few years.  We can put that number in a #define command.  If it’s necessary to program for ages, pick an enormous number like one million or use a table rather than an in-memory array.

create-array name=emptime size={num_emps}
field=report:integer={false}
field=emplid:char=''
field=name:char=''
field=hours:number=0

The next issue is mapping EMPLID to the array.  The March 15, 2009 blog entry, “5 Uses: Load-Lookup Love Letter” addressed this in items 3 and 4.  Since we will report from the array, we’ll want the employees’ names, and let’s assume we want to sort by name.  The appropriate command will map EMPLID to an array index and the employee’s name.  The array index should leave the array sorted by name.

load-lookup
name=emp_to_name
table='(select EMPLID, NAME, ROWNUM from (select EMPLID, NAME from PS_PERSONAL_DATA order by NAME))'
key=EMPLID
return_value='ROWNUM || ''@'' || NAME'
rows={num_emps}

The Solution

First, create the array, load the lookup, and establish the begin and end date for the timecards.  Second, read the PS_JOB table and populate the report flag, emplid field, and name field.  Third, read the PS_TL_RPTD_TIME table.  Populate the report flag, emplid field, and name field, even if that’s redundant.  Sum the hours into the hours field.  Fourth, scan the array and print any row where the report flag is {true}.

In order to meet the more probable business need, we have no need for the fancy SQL from last week’s blog entry.  In fact, making the SQL more fancy gave us the wrong results.  Sadly, this creativity-stifling truth is often the case when we are adding value in the real world.  The compensation is the satisfaction of satisfying our customers and, of course, our compensation.

2 Comments

  1. Hi,

    Interesting blog that I stumbled upon today – didnt know people were still writing about PeopleSoft. Have a question: there used to be a tool called Convoy for autogenerating the code for interface SQRs. I am unable to find a download. Any similar tool that you are aware of?

    Invite you to visit my blog http://blog.hardeep.name

    Regards
    Hardeep

  2. administrator says:

    Thanks for your note, Hardeep. Here’s more than you probably want to know about Convoy, from my SQR Timeline articles last December.

    In 1997, Convoy Corporation of Emeryville, California introduced Convoy/DM (data migration). It was a software product that generated SQR ETL (extract, transform, and load data) programs, running on Windows NT, Unix, AS/400, and MVS servers. Convoy was founded in 1993 as an independent project management software reseller.

    June 1999: New Era of Networks Inc. (NEON) of Denver, Colorado agreed to acquire Convoy for about $42 million in stock. Convoy’s CEO John Valencia became a Senior Vice President of NEON. Convoy had had $10.4 million in revenue for fiscal year ended March 31, 1999.

    In third quarter 2000, NEON posted record revenues and earnings by selling software to other companies in exchange for equity in those companies; booking the sales as non-cash revenue. When this was disclosed, NEON stock dropped over 50% on November 21, 2000. In June 2001, Sybase completed acquisition of NEON in an exchange of stock worth about $373 million – a decline of 91% from NEON’s 2000 high.

    So, if anything remains of Convoy/DM, you’ll find it at Sybase. The only other SQR code generator I know is Rwiz from a company called “Enterprise Information Resources,” but I think it generates report programs. There are other ETL products unrelated to SQR; I recommend you search the web for that acronym.

    Update: You should also investigate SQRIntegrator.