7 Little SQL Tricks

Here are some small but significant ways to speed up or streamline your SQR program.

1. Select Before Delete

In every pay period we added 200,000 rows to a table that already had twenty million rows. This process had to be repeatable, so it started by deleting any rows that might exist for the current pay period.

begin-procedure delete_pay_period_rows
begin-sql
  delete from PS_HUGE_TBL
  where PAY_PERIOD = $current_pay_period
end-sql
end-procedure delete_pay_period_rows

We rarely needed to rerun the program, but the database reviewed every row, looking for something to delete. Oracle doesn’t use indexes for delete. Instead of calling delete_pay_period_rows every time, I added this code:

begin-select loops = 1
PAY_PERIOD
  do delete_pay_period_rows
 from PS_HUGE_TBL
where PAY_PERIOD = $current_pay_period
end-select

Now Oracle can use the index for PAY_PERIOD, and usually discover that there is no need for a delete.

2. Get The Current PS_JOB Row

PS_JOB has four keys, EMPLID, EMPL_RCD, EFFDT, and EFFSEQ. The “current” row for each employee is the one with the latest EFFDT (effective date) that is still in the past, and the highest EFFSEQ (effective sequence number) for that date. If our organization uses multiple values of EMPL_RCD, we might also want to get its maximum value before selecting EFFDT and EFFSEQ.

(Side note: EMPL_RCD is “employee record number,” and it can be used to allow one employee to hold different jobs simultaneously in different departments with different pay rates. More recently, it can be used to distinguish one person’s differing roles as a job applicant, employee, and other.)

Normally, to select the current row for each current employee as of a particular date, we write:

begin-select
J.EMPLID
 from PS_JOB J
where J.EFFDT =  (select max(EFFDT)
                  from PS_JOB
                  where EMPLID = J.EMPLID
                  and EMPL_RCD = J.EMPL_RCD
                  and EFFDT <= $as_of_date)
  and J.EFFSEQ = (select max(EFFSEQ)
                  from PS_JOB
                  where EMPLID = J.EMPLID
                  and EMPL_RCD = J.EMPL_RCD
                  and EFFDT = J.EFFDT)
  and J.EMPL_STATUS in (‘A’,’L’,’P’,’S’)
end-select

However, if we want just the current employees (who haven’t terminated their employment) as of the day we run the SQR program, we can use the PS_EMPLOYEES table, which is refreshed every night. PS_EMPLOYEES has many of the same fields as PS_JOB, and it might replace PS_JOB entirely. However, if we still need PS_JOB, we can write:

begin-select
J.EMPLID
 from PS_EMPLOYEES E, PS_JOB J
where J.EMPLID = E.EMPLID
  and J.EMPL_RCD = E.EMPL_RCD
  and J.EFFDT = E.EFFDT
  and J.EFFSEQ = E.EFFSEQ
end-select

Ten years after we implemented Peoplesoft HR, the PS_EMPLOYEES table might be 1/20th the size of PS_JOB, and we’ve gone from two subselects to a single join.

3. Who Dropped The Deduction?

Part-time employees with irregular schedules may get a paycheck too small to pay for all their deductions. We can find those cases by joining PS_PAY_CHECK (for the EMPLID) with PS_PAY_DEDUCTION (for the deduction code and amount). First we select the two tables for the previous pay period to see who paid the deduction last time. Then we add a “not exists” subquery of the same two tables for the current pay period to see who didn’t pay the deduction this time. Slow, slow, slow.

Here’s a faster way:

begin-select
PC.EMPLID
 from PS_PAY_CHECK PC, PS_PAY_DEDUCTION PD
where PC.PAY_END_DT in ($prev_pay_end_dt, $curr_pay_end_dt)
  and PD.COMPANY = PC.COMPANY
  and PD.PAYGROUP = PC.PAYGROUP
  and PD.PAY_END_DT = PC.PAY_END_DT
  and PD.OFF_CYCLE = PC.OFF_CYCLE
  and PD.PAGE_NUM = PC.PAGE_NUM
  and PD.LINE_NUM = PC.LINE_NUM
  and PD.DEDCD = $deduction_code
group by PC.EMPLID
having max(PC.PAY_END_DT) = $prev_pay_end_dt
end-select

The having clause is an additional selection criterion that SQL applies to each group of rows created by the group by clause.

If you haven’t implemented Peoplesoft Payroll for North America yet, there is an option you could consider. There is a setup option that specifies how many paychecks (LINE_NUM) should be on each “page” (PAGE_NUM). The product defaults to seven. If you set it to one, then every paycheck will have LINE_NUM = 1, and you won’t have to specify PD.LINE_NUM = PC.LINE_NUM. I’ve never had a chance to test it. If you do, please let me know what happens.

4. Skip The Third Pay Period

Some benefits require deductions in the first two biweekly pay periods of each month, but not for the third pay period (if it occurs).

FSA (flexible spending accounts for medical or dependent care expenses) deductions are taken in the first two pay periods, so we export them to the carrier only twice a month. The carrier wants to know who started deductions in the pay period just ended. But what if someone “started” (COVERAGE_BEGIN_DT) in a non-deducting pay period? We want to report on him or her too.

begin-select
min(PAY_BEGIN_DT) &PC.COVERAGE_BEGIN_DT
 from PS_PAY_CALENDAR PC
where PC.COMPANY = $company
  and PC.PAYGROUP = $paygroup
  and ((SYSDATE - 28 between PC.PAY_BEGIN_DT and PC.PAY_END_DT and PC.PAY_PERIOD = 3)
       or
       (SYSDATE - 14 between PC.PAY_BEGIN_DT and PC.PAY_END_DT and PC.PAY_PERIOD < 3))
  and PC.PAY_OFF_CYCLE_CAL = 'N'
end-select

This selection will return a start date equal to the PAY_BEGIN_DT of the skipped pay period after the following pay period ends. If there wasn’t a skipped pay period, it will return the PAY_BEGIN_DT of the pay period just ended. Then we can select all rows in PS_FSA_BENEFIT whose COVERAGE_BEGIN_DT is greater than or equal to PC.COVERAGE_BEGIN_DT.

5. Union All

Whenever it seems appropriate to perform an outer join or a union in SQL, we should consider a union all command instead. Jeff Smith proposes better alternatives to a full outer join in this article (http://weblogs.sqlteam.com/jeffs/archive/2007/04/19/Full-Outer-Joins.aspx).

A union command will cause the database to collect all the results from two select statements in a temporary table, merge them, sort them, and eliminate the duplicates. If we don’t mind duplicates, or if we know there are no duplicates, we should use union all to bypass the merge, sort, and de-duplication processes.

6. Don’t Apply Functions To Columns

Sometimes we need to test some transformation of a column rather than the column itself, but if we test a transformation of an indexed column, Oracle (and perhaps other databases) won’t use the index. It will transform that column’s value in each row. Even if the column isn’t indexed, it’s faster to transform the constant value once rather than that column’s value in each row.

  • Slow: where substr(FIELD, 1, 3) = ‘ABC’
  • Fast: where FIELD like ‘ABC%’
  • Slow: where to_char(DATE, ‘mm/dd/yy’) = ’01/01/02’
  • Fast: where DATE = to_date(’01/01/02’, ‘mm/dd/yy’)
  • Slow: order by substr(FIELD, 1, 2), substr(FIELD, 3, 2)
  • Fast: order by FIELD

The last example was a compound key consisting of three values that were two characters each. The business requirements were to sort by the first two values. We need to recognize that sorting the entire field will accomplish the same thing.

7. Value Lists

Peoplesoft Time & Labor introduced a great idea, value lists. The PS_TL_VAL_LIST_DTL table has three fields, LIST_ID, LIST_TYPE, and VALUEGROUP. LIST_TYPE can be anything; union code, jobcode, deptid, position number, location, shift, company, paygroup, grade, step, employee status, employee type, full-time/part-time code, earnings code, deduction code, tax code, etc.

We’ve all written SQL selections that include “in list” clauses, with an arbitrary list of values. The database handles these inefficiently, because they are equivalent to a long list of tests, separated by “or.”

where ERNCD in ('REG','OVT','VAC')

Is the same as

where (ERNCD = 'REG' or ERNCD = 'OVT' or ERNCD = 'VAC')

With value lists, we can create a list and use it as a subquery or a joined table:

where ERNCD in (select VALUEGROUP
from PS_TL_VAL_LIST_DTL
where LIST_ID = 'ABC'
and LIST_TYPE = 'ERNCD')

or

from PS_TL_VAL_LIST_DTL TVLD, PS_PAY_OTH_EARNS POE
where TVLD.LIST_ID = 'ABC'
and TVLD.LIST_TYPE = 'ERNCD'
and POE.ERNCD = TVLD.VALUEGROUP

This also saves us from changing the program every time the list changes. So, create a version of PS_TL_VAL_LIST_DTL for yourself!

3 Comments

  1. Bob Josephson says:

    It’s a shame that non-Peoplesoft SQL users are unlikely to see this.

    • administrator says:

      Thank you. Maybe, someday, a SQL blogger will find this post and mention it on his or her blog.

  2. David Collins says:

    Very helpful SQL tips