Rounding Error

Some people use the term “rounding error” to indicate that an answer is right enough, although not perfectly right.  Those people think “accurate” and “precise” are two different concepts.  They may be experimental scientists or programmers, but they aren’t working in Accounting.  Accountants cannot and should not leave journal entries alone with rounding errors.

Fourth Grade Rounding

We learn the mathematics of rounding shortly after learning about decimal places.  Rounding is the process of finding the closest number with fewer significant digits; 123.456 rounds to 123.46 or 123.5 or 123 or 120 or 100 or 0 (the closest thousand).  The algorithm for rounding was simple, but the second step left some discretion to the child.

  1. Select the rightmost digit of the number, ignoring trailing zeroes.
  2. If the rightmost digit is 6, 7, 8, or 9, add one in the place to the left of that digit.  If the rightmost digit is 5, it is optional to add one in the place to the left of that digit and we should do it half the time.
  3. Change the rightmost digit to zero.

Computer Rounding

In the early years of electronic computing, we discovered that rounding errors could be artifacts of computer hardware.  Numbers were calculated and stored in binary, and not converted to decimal until they were printed.  Different computers had different ways of storing fractions, and very different ways of storing floating point numbers.  A calculation like “1 divided by 3, then multiplied by 3″ would produce a binary value equivalent to 0.9999999, and it was up to the compiler to output that as 1.

At some point, a consensus formed that computers would not round 0.5 up or down arbitrarily, nor would it toggle between those choices.  0.5 or 0.0005 would always round up.

SQR Rounding

There are functions for rounding in SQR and many other languages.  I wrote about some of them in SQR Functions.  Four functions can reduce the number of significant digits in a number; edit, floor, round, and trunc. Edit returns a string, but we can set a number equal to that string.  There are four commands that have an option to round their results; add, subtract, multiply, and divide.

The critical fact about rounding numbers with built-in SQR functions is that they follow the computer industry consensus: always round up when the last digit is five.

The Problem

My assignment was to write an interface from the Peoplesoft Time & Labor and Payroll modules to a third party General Ledger system.  An employee might have a paycheck for 80 hours and $2000.  That employee’s timecards might indicate that there were 20 hours charged to account code X, 25 hours for account code Y, and 35 hours for Z.

My program takes the amount from the paycheck and prorate it over the three account codes ($500.00, $625.00, and 875.00).  There were also employer-paid benefits and taxes, totaling $321.98 for example, to be spread over X, Y, and Z ($80.495, $100.61875, and $140.86625).  My program should collect all the charges to each account code from every employee, and generate a single journal entry for the total amount.

My program outputs the details for each employee and each account to a database table.  It outputs the totals for each account to an export file.

My customers are accountants.  The grand total in the export file, in the database table, and in the payroll report should match to the penny.  Any query of the paycheck tables and the labor distribution detail table should match to the penny, whether we group by earnings codes, department, or employee.  Each account code in the export file should match to the penny with a query of the labor distribution detail table.

The decision to always round up with fives is based on the assumption that numbers are randomly, evenly distributed; if we are rounding six decimal places to integers, only one number in a million ends in .500000.  But suppose we are rounding six decimal places to dollars and cents.  There are one hundred numbers in a million that take the form .xx5000.  That gives a rounded total that is fifty cents above the unrounded total.

If there is a bias toward fives, the numbers are not evenly distributed and the rounding error can greater.  There are several reasons for a bias towards five.

  • There are 80 hours in a pay period.  Any odd integer divided by 80 will have a decimal portion that ends in five.
  • The timecard system only allows quarter hour intervals, which produce decimal portions that end in five.
  • Some employees use task profiles to assign their hours to account codes.  The task profiles often divide the time by fractions whose decimal portions end in five.
  • Multiplying a number that ends in five by any other number produces a number that ends in five about 55% of the time.  (If you’re wondering why it isn’t 50% of the time, ask me.)

Most management decisions would be unaffected by a $3 discrepancy in a $20 million payroll.  Most managers wouldn’t even spend the time to calculate what miniscule percentage that represents.  The exception is the decision whether to “close the books” on that accounting period changes from “yes” to “no.”

The Solution

I’ve hinted at the solution in my leadup, but it didn’t occur to me immediately.  I wrote the proper formulae, summed the results, and formatted the numbers for writing.  Weeks passed while the accountants ran tests and identified issues, and I made changes and released new versions for testing.  After every thing else was fixed, the numbers were good enough to reveal the rounding errors.

At first I was stumped; what kind of bug would calculate five thousand totals correctly … plus or minus ten cents?  What kind of bug would get most of the numbers perfectly, and the grand total perfectly, yet leave some totals slightly off?

I added “debug code” to the program to show rounded and unrounded quantities in the log file, then to count the occurrences of .xx5000.  That’s when I realized that I had to invent a new system for rounding numbers.

Most software can use the built-in functions to round each number in isolation.  Most accounting software might round every number like that, except the last one, and change the last number to make the grand totals balance.  My accountants set a higher standard; not only must the grand totals balance, but the list of numbers could be grouped in a few different ways and all of the groups in all of those ways must balance too.

Rather than write data to a file, I stored two values for each quantity in an array, the exact value and the value rounded to the nearest penny.  I counted the number of quantities in the form .xx5000.  I also summed the two values for the entire output, and calculated the difference.  As expected, the difference is typically hundreds of cents.  Fortunately, there are always more values in the form .xx5000 than the number of cents in the difference.

The next step is to select some of the .xx5000 unrounded values and subtract one cent from the rounded values.  I don’t want to select the first N values or the last N values.  That would guarantee that some groups would be rounded too low and other groups would be rounded too high.  I need to spread the effect as evenly as possible.

If there were 360 values in the form .xx5000 and I had to subtract 120 cents, the spreading algorithm would be simple.  If there were 343 values and I had to subtract 163 cents, what should I do?

  1. Divide the number of .xx5000 values remaining by the number of cents remaining to deduct.  Take the integer portion of that (let #N = floor(343/163)), in this case 2.
  2. Scan the array of unrounded values to find the Nth (second) value in the form .xx5000.  Subtract one cent from the rounded value.
  3. Now there are N fewer values, and one fewer cent to deduct; 341 and 162.  Update those counters and return to step 1 to continue through the array.

When I finish the array, the corrections are spread evenly (roughly) throughout the array.  The array is ready for the export file.

3 Comments

  1. david nicol says:

    presumably you adopted the chemist’s best practice of making all calculations using all precision available and deferred rounding to the appropriate accuracy to the display (and comparison) phase? Or were you doing that all along, and wrote this because you had so many ends-in-fives that you multiplied together that the errors got up to the significant part?

    Premature rounding is a form of premature optimization, when classifying evils.

    • administrator says:

      Absolutely. I didn’t round until right before I wrote the data to the export file, and the export file format imposed the rounding requirements (dollars and cents without a decimal point). The ends-in-fives bias was due to the nature of the input values and type of operations performed, and would have appeared even with infinite-precision arithmetic. Perhaps its similar to the oft-noted bias for numbers to begin with 1.

      Interestingly, when I first inherited the program, I discovered that some numbers were being prematurely rounded to pennies and others weren’t, leading to grand total errors between 10 and 100 dollars.