Skip to the content.

How I used middle-school arithmetic to solve a Redshift migration issue

One of the most annoying issues I encountered while migrating from Oracle to Redshift could be solved with middle-school arithmetic.

No, seriously.

Redshift has a bunch of complex rules governing the precision and scale of results of basic arithmetic operations.

The one for multiplication makes some intuitive sense; xx.xx * yy.yy = zzzz.zzzz.

The one for division is really hard to understand. I even resorted to making a calculator to experiment with different scenarios. I’m not sure why it works the way that it does, but at least it’s documented.

One consequence of the Redshift behavior is that when you calculate A/B, the more decimal places in B, the fewer in the result. (This of course assumes any integer arguments are casted to decimal to avoid integer division.) So if A and B are both decimals, if you want to get more decimal places in A/B, you may have to round B first, sacrificing some accuracy.

Here’s where the math comes in. My specific scenario was the result of a nested division, A/(B/C). In my case, A was decimal(19,2) (dollars and cents), and B and C were bigint. If I cast B and C to decimal(19,0), then B/C would be decimal(38,19) and A/(B/C) would fall back to (38,4) – only four decimal places. If I want more decimal places in the result, I have to round B/C, which will get me closer to matching Oracle but with some error a few more decimal places out.

But what if you change the formula to A * (C/B), which is exactly the same thing?

The intermediate result is the same (38,19) but because you’re multiplying, you now get a result of (38,21) – with multiplication, the two operand scales are added, and the precision maxes out at 38. Further if you run into an overflow issue, you can round the intermediate result, but we’ll have better understanding of the impact on the accuracy of the final result.

Written on April 15, 2019