Monday 16 May 2016

BigDecimal Gotchas






by Sarang Nagmote



Category - Data Analysis
More Information & Updates Available at: http://vibranttechnologies.co.in




Today Id like to share some hints on BigDecimals, how and when to use them, what to take care about, and how to avoid strange, but common errors.

When To Use Them

I think this is the most straightforward of them all. BigDecimals should be used when exact results with arbitrary precision are needed for numerical calculations.
Doubles can cause problems when used for exact answers, as not all floating point numbers can exactly be represented as doubles (see about representation of floating point numbers). Heres the worlds simplest source code to show this:
System.out.println(5.8 + 5.6);
The above code snippet will display 11.399999999999999 instead of the expected 11.4. If this had been some application for monetary calculations, we would have just lost (or won, it depends) a tiny amount of money. 
Ok, so lets fix this. Since this is a post about BigDecimals, our first attempt is to wrap all those double values, like this:
BigDecimal first = new BigDecimal(5.8);BigDecimal second = new BigDecimal(5.6);System.out.println(first.add(second));
And the result is: 11.39999999999999946709294817992486059665679931640625. Again, not 11.4 as we have expected. We are facing the very same issue as before, only with some more digits. This should be no surprise, to quote the Javadoc for the constructor BigDecimal(double):
"Translates a double into a BigDecimal which is the exact decimal representation of the doubles binary floating-point value."
Yep, this constructor really just wraps the double values; it applies no extra smarts to save us from trouble. Lets try it differently, with our numbers expressed as strings:
BigDecimal first = new BigDecimal("5.8");BigDecimal second = new BigDecimal("5.6");System.out.println(first.add(second));
And yes, the result finally is 11.4, the value we were looking for. I guess this is the first morale of this story: one should use the BigDecimal constructor that takes a String rather than a double to translate a floating point value into BigDecimal objects.

When Not To Use Them

This one is pretty simple too. Since BigDecimals are for exact values, they should not be used when an approximation suffices. Thats simply because they are pretty heavyweight.
Lets suppose we want to create a simple application that iteratively approximates the value of Pi. A really old (and I guess outdated by now) formula for approximating Pi (taken from Wikipedia) is:
Image title
Now lets try to apply this formula for k=2000, using first BigDecimals and then doubles to see what happens.
First, the BigDecimal version:
private String approximate() { BigDecimal sqrtTwelve = new BigDecimal("3.4641016151377544"); // pretty good approximation BigDecimal pi = BigDecimal.ZERO.setScale(16); for (int i=0;i<LENGTH;i++) { BigDecimal numerator = MINUS_ONE_THIRD_BD.pow(i); BigDecimal element = numerator.divide(new BigDecimal(2*i+1), RoundingMode.HALF_EVEN); pi = pi.add(element); } pi = pi.multiply(sqrtTwelve).setScale(16, RoundingMode.HALF_EVEN); return pi.toString();}
and now with primitive double values:
private String approximate() { double sqrtTwelve = Math.sqrt(12.0); double pi = 0.0; for (int i=0;i<LENGTH;i++) { double element = Math.pow(MINUS_ONE_THIRD_DOUBLE, i) / (2.0*i+1.0); pi = pi + element; } pi = pi * sqrtTwelve; return pi + "";}
I did the usual milliseconds based elapsed time measurement to see how these code snippets perform. I also set my heap large enough to avoid garbage collections during those 2000 iterations, which might be cheating, as it only applies to the BigDecimal version and can be a real overhead in large systems.
Here are the results:
VersionResultRunning time (ms)
Pis real digits
3.1415926535897932
N/A
Big Decimal
3.1415926535897931
8315
double
3.141592653589794
3
Well, both versions approximated Pi really well. The BigDecimal version got 15 decimals right, while the double version got 14. However, the latter outperformed the first by more than 2771 times. In this case, the result were getting using doubles is more than good enough, and should be considered instead of heavyweight BigDecimals, given the huge performance gain.

What to Remember

Lets create a simple JUnit test file to observe the behavior of BigDecimals compared to Doubles. It looks like this:
String pi1 = "3.14";String pi2 = "3.140";String pi3 = "3.1400";@Testpublic void test_doubles() { Set<Double> set = new HashSet<>(); set.add(Double.parseDouble(pi1)); set.add(Double.parseDouble(pi2)); set.add(Double.parseDouble(pi3)); assertEquals(1, set.size());}@Testpublic void test_bigDecimals() { Set<BigDecimal> set = new HashSet<>(); BigDecimal bigDecimal1 = new BigDecimal(pi1); BigDecimal bigDecimal2 = new BigDecimal(pi2); BigDecimal bigDecimal3 = new BigDecimal(pi3); set.add(bigDecimal1); set.add(bigDecimal2); set.add(bigDecimal3); assertEquals(1, set.size());}
If we run the test cases above, we can see that the first one passes with no problem, while the second one fails with a message: "expected 1, but was 3". Even though 3.14 is equal to 3.140 and 3.1400, we are still getting strange results.
Lets see why. When we created our three BigDecimal objects, we specified different number of decimals (2, 3 and 4, respectively). When passed to the constructor, that number of decimals got translated to a desired level of precision, also known as scale (observe calls to setScale(...) in the previous example too). The first instance would have a scale of two, the second three while the third four. BigDecimals equals() method takes into account both the represented numeric data and the scale. Even though equal from a mathematical point of view, our instances differ in scale, so the equality check fails. Thats why we end up with three elements in the set instead of one.
If, however, we set equal scales for our three BigDecimal instances like this:
@Testpublic void test_bigDecimals() { Set<BigDecimal> set = new HashSet<>(); BigDecimal bigDecimal1 = new BigDecimal(pi1).setScale(4); BigDecimal bigDecimal2 = new BigDecimal(pi2).setScale(4); BigDecimal bigDecimal3 = new BigDecimal(pi3).setScale(4); // this is redundant, btw set.add(bigDecimal1); set.add(bigDecimal2); set.add(bigDecimal3); assertEquals(1, set.size());}
Then everything works fine again; both the numeric values and scales are equal so the instances are considered equal.
To make this even more interesting, lets remove the setScale(4) statements and change the object type of our set from HashSet to TreeSet, like this:
@Testpublic void test_bigDecimals() { Set<BigDecimal> set = new TreeSet<>(); BigDecimal bigDecimal1 = new BigDecimal(pi1); BigDecimal bigDecimal2 = new BigDecimal(pi2); BigDecimal bigDecimal3 = new BigDecimal(pi3); set.add(bigDecimal1); set.add(bigDecimal2); set.add(bigDecimal3); assertEquals(1, set.size());}
Surprise, now the test passes again. Thats because unlike HashSet, TreeSet is a SortedSet, where element equality is based on the outcome of the compareTo(), rather than the equals() method. BigDecimal is one of those classes in the JDK where equals() and compareTo() are out of synch. compareTo() does not require the scales to be equal for considering the instances equal, hence we end up with one element in the set (just as we want).
In order to receive mathematically correct results, BigDecimals should always be compared using the compareTo() method.
Oh, and one last thing, just for the sake of completeness: BigDecimals are immutable; every modification results in a new object being created. That is why the following code is broken, and prints "Whoooops" instead of "Math makes sense!":
BigDecimal bigDecimal1 = new BigDecimal(0);BigDecimal bigDecimal2 = new BigDecimal(10);for (int i=0;i<10;i++) { bigDecimal1.add(BigDecimal.ONE); // this should be bigDecimal1 = bigDecimal1.add(BigDecimal.ONE);}if (bigDecimal1.compareTo(bigDecimal2) == 0) { System.out.println("Math makes sense!");} else { System.out.println("Whooops");}

No comments:

Post a Comment