Calculation accuracy with Excel
Numerical calculation Vs digital calculation
Reading Time: 7 minutes
Who has never played ‘Dune II’ or any other video game to the point of being almost totally absorbed? Overcoming the wall and moving on to the next level became real personal goals! And to do this, we would often have to ‘rig’ our PC (the Vespa was already fixed!). This was because the more difficult the game became, the more our skill was no longer enough to win, we also needed a computer that was adequate for the situation, i.e. with higher performance, a more powerful processor or video or sound card.
So it is now (as adults!) when we have to perform calculations on the computer. The more precise the calculation we have to do, the greater the precision required of the calculation tool to perform them. Then, depending on the objective and the standards required, each of us evaluates and uses the appropriate software for our needs. Without going too far with elaborate calculation software created for niche sectors, such as scientific research, Excel is certainly the most widespread calculation tool in the world and, as such, its use has now become standard. In this sense we can say that Excel has become the modern calculator! (I don’t blame anyone at Microsoft… I know it’s much more than that!).
So if you are like me and, whatever your profession or role in the world, when you have to do any kind of calculation, you always use Excel as your first calculation tool, then you will have noticed some oddities in certain results … this is due to the representation of numbers in the spreadsheet.
The basic problem in all this is that the real world is very different from the one simulated and reproduced by a computer. While we reason by trying to translate the real world into numbers, these numbers are obtained by implementing algorithms that will be processed by a computer. The results of calculations made with computers, such as a spreadsheet, are therefore “compromises”.
In this article I want to talk about the difference between numerical calculation and its representation in the spreadsheet, and how we can overcome this gap through the application of programming techniques in Excel that brilliantly solve the problem without making us change the calculation software
… and so move on to the next wall using the same PC as in a video game!
Representation of numerical data
In general, when we make calculations, there are
- initial data
- operations allowed on this data
- final data or results
All of this, data and the operations allowed on it, we call data types.
In the real world the numerical data types are the sets of numbers of the mathematics N, Z, Q, R, that is, the natural, integer, rational and real numbers (let’s not go into this subject in depth… it is already so boring at school), while when we entrust these calculations to a computer, as with a normal spreadsheet, we have a limited number of numerical representations, mainly whole numbers (without decimal point) and decimal numbers (with decimal point) declined in various ways.
Floating point numbers are a representation of the corresponding real numbers in mathematics, a kind of scientific notation in which the number is always expressed as the product of a number with a comma, with a sign, with a single integer digit and a power with a relative exponent but with a base of not 10 but 2! Thus:
N = ± (0.a1a2…an) × 2^p
± is the sign of the number (positive or negative)
a1a2…an, called the mantissa, is a number in absolute value and represents the essential digits
p, the exponent, is a relative number and expresses the precision of the representation.
Since the computer has limited space, the representation of numbers is obviously limited as well. Specifically, floating point numbers are usually represented using 32 bits, which are divided as follows: 1 bit for the sign, 8 bits for the exponent and 23 bits for the mantissa.
This is where the problems arise. In digital calculations, we have to keep in mind that the representation of numbers must be contained within a limited space!
The Gordian knot of it all is that while we reason in a decimal numbering system, the computer works with a binary numbering system, i.e. it works with bits, bytes and multiples of them. In order to bring the real world (decimal) into contact with the digital world (binary), a way had to be found whereby each decimal number corresponded uniquely to a binary number. This two-way correspondence between the two worlds inevitably led to some limiting consequences.
Firstly, since every piece of data must be stored in a group of bits, it follows that numerical data must be limited, i.e. they have a minimum and a maximum that can be represented … which is no small matter!
The second important aspect is that the digital representation of real numbers, floating point numbers, offers a finite number of significant digits of precision, which becomes a major limitation if a real number has infinite decimal digits. This leads to an initial rounding error which is then iterated and amplified in the execution of calculations that we will see later.
Finally, the fact that the number of digits in base 2 is considerably greater than the number in base 10 often leads to major inconsistencies. It may happen, for example, that numbers which in reality have a finite representation, such as non-periodic rational numbers, are not so in the digital representation.
As we can see, the limitations due to the digital representation of real numbers is the cause of those anomalies that sometimes occur when we do calculations with software such as Excel.
Data in Excel
Excel uses the representation of double precision floating point numbers, called doubles, which cover the entire range of decimal numbers from -1.79769313486231E308 to -4.94065645841247E-324 for negative numbers and from 4.94065645841247E-324 to 1.79769313486232E308 for positive numbers (the E is used by convention instead of indicating base 10). In this case 64bit is used for the representation of a real number called a “long” real number.
If we therefore need particular accuracy for our calculations, Excel is a good tool as it uses the type double, the most common of all programming languages and which satisfies most cases.
However, there are situations in which this accuracy is not sufficient because high-precision data is used. Just think of numbers such as the mass of the earth (597360000000000000000000000000000000000 kg approx.) or the mass of the electron (0.00000000000000000000000000000000000000000000091 Kg approx.) … how can we handle such numbers?
Example with Excel
Without bothering astronomy or nuclear physics, it is enough to consider that Excel is routinely used for calculations in financial companies. In this context, since we have to process numerical quantities with many decimal places, if we use the double type, anomalies may occur (due to rounding) and, therefore, ultimately dangerous errors in the results of financial calculations.
If, for example, we want to calculate the future value of a capital that we would like to invest today, this capitalization operation is translated in financial mathematics by a recursive sequence. And often, precisely in the calculation of numerical successions, Excel commits real ‘oddities’.
An example. Let’s take the numerical sequence
a0 = 1/3
an+1 = 4an – 1
This is the constant sequence with the value 1/3, but according to Excel, it diverges at -∞. If you try to recalculate the simple account on an Excel spreadsheet, you will realise that already at the third step the values of the sequence an begin to decrease, passing from the constant value 0.3333333333333 (=1/3) to the null value corresponding to the twenty-seventh iteration (a27). From that point on, the sequence takes on negative values that tend towards -∞.
This is a typical example of how using a lower precision data type causes colossal errors in calculations.
You are probably thinking that this is not your problem. Your calculations don’t need that much precision. Are you sure?
Let me give you another example from Microsoft’s support site.
Try entering a simple formula like this into a cell:
=( 43,1 – 43,2 ) + 1
These are not particularly large or strange numbers. Everyone could do a calculation like this.
The result should be 0.9 and in fact our trusty Excel will diligently return a nice 0.9.
So where is the problem? Try increasing the number of decimals displayed in the cell to 15 and you will notice that the number will become 0.8999999999999990000. Not quite the same. Imagine if your accountant did your taxes without taking this fact into account.
At this point? No panic … we could adopt a different representation of the real numbers that extends the range and mode of representation. In fact, in these cases we can use a data type that is more appropriate for calculations that require a large number of significant digits and no rounding errors.
The Decimal type, declared as Variant, represents decimal numbers between positive 22.214.171.1244.264.337.593.543.950.335 and negative 126.96.36.1994.264.337.593.543.950.335 and, although it does not eliminate the need for it, it does minimize errors due to rounding.
To understand the difference between the two types of data and their use in calculations, let’s take a simple example. Let’s use VBA (Visual Basic for Aplication) to create two functions that add up the number 0.0001 10,000 times and whose sum must return 1. The first function, CountDouble, uses the data type Double, while the second function, CountDecimal, uses the data type Decimal. Here is the code:
Function 1 (with numbers of type Double)
Public Function ConteggioDouble() As Double Dim X As Double Dim I As Integer X=0# For I = 1 To 10000 X = X + CDbl(0.0001) Next I ConteggioDouble = X End Function
Function 2 (with Decimal (Variant) type numbers)
Public Function ConteggioDecimal() As Variant Dim X As Variant Dim I As Integer X=0# For I = 1 To 10000 X = X + CDec(0.0001) Next I ConteggioDecimal = X End Function
The result of the two functions is surprising.
While the first function with doubles returns the value 0.99999999906, the second function with decimals (variant) returns the correct value 1!
At the expense of increased memory and computing power requirements, adopting the decimal data type proves to be more accurate in Excel calculations …
just as we used to power up the PC for video games in order to move on to the next wall, so we did with Excel in order to use the same tool for calculations requiring greater precision.
What we saw
In the present article we have talked about:
- numerical representation
- types of data
- numerical calculation
- accuracy of calculations
- Functions in VBA
Notes and Insights
This is just an article that emphasises the importance of precision in calculations in general, but it does not claim to cover all the cases that can be found in the various disciplines and which concern different types of calculation. As we know, mathematics is everywhere and the precision of the result is a condition required of any professional.
It is difficult to find texts that deal specifically with this subject, the precision of calculation. If anything, it is easier to find references in texts and manuals dealing with specific topics. During the explanation I used a VBA code from the book “Excel and artificial intelligence for trading“, in which there is a whole chapter dedicated to VBA programming for Excel.
In this case, it is a sectoral text that also deals with basic programming concepts for non-programmers and addresses the problem of calculation accuracy in the financial field.
Although I should urge you to buy the text, since I am one of the authors, I feel I must ask you if you are interested in this topic. If the answer is yes, then I will be happy to spend the next few holidays writing articles on these topics and dealing with many more mathematical cases… instead of taking the family for pre-Christmas shopping (please save me from pre-Christmas shopping!!).