Or, why floating point operations are slow, and how to avoid them.
Table of Contents
A Brief Note on Notation
This article is geared toward programmers, but if you happen to be a non-programmer:
- The “^” means, “to the power of” or exponentiation. So 5^2 means “5 to the power of 2”, or 5 x 5 = 25.
- You might also see “*” – in computerese, this means “multiply”. So 5*5 = 5 x 5 = 25.
- 0x means “The following number is in hexadecimal.” So 0x10 is “Hex 10” or 16. Likewise, 0b means “the following number is in binary”.
Quick Recap of Integers
Computers use binary internally to store all numbers, and all internal calculations are also performed in binary.
For integers, this is fairly straightforward.
Let’s recall that integers are numbers without a decimal component. These are all integers:
- 0; 1; 2; etc…
- -1; -2; -3; etc…
- 999,999,999; -123,456
- 123 x 10^45; -5 x 10^32
To represent any integer number, you simply need enough bits (binary digits) such that 2^b > n.
For example, if I want to represent the integer 123,456, I need 17 bits. 2^17 = 131,072.
However, computers don’t just store an arbitrary number of bits – bits are always grouped by multiples of 8, and that’s how we get our basic integer data types:
Bytes (Bits) | Data Type | Unsigned Numerical Range |
---|---|---|
1 (8) | Byte | 0 to 255 |
2 (16) | SmallInt | 0 to 65535 |
4 (32) | Int | 0 to 4.2 billion |
8 (64) | LongInt / QuadInt | 0 to 18 quadrillion |
This works great, until you need to represent decimal numbers, or very large numbers.
Very large numbers are represented in exponential form, such as:
- 1.23456 x 10^5 = 123,456
- 4.2 x 10^42
For the second example, we would need 145 bits (over 18 bytes) to represent this number!
This, and decimals, is where floating point comes in.
What is a Floating Point?
A floating point is a way for computers to store and process decimal numbers.
Floating point numbers consist of the following components:
s x 2^e
s – “Significand” holds the significant digits
2 – “Base”. Could be anything, but computers use base 2
e – “Exponent”
“Floating point” means that the decimal point “floats” based on the exponent.
Assuming that the base is always 2, we really store every floating point number as two numbers:
- The significand. In binary, every number starts with “1” except 0. So our significand will always be 1.bbb… unless we are truly storing the number 0.
- The exponent (in binary)
Let’s recap: anything raised to a negative exponent means 1/x. So:
- 10^(-1) = 1/10 = 0.1
- 10^(-2) = 1/100 = 0.01
- etc…
This is important, because we need negative exponents to represent fractions. Without them, we could only represent integers.
I kind of glossed over signed integers, but our exponent can be negative, and it does this using an offset.
In a 32-bit floating point, the exponent is 8 bits. If you recall, 8 bits (or one byte) can store any integer in the range of 0 to 255. However, the exponent is really an offset from -127, so to get the real exponent value, you subtract 127:
00000000 = 0, but 0 offset from -127 = -127 01111110 = 126, but 126 offset from -127 = -1 01111111 = 127, but 127 offset from -127 = 0 10000000 = 128, but 128 offset from -127 = 1 11111111 = 255, but 255 offset from -127 = 128
Because -127 and +128 have special meanings, this means that our 32-bit float can store any positive number in the range of:
- In base 2: 1 x 2^(-126) to 1 x 2^127
- In base 10: 1 x 10^(-38) to 1 x 10^38
That’s about 100 undecillion, or 100,000 trillion-trillion
To represent negative numbers, floating point uses a sign bit. If the bit is set, the entire number is negative. So both our number and our exponent could be negative or positive:
Exponent Positive |
Exponent Negative |
|
---|---|---|
Sign Positive |
Positive, Greater than 1 |
Positive, Fractional |
Sign Negative |
Negative, Less than -1 |
Negative, Fractional |
Putting this all together, a 32-bit floating point looks like this:
seeeeeeeebbbbbbbbbbbbbbbbbbbbbbb 1 sign bit 8 exponent bits (Offset from -127) 23 significand bits
However, recall that every number in binary, except 0, starts with a “1”. This includes fractions, and any other number you can think of. We can exploit this by always assuming that the first digit is 1, and that the significand bits encode digits 2-24, giving us 24 bits of precision, even though we only have 23 significand bits.
To represent the number “0”, we leave both the exponent and significand empty.
So even though we can store very large or small numbers, we only have 24 bits to store the actual digits, which equates to about 7 digits in base 10. This is called the precision limit.
Here are some examples:
0-00000000-00000000000000000000000 = 0 0-01111111-00000000000000000000000 = 1.0 x 2^(127-127) = 1.0 x 2^1 = 1 0-01111110-10000000000000000000000 = 1.1 x 2^(126-127) = 0.11b = 0.75 1-10000100-00100000000000000000000 = -1.001 x 2^(132-127) = -1.125 x 32 = -36
Inside the Computer
From a program’s standpoint, there are three main components:
- The program itself
- Data
- CPU Registers (and other parts)
A “register” is a temporary holding area inside the computer’s CPU. Like a tiny, digital, robotic arm, the program tells the register what data to grab, and what to do with it.
It’s been over 30 years since I’ve done any assembly programming, so I’ll discuss a “generic 32-bit CPU” without getting in to specific architectures and implementations.
Every assembly program consists of several types of instructions:
Instruction | Effect |
---|---|
LOAD | Go get something from memory, and put it in a register. |
STORE | Take what’s in a register and put it in memory. |
ADD | Add some number (or a value in memory) to the value currently in one of the registers. |
SUB | Subtract some number (or a value in memory) to the value currently in one of the registers |
MUL | Multiply (etc…) |
DIV | Divide (etc…) |
SHIFTL | Shift Left. Move all the bits in a specified register, some positions to the left. This has the effect of multiplying by 2, and since most CPUs implement SHIFT in hardware, it’s a very fast operation. |
SHIFTR | Shift Right. Like left shift, right shift has the effect of dividing by 2. |
INC, DEC | Increment or decrement – add or subtract one to the value stored in one of the registers |
There are also instructions such as JUMP that allow for flow control, and port manipulation via IN / OUT, but we can ignore these in our simple example.
Machine language is an all-numeric way to represent CPU instructions so that they can be stored as data. The computer reads program instruction bytes from memory, loads them in to the instruction register, and then the instruction register tells the other registers what to do based on instruction codes and other operands that are included with the instruction.
You can think of assembly as human-readable machine language.
When you compile something in a higher-level language like C, it takes your source code and turns it in to machine instructions, that we can then view as assembly.
You might write this in C:
int i=10, j=++i, k=i+j;
/* i==11, j==11, k==22 */
In assembly, it might look like this, after you compile it:
LOAD A $0x0A STORE A 0xD001 LOAD A 0xD001 INC A STORE A 0xD001 STORE A 0xD002 LOAD A 0xD001 ADD A 0xD002 STORE A 0xD003
Of course, compilers usually have optimizers that help avoid all of the back and forth silliness, but again, this is an example.
In the above, “A” is the CPU register – there may be as few as 2 “general purpose” registers, while most have at least 4.
We have notation for constants, such as 0x0A, which would be interpreted as the 32-bit value, 0x0000000A (0x means “hex”).
We also use direct notation, such as 0xD001 that specifies a memory address that the CPU’s register reads or writes. For simplicity, we can skip indirect notation, which acts like a pointer.
Once we throw the code above through an optimizer, it might look like this:
LOAD A $0x0A STORE A 0xD001 INC A STORE A 0xD001 STORE A 0xD002 ADD A 0xD002 STORE A 0xD003
When we load our program in to memory, the computer might look something like the table below. Keep in mind that the program basically shares the same memory as data, and there are pointers that point to different sections of memory for different purposes, but again, this is simplified.
Also, remember that “LOAD A $0x0A” might be 0x0A01DD00 in machine language, but we’re viewing this in human-readable assembly…
Program | Registers | Data | |||||
---|---|---|---|---|---|---|---|
0x0100 | LOAD A $0x0A | A | 0xD000 | ||||
0x0101 | STORE A 0xD001 | B | 0xD001 | ||||
0x0102 | INC A | C | 0xD002 | ||||
0x0103 | STORE A 0xD001 | D | 0xD003 | ||||
0x0104 | STORE A 0xD002 | 0xD004 | |||||
0x0105 | ADD A 0xD002 | … | |||||
0x0106 | STORE A 0xD003 | ||||||
0x0107 | STOP | ||||||
… |
The CPU reads each instruction, which tells a specific register what to do next.
After the first instruction, our computer looks like this:
Program | Registers | Data | |||||
---|---|---|---|---|---|---|---|
0x0100 | LOAD A $0x0A | A | 0x0000000A | 0xD000 | |||
0x0101 | STORE A 0xD001 | B | 0xD001 | ||||
0x0102 | INC A | C | 0xD002 | ||||
0x0103 | STORE A 0xD001 | D | 0xD003 | ||||
0x0104 | STORE A 0xD002 | 0xD004 | |||||
0x0105 | ADD A 0xD002 | … | |||||
0x0106 | STORE A 0xD003 | ||||||
0x0107 | STOP | ||||||
… |
As our program progresses, we move data in and out of the A register, and memory locations that represent our C variables.
Program | Registers | Data | |||||
---|---|---|---|---|---|---|---|
0x0100 | LOAD A $0x0A | A | 0x0000000B | 0xD000 | |||
0x0101 | STORE A 0xD001 | B | 0xD001 | 0x0000000B | |||
0x0102 | INC A | C | 0xD002 | 0x0000000B | |||
0x0103 | STORE A 0xD001 | D | 0xD003 | ||||
0x0104 | STORE A 0xD002 | 0xD004 | |||||
0x0105 | ADD A 0xD002 | … | |||||
0x0106 | STORE A 0xD003 | ||||||
0x0107 | STOP | ||||||
… |
At its final state, we add “i” and “j”, and store the result as “k”. Remembering that program is data and vice-versa, the last instruction, STOP, tells the CPU to stop executing instructions, so that it doesn’t start executing data!
Program | Registers | Data | |||||
---|---|---|---|---|---|---|---|
0x0100 | LOAD A $0x0A | A | 0x00000016 | 0xD000 | |||
0x0101 | STORE A 0xD001 | B | 0xD001 | 0x0000000B | |||
0x0102 | INC A | C | 0xD002 | 0x0000000B | |||
0x0103 | STORE A 0xD001 | D | 0xD003 | 0x00000016 | |||
0x0104 | STORE A 0xD002 | 0xD004 | |||||
0x0105 | ADD A 0xD002 | … | |||||
0x0106 | STORE A 0xD003 | ||||||
0x0107 | STOP | ||||||
… |
Although this is a very simplified model, we can see what happens when we perform math operations on integers.
Integer Math
Integer math instructions look like this:
- Go get something
- Do something with it
- Store the result in a register
From the CPU’s perspective, even though this is a single programming instruction, there are 3 steps, and thus it might take up to 3 CPU cycles to execute one instruction.
Simple operations like addition are usually built right in to the register so that they only take one CPU cycle.
Integer multiplication might take 1 or 2 extra CPU cycles, even when using purpose-specific hardware. Multiplication is a combination of addition and bit shifting.
For example, to multiply 0b0101 (5) by 0b1011 (11), we would LOAD A $0x05, and then MUL A “x”, where x is some memory location.
Without a dedicated multiplier, assuming we already have 5 in the A register, here is what the CPU might do behind the scenes when we ask it to MUL A “x”:
- Go read memory location x (value 11)
- Multiply by A (value 5)
- Set C to 0
- Loop from 0 to 7 (We’ll call this “S”)
- If x AND 1
- Store A (value 5) in B
- Shift B left by S
- Add B to C and store in C
- Shift x right by 1
- If x AND 1
- Store C (value 55) in A
By my count, that’s 33 CPU cycles for a single instruction!
Modern CPUs have hardware multipliers that perform this entire process as 1 or 2 instructions, so the microcode might look like this:
- Go read memory location x (value 11)
- Send A to the multiplier
- Send x to the multiplier
- Read a result from the multiplier, and store in A
Here is what an 8-bit hardware “Lattice” multiplier looks like:
Here is how it works:
- As the first value is loaded, it populates the lattice, where each row is offset by one position (multiplied by 2).
- As the second value is loaded, any “zero” bits disable rows in the lattice using AND
- The remaining rows in each column of the lattice are then added to obtain two 8-bit result fields.
- The low byte and high byte are fed in to the result register, which can then be read by the CPU.
In this example, although it takes 3 steps write to and read from the multiplier, the multiplication itself is handled instantaneously.
Floating Point Math
If you recall, a floating point number is like two numbers stored as one:
n = s * 2^b
So, although you can absolutely load or store a floating point just like any other number, math isn’t so straightforward.
Let’s assume we have two floats, n and m. The first step is to decode the sign, exponent, and significand:
m | n | |
---|---|---|
Raw number: | 174,587.904 | 0.30690625 |
Binary: | 101010100111111011.11100111011 | 0.01001110100 |
Significand and Exponent: |
s = 1.01010100111111011111001 e = 10001 |
s = 1.001110100 e = -10 |
Encoded: | 0-10010000- 01010100111111011111001 |
0-10000001- 00111010000000000000000 |
Base 10: | 1.3319998979568 x 2^17 | 1.227625 x 2^(-2) |
Remember the special case of zero:
If e=0 (after shift) and s=0 then the value is 0. For our example, this is not the case, but the CPU still has to deal with checking for a zero and other special values, which takes time.
Floating point multiplication is actually easier than addition:
Step | Result |
---|---|
1. Multiply ms * ns | 1.3319998979568 x 1.227625 = 1.6351963747292166 |
2. While result > 2, divide by 2, add 1 to me |
(n/a) |
3. Add me + ne | 17 + (-2) = 15 |
4. Re-encode the result | s = 1.6351963747292166 e = 15 |
(In decimal) | 53,582.11328125 |
This seems complicated, but addition is even more so:
Step | Result |
---|---|
1. If n>m then swap n,m (Ensure n<m) 1.a. Compare exponents 1.b. If exponents are equal, compare significands |
No swap required |
2. Subtract ne-me = ee | -2 – 17 = -19 |
3. Left shift ns by ee (Note that a negative left shift is a right shift) |
1.227625 << -19 = 0.00000234150886536 |
4. Add ns to ms = ss | 1.3319998979568 + 0.00000234150886536 = 1.33200223946 |
5. While result >2, divide by 2, add 1 to me |
n/a |
6. Re-encode the result | s = 1.33200223946 e = 17 |
(In decimal) | 174,588.19753 |
Just as integer multiplication would chew up CPU cycles without special hardware, most modern CPUs have floating point hardware (called a Floating Point Unit) that is dedicated to floating point math and manipulation.
Comparing Floats to Ints
The point of demonstrating all of this complexity is to show all of the extra work that has to be done to support floating point numbers, compared to integers.
We’ve looked at the internal CPU process, and it’s easy to see why floats are slower than ints:
Floating Point | Integer |
---|---|
Special cases include zero, infinity, and infinitesimal, that all require extra handling steps. | Integers have no special cases. |
Each floating point consists of two numbers, each pair requiring separate manipulation and normalization steps. | All integers are a single component. |
Floating point arithmetic, even if implemented in hardware, requires a discreet set of steps that can be computationally-expensive. | Integer arithmetic, if implemented in hardware, only requires 1 or 2 steps. |
What all of this means is that the CPU has to work harder during floating-point operations compared to integer operations:
- On older CPUs that don’t have dedicated FPU support, floating point arithmetic can take up to 4 times longer.
- On newer CPUs with dedicated FPU hardware, parallel pipelines, and conditional execution, floating point arithmetic can be as little as 1.3 times longer.
One more complication: If you perform a math operation on a float and an int, the CPU must convert it to float, which adds an extra step:
Integer | Floating Point | |
---|---|---|
Integer | Integer Operation | Convert Int to Float Floating Point Operation |
Floating Point | Convert Int to Float Floating Point Operation |
Floating Point Operation |
Although it may seem trivial, converting an int to floating point entails these steps:
- Set e=0
- Preserve the sign bit = b
- if n<0 then convert n to +n
- Loop
- if s and 0x80000000 = 1 exit loop (we found the high bit)
- left shift s by 1
- increment e
- if e>31 exit loop (all zeros)
- Convert b,e,s to float
- Send b to FPU
- Send e to FPU
- Send s to FPU
- Read result from FPU to register A
The last step assumes that the actual conversion is done in hardware, but here is what would happen inside that hardware:
- When the b register is loaded, shift left by 31
- When the s register is loaded… shift right 8 (only 24 bits of precision)
- When the e register is loaded, add 127 (e will never be negative because we know n was an integer)
- left shift e by 30
- combine b OR e OR s
So the simple act of converting an int to a float is fairly computationally-expensive.
This gives us a hint about how to optimize our code…
- Reduce Integer to Floating Point conversions
- Reduce Floating Point Operations (FLOps)
How to Convert Operations to Integer-Only, Pt 1 – Beginner Mode
Here is a simple example:
Let’s say that you have a function that returns an integer value in the range of 0 to 1023, but you want to scale it by 2/3.
If you really suck at coding, your code might look like this:
int i; float k; i=somefunction(); k=i*0.66; i=cint(k);
Here is what the code above does…
- Declare variables
- Capture our integer function output in ‘i’
- Multiply by 0.66, which as we expect, results in a float. We capture the float output in ‘k’
- Finally, we convert our scaled float back to int with cint, storing the result back in ‘i’
Let’s see why this code is horrible… Here is what the CPU has to do:
- JUMP to somefunction
- POP A from the stack (result); cost = 1
- Store A as ‘i’ (assign our function result to ‘i’); cost = 1
- Convert A to floating point (preparing for floating point operation); cost = 5
- Multiply by 0.66; cost = 6
- Store the result as ‘k’; cost = 1
- Convert A to integer from float; cost = 6
- Store the result as ‘i’; cost = 1
Total cost: 21
We quickly realize that if you assign a float to an integer variable, the conversion is done automatically. Even better, we don’t need an intermediate assignment, as we can handle the scaling inline:
int i; i=somefunction()*0.66;
Better, but NOT MUCH BETTER. Here is what your CPU sees:
- JUMP to somefunction
- POP A from the stack (result); cost = 1
- Convert A to float; cost = 5
- Multiply A * $0.66; cost = 6
- Convert A to int; cost = 6
- Store the result as ‘i’; cost = 1
Total cost = 19
So although your code looks more efficient, IT’S NOT.
How about this:
int i; i=somefunction()*2/3;
At first, this looks exactly like what we had above, but let’s look at how the CPU sees this:
- JUMP to somefunction
- POP A from the stack (result); cost = 1
- MUL A $2; cost = 4
- DIV A $3; cost = 4
- Store ‘i’; cost = 1
Total cost: 10
Wait…. what???
- Since we didn’t use any floating point values, there were no floating point conversions
- Since we didn’t use any floating point variables, the compiler automatically decided that no intermediate conversions were necessary
- We still have 5 instead of 6 steps, BUT… we have two FASTER integer operations instead of one SLOW FLOp, and no conversions.
So, even though we have almost as many steps, this code requires only about 50% of the CPU cycles, and is up to twice as fast!
Here is the interesting part…
If you integer-divide 2/3, the result is 0. If we multiply 1023 by (2/3) the integer answer will always be 0.
But… 1023 * 2 = 2046 / 3 = 682
So, if we change the order of operations – multiply first, divide next, then we get the desired result without having to use floating points.
How to Convert Operations to Integer-Only, Pt 2 – Expert Mode
Let’s say that we want to map the output of our same function (0..1023) to an angle in radians.
Just to recap, in degrees, a circle is 360 degrees. It’s sliced up in to 360 neat little slices, and 360 degrees is at the same position as zero. In radians, the circle is divided in to 2π slices. So 0 and 2π sit at the same location, 90 degrees is π/2 (about 1.57) radians, 180 degrees is π (about 3.14) radians, and 270 degrees is at 3π/2 radians (about 4.71).
In computing, we typically use radians over degrees for two reasons:
- There are some math shortcuts where things cancel out, when you consider the unit circle in radians instead of degrees.
- Most compilers and math libraries implement trig functions in radians, not degrees.
So if we want to convert our function output (0..1023) to an angle in radians (0..2π), the calculation looks like this:
int i; float k; /* Get our function output */ i=somefunction(); /* Convert i to an interval of 0..1 */ k=i/1023; /* Multiply by the length around the unit circle */ k=k*2*PI();
- Int Ops: 1 * 4
- Conversions: 1 * 5
- FLOPs: 3 * 6
- Total Cost: 27
Note: In the given context, the compiler will store 1023 and 2 as floats, so no conversion is necessary.
Simplified:
/* int i; */
float k;
k=somefunction()/1023*2*PI();
- Int Ops: 0 * 4
- Conversions: 1 * 5 (output from somefunction)
- FLOps: 3 * 6
- Total Cost: 23
At first, this looks optimum because the result is a float, and there is no way around this.
However, we can rearrange things so that there are fewer FLOps:
int i; float k; i=function()*2/1023; k=i*PI();
- Int Ops: 2 * 4
- Conversions: 1 * 5
- FLOps: 1 * 6
- Total Cost: 19
Now, everything is an integer until the last step, and the cost savings is evident.
BUT…
There is a problem.
Remember that our function ranges from 0..1023. We multiply first, then divide. However, by immediately storing an integer result, we lose any decimal granularity. We effectively have a function that returns 0 or 1, which we then multiply by PI in the last step. Let’s look at a range of values, and see how this plays out.
f() = 0 * 2 = 0 / 1023 = 0 * PI = 0 f() = 128 * 2 = 256 / 1023 = 0 * PI = 0 f() = 256 * 2 = 512 / 1023 = 0 * PI = 0 f() = 384 * 2 = 768 / 1023 = 0 * PI = 0 f() = 512 * 2 = 1024 / 1023 = 1 * PI = 3.14159 f() = 640 * 2 = 1280 / 1023 = 1 * PI = 3.14159 f() = 768 * 2 = 1536 / 1023 = 1 * PI = 3.14159 f() = 896 * 2 = 1792 / 1023 = 1 * PI = 3.14159 f() = 1023 * 2 = 2046 / 1023 = 1 * PI = 3.14159
We effectively lose all precision after the decimal, which we need in order to avoid this “stepping” problem.
We can use some basic algebra to fix the stepping problem by retaining some precision, without losing our efficient integer operations.
- If we pre-calculate PI to a specific precision, we can deal with it as a fraction of integers:
3.14159 = 314159/100000 - We can pre-calculate other constants:
2 * 314159 / 1023 = 614 (and some fraction that we will ignore, assuming we have suitable precision)
/* 2 * 314159 / 1023 */ const C=614; /* Return k to a decimal. Note the use of .0 to force D to be a floating point. */ const D=100000.0; int i; float k; i=function()*C; k=i/D;
- Int Ops: 1 * 4
- Conversions: 1 * 5
- FLOps: 1 * 6
- Total Cost: 15
This approach is even cheaper, and a quick check of the output will determine that we’ve fixed the “stepping” problem:
f() = 0 * 614 = 0 / 100000.0 = 0.00000 f() = 128 * 614 = 78592 / 100000.0 = 0.78592 f() = 256 * 614 = 157184 / 100000.0 = 1.57184 f() = 384 * 614 = 235776 / 100000.0 = 2.35776 f() = 512 * 614 = 314368 / 100000.0 = 3.14368 f() = 640 * 614 = 392960 / 100000.0 = 3.92960 f() = 768 * 614 = 471552 / 100000.0 = 4.71552 f() = 896 * 614 = 550144 / 100000.0 = 5.50144 f() = 1023 * 614 = 628122 / 100000.0 = 6.28122
MUCH better!
How to Convert Operations to Integer-Only, Pt 3 – Super Advanced Mode
Lookup tables.
Hang on… you’re in for a ride, but we’ll get there together.
Let’s say you want to draw a circle – we’ll use an imaginary function called “plot(x,y)” to draw one pixel:
/* Set radius */ int r=100; /* Calculate circumference */ int c=r*2*PI(); /* Circle center coords */ int cx=200,cy=200; /* One of the advantages of using radians is that the incremental angle = 1/r */ float a=0,ai=1/r; int x,y,i; for (i=0; i<c; i++) { a=i*ai; /* Angle */ x=cos(a)*r+cx; /* X coord */ y=sin(a)*r+cy; /* Y coord */ plot(x,y); /* Draw 1 pixel */ }
This plots a 200-pixel diameter circle whose center is (200,200).
The trick used to calculate the incremental angle – the angle we need for just one pixel at the circumference, works like this:
- Circumference of circle: c=2*r*π, which is also the exact number of pixels we need to draw.
- If r=1, then the circumference of a unit circle is simply 2π, which is also the number of radians around the circumference.
- So if we take any circle, the number of radians per pixel (the angle by which we have to shift to draw the next adjacent pixel) is 2π/c
- If we substitute c: 2π/2rπ
- Everything cancels: ai=1/r
- Example: We have a circle of radius 1, 1/1=1. Since the circumference is about 6, the incremental angle being 1 means we will draw 2π/1 = about 6 pixels.
- Example, radius 100: c=628; ai=1/100 = 0.01
Because we go through the circle only once, there is no real advantage to pre-calculating a lookup table, nor even in using all-integer calculations.
We can even draw a disc with no wasted cycles, even using floats:
/* Radius */ int r=100; /* 1/2 Circumference */ int c=r*PI(); /* Circle Center */ int cx=200,cy=200; /* Incremental Angle */ float a,ai=1/r; int x,y,y1,y2,i; /* To draw a disc, we only need quadrants 1 and 2. We will use cy-y and cy+y for each vertical line, then draw a line between the two. This draws the disc from right to left. It's a bit confusing, but y is NEGATIVE in Q1 and Q2, so sin(a) will return a negative, and we ADD to cy to subtract (y1) and then subtract in order to add to obtain y2. */ for(i=0; i<c; i++){ a =i*ai; /* Calculate angle */ x =cx+cos(a)*r; /* X coord */ y =sin(a)*r; /* Y offset - should be negative */ y1=cy+y; /* Start y */ y2=cy-y; /* End y */ for (y=y1; y<=y2; y++) { plot(x,y); } }
This draws a 200-pixel-wide disc centered at (200,200).
What if we want to draw a sphere?
- We can use the same trick we used with the disc – calculate +y and -y and draw a line between the two
- Due to z-clipping, we only see one pixel. Rather than waste cycles drawing what we can’t see, we simply calculate the closest z value for each (x,y) on the shell of the sphere, and everything farther away at that (x,y) is clipped.
- We need sin, cos for calculating (x,y), but then for a given (x,y), we need cos again to calculate z. Think of each (x,-y..y) as the cross-section of a half-disc. At -y or +y, z is zero, but as y approaches 0, z approaches x.
- Since we will calculate cos(x) iteratively πr^2 times, this is a good place to build a lookup table!
- We get less benefit from sin(y) since we only need it once, but pre-calculating will keep things consistent.
/* Precision */ const PRECISION=10000; /* Radius */ int r=100; /* 1/2 Circumference at z=0 */ int c=r*PI(); /* Center */ int cx=200;cy=200;cz=0; /* Incremental Angle */ float a,ai=1/r; /* Utility variables */ int x,y,y1,y2,y3,z,i; /* LOOKUP TABLES!! */ int lcos[c], lsin[c]; for (i=0;i<c;i++) { a=ai*i; lcos[i]=cos(a)*PRECISION; lsin[i]=sin(a)*PRECISION; } /* That was the last FLOp */ /* Draw sphere surface */ for (i=0;i<c;i++){ x =lcos[i]*r/PRECISION+cx; y1=lsin[i]*r/PRECISION; /* Negative in Q1,Q2 */ y2=-y1; /* Reflect across 0 to get Q4,Q3 */ y3=y2+y2; /* Pre-compute the height of the current slice */ for (y=y1; y<=y2; y++) { z=(y+y2)*100/y3; /* Use 0..2*y2 to generate a lookup value 0..100 */ z=lcos[z]*y2/PRECISION+cz; /* Generate z using lookup and half the slice radius, y2 */ plot(x,y+cy,z); /* Plot - note y+cy in order to apply the y offset */ } }
In addition to accomplishing this with all integers, we also eliminate the function calls.
Function calls are also one of those quasi-expensive CPU operations, which involves pushing parameters on to a stack, setting up a private variable space, and then a far jump to some code “in the memory cloud” somewhere. On return, the return value is pushed on to the stack, and another far jump back to the next instruction.
This method can be used for trig functions, 3D transformations, and even logarithms.
For example, if you pre-compute log and inverse-log tables (llog[] and lilog[]) you can do things like compute the square root, all in integer space:
lilog[llog[25]/llog[2]]==5
Conclusion
Faster is better.
Int Ops are always faster than FLOps.
This is why FLOp Per Second (FLOPS) is a common measure of computing power, over “(Integer) Instructions Per Second”.
The better your CPU, the more efficiently you can process a FLOp, and therefore, the more FLOps you can compute in one second.
However, your code will always be faster with fewer FLOps, regardless of how powerful your CPU, or how efficiently it handles floating point conversion and arithmetic.
- Multiply before Divide
- Refactor, if you have a stepping problem
- Use lookup tables for repetitive floating point operations (or function calls)
Have fun!