Pro(gramming) Tip: Turn Floating Point Operations in to Integer Operations

Or, why floating point operations are slow, and how to avoid them.

Table of Contents

A Brief Note on Notation

This article is geared toward programmers, but if you happen to be a non-programmer:

The “^” means, “to the power of” or exponentiation. So 5^2 means “5 to the power of 2”, or 5 x 5 = 25.
You might also see “*” – in computerese, this means “multiply”. So 5*5 = 5 x 5 = 25.
0x means “The following number is in hexadecimal.” So 0x10 is “Hex 10” or 16. Likewise, 0b means “the following number is in binary”.

Quick Recap of Integers

Computers use binary internally to store all numbers, and all internal calculations are also performed in binary.

For integers, this is fairly straightforward.

Let’s recall that integers are numbers without a decimal component. These are all integers:

0; 1; 2; etc…
-1; -2; -3; etc…
999,999,999; -123,456
123 x 10^45; -5 x 10^32

To represent any integer number, you simply need enough bits (binary digits) such that 2^b > n.

For example, if I want to represent the integer 123,456, I need 17 bits. 2^17 = 131,072.

However, computers don’t just store an arbitrary number of bits – bits are always grouped by multiples of 8, and that’s how we get our basic integer data types:

Bytes (Bits)	Data Type	Unsigned Numerical Range
1 (8)	Byte	0 to 255
2 (16)	SmallInt	0 to 65535
4 (32)	Int	0 to 4.2 billion
8 (64)	LongInt / QuadInt	0 to 18 quadrillion

This works great, until you need to represent decimal numbers, or very large numbers.

Very large numbers are represented in exponential form, such as:

1.23456 x 10^5 = 123,456
4.2 x 10^42

For the second example, we would need 145 bits (over 18 bytes) to represent this number!

This, and decimals, is where floating point comes in.

What is a Floating Point?

A floating point is a way for computers to store and process decimal numbers.

Floating point numbers consist of the following components:

s x 2^e

s – “Significand” holds the significant digits

2 – “Base”. Could be anything, but computers use base 2

e – “Exponent”

“Floating point” means that the decimal point “floats” based on the exponent.

Assuming that the base is always 2, we really store every floating point number as two numbers:

The significand. In binary, every number starts with “1” except 0. So our significand will always be 1.bbb… unless we are truly storing the number 0.
The exponent (in binary)

Let’s recap: anything raised to a negative exponent means 1/x. So:

10^(-1) = 1/10 = 0.1
10^(-2) = 1/100 = 0.01
etc…

This is important, because we need negative exponents to represent fractions. Without them, we could only represent integers.

I kind of glossed over signed integers, but our exponent can be negative, and it does this using an offset.

In a 32-bit floating point, the exponent is 8 bits. If you recall, 8 bits (or one byte) can store any integer in the range of 0 to 255. However, the exponent is really an offset from -127, so to get the real exponent value, you subtract 127:

00000000 =   0, but   0 offset from -127 = -127
01111110 = 126, but 126 offset from -127 =   -1
01111111 = 127, but 127 offset from -127 =    0
10000000 = 128, but 128 offset from -127 =    1
11111111 = 255, but 255 offset from -127 =  128

Because -127 and +128 have special meanings, this means that our 32-bit float can store any positive number in the range of:

In base 2: 1 x 2^(-126) to 1 x 2^127
In base 10: 1 x 10^(-38) to 1 x 10^38

That’s about 100 undecillion, or 100,000 trillion-trillion

To represent negative numbers, floating point uses a sign bit. If the bit is set, the entire number is negative. So both our number and our exponent could be negative or positive:

	Exponent Positive	Exponent Negative
Sign Positive	Positive, Greater than 1	Positive, Fractional
Sign Negative	Negative, Less than -1	Negative, Fractional

Putting this all together, a 32-bit floating point looks like this:

seeeeeeeebbbbbbbbbbbbbbbbbbbbbbb
 1 sign bit
 8 exponent bits (Offset from -127)
23 significand bits

However, recall that every number in binary, except 0, starts with a “1”. This includes fractions, and any other number you can think of. We can exploit this by always assuming that the first digit is 1, and that the significand bits encode digits 2-24, giving us 24 bits of precision, even though we only have 23 significand bits.

To represent the number “0”, we leave both the exponent and significand empty.

So even though we can store very large or small numbers, we only have 24 bits to store the actual digits, which equates to about 7 digits in base 10. This is called the precision limit.

Here are some examples:

0-00000000-00000000000000000000000 = 0
0-01111111-00000000000000000000000 = 1.0 x 2^(127-127) = 1.0 x 2^1 = 1
0-01111110-10000000000000000000000 = 1.1 x 2^(126-127) = 0.11b = 0.75
1-10000100-00100000000000000000000 = -1.001 x 2^(132-127) = -1.125 x 32 = -36

Inside the Computer

From a program’s standpoint, there are three main components:

The program itself
Data
CPU Registers (and other parts)

A “register” is a temporary holding area inside the computer’s CPU. Like a tiny, digital, robotic arm, the program tells the register what data to grab, and what to do with it.

It’s been over 30 years since I’ve done any assembly programming, so I’ll discuss a “generic 32-bit CPU” without getting in to specific architectures and implementations.

Every assembly program consists of several types of instructions:

Instruction	Effect
LOAD	Go get something from memory, and put it in a register.
STORE	Take what’s in a register and put it in memory.
ADD	Add some number (or a value in memory) to the value currently in one of the registers.
SUB	Subtract some number (or a value in memory) to the value currently in one of the registers
MUL	Multiply (etc…)
DIV	Divide (etc…)
SHIFTL	Shift Left. Move all the bits in a specified register, some positions to the left. This has the effect of multiplying by 2, and since most CPUs implement SHIFT in hardware, it’s a very fast operation.
SHIFTR	Shift Right. Like left shift, right shift has the effect of dividing by 2.
INC, DEC	Increment or decrement – add or subtract one to the value stored in one of the registers

There are also instructions such as JUMP that allow for flow control, and port manipulation via IN / OUT, but we can ignore these in our simple example.

Machine language is an all-numeric way to represent CPU instructions so that they can be stored as data. The computer reads program instruction bytes from memory, loads them in to the instruction register, and then the instruction register tells the other registers what to do based on instruction codes and other operands that are included with the instruction.

You can think of assembly as human-readable machine language.

When you compile something in a higher-level language like C, it takes your source code and turns it in to machine instructions, that we can then view as assembly.

You might write this in C:

int i=10, j=++i, k=i+j;
/*   i==11, j==11, k==22  */

In assembly, it might look like this, after you compile it:

LOAD A $0x0A
STORE A 0xD001
LOAD A 0xD001
INC A
STORE A 0xD001
STORE A 0xD002
LOAD A 0xD001
ADD A 0xD002
STORE A 0xD003

Of course, compilers usually have optimizers that help avoid all of the back and forth silliness, but again, this is an example.

In the above, “A” is the CPU register – there may be as few as 2 “general purpose” registers, while most have at least 4.

We have notation for constants, such as 0x0A, which would be interpreted as the 32-bit value, 0x0000000A (0x means “hex”).

We also use direct notation, such as 0xD001 that specifies a memory address that the CPU’s register reads or writes. For simplicity, we can skip indirect notation, which acts like a pointer.

Once we throw the code above through an optimizer, it might look like this:

LOAD A $0x0A
STORE A 0xD001
INC A
STORE A 0xD001
STORE A 0xD002
ADD A 0xD002
STORE A 0xD003

When we load our program in to memory, the computer might look something like the table below. Keep in mind that the program basically shares the same memory as data, and there are pointers that point to different sections of memory for different purposes, but again, this is simplified.

Also, remember that “LOAD A $0x0A” might be 0x0A01DD00 in machine language, but we’re viewing this in human-readable assembly…

	Program		Registers		Data
0x0100	LOAD A $0x0A	A		0xD000
0x0101	STORE A 0xD001	B		0xD001
0x0102	INC A	C		0xD002
0x0103	STORE A 0xD001	D		0xD003
0x0104	STORE A 0xD002			0xD004
0x0105	ADD A 0xD002			…
0x0106	STORE A 0xD003
0x0107	STOP
…

The CPU reads each instruction, which tells a specific register what to do next.

After the first instruction, our computer looks like this:

	Program		Registers		Data
0x0100	LOAD A $0x0A	A	0x0000000A	0xD000
0x0101	STORE A 0xD001	B		0xD001
0x0102	INC A	C		0xD002
0x0103	STORE A 0xD001	D		0xD003
0x0104	STORE A 0xD002			0xD004
0x0105	ADD A 0xD002			…
0x0106	STORE A 0xD003
0x0107	STOP
…

As our program progresses, we move data in and out of the A register, and memory locations that represent our C variables.

	Program		Registers		Data
0x0100	LOAD A $0x0A	A	0x0000000B	0xD000
0x0101	STORE A 0xD001	B		0xD001	0x0000000B
0x0102	INC A	C		0xD002	0x0000000B
0x0103	STORE A 0xD001	D		0xD003
0x0104	STORE A 0xD002			0xD004
0x0105	ADD A 0xD002			…
0x0106	STORE A 0xD003
0x0107	STOP
…

At its final state, we add “i” and “j”, and store the result as “k”. Remembering that program is data and vice-versa, the last instruction, STOP, tells the CPU to stop executing instructions, so that it doesn’t start executing data!

	Program		Registers		Data
0x0100	LOAD A $0x0A	A	0x00000016	0xD000
0x0101	STORE A 0xD001	B		0xD001	0x0000000B
0x0102	INC A	C		0xD002	0x0000000B
0x0103	STORE A 0xD001	D		0xD003	0x00000016
0x0104	STORE A 0xD002			0xD004
0x0105	ADD A 0xD002			…
0x0106	STORE A 0xD003
0x0107	STOP
…

Although this is a very simplified model, we can see what happens when we perform math operations on integers.

Integer Math

Integer math instructions look like this:

Go get something
Do something with it
Store the result in a register

From the CPU’s perspective, even though this is a single programming instruction, there are 3 steps, and thus it might take up to 3 CPU cycles to execute one instruction.

Simple operations like addition are usually built right in to the register so that they only take one CPU cycle.

Integer multiplication might take 1 or 2 extra CPU cycles, even when using purpose-specific hardware. Multiplication is a combination of addition and bit shifting.

For example, to multiply 0b0101 (5) by 0b1011 (11), we would LOAD A $0x05, and then MUL A “x”, where x is some memory location.

Without a dedicated multiplier, assuming we already have 5 in the A register, here is what the CPU might do behind the scenes when we ask it to MUL A “x”:

Go read memory location x (value 11)
Multiply by A (value 5)
1. Set C to 0
2. Loop from 0 to 7 (We’ll call this “S”)
  1. If x AND 1
    1. Store A (value 5) in B
    2. Shift B left by S
    3. Add B to C and store in C
  2. Shift x right by 1
Store C (value 55) in A

By my count, that’s 33 CPU cycles for a single instruction!

Modern CPUs have hardware multipliers that perform this entire process as 1 or 2 instructions, so the microcode might look like this:

Go read memory location x (value 11)
Send A to the multiplier
Send x to the multiplier
Read a result from the multiplier, and store in A

Here is what an 8-bit hardware “Lattice” multiplier looks like:

Here is how it works:

As the first value is loaded, it populates the lattice, where each row is offset by one position (multiplied by 2).
As the second value is loaded, any “zero” bits disable rows in the lattice using AND
The remaining rows in each column of the lattice are then added to obtain two 8-bit result fields.
The low byte and high byte are fed in to the result register, which can then be read by the CPU.

In this example, although it takes 3 steps write to and read from the multiplier, the multiplication itself is handled instantaneously.

Floating Point Math

If you recall, a floating point number is like two numbers stored as one:

n = s * 2^b

So, although you can absolutely load or store a floating point just like any other number, math isn’t so straightforward.

Let’s assume we have two floats, n and m. The first step is to decode the sign, exponent, and significand:

	m	n
Raw number:	174,587.904	0.30690625
Binary:	101010100111111011.11100111011	0.01001110100
Significand and Exponent:	s = 1.01010100111111011111001 e = 10001	s = 1.001110100 e = -10
Encoded:	0-10010000- 01010100111111011111001	0-10000001- 00111010000000000000000
Base 10:	1.3319998979568 x 2^17	1.227625 x 2^(-2)

Remember the special case of zero:

If e=0 (after shift) and s=0 then the value is 0. For our example, this is not the case, but the CPU still has to deal with checking for a zero and other special values, which takes time.

Floating point multiplication is actually easier than addition:

Step	Result
1. Multiply ms * ns	1.3319998979568 x 1.227625 = 1.6351963747292166
2. While result > 2, divide by 2, add 1 to me	(n/a)
3. Add me + ne	17 + (-2) = 15
4. Re-encode the result	s = 1.6351963747292166 e = 15
(In decimal)	53,582.11328125

This seems complicated, but addition is even more so:

Step	Result
1. If n>m then swap n,m (Ensure n<m) 1.a. Compare exponents 1.b. If exponents are equal, compare significands	No swap required
2. Subtract ne-me = ee	-2 – 17 = -19
3. Left shift ns by ee (Note that a negative left shift is a right shift)	1.227625 << -19 = 0.00000234150886536
4. Add ns to ms = ss	1.3319998979568 + 0.00000234150886536 = 1.33200223946
5. While result >2, divide by 2, add 1 to me	n/a
6. Re-encode the result	s = 1.33200223946 e = 17
(In decimal)	174,588.19753

Just as integer multiplication would chew up CPU cycles without special hardware, most modern CPUs have floating point hardware (called a Floating Point Unit) that is dedicated to floating point math and manipulation.

Comparing Floats to Ints

The point of demonstrating all of this complexity is to show all of the extra work that has to be done to support floating point numbers, compared to integers.

We’ve looked at the internal CPU process, and it’s easy to see why floats are slower than ints:

Floating Point	Integer
Special cases include zero, infinity, and infinitesimal, that all require extra handling steps.	Integers have no special cases.
Each floating point consists of two numbers, each pair requiring separate manipulation and normalization steps.	All integers are a single component.
Floating point arithmetic, even if implemented in hardware, requires a discreet set of steps that can be computationally-expensive.	Integer arithmetic, if implemented in hardware, only requires 1 or 2 steps.

What all of this means is that the CPU has to work harder during floating-point operations compared to integer operations:

On older CPUs that don’t have dedicated FPU support, floating point arithmetic can take up to 4 times longer.
On newer CPUs with dedicated FPU hardware, parallel pipelines, and conditional execution, floating point arithmetic can be as little as 1.3 times longer.

One more complication: If you perform a math operation on a float and an int, the CPU must convert it to float, which adds an extra step:

	Integer	Floating Point
Integer	Integer Operation	Convert Int to Float Floating Point Operation
Floating Point	Convert Int to Float Floating Point Operation	Floating Point Operation

Although it may seem trivial, converting an int to floating point entails these steps:

Set e=0
Preserve the sign bit = b
if n<0 then convert n to +n
Loop
1. if s and 0x80000000 = 1 exit loop (we found the high bit)
2. left shift s by 1
3. increment e
4. if e>31 exit loop (all zeros)
Convert b,e,s to float
1. Send b to FPU
2. Send e to FPU
3. Send s to FPU
4. Read result from FPU to register A

The last step assumes that the actual conversion is done in hardware, but here is what would happen inside that hardware:

When the b register is loaded, shift left by 31
When the s register is loaded… shift right 8 (only 24 bits of precision)
When the e register is loaded, add 127 (e will never be negative because we know n was an integer)
left shift e by 30
combine b OR e OR s

So the simple act of converting an int to a float is fairly computationally-expensive.

This gives us a hint about how to optimize our code…

Reduce Integer to Floating Point conversions
Reduce Floating Point Operations (FLOps)

How to Convert Operations to Integer-Only, Pt 1 – Beginner Mode

Here is a simple example:

Let’s say that you have a function that returns an integer value in the range of 0 to 1023, but you want to scale it by 2/3.

If you really suck at coding, your code might look like this:

int i;
float k;
i=somefunction();
k=i*0.66;
i=cint(k);

Here is what the code above does…

Declare variables
Capture our integer function output in ‘i’
Multiply by 0.66, which as we expect, results in a float. We capture the float output in ‘k’
Finally, we convert our scaled float back to int with cint, storing the result back in ‘i’

Let’s see why this code is horrible… Here is what the CPU has to do:

JUMP to somefunction
POP A from the stack (result); cost = 1
Store A as ‘i’ (assign our function result to ‘i’); cost = 1
Convert A to floating point (preparing for floating point operation); cost = 5
Multiply by 0.66; cost = 6
Store the result as ‘k’; cost = 1
Convert A to integer from float; cost = 6
Store the result as ‘i’; cost = 1

Total cost: 21

We quickly realize that if you assign a float to an integer variable, the conversion is done automatically. Even better, we don’t need an intermediate assignment, as we can handle the scaling inline:

int i;
i=somefunction()*0.66;

Better, but NOT MUCH BETTER. Here is what your CPU sees:

JUMP to somefunction
POP A from the stack (result); cost = 1
Convert A to float; cost = 5
Multiply A * $0.66; cost = 6
Convert A to int; cost = 6
Store the result as ‘i’; cost = 1

Total cost = 19

So although your code looks more efficient, IT’S NOT.

How about this:

int i;
i=somefunction()*2/3;

At first, this looks exactly like what we had above, but let’s look at how the CPU sees this:

JUMP to somefunction
POP A from the stack (result); cost = 1
MUL A $2; cost = 4
DIV A $3; cost = 4
Store ‘i’; cost = 1

Total cost: 10

Wait…. what???

Since we didn’t use any floating point values, there were no floating point conversions
Since we didn’t use any floating point variables, the compiler automatically decided that no intermediate conversions were necessary
We still have 5 instead of 6 steps, BUT… we have two FASTER integer operations instead of one SLOW FLOp, and no conversions.

So, even though we have almost as many steps, this code requires only about 50% of the CPU cycles, and is up to twice as fast!

Here is the interesting part…

If you integer-divide 2/3, the result is 0. If we multiply 1023 by (2/3) the integer answer will always be 0.

But… 1023 * 2 = 2046 / 3 = 682

So, if we change the order of operations – multiply first, divide next, then we get the desired result without having to use floating points.

How to Convert Operations to Integer-Only, Pt 2 – Expert Mode

Let’s say that we want to map the output of our same function (0..1023) to an angle in radians.

Just to recap, in degrees, a circle is 360 degrees. It’s sliced up in to 360 neat little slices, and 360 degrees is at the same position as zero. In radians, the circle is divided in to 2π slices. So 0 and 2π sit at the same location, 90 degrees is π/2 (about 1.57) radians, 180 degrees is π (about 3.14) radians, and 270 degrees is at 3π/2 radians (about 4.71).

In computing, we typically use radians over degrees for two reasons:

There are some math shortcuts where things cancel out, when you consider the unit circle in radians instead of degrees.
Most compilers and math libraries implement trig functions in radians, not degrees.

So if we want to convert our function output (0..1023) to an angle in radians (0..2π), the calculation looks like this:

int i;
float k;

/* Get our function output */
i=somefunction();

/* Convert i to an interval of 0..1 */
k=i/1023;

/* Multiply by the length around the unit circle */
k=k*2*PI();

Int Ops: 1 * 4
Conversions: 1 * 5
FLOPs: 3 * 6
Total Cost: 27

Note: In the given context, the compiler will store 1023 and 2 as floats, so no conversion is necessary.

Simplified:

/* int i; */
float k;

k=somefunction()/1023*2*PI();

Int Ops: 0 * 4
Conversions: 1 * 5 (output from somefunction)
FLOps: 3 * 6
Total Cost: 23

At first, this looks optimum because the result is a float, and there is no way around this.

However, we can rearrange things so that there are fewer FLOps:

int i;
float k;

i=function()*2/1023;
k=i*PI();

Int Ops: 2 * 4
Conversions: 1 * 5
FLOps: 1 * 6
Total Cost: 19

Now, everything is an integer until the last step, and the cost savings is evident.

BUT…

There is a problem.

Remember that our function ranges from 0..1023. We multiply first, then divide. However, by immediately storing an integer result, we lose any decimal granularity. We effectively have a function that returns 0 or 1, which we then multiply by PI in the last step. Let’s look at a range of values, and see how this plays out.

f() =    0 * 2 = 0    / 1023 = 0 * PI = 0
f() =  128 * 2 = 256  / 1023 = 0 * PI = 0
f() =  256 * 2 = 512  / 1023 = 0 * PI = 0
f() =  384 * 2 = 768  / 1023 = 0 * PI = 0
f() =  512 * 2 = 1024 / 1023 = 1 * PI = 3.14159
f() =  640 * 2 = 1280 / 1023 = 1 * PI = 3.14159
f() =  768 * 2 = 1536 / 1023 = 1 * PI = 3.14159
f() =  896 * 2 = 1792 / 1023 = 1 * PI = 3.14159
f() = 1023 * 2 = 2046 / 1023 = 1 * PI = 3.14159

We effectively lose all precision after the decimal, which we need in order to avoid this “stepping” problem.

We can use some basic algebra to fix the stepping problem by retaining some precision, without losing our efficient integer operations.

If we pre-calculate PI to a specific precision, we can deal with it as a fraction of integers:
3.14159 = 314159/100000
We can pre-calculate other constants:
2 * 314159 / 1023 = 614 (and some fraction that we will ignore, assuming we have suitable precision)

/* 2 * 314159 / 1023 */
const C=614;

/* Return k to a decimal.  Note the 
   use of .0 to force D to be a  
   floating point.  */
const D=100000.0;   

int i;
float k; 

i=function()*C; 

k=i/D;

Int Ops: 1 * 4
Conversions: 1 * 5
FLOps: 1 * 6
Total Cost: 15

This approach is even cheaper, and a quick check of the output will determine that we’ve fixed the “stepping” problem:

f() =    0 * 614 =      0 / 100000.0 = 0.00000 
f() =  128 * 614 =  78592 / 100000.0 = 0.78592
f() =  256 * 614 = 157184 / 100000.0 = 1.57184 
f() =  384 * 614 = 235776 / 100000.0 = 2.35776
f() =  512 * 614 = 314368 / 100000.0 = 3.14368
f() =  640 * 614 = 392960 / 100000.0 = 3.92960
f() =  768 * 614 = 471552 / 100000.0 = 4.71552
f() =  896 * 614 = 550144 / 100000.0 = 5.50144
f() = 1023 * 614 = 628122 / 100000.0 = 6.28122

MUCH better!

How to Convert Operations to Integer-Only, Pt 3 – Super Advanced Mode

Lookup tables.

Hang on… you’re in for a ride, but we’ll get there together.

Let’s say you want to draw a circle – we’ll use an imaginary function called “plot(x,y)” to draw one pixel:

/* Set radius              */
int r=100;

/* Calculate circumference */
int c=r*2*PI();

/* Circle center coords    */
int cx=200,cy=200;

/* One of the advantages of using radians
   is that the incremental angle = 1/r    */
float a=0,ai=1/r;

int x,y,i;
for (i=0; i<c; i++) {
    a=i*ai;         /* Angle        */
    x=cos(a)*r+cx;  /* X coord      */
    y=sin(a)*r+cy;  /* Y coord      */
    plot(x,y);      /* Draw 1 pixel */
}

This plots a 200-pixel diameter circle whose center is (200,200).

The trick used to calculate the incremental angle – the angle we need for just one pixel at the circumference, works like this:

Circumference of circle: c=2*r*π, which is also the exact number of pixels we need to draw.
If r=1, then the circumference of a unit circle is simply 2π, which is also the number of radians around the circumference.
So if we take any circle, the number of radians per pixel (the angle by which we have to shift to draw the next adjacent pixel) is 2π/c
If we substitute c: 2π/2rπ
Everything cancels: ai=1/r
Example: We have a circle of radius 1, 1/1=1. Since the circumference is about 6, the incremental angle being 1 means we will draw 2π/1 = about 6 pixels.
Example, radius 100: c=628; ai=1/100 = 0.01

Because we go through the circle only once, there is no real advantage to pre-calculating a lookup table, nor even in using all-integer calculations.

We can even draw a disc with no wasted cycles, even using floats:

/* Radius            */
int r=100;

/* 1/2 Circumference */
int c=r*PI();

/* Circle Center     */
int cx=200,cy=200;

/* Incremental Angle */
float a,ai=1/r;

int x,y,y1,y2,i;

/* To draw a disc, we only need quadrants 1 and 2.
   We will use cy-y and cy+y for each vertical line, 
   then draw a line between the two.  This draws the
   disc from right to left.                        
   It's a bit confusing, but y is NEGATIVE in Q1
   and Q2, so sin(a) will return a negative, and we 
   ADD to cy to subtract (y1) and then subtract
   in order to add to obtain y2.                     */
for(i=0; i<c; i++){
    a =i*ai;           /* Calculate angle                */
    x =cx+cos(a)*r;    /* X coord                        */
    y =sin(a)*r;       /* Y offset - should be negative  */
    y1=cy+y;           /* Start y                        */
    y2=cy-y;           /* End y                          */
    for (y=y1; y<=y2; y++) {
        plot(x,y);
    }
}

This draws a 200-pixel-wide disc centered at (200,200).

What if we want to draw a sphere?

We can use the same trick we used with the disc – calculate +y and -y and draw a line between the two
Due to z-clipping, we only see one pixel. Rather than waste cycles drawing what we can’t see, we simply calculate the closest z value for each (x,y) on the shell of the sphere, and everything farther away at that (x,y) is clipped.
We need sin, cos for calculating (x,y), but then for a given (x,y), we need cos again to calculate z. Think of each (x,-y..y) as the cross-section of a half-disc. At -y or +y, z is zero, but as y approaches 0, z approaches x.
Since we will calculate cos(x) iteratively πr^2 times, this is a good place to build a lookup table!
We get less benefit from sin(y) since we only need it once, but pre-calculating will keep things consistent.

/* Precision */
const PRECISION=10000;

/* Radius */
int r=100;

/* 1/2 Circumference at z=0 */
int c=r*PI();

/* Center */
int cx=200;cy=200;cz=0;

/* Incremental Angle */
float a,ai=1/r;

/* Utility variables */
int x,y,y1,y2,y3,z,i;

/* LOOKUP TABLES!! */
int lcos[c], lsin[c];

for (i=0;i<c;i++) {
    a=ai*i;
    lcos[i]=cos(a)*PRECISION;
    lsin[i]=sin(a)*PRECISION;
}
/* That was the last FLOp */

/* Draw sphere surface */
for (i=0;i<c;i++){
    x =lcos[i]*r/PRECISION+cx;
    y1=lsin[i]*r/PRECISION;            /* Negative in Q1,Q2             */
    y2=-y1;                            /* Reflect across 0 to get Q4,Q3 */
    y3=y2+y2;                          /* Pre-compute the height of the
                                          current slice                 */
    for (y=y1; y<=y2; y++) {
        z=(y+y2)*100/y3;               /* Use 0..2*y2 to generate a
                                          lookup value 0..100           */
        z=lcos[z]*y2/PRECISION+cz;     /* Generate z using lookup and 
                                          half the slice radius, y2     */
        plot(x,y+cy,z);                /* Plot - note y+cy in order to 
                                          apply the y offset            */
    }
}

In addition to accomplishing this with all integers, we also eliminate the function calls.

Function calls are also one of those quasi-expensive CPU operations, which involves pushing parameters on to a stack, setting up a private variable space, and then a far jump to some code “in the memory cloud” somewhere. On return, the return value is pushed on to the stack, and another far jump back to the next instruction.

This method can be used for trig functions, 3D transformations, and even logarithms.

For example, if you pre-compute log and inverse-log tables (llog[] and lilog[]) you can do things like compute the square root, all in integer space:

lilog[llog[25]/llog[2]]==5

Conclusion

Faster is better.

Int Ops are always faster than FLOps.

This is why FLOp Per Second (FLOPS) is a common measure of computing power, over “(Integer) Instructions Per Second”.

The better your CPU, the more efficiently you can process a FLOp, and therefore, the more FLOps you can compute in one second.

However, your code will always be faster with fewer FLOps, regardless of how powerful your CPU, or how efficiently it handles floating point conversion and arithmetic.

Multiply before Divide
Refactor, if you have a stepping problem
Use lookup tables for repetitive floating point operations (or function calls)

Have fun!

Justin A. Parr - Technologist

Technology, Strategy, Insights, and Tech Support