code computer-science linux mathematics open source original-content
by Layne
3 comments
Integer Square-root Function
While browsing the Linux kernel source this morning at work, I stumbled upon this little gem. As you may already know, kernel code can’t use any floating-point calculations, since the state of the Floating Point Unit (FPU) isn’t saved during context switches to kernel-space. This function computes the integer square root of the provided integer. It returns the largest integer less than or equal to the true square root of the input number. Code and sample output provided after the break.
#include <stdio.h>
unsigned long int_sqrt(unsigned long x)
{
unsigned long op, res, one;
op = x;
res = 0;
one = 1 < < 30;
while (one > op)
{
one >>= 2;
}
while (one != 0)
{
if (op >= res + one)
{
op = op - (res + one);
res = res + 2 * one;
}
res >>= 1;
one >>= 2;
}
return res;
}
int main()
{
unsigned long index, result;
for (index = 0; index < 18; index++)
{
result = int_sqrt(index);
printf("int_sqrt(%2ld) = %2ld\n", index, result);
}
return 0;
}Using the simple main function above, here is some sample output:
int_sqrt( 0 ) = 0
int_sqrt( 1 ) = 1
int_sqrt( 2 ) = 1
int_sqrt( 3 ) = 1
int_sqrt( 4 ) = 2
int_sqrt( 5 ) = 2
int_sqrt( 6 ) = 2
int_sqrt( 7 ) = 2
int_sqrt( 8 ) = 2
int_sqrt( 9 ) = 3
int_sqrt( 10 ) = 3
int_sqrt( 11 ) = 3
int_sqrt( 12 ) = 3
int_sqrt( 13 ) = 3
int_sqrt( 14 ) = 3
int_sqrt( 15 ) = 3
int_sqrt( 16 ) = 4
int_sqrt( 17 ) = 4
Update
Peter Harrison (http://www.micromouseonline.com) has provided another version of the integer square root function. Here are his comments:
This is a slightly different version. I only return an int since the answer can never be bigger than that. There is now an optimisation taken from the linux code. The initial mask is now pre-adjusted before the main loop.
Note that the main reason it is quicker is the fact that an int on my processors is 16 bit while a long is 32 bit. This is not a safe assumption for all architectures and the linux code is more generally correct.
unsigned long isqrt(unsigned long value)
{
unsigned int result = 0;
unsigned int mask = 0x8000;
unsigned int temp;
while (mask > value)
{
mask >>= 2;
}
while (mask)
{
temp = result + mask;
if ((unsigned int) temp*temp< = value)
{
result = temp;
}
mask >>= 1;
}
return result;
}
Ooops, the code was chewed up! It should read:
if ((unsigned long) temp*temp>= 1;
}
return result;
}
I give up. It gets chewed again. Sorry to mess you about. I can email you the code if you like
Found your page when searching for fast integer square root algorithms. The code looked interesting. After all, those linux kernel guys should know a thing or two right? So I copied it into my IDE and compared it with the code I already had. Waddaya know – it took 50% longer! My code is not original to me – it is a translation of some old assembly language stuff I have. The linux kernel code seems to be an adaptation of an optimised routine intended for a compiler/processor without a decent multiply instruction. I am compiling for the dsPIC32 which has a fast multiply. It would be interestnig to test against a current x86 processor.
Here is the code:
unsigned int mysqrt(unsigned long value){
unsigned int result = 0;
unsigned int mask = 0×8000;
unsigned int temp;
while (mask){
temp = result + mask;
if ((unsigned long) temp*temp>= 1;
}
return result;
}
Pete HArrison