Smallest And Largest Floating Point Values In Python

271 comments
In my production code, I have a function that calculates a/(a+b). That function threw a ZeroDivisionError at me this afternoon, so I decided to change the equation to a/(a+b+sys.minfloat), because I felt that was more elegant than writing a conditional function to check if (a + b) == 0.

Turns out there is no 'minfloat' in the sys module, so I decided to write a function to calculate the smallest float myself.
>>> def minfloat(guess):
while(guess * 0.5 != 0):
guess = guess * 0.5
return guess

>>> minfloat(+1.0) # minimum positive value of a float
4.9406564584124654e-324

>>> minfloat(-1.0) # minimum negative value of a float
-4.9406564584124654e-324


But I couldn't stop there. Now I had to write a function to calculate the largest possible floating point value in Python, just for kicks:
>>> def maxfloat(guess = 1.0):
while(guess * 2 != guess):
guess = guess * 2
return guess

>>> maxfloat(+1.0) # maximum positive value of a float
inf

>>> maxfloat(-1.0) # maximum negative value of a float
-inf


This is interesting. Let's find out more about this "inf" value:
>>> inf

Traceback (most recent call last):
File "", line 1, in
inf
NameError: name 'inf' is not defined
>>> float("inf")
inf

>>> inf = maxfloat()

>>> inf + inf
inf

>>> inf - inf
nan

>>> 1 / inf
0.0

>>> 1/(-inf)
-0.0


Finally, check the relationship between minfloat and maxfloat:
>>> 1 / minfloat(1.0)
inf
>>> 1 / minfloat(-1.0)
-inf

Trivial Solution To CPU-Bound Thread Slowdown On Multicore Systems

158 comments
According to David Beasley, the CPython GIL doesn't just prevent CPU-bound applications from taking advantage of multiple cores. He says it also slows them down on multicore systems.

As an example, he says, following CPU-bound function runs is slower on multicore systems when run in two threads vs when run twice sequentially on the same thread:
def counts(n=10000000):
while n > 0:
n -= 1

I tested the claim by running this little program (let's call it gil.py)
from time import time
from threading import Thread

def counts(n=10000000):
while n > 0:
n -= 1

def measure(f, comment):
t=time()
f()
print time()-t, "seconds"

def sequ():
counts()
counts()

def para():
t1 = Thread(target=counts,args=())
t1.start()
t2 = Thread(target=counts,args=())
t2.start()
t1.join(); t2.join()

if __name__ == "__main__":
measure(sequ, "Sequential Execution")
measure(para, "Sequential Execution")


Here's the result from running the script on a Core Duo 2 laptop:
C:\Py>python gil.py
Sequential Execution: 4.81399989128 seconds
Parallel Execution: 11.2730000019 seconds


He's sort-of right in this case. So why has the GIL has survived? Partly because the problem is easily solved.

One way to do that is to use the sys.checkinterval function which allows CPU-bound threads to do a little more work before giving up control of the GIL. Since the overhead of GIL passing in the above is more than 100%, we need to increase the check interval by a factor of hundred or so. (The default is 100.)

Just add the following to gil.py after "if __name__ == '__main__'":
import sys
sys.setcheckinterval(100*100)


Let's run it again and see what happens:
C:\Py>python gil.py
Sequential Execution: 4.63999986649 seconds
Parallel Execution: 4.63199996948 seconds


Problem solved!

Avoiding Pre-emption Altogether

Personally, I like to include "sys.setcheckinterval(sys.maxint)" in all my scripts, so I don't have to lock shared data structures when manipulating them in Python code. If you do that, your threads will never be pre-empted as long as you don't call any blocking functions that release the GIL, and you can avoid the overhead of fine-grained locking, potential deadlocks, etc. What do you think?