In order to checkout Python’s multiprocessing functionality and quite inspired by Benjamin Scherrey’s presentation on Test Driven Development, I wrote a pi calculator using the monte carlo method that utilizes the Python multiprocessing module.
For those who are not aware, roughly speaking the monte carlo simulator is like blindly throwing darts at a circle that is framed perfectly in a square where the length of the square is equal to the diameter of a circle. The ratio of darts that lands inside the circle vs outside circle will give us pi. For further information you can take a look here.
Initially the code was utilizing the map() function which needed to have a list passed to it containing an item for every coordinate I needed to process. When the list got really large (2 ^ 24) I realized that the python interpreter was taking up alot of RAM so I decided to use iterators instead of passing a really large list.
The initial concerns with the iterator was that because an item had to be requested on every iteration, there would be a significant performance penalty as each worker in the pool would have to wait until an item gets taken off before the next calculation could be performed.
My first version of the code was as follows:
import math import random from multiprocessing import Pool # Keep seed constant random.seed(1) ''' Main Work function deleteme is there because we need to pass it something for it to work with map() ''' def do_work(deleteme): # Not in circle by default returnValue = 0 coordinate = (random.random(), random.random()) # Set return to 1 if coordinate within circle if math.hypot(coordinate[0], coordinate[1]) < 1: returnValue = 1 return returnValue if __name__ == "__main__": # Total number of items we're going through TOTAL = 2**24 # start a pool with 4 processes pool = Pool(processes=4) # give it work in work chunks of 65536 result = pool.map(do_work, xrange(TOTAL), 65536) #outputs pi print ( float(sum(result)) / float(TOTAL) ) * 4
When run, this code takes up 140MB of RAM and takes 34.87 user seconds to execute. I then substituted __main__ with the following:
if __name__ == "__main__": # Total number of items we're going through TOTAL = 2**24 # start a pool with 4 processes pool = Pool(processes=4) # give it work in work chunks of 65536 result = pool.imap_unordered(do_work, xrange(TOTAL), 65536) totalResult = sum(x for x in result) print ( float(totalResult) / float(TOTAL) ) * 4
The results were 12MB of RAM and 39.01 user seconds. This translated to a 1 second penalty which is insignificant.
Conclusion:
It seems that my concerns were unfounded, python does a good job making sure that there are always results waiting to be processed and any overhead involved is minuscule.
Copyright © 2010. All Rights Reserved.