Dealing with a precision error when generating probability distribution in python -


in context of model i'm generating, @ 1 point need generate probability distribution array of real numbers. i'll leave out relevant details, have function (we'll call "f" now), generates array of n floats:

arr = [value_1, value_2, ..., value_n] 

now, these values proportional probabilities next need use in multinomial sampling procedure, obvious approach this:

result = np.random.multinomial(number_of_samples,arr/arr.sum()) 

but (sometimes) doesn't work! sum of arr/arr.sum() ends being greater 1. in principle should mathematically impossible, i'm assuming boils down floating-point precision issue. here's trivial example of how can happen:

in [58]: arr = np.array([1/20.]*20)  in [59]: arr/arr.sum() out[59]: array([ 0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,     0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,  0.05,     0.05,  0.05])  in [60]: (arr/arr.sum()).sum() out[60]: 1.0000000000000002 

so long story short, question how best deal this. can cheat adding small number sum, i.e.:

probs = arr / (arr.sum()+0.000001) 

but hackish, , fear may introduce further unwanted precision issues. there better solution?

start reading https://docs.python.org/2/tutorial/floatingpoint.html

in nutshell, floating point can't represent 0.05. effect minute:

>>> repr(1/20.) '0.05' >>> repr(sum([1/20.]*20)) '1.0000000000000002' 

the correct solution define desired precision each mathematical operation, calculate round errors of each step , round accordingly when necessary.

in case, can round 5 digits since you're adding few numbers.

>>> repr(round(sum([1/20.]*20),5)) '1.0' 

but more complex calculations need correct, have error assessment.


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -