Men’s Continued Mismeasurements: An Improved Formula for Penis Size Estimation

Well, I already gave one formula for penis size prediction, but I realized the dataset I had included many other significant variables as well, so I might as well use them.  Due to the increased complexity of the formulas, I had to start using SAS instead of Excel, but the general thought is the same.  In both cases, we’re reducing the sum of the squares of the differences between the actual value and what our formula predicts the value should be by changing coefficients.  My statistics class actually came in useful, as we had recently gone over how to turn categorical variables, something like hair color, into a numeric variable that can be processed by the formula.

The basic idea is to separate it out into a series of binary variables.  So, for example, if hair color options include blonde, brown, black, and red, then you create three (n-1) binary variable, that track whether the individual has one trait, such as blonde hair, brown hair, and black hair.  They have a 1 if so, and a zero otherwise.  The coefficient attached to this variable is then just a simple addition to the formula, since A*1=A.  Interesting and useful.

By implementing other variables, not only was I able to make the formula more accurate (The value of Rwent from 0.5787 to 0.7118, which is the amount of variability explained by the model), but I was also able to eliminate the need of an initial estimate altogether, although at the cost of accuracy, which I find interesting.  Since the initial estimate is such a great predictor, though, (As you’d expect.  If you want to predict something, you could do worse than asking the owner.) the model changes quite a bit without that variable, so really, there are two separate formula: one where you have an initial self-reported estimate of penis size, and another where you don’t.

Now, the formula gets a bit more complicated, but I’ll do my best to explain it.  It’s a combination of categorical additions (for example, if the guy has black hair, had so-and-so centimeters to the estimate) and products (the guy’s height times so-and-so centimeters per year).  All values are in centimeters.

The variables used are:
Height (in meters)
Age (in years)
Hair Color
Eye Color
Self-Measured Penis Length (in centimeters)

Now then, the formula is:

Predicted Penis Size = -5.56387 + (Age * -0.00895) + (Height * 6.57043) + (Self-Measured Penis Length * 0.45639) + (Eye Color Value) + (Hair Color Value) + (Ethnicity Value)

The values for hair color, eye color, and ethnicity come from the below tables.

Eye Color Eye Color Value
Green 0.4041
Brown 0.48353
Blue -0.2685
Gray 0
Other 0.3803764
Hair Color Hair Color Value
Brown 0.49832
Blonde 0.37735
Black 0.61258
Red 0
Other 0.511714767
Ethnicity Ethnicity Value
Caucasian 0.09476
Arab -0.59318
East Asian -0.91096
Black 1.30666
Latino 1.16791
Mediterranean -0.16643
Mixed 0.34564
Central Asian -0.31054
Indian -1.12532
Australoid 0
Other/Unknown 0.10610367

(The classes with a value of 0 don’t indicate any sort of normality, it’s just a by-product of the linear regression method I used.)

Using this model, the average error is about a centimeter, which implies that, assuming a normal distribution, you’re about 68% likely to be within a centimeter of the true size.  Oftentimes, though, men don’t just blurt out an estimate of their penis size, which requires a different formula.  The variables are the same, only now we have no self-measured penis size.  This time, the formula and value charts are:

Predicted Penis Size = -8.54628 + (Age * -0.00711) + (Height * 11.59279) + (Eye Color Value) + (Hair Color Value) + (Ethnicity Value)

Eye Color Eye Color Value
Green 1.16689
Brown 1.26317
Blue 0.29298
Gray 0
Other 1.1251647
Hair Color Hair Color Value
Brown 0.76874
Blonde 0.54502
Black 1.14517
Red 0
Other 0.8795126
Ethnicity Ethnicity Value
Caucasian 1.08656
Arab 0.53232
East Asian -1.47795
Black 2.63534
Latino 2.1038
Mediterranean 0.64598
Mixed 1.25143
Central Asian 0.20177
Indian -1.33676
Australoid 0
Other/Unknown 0.99191133

This formula is less exact, with an average error of about 1.7 cm, which means you’re only 44% likely to be within a centimeter of the true size.  Not awful, that’s almost even odds, but it’s certainly much better to get an estimate if you can.  The best thing is that the expectation of a lie is built in.  The average exaggeration is 2.65 cm, or a little over an inch.

So what does this data indicate?  Well, first, age matters very, very little to penis size, but all of our observations are at least 18, so we should say that once you reach the age of 18, age matters little.

Secondly, height does matter.  Without any previous estimate of penis size, even a decimeter increase in height, about four inches, correlates with an average increase of a centimeter in penis size.

As for eye color, brown-eyed men tend to be largest, with blue- and grey-eyed being the smallest.  With hair color, black-haired men tend to be largest, with redheads being the smallest, which incidentally supports an anecdote I once heard from a summer camp coworker, which was that Irish men tend to be smaller than average.  With regards to ethnicity, men of African and Latino descent tend to be largest, while East Asians and Indians tend to be smallest, which aligns with the most common stereotypes and prejudices.

Honestly, I kind of wish I could turn this into an app or something.  Might be useful, yeah?  Interesting, at least.  You could probably get a lot more observations, too, if you had a way to verify the true size later.

This entry was posted in Interesting Things and tagged , , , , , , , , , , . Bookmark the permalink.

5 Responses to Men’s Continued Mismeasurements: An Improved Formula for Penis Size Estimation

  1. kaushik55 says:

    Dear John,
    Perhaps, you can sell this framework to a software company who can make a ‘killer-app’ out of it and you can laugh all the way to the bank! Meanwhile, I thought this a rather convoluted way of saying ‘Balls to statistics’? I hated statistics classes too! Cheers 🙂


  2. Pingback: An Oddity in My Search Terms | John Kutensky

  3. −5.56387+13+1.70688
    +0.09476 = 9.4607



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.