darkoshi: (Default)
I've been thinking about it more, and the conclusion I've come to is that the formula given on that webpage is not an accurate calculation of the probably of someone being male or female, even taking for granted that the input variables r_n are accurate.

It is simply a formula which appears to give a reasonable result for various ratios. But it is not necessarily accurate given n > 1.

If the male-to-female ratio of visitors to a website is 2-to-1, then the probability of a website visitor being female will be 33.3%. However, given another website with the same ratio, and given that a person has visited both websites, this does not conclusively indicate that the probability of that person being female is then 20%, as the formula suggests. In order to determine the actual probability, we would need to know the actual M/F ratio of people who have visited both websites; however this piece of data is not given nor can it be determined from the separate ratios for the individual websites.

For example, suppose that all the people who visited the first website are the same group of people who visited the 2nd website. In that situation, the overall probability of one of those people being female would remain 33.3% even though they had been to both websites.

As a another example, suppose that a certain topic is mainly of interest to females, but that a small percentage of males are also interested in that topic. Websites devoted to that topic will tend to have a high F/M ratio. However, the males that are likely to visit one of those websites will also be likely to visit other of those websites. Just because a person visits multiple of those websites, does not make them less and less likely to be male.

The formula does seem to make sense in a general way, as a guess as to whether a person is male or female, based on whether the result is more or less than 50%. However, the actual percentage numbers given are not logically accurate.


Update on 08/02/2008 (still haven't been able to drop this line of thought):

Ok, so if you base your calculation on the assumption that each website is completely distinct topic-wise, from every other website, and so that the M/F ratio at any website is completely unrelated to the M/F ratio at another website... That there is just some inherent quality of the website that attracts Ms and Fs in different ratios, and that quality is completely unrelated to all the other websites' inherent qualities of M/F attraction...

Ok, so then maybe that formula does make sense.

Given website #1, if X females visit it, then X * R1 males visit it (as R1 = the M/F ratio for that website).

Given website #2, suppose that all the visitors from website #1 "bump" into website #2 (where bumping into the site isn't visiting it, it is just coming across it and being given the decision to visit it or not).

Given that R2 is completely unrelated to R1, then we can expect that of the people from website #1 who have bumped into website #2, that the ratio of males to females who decide to visit #2 will be R2.

Suppose that of the X females who've visited #1, Y percent of them decide to visit #2. So X * Y = the number of females who visit both websites.

The percentage of males who've visited #1 and also decide to visit #2 will be Y * R2, since R2 is the generic M/F ratio for website #2.

Therefore the number of males who visit both websites will be the number who've visited #1 (X * R1) multiplied by the percentage of those that visit #2 (Y * R2). That equals X * Y * R1 * R2.

Then the ratio of M/F for people who've visited both websites will be the number of males divided by the number of females:
X * Y * R1 * R2 / (X * Y) = R1 * R2.

For each additional website that is added to the calculation, the total M/F ratio will be R1 * R2 * ... * Rn. And that is what the formula on that other page uses.

But it is based on the assumption that the Rn numbers are accurate, and that all websites have unrelated M/F qualities of attraction.

May 2025

S M T W T F S
    123
45678910
11121314151617
1819 2021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Thursday, May 22nd, 2025 04:11 pm
Powered by Dreamwidth Studios