"In summary - the Net Applications sample is small (it is based on 40,000 websites compared to our 3,000,000). Weighting the initial unrepresentative data from that small sample will NOT produce meaningful information. Instead, applying a weighting factor will simply inflate inaccuracies due to the small sample size." http://gs.statcounter.com/press/open-letter-msVery misleading. I have no idea which of StatCounter and Net Applications is more accurate. But that argument is off.
In statistics, sample size is basically irrelevant past a certain minimal size. That's how a survey of 300 people in the US can predict pretty well for 300 million. The number of people doesn't matter in two ways: First, it could be 1 million or 1 billion, the actual population size is irrelevant, and second, it could be 3,000 or 30,000 and it would not be much better than 300. The only exception to those two facts is if the population size is very small, say 100. Then a sample size of 100 is guaranteed to be representative, obviously. And for very small sample sizes like say 5, you have poor accuracy in most cases. But just 300 people is enough for any large population.
The reason is the basic statistical law that the standard deviation of the sample is the same as of the population, divided by the square root of the sample size. If you are measuring something like % of people using a browser, the first factor doesn't matter much. That leaves the second. Happily for statistics, 1 over square root decreases very fast. You get to accuracy of a few percent with just a few hundred people, no matter what the population size.
So that StatCounter has 3,000,000 websites and Net Applications has 40,000 means practically nothing (note that 40,000 even understates it, since those are websites. The number of people visiting those sites is likely much larger). 40,000 is definitely large enough: In fact, just a few hundred datapoints would be enough! Of course, that is only if the sample is unbiased. That's the crucial factor, not sample size. We don't really know which of StatCounter and Net Applications is less biased. But the difference in sample size between them is basically irrelevant. Past a minimal sample size, more doesn't matter, even if it seems intuitively like it must make you more representative.