Means and Variances of Binomial RV's

 

If X has the B(n,p) distribution then: 

 

Explanation: 

Let Z be a B(1, p) random variable 

Thus: Z has the distribution 

Outcome:

1

0

Probability:

p

1-p

 

 Now, suppose you had "n" of the Z variables and you made sure each were independent. 

Then, you could consider the random variable: X = Z1 + Z2 + … + Zn 

X is just the count of the number of successes in "n" Z-type variables. The count is increased by 1 if ZI = 1 and it is unchanged if ZI = 0. 

 

Since the Z variables are all independent we can "add" the variances so:

An Annoying Feature of Binomial RV's 

Example: Suppose you have X as a B(1785, .6) random variable. Calculate P(X $ 1036)

= P(X=1036)+P(X=1037)+…+P(X=1785) 

This could take forever. Imagine all of the factorials. 

There is a Simple Solution 

If n is 'large enough' then 

 

That is, the Binomial distribution looks at lot like a normal distribution with the same mean and standard deviation. 

Back to our example: 

NOTE: np = 1071 = (.6).1785

 

This also works for specific probabilities 

e.g. X is B(15, .5) P(X = 8)

 

The normal curve can be used but we need to be just a bit careful!

What we do is say that 8 is the same as the mass between 7.5 and 8.5 under a normal curve with : x = np and  

 

Question of P(x = 8) becomes 

P (7.5 # X # 8.5) for X a N (7.5, 1.936)

np

P (7.5 # X # 8.5) = P (7.5 -7.5 # X - 7.5 # 8.5 - 7.5)

= P (0 # X - 7.5 # 1)

= P (0 # N(0,1) # .516)

 

From the normal tables we get P (z # 0) = .5

P (z # .516) = .6985

 

= .1985

 

Compare this to the binomial tables where you get an answer of .1964 (NOT TOO BAD) 

This approximation works best when p is close to .5 and 'n' is large. If 'n' is large enough, it can compensate for p ¹ .5 so you can look at a range of P's if you are counting enough potential success. 

Continuity Correction

 

Prob (X = M) =

P(M - .5 # X # M + .5)

for

N(np )

P(X < M) =

P(X < M - .5)

;

P(X > M) = P(X > M + .5)

P (X # M) =

P(X # M + .5)

;

P(X $ M) = P(X $ M - .5)

 

 

Another Random Variable: Proportions 

Suppose we have our 'old friend' X which is B(n,p) 

Just for fun, consider the random variable:

 

Thus is the proportion of successes in n trials of the same experiment. I put a "hat" on capital P to distinguish this new random variable from p = the probability of success in the experiment. 

 

If we use our rules for finding the mean and variance of a linear function of a random variable we end up with: 

So the mean is equal to the probability of success in any of the trials 

 

There is something very 'neat' about the random variable . The variance in disappears as n gets large while the variance in X explodes. 

The variance disappears 

Just as in the case of X (B(n,p)) we can approximate the distribution of as a normal distribution. 

The relationship is as follows: 

 

 

Example 

n = 1785 p = .6 

Find probability ()  

= Prob.

 

= Prob.

= Prob.

(Z $ -1.724)

 

= .9573

 

 

 

(close to what we had for X) 

How I Make Sense of RV's and Sampling 

The way I tell this story is a bit different from the text but ultimately, equivalent.

 

Population

Many copies all have same

of same r.v Population of random variables

every type possible r.v. of a given type

e.g. all possible people, all possible Canadians

They are rv's to us because, before their characteristics (e.g. M or F)

 

 

Sample A random sample of rv's from the population. I.e. these are still rv's because even though we have chosen the sample we still haven't 'run the experiment' (i.e. find out whether the people are M or F.)

 

The text refers to SRS's. Simple Random Samples. This is what is meant.

 

What you would like to do is use the sample of RV's to 'learn' about the population. 

For example, how can I learn (in a statistical sense) what is the proportion of F's in the population?

Learning about population parameters (e.g. : x) is one of the major goals of statistics. Typically what this involves is defining a new random variable as a linear function of the random variables in a sample.

 

Example Note: I use , the text uses (more on this later) 

= sample mean

= average of the n random variables in the sample.

 

Note: is a random variable. 

 

Suppose that each X in the sample is independent and has the same (identical) distribution.

We can derive:

 

Because all xi have the same mean 

 

 

This is very special, we have defined a random variable on a sample and we have shown that the Sampling Distribution of (which is just the distribution of defined for the sample) has the properties:

 

Note: depends on properties of the sample e.g.

 

So, if the sample gets large n V 4 this random variable will have almost no variability. Any realization will be approximately equal to : .

 

Prob. of

Density of

 

What is the interaction behind this result (i.e. that as n gets larger?) 

A way to think about this is why do partnerships form, for example, with lawyers., They throw all their earnings into a pot and divide it into equal shares.

The reason is that the average (equal shares) income is less variable. On your own, you may have a good year or you may have a bad year. Your income goes up and down (it has lots of variability) But, if you are in a partnership, if you have a bad year, chances are someone will offset that with a good year and so your income isn't prone to the same variability when you make partnership sharing agreements. NOTE: on average is just a proof that partnership earnings should be less variable.

 

Central Limit Theorem 

In most situations that we will see, the Central Limit Theorem holds. This theorem is incredible. It says (essentially) that it usually doesn't matter what the distribution of the X's is in the population, in a sample will be approximately normally distributed with

 

Example of Growing Beans 

  1. Consider the case of a biology professor. He has made up "billions" of the following packets: 1 seed bean, 1 blotter, 1 vial of water and 1 small enclosure that allows a fixed amount of light and temperature. This is the population of random variables. Each packet is a r.v…. an experiment waiting to happen. The outcome of an experiment is the growth of the bean sprout in cm in 2 weeks. We don't know what that will be before the experiment is run.
  2.  

  3. For a given class (BIO 101), the professor randomly chooses n packets from his stockpile. Again the experiment has not run, so the professor has just created a SRS of size n. The SRS is full of n random variables.
  4.  

    Note: each packet is identical — so the distribution of each r.v. in the population and the SRS have to be the same with mean = : x and standard deviation = F x

     

  5. The professor defines a random variable as a weighted sum of all of the random variables in the simple random sample.
  6.  

    Each Xi is one of the original packets.

     

    This is truly a random variable: its value is not known until the experiment is run. We saw before that 

     

  7. Suppose, now, that the professor hands out the n packets to the class, the students run the experiments and come back in 2 weeks each student reports an amount (in cm) that the bean has sprouted. For each random variable Xi there is a corresponding realization or outcome xi. The xi are the data and they form a list of length n. 
  8. Using the data, you can construct/compute a realization for, as . This value is what I refer to as . (i.e. is a realization of )

It is important for you to note that for a large sample (i.e. when n is large), will be almost exactly equal to : because it will almost be 0. (Law of Large Numbers) 

Thus, you can learn about : an unknown parameter of the population distribution.

 

In the formal language of Statistics 

    1. is an estimator of : . In fact, it is an unbiased estimator of : because .
    2.  

    3. is a point estimator of : . It is the numerical realized value of .
    4.  

    5. Regardless of the distribution of the Xi's (for example, they could each be uniform on the interval [0,1], the sample mean will have a distribution which gets closer and close to . This is the Central Limit Theorem.
    6.  

    7. The Central Limit Theorem also works for proportions (like ).
    8.  

    9. The Central Limit Theorem even sometimes works for sums of random variables… like the binomial B(n,p). That is why the normal approximation worked in our previous example.