Random Variable(s) & Probability Distribution(s) and and one-sample and two sample tests

Today we have three different scenarios to test the random samples and probability distributions.

A.

Consider a population consisting of the following values, which represents the number of ice cream purchases during the academic year for each of the five housemates.
8, 14, 16, 10, 11

a. Compute the mean of this population.
b. Select a random sample of size 2 out of the five members

c. Compute the mean and standard deviation of your sample.
d. Compare the Mean and Standard deviation of your sample to the entire population of this set (8,14, 16, 10, 11).

population <- c(8,14,16,10,11)
population
[1] 8 14 16 10 11
smpl_size <- 2
random_sample <- sample(population, size = smpl_size)
random_sample
[1] 10 11
mean(population)
[1] 11.8
sd(population)
[1] 3.193744
mean(random_sample)
[1] 10.5
sd(random_sample)
[1] 0.7071068

#########No we have the Question B

B.

Suppose that the sample size n = 100 and the population proportion p = 0.95.

Does the sample proportion p have approximately a normal distribution? Explain.
What is the smallest value of n for which the sampling distribution of p is approximately normal?

The sample mean from a group of observations is an estimate of the population mean μ . Given a sample of size n, consider n independent random variables X1, X2, …, Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean μ and standard deviation σ .
A. Population mean= (8+14+16+10+11)/_ B. Sample of size n=
C. Mean of sample distribution: __

sample 1=
sample 2=
sample 3 and so on and so forth…
And Standard Error Qm=Q/square root of n=4.4/square root of 5=
D. I am looking for table with the following variables X, x=u, and
(x-u)^2

Here’s a little hint

The sample size n = 100 and the population proportion p = 0.95.

1.Does the sample proportion p have approximately a normal distribution?
The distribution is expected to be normal if both np and nq are greater…… (your turn)
Since p = .95, q = .05.
p * n = .95 * 100 = ……
q * n = .05 * 100 = ……

#############This is my understanding of the above

>population <- c(8, 14, 16, 10, 11)
>population_mean <- mean(population)

Q <-sd(population)

Qm <- Q / sqrt(100)
Qm
[1] 0.3193744

>#Create a vector of X values (sample values)
X <- rnorm(100, mean = population_mean, sd = Q)

># Calulate ((x – μ)^2

>squared_diff <- (X – population_mean)^2

># Create a data frame with X, x = μ, and (x – μ)^2

>data_table <- data.frame(X, x = rep(population_mean, 100), squared_diff)
> data_table

#providing a simple showing of the data

head(data_table,5)
X x squared_diff
1 13.99153 11.8 4.8028101
2 12.74039 11.8 0.8843327
3 13.77450 11.8 3.8986421
4 12.34278 11.8 0.2946098
5 15.91607 11.8 16.9420565

Reasoning

Does the sample proportion p have approximately a normal distribution? Explain

To determine whether the sample proportion p has an approximately normal distribution, I am checking the conditions of the Central Limit Theorem (CLT). In this case, I am dealing with proportions, and the CLT suggests that the sampling distribution of sample proportions tends to be approximately normal if both np and nq are sufficiently large (usually greater than or equal to 10).

>#Given Values

n <- 100
p <- 0.95
q <- 1 – p

># Calculate np and nq

np <- n * p
nq <- n * q

np
[1] 95
nq
[1] 5

if (np >= 10 && nq >= 10) {

+cat(“The conditions for the CLT are met.\n”)

+cat(“The sample proportion p has an approximately normal distribution.\n”)} else {

+cat(“The conditions for the CLT are not met.\n”)

+cat(“The sample proportion p may not have an approximately normal distribution.\n”)}
The conditions for the CLT are not met.
The sample proportion p may not have an approximately normal distribution.

########### Next Question

> ## What is the smallest value of n for which the sampling distribution of p is approximately normal?

> threshold <- 30

> smallest_n <- ceiling(threshold / p)

> smallest_n

[1] 32

C.

Simulated coin tossing is probability better done using function called rbinom than using function called sample. Explain.

# Simulate 10 coin flips (0 for Tails, 1 for Heads)

>results <- rbinom(10, size = 1, prob = 0.5)

# Convert the results to “Head” or “Tails”

>labels <- ifelse(results == 1, “Head”, “Tails”)

# Print the results

>print(labels)
[1] “Head” “Tails” “Head” “Tails” “Head” “Head” “Head” “Head” “Head” “Tails”

Explanation:
The rbinom() is specifically designed for Binomial Outcomes. It is designed for generating random numbers that follow a binomial distribution.
In the context of coin tossing, a binomial distribution is suitable because it deals with two possible outcomes (heads or tails) with known probabilities (0.5 for a fair coin). In contrast, the sample function is more general and can be used for various types of random sampling, making it less specialized for coin tossing.

Leave a comment