Which of the following probability distribution is generating an iid dataset?

Question

Let $X_1, X_2$ be two discrete random variables. Each random variable takes two values: $1, 2$

The probability distribution $p_1$ over $X_1, X_2$ is given by

$$p_1(X_1=1, X_2 = 1) = \dfrac{1}{4}$$ $$p_1(X_1=1, X_2 = 2) = \dfrac{1}{4}$$ $$p_1(X_1=2, X_2 = 1) = \dfrac{1}{4}$$ $$p_1(X_1=2, X_2 = 2) = \dfrac{1}{4}$$

The probability distribution $p_2$ over $X_1, X_2$ is given by

$$p_2(X_1=1, X_2 = 1) = \dfrac{8}{16}$$ $$p_2(X_1=1, X_2 = 2) = \dfrac{4}{16}$$ $$p_2(X_1=2, X_2 = 1) = \dfrac{3}{16}$$ $$p_2(X_1=2, X_2 = 2) = \dfrac{1}{16}$$

Suppose $D_1, D_2$ are the datasets generated by $p_1, p_2$ respectively.

Then which dataset can I call an iid? I am guessing as $D_1$ since we can prove the random variables are independent and are identically distributed and for $D_2$, iid does not hold.

$\underline{\text{ For }D_1}$

Identical : $p_1(X_1 = x_1) = p_1(X_2 = x_2)= \dfrac{1}{2} \text{ where } x_1 = x_2 \in \{1, 2\}$
Independent: $p_1(X_1 = x_1,X_2 = x_2) = \dfrac{1}{4} = p_1(X_1 = x_1) p_1(X_2 = x_2) \text{ for } x_1, x_2 \in \{1, 2\}$

We can show that random variables $X_1, X_2$ are not iid if we consider $p_2$.

Is the iid I am discussing is different from making an iid of a dataset as answered here? If not, where am I going wrong?

nbro · Accepted Answer · 2022-01-16T09:14:20.427

A sequence of $n$ random variables $z_{1:n} = z_1, z_2, \dots, z_n$ is i.i.d. if

they are identically distributed, i.e. each random variable $z_i$ has the same distribution
the joint distribution of all of them is just the product of the marginal distributions of each r.v.

So, let's imagine a thought experiment in which we throw a coin $n$ times, so you get a sequence of results, but you still don't know what those actual results are (i.e. you don't know whether $z_i$ is heads or tails), because this is just a thought experiment, so we're still talking about random variables and probability distributions and not datasets.

You might think: well, we're sampling $n$ times from the same distribution because we have only 1 coin, so we have only one r.v. In reality, you can think that each throw is associated with a different r.v. $z_i$, but that all these r.v.s, $z_1, z_2, \dots, z_n$, have the same probability distribution, for example, a Bernoulli with the same $p$ (the parameter of the Bernoulli). Now, let's say that $p = 0.5$. This means that, for a single coin toss, there's a 50% chance that the coin will be tails and a 50% chance it lands on heads. This does not have to be the case. In fact, we could also have a weird coin that prefers to land on heads, so let's say that $p = 0.7$, which is the probability it lands on heads. That's fine, and the sequence of random variables $z_{1:n}$ can still be i.i.d. How is this possible?

What's the Bernoulli pmf?

$$f_\text{marginal}(k_i;p)=p^{k_i}(1-p)^{1-k_i},$$ where $k_i\in \{0,1\}$.

So, each $z_i$ has this Bernoulli pmf, with the same $p$. This is the identically distributed part of iid.

For simplicity, let $n = 2$, so $z_{1:n} = z_{1:2} = z_1, z_2$.

So, if $z_1$ and $z_n$ are independent, we have that their joint is just the product of their marginals

\begin{align} f_\text{joint}(k_1, k_2;p) &=(p^{k_1}(1-p)^{1-k_1}) (p^{k_2}(1-p)^{1-k_2}) \\ &=p^{k_1} p^{k_2} (1-p)^{1-k_1} (1-p)^{1-k_2} \\ &=p^{k_1 + k_2} (1-p)^{2-k_1 - k_2} \\ \end{align} where $k_1 \in \{0,1\}$ is the outcome for $z_1$ and $k_2 \in \{0,1\}$ is the outcome for $z_2$.

So, as before, let's say that $p= 0.7$, then that can be written as

$$ f_\text{joint}(k_1, k_2; 0.7) = (0.7^{k_1}(1-0.7)^{1-k_1}) (0.7^{k_2}(1-0.7)^{1-k_2}) $$

If $k_1=0$ and $k_2 = 0$, we have

\begin{align} f_\text{joint}(0, 0;0.7) &= (0.7^0(1-0.7)^1) (0.7^0(1-0.7)^1) \\ &= 0.3 * 0.3 = 0.09 \end{align}

If $k_1=1$ and $k_2 = 1$, we have

\begin{align} f_\text{joint}(1, 1;0.7) &= (0.7^1(1-0.7)^0) (0.7^1(1-0.7)^0) \\ &= 0.7 * 0.7 = 0.49 \end{align}

If $k_1= 1$ and $k_2 = 0$, we have

\begin{align} f_\text{joint}(1, 1;0.7) &= (0.7^1(1-0.7)^0) (0.7^0(1-0.7)^1) \\ &= 0.7 * 0.3 = 0.21 \end{align}

If $k_1= 0$ and $k_2 = 1$, we have

\begin{align} f_\text{joint}(1, 1;0.7) &= (0.7^0(1-0.7)^1) (0.7^1(1-0.7)^0) \\ &= 0.3 * 0.7 = 0.21 \end{align}

So, the probabilities are not uniform, but we still have two independent r.v.s, because we defined their joint as the product of their marginals.

Now, set $p = 0.5$, you will see that we will have $1/4$ for all the combinations of $k_1$ and $k_2$.

Generally, the joint distribution of $n$ independent Bernoulli can be compactly written as follows

$$ f_\text{joint}(k_1, \dots, k_n; p) = p^{\sum k_i}(1-p)^{n- \sum k_i} $$

Conclusion: you cannot determine whether a sequence of r.v.s is i.i.d. by just looking at the probabilities.

Which of the following probability distribution is generating an iid dataset?

1 Answers1

Linked