Is knowing underlying probability distribution mandatory for deciding iid property of random variables?

Question

Consider the following information regarding iid random variables

The acronym IID stands for "Independent and Identically Distributed".

A sequence of random variables (or random vectors) is IID if and only if the following two conditions are satisfied:

the terms of the sequence are mutually independent;

they all have the same probability distribution.

Definition:

Let $\{\mathcal{X}_n\}$ be a sequence of random vectors. Let $F_{\mathcal{X}_n}{(x_n)}$ be the joint distribution function of a generic term of the sequence $\{\mathcal{X}_n\}$. We say that $\{\mathcal{X}_n\}$ is an IID sequence if and only if

$$F_{\mathcal{X}_n}{(x)} = F_{\mathcal{X}_k}{(x)} \forall x, n, k $$

and any subset of terms of the sequence is a set of mutually independent random vectors.

Thus,

iid is a property for a sequence of random variables.
A joint probability distribution function is necessary to validate whether a sequence of random variables is iid or not.

Thus, the iid property of a sequence of random variables, from 2, is entirely depending on the underlying joint probability distribution function. Am I wrong anywhere?

If I am wrong, is there any other iid property of random variables that do not depend on the underlying probability distribution function?

score 2 · Answer 1 · edited Jan 15 '22 at 00:01

The point is even you know the distribution, sometimes you can't prove that the sampled data is i.i.d. or not! (more details in https://stats.stackexchange.com/q/130381/144441). Hence, without knowing the distribution, you have less information, and of course, you can't prove any identically distributed-ness property of the sampled data.

Note that i.i.d. is mostly mentioned as an assumption that is held in the corresponding domain, and you do not need to prove it as a property.

score -1 · Answer 2 · edited Jan 15 '22 at 00:00

From your conclusion, 1. is correct. But more specifically, it characterizes the nature of an underlying data-generating statistic. A table of results of dice throws is likely iid, but more significantly it is because the dice roll itself is iid.

Not really for 2. since you would be simply calculating for $P(A)P(B) = P(A,B)$ and $P(A) = P(B), \forall A, B$ in the discrete case. Since iid is defined as an iff (if and only if), this characterization is also sufficient.

Note that the iid assumption allows us to characterize the joint distribution in a certain way, which then allows us to compute it. Otherwise, the model might grow to be very complex.

Is knowing underlying probability distribution mandatory for deciding iid property of random variables?

2 Answers2

Linked