Though metaphorically you're right to claim that contrastive learning objective is implicitly achieved by a supervised classifier in its intermediate or deeper layer(s) which reflects class similarity as you roughly reasoned, it's not necessarily always and exactly so.
The semi-supervised contrastively learned embeddings are typically more general purpose and task agnostic based on their explicit semantic similarity encoding, suitable for transfer learning to downstream tasks and applicable to both labeled and unlabeled data using augmentation training such as SimCLR framework or clustering induced pseudo-labeling to define positive and negative pairs.
On the other hand, the classifier trained embeddings are task specific and may not generalize well to other tasks other than classification. If samples from the same class exhibit significant variability, a classifier may focus on the decision boundary rather than reducing intra-class variance, while contrastive learning would explicitly reduce this variance.