Background
Suppose I have a quantum channel $\Phi:B(\mathcal{H}_1)\rightarrow B(\mathcal{H}_1)\otimes B(\mathcal{H}_2)$, such that there is some small $\epsilon$ such that for any two input states $\rho$ and $\sigma$
$$ \Vert \rho - \sigma\Vert_1 (1-\epsilon) \leq \Vert\text{Tr}_2(\Phi(\rho)) - \text{Tr}_2(\Phi(\sigma))\Vert_1.\tag{1}$$
That is, the channel almost preserves distance even if we trace out the second system. This makes me think that the second system can't have much dependence on the first system, i.e., there is some channel $\Phi_1:B(\mathcal{H}_1)\rightarrow B(\mathcal{H}_1)$ such that $\Phi$ is close to $\Phi_1\otimes \rho_0$, i.e., a channel that just applies some channel to the first system and tacks on a fixed state to the second system.
Somehow this needs to use something like no-cloning, because an ill-defined map $\Psi:\rho\mapsto \rho\otimes \rho$ satisfies the above inequality, but is not a quantum channel.
Question
Is there any way to prove that $\Phi$ has this form of "close to just adding a constant state to the second system"?
To phrase this formally: For any $\delta>0$, is there an $\epsilon >0$ such that for any channel $\Phi$ satisfying equation (1) for all input states, then there exists a channel $\Phi_1:B(\mathcal{H}_1)\rightarrow B(\mathcal{H}_1)$ and a state $\rho_0\in B(\mathcal{H}_2)$ such that $\Phi$ is within distance $\delta$ of the channel $\tilde{\Phi}:\rho\mapsto \Phi_1(\rho)\otimes \rho_0$?
Additional context
I am imagining two unitarities $U_1$ and $U_2$ whose action differs only on 2 basis states. That is, $\Vert U_1 - U_2\Vert = 2$, but there is a subspace $V$ of almost the full space such that $U_1\vert_V = U_2\vert_V$. Now I only have noisy channels $\mathcal{\tilde{U}}_{i}$ that implement these, i.e. $\mathcal{\tilde{U}_i} = (1-p)U_i + p \mathcal{D}$ for some noise channel $\mathcal{D}$. Then considering the channel
$$ (I\otimes \mathcal{\tilde{U}}_i)\Phi \tag{2}$$
I want to argue that there is some trade-off between the fidelity of this channel and its ability to distinguish between $U_1$ and $U_2$. That is, if I have two input states $\rho$ and $\sigma$ that distinguish $U_1$ and $U_2$, then after I apply $\Phi$, if too much information about the input state is in the second system, then the system has an irrecoverable loss because $\mathcal{\tilde{U}}_i$ is noisy, but if not enough information about the input state is int he second system, then it can't distinguish $U_1$ from $U_2$.
The original question should solve this (if $\Phi$ is close to $\Phi_1\otimes\rho_0$, then it can't distinguish $U_1$ from $U_2$ very well) but maybe there are other approaches.
 
     
    