Mass renormalization is indeed introduced to cure the divergent behaviour of the real part of the self-energy, while the imaginary part (giving the lifetime) is seemingly finite. However, in the naive calculation, it actually contains the unobservable (and infinite) bare mass. The lifetime formula must also be calculated from the renormalized formalism, but in the end this will make no difference, we will end up with formally the same expression; the only difference is that the observed (that is, renormalized) mass $m_\text{obs}$ enters the lifetime formula as well.
In the old-fashioned PT (i.e. non-relativistic QM coupled to a quantized electromagnetic field), we find that the renormalized Hamiltonian of the hydrogen atom reads
$$
H=H_\text{atom}+H_\text{rad}+H_\text{int} \ ,
$$
where
$$
H_\text{atom}=\frac{P^2}{2m_\text{obs}}-\frac{Z\alpha}{r}
$$
is the Hamiltonian of the hydrogen without any field (with eigenstates and eigenvalues $\phi_n$, $E_n$),
$$
H_\text{rad}=\frac{1}{2}
\int\mathrm{d}^3r\left[\Pi^2+(\nabla\times\vec{A})^2\right]
$$
is the Hamiltonian of the field ($\vec{A}$ and $\vec{\Pi}=-\vec{E}_\perp$ being the vector potential and its conjugate momentum, respectively), and
$$
H_\text{int}=\frac{e}{m_\text{obs}}\vec{P}\cdot\vec{A}+\frac{e^2}{2m_\text{obs}}A^2+\left(\frac{\delta m}{m_\text{obs}}\right)\frac{P^2}{2m_\text{obs}}
$$
is the usual minimal coupling interaction. The only non-standard term is the last one, the mass counterterm, which is introduced to cancel the divergence of the self-energy. In the lowest order and in the dipole approximation, $\delta m$ reads
$$
\delta m=\frac{4\alpha}{3\pi}\Lambda \ ,
$$
as $\Lambda\rightarrow\infty$. See Chapter 11. of Molecular Quantum Electrodynamics by Craig & Thirunamachandran for the derivation of this Hamiltonian with this counterterm (and also Chapter 3.4 of Relativistic Quantum Mechanics and Field Theory by Gross for some details on the following PT calculation).
When calculating the energy shift and decay rate in the renormalized theory, everything goes on just like in the naive "Rayleigh-Schrödinger PT + dipole approximation" calculation (the $\sim A^2$ term gives an infinite but state-independent "constant" contribution which can be absorbed by the redefinition of zero energy, while the $\sim\vec{P}\cdot\vec{A}$ term yields in second order a state-dependent linear divergence cancelled by the mass counterterm), and we end up with
$$
\Delta {\cal{E}}_{n,\Lambda}=\frac{2\alpha}{3\pi m^2_\text{obs}}
\sum_m(E_m-E_n)|\langle\phi_n|\vec{P}|\phi_m\rangle|^2
\int_0^\Lambda\mathrm{d}k\frac{1}{E_m-E_n+k}
$$
for the $n$-th state. The integrand can become singular for excited states when $k=E_n-E_m$. This must be avoided with an appropriate $i0^+$ prescription, that is, by formally adding a small imaginary part to the energy:
$$
\begin{aligned}
\Delta {\cal{E}}_{n,\Lambda}&=\frac{2\alpha}{3\pi m^2_\text{obs}}
\sum_m(E_m-E_n)|\langle\phi_n|\vec{P}|\phi_m\rangle|^2
\int_0^\Lambda\mathrm{d}k\frac{1}{E_m-E_n+k-i0^+} \\
&=\Delta {{E}}_{n,\Lambda}-\frac{i}{2}\Gamma_n \ ,
\end{aligned}
$$
where
$$
\Delta {{E}}_{n,\Lambda}=\frac{2\alpha}{3\pi m^2_\text{obs}}
\sum_m(E_m-E_n)|\langle\phi_n|\vec{P}|\phi_m\rangle|^2
{\cal{P}}\int_0^\Lambda\mathrm{d}k\frac{1}{E_m-E_n+k} \ ,
$$
and
$$
\begin{aligned}
\Gamma_n&=
-\frac{4\alpha}{3 m^2_\text{obs}}
\sum_m(E_m-E_n)|\langle\phi_n|\vec{P}|\phi_m\rangle|^2
\int_0^\infty\mathrm{d}k\delta(E_m-E_n+k) \\
&=\frac{4\alpha}{3 m^2_\text{obs}}
\sum_{\substack{m \\ E_n>E_m}}(E_n-E_m)|\langle\phi_n|\vec{P}|\phi_m\rangle|^2 \\
&=\frac{4\alpha}{3}
\sum_{\substack{m \\ E_n>E_m}}(E_n-E_m)^3|\langle\phi_n|\vec{r}|\phi_m\rangle|^2
\ .
\end{aligned}
$$
We used
$$
\frac{1}{x\pm i0^+}={\cal{P}}\frac{1}{x}\mp i\pi\delta(x)
$$
to separate the real and imaginary parts, and we were free to let $\Lambda\rightarrow\infty$ in the decay rate formula. The expression for $\Gamma_n=1/\tau_n$ looks just like the one calculated naively, but with $m_\text{obs}$ instead of the bare mass.
Note that the decay rate was extracted from an essentially time-independent PT formalism by avoiding poles along the real line, an approach reminiscent of the treatment of resonances. The correct choice of the sign of $i0^+$ is better motivated in the full QED derivation; in this context, not much more can be said other than choosing $-i0^+$ instead of $+i0^+$ is necessary to prevent an unphysical, exponentially growing probability.
The real part providing the energy shift is still logarithmically divergent. Bethe simply introduced the cut-off $\Lambda=m$, while in the QED treatment, this divergence is cancelled by the infrared divergence coming from the $F_1$ form factor of the one-loop vertex correction. Note, however, that the violently singular nature of the self-energy (containing both linear and logarithmic divergences) is an artifact of the dipole approximation. If one calculated the one-loop self-energy in the old-fashioned PT but without dipole approximation, then only a single logarithmic divergence should be treated in the mass renormalization, and the remaining parts would be finite; see Lett. Math. Phys. 24 115 (1992) for details.