2

I've been having a problem that I can't explain with my limited understanding of "how things work". I connect to various servers (I'll use svn.mycompany.com as the example) for work. We use openvpn to create a secure connection generally to those servers. The company has OpenDNS set up to provide ordinary resolution for the servers, and the resulting addresses of course only work via an established VPN tunnel.

Well after some innocuous reconfiguration of some of the servers (not done by me, and apparently working for everybody else) I notice the following pattern. I have a VPN tunnel, and I try a svn command:

svn update

I get an error back immediately that the host svn.mycompany.com can't be resolved. If I then do a host lookup:

host -a svn.mycompany.com

that responds with the correct IP address. If I then re-try the svn command, it works, and svn keeps working, for a while. After some unmeasured period of time however, it stops working again and the cycle repeats.

The same pattern holds for other servers on the other side of the tunnel. I've seen this happen from different networks (i.e., at my house, out at a coffee shop, etc).

I'm not looking for an overall solution. My real question is, how is it that simply running host -a can at least temporarily "fix" the situation of a domain not resolving? Does host do something special to bypass a local cache? (If so, I'm still confused, because the address of the servers don't change, or change rarely.)

edit — OK more information. By turning up logging for systemd-resolved, I was able to use journalctl to track what my local machine is doing with DNS lookups. What I saw seems interesting but I still don't know enough to understand what it means: DNS lookup requests of query type ANY seem to overflow the UDP packet size, so systemd-resolved falls back to making a TCP query. For a normal non-ANY lookup on these foo.mycompany.com names, I don't get the packet overflow but it goes on to make a NODATA local cache entry.

When the ANY queries force the TCP fallback, systemd-resolved gets a useful result and makes a positive cache entry.

To me this means that something weird is going on with the UDP responses, but I don't know what that implies about the root cause.

Pointy
  • 1,623

0 Answers0