logpad

keeping logs, etc.

IPv6 Stream TCP Endpoints: Part 2

“Twisted python…. it’s featurrific!”

Another endpoint!
Twisted now supports IPv6 TCP client endpoints.
And guess what, it accepts hostnames too!

Other than enabling you to create a TCP client that connects to IPv6 hosts, this endpoint is special because it does name-resolution: When you pass a hostname to the endpoint, it uses socket.getaddrinfo to resolve it into IPv6 addresses and uses the first address on the list to connect.

What I learnt

Using fakes

This was the endpoint where I came to appreciate the importance and necessity of using fake components for testing, instead of testing our code with the real stuff. Thanks to a discussion with glyph, tomprince, and radix.
It started with Glyph telling me that I shouldn’t be talking to the real system resolver; it is insufficiently reliable for a unit test. So… the real thing is not reliable enough? How is a unit test different from any other code which will be using it?
And then came the answers:
The difference with a unit test is that test coverage must cover all the conditions, even those which do not currently exist. The real code only cares about the current state of the universe, but since we expect real code to run in many different universes at many different times, we must have tests that cover all of it. For example, your computer may exist in one of two states - ethernet cable plugged in, or not plugged in. Name resolution will behave differently in these cases. So we have to have a unit test that behaves as if your ethernet cable were plugged in, at least for the purpose of the code under test, and one that behaves as if it isn’t. And of course, this extends all the way up through DNS configuration and routing tables and whether the server you’re trying to connect to actually exists or not, which produce similar (but distinct) error conditions.

So yes, the real code can’t reliably fail on demand. The whole point of the real code is telling you the real, current state of the ethernet cable, after all.

Also, a lot of this just has to do with the speed of the tests. If you have 1000 tests which do DNS resolutions, it’s a lot faster to call a function that just does return “1.1.1.1” than to call a function that has to do a whole ton of real work on your network hardware in order to just get that same answer.

Thus, fake code will create a virtual universe within the real one to test how the code will behave there. Basically the test is fooling the feature’s code with certain environments, to test how it would behave in them.
This also enables someone who doesn’t have access to one of those environments (for example, someone on a network that just firewalls out IPv6 traffic) can still maintain the code and submit fixes.

Except ‘virtual universe’ is probably thinking way too big. Unit testing is all about individual components talking to each other. If you have a widget that connects to a gizmo, and you need to test that the widget works, you just have a testing gizmo that behaves differently from a real one. It’s not really a “fake” gizmo just because it only does one thing instead of all of the things that some other gizmo might normally do, and you certainly don’t need to simulate the entire universe that produced your testing gizmo.

But then, there’s also a tiny glitch: This way, we get to test if name resolution works with some (fake) function that I make, and pass, but how can I reliably check if it will work with socket.getaddrinfo too?

To which, I got this:
For our purposes right now, the way you know that it works the same as getaddrinfo is that you read the docs, you called getaddrinfo in a python shell and you compared the output carefully. And since it’s unlikely that the behavior of getaddrinfo will change (in any way that we care about), that kind of manual verification is fine. We don’t need to test its behavior, we need to test our behavior. There is something to be said for testing that we are compatible with it, but in a case such as this, where it’s such a relatively stable API, it’s not that critical.

But the better technique is to have a library of verified fakes where a specially-configured machine is used to make sure that the real one and the fake one have the same behavior in one test, so other tests can confidently use the fake one in N tests.

Deferred Callbacks

I thought I got comfortable with the idea of Deferred callbacks and using them effectively, but, this sums it up best I think:

radix: well, running it works well :>
glyph: yeah, but don’t ask what it does because it’ll KILL YOU WITH ITS TEETH!

(I almost got killed, by the way, a few weeks later, while making another endpoint, by them Deferreds.)

Oh, and I also got to know about using lambda forms.

(Note: In case you are interested, Twisted’s got a smarter name-based TCP connection endpoint under development.)

Comments