How to best distribute tests against multiple devices or resources with pytest? This interesting question came up during my training in Lviv (Ukraine) at an embedded systems company. Distributing tests to processes can serve two purposes:
- running the full test suite against each device to verify they all work according to the test specification
- distributing the test load to several devices of the same type in order to minimize overall test execution time.
The solution to both problems is easy if you use two pytest facilities:
- the general fixture mechanism: we write a fixture function which provides a device object which is pre-configured for use in tests.
- the pytest-xdist plugin: we use it to run subprocesses and communicate configuration data for the device fixture from the master process to the subprocesses.
To begin with, let’s configure three devices that are each reachable by a separate IP address. We create a list of ip addresses in a file:
# content of devices.json ["192.168.0.1", "192.168.0.2", "192.168.0.3"]
We now create a local pytest plugin which reads the configuration data, implements a per-process device fixture and the master-to-slave communication to configure each subprocess according to our device list:
# content of conftest.py import pytest def read_device_list(): import json with open("devices.json") as f: return json.load(f) def pytest_configure(config): # read device list if we are on the master if not hasattr(config, "slaveinput"): config.iplist = read_device_list() def pytest_configure_node(node): # the master for each node fills slaveinput dictionary # which pytest-xdist will transfer to the subprocess node.slaveinput["ipadr"] = node.config.iplist.pop() @pytest.fixture(scope="session") def device(request): slaveinput = getattr(request.config, "slaveinput", None) if slaveinput is None: # single-process execution ipadr = read_device_list() else: # running in a subprocess here ipadr = slaveinput["ipadr"] return Device(ipadr) class Device: def __init__(self, ipadr): self.ipadr = ipadr def __repr__(self): return "<Device ip=%s>" % (self.ipadr)
We can now write tests that simply make use of the device fixture by using its name as an argument to a test function:
# content of test_device.py import time def test_device1(device): time.sleep(2) # simulate long test time assert 0, device def test_device2(device): time.sleep(2) # simulate long test time assert 0, device def test_device3(device): time.sleep(2) # simulate long test time assert 0, device
Let’s first run the tests in a single-process, only using a single device (also using some reporting option to shorten output):
$ py.test test_device.py -q --tb=line FFF ================================= FAILURES ================================= /tmp/doc-exec-9/test_device.py:5: AssertionError: <Device ip=192.168.0.1> /tmp/doc-exec-9/test_device.py:9: AssertionError: <Device ip=192.168.0.1> /tmp/doc-exec-9/test_device.py:13: AssertionError: <Device ip=192.168.0.1> 3 failed in 6.02 seconds
As to be expected, we get six seconds execution time (3 tests times 2 seconds each).
Now let’s run the same tests in three subprocesses, each using a different device:
$ py.test --tx 3*popen --dist=each test_device.py -q --tb=line gw0 I / gw1 I / gw2 I gw0  / gw1  / gw2  scheduling tests via EachScheduling FFFFFFFFF ================================= FAILURES ================================= E AssertionError: <Device ip=192.168.0.1> E AssertionError: <Device ip=192.168.0.3> E AssertionError: <Device ip=192.168.0.2> E AssertionError: <Device ip=192.168.0.1> E AssertionError: <Device ip=192.168.0.3> E AssertionError: <Device ip=192.168.0.2> E AssertionError: <Device ip=192.168.0.3> E AssertionError: <Device ip=192.168.0.1> E AssertionError: <Device ip=192.168.0.2> 9 failed in 6.52 seconds
We just created three subprocesses each running three tests. Instead of 18 seconds execution time (9 tests times 2 seconds per test) we roughly got 6 seconds, a 3-times speedup. Each subprocess ran in parallel three tests against “its” device.
Let’s also run with load-balancing, i.e. distributing the tests against three different devices so that each device executes one test:
$ py.test --tx 3*popen --dist=load test_device.py -q --tb=line gw0 I / gw1 I / gw2 I gw0  / gw1  / gw2  scheduling tests via LoadScheduling FFF ================================= FAILURES ================================= E AssertionError: <Device ip=192.168.0.3> E AssertionError: <Device ip=192.168.0.2> E AssertionError: <Device ip=192.168.0.1> 3 failed in 2.50 seconds
Here each test runs in a separate process against its device, overall more than halfing the test time compared to what it would take in a single-process (3*2=6 seconds). If we had many more tests than subproceses than load-scheduling would distribute tests in real-time to the process which has finished executing other tests.
Note that the tests themselves do not need to be aware of the distribution mode. All configuration and setup is contained in the conftest.py file.
To summarize the behaviour of the hooks and fixtures in conftest.py:
- pytest_configure(config) is called both on the master and each subprocess. We can distinguish where we are by checking for presence of config.slaveinput.
- pytest_configure_node(node) is called for each subprocess. We can fill the slaveinput dictionary which the subprocess slave can then read via its config.slaveinput dictionary.
- the device fixture only is called when a test needs it. In distributed mode, tests are only collected and executed in a subprocess. In non-distributed mode, tests are run single-process. The Device class is just a stub — it will need to grow methods for actual device communication. The tests can then simply use those device methods.
I’d like to thank Anton and the participants of my three day testing training in Lviv (Ukraine) for bringing up this and many other interesting questions.
I am giving another such professional testing course 25-27th November at the Python Academy in Leipzig. There are still two seats available. Me and other trainers can also be booked for on-site/in-house trainings worldwide.
Gandalf and Frodo did the right thing when they went for destroying the power of the all-seeing eye. The idea of a central power that knows everything undermines our ability to self-govern and influence important changes in society, it undermines a foundation of democracy.
As against Sauron, it seems like an impossible fight to try to protect our communication against present-day espionage cartels. I see glimmers of hope, though. Certainly not much in the political space. Somehow our politicians are themselves too interested to use the eye on select targets — even if only the ones which Sauron allows them to see.
My bigger hope lies with technologists who are working on designing better communication systems. We still have time during which we can reduce Sauron’s sight. But to begin with, how do we prevent passive spying attacks against our communications?
A good part of the answer lies in the Trust on first use principle. The mobile Threema application is a good example: when two people first connect with each other, they exchange communication keys and afterwards use it to perform end-to-end encrypted communications. The key exchange can happen in full sight of the eye, yet the subsequent communication will be illegible. No question, the eye can notice that the two are communicating with unknown content but if too many of them do that this fact becomes less significant.
Of course, the all-seeying eye can send a Nazgul to stand in the middle of the communication to deceive both ends and listen in. But it needs to do so from the beginning and continously if it wants to avoid the victims from noticing. And those two can at any time meet to verify their encryption keys and would realize there was a Nazgul-in-the-middle attack.
By contrast, both SSL and GPG operate with a trust model where we can hear Sauron’s distant laughter. The one is tied to a thousand or so “root authorities”, which can be easily reined in as need be. The other mandates and propagates such a high level of initial mistrust between us that we find it simply too inconvenient to use.
Societies and our social interactions are fundamentally build on trust. Let’s design systems which build on initial trust and which help to identify after-the-fact when it was compromised. If the eye has bad dreams, then i am sure massively deployed trust-on-first-use communication systems are among them.
- new yield-style fixtures pytest.yield_fixture, allowing to use existing with-style context managers in fixture functions.
- improved pdb support: import pdb ; pdb.set_trace() now works without requiring prior disabling of stdout/stderr capturing. Also the --pdb options works now on collection and internal errors and we introduced a new experimental hook for IDEs/plugins to intercept debugging: pytest_exception_interact(node, call, report).
- shorter monkeypatch variant to allow specifying an import path as a target, for example: monkeypatch.setattr("requests.get", myfunc)
- better unittest/nose compatibility: all teardown methods are now only called if the corresponding setup method succeeded.
- integrate tab-completion on command line options if you have argcomplete configured.
- allow boolean expression directly with skipif/xfail if a “reason” is also specified.
- a new hook pytest_load_initial_conftests allows plugins like pytest-django to influence the environment before conftest files import django.
- reporting: color the last line red or green depending if failures/errors occured or everything passed.
The documentation has been updated to accomodate the changes, see http://pytest.org
To install or upgrade pytest:
pip install -U pytest # or easy_install -U pytest
Many thanks to all who helped, including Floris Bruynooghe, Brianna Laugher, Andreas Pelme, Anthon van der Neut, Anatoly Bubenkoff, Vladimir Keleshev, Mathieu Agopian, Ronny Pfannschmidt, Christian Theunert and many others.
may nice fixtures and passing tests be with you,
Changes between 2.3.5 and 2.4
- if calling –genscript from python2.7 or above, you only get a standalone script which works on python2.7 or above. Use Python2.6 to also get a python2.5 compatible version.
- all xunit-style teardown methods (nose-style, pytest-style, unittest-style) will not be called if the corresponding setup method failed, see issue322 below.
- the pytest_plugin_unregister hook wasn’t ever properly called and there is no known implementation of the hook – so it got removed.
- pytest.fixture-decorated functions cannot be generators (i.e. use yield) anymore. This change might be reversed in 2.4.1 if it causes unforeseen real-life issues. However, you can always write and return an inner function/generator and change the fixture consumer to iterate over the returned generator. This change was done in lieu of the new pytest.yield_fixture decorator, see below.
experimentally introduce a new pytest.yield_fixture decorator which accepts exactly the same parameters as pytest.fixture but mandates a yield statement instead of a return statement from fixture functions. This allows direct integration with “with-style” context managers in fixture functions and generally avoids registering of finalization callbacks in favour of treating the “after-yield” as teardown code. Thanks Andreas Pelme, Vladimir Keleshev, Floris Bruynooghe, Ronny Pfannschmidt and many others for discussions.
allow boolean expression directly with skipif/xfail if a “reason” is also specified. Rework skipping documentation to recommend “condition as booleans” because it prevents surprises when importing markers between modules. Specifying conditions as strings will remain fully supported.
reporting: color the last line red or green depending if failures/errors occured or everything passed. thanks Christian Theunert.
make “import pdb ; pdb.set_trace()” work natively wrt capturing (no “-s” needed anymore), making pytest.set_trace() a mere shortcut.
fix issue181: –pdb now also works on collect errors (and on internal errors) . This was implemented by a slight internal refactoring and the introduction of a new hook pytest_exception_interact hook (see next item).
fix issue341: introduce new experimental hook for IDEs/terminals to intercept debugging: pytest_exception_interact(node, call, report).
new monkeypatch.setattr() variant to provide a shorter invocation for patching out classes/functions from modules:
will replace the “get” function of the “requests” module with myfunc.
fix issue322: tearDownClass is not run if setUpClass failed. Thanks Mathieu Agopian for the initial fix. Also make all of pytest/nose finalizer mimick the same generic behaviour: if a setupX exists and fails, don’t run teardownX. This internally introduces a new method “node.addfinalizer()” helper which can only be called during the setup phase of a node.
simplify pytest.mark.parametrize() signature: allow to pass a CSV-separated string to specify argnames. For example: pytest.mark.parametrize("input,expected", [(1,2), (2,3)]) works as well as the previous: pytest.mark.parametrize(("input", "expected"), ...).
add support for setUpModule/tearDownModule detection, thanks Brian Okken.
integrate tab-completion on options through use of “argcomplete”. Thanks Anthon van der Neut for the PR.
change option names to be hyphen-separated long options but keep the old spelling backward compatible. py.test -h will only show the hyphenated version, for example “–collect-only” but “–collectonly” will remain valid as well (for backward-compat reasons). Many thanks to Anthon van der Neut for the implementation and to Hynek Schlawack for pushing us.
fix issue 308 – allow to mark/xfail/skip individual parameter sets when parametrizing. Thanks Brianna Laugher.
call new experimental pytest_load_initial_conftests hook to allow 3rd party plugins to do something before a conftest is loaded.
- fix issue358 – capturing options are now parsed more properly by using a new parser.parse_known_args method.
- pytest now uses argparse instead of optparse (thanks Anthon) which means that “argparse” is added as a dependency if installing into python2.6 environments or below.
- fix issue333: fix a case of bad unittest/pytest hook interaction.
- PR27: correctly handle nose.SkipTest during collection. Thanks Antonio Cuni, Ronny Pfannschmidt.
- fix issue355: junitxml puts name=”pytest” attribute to testsuite tag.
- fix issue336: autouse fixture in plugins should work again.
- fix issue279: improve object comparisons on assertion failure for standard datatypes and recognise collections.abc. Thanks to Brianna Laugher and Mathieu Agopian.
- fix issue317: assertion rewriter support for the is_package method
- fix issue335: document py.code.ExceptionInfo() object returned from pytest.raises(), thanks Mathieu Agopian.
- remove implicit distribute_setup support from setup.py.
- fix issue305: ignore any problems when writing pyc files.
- SO-17664702: call fixture finalizers even if the fixture function partially failed (finalizers would not always be called before)
- fix issue320 – fix class scope for fixtures when mixed with module-level functions. Thanks Anatloy Bubenkoff.
- you can specify “-q” or “-qq” to get different levels of “quieter” reporting (thanks Katarzyna Jachim)
- fix issue300 – Fix order of conftest loading when starting py.test in a subdirectory.
- fix issue323 – sorting of many module-scoped arg parametrizations
- make sessionfinish hooks execute with the same cwd-context as at session start (helps fix plugin behaviour which write output files with relative path such as pytest-cov)
- fix issue316 – properly reference collection hooks in docs
- fix issue 306 – cleanup of -k/-m options to only match markers/test names/keywords respectively. Thanks Wouter van Ackooy.
- improved doctest counting for doctests in python modules — files without any doctest items will not show up anymore and doctest examples are counted as separate test items. thanks Danilo Bellini.
- fix issue245 by depending on the released py-1.4.14 which fixes py.io.dupfile to work with files with no mode. Thanks Jason R. Coombs.
- fix junitxml generation when test output contains control characters, addressing issue267, thanks Jaap Broekhuizen
- fix issue338: honor –tb style for setup/teardown errors as well. Thanks Maho.
- fix issue307 – use yaml.safe_load in example, thanks Mark Eichin.
- better parametrize error messages, thanks Brianna Laugher
- pytest_terminal_summary(terminalreporter) hooks can now use “.section(title)” and “.line(msg)” methods to print extra information at the end of a test run.
My “speed up pypi installs” PEP438 has been accepted and transition phase 1 is live: as a package maintainer you can speed up the installation for your packages for all your users now, with the click of a button: Login to https://pypi.python.org and then go to urls for each of your packages, and specify that all release files are hosted from pypi.python.org. Or add explicit download urls with an MD5. Tools such as pip or easy_install will thus avoid any slow crawling of third party sites.
Many thanks to Carl Meyer who helped me write the PEP, and Donald Stufft for implementing most of it, and Richard Jones who accepted it today! And thanks also to the distutils-sig discussion participants, in particular Phillip Eby and Marc-Andre Lemburg.
If a man were to tweet a mysogynist joke and his followers were men, would that be an issue? What if one of them re-tweets it and one of his female followers complains on twitter? And then many other people start tweeting and re-tweeting this or that and what if this all got the initial tweeter fired from his company? And then her company would fire her as well?
Quite a mess, obviously. However, I think everyone had their reasons for talking and acting the way they did. And it boils down to the perspective you are able to feel empathy for. Here is a possible set of perspectives:
Perspective M: “The other day i was ridiculed by a bunch of girls at the office. I wanted to pay back with a little joke in an environment where i felt safe to do so.”
Perspective F: “Again a tweet with bad mysogynist jokes. I’ve had enough. This time i won’t sit quietly but call it out.”
Perspective C1: “Damn, look what this guy caused. His twitter profile is directly associated with our company. And now he tells bad mysogynist jokes and it’s now all over the internet. We cannot let go this time.”
Perspective C2: “Damn it, look what she caused. She is working in public relations and doesn’t know better than to cause a shitstorm which directly comes back to us a company? We cannot let go.”
I could understand each of these perspectives though i’d have a suspicision that the companies choose a bit of an easy way out. Had they rather put the issue of misogyny at the center of their positioning and communication, rather than focusing just on keeping some damage from the company, everybody would have learned a lesson and the incident could have contributed to a more enjoyable environment, i am sure.
Besides the good feedback and discussions around my talk, i just had a great few days. It was my first time to Russia and i saw and learned a lot. One unexpected event was going to a russian Sauna with Amir Salihefendic, Russel Keith-Magee and Anton, a main conference organizer. Between going into the Sauna we had glasses of nice irish Whiskey or walked outside to the snowy freezing cold. Afterwards some of us went to the conference party and had good (despite being somewhat drunken) discussions with people from Yandex, the biggest russian search engine and several russian devs. All very friendly, competent and funny. The party lasted until 5:30am – with my fellow english-speaking talkers Armin Ronacher, David Cramer (a weekend in Russia) and me being among the very last.
The next days evening saw Amir, David, Armin and two russian guys visiting an Irish pub past midnight. It turned out there is no such thing as a “russian pub”, the concept of “pub” was imported in the last decade mostly in the form of english or irish ones. And it seems IT/Python guys can meet everywhere on the planet and have a good time :)
Going back to content, i felt particularly inspired by Jeff Lindsay’s talk on Autosustainable services. He described how he tries to provide several small web services, and how to organize cost sharing by its users. As services need resources, it’s a different issue than Open Cource collaboration which does not require such to exist.
I heart several good sentences from my fellow talkers, for example one from Russel Keith-Magee describing a dillema of open source communities: “There are many people who can say ‘No’ but few who can say ‘Yes’ to something”. Amir Salihefendic desribed how the “Redis” database solved many problems for him, and some interesting concrete usages of “bitmaps” in his current endeavours like bitmapist.cohort. And of course Armin Ronacher and David Cramer also gave good talks related to their experience, Advanced Flask patterns and scalable web services respectively. With Armin i also had a good private discussion about the issue of code-signing and verification. We drafted what we think could work for Python packaging (more separately). With David, i discussed workflow commands for python packaging as he offered some good thoughts on the matter.
Around the whole conference we were warmly cared for by Yulia’s company it-people.ru who overtook the physical organisation, and by Anton and his friends who organized the program. Maria Kalinina in particular had cared for the keynote speakers and many other aspects of the conference, and without her, i wouldn’t have made it. Anton drove us to the Asian European geographic border, and Yulia to the skyscraper of Ekaterinburg, overlooking the third largest city in Russia. Russel and me also took the opportunity to walk around Ekaterinburg, looking at Lenin sculptures, buildings made of ice, frozen lakes, and the many shops and noises in the city.
Lastly i went to the university with Russel to talk for two hours to students about “How Open Source can help your career” and we had a lively discussion with them and the lecturer who invited us. I offered my own background and stated that the very best people in the IT world are today collaborating through open-source. It’s a totally dominant model for excellence. (Which doesn’t mean there are not some good proprietary projects, they are just fewer i’d say).
So i can join the many russian participants who thought Pycon Russia was a very good conference. It’s of course mostly interesting for people speaking russian, as only seven talks were in english. For my part, the intense time i had with both the russian hosts and developers and the english talkers was verymuch worth it – i think there might be a few new collaborations coming from it. More on that in later blog posts hopefully :)
Two days ago i left Ekaterinburg and felt a bit sad because of the many contacts i made, which almost felt like the beginning of friendships.
PSF’s code of conduct enforcement is a good step, but what about the many traditional family models in the IT world? I know many fathers which are busy fulltime with non-child stuff, and their partners have the main child responsibility. I heart three main reasonings for this situation and i don’t fully buy them:
- an economic one: the guy working brings more money into the household. This kind of perpetuates the inequality situation, doesn’t it? And is having less money really an issue? Is part-time working impossible? In germany you have a legal right to do part-time work, to begin with.
- a biologistic one: women can “naturally” or genetically care better than men for children. One, I’ve seen fathers doing just fine. Two, are we entirely determined by genetics? I see genetics as some kind of hardware, and software can do lots of different things on it. Culture is shaped as much as software. There is no such thing as “objective” nature.
- go away, it’s a family’s private business and choices. Nevertheless such choices are also culturally determined. Often there is no explicit discussion or choice but rather a fallback to the default, often induced by the facts of birth and breast feeding. How many fathers discuss the issue of child-care openly and regularly, offering changes to give a real choice?
Rest assured, I really like the projects i am hacking on as much as the other guy. Sometimes i feel that caring often for my child makes this harder. On the plus side, it gives me better focus because my time is more limited. And more often than not, i am grateful and have a lot of fun being with my little one.
Now, if more fathers in the Python communities were busier with their children, what would that change in terms of conference attendance of women? Not sure there would be any direct effect except maybe lower conference attendance of men, rising the percentage of women. It would set a good example, however, and help mid- to long-term, i am sure.
Sometimes i like to ask myself this question: when i am dying and wonder what should i have done rather differently? I doubt i am going to say “i should have released one more library, earned more money, become more popular”.