metaprogramming and politics

Decentralize. Take the red pill.

Archive for the ‘metaprogramming’ Category

Running tests against multiple devices/resources (in parallel)

with one comment

devicesHow to best distribute tests against multiple devices or resources with pytest? This interesting question came up during my training in Lviv (Ukraine) at an embedded systems company. Distributing tests to processes can serve two purposes:

  • running the full test suite against each device to verify they all work according to the test specification
  • distributing the test load to several devices of the same type in order to minimize overall test execution time.

The solution to both problems is easy if you use two pytest facilities:

  • the general fixture mechanism: we write a fixture function which provides a device object which is pre-configured for use in tests.
  • the pytest-xdist plugin: we use it to run subprocesses and communicate configuration data for the device fixture from the master process to the subprocesses.

To begin with, let’s configure three devices that are each reachable by a separate IP address. We create a list of ip addresses in a file:

# content of devices.json
["192.168.0.1", "192.168.0.2", "192.168.0.3"]

We now create a local pytest plugin which reads the configuration data, implements a per-process device fixture and the master-to-slave communication to configure each subprocess according to our device list:

# content of conftest.py

import pytest

def read_device_list():
    import json
    with open("devices.json") as f:
        return json.load(f)

def pytest_configure(config):
     # read device list if we are on the master
     if not hasattr(config, "slaveinput"):
        config.iplist = read_device_list()

def pytest_configure_node(node):
    # the master for each node fills slaveinput dictionary
    # which pytest-xdist will transfer to the subprocess
    node.slaveinput["ipadr"] = node.config.iplist.pop()

@pytest.fixture(scope="session")
def device(request):
    slaveinput = getattr(request.config, "slaveinput", None)
    if slaveinput is None: # single-process execution
        ipadr = read_device_list()[0]
    else: # running in a subprocess here
        ipadr = slaveinput["ipadr"]
    return Device(ipadr)

class Device:
    def __init__(self, ipadr):
        self.ipadr = ipadr

    def __repr__(self):
        return "<Device ip=%s>" % (self.ipadr)

We can now write tests that simply make use of the device fixture by using its name as an argument to a test function:

# content of test_device.py
import time

def test_device1(device):
    time.sleep(2)  # simulate long test time
    assert 0, device

def test_device2(device):
    time.sleep(2)  # simulate long test time
    assert 0, device

def test_device3(device):
    time.sleep(2)  # simulate long test time
    assert 0, device

Let’s first run the tests in a single-process, only using a single device (also using some reporting option to shorten output):

$ py.test test_device.py -q --tb=line
FFF
================================= FAILURES =================================
/tmp/doc-exec-9/test_device.py:5: AssertionError: <Device ip=192.168.0.1>
/tmp/doc-exec-9/test_device.py:9: AssertionError: <Device ip=192.168.0.1>
/tmp/doc-exec-9/test_device.py:13: AssertionError: <Device ip=192.168.0.1>
3 failed in 6.02 seconds

As to be expected, we get six seconds execution time (3 tests times 2 seconds each).

Now let’s run the same tests in three subprocesses, each using a different device:

$ py.test --tx 3*popen --dist=each test_device.py -q --tb=line
gw0 I / gw1 I / gw2 I
gw0 [3] / gw1 [3] / gw2 [3]

scheduling tests via EachScheduling
FFFFFFFFF
================================= FAILURES =================================
E   AssertionError: <Device ip=192.168.0.1>
E   AssertionError: <Device ip=192.168.0.3>
E   AssertionError: <Device ip=192.168.0.2>
E   AssertionError: <Device ip=192.168.0.1>
E   AssertionError: <Device ip=192.168.0.3>
E   AssertionError: <Device ip=192.168.0.2>
E   AssertionError: <Device ip=192.168.0.3>
E   AssertionError: <Device ip=192.168.0.1>
E   AssertionError: <Device ip=192.168.0.2>
9 failed in 6.52 seconds

We just created three subprocesses each running three tests. Instead of 18 seconds execution time (9 tests times 2 seconds per test) we roughly got 6 seconds, a 3-times speedup. Each subprocess ran in parallel three tests against “its” device.

Let’s also run with load-balancing, i.e. distributing the tests against three different devices so that each device executes one test:

$ py.test --tx 3*popen --dist=load test_device.py -q --tb=line
gw0 I / gw1 I / gw2 I
gw0 [3] / gw1 [3] / gw2 [3]

scheduling tests via LoadScheduling
FFF
================================= FAILURES =================================
E   AssertionError: <Device ip=192.168.0.3>
E   AssertionError: <Device ip=192.168.0.2>
E   AssertionError: <Device ip=192.168.0.1>
3 failed in 2.50 seconds

Here each test runs in a separate process against its device, overall more than halfing the test time compared to what it would take in a single-process (3*2=6 seconds). If we had many more tests than subproceses than load-scheduling would distribute tests in real-time to the process which has finished executing other tests.

Note that the tests themselves do not need to be aware of the distribution mode. All configuration and setup is contained in the conftest.py file.

To summarize the behaviour of the hooks and fixtures in conftest.py:

  • pytest_configure(config) is called both on the master and each subprocess. We can distinguish where we are by checking for presence of config.slaveinput.
  • pytest_configure_node(node) is called for each subprocess. We can fill the slaveinput dictionary which the subprocess slave can then read via its config.slaveinput dictionary.
  • the device fixture only is called when a test needs it. In distributed mode, tests are only collected and executed in a subprocess. In non-distributed mode, tests are run single-process. The Device class is just a stub — it will need to grow methods for actual device communication. The tests can then simply use those device methods.

I’d like to thank Anton and the participants of my three day testing training in Lviv (Ukraine) for bringing up this and many other interesting questions.

 

I am giving another such professional testing course 25-27th November at the Python Academy in Leipzig. There are still two seats available. Me and other trainers can also be booked for on-site/in-house trainings worldwide.

Written by holger krekel

November 12, 2013 at 7:43 am

Defeating Sauron with the “Trust on first use” principle

with 3 comments

photo from Alexandre Duret-Lutz Gandalf and Frodo did the right thing when they went for destroying the power of the all-seeing eye. The idea of a central power that knows everything undermines our ability to self-govern and influence important changes in society, it undermines a foundation of democracy.

As against Sauron, it seems like an impossible fight to try to protect our communication against present-day espionage cartels.  I see glimmers of hope, though. Certainly not much in the political space. Somehow our politicians are themselves too interested to use the eye on select targets — even if only the ones which Sauron allows them to see.

My bigger hope lies with technologists who are working on designing better communication systems. We still have time during which we can reduce Sauron’s sight. But to begin with, how do we prevent passive spying attacks against our communications?

A good part of the answer lies in the Trust on first use principle. The mobile Threema application is a good example: when two people first connect with each other, they exchange communication keys and afterwards use it to perform end-to-end encrypted communications. The key exchange can happen in full sight of the eye, yet the subsequent communication will be illegible. No question, the eye can notice that the two are communicating with unknown content but if too many of them do that this fact becomes less significant.

Of course, the all-seeying eye can send a Nazgul to stand in the middle of the communication to deceive both ends and listen in. But it needs to do so from the beginning and continously if it wants to avoid the victims from noticing. And those two can at any time meet to verify their encryption keys and would realize there  was a Nazgul-in-the-middle attack.

By contrast, both SSL and GPG operate with a trust model where we can hear Sauron’s distant laughter. The one is tied to a thousand or so “root authorities”, which can be easily reined in as need be. The other mandates and propagates such a high level of initial mistrust between us that we find it simply too inconvenient to use.

Societies and our social interactions are fundamentally build on trust. Let’s design systems which build on initial trust and which help to identify after-the-fact when it was compromised. If the eye has bad dreams, then i am sure massively deployed trust-on-first-use communication systems are among them.

Written by holger krekel

October 26, 2013 at 7:04 am

pytest-2.4.0: new fixture features, many bug fixes

with one comment

The just released pytest-2.4.0 brings many improvements and numerous bug fixes while remaining plugin- and test-suite compatible (apart from a few supposedly very minor incompatibilities). See below for a full list of details. New feature highlights:
  • new yield-style fixtures pytest.yield_fixture, allowing to use existing with-style context managers in fixture functions.
  • improved pdb support: import pdb ; pdb.set_trace() now works without requiring prior disabling of stdout/stderr capturing. Also the --pdb options works now on collection and internal errors and we introduced a new experimental hook for IDEs/plugins to intercept debugging: pytest_exception_interact(node, call, report).
  • shorter monkeypatch variant to allow specifying an import path as a target, for example: monkeypatch.setattr("requests.get", myfunc)
  • better unittest/nose compatibility: all teardown methods are now only called if the corresponding setup method succeeded.
  • integrate tab-completion on command line options if you have argcomplete configured.
  • allow boolean expression directly with skipif/xfail if a “reason” is also specified.
  • a new hook pytest_load_initial_conftests allows plugins like pytest-django to influence the environment before conftest files import django.
  • reporting: color the last line red or green depending if failures/errors occured or everything passed.

The documentation has been updated to accomodate the changes, see http://pytest.org

To install or upgrade pytest:

pip install -U pytest # or
easy_install -U pytest

Many thanks to all who helped, including Floris Bruynooghe, Brianna Laugher, Andreas Pelme, Anthon van der Neut, Anatoly Bubenkoff, Vladimir Keleshev, Mathieu Agopian, Ronny Pfannschmidt, Christian Theunert and many others.

may nice fixtures and passing tests be with you,

holger krekel

Changes between 2.3.5 and 2.4

known incompatibilities:

  • if calling –genscript from python2.7 or above, you only get a standalone script which works on python2.7 or above. Use Python2.6 to also get a python2.5 compatible version.
  • all xunit-style teardown methods (nose-style, pytest-style, unittest-style) will not be called if the corresponding setup method failed, see issue322 below.
  • the pytest_plugin_unregister hook wasn’t ever properly called and there is no known implementation of the hook – so it got removed.
  • pytest.fixture-decorated functions cannot be generators (i.e. use yield) anymore. This change might be reversed in 2.4.1 if it causes unforeseen real-life issues. However, you can always write and return an inner function/generator and change the fixture consumer to iterate over the returned generator. This change was done in lieu of the new pytest.yield_fixture decorator, see below.

new features:

  • experimentally introduce a new pytest.yield_fixture decorator which accepts exactly the same parameters as pytest.fixture but mandates a yield statement instead of a return statement from fixture functions. This allows direct integration with “with-style” context managers in fixture functions and generally avoids registering of finalization callbacks in favour of treating the “after-yield” as teardown code. Thanks Andreas Pelme, Vladimir Keleshev, Floris Bruynooghe, Ronny Pfannschmidt and many others for discussions.

  • allow boolean expression directly with skipif/xfail if a “reason” is also specified. Rework skipping documentation to recommend “condition as booleans” because it prevents surprises when importing markers between modules. Specifying conditions as strings will remain fully supported.

  • reporting: color the last line red or green depending if failures/errors occured or everything passed. thanks Christian Theunert.

  • make “import pdb ; pdb.set_trace()” work natively wrt capturing (no “-s” needed anymore), making pytest.set_trace() a mere shortcut.

  • fix issue181: –pdb now also works on collect errors (and on internal errors) . This was implemented by a slight internal refactoring and the introduction of a new hook pytest_exception_interact hook (see next item).

  • fix issue341: introduce new experimental hook for IDEs/terminals to intercept debugging: pytest_exception_interact(node, call, report).

  • new monkeypatch.setattr() variant to provide a shorter invocation for patching out classes/functions from modules:

    monkeypatch.setattr(“requests.get”, myfunc)

    will replace the “get” function of the “requests” module with myfunc.

  • fix issue322: tearDownClass is not run if setUpClass failed. Thanks Mathieu Agopian for the initial fix. Also make all of pytest/nose finalizer mimick the same generic behaviour: if a setupX exists and fails, don’t run teardownX. This internally introduces a new method “node.addfinalizer()” helper which can only be called during the setup phase of a node.

  • simplify pytest.mark.parametrize() signature: allow to pass a CSV-separated string to specify argnames. For example: pytest.mark.parametrize("input,expected",  [(1,2), (2,3)]) works as well as the previous: pytest.mark.parametrize(("input", "expected"), ...).

  • add support for setUpModule/tearDownModule detection, thanks Brian Okken.

  • integrate tab-completion on options through use of “argcomplete”. Thanks Anthon van der Neut for the PR.

  • change option names to be hyphen-separated long options but keep the old spelling backward compatible. py.test -h will only show the hyphenated version, for example “–collect-only” but “–collectonly” will remain valid as well (for backward-compat reasons). Many thanks to Anthon van der Neut for the implementation and to Hynek Schlawack for pushing us.

  • fix issue 308 – allow to mark/xfail/skip individual parameter sets when parametrizing. Thanks Brianna Laugher.

  • call new experimental pytest_load_initial_conftests hook to allow 3rd party plugins to do something before a conftest is loaded.

Bug fixes:

  • fix issue358 – capturing options are now parsed more properly by using a new parser.parse_known_args method.
  • pytest now uses argparse instead of optparse (thanks Anthon) which means that “argparse” is added as a dependency if installing into python2.6 environments or below.
  • fix issue333: fix a case of bad unittest/pytest hook interaction.
  • PR27: correctly handle nose.SkipTest during collection. Thanks Antonio Cuni, Ronny Pfannschmidt.
  • fix issue355: junitxml puts name=”pytest” attribute to testsuite tag.
  • fix issue336: autouse fixture in plugins should work again.
  • fix issue279: improve object comparisons on assertion failure for standard datatypes and recognise collections.abc. Thanks to Brianna Laugher and Mathieu Agopian.
  • fix issue317: assertion rewriter support for the is_package method
  • fix issue335: document py.code.ExceptionInfo() object returned from pytest.raises(), thanks Mathieu Agopian.
  • remove implicit distribute_setup support from setup.py.
  • fix issue305: ignore any problems when writing pyc files.
  • SO-17664702: call fixture finalizers even if the fixture function partially failed (finalizers would not always be called before)
  • fix issue320 – fix class scope for fixtures when mixed with module-level functions. Thanks Anatloy Bubenkoff.
  • you can specify “-q” or “-qq” to get different levels of “quieter” reporting (thanks Katarzyna Jachim)
  • fix issue300 – Fix order of conftest loading when starting py.test in a subdirectory.
  • fix issue323 – sorting of many module-scoped arg parametrizations
  • make sessionfinish hooks execute with the same cwd-context as at session start (helps fix plugin behaviour which write output files with relative path such as pytest-cov)
  • fix issue316 – properly reference collection hooks in docs
  • fix issue 306 – cleanup of -k/-m options to only match markers/test names/keywords respectively. Thanks Wouter van Ackooy.
  • improved doctest counting for doctests in python modules — files without any doctest items will not show up anymore and doctest examples are counted as separate test items. thanks Danilo Bellini.
  • fix issue245 by depending on the released py-1.4.14 which fixes py.io.dupfile to work with files with no mode. Thanks Jason R. Coombs.
  • fix junitxml generation when test output contains control characters, addressing issue267, thanks Jaap Broekhuizen
  • fix issue338: honor –tb style for setup/teardown errors as well. Thanks Maho.
  • fix issue307 – use yaml.safe_load in example, thanks Mark Eichin.
  • better parametrize error messages, thanks Brianna Laugher
  • pytest_terminal_summary(terminalreporter) hooks can now use “.section(title)” and “.line(msg)” methods to print extra information at the end of a test run.

Written by holger krekel

October 1, 2013 at 9:40 am

Posted in metaprogramming

Tagged with ,

PEP438 is live: speed up python package installs now!

with 17 comments

My “speed up pypi installs” PEP438 has been accepted and transition phase 1 is live: as a package maintainer you can speed up the installation for your packages for all your users now, with the click of a button: Login to https://pypi.python.org and then go to urls for each of your packages, and specify that all release files are hosted from pypi.python.org. Or add explicit download urls with an MD5. Tools such as pip or easy_install will thus avoid any slow crawling of third party sites.

Many thanks to Carl Meyer who helped me write the PEP, and Donald Stufft for implementing most of it, and Richard Jones who accepted it today!   And thanks also to the distutils-sig discussion participants, in particular Phillip Eby and Marc-Andre Lemburg.

 

Written by holger krekel

May 19, 2013 at 7:49 am

Posted in metaprogramming

Tagged with , , ,

If i were to tweet a mysogynist joke …

with 8 comments

If a man were to tweet a mysogynist joke and his followers were men, would that be an issue? What if one of them re-tweets it and one of his female followers complains on twitter? And then many other people start tweeting and re-tweeting this or that and what if this all got the initial tweeter fired from his company? And then her company would fire her as well?

Quite a mess, obviously.  However, I think everyone had their reasons for talking and acting the way they did.  And it boils down to the perspective you are able to feel empathy for.  Here is a possible set of perspectives:

Perspective M: “The other day i was ridiculed by a bunch of girls at the office. I wanted to pay back with a little joke in an environment where i felt safe to do so.”

Perspective F: “Again a tweet with bad mysogynist jokes. I’ve had enough. This time i won’t sit quietly but call it out.”

Perspective C1: “Damn, look what this guy caused. His twitter profile is directly associated with our company. And now he tells bad mysogynist jokes and it’s now all over the internet. We cannot let go this time.”

Perspective C2: “Damn it, look what she caused. She is working in public relations and doesn’t know better than to cause a shitstorm which directly comes back to us a company? We cannot let go.”

I could understand each of these perspectives though i’d have a suspicision that the companies choose a bit of an easy way out. Had they rather put the issue of misogyny at the center of their positioning and communication, rather than focusing just on keeping some damage from the company, everybody would have learned a lesson and the incident could have contributed to a more enjoyable environment, i am sure.

Written by holger krekel

March 23, 2013 at 12:16 pm

Packaging, testing, pypi and my Pycon Russia adventures

leave a comment »

A few days ago I talked at Pycon Russia on packaging and testing and a new PyPI Server implementation and workflow tool i am working on, codenamed devpi. See the slides and the video. The slides are converted from my hovercraft based presentation which you can find here (needs javascript).  devpi tries to solve the “standardization” problem around Python packaging by offering a good index server and a “meta approach” on configuring and invoking setup.py/easy_install/pip, incorporating existing practises and facilitating new ones.  The slides and the talk hopefully clarify a bit of the reasoning behind it.
Besides the good feedback and discussions around my talk, i just had a great few days. It was my first time to Russia and i saw and learned a lot.  One unexpected event was going to a russian Sauna with Amir Salihefendic, Russel Keith-Magee and Anton, a main conference organizer. Between going into the Sauna we had glasses of nice irish Whiskey or walked outside to the snowy freezing cold.  Afterwards some of us went to the conference party and had good (despite being somewhat drunken) discussions with people from Yandex, the biggest russian search engine and several russian devs. All very friendly, competent and funny. The party lasted until 5:30am – with my fellow english-speaking talkers Armin Ronacher, David Cramer (a weekend in Russia) and me being among the very last.

Image

David, Amir, Russel, and our russian hosts

The next days evening saw Amir, David, Armin and two russian guys visiting an Irish pub past midnight. It turned out there is no such thing as a “russian pub”, the concept of “pub” was imported in the last decade mostly in the form of english or irish ones. And it seems IT/Python guys can meet everywhere on the planet and have a good time :)

Image

Ice, Ekaterinburg at night, and an anonymous shop

Going back to content, i felt particularly inspired by Jeff Lindsay’s talk on Autosustainable services. He described how he tries to provide several small web services, and how to organize cost sharing by its users. As services need resources, it’s a different issue than Open Cource collaboration which does not require such to exist.

I heart several good sentences from my fellow talkers, for example one from Russel Keith-Magee describing a dillema of open source communities: “There are many people who can say ‘No’ but few who can say ‘Yes’ to something”. Amir Salihefendic desribed how the “Redis” database solved many problems for him, and some interesting concrete usages of “bitmaps” in his current endeavours like bitmapist.cohort. And of course Armin Ronacher and David Cramer also gave good talks related to their experience, Advanced Flask patterns and scalable web services respectively.  With Armin i also had a good private discussion about the issue of code-signing and verification.  We drafted what we think could work for Python packaging (more separately).  With David, i discussed workflow commands for python packaging as he offered some good thoughts on the matter.

Around the whole conference we were warmly cared for by Yulia’s company it-people.ru who overtook the physical organisation, and by Anton and his friends who organized the program.  Maria Kalinina in particular had cared for the keynote speakers and many other aspects of the conference, and without her, i wouldn’t have made it.  Anton drove us to the Asian European geographic border, and Yulia to the skyscraper of Ekaterinburg, overlooking the third largest city in Russia. Russel and me also took the opportunity to walk around Ekaterinburg, looking at Lenin sculptures, buildings made of ice, frozen lakes, and the many shops and noises in the city.

Image

Iced lake, Lenin forver, The Asia/Europe border

Lastly i went to the university with Russel to talk for two hours to students about “How Open Source can help your career” and we had a lively discussion with them and the lecturer who invited us.  I offered my own background and stated that the very best people in the IT world are today collaborating through open-source.  It’s a totally dominant model for excellence.  (Which doesn’t mean there are not some good proprietary projects, they are just fewer i’d say).

So i can join the many russian participants who thought Pycon Russia was a very good conference. It’s of course mostly interesting for people speaking russian, as only seven talks were in english.   For my part, the intense time i had with both the russian hosts and developers and the english talkers was verymuch worth it – i think there might be a few new collaborations coming from it.  More on that in later blog posts hopefully  :)

Two days ago i left Ekaterinburg and felt a bit sad because of the many contacts i made, which almost felt like the beginning of friendships.

Written by holger krekel

March 1, 2013 at 12:38 pm

Traditional family models in the IT and Python world

with 4 comments

from “Traditional family not in bible” (click on image goes to related article form gazette.com)

PSF’s code of conduct enforcement is a good step, but what about the many traditional family models in the IT world? I know many fathers which are busy fulltime with non-child stuff, and their partners have the main child responsibility. I heart three main reasonings for this situation and i don’t fully buy them:

  • an economic one: the guy working brings more money into the household. This kind of perpetuates the inequality situation, doesn’t it? And is having less money really an issue? Is part-time working impossible? In germany you have a legal right to do part-time work, to begin with.
  • a biologistic one: women can “naturally” or genetically care better than men for children. One, I’ve seen fathers doing just fine. Two, are we entirely determined by genetics? I see genetics as some kind of hardware, and software can do lots of different things on it. Culture is shaped as much as software. There is no such thing as “objective” nature.
  • go away, it’s a family’s private business and choices. Nevertheless such choices are also culturally determined. Often there is no explicit discussion or choice but rather a fallback to the default, often induced by the facts of birth and breast feeding. How many fathers discuss the issue of child-care openly and regularly, offering changes to give a real choice?

Rest assured, I really like the projects i am hacking on as much as the other guy. Sometimes i feel that caring often for my child makes this harder. On the plus side, it gives me better focus because my time is more limited. And more often than not, i am grateful and have a lot of fun being with my little one.

Now, if more fathers in the Python communities were busier with their children, what would that change in terms of conference attendance of women? Not sure there would be any direct effect except maybe lower conference attendance of men, rising the percentage of women. It would set a good example, however, and help mid- to long-term, i am sure.

Sometimes i like to ask myself this question: when i am dying and wonder what should i have done rather differently? I doubt i am going to say “i should have released one more library, earned more money, become more popular”.

Written by holger krekel

December 14, 2012 at 10:38 am

metaprogramming in Python: What CPython, PyPy, Pyramid, pytest and politics have in common …

leave a comment »

Metaprogramming in Python too often revolves around metaclasses, which are just a narrow application of the “meta” idea and not a great one at that. Metaprogramming more generally deals with reasoning about program code, about taking a “meta” stance on it.  A metaprogram takes a program as input, often just partial programs like functions or classes. Here are a few applications of metaprogramming:

  • CPython is a metaprogram written in C. It takes Python program code as input and interprets it, so that it runs at a higher level than C.
  • PyPy is a metaprogramm written in Python. It takes RPython program code as input and generates a C-level metaprogram (the PyPy interpreter) which itself interprets Python programs and takes another meta stance by generating Assembler pieces for parts of the interpreation execution. If you like, PyPy is a metaprogram generating metaprograms whereas CPython and typical compilers like GCC are “just” a metaprogram.
  • Pyramid is a metaprogram that takes view, model definitions and http-handling code as input and executes them, thereby raising code on a higher level to implement the “Pyramid application” language.
  • pytest is a metaprogram written in Python, taking test, fixture and plugin functions as input and executing them in a certain manner, thereby implementing a testing language.
  • metaclasses: in Python they allow to intercept class creation and introspect methods and attributes, amending their behaviour. Because metaclass-code usually executes at import time, it often uses global state for implementing non-trivial meta aspects.

Apart from these concrete examples, language compilers, testing tools and web frameworks all have metaprogramming aspects. Creating big or small “higher” level or domain-specific languages within Python is as a typical example of metaprogramming. Python is actually a great language for metaprogramming although it could be better.

In future blog posts i plan to talk about some good metaprogramming practise, particularly:

  • keep the layers/levels separate by good naming and API design
  • define a concise “language” for the programs you take as input
  • avoid creating global state in your metaprograms (and elsewhere)
    which can easily happen with meta-classes executing at import time

Lastly, i see metaprogramming at work not only when coding in a computer language. Discussing the legal framing for executing programs on the internet is some kind of metaprogramming, especially if you consider licensing and laws as human-interpreted code which affects how programs can be written, constructed and executed. In reverse, web applications increasingly affect how we interact with each other other, thereby implementing rules formerly dealt with in the arena of politics. Therefore, metaprogramming and politics are fundamentally connected topics.

have metafun, i. e. take fun stuff as input to generate more of it :) holger

Written by holger krekel

November 22, 2012 at 3:04 pm

execution locals: better than thread locals/globals

with 6 comments

While many agree that global state is evil, the so called “thread locals” are not much better. Even though they help to separate state on a per-thread or per-greenlet basis, they still are global within that context. In particular (thread) global state means that:

  • Invoked functions can change bindings of an invoking function as a side effect
  • thread locals may linger around even if their state is not used or became invalid

Meet “execution locals” which avoid these problems. Find the code released on PyPI:

http://pypi.python.org/pypi/xlocal

It’s some 60 lines of code and tested on python2.5 up to python3.3 and pypy and ready to be played with. I inline its README.txt below in case you can’t or don’t want to switch reading context. One more note: If I were to design a new language i’d probably remove “globals” all together and only offer something like the “xlocal” type with a more straight forward syntax.

execution locals: killing global state (including thread locals)

The xlocal module provides execution locals aka “xlocal” objects which implement a more restricted variant of “thread locals”. An “xlocal” instance allows to manage its attributes on a per-execution basis in a manner similar to how real locals work:

  • Invoked functions cannot change the binding for the invoking function
  • existence of a binding is local to a code block (and everything it calls)

Attribute bindings for an xlocal object will not leak outside a context-managed code block and they will not leak to other threads or greenlets. By contrast, both process-globals and “thread locals” do not implement these properties.

Let’s look at a basic example:

# content of example.py

from xlocal import xlocal

xcurrent = xlocal()

def output():
    print "hello world", xcurrent.x

if __name__ == "__main__":
    with xcurrent(x=1):
        output()

If we execute this module, the output() function will see a xcurrent.x==1 binding:

$ python example.py
hello world 1

Here is what happens in detail: xcurrent(x=1) returns a context manager which sets/resets the x attribute on the xcurrent object. While remaining in the same thread/greenlet, all code triggered by the with-body (in this case just the output() function) can access xcurrent.x. Outside the with- body xcurrent.x would raise an AttributeError. It is also not allowed to directly set xcurrent attributes; you always have to explicitely mark their life-cycle with a with-statement. This means that invoked code:

  • cannot rebind xlocal state of its invoking functions (no side effects, yay!)
  • xlocal state does not leak outside the with-context (lifecylcle control)

Another module may now reuse the example code:

# content of example_call.py
import example

with example.xcurrent(x=3):
    example.output()

which when running …:

$ python example_call.py
hello world 3

will cause the example.output() function to print the xcurrent.x binding as defined at the invoking with xcurrent(x=3) statement.

Other threads or greenlets will never see this xcurrent.x binding; they may even set and read their own distincit xcurrent.x object. This means that all threads/greenlets can concurrently call into a function which will always see the execution specific x attribute.

Usage in frameworks and libraries invoking “handlers”

When invoking plugin code or handler code to perform work, you may not want to pass around all state that might ever be needed. Instead of using a global or thread local you can safely pass around such state in execution locals. Here is a pseudo example:

xcurrent = xlocal()

def with_xlocal(func, **kwargs):
    with xcurrent(**kwargs):
        func()

def handle_request(request):
    func = gethandler(request)  # some user code
    spawn(with_xlocal(func, request=request))

handle_request will run a user-provided handler function in a newly spawned execution unit (for example spawn might map to threading.Thread(…).start() or to gevent.spawn(…)). The generic with_xlocal helper wraps the execution of the handler function so that it will see a xcurrent.request binding. Multiple spawns may execute concurrently and xcurrent.request will carry the execution-specific request object in each of them.

Issues worth noting

If a method decides to memorize an attribute of an execution local, for example the above xcurrent.request, then it will keep a reference to the exact request object, not the per-execution one. If you want to keep a per-execution local, you can do it this way for example:

Class Renderer:
    @property
    def request(self):
        return xcurrent.request

this means that Renderer instances will have an execution-local self.request object even if the life-cycle of the instance crosses execution units.

Another issue is that if you spawn new execution units, they will not implicitely inherit execution locals. Instead you have to wrap your spawning function to explicitely set execution locals, similar to what we did in the above “invoking handlers” section.

Written by holger krekel

November 16, 2012 at 2:22 pm

If i were to design a new programming language …

with 17 comments

I’d see to base syntax and semantics on Python3, but strip and rebase it:

  • no C: implement the interpreter in RPython, get a JIT for free and implementation bits from PyPy’s Python interpreter (parsing, IO, etc.)
  • no drags-you-down batteries: lean interpreter core and a standard battery distro which is tested against the last N interpreter versions + current
  • no yield: use greenlets to implement all of what yield provides and more
  • no underlying blocking on IO: base it all on event loop, yet provide synchronous programming model through greenlets
  • no c-level API nor ctypes: use cffi to interface with c-libraries
  • no global state: just support state bound to execution context/stack
  • no GIL: support free threading and Automatic Mutual Exclusion for dealing with shared state
  • no setup.py: have a thought-through story and tools from the start for packaging, installation, depending/interfacing between packages
  • no import, no sys.modules: provide an object with which you can access other packages’s objects and introspect/interact with one’s own package
  • no testing as an afterthought: everything needs to be easily testable, empowered assert statement and branch-coverage supported from the core.
  • no extensibility as an afterthought: support plugins and loose coupling through builtin 1:N calling mechanism (event notification on steroids)
  • no unsafe code: support IO/CPU/RAM sandboxing as a core feature
  • no NIH syndrome: provide a bridge to a virtualenv’ed Python interpreter allowing to leverage existing good crap

Anything else? Probably! Discussion needed? Certainly. Unrealistic? Depends on who would participate — almost all of the above has projects, PEPs and code showcasing viability.

Btw, did you know that when we started PyPy we initially did this under the heading of “Minimal Python”? Some of the above ideas above and their underlying motivations were already mentioned when I invited to the first PyPy sprint almost 10 years ago:

http://mail.python.org/pipermail/python-dev/2003-January/032427.html

I learned since then that Python has more complex innards than it seems but i still believe it could be both simpler and more powerful.

holger

Written by holger krekel

November 13, 2012 at 3:29 pm

Follow

Get every new post delivered to your Inbox.