Monday, May 1, 2017

lambda functions in Python

The most common (and recommended) way to implement a function in Python is via the def keyword and function name. For instance

  
def add(a, b): return a + b 

is a function that adds two numbers and can be called as

  
>>> add(1, 2)
3

We can (re)define the same function as a lambda function

  
add = lambda a, b: a + b

and it can be invoked with exactly the same syntax as before

  
>>> add(1, 2)
3

However, we can also define and invoke a lambda function anonymously (without giving it a name). Here some examples

  
>>> (lambda a, b: a + b)(1, 2)      # add
3
>>> (lambda a: a * 2)(2)            # times 2
4
>>> (lambda a: sum(a))([1, 2, 3])   # sum of list
6
>>> (lambda *a: min(a))(1, 2, 3)    # min of list
1

The syntactical differences between conventional function definitions via def and lambda functions are that lambda functions have no brackets around their arguments, don't use the return statement, do not have a name and must be implemented in a single expression.

So, what are they good for? Fundamentally there is no need for lambda functions and we could happily live without them but they make code shorter and more readable when a simple function is needed only once. The most common use case is as key function for sort:

  
>>> names = ['Fred', 'Anna', 'John']
>>> sorted(names)                        # alphabetically
['Anna', 'Fred', 'John']

>>> sorted(names, key=lambda n: n[1])    # second letter
['Anna', 'John', 'Fred']

Especially for collections of data structures key functions come in handy for sorting

  
>>> contacts = [('Fred', 18), ('Anna', 22)]

>>> sorted(contacts)                               # name then age
[('Anna', 22), ('Fred', 18)]

>>> sorted(contacts, key=lambda (name, _): name)   # name
[('Anna', 22), ('Fred', 18)]

>>> sorted(contacts, key=lambda (_, age): a)       # age
[('Fred', 18), ('Anna', 22)]

>>> sorted(contacts, key=lambda (n, a): a, n)      # age then name
[('Fred', 18), ('Anna', 22)]

Note the brackets in lambda (name, _) and similar calls that result in tuple unpacking and a very readable sorting implementation by name or age (removed in Python 3, see PEP 3113). We could have achieved the same by defining a conventional function

  
>>> def by_age((name, age)): return age
>>> sorted(contacts, key=by_age)
[('Fred', 18), ('Anna', 22)]

which is also very readable but longer and pollutes the name space with a by_age function that is needed only once. Note that lambda functions are not needed if we already have a function! For instance

  
>>> names = ['Fred', 'Anna', 'John']

>>> sorted(names, key=lambda n: len(n))    # don't do this
['Fred', 'Anna', 'John']

>>> sorted(names, key=len)                 # simply do this
['Fred', 'Anna', 'John']

An elegant example for the usage of lambda functions is the implementation of argmax, which returns the index of the largest element in a list (and the element itself)

  
>>> argmax = lambda ns: max(enumerate(ns), key=lambda (i,n): n)
>>> argmax([2, 3, 1])
(1, 3)

In Python 3 we can use the following implementation instead, which is also short but less readable

  
>>> argmax = lambda ns: max(enumerate(ns), key=lambda t: t[1])
>>> argmax([2, 3, 1])
(1, 3)

While applications for lambda functions in typical Python code are limited they occur frequently in APIs for graphical user interface, e.g. to react on buttons pressed or in code that employs a functional programming style, which uses filter, map, reduce or similar functions that take functions as arguments. For instance, the accumulative difference of positive numbers provided as strings could be computed (in functional style) as follows

>>> from operator import sub  
>>> numbers = ['10', '-5', '3', '2']  # we want to compute 10-3-2
>>> reduce(sub, filter(lambda x: x > 0, map(int, numbers)))  
5

However, list comprehension and generators are often more readable and recommended. Here the same computation using a generator expression

>>> reduce(sub, (int(n) for n in numbers if int(n) > 0)) 
5

Note, however, that we have to convert strings to integers twice and that there is no elegant way to compute the accumulative difference without using reduce, which is not available in Python 3 anymore :( So in plain, boring, imperative Python we would have to write

numbers = ['10', '-5', '3', '2']
diff = None
for nstr in numbers:
   n = int(nstr)
   if n <= 0: 
      continue
   diff = n if diff is None else diff - n

Pay special attention to the ugly edge case of the first element when accumulating the differences. We need to initialize diff with a marker value, here None, and then have to test for it in the last line. Ugly! The functional implementation is clearly more readable and shorter in this case (and similar problems).

Luckily the functools library comes to the rescue and brings back reduce. Also have a look at itertools, which provides many useful functions for functional programming. Similarily, there is nuts-flow, which allows even more elegant code. Here the accumulative difference using nuts-flow

>>> from operator import sub 
>>> from nutsflow import _, Reduce, Map, Filter

>>> numbers = ['10', '-5', '3', '2']
>>> numbers >> Map(int) >> Filter(_ > 0) >> Reduce(sub)
5

That's it. Have fun ;)

No comments:

Post a Comment