Chiffre.io anonymous visit counting for clients without JavaScript

Python Pathlib complex glob patterns


TL;DR

import itertools
import pathlib

mypath = pathlib.Path()
patterns = ["*.jpg", "*.png"]

matched = list(
    itertools.chain.from_iterable(
        mypath.glob(pattern) for pattern in patterns
    )
)

Today I learn Python pathlib.Path.glob does not support complex glob patterns such as *.{jpg,png}!

Let's see how it can be done. First try:

import pathlib

mypath = pathlib.Path() / "tests"
matched = list(mypath.glob("*.jpg")) + list(mypath.glob("*.png"))

It does the job but I'm not a big fan of concataining lists as it allocates multiple list objects. pathlib.Path.glob returns a generator which is an iterable. Let's see if itertools can rescue us:

import itertools

matched = list(itertools.chain(mypath.glob("*.jpg"), mypath.glob("*.png")))

Not bad! Only a single list object is created out of multiple generators. Can we go further and apply some "DRY" principle?

matched = list(
    itertools.chain.from_iterable(
        mypath.glob(pattern) for pattern in ["*.jpg", "*.png"]
    )
)

I do not think I can do better right now. The expression given as argument to itertools.chain.from_iterable is a generator, itself generating a generator for each pattern using pathlib.Path.glob. I do like these types of constructs in Python as the whole construction chain is lazily evaluated thanks to generators, and the composability aspect to it feels very much like functional programming.