Today I learned how to use Hypothesis to do confident code refactoring.
Hypothesis is a Python library that you can use when testing your code. In a nutshell, Hypothesis will generate test cases automatically for you.
In case you already know what that means, Hypothesis enables property-based testing, which is a “testing paradigm”. In case you are interested, I've written about getting started with Hypothesis before.
One of the cool things about Hypothesis is that you can use it when you are refactoring code and you want to be as certain as possible that you are not breaking anything. Let me give you a concrete example.
from functools import lru_cache @lru_cache def dl(a, b): edit_distances =  if len(a) == len(b) == 0: edit_distances.append(0) if len(a) > 0: edit_distances.append(dl(a[:-1], b) + 1) if len(b) > 0: edit_distances.append(dl(a, b[:-1]) + 1) if len(a) > 0 and len(b) > 0: edit_distances.append(dl(a[:-1], b[:-1]) + (a[-1] != b[-1])) if len(a) > 1 and len(b) > 1 and a[-1] == b[-2] and a[-2] == b[-1]: edit_distances.append(dl(a[:-2], b[:-2]) + (a[-1] != b[-1])) return min(edit_distances)
However, this was just a basic implementation that translated the mathematical formula that the Wikipedia page showed. I wanted to have a go at rewriting this in a better way. I wanted to refactor the code.
So, I did. I came up with this alternative implementation:
from functools import lru_cache @lru_cache def dl(a, b): if not a or not b: return len(a) + len(b) levenshstein = min( dl(a[:-1], b) + 1, dl(a, b[:-1]) + 1, dl(a[:-1], b[:-1]) + (a[-1] != b[-1]), ) if a[:-1] and b[:-1] and a[-1] == b[-2] and b[-1] == a[-2]: return min( levenshstein, dl(a[:-2], b[:-2]) + (a[-1] != b[-1]), ) return levenshstein
Now, the question is: are these two functions the same? Do the two functions compute the same thing?
Because Hypothesis generates random test cases for you, what you can do is create a test where Hypothesis generates two random strings, we feed the two strings to the two alternative implementations, and then we check if they return the same result!
In essence, if
dl2 are the two functions, you just need to write this:
# dl.py from functools import lru_cache from hypothesis import given from hypothesis.strategies import text @lru_cache def dl(a, b): ... @lru_cache def dl2(a, b): ... @given(text(max_size=15), text(max_size=15)) def test_dl_match(a, b): assert dl(a, b) == dl2(a, b)
Then, you can run your tests – for example, with
pytest dl.py – and wait for Hypothesis's verdict.
If the test passes, that's because the two functions probably compute the same thing.
In this case, the test passes, so the two functions likely are the same and my refactor is OK!
Notice that this doesn't guarantee that the two functions are correct!
I just tested whether they compute the same results.
dl is correct, it's likely that
dl2 is also correct; if
dl is incorrect, it's likely that
dl2 is also incorrect.
So, now you know. If you want to do a large refactor of a function, have the two versions side by side, use Hypothesis to generate the arguments for them, and compare the results of the two.
That's it for now! Stay tuned and I'll see you around!
If you enjoyed this blog article, you will love the mathspp insider 🐍🚀 newsletter! Join +16.000 others who are taking their Python 🐍 skills to the next level! 🚀