Matching prefixes and suffixes with regular expressions

The special characters \b and \B go hand in hand in regular expressions:

  • \b matches at word boundaries; and
  • \B matches inside words.

For these two characters, the default β€œword characters” are alphanumeric characters and the underscore.

By combining \b and \B at the beginning or end of a pattern, you get to match standalone words, prefixes, suffixes, and infixes!

The table below shows some examples of sentences that all contain the substring "legal" along the rows. The columns show whether different patterns that use the special characters \b and \B would match against those sentences.

r"legal" r"\blegal\b" r"\blegal\B" r"\Blegal\b" r"\Blegal\B"
"Criticism is legal." βœ… βœ… ❌ ❌ ❌
"He's legally blind." βœ… ❌ βœ… ❌ ❌
"Theft is illegal." βœ… ❌ ❌ βœ… ❌
"He obtained that illegally." βœ… ❌ ❌ ❌ βœ…

This was the tip 97 I sent to the Python drops πŸπŸ’§ newsletter, so if you'd like to get a daily drop of Python knowledge, make sure to sign-up now!

Become a better Python 🐍 developer, drop by drop πŸ’§

Get a daily drop of Python knowledge. A short, effective tip to start writing better Python code: more idiomatic, more effective, more efficient, with fewer bugs. Subscribe here.

Previous Post

Blog Comments powered by Disqus.