Matching prefixes and suffixes with regular expressions

The special characters \b and \B go hand in hand in regular expressions:

  • \b matches at word boundaries; and
  • \B matches inside words.

For these two characters, the default β€œword characters” are alphanumeric characters and the underscore.

By combining \b and \B at the beginning or end of a pattern, you get to match standalone words, prefixes, suffixes, and infixes!

The table below shows some examples of sentences that all contain the substring "legal" along the rows. The columns show whether different patterns that use the special characters \b and \B would match against those sentences.

r"legal" r"\blegal\b" r"\blegal\B" r"\Blegal\b" r"\Blegal\B"
"Criticism is legal." βœ… βœ… ❌ ❌ ❌
"He's legally blind." βœ… ❌ βœ… ❌ ❌
"Theft is illegal." βœ… ❌ ❌ βœ… ❌
"He obtained that illegally." βœ… ❌ ❌ ❌ βœ…

This was the tip 97 I sent to the Python drops πŸπŸ’§ newsletter, so if you'd like to get a daily drop of Python knowledge, make sure to sign-up now!

Improve your Python 🐍 fluency and algorithm knowledge 🎯

Get ready for 12 intense days of problem-solving. The β€œAlgorithm Mastery Bootcamp” starts December 1st and it will feature 24 programming challenges, live analysis sessions, a supportive community of like-minded problem-solvers, and more! Join now and become the Python expert others can rely on.

Previous Post Next Post

Blog Comments powered by Disqus.