Two simple changes allow the previous regular expression to match phone numbers within longer text: matches the position between a word character and either a nonword character or the beginning or end of the text.Letters, numbers, and underscore are all considered word characters (see Recipe 2.6).
This is a textbook example of where we need a backslash to escape a special character so the regular expression treats it as literal input.
As we’ve repeatedly seen, parentheses are special characters in regular expressions, but in this case we want to allow a user to enter parentheses and have our regex recognize them.
So far, the regular expression matches any 10-digit number.
If you want to limit matches to valid phone numbers according to the North American Numbering Plan, here are the basic rules: Beyond the basic rules just listed, there are a variety of reserved, unassigned, and restricted phone numbers.
The first group can optionally be enclosed with parentheses, and the first two groups can optionally be followed with a choice of three separators (a hyphen, dot, or space). It’s important that the hyphen appears first in this character class, because if it appeared between other characters, it would create a range, as with .