Regex for SEO

Â
Perhaps you’ve heard about regex but aren’t sure how it can be utilised in SEO or whether it fits into your own strategy. Regular expressions, or ‘regex,’ are a text search programming language that allows you to add complex search strings, partial matches and wildcards, case-insensitive searches, and other advanced instructions. They can be thought of as looking for a pattern rather than a specific string of text. As a result, they can assist you in locating entire sets of search results that appear to have little in common at first glance.
Regex expressions have their own language, and seeing one for the first time can be pretty strange. They are, simple to master and can be utilised in JavaScript, Python, and other programming languages, making them a versatile and powerful SEO tool.
This article will teach you how to utilise common regex operators, how to use more complicated regex filters for SEO, how to use regex in Google Analytics and Google Search Console, and much more. You’ll see regex used in a variety of ways in SEO as well.
What Does Regex Look Like?
A regular expression is often composed of text that will match exactly in the search results, as well as many operators that function more like wildcards to produce a pattern match rather than an exact text match.
A single-character wildcard, a match for one or more characters, or a match for zero or more characters, as well as optional characters, can be used. Nested sub-expressions in parentheses, and ‘or’ functions can also be utilised. By combining these many actions, you can create a complex expression that can yield a very broad, yet very particular result.
Common Regex Operators
Here are a few examples of common regex operators:
.     –   Any character
.*Â Â Â Â –Â Â zero or more characters.
.+Â Â Â – Â 1 or more characters.
d     –  matches single digit from 0 to 9.
?     –  Optional character
|      –  A vertical line, also known as a ‘pipe’ character, denotes an ‘or’ function.
^Â Â Â Â Â – Â Beginning of a line
$Â Â Â Â Â –Â Â End of a line
( )Â Â – Â Used to nest a sub-expression.
\Â Â Â Â Â –Â Â Escape a special character
Some programming languages, such as JavaScript, allow the addition of ‘flags’ after the regex pattern, which might further influence the result:
g    –   more than one match.
i     –   ignore case.
m   –   Activates multiline mode.
s    –   corresponds to whitespace.
u     –  Activates full Unicode support.
y     –  Searches the specific text position (‘sticky’ mode).
As you can see, these operators and flags combine to form a complicated logical language that allows you to accomplish highly specific conclusions across enormous, unordered data sets.
Using Regex On Google Analytics
Regular expressions may be used to create filters that reveal only the data you want to see in Google Analytics, which is one of the most prominent uses of regex for SEO.
In this context, the expression is used to exclude results rather than to build a set of inclusive search results.
If you wish to exclude data from IP addresses on your local area network, for example, you could filter out 192.168.*.* to exclude the entire range from 192.168.0.0 to 192.168.255.255.
More Advanced Regex SEO Filters
Consider the following scenario: you have two brands: regex247 and regex365.
You might want to exclude results that contain any combination of these brand names in their URLs, such as regex247.biz or www.regex365.org.
One method is to use a simple ‘or’ expression like this: .*regex247.*|.*regex365.*
All matching URLs, including subfolder paths and individual page URLs that occur on those domain names, would be removed from your Analytics data.
A Word Of Warning
It should be noted that – A badly constructed regex expression, similar to your robots.txt file, can very easily filter out much or all of your data by using an unrestricted wildcard match.
The good news is that in many SEO scenarios, the filter is only applied to your data during the reporting step, and you may restore full visibility to your data by modifying or deleting your regex phrase.
Regular expressions can also be tested using a variety of online testing tools to verify if they accomplish the desired result, allowing you to ‘sandbox’ your regex expressions before releasing them throughout your full data collection.
To make regex filters in Google Analytics, go to the Report type you wish to make (for example, Behaviour > Site Content > All Pages or Acquisition > All Traffic > Source/Medium).
Look for the search box below the graph, at the top of the data table, and click advanced to see the advanced filter options.
You can use this section to include or exclude data based on a specific dimension or metric. Choose Matching RegExp from the dropdown list after selecting your dimension, and then type your expression into the text field.
‘Or’ And ‘And’ In Google Analytics Regex
Simply put the pipe character (the | vertical stroke sign) between the necessary segments of your expression to build an ‘or’ expression in Google Analytics.
Regular expressions in Google Analytics do not accept ‘and’ statements within a single regex; however, you can do this by adding another filter.
Simply click Add a dimension or metric below your first regex and type your next regex. When filtering your data, you can stack as many expressions as you want, and they’ll be processed as a single logical ‘and’ statement.
Using Regex In Google Search Console
Google Search Console started supporting the Re2 regex syntax in 2021, allowing webmasters to include and exclude data from the user interface.
There is a character limit of 4096 characters at the time of writing (which is usually enough).
You may use Search Console to filter for queries featuring a given brand and the variations users may type, such as Facebook:
.*facebook.*|face*book.*|fb.*|fbook.*|f*book.*
Filter out users who arrive at your website via “commercial” purpose terms:
.*(best|top|alternate|alternative|vs|versus|review*).*
Why Is Regex Important For SEO?
Finally, why is all of this important?
It’s all about taking control of your data and filtering out the parts that aren’t helping you improve your SEO, whether it’s certain pages or parts of your website, traffic from a specific source or medium, or data from your own local network.
You can construct short regex expressions to obtain a basic ‘include’ or ‘exclude’ filter, or lengthier expressions to achieve complex and very precise results.
You can also verify that your SEO efforts are accomplishing your aims, ambitions, and outcomes using the correct regex for each campaign – a strong approach to demonstrate positive ROI on your future SEO investments.
Credits – https://www.searchenginejournal.com/regex-seo-beginners-guide/432930/