How can I Block Language Spam in Google Analytics

Nov 9, 2022
language spam google analytics

Everyone hates spam, but it can be very frustrating for website owners since it typically involves taking the time to create filters, and figuring out the most effective ways to prevent it.

Many of you started to notice this at the moment of 2016 US elections. Learn from our article below about the most effective ways to prevent messages from a foreign language and prevent it from skewing the statistics of your analytics and traffic. It is vital to address these issues when they begin appearing.

What is Language Spam?

While referrer spam is mainly focused on targeting the search engines however, language spam can be employed by spammers to achieve a certain agenda or to advertise their own products or sites. What happens is they manipulate the language used by real sites like motherboard.vice.com, thenextweb.com, lifehacker.com, reddit.com, etc. Language spam also typically will only record pageviews on the homepage of your website.

What are they gaining? Peter Velchev from Dowser explains it well:

The reason for this is that once you have a look at the website of the person who is visiting your site it is possible in tracing it back to where it came from. The result would be genuine visits to the hacker's website, thus pushing it up the rating ladder...
language spam referral source
Language spam referral source
  • Secret.google.com We invite you! Only enter using this ticket's URL. Copy it. Make sure you vote for Trump!
  • We wish President Trump and everyone else Americans
  • Vitaly rules google *:.;**"(^^)no**;.:* -\_(tsu)_/-(tthYi tth)(th_th)(@_@)l(tth_tthl)( deg ? deg)"(;D;)no?**? ool(=^ ^=)oO
  • o-o-8-o-o.com search shell is much more effective than Google!
  • Google has officially endorsed o-o-8-o-o.com Search Shell!

Google appears to be engaged in resolving this problem, but more and more keep emerging. After one has been fixed, another one seems to be about to start.

google analytics language spam
Google Analytics language spam

This screenshot was of a new site, and as you can see, between November 1st and December 17th 929 of the 1 377 sessions were attributed to language spam! Talk about skewing your data.

language spam traffic
Language spam traffic

The problem of language spam was raised on the Search Engine Roundtable on November 9th. If you look through Google Trends, it is clear that starting in November 2016 the activity around "google analysis spam" increased dramatically.

google analytics spam trends
Google Analytics spam trends

Why Should You Block Language Spam?

A further reason that most people don't be aware of, is the fact the fact that Google Analytics filters don't work retroactively. That means filters only apply to data gathered prior to the time the filters were created. This is why it's important to tackle the issue of spam immediately. It is impossible to fix historical data with filters. But the drawback to this is that if you do not implement the correct filter, you could lose valuable data for ever. There are sophisticated segments however, which could help users with their historical data, of which we will go more into below.

How to Block Language Spam in Google Analytics

Option 1: Block Language Spam with a Filter

The first and probably one of the most efficient ways to block language spam within Google Analytics is to use a filter. Filters permit you to alter the data and restrict it. In particular, you could restrict certain subdirectories or filter traffic to specific IPs or IP ranges for example. It is recommended to set up a new view whenever you are creating filters, in case something goes wrong, you should always have access to your original files unaltered. After that, apply all of your customized filters to the new view.

  1. (Optional)  

The first step is to copy your current view so that you can filter the data only on a separate view. This is optional for your safety. You might already have an additional view, in which case you can skip to Step 2. In any case, you need to go into the Admin section in Google Analytics and into your view's "View Options." After that, click "Copy view." Why you need to copy your view is because it will be carried all other filters and purposes you may have in place in your website.

copy view in google analytics
Copy view in Google Analytics

Name your new view. In our example the case, we choose "filtered domain.com." Then click on "Copy view."

Are you interested in knowing the ways we have increased traffic over 1000 percent?

Join over 20,000 others to receive our weekly newsletter that contains insider WordPress tips!

copy view
Copy View Copy

  Step 2.  

Navigate to the new view (or your original view) and click into "Filters." After that, click the "+ Add Filter."    You'll need "Edit" access on the "Account" level of Google Analytics in order to set up new filters, or you won't have the ability to follow through on these next steps.

add new filter google analytics
Create a new filter for Google Analytics

  3.  

You can give your filter a name (ex: Filter the Spam in Language). Then choose custom from the Filter types. Then, you should select the "Language Setting" filter, and then enter the following into the filter Pattern field:  . 15,|\s[^\s]*\s|\.|,|\!|\/

language spam filter
Filtering out spam in the language

You can then click on the "verify" button and view an illustration of what the filter found in the last 7 days. Then click "Save" to apply the filter.

verify language spam filter
Check the spam filtering in your language.

And that is all there is to it! Now, only real and valid languages in your Google Analytics.

Option 2: Block language Spam by using an advanced segment

The other option for fighting language spam in Google Analytics is to use an advanced segment. They work on your historical data and are considered to be an alternative to not alter your information since they don't modify any aspect of your data. You can deactivate them at any time to return to the state you were in prior to. But, if you're using a separate view with a filter as we showed earlier, it is as safe.

  Step 1.  

To start a segment, go in the Admin area of Google Analytics and into "Segments." Then click on "+ Create Segment."    Similar to filters, you'll need "Edit" access to the "Account" level of Google Analytics in order to set up new filters, or else you'll not have the ability to follow through on the next steps.

create segment google analytics
Create segment in Google Analytics

  Step 2.  

You can give your segment a name (ex: Segment Language Spam) and then under the Language field, switch the drop-down menu to "does not correspond to regex" and then enter the following information:  . 15,|\s[^\s]*\s|\.|,|\!|\/

After that, click "Save."

language spam advanced segment
Language spam advanced segment

Then you're done. Then, you can pick the appropriate language segment from your Analytics dashboard and remove "All Users." Remember segmentation alters the data immediately. Tips: Create the custom dashboard/shortcut using your newly added segment for quick viewing at a later time.

analytics view segment
View segment in Google Analytics

Option 3: Block the Spam Languages of 3rd Party Lists

One of the more annoying aspects of spam is that it is time-consuming for us as website owners. We must be constantly updating our segments and filters to make sure that our information is as precise as it can be. There are many third-party tools and resources that can speed up the process if you're short on time. There are several alternatives you could look into:

  • Analytics-Toolkit: This company provides what they call an Automatic Spam filter that's constantly upgraded to help you.
  • Analytics Edge has free segment templates that you can use with a single mouse click. The segments are updated regularly.

And if you are looking to learn more about the best way to eliminate spam from Google Analytics, these following instructions are excellent:

Summary

As you can see, it is pretty easy to block out this new language spam tactic. We suggest looking over the data on your websites to ensure that your data isn't distorted. What is your opinion on the language spam situation? It is truly frustrating and are hoping that in the future Google will help to combat the spread than this useless information businesses are currently dealing with.

Cut down on time, expenses and improve site performance by:

  • 24/7 help and support assistance from WordPress hosting experts 24/7.
  • Cloudflare Enterprise integration.
  • Global audience reach with 35 data centers around the world.
  • Optimization using the built-in Application for Performance Monitoring.