Filtering Ghost Referral Spam Via Advanced Segments

First, a very short primer on filter formatting. Filters need to be written in “regular expression”. Google has a primer on regex, and there’s a wiki page on it in case you care to get a little more.

In regex, full-stop characters (periods) function as wildcards– they match any letter, number, or symbol the way an asterisk typically does in Windows systems. In order to have them be recognized as simple periods (you don’t want to cast too wide a net, after all), you need to precede them with a \ (backslash). The backslash tells the system to interpret the character that follows it as a simple character, and not its regex function.

Second, the advanced segments. Filters are great, and generally do the job we want them to do– get spam out of our analytics data. However, there are two downsides:

  1. Filters neither preserve nor filter data from prior to the moment the filter is implemented. If you preserve your “All Data” view, and compare it to a filtered view with your anti-spam filters implemented, you will notice that the filtered view starts blank. What’s more, if you add a new domain to the filter after you’ve found it, its initial hits will remain in your data. There is no way to excise it using filters.
  2. Filters only apply to one view on whatever GA property you’re looking at. You can import them to other views for that same property manually; however (to my knowledge), you can’t share them between GA property short of copy/pasting them. This is tedious, especially when you have to update every.single.filter. to include any new referral spam sources you’ve detected.

Advanced segments don’t have these shortfalls. When you add a filter to an advanced segment, it removes the data from your analytics view when you’re looking at that segment. This includes data from before the filter was implemented; indeed, it cleans up the data from the first date GA was implemented onwards. As an added bonus, segments are saved to your Google account– so if you set one up, you can pull it on any GA property you manage from that Google account.

Setting up a segment is pretty easy, once you know how to get the segment to filter what you want it to. Step by step (this will be detailed– because I have a thing about relaying good instructions):

  1. Full album here.
  2. Log into Google Analytics.
  3. Open up any of your managed GA properties.
  4. Near the top of the window, click the “Add a segment” button.
  5. Click “+New Segment”.
  6. Multi parter: open the image.
  7. In the left menu, click “Conditions”. (#1)
  8. (Forgot this in the image): Leave “sessions” alone, but change “Include” to “Exclude”.
  9. Set the first dropdown to “Source”. (#2)
  10. Set the second dropdown to “Matches regex”. (#3)
  11. Type or paste your regex filter into the text box. (#4).
  12. Name the segment.
  13. Save the segment.
  14. When you return to your property, the new segment may already be implemented.
  15. From now on, when you open the property, you can select “Add a segment”.
  16. Check off the appropriate segment you want to view.
  17. Click “Apply”.
  18. View the glorious difference.

You’ll be able to compare the unfiltered data to the filtered data, from day 0f of GA installation onwards.

Keep in mind two things. First, filter patterns can be a maximum of 255 characters long. Hence the need for multiples. Second, you can update your segments whenever you need to, and the revisions will take immediately, and apply to historical data.

Enjoy your convenient clean data.

Recent Blog Posts

Contact Us Today!

  • This field is for validation purposes and should be left unchanged.