Optimize Google Analytics / Google Tag Manager via Preconnect Headers

Add the following before the tag in your html to give google analytics loading a little boost and make the pagespeed and chrome auditing tool happy with your site

<link rel="dns-prefetch" href="https://www.google-analytics.com">
<link rel="dns-prefetch" href="https://www.googletagmanager.com">
<link href="https://www.google-analytics.com" rel="preconnect" crossorigin>
<link href="https://www.googletagmanager.com" rel="preconnect" crossorigin>

These optimizations can speed up your page load.
Opportunity: Preconnect to required origins
Consider adding preconnect or dns-prefetch resource hints to establish early connections to important third-party origins. Learn more.
https://www.google-analytics.com

CloudFlare Mirage Causing Google Pagespeed Hang Ups

After encountering some Sea-Themed Google Pagespeed Warnings for porpoiseant , jellyfish.webp, and banger.js, I’ve tracked down the offending code to be from CloudFlare’s Mirage tool (found under the Speed tab) 

From the CloudFlare Website, this is the summary of Mirage:

What does Mirage do?

Mirage tailors image loading based on network connection and device type. Devices with small screens receive smaller images, and slower connections receive lower resolution images. This speeds up page rendering so users can begin interacting with your website without waiting for images to download first.

Mirage improves page load time by:

  • Image Virtualizing: Replaces images with low-resolution placeholder images that have the same dimensions as the original (including third-party images). Once the page renders completely, full resolution images are then lazy-loaded (prioritizing images in the browser viewport). This process allows pages to render quickly and minimizes browser reflow.
  • Request Streamlining: Combines multiple individual network requests for images into a single request.

Note: Mirage does not transcode or otherwise alter the original full-resolution images.

Mirage is considered Beta because it’s an experimental feature that may cause issues displaying images in association with certain Javascript libraries, such as image carousels or photo viewers. Issues with Mirage affect only a small percentage of customers.

 

In a real-world setting, I imagine Mirage can provide some good speed advantages. This is even more true for image-lite sites where users are interested in reading the text first before the full-resolution images have loaded. Unfortunately, in Google’s Pagespeed-Lab context, the Googlebot sees Mirage as a drain to websites. Googlebot seems to prefer manually implemented image optimizations and lazy-loading techniques. Because Pagespeed is now a ranking factor and Mirage is still a beta feature, I am now avoiding the tool in order to avoid possible Google penalties for having an apparently slow site.

edmonton.webp jellyfish.webp banger.js 404 errors

I’ve been getting an increase in 404 errors hit by Googlebot recently:

66.249.69.204 - - [01/Jul/2019:20:29:17 +0000] "GET /porpoiseant/banger.js?cb=169-1&bv=2&v=15&PageSpeed=off HTTP/1.1" 404 3277 "https://fccid.io/JWC-BS5-5" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://ww
66.249.69.206 - - [01/Jul/2019:10:37:53 +0000] "GET /detroitchicago/edmonton.webp?a=a&cb=170-1&shcb=27 HTTP/1.1" 404 "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.69.208 - - [01/Jul/2019:10:37:53 +0000] "GET /porpoiseant/jellyfish.webp?a=a&cb=170-1&shcb=27 HTTP/1.1" 404  "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

I’m assuming this is either from a CloudFlare 3rd party app or perhaps one of the primary functions of CloudFlare. CF should really be handling these requests and not allow them to hit the back-end server.
The only third-party app I am using on CF is Google Analytics. Given the nature of the request, I’m assuming this could be an artifact from Rocket Loader or the WebP image loading function of CloudFlare Polish.

Have you seen these requests recently? Please share below so I can continue tracking this down.

Update: I’ve narrowed down the source of these requests: CloudFlare Mirage. Mirage is a CF addon intending to improve page load time by Image Virtualizing and Request Streamlining. CloudFlare uses some smart algorithms to serve mobile devices intelligently-optimized (lossy) images as well as lazy-loading the full resolution image. On top of this, Mirage manages to send all the page images over fewer network requests. Despite these supposed speed advantages, I’ve removed the Mirage tool from my websites. Here’s Why.

ZoomBot (Linkbot 1.0 http://suite.seozoom.it/bot.html)

I've posted the archive of http://suite.seozoom.it/bot.html below as it seems the link is now dead. I've blocked the ZoomBot from my web properties due to little added value.

ZoomBot

Version: 1.0
Bot Type: Good (Identifies itself, has an official moniker)
Category: Marketing
Obeys Robots.txt: Yes
User-agent string: ZoomBot (Linkbot 1.0 http://suite.seozoom.it/bot.html)

What is ZoomBot

What is ZoomBot? ZoomBot is a Web Crawler that powers the 2 billions link database for SEOZoom tool. It constantly crawls web to fill our database with new links and check the status of the previously found ones to provide the most comprehensive and up-to-the-minute data to our users. Link data collected by ZoomBot from the web is used by marketers in Italy to plan, execute, and monitor their online marketing campaigns.

What is ZoomBot doing on your website?

ZoomBot is crawling your website making notes of outbound links and adding them to our database. It will periodically re-crawl your website to check the current status of previously found links. Our crawler does not collect or store any other information about your website. It does not trigger ads on your website (if any) and won’t add numbers to your Google Analytics traffic.

It does respect Robots.txt?

Yes, We respect robots.txt, only with disallow rule.
You can block our bot using this simple rule:

User-Agent: ZoomBot
Disallow: *

Please note that ZoomBot may need some time to pick the changes in your robots.txt file. This will be made prior to each next scheduled crawl.

Please also note that if your robots.txt contains errors and ZoomBot won’t be able to recognize your commands it will continue crawling your website the way it did before.

If you think that ZoomBot is someway misbehaving on your website or if you have any questions about it, please don’t hesitate to contact our support team [email protected]

GoogleInitIc – Google Adsense Experiencing a loading issue

Google Adsense / Double click seems to be experiencing some occasional loading issues across domains. Users are seeing the text “GoogleInitIc(document.body,’10,10,10,10′)” followed by a blank space in place of an advertisement. These issues appear widespread and could be impacting ad revenue.

GoogleInitIc loading error with blank ad space

Bulk IP-Address / Reverse DNS Lookup Tool

I’ve created this simple little Google Sheet for conducting reverse DNS / IP address lookup via Google Sheets.

I primarily use this tool for auditing the top IP addresses connecting to my site. If you use this API for your own products, please include your website/contact in the URL.

The sheet works off of an API hosted by me so if you have any requests or if you just enjoy using the free tool, please leave me a comment below.

Google-Certificates-Bridge User Agent .well-known/acme-challenge Requests

I recently have been having many requests coming from Google IPv4 and IPv6 addresses with the user agent “Google-Certificates-Bridge” accessing unique files within /.well-known/acme-challenge/XXXX.
A snippet from my Apache Log
64.233.172.141 - - [25/Dec/2018:23:30:30 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
64.233.172.145 - - [25/Dec/2018:23:30:40 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
64.233.172.143 - - [25/Dec/2018:23:30:50 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
64.233.172.146 - - [25/Dec/2018:23:31:00 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
2001:4860:4801:400a::35 - - [25/Dec/2018:23:31:10 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
64.233.172.144 - - [25/Dec/2018:23:31:20 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
2001:4860:4801:400a::19 - - [25/Dec/2018:23:31:30 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
66.102.8.40 - - [25/Dec/2018:23:31:39 +0000] "GET /.well-known/acme-challenge/XXXXXXXXXXXXX-XXXXXXXXXXXXX HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"
64.233.172.143 - - [25/Dec/2018:23:31:40 +0000] "GET /.well-known/acme-challenge/LjaR-XXXXXXXXXXXXX-lgf6-QW8 HTTP/1.1" 404 - "-" "Google-Certificates-Bridge"

These requests are used by Cpanel, Google and some other services for the purpose of verifying SSL certificates issued to the domain. No need to worry, as long as the requests are coming from a familiar IP, this is not likely attack traffic.

Adsense: Is this your site? We’ve detected your ad code on the site below…

Is this your site? We’ve detected your ad code on the site below. If it’s your site, click Yes to add it to your Sites.
Web caches, proxies, and translation services often appear as sites where Google Adsense has detected your ad code. Here is a list of services I’ve seen on my account:
  • translatoruser-int.com [Translate]
  • translate.google.com [Translate]
  • translate.google.ru [Translate]
  • translate.google.com.br [Translate]
  • translatoruser.net [Translate]
  • www.microsofttranslator.com [Translate]
  • web.archive.org [Cache/Archive]
  • www.translate.ru [Translate]
  • www.proxyit.cc [Proxy]
  • www.s-translation.jp [Translate]
  • cloudflare.works [Admin Configuration of Apps on Cloudflare]
  • yandex.ru [Translate and Cache]
  • dakwak.com [Translate]
  • Web caches and other [Google “Cache:”, other]

For my properties, I primarily receive this message from the Adsense console due to translation services accessing the site and pulling my Adsense code through to their front-end.  Generally, it is not a good idea to add translation services, caches, and proxies to your Adsense account. Although depending on the number of readers you have translating your site, you could gain a few extra percent of ad revenue. This does, however, come with some major risks which may outweigh the small percentage of revenue gain you could see from these new domains.

Within the Adsense Sites configuration [Adsense > Sites > Overview] you can control the list of sites your code appears on. 

This feature was added as a way to protect your account from “malicious use of your ad code by others”. The sites in your sites list are the only sites that are permitted to use your ad code. If a site displaying your ad code is not on your list of sites, then no ads will show on that site.

Malicious use of your ad code could include generating false clicks on your site for the purpose of harming your Adsense account, revenue, and reputation. A malicious actor might include a competitor or someone else looking to harm your site for their own financial gain.

By enabling translation sites, caches,  and proxies to display your ad code, you open your account be displayed alongside content you might not control. Malicious actors could serve up your ad code alongside restricted content creating negative marks on your Adsense account.  Because Proxies are known to be couriers of less desirable internet content (and thus are disallowed by Adsense ToS), I would never risk adding a proxy domain to my AdSense account. Auto-translating sites are a risk as well due to the poor quality of the translations. In most cases, auto-translated content is considered low-quality by Adsesnse. Because caches are often a direct mirror of your content they carry a smaller risk of being low-quality or malicious, but for most, the risk likely doesn’t outweigh the payoff.

 

/h/8913147.html in Google Analytics Spam

The page “/h/8913147.html” is part of a Google Analytics Spam campaign published by get-seo-help.com
It is likely the uniqueness of the url is utilized to avoid being filted by Google.

I’ve also seen the same html page being used for referrer spam from  free-seo-consultation.com

 

Verification methods used: Unknown [Google Search Console]

If you purchased a domain via domains.google, you can add it to your Google Webmaster Tools / Search Console without performing any further verification (no Google Analytics / DNS / HTML file required).

It makes adding domains and subdomains to WMT / GSC super easy, but it also comes with a confusing “Unknown” domain verification information tag.

Verification Method Unknown