The Most Active and Friendliest
Affiliate Marketing Community Online!

“AdsEmpire”/  Direct Affiliate

Google: Using RSS/Atom feeds to discover new URLs

D

djbaxter

Guest
Using RSS/Atom feeds to discover new URLs
By Maile Ohye, Google
October 30, 2009

Google uses numerous sources to find new webpages, from links we find on the web to submitted URLs. We aim to discover new pages quickly so that users can find new content in Google search results soon after they go live. We recently launched a feature that uses RSS and Atom feeds for the discovery of new webpages.

...

In order for us to use your RSS/Atom feeds for discovery, it's important that crawling these files is not disallowed by your robots.txt. To find out if Googlebot can crawl your feeds and find your pages as fast as possible, test your feed URLs with the robots.txt tester in Google Webmaster Tools.

...more
 
Thanks for posting this minstrel...I am having a problem with my robots.txt file. I set up my first niche domain a few weeks ago and was waiting patiently for it to hit Google (while posting on blogs and forums with my link).

Then, I checked out my Google Webmaster tools and it said "restricted by robots.txt"! It turns out that my Wordpress blog was set to automatically restrict access to search engines. I changed my privacy settings so that they are now allowed. However, it has been several days and no Google listing yet.

When I did what your above post said and when I tested my robots.txt file, this is what I got:
Allowed by line 2: Disallow:
Detected as a directory; specific files may have different restrictions


So, what do I do now?

I really appreciate any help!
Marie
 
Well, it is on a wordpress blog that I host on a domain. I am not sure where to find it as I can't see it in my cpanel.

What it says on the page in webmaster tools is this:

Code:
User-agent: *
Disallow:

Sitemap: http://www.paralyzed-dogs.askavetquestion.com/sitemap.xml.gz
Thanks!
 
That looks fine to me, except I think that sitemap line should read:

Code:
Sitemap: http://paralyzed-dogs.askavetquestion.com/sitemap.xml.gz

Make sure the robots.txt file is in the root directory on your server, or if it's a subdomain as in your case, in the root directory of the subdomain.

Also doubt check that you used a plain text editor for editing or creating that file and that it is saved in plain text or ASCII format, not unicode or anything.
 
MI
Back