function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
emilyjonesemilyjones 

robots.txt vs. robots??

I am trying to submit my sitemap to Google webmaster tools, and am having some issues with the robots.txt functionality.

 

 

I created a VF page called robots with this as the source:

 

 

<apex:page contentType="text/plain">
User-agent: *
Disallow:
Sitemap: www.innovationaero.com/sitemap
</apex:page>

 

Then I linked it to the site using the robots.txt field.

 

My site info is:

 

Default web address: innovationaero.force.com

Custom web address: www.innovationaero.com

 

 

Now, here comes the problem. These 2 files are different:

www.innovationaero.com/robots.txt (disallows everything)

www.innovationaero.com/robots (allows everything)

 

 

How do I change the default robots.txt file so that google will see my site?

 

Thanks!
Emily

Best Answer chosen by Admin (Salesforce Developers) 
emilyjonesemilyjones

Woohoo! It finally worked. Looks like it just took some time to override the salesforce default. It took about a day, FYI.

 

Thank you all for your help!

All Answers

Ryan-GuestRyan-Guest

It can take up to 24 hours for the robots.txt file to update from the global cache.

Ryan-GuestRyan-Guest

As a hack, you can manipulate the URL and validate that the robots.txt file is correct.

 

Example:

 

http://www.innovationaero.com/robots.txt?1

BulentBulent

i can't seems to locate www.innovationaero.com

emilyjonesemilyjones

That is the correct address--

 

www.innovationaero.com

 

Will you retry it?

BulentBulent

nslookup returns no results

emilyjonesemilyjones

when I look it up it says the CNAME is innovationaero.force.com

 

This is correct, right?

Ryan-GuestRyan-Guest

The two URLs:

 

http://www.innovationaero.com/

 

and

 

http://www.innovationaero.com/robots.txt

 

work fine on my machine. It might be a problem with Bulent's DNS server.

 

I also see that your DNS information is correct:

 

 

 

 

 

rguest@rguest-ws:~$ nslookup www.innovationaero.com
Server: 10.0.11.2
Address: 10.0.11.2#53
Non-authoritative answer:
www.innovationaero.com canonical name = innovationaero.force.com.

 

$ nslookup www.innovationaero.com
Non-authoritative answer:

www.innovationaero.com canonical name = innovationaero.force.com.

BulentBulent

Ryan is correct.

now it works fine for me as well.

 

 

emilyjonesemilyjones

but. the robots.txt file is disallowing bots, whereas I want to allow all of them. The robots VF page I set up allows everything, but I can't figure out how to change robots.txt, which seems to be generated by salesforce

BulentBulent

this is what I see when I browse your site ronbots.txt

 

User-agent: *
Disallow:
Sitemap: www.innovationaero.com/sitemap

 it's looks right.

 

emilyjonesemilyjones

Woohoo! It finally worked. Looks like it just took some time to override the salesforce default. It took about a day, FYI.

 

Thank you all for your help!

This was selected as the best answer
sarvesh001sarvesh001
HI Emily,

how to add GWT html page to community to confirm verification from GWT.
I am not able to find a way to add this html page to community.

Please help me.


Thanks,
Sarvesh.