You need to sign in to do that
Don't have an account?
dev401has
Robots.txt is not available in Domain
Hi
I have prepared a site http://nbcuni.force.com/commops
My Force.com domain name is nbcuni.force.com
I have prepared a robots.txt file (VF page which allows all bots) and uploaded it in Site Robots.txt.
That robots.txt is visible at http://nbcuni.force.com/commops/robots.txt infact it should be visible at http://nbcuni.force.com/robots.txt (in domain)
In Google Webmaster tools the bot crawls at http://nbcuni.force.com/robots.txt and are unable to find the file hence give crawl error (503)
How to get Robots.txt at the http://nbcuni.force.com/robots.txt
Create another site, but leave the Default Web Address field empty, and add the robots file there. same goes for if you want to properly support favicon.ico .
All Answers
Create another site, but leave the Default Web Address field empty, and add the robots file there. same goes for if you want to properly support favicon.ico .
Hi Paul
Thanks for the reply.
I thought it should come directly instead of creating a new site.
I am doing it in sandbox so it will work. I will create a new site and add robots file there but just curious to know what in developer edition? We dont have access to create more than one site in developer edition.
-Has
NBC Commercial Guidelines
as far as dev orgs go, if you didn't create the site similarly to the new one you created in sandbox (no path), you won't be able to properly support robots.txt in it.
But, you can't create a site that has the same default web address, even if you remove the custom web address.
Did anyone solve this problem yet?
I didnt create another site. I kept it as it is. my robots.txt comes after the site instead of main domain but it works fine. Google crawler started crawling the site.
Just make sure that original robots.txt of salesforce which blocks all crawlers is removed and you add your own robots.txt in your site which allows bots.
rgds
dev401has
NBCU Commercial Guidelines
How do you remove their robots.txt file?
I just added my robots vf page in the field in sites. it got removed automatically.
rgds
dev401has
NBCU Commercial Guidelines
Do you know if it happened right away or if it took a while?
search engines look for the the robots.txt at the root level.
so if you are not masking your force.com site url with your custom url than you need to setup a site with no path to serve your robot.txt.
Also it'll take up to 24h for cache to clear and reflect your robot.txt and favico.ico
these files are cached for 24h.
When you say "masking your force.com site url with your custom url", you mean just adding the custom url to the site settings? Or are there some other DNS settings I need to change?
let's say your site url is mysite.force.com/partners/
and you wnt to mask this url with your custom domain or subdomain like partners.mydomain.com
first you need to create a cname (at your domain name provider portal) to point partners.mydomain.com to mysite.force.com (not all the cnmaes will point to the same force.com subdomain not the actual site url).
this might take up to 24h to propagate globally, once the cname is in place, then you need to navigate to the specific site's detail page and update the custom webaddres value of your site with your custom domain (partners.mydomain.com)
now you can access your site via either http://partners.mydomain.com or http://mysite.force.com/partners/ or via the secure url https://mysite.secure.force.com/partners/
Thank you for your help. I have created a CNAME (about 2 weeks ago, so I know it has propagated) and am still having issues with the following:
http://boards.developerforce.com/t5/Force-com-Sites/robots-txt-vs-robots/td-p/208567