+ Start a Discussion
sam sammsam samm 

Help with robot.txt

Google is not showing the website description even though i added the robot text allowing everything.

This is what google is showing: 
A description for this result is not available because of this site's robots.txt – learn more.

also this is the robot txt file i am using:
User-agent: *
Disallow:

Any idea ..
AmitAmit (Salesforce Developers) 
You can upload your custom robot.txt via metadata API. Here are the steps:

1. Install the force.com ide

2. Create a Visualforce Page that represents the robots.txt you would like

<apex:page contentType="text/plain" showHeader="false">
User-agent: msnbot
Disallow: /
</apex:page>

3. Add the following to your package.html

<types>
<members>*</members>
<name>CustomSite</name>
</types>

4. "Refresh from Server" before you can see the sites.

5. Open up your site under the "sites" folder and add in a "robotsTxtPage" entry and save.

<?xml version="1.0" encoding="UTF-8"?>
<CustomSite xmlns="http://soap.sforce.com/2006/04/metadata">
<active>true</active>
<authorizationRequiredPage>Unauthorized</authorizationRequiredPage>
<bandwidthExceededPage>BandwidthExceeded</bandwidthExceededPage>
<changePasswordPage>ChangePassword</changePasswordPage>
<fileNotFoundPage>FileNotFound</fileNotFoundPage>
<inMaintenancePage>InMaintenance</inMaintenancePage>
<indexPage>IdeasHome</indexPage>
<masterLabel>First</masterLabel>
<portal>Customer Portal</portal>
<requireInsecurePortalAccess>false</requireInsecurePortalAccess>
<robotsTxtPage>RobotsTxt</robotsTxtPage>
<siteAdmin>mylogin@myco.com</siteAdmin>
<subdomain>mydomain</subdomain>
</CustomSite>

6. Now http://mydomain.force.com/robots.txt will contains the description of your page


You can also use your own custom favorite icon. Here are the steps:

1. create 16x16 icon
2. upload as a static resource and make sure cache control attribute is set as public
3. Disable the standard header in your visualforce page and reference the favorite icon static resource in your page as following:

<apex:page showHeader="false" >
...
...
<link REL="SHORTCUT ICON" HREF="{!URLFOR($Resource.favicon)}">
.....
</apex:page>
sam sammsam samm
After google indexing its now showing on google but not on bing search.
AmitAmit (Salesforce Developers) 
Hello,

Please use following in your VF page for allowing all bots to crawl your website.

<apex:page contentType="text/plain" showHeader="false">
User-agent: *
Disallow:
</apex:page>

Please use following link to submit your website to bing:
http://www.bing.com/toolbox/submit-site-url

Following Robots Database lists robot software implementations and operators. 

http://www.robotstxt.org/db.html

Changing Bing's Crawl Settings

Microsoft allows you to set the crawl rate for your site and even allows you to specify the crawl rate by hour of the day (a crawl pattern). This allows you to specify that Bing crawl your site more aggressively during non-peak hours and less aggressively while a majority of your visitors are online.

Note: Before specifying a crawl pattern, you may want to view your web statistics in cPanel to see when most of your visitors are online.
To change Bing's crawl settings:
  1. Log into Bing Webmaster Tools.
  2. Add your site to Bing Webmaster Tools if you have not already done so.
  3. Click on the Crawl tab in the top navigation.
  4. Click on Crawl Settings in the left navigation.
  5. Select your website from the drop-down list (if not already selected).
  6. Select the radio button Use the selected crawl rate below to set a custom crawl rate and pattern.
  7. Use the graph to specify the crawl rate and crawl pattern by dragging the graph areas to the desired settings. You can click on individual vertical columns to set the crawl rate individually by hour.
  8. If your website is built in AJAX, select the check box labeled Configure Bing for Ajax site crawling so that Bing handles your website properly.
  9. Click Save to save your settings.

Please refer following links for more information on bing search :

http://www.bing.com/blogs/site_blogs/b/webmaster/archive/2012/05/03/to-crawl-or-not-to-crawl-that-is-bingbot-s-question.aspx

Also you can use following link for generating robot.txt file.

http://www.mcanerin.com/en/search-engine/robots-txt.asp

Also robot.txt file can be checked at : 
http://phpweby.com/services/robots
http://www.frobee.com/robots-txt-check
http://webmaster.yandex.com/robots.xml

I hope this helps.

Thanks,
Amit Bhardwaj


sam sammsam samm
Thanks for the response amit.

Also my website contains some sensitive user data.

Can i gave permission just to one page in the whole org rather than allowing everything.
I dont want to give permission to any of the object,static resources,custom setting.

waiting for the answer eagerly.


sam sammsam samm
For example my homepage name is myhomepage.page

I just want crawler to access this page and not the pages after user logged in and there data.