function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
jhartjhart 

Getting data skew error on highly selective query

At certain clients, we are getting the dreaded "data skew" error for the following query:

[select Id from Task where EmailId__c in :msgIds]


This is called in a trigger, and msgIds will typically have only a single member (& definitely no more than 200).

Similarly, this query will generally return only a single Task - in other words, filtering by the EmailId__c field is highly selective.


However, at certain clients (those who have many Task objects), we are getting this error:

Error on create: CANNOT_INSERT_UPDATE_ACTIVATE_ENTITY: i.tAA_EmailHasAttachments: execution of AfterInsert

caused by: System.QueryException: Non-selective query against large object type (more than 100000 rows). Consider an indexed filter or contact salesforce.com about custom indexing.
Even if a field is indexed a filter might still not be selective when:
1. The filter value includes null (for instance binding with a list that contains null)
2. Data skew exists whereby the number of matching rows is very large (for instance, filtering for a particular foreign key value that occurs many times)

 

Our filter list does not include the value null, and the number of matching rows is quite small, so it appears to be an indexing issue.

A couple things of note:

a.  This field holds IDs, but is defined as a simple text field, because Tasks cannot have custom lookup fields (why?)

We would certainly prefer it be a lookup field.  Perhaps lookup fields are auto-indexed and wouldn't have this problem.


b.  Our packaged application cannot require this field be indexed.

When defining the custom field, the only way to index it is to mark it as an External ID, but that requires a unique value, which this field is not.


 


Questions:

 

Is our only option contacting each of our effected customers and asking them to ask salesforce to add a custom index?

Are there any plans for letting us define custom fields which are indexed but not unique?

Are there any plans for letting Tasks have custom lookup fields?



Salesforce support - I have created case 05569974 to track this issue.

jhartjhart

Answers from salesforce support:

 

1.Is our only option contacting each of our effected customers and asking them to ask salesforce to add a custom index?
[A] For fields that you cannot define as an externalID (see 2), yes, the customer will need to contact salesforce support to request that the field be indexed.

1a. Will we, as certified partners, be able to request this custom index on our customers' behalf? That would be vastly preferable to making them do it.
[A] You will need to mention this to your account rep for the future. This can be setup to show on your account so that future cases will have it noted.
In general, we always have to get permission from customers before making changes to their orgs.

2. Are there any plans for letting us define custom fields which are indexed but not unique?
[A] The only mechanism available to custom index a custom field yourself is to designate it as an externalId field.

Additionally, you are right, lookup fields are automatically indexed, but...

3. Are there any plans for letting Tasks have custom lookup fields?
[A] At this time, this feature is not yet set for consideration.
There is an existing idea for this feature to vote on.
http://success.salesforce.com/ideaView?c=09a30000000D9xtAAC&id=08730000000BrPdAAK

 

 

So, there you go.  No known plans for (2) or (3), and thus we must ask customers to ask salesforce to add the index per (1).

jhartjhart

Interesting note: ExternalIDs fields do _not_ have to be unique, apparently.

 

Our support engineer relates the following information:

 

[A] External ids can be unique or not unique.  When creating an external id field, it is optional to mark as unique.
  If not set as unique, then an upsert could potentially cause an exception if there is more than 1 matching record for the externalid field and the field is indicated as the matching field for upsert operation.

E.G
 Custom text field test__c is externalid and not unique. 
task t1 = new task(test__c ='Hello');
task t2 = new task(test__c ='Hello');

insert t1;
upsert t2 test__c;//  This will throw a DmlException "DUPLICATE_EXTERNAL_ID, test: more than one record found for external id field".    

or
insert t1;
upsert t2 ; // This will not throw an exception as the external field has not been indicated as matching field


now, if the externalid field is marked as unique then:
insert t1;
upsert t2 ; // This will always throw a DmlException ,
regardless if the matching field is used or not, as the field is unique. 

 

This is news to me - the text in the Custom Field definition editor led me to believe that ExternalID fields were required to be unique (regardless of the 'unique' checkbox):

 

 

 

This is really good news - we can update our package to fix the issue, rather than forcing our customers to request custom indexes.