function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
earlbarbyearlbarby 

Querying object with 100k+ records: either get non-selective query error or hit governor limit

Hi,

We have a package that needs to query numerous contacts by email address. It could be as many as 50 email addresses. Typically, we use a bulk query like:

 

Set<String> EmailSet = new Set<String>();
//EmailSet is populated
[Select Id From Contact Where Email in :EmailSet];

 This works fine in orgs with fewer than 100k contacts, but we get non-selective query errors when there are more than 100k contacts. According to Salesforce, Contact Email is indexed, so the error is likely due to the IN operator. A (poor) alternative is to query a single email address at a time, which does successfully avoid the non-selective error, but then we quickly exhaust the 100 SOQL query governor limit since there is a lot of other processing going on aside from the contact retrieval. So we're stuck here - we can't use an efficient bulk query since it throws an error, and we can't use multiple selective queries since we hit the governor limit. Does anybody have advice on how to approach selecting a set of records from a 100k+ count object?

Thanks,
Earl

rocwilcoxrocwilcox

I assume you are using Batch Apex to process this, but are getting this error when creating the QueryLocator in the start method for the batch apex...

 

What criteria or logic did you use to build the list "Email" that has 100,000 entries in it?

What ever that criteria is cant you move all of that into this query?

 

Lets see how you build list called "Email"

 

 

earlbarbyearlbarby

This runs in trigger context as part of an email service. It's not a batch updater.

You're misunderstanding - I'm not building a list of 100k items. In my example, EmailSet might have 50 email addresses in it. The organization has 100,000 contacts total. I need to retrieve the 50 contacts whose email addresses match those in EmailSet.

rocwilcoxrocwilcox

Ah, Yes. Sorry... I'm with you now.

 

What table is this trigger on? is there any other way to get at the contact Ids?

 

If its just that the query form with "IN" is not supported well here.. how about something like:

 

String clause=' WHERE '

String soql = 'SELECT Id FROM Contact' ;

for (String s : Email)

{

  soql+=clause + 'Email = \'' + s;

  clause=' OR ';

}

 ....database.query(soql);

 

You can hit another limit of query length, but not until you have quite a few more emails in the list.

I cant say this will work, but worth a try to mess around with the query form and avoid IN??

 

 

 

 

 

rocwilcoxrocwilcox

woops.. forgot the closeing ' after the s..    +s + '\'';

but you get the point

 

earlbarbyearlbarby

Yes, I do know what you mean. I'm pretty sure we've tried building a dynamic query with multiple ORs for something else and that did not make it selective, but I will try again for this scenario.

cARL scARL s

I think yous problem is that ur set is all full up of too many strings to run a nonselective query up on it.  Try using a for loop up in the soql so u dont get so much all up in there.

earlbarbyearlbarby

Concatenating multiple OR conditions does not make the query selective.

rocwilcoxrocwilcox

Bummer.

So what object is this trigger on? is there any other way to get at the contact ids, and thus avoid the query on contacts.

We've fun into this querying from more than 100k rows before too, and have hit the same wall.

 

 

rocwilcoxrocwilcox

Reading about selectivity at http://www.salesforce.com/us/developer/docs/apexcode/Content/langCon_apex_SOQL_VLSQ.htm

Makes me wonder futher, as you say they indicate the field is indexed, but even if indexed are the criteria regarding selectivity and how many rows (10% of the 1st million) are being selected "over".   is it possilbe that one of the values in your set EmailSet has a null value, and lots of contacts dont have an email?

 

In your code go over and ensure that you dont query for any "null" value of email?

 

Just an idea...

Starz26Starz26

Where did you read that email is indexed???

 

According to this link email is not one of the indexed fields to use when querying large datasets:

 

http://www.salesforce.com/us/developer/docs/apexcode/Content/langCon_apex_SOQL_VLSQ.htm

 

Eaisest way around this would be to create a custom field named emailIDX__c, set it as external ID, and use a workflow to update it every time email changes. This should help you with your query by using the emailIDX field as the field to search in the query

earlbarbyearlbarby

Thanks for all the replies.


There are never nulls in the set of emails. It's definitely not selecting enough rows to hits 5%/10% threshold.

Contact Email is indexed according to salesforce developer support. It's possible that the tech meant that the specific  organization where this is an issue already has a custom index.

This problem occurs in a managed package. We couldn't do an external index with workflow as a general fix for use in the app.

I was able to rework the basic logic to call the query before the trigger runs and store the results in a static map that can be used later in the execution. This looks promising as a permanent solution as the error no longer occurs in a test org with 500k contacts. Dev support did not have a better solution than this and I suspect querying high count objects in triggers may not be possible in many use cases.

LVSLVS
earlbarby, this is an interesting post can you share how the static map works? not sure what you mean...
earlbarbyearlbarby

Sure, it's not complicated. The general flow of the process is that an email service runs, it adds some data, and then triggers run. Originally, the triggers were querying the data they needed, since that was naturally the most logical placement for those lookups. Now, through some creative re-engineering, I moved those queries into the email service itself and then store the data that the triggers will eventually need in a static class. This avoids running the non-selective error, since the queries are not running in trigger context.

LVSLVS
Pretty neat. Cheers! ~LVS