function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Fran KorbFran Korb 

SELECT DISTINCT

I am trying to build a pick list of countries from the Contact.MailingCountry field in a very large community (200,000+ members).
SOQL does not provide the SQL SELECT DISTINCT statement and there are several alternatives found on the web.
The following query fails with 'Too many query rows: 50001’ :
SELECT n.Member.Contact.MailingCountry c,  COUNT_DISTINCT(n.Member.Id) FROM NetworkMember n WHERE n.Member.IsActive = TRUE GROUP BY n.Member.Contact.MailingCountry LIMIT 50
Is there a better way to build a list of unique entries for a specific field in a large database that is optimized and performant?
Best Answer chosen by Fran Korb
James LoghryJames Loghry
Hi Fran,

First, try running the following query (Removing the first selected should reduce your result set just a bit)
SELECT Member.Contact.MailingCountry FROM NetworkMember n WHERE n.Member.IsActive = TRUE GROUP BY n.Member.Contact.MailingCountry LIMIT 50
I'm guessing you'll still hit the same error though, so you might want to try some other approach.

How about the following: Create a batchable class that writes the MailingCountry to a custom object?  Create a custom object with an External Id of say country, then have the batch job write to that object.  Make the external id case SENSITIVE if you'd like to determine if users are using different cases of countries.  Below is an example of what that batch job might look like.
 
public class BatchStateExport implements Database.Batchable<sObject>{

    public BatchStateExport() {
    }

    public List<Account> start(Database.BatchableContext bc) {
        return 
            [Select 
                Member.Contact.MailingCountry 
            From 
                NetworkMember 
            Where Member.IsActive = true];
    }

    public void execute(Database.BatchableContext BC, List<sOjbect> scope) {
        List<My_Country_Export__c> exports = new List<My_Country_Export__c>();
        for(NetworkMember nm : (List<NetworkMember>)scope){
            exports.add(
                new My_Country_Export__c(
                    Country__c=nm.Member.Contact.MailingCountry
                )
            );
        }

        //Upsert via external id for "uniqueness"... The external id should be case sensitive.
        upsert exports Country__c;
    }

    global void finish(Database.BatchableContext BC) {

    }
}

Once you have the custom object, then you can export it to excel or run reports (There should be a managable number of countries / records to work with). 

Also, you'll likely want to do some data cleanup, validation rules, etc to perhaps manage those countries a bit better in the future.

All Answers

Alain CabonAlain Cabon
Hi,

The 'Too many query rows: 50001’ is very hard to solve.
  • The "simple" workaround is a batch which will count the total numbers browsing all the data sorting by Member.Contact.MailingCountry (calculated and stored total values).
The big problem is the scope of this batch which is not suitable for evaluating total numbers (the scope can split the values of the same MailingCountry that makes things more difficult). The readings will not be longer than some minutes for some hundreds of thousands of records but the writings will be longer. 
  • Salesforce offers the big objects ( __b) and Async SOQL (pilot) with a lot of constraints and not sure that overcomes the standard limits of SOQL (where are the new limits of the async SOQL/big objects? I didn't find them)

Create a working dataset with coarse aggregations
https://resources.docs.salesforce.com/sfdc/pdf/big_objects_guide.pdf

Regards
Alain CabonAlain Cabon
  • Declarative Rollups for Lookups!​
Volume Considerations. For each rollup, there is a maximum of 50,000 child relation records that can be summarised each time child record/s insert/update/delete operations are made (which may process several configured rollups). The rollup processes children to rollup by their parent record relationship and an optional further filter if provided. Meaning so long as this relationship does not result in more than 50,000 child records per parent parent record it will be successful. Take a look at this blog post which describes some new configuration settings (see bottom of blog post) to help calibrate the tool when running the Scheduled or Calculate jobs to help work within the 50,000 row limit.

https://github.com/afawcett/declarative-lookup-rollup-summaries
  • Trigger for calculated and stored total values: each time there is an DML operation (very expensive, real world?).
James LoghryJames Loghry
Hi Fran,

First, try running the following query (Removing the first selected should reduce your result set just a bit)
SELECT Member.Contact.MailingCountry FROM NetworkMember n WHERE n.Member.IsActive = TRUE GROUP BY n.Member.Contact.MailingCountry LIMIT 50
I'm guessing you'll still hit the same error though, so you might want to try some other approach.

How about the following: Create a batchable class that writes the MailingCountry to a custom object?  Create a custom object with an External Id of say country, then have the batch job write to that object.  Make the external id case SENSITIVE if you'd like to determine if users are using different cases of countries.  Below is an example of what that batch job might look like.
 
public class BatchStateExport implements Database.Batchable<sObject>{

    public BatchStateExport() {
    }

    public List<Account> start(Database.BatchableContext bc) {
        return 
            [Select 
                Member.Contact.MailingCountry 
            From 
                NetworkMember 
            Where Member.IsActive = true];
    }

    public void execute(Database.BatchableContext BC, List<sOjbect> scope) {
        List<My_Country_Export__c> exports = new List<My_Country_Export__c>();
        for(NetworkMember nm : (List<NetworkMember>)scope){
            exports.add(
                new My_Country_Export__c(
                    Country__c=nm.Member.Contact.MailingCountry
                )
            );
        }

        //Upsert via external id for "uniqueness"... The external id should be case sensitive.
        upsert exports Country__c;
    }

    global void finish(Database.BatchableContext BC) {

    }
}

Once you have the custom object, then you can export it to excel or run reports (There should be a managable number of countries / records to work with). 

Also, you'll likely want to do some data cleanup, validation rules, etc to perhaps manage those countries a bit better in the future.
This was selected as the best answer
James LoghryJames Loghry
I should add, that you could just try doing that same export with data loader as well, but I originally had some concerns on the query size and how Excel might balk at the number of records returned.