function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Vamsi DVamsi D 

Processing more than one million records

I have an object with more than one million records. For same object, I am getting a file for processing every other day. My requirement is to compare all the records(+million) in the file with existing records in the object.(Assume there is a unique key AccountID in the object and file for comparing). If any changes are there like value of any attribute changes, we have to update the existing record.What is the best way to achieve this? .. I know that we can process upto 50 million records using batch. But i have one more concern like how can i fetch more than 50k records in execute method from object(+million) for comparison...?  and also i want to know is there any performance issue if we process this much huge data...?
shashi lad 4shashi lad 4
Why can't you use data loader to update those records using Upsert call ? DO you need to automate this?

thanks
shashi
shashi lad 4shashi lad 4
Try this .....
Let me know if it works..

thanks
shashi

global class Batch_AccountMatching implements Database.Batchable<SObject>,  Schedulable {
    
   /* *********************   Batchable methods below ********************* */
    // Batch start with getting records
    
    global Database.QueryLocator start(Database.BatchableContext bc){
        String soql = getQuery();
        return Database.getQueryLocator(soql);
    }
    
    // Batch Execute 
    global void execute(Database.BatchableContext bc, list<Account> accountRecords) {
      

        logic ---------------

    } // end of batch - execute
 
     // Batch finish 
    global void finish(Database.BatchableContext bc) {
        if (bc != null) {
            System.debug('finish,job id --> ' + bc.getJobId());
        }
    }

     // query
    private static String getQuery() {
        string query = 'SELECT Id,Name'
                     + 'FROM Account'; 
        return query;
    }
Vamsi DVamsi D
Thanks for reply Shashi.I cant use dataloader as per my requirement and  i cant use Query locater too . I have to use Iterator  in batch as i have to process file records which are more than 50K , So i want to query more than 50K records from Object(lets say Account) inside execute or any other  method other than start method(which is used to process records from file).Is there anyway to acheive this.?
shashi lad 4shashi lad 4
A Batch class allows you to define a single job that can be broken up into manageable chunks that will be processed separately. One example is if you need to make a field update to every Account in your organization. If you have 10,001 Account records in your org, this is impossible without some way of breaking it up. So in the start() method, you define the query you're going to use in this batch context: 'select Id from Account'. Then the execute() method runs, but only receives a relatively short list of records (default 200). Within the execute(), everything runs in its own transactional context, which means almost all of the governor limits only apply to that block. Thus each time execute() is run, you are allowed 150 queries and 50,000 DML rows and so on. When that execute() is completenew one is instantiated with the next group of 200 Accounts, with a brand new set of governor limits. Finally the finish () method wraps up any loose ends as necessary, like sending a status email. So your batch that runs against 10,000 Accounts will actually be run in 50 separate execute() transactions, each of which only has to deal with 200 Accounts. Governor limits still apply, but only to each transaction, along with a separate set of limits for the batch as a whole.