function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Arthur Lockrem 9Arthur Lockrem 9 

How does SOSL determine it's results?

In a simple test, searching for portions of a record id return different results.  The problem is searching for the smaller subset of the record id returns less records.  How is it possible the shorter string is not found in every record where the larger string is found?

Example:
FIND {a0rK0000004F} returns 416 records
FIND {a0rK0000004} returns 112 records

How is this possible?

Additional Details:
- Both searches are returning the same object type and fields
- This is being executed in Query Editor within Developer Console.
Shun KosakaShun Kosaka
Hi,
The algorithm of SOSL is very complex and it is what is called full-text search. It searches keyword as a meaningful text unit.

The following is a simple example in a clean DE org.
FIND {Company} returns 4 records
FIND {Compa} returns 0 records
It means that the word "Company" is more meaningful than "Compa". In your case, the former is regarded as so by search engine.
Alain CabonAlain Cabon
Hi,

Did you try with the wildcards ( * and ?) ?

FIND {a0rK0000004*} or FIND {a0rK0000004?}

https://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_sosl_examples.htm

Regards,

Alain
Arthur Lockrem 9Arthur Lockrem 9
Thank you Shun and Alain for your quick responses.

Alain - Yes.  I tried with and without wildcards.  The results were different, but similar.  The search with less characters produced less results.

Shun - Your explanation is very interesting.  I am hoping to receive confirmation from Salesforce that it is 100% correct.  Given our current needs it will be a problem if Salesforce is determining what information is meaningful, rather than finding matches based on the values I request.
Shun KosakaShun Kosaka
Hi Arthur,
There are two major ways for full-text search
1. Serial scanning (like Ctrl-F search)
2. Indexing (create something like dictionary before search)
SOSL uses the latter. It seems run querys based on Apache Solr.
https://www.igvita.com/2010/10/22/open-source-search-with-lucene-solr/

But how it splits text into units and scores them is non-public and I doubt Salesforce will tell it us...
If you know in which objects or fields the data resides, SOQL is better than SOSL.

See also,
https://developer.salesforce.com/docs/atlas.en-us.salesforce_large_data_volumes_bp.meta/salesforce_large_data_volumes_bp/ldv_deployments_techniques_using_soql_and_sosl.htm