Find Null Fields

Hello All,
I am in need to find a way to search Saleforce Database for Field or objects that were created but never used and there is no data in them.
So if you were to selct from these tables and fields it would be null.

Thansk,
Keith.

January 18, 2019
·
Answer
·
Like
0
·
Follow
2

Raj Vakati

You can able to use Salesforce reports .. Create reports on a different object and filter the data based on null values

Or you have to export the data into external or csv files to analyze the values
I guess noting is there native to SF..

January 18, 2019
·
Like
0
·
Dislike
0

Keith Stephens 18
I have not tried the report or csv route yet, but what I am looking for is a tool or script that can do this. I have a tool that when you select a sql server database it loops through all the tables and all the fields of that table and reports the table name and field name that is NULL and has NO data.

January 18, 2019
·
Like
0
·
Dislike
0

Alain Cabon
Hi,

I have writen some programs for this problem and the easiest way is to read all the exported CSV files (setup > data export).

Any CSV reader is sufficient. https://commons.apache.org/proper/commons-csv/

Comparison: https://github.com/uniVocity/csv-parsers-comparison
You can always read big exported CSV files (no time out, no governor limits).

What is the maximum number of records of your objects and the number of custom fields?

With few custom fields and some hundreds of thousands of rows (up to one million), generated dynamic SOQL queries with Rest API calls (can be done in Apex): select myfield1__c from myObject__c where myfield1__c != null limit 1 ... are sufficient but you can have a time out nevertheless if you have big objects (not declared as big objects). When I launched parallel calls in java (10 queries for instance), the responses are incredibly fast (the DB engine of Salesforce is impressive and very powerfull despite of the governor limits).

AppExchange tools: rarely free when powerful or limited free version in reality (free or limited is the same) or buggy (crashes because of the governor limits or a time out which are very difficult to overcome with synchronous requests from a screen with too simple algorithms) but it is always interesting to test all these "free" tools if you have not too many records and you select an object one after another (great but slow and many manual operations)

Field Trip (powered by RingLead): Run reports on standard and custom field usage.
Ever wish you could run reports on the fields you have in Salesforce? Take a Field Trip! This utility lets you analyze the fields of any object, including what percentage of the records (or a subset of your records) have that field populated.
https://appexchange.salesforce.com/appxListingDetail?listingId=a0N30000003HSXEEA4

January 18, 2019
·
Like
0
·
Dislike
0

Alain Cabon
I have also tried a tool called Pandas ( https://pandas.pydata.org/ ).

In theory, that should be the "best" tool for analysing CSV files with its dataframes but with really big files, you have to complicate the queries (reading by chunks with cumulative results) and a simple CSV reader is sufficient when you don't need complicated statistics.

I used Pandas to have the first most used values for example. Pandas is useful for this kind of results directly from CSV files (including with filters) with short code in python.

January 18, 2019
·
Like
0
·
Dislike
0

Alain Cabon
A short example with Pandas and Account.csv

With some lines of code in python by using pandas, you can have directly the result of all the empty columns.

account_columns_null.py
import warnings warnings.filterwarnings("ignore") import pandas as pd import numpy as np pd.set_option('max_info_columns', 500) filename ='Account.csv' mychunksize = 100000 for chunk in pd.read_csv(filename, chunksize=mychunksize, usecols=lambda col: col not in ['Id','Name','IsDeleted','CreatedDate','CreatedById','RecordTypeId','ParentId','LastModifiedDate','LastModifiedById','SystemModstamp']): chunk.info() chunk.describe(include = 'all') null_columns=chunk.columns[chunk.isnull().all()] for col in null_columns: print("Column: " + col) print 'fin'

Panda automatically analyze all the columns in its dataframe (excepted "col not in ['Id','Name' ... ])

c:\panda>account_columns_null.py <class 'pandas.core.frame.DataFrame'> RangeIndex: 136 entries, 0 to 135 Data columns (total 62 columns): MasterRecordId 0 non-null float64 Type 117 non-null object BillingStreet 129 non-null object BillingCity 129 non-null object BillingState 78 non-null object BillingPostalCode 129 non-null object BillingCountry 132 non-null object BillingLatitude 0 non-null float64 BillingLongitude 0 non-null float64 BillingGeocodeAccuracy 0 non-null float64 ShippingStreet 129 non-null object ShippingCity 129 non-null object ShippingState 78 non-null object ShippingPostalCode 128 non-null object ShippingCountry 132 non-null object ShippingLatitude 0 non-null float64 ShippingLongitude 0 non-null float64 ShippingGeocodeAccuracy 0 non-null float64 Phone 132 non-null object Fax 118 non-null object AccountNumber 1 non-null float64 Website 129 non-null object Sic 0 non-null float64 Industry 120 non-null object AnnualRevenue 129 non-null float64 NumberOfEmployees 129 non-null float64 Ownership 0 non-null float64 TickerSymbol 0 non-null float64 Description 0 non-null float64 Rating 1 non-null object Site 0 non-null float64 OwnerId 136 non-null object LastActivityDate 3 non-null object IsExcludedFromRealign 136 non-null int64 Jigsaw 0 non-null float64 JigsawCompanyId 0 non-null float64 CleanStatus 136 non-null object AccountSource 0 non-null float64 DunsNumber 0 non-null float64 Tradestyle 0 non-null float64 NaicsCode 0 non-null float64 NaicsDesc 0 non-null float64 YearStarted 0 non-null float64 SicDesc 0 non-null float64 DandbCompanyId 136 non-null object CustomerPriority__c 0 non-null float64 SLA__c 3 non-null object Active__c 3 non-null object NumberofLocations__c 1 non-null float64 UpsellOpportunity__c 0 non-null float64 SLASerialNumber__c 3 non-null float64 SLAExpirationDate__c 3 non-null object ginid__c 3 non-null float64 ZC__c 2 non-null object rrpu__Alert_Message__c 0 non-null float64 City__c 1 non-null object Country__c 1 non-null object CompletionValue__c 0 non-null float64 Status__c 13 non-null object Support_Level__c 0 non-null float64 Subcategories__c 1 non-null object ListeBilingue__c 1 non-null object dtypes: float64(32), int64(1), object(29) memory usage: 65.9+ KB Column: MasterRecordId Column: BillingLatitude Column: BillingLongitude Column: BillingGeocodeAccuracy Column: ShippingLatitude Column: ShippingLongitude Column: ShippingGeocodeAccuracy Column: Sic Column: Ownership Column: TickerSymbol Column: Description Column: Site Column: Jigsaw Column: JigsawCompanyId Column: AccountSource Column: DunsNumber Column: Tradestyle Column: NaicsCode Column: NaicsDesc Column: YearStarted Column: SicDesc Column: CustomerPriority__c Column: UpsellOpportunity__c Column: rrpu__Alert_Message__c Column: CompletionValue__c Column: Support_Level__c fin c:\panda>

It is the shortest way but you cannot read gigas of data directly with Pandas and reading by chunks complicate the final results.

January 18, 2019
·
Like
0
·
Dislike
0

You need to sign in to do that.

Need an account? Sign Up

Have an account? Sign In

Dismiss

Browse by Topic

Welcome to Support!

Show

sorted by

Find Null Fields

You need to sign in to do that.