Skip to content
Data-Discrepancy-Someka-Blog-Featured-Image

How to Prevent and Solve Data Discrepancy Issues in Excel?

In this article, we’re going to discuss data discrepancy issues: What is Data Discrepancy? How to prevent it? And how to maintain more accurate data in Excel as well as in other tools?

As most businesses still store a significant amount of data in Excel, we will mainly focus on solving data discrepancies in Excel.

Table Of Content

1. What’s Data Discrepancy?
2. What causes Data Discrepancy?
3. How to Prevent Data Discrepancies?
4. Excel Features to Detect and Solve Data Discrepancies
5. Final Words

First, let’s understand what Data Discrepancy is, and then we’ll explain how to prevent and find remedy for this data analysis issue.

1. What’s Data Discrepancy?

Data discrepancy is the lack of consistency between sets of data, data records, or data sources. It occurs when data does not match or align as expected. A data discrepancy is essentially the mismatch of data.

Data-Discrepancy-Someka-Blog-S02

Let’s explain this with a very simple example. Suppose we are in HR department and we have a list of employees. We have an Excel sheet with a list of employees, row by row. For instance, the employee John Brown might be recorded in many different ways. There could be a space in ‘John Brown’. It could be written as ‘John B.’. Or, there might be a space after ‘John Brown’, and so on, with various cases.

Data-Discrepancy-Someka-Blog-S03-v2

Imagine our data consists of hundreds of rows and the same person is referred to differently in various places. When we want to consolidate this data, the same person will appear as if they are different individuals.

For instance, when consolidating them in a pivot table, John Brown and John-Brown will be analyzed as if they are different people, giving us incorrect information in the total data. We encounter such examples quite frequently, and companies especially struggle a lot with the inaccuracy of their data.

In this article, we’ll discuss what causes these issues, how to prevent them, and how to resolve them once they occur. Let’s first examine what causes them.

Product-Database-Excel-Template-Someka-Banner

2. What causes Data Discrepancy?

There are many reasons that can cause data inconsistancies in your records. Here we’ll list some of them.

Human Error

Firstly, data discrepancy often arises from typographical errors and mistakes, essentially human errors. It’s not entirely possible to eliminate these human errors. If we don’t have a system in place to prevent or correct them, people will inevitably make these mistakes.

Getting upset with employees, individuals, or those responsible for data input is not particularly productive. Humans can always perceive things differently. Perhaps the person thought it was correct to write it that way. If we don’t have a system to identify these errors, it’s impossible to prevent human errors.

Therefore, using the systems and correction methods we will discuss in Section 3, we can prevent these human errors.

Human Errors in Data Management

Data Import Issues

If we are importing data into Excel or our database program from another source, character encoding or inconsistent formatting can lead to discrepancies. For example, a period may be interpreted differently, or certain characters in the imported database might not exist in the new database, resulting in strange, unfamiliar characters being inserted. And this definitely corrupts our data.

Inconsistent Data Entry Standards

Discrepancies can arise when there are no standardized formats or protocols for entering data. For instance, one person might enter dates as DD/MM/YYYY, while another might use MM/DD/YYYY.

Data-Discrepancy-Someka-Blog-S01

Formula Errors

If we are pulling data through formulas, there may be incorrect references. Or, when using the autofill feature in Excel, it can sometimes fill in the data incorrectly. Therefore, it’s important not to rely on autofill for filling in critical data.

We can also underline that mistakes made during data processing stages, such as incorrect calculations, improper use of formulas, or misalignment of data sets, can create discrepancies.

External Data Changes

If our file links to an external source or dynamically pulls data from an external database, any changes there will affect your file. Thus, as a best practice, we recommend not leaving active external data links in Excel files.

!!! For healthy data management, always make sure that you remove external data connections in your Excel files.

These are among the most common reasons behind data discrepancies. In addition to them, we can also list time zone differences, cultural differences, data decay, communication gaps or sampling errors and the other discrepancy reasons in data management.

Data-Analysis-Excel-Templates-Someka-Banner

What are the factors affecting the data discrepancy?

  • Human error
  • Systematic errors
  • External factors affecting data collection
  • Data transformation errors
  • Sampling errors

3. How to Prevent Data Discrepancies?

It’s very important to implement systems to prevent data discrepancies before they occur. Because if we are not error-proof enough, it’s inevitable to have inconsistencies in our data.

It’s much more difficult to identify a data discrepancy than to prevent it in the first place!

Let’s discuss the most effective ways to prevent data discrepancy issues.

Data Validation Lists

Firstly, one of the best ways to prevent data discrepancy in Excel is to use Data Validation. By using Data Validation in Excel, we can prevent the manual entry of data into each cell.

How do we do this? Let’s consider the previous example of person names. Instead of manually entering names each time during data input, imagine having a separate sheet or section in the form of a person table. The person inputting data will no longer manually type these entries but will select them from a drop-down list.

This significantly reduces the possibility of entering different variations each time, ensuring consistent and controlled data input.

Excel Data Validation

This method is also used in large programs like SAP, Oracle, Salesforce, etc. In these systems, if there is a table for defining people and another table for entering information about these people, the data can only be entered by pulling from the previously defined table.

Otherwise, significant data discrepancies would occur in such large programs. This concept is known as a Relational Database, where data is maintained with unique numbers to ensure consistency. We achieve a similar function in Excel using Data Validation.

In Excel, you can create dynamic drop-down menus with Offset formula. We also use this method extensively in Someka templates and products.

Data Format Validation

The Data Validation also includes features to prevent the entry of incorrect data formats. We recommend looking into this as well.

Numeric Data Validation

For example, if you want users to only input numeric values, then you can prevent incorrect data input with data validation. You can also manipulate minimum or maximum numbers or assign different conditions.,

Excel Data Validation Check

Use Excel Formulas To Prevent Data Discrepancies

Data Validation is one of the best methods to prevent data discrepancies. But where Data Validation is not applicable, formulas and functions can also be used to take preventive measures.

For instance, with Excel Text Functions like LEFT, RIGHT, MID, and etc., you can dynamically analyze the data for incorrect entries using formulas and create a column that provides alerts on errors.

Below, we show an example of this. In our products like Org Chart Maker or Flowchart Maker, we have included a feature on the right side that alerts you when incorrect data is entered, dynamically analyzing and providing warnings.

Automatic Org Chart Generator – Premium Version - Someka SS12

– This is the main Input area of our Org Chart Maker which has a dynamic Error Check area with formulas –

VBA Codes and Macros

To prevent data discrepancies, we have discussed using Data Validation lists, features in Data Validation like stop or info, and formulas and functions. Additionally, implementing version control or dynamically controlling data with macros and code is also possible.

However, let’s not delve deeper into these now and instead look into what can be done once discrepancies occur and how to ensure whether data discrepancy exists.

Excel-Formulas-Practice-Someka-Template-Banner

4. Excel Features to Detect and Solve Data Discrepancies

Excel has very effective ways to find and solve data discrepancy issues. We’ll here check the most common three of them:

  • Conditional Formatting
  • Pivot Tables
  • Calculations
  • Filtering

Conditional Formatting to Detect Data Discrepancies

Conditional formatting is the first technique we use in Excel to find and fix data discrepancies. For example, we can set up Conditional Formatting to remove duplicate data from our table. This will notify us by drawing attention to the repeated data by emphasizing it with a distinct color. This is used as a post-entry tool to detect and fix duplicates, as well as a preventive measure.

Excel Conditional Formatting

Pivot Tables to Find Data Discrepancies

The use of pivot tables is an additional helpful technique. The data should be consolidated without any spaces when we combine two columns side by side in a pivot table.

Data Discrepancy Detection with Pivot Tables

The pivot table will cleanly show any inconsistencies or duplicate entries for the same data, making it easy to spot mistakes. It makes sense to use pivot tables for this purpose.

You can check out our Pivot Table Guidance to learn more about detecting data discrepancies with pivot tables.

Arithmetical Calculations

There’s another important aspect of pivot tables worth mentioning. We have talked about the entry of text data and its accuracy, but checking numeric values is also crucial.

For example, the use of decimal points and commas varies in different countries. Some countries use a period as a decimal marker, while others use a comma. If we copy data from a site like Wikipedia or another website into Excel, it might not come in the desired format, and we may not even notice it. Hence, there are a few methods to address this issue.

Firstly, you can try summing the column with your data. You can simply apply a SUM formula or or AVERAGE formula. If there’s no numeric value, it indicates that your data may contain errors or text entries, preventing Excel from calculating a total.

Remember, if you have 100 data entries and 99 are numeric but one contains text, Excel will have difficulties to make a sum calculation. This can be a useful indicator to identify non-numeric values.

Filtering

Another method is to use filtering. Suppose you think you have a value like 25.95, but it’s actually a comma-separated value, and in Excel, it appears as 2595. 

Data-Discrepancy-Someka-Blog-S02

When you copy it from a source and paste it into Excel, it might display incorrectly. When you apply a filter, if you expect values around 30-40-50 and see a value like 3200, it’s a clear sign of discrepancy. Hence, inspecting your data through filters is another effective method.

Excel Filtering For Data Discrepancy Detection

5. Final Words

In conclusion, data discrepancy is a significant issue, especially in today’s world where data plays such a crucial role. It’s essential to implement ways to prevent and address these discrepancies in companies.

While we should have strategies for both preventing and resolving discrepancies, our recommendation, particularly in Excel, is to use Data Validation. This approach alone can solve more than 50% of common problems. Where Data Validation is not applicable, other correction techniques can be employed to clean up your data.

As a final note, you can also use Google Sheets for most of these features to clean your data from discrepancies.

Recommended Readings:

How to Format Input Tables in Excel?

Excel Dashboard Design: How to make impressive Excel dashboards like Someka does?

Can Excel Analyze Data?

Search