Why Every Organization Needs a Data Integrity Strategy
-
bookmark
-
print
Modern businesses generate data with every purchase, social media like, and customer service interaction—and when properly managed, that data can fuel valuable insights that drive growth.
However, nutritionists and data professionals share a saying: "Garbage in, garbage out." And one 2023 survey found that just 46 percent of data and analytics professionals rated their ability to trust the data used for decision making as "high" or "very high," with 70 percent of those who struggle to trust their data citing quality as their biggest issue.1
Whether a business works with predictive, prescriptive, or simply diagnostic analytics, its data program is only as good as its data. Data should be reliable, accurate, complete, and consistent. However, building and maintaining data integrity on a scale fit for today's vast data volume and velocity requires a strategic approach.
Keeping data intact across its journey
Every data ecosystem is unique, but every piece of data a business leverages will pass through similar stages across its lifecycle, from generation and collection to interpretation and disposal. Data is fungible. It shifts between assets and platforms, hopping from server to server, often crossing state, province, and even national borders. It changes form as it moves from an audio call to a customer relationship management system to an analytics platform to a PowerPoint presentation. And at every step along the way, the data's integrity is at risk.
The term "integrity" comes from a Latin root meaning "intact" or "untouched." 2 Data integrity describes the consistency of data as it transitions across time, systems, and formats, with a focus on keeping the data uncorrupted and readable: intact.
Data quality is a separate, but important concept that describes how useful the data is for its intended purpose. Good quality data is unique, accurate, consistent, and up-to-date. Poor quality data may be redundant, inaccurate, inconsistent, out-of-date, or some or all of the above. When data loses integrity, it necessarily loses quality; however, poor-quality data may still maintain integrity.
Ensuring data integrity is, therefore, a necessary step on the road to data quality. A robust data integrity strategy will ensure that data, regardless of quality, remains intact, enabling data scientists and, increasingly, generative AI tools to produce actionable insights, and helping secure the business against the reputational and regulatory damage data loss can cause. At its simplest, data should be available, reliable, comprehensible, original, transparent, and intact.
Data integrity threats
In the generative AI era, businesses face a myriad of challenges to data integrity, including ransomware attacks, old-fashioned viruses and bugs, and more innovative, AI-enabled bad actors.
However, some of the biggest risks do not usually fall under an entity's cybersecurity strategy, including:3,4
-
User error: Whether users input text into a cell in a spreadsheet when they intend to search, accidentally overwrite a file, arrest a data migration too early, or mishandle data collection, there are myriad ways in which user error can impact data integrity.
-
Transfer issues: It's easy to assume that transferring data across long distances is as simple and error-free a process as sending an email. But packets do go missing, the data may not be received in its complete form, and occasionally errors can even be introduced.
-
System incompatibility: The more data collected, the more businesses have to deal with systems that don't speak the same language. Data sent from a legacy database may be formatted in a way that its recipient cannot read.
-
Hardware and infrastructure problems: Old, damaged, or poorly maintained hardware and infrastructure can corrupt data at any stage through its lifecycle from collection to disposal.
Four key steps to achieve data integrity
A proactive data integrity strategy is at the heart of any successful data program. Success requires ensuring integrity across every stage of the lifecycle, from training staff on data entry to penetration testing systems.
Every data integrity strategy, no matter how mature an entity's data infrastructure may be, should include four key components, most of which can now be largely automated.3,4
-
Validation: A rigorous data validation program will help ensure data integrity, quality, and transparency.
-
Auditing: Without a robust and systematic auditing system in place, errors can proliferate throughout the data ecosystem.
-
Backups: As basic good housekeeping, backups and recovery should be part of your business continuity strategy, but they should also mesh with your data integrity strategy.
-
Security: Align the data integrity strategy with your security planning and conduct penetration testing to ensure systems protected against bad actors.
Whether a business aims to benefit from the data gold rush by turning data into revenue or simply avoid the many regulatory and reputational pitfalls of poor data management, integrity should be at the core of the strategy, not an afterthought.
1 Precisely, 2023 Data Integrity Trends and Insights, accessed September 2024.
2 Leo W.J.C. Huberts, Integrity: What it is and Why it is Important," July 2018.
3 Kihara Kimachia, What Is Data Integrity & Why Is It Important? (Definition & Types), TechRepublic, August 2023.
4 Keith D. Foote, Data Integrity vs. Data Quality, Dataversity, July 2023
Why Every Organization Needs a Data Integrity Strategy
CIO, US Capital Markets & Head Investment & Corporate Banking & Office of the COO Technology
Kim joined BMO in August 2021 from RBC where she was the Global Head of the Client, Banking & Digital Channels Technology group across Capital Markets. Kim dev…
Kim joined BMO in August 2021 from RBC where she was the Global Head of the Client, Banking & Digital Channels Technology group across Capital Markets. Kim dev…
VIEW FULL PROFILE- Minute Read
- Listen Stop
- Text Bigger | Text Smaller
Modern businesses generate data with every purchase, social media like, and customer service interaction—and when properly managed, that data can fuel valuable insights that drive growth.
However, nutritionists and data professionals share a saying: "Garbage in, garbage out." And one 2023 survey found that just 46 percent of data and analytics professionals rated their ability to trust the data used for decision making as "high" or "very high," with 70 percent of those who struggle to trust their data citing quality as their biggest issue.1
Whether a business works with predictive, prescriptive, or simply diagnostic analytics, its data program is only as good as its data. Data should be reliable, accurate, complete, and consistent. However, building and maintaining data integrity on a scale fit for today's vast data volume and velocity requires a strategic approach.
Keeping data intact across its journey
Every data ecosystem is unique, but every piece of data a business leverages will pass through similar stages across its lifecycle, from generation and collection to interpretation and disposal. Data is fungible. It shifts between assets and platforms, hopping from server to server, often crossing state, province, and even national borders. It changes form as it moves from an audio call to a customer relationship management system to an analytics platform to a PowerPoint presentation. And at every step along the way, the data's integrity is at risk.
The term "integrity" comes from a Latin root meaning "intact" or "untouched." 2 Data integrity describes the consistency of data as it transitions across time, systems, and formats, with a focus on keeping the data uncorrupted and readable: intact.
Data quality is a separate, but important concept that describes how useful the data is for its intended purpose. Good quality data is unique, accurate, consistent, and up-to-date. Poor quality data may be redundant, inaccurate, inconsistent, out-of-date, or some or all of the above. When data loses integrity, it necessarily loses quality; however, poor-quality data may still maintain integrity.
Ensuring data integrity is, therefore, a necessary step on the road to data quality. A robust data integrity strategy will ensure that data, regardless of quality, remains intact, enabling data scientists and, increasingly, generative AI tools to produce actionable insights, and helping secure the business against the reputational and regulatory damage data loss can cause. At its simplest, data should be available, reliable, comprehensible, original, transparent, and intact.
Data integrity threats
In the generative AI era, businesses face a myriad of challenges to data integrity, including ransomware attacks, old-fashioned viruses and bugs, and more innovative, AI-enabled bad actors.
However, some of the biggest risks do not usually fall under an entity's cybersecurity strategy, including:3,4
-
User error: Whether users input text into a cell in a spreadsheet when they intend to search, accidentally overwrite a file, arrest a data migration too early, or mishandle data collection, there are myriad ways in which user error can impact data integrity.
-
Transfer issues: It's easy to assume that transferring data across long distances is as simple and error-free a process as sending an email. But packets do go missing, the data may not be received in its complete form, and occasionally errors can even be introduced.
-
System incompatibility: The more data collected, the more businesses have to deal with systems that don't speak the same language. Data sent from a legacy database may be formatted in a way that its recipient cannot read.
-
Hardware and infrastructure problems: Old, damaged, or poorly maintained hardware and infrastructure can corrupt data at any stage through its lifecycle from collection to disposal.
Four key steps to achieve data integrity
A proactive data integrity strategy is at the heart of any successful data program. Success requires ensuring integrity across every stage of the lifecycle, from training staff on data entry to penetration testing systems.
Every data integrity strategy, no matter how mature an entity's data infrastructure may be, should include four key components, most of which can now be largely automated.3,4
-
Validation: A rigorous data validation program will help ensure data integrity, quality, and transparency.
-
Auditing: Without a robust and systematic auditing system in place, errors can proliferate throughout the data ecosystem.
-
Backups: As basic good housekeeping, backups and recovery should be part of your business continuity strategy, but they should also mesh with your data integrity strategy.
-
Security: Align the data integrity strategy with your security planning and conduct penetration testing to ensure systems protected against bad actors.
Whether a business aims to benefit from the data gold rush by turning data into revenue or simply avoid the many regulatory and reputational pitfalls of poor data management, integrity should be at the core of the strategy, not an afterthought.
1 Precisely, 2023 Data Integrity Trends and Insights, accessed September 2024.
2 Leo W.J.C. Huberts, Integrity: What it is and Why it is Important," July 2018.
3 Kihara Kimachia, What Is Data Integrity & Why Is It Important? (Definition & Types), TechRepublic, August 2023.
4 Keith D. Foote, Data Integrity vs. Data Quality, Dataversity, July 2023
You might also be interested in
How NASA and IBM Are Using Geospatial Data and AI to Analyze Climate Risks
NextGen Treasury: Protecting Your Organization from a Cybersecurity Attack
Cloud, Data and Zero-trust: Here’s Where VCs are Putting Their Cybersecurity Investments