How to Use Data Contracts and Models to Improve Data Quality in Insurance

Using data contracts and data models to enable reliable, transparent, and accurate data for insurance processes.

Author

Darya Zelenevskaya

Senior Data Analyst – People Lead

Darya is a Senior Data Analyst with more than 14 years of experience in IT, specialising in transforming complex data into actionable insights that drive business performance. With a passion for helping businesses make informed decisions, she works to bridge the gap between data and strategic goals.

Insurers rely heavily on data enrichment to deliver accurate pricing and better customer experiences. This process involves collating data from both internal and external sources to build a complete and up-to-date view of customer risk.

When it comes to vehicle and driver information, reliable and accurate data is crucial. This data (vehicle lookups, vehicle risk scores, driving history, etc.) needs to be known, understood, and documented.

Incoming data being inconsistent, late, or poorly structured results in slow and inaccurate quotes. With data arriving from multiple sources, and complex API and system architecture, it’s important to create good documentation that describes the source data format, frequency of arrival and provide transparency of data journey to help developers and engineers build robust ETL pipelines without ambiguity.

In this article, we’re going to look into how data contracts and data models play a critical role in describing source data and its journey within the insurance industry.

What is a Data Contract?

A data contract is an agreement between data producer and data consumer that specifies the schema of the data, constraints, quality and outlines expectations of data shared between systems and APIs. They should be versioned alongside the software with which they interact.

When a data contract is created it brings clarity and transparency in documentation to describe the data journey from multiple external and internal sources, bringing a common understanding between supplier and organisation to help prevent data integration issues.

A data contract can describe (but is not limited to!) the data exchange between systems and APIs, providing robust documentation for the project team and beyond.

What is a Data Model?

A data model creates a visual representation of the data and creates a shared language across the business in defining how data is structured, stored and exists within a system.

Data models play an important role in managing and understanding data by providing structure and clarity. They make complex data easier to understand for both technical and business users and can be used as a shared artefact for engineers, analysts, and stakeholders, bridging communication across teams.

Data enrichments introduce new entities, attributes, and relationships in existing data models. Data models enable scalability and ensure future data enhancements fit smoothly into the current architecture.

Together with the data contract, it gives stakeholders confidence that data is readable, predictable, auditable and well managed. It creates a common understanding between the business users, technical teams and suppliers on the shape of the data and its definitions.

Why Both Data Contract and Data Model Matter

Robust data contracts and data mapping templates enable documented ETL pipelines, data validation and automated testing.

Contracts without a model result in a well-defined data exchange. Models without a contract provide a clear schema, but no guarantee that data will fit. When combined they allow us to use enriched data in a predictable way.

Let’s review an example of enriching a quote record with vehicle information from an external API. The contract ensures values in the fields like Vehicle Make or Vehicle Model meet agreed formats. The model guarantees these fields are structured consistently across systems.

A Practical Example with Vehicle Data

During a previous engagement with a top UK-based insurer, we developed data contract and data model template standards that describes the data lineage covering the full quote processing journey:

Data received from different customer quote input journeys
Enrichment API request and response schemas
Schema of the source data shredded on a batch process
Integration layer mapping – transformations and business rules
Pricing model schema input and output
Schema of the pricing model input shredded to the analytics store
How the data is mapped back into key policy systems

Scope of Data Mapping Template

Before building pipelines, thorough data analysis is paramount to understand the datasets. The enrichments schema should be known: the shape of the data, quality of the data, and frequency of update.

In this project, the scope of a typical data contract, covering internal and external API schema exchange, has been extended to describe the end-to-end data journey, including the schema of the customer data through different channels, dataset enrichments, pricing model mapping and data analytics schemas.

The end-to-end mappings within the contract cover this scope, but also the source-to-target data journey ensuring transparency, schema definition, transformations, enabling a fully comprehensive understanding of what data is consumed and how it is used throughout the entire lifecycle.

Example from Data Mapping template:

Scope of Data Model Template

The data model describes the schema of the shredded customer quote journey data.

Best Practices for Data Contract and Data Models

Insurers make use of a vast amount of external data sources and while documentation often exists from suppliers to describe the APIs interacted with, there is often a gap in documentation to describe the external source data journey and how that maps end-to-end from enrichment into pricing models to policy systems. Robust data contracts are required to describe and make available the data lineage in what is a complex quote enrichment journey.

Many organisations start by focusing on the critical data domains in scope for phased enrichment builds. This often starts with contracts to describe external API interchange, which can evolve into larger data mapping exercise describing customer quote input (through various business journeys), price modelling and analytical modelling. The data models are created to describe the analytical quote store schema.

Standards are defined for the template, including naming conventions, versioning rules, and change management process.

Insurers should put templates in place for multiple suppliers and define clear guidelines for creating new templates to support future enrichments.

Summary

Contracts and models make enrichment practical, predictable and trustworthy, turning internal or external data into better decisions.

In the insurance industry, trust and reliability are essential; data contracts and data models enforce accurate schemas for engineers to develop against and visibility of data journey builds trust both with internal and external stakeholders.

Describing the data lineage in a well-defined template acts as a risk-management tool, driving good data governance practice in organisations.

Discover how data contracts and models can boost insurance data quality, get in touch with our team.

Author

Darya Zelenevskaya

Senior Data Analyst – People Lead

Advisory

Enterprise Data & AI Platforms

ML Solutions

Generative AI Solutions

Data Migrations

Run and Support