Skip to main content


Table 4 Papers where data quality was conceptualized within fitness for purpose paradigm

From: Ontological specification of quality of chronic disease data in EHRs to support decision analytics: a realist review

Author reference Context Aims of project Methods/tools used in project Results
(Ivanova et al. 2013) Geo-spatial datasets in the national geo-information repositories in Netherlands To suggest a system for guided search for spatial data resources called GUESS -Use of popular search engines like OpenSearch to help in assessing fitness for purpose Defined fitness for purpose of data based on users (experts and non-experts in geo-informatics) satisfaction from search results
    -Use metadata (information that helps users to assess the usefulness of a dataset relative to their problem) as a tool to evaluate fitness for purpose of datasets  
    -Their approach is based on a 3-part data model (user profile, spatial data profiles and interaction profiles) Allowed users without specific expertise to conduct free form search requests in their own language
    -Theoretical discussion on accuracy and completeness of data  
(Devillers et al. 2007) Spatial On‒Line Analytical Processing (SOLAP) as a GIS data repository To manage heterogeneous data quality and provide functions to support expert users in the assessment of the fitness for purpose of a given dataset -Use the Quality Information Management Model = QIMM Defined fitness for purpose as the closeness of the agreement between data characteristics and the explicit and/or implicit needs of a user for a given application in a given area
   -Focus on intrinsic data quality indicators such as completeness, correctness and accuracy underpins a prototype
    -Apply data quality analysis tool which is the Multidimensional User Manual (MUM) prototype Researchers attempt to provide data quality indicators to help users determine a dataset’s fitness for purpose and better assess the fitness of data based on quality indicators/experts in GIS
    -Validate the QUMM of through demonstrations of the prototype to different users (GIS scientists, specialists in data quality issues, consultants in GIS, data producers, governmental agencies, typical GIS users, etc.)  
(Kahn et al. 2012) Clinical dataset in US To develop the efficacy of their data model in three large healthcare organizations -Use a two-by-two conceptual model (PSP/IQ) for describing IQ This is a well-grounded, logical approach and a case study to indicate health organizations need to use "fitness of use" to determine IQ (specifically soundness, dependable, useful and usable information) for analytical purposes
    -Focus on 8 dimensions of data quality (completeness, correctness, flexibility, etc.)  
    -Surveyed 45 professionals to determine which IQ dimensions belong in each quadrant of the model This assessment of DQ provides a reasonable baseline for determining what improvements should be made in DQ based on fitness for purpose for analytical purposes
    -Use case study method in 3 healthcare organizations that 75 people in each organization completed a 70-item questionnaire for assessing the quality of their patients information on the IQ dimensions  
(Chen 2009) Infectious diseases dataset in US To investigate the effect of 'quality’ of information and 'amount’ of information are used in the health behaviour -Use mathematical modelling of infectious disease transmission, seeks to analyse how the amount of information about disease prevalence affects individuals’ incentives Demonstrated "fitness for purpose" of data for agents to choose how much information to gather from others (personal communication from an anonymous reviewer)
    -More focus on data timeliness This is a theoretical paper using several mathematical models to show that information quality affects health behaviour i.e. better information leads to better decision making
    -Use of mathematics software  
(Liaw et al. 2011) An electronic Practice Based Research Network (ePBRN) with a data repository of routinely data from multiple EHRs To develop a matrix for assessment and management the quality of data Their methods include 3 phases: They used a well-designed framework to describe the intrinsic DQ (correctness and consistency) and fitness for purpose (completeness) for research and clinical purposes
    (1) requirements specification based on the conceptual framework,  
    (2) design and establishment of the ePBRN, and  
    (3) evaluation of the data quality and fitness for research.  
    -Use Microsoft Structured Query Language (SQL) to manage the extracted data and SAS used for datacleansing and analysis This study raised the theoretical dependence of the SQL/SAS approach on the lack of a transparent and explicit data model, metadata and process within proprietary EHRs
    -Focus on correctness, completeness and consistency of clinical data  
(Hamilton et al. 2003) Eighteen general practices in the Exeter Primary Care Trust in UK To compare computer-only record keeping to paper-only and hybrid systems -Use case control study of cancer patients aged over 40 years Defined completeness as fitness for consultation in primary care
    -Classify records as paper, computer, or hybrid, depending on which medium stored the clinical information from consultations by descriptive statistics Hybrid systems of primary care record keeping document higher numbers of consultations than computer-only or paper-only systems
    -Focus on completeness of data