- Research article
- Open Access
- Open Peer Review
Validating an algorithm to identify metastatic gastric cancer in the absence of routinely collected TNM staging data
BMC Health Services Researchvolume 18, Article number: 309 (2018)
Accurate TNM stage information is essential for cancer health services research, but is often impractical and expensive to collect at the population-level. We evaluated algorithms using administrative healthcare data to identify patients with metastatic gastric cancer.
A population-based cohort of gastric cancer patients diagnosed between 2005 and 2007 identified from the Ontario Cancer Registry were linked to routinely collected healthcare data. Reference standard data identifying metastatic disease were obtained from a province-wide chart review, according to the Collaborative Staging method. Algorithms to identify metastatic gastric cancer were created using administrative healthcare data from hospitalization, emergency department, and physician billing records. Time frames of data collection in the peri-diagnosis period, and the diagnosis codes used to identify metastatic disease were varied. Algorithm sensitivity, specificity, and accuracy were evaluated.
Of 2366 gastric cancer patients, included within the chart review, 54.3% had metastatic disease. Algorithm sensitivity ranged from 50.0- 90%, specificity ranged from 27.6 - 92.5%, and accuracy from 61.5 - 73.4%. Sensitivity and specificity were maximized when the most conservative list of diagnosis codes from hospitalization and outpatient records in the six months prior to and the six months following diagnosis were included.
Algorithms identifying metastatic gastric cancer can be used for research purposes using administrative healthcare data, although they are imperfect measures. The properties of these algorithms may be generalizable to other high fatality cancers and other healthcare systems. This study provides further support for the collection of population-based, TNM stage data.
Stage data is needed to define clinically homogenous cohorts, adjust for the extent of disease spread, study real-world treatment effectiveness and costs, and inform regional decision-making . Accurate staging, when linked to treatment and outcome data, informs the effectiveness and quality of cancer treatments, and guides healthcare planning for resource mobilization or implementation . The absence of stage data increases the complexity of maintaining representativeness of the cancer cohort, minimizing bias caused by excluding patients with unknown stage data, and achieving adequate sample size to perform robust statistical analyses .
Capturing population-based stage data in ‘big data’ is often limited by practical and financial constraints. For example, the International Cancer Benchmarking Project used multiple national cancer registries to understand cancer stage and survival patterns . The registries contained varying levels of complete stage data across primary cancer sites; upwards of 50% of patients were excluded due to missing stage data in this international comparison of cancer survival [4, 5]. As a result, many countries are aiming to improve their population-based stage data collection using a number of methods and data sources [1, 2, 6,7,8].
Validated algorithms to identify metastatic disease using routinely collected healthcare data may provide one solution to missing stage data in studies using population-based, administrative data [9, 10]. Benchimol et al. have published general guidelines for algorithm development and validation using administrative healthcare data to assign disease status . Overall, many studies do not appropriate report on the performance of the algorithm, including revalidation, present at least four metrics to assess diagnostic accuracy (e.g. sensitivity, specificity, agreement), or confidence intervals . Little published research has evaluated algorithm performance across cancer sites; developing high quality algorithms requires gold standard staging data to properly validate and ensure accuracy prior to use. Whyte et al. evaluated 28 algorithms to identify metastatic disease status in three administrative data cohorts of treated colorectal, breast, and lung cancer patients in the United States . The algorithms had varying properties depending on cancer site, the underlying prevalence of metastatic disease, the choice of timeframe, and diagnosis codes . This is consistent with the properties of other diagnostic algorithms, where there is also evidence that algorithm performance is dependent on the data sources used.
Gastric cancer (GC) is the third leading cause of cancer-related mortality worldwide [12, 13]. Most patients in North America present with metastatic disease at diagnosis [14, 15], with similar stage distributions reported in the United Kingdom [16,17,18]. Although not all countries capture this information routinely, the ability to identify stage IV patients in population-based registries is crucial. Therefore, this study linked detailed TNM staging data from a province-wide chart review with routinely collected healthcare data, to develop an algorithm to identify individuals with metastatic disease in a cohort of GC patients.
GC patients aged 19 and older and diagnosed between April 1, 2005 and March 31, 2008 were identified in the Ontario Cancer Registry. Patients with multiple cancers, no corresponding hospital chart, tumour located primarily in the oesophagus, or non-adenocarcinoma tumours were excluded. The project received the Research Ethics Boards approval at the Sunnybrook Health Sciences Centre and adhered to all privacy and confidentiality regulations of ICES. Individual patient consent was not required. ICES is an s. Forty five Prescribed Entity under Ontario’s privacy law (PHIPA), enabling us to study the health and health outcomes of individuals for the purpose of analysis or compiling statistical information with respect to the management of, evaluation or monitoring of, the allocation of resources to or planning for all or part of the health system.
A province-wide chart review was conducted at over 100 institutions between November 2009 and November 2011. Information from multiple endoscopy, radiology, and pathology reports per patient were aggregated. Data abstraction from operative reports was completed by a surgical resident in 2013. Chart review data were linked to routinely collected healthcare and vital status data at ICES in 2013. All hospitalizations, emergency department (ED) visits, and physician visits were captured from the Canadian Institute of Health Information-Discharge Abstract Database and the Same Day Surgery Database, the National Ambulatory Care Reporting System, and the Ontario Health Insurance Plan database.
The 7th Edition American Joint Committee on Cancer/Union International Cancer Control TNM staging system was used . TNM stage data from patient hospital charts were used as the reference standard. Stage data were collected in the 180 days prior to the diagnosis date registered in the Ontario Cancer Registry and in the 180 days following diagnosis up until the date of surgical resection (whichever came last) using a modified Collaborative Staging system approach. Clinic, diagnostic imaging, endoscopy, surgery, and pathology records were used to identify metastatic disease. Patients were considered stage IV, otherwise defined as M1 or positive for metastatic disease, if evidence of metastatic disease was identified in any portion of the medical record and M0 otherwise (stage I-III).
Three sets of administrative data algorithms to identify stage IV gastric cancer , otherwise defined as the presence of metastatic disease at diagnosis, were created using a combination of information from hospitalization records, ED visits, and outpatient physician visits. A positive diagnosis of metastatic GC was determined using three sets of eligible International Classification of Disease (ICD) system version 9 and 10 diagnosis codes (a complete list is provided in Additional file 1: Table S1). The included diagnoses ranged from conservative (secondary malignancy codes only, e.g. ICD-9 code 196) to inclusive (any non-gastric malignancy diagnosis (e.g., ICD 10 C codes excluding digestive organs). In the first set of algorithms, patients were identified as being metastatic if they had a hospitalization. In the second set of algorithms, patients with metastases were identified using hospitalization records (one or more) and outpatient records (two or more). In the third set of algorithms, patients with metastases were identified if they had one or more hospitalizations or outpatient records. Three different time periods were also considered for each algorithm: three months pre- and post-diagnosis, six months pre- and post-diagnosis, and three months pre-diagnosis with no end to follow-up post-diagnosis. These specific criteria were chosen based on the types of data in our administrative data holdings, as well as previous studies defining metastatic disease using similar data, and based on the properties of diagnostic algorithms using administrative data in other settings. We performed a sensitivity analysis restricting the cohort to those who received a surgical resection. In total, 45 algorithms were evaluated.
Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were calculated for each algorithm. Accuracy was measured using the following equation: Accuracy = (TP + TN) / (TP + TN + FP + FN) . Ninety five percent confidence limits on the estimates of sensitivity, specificity, PPV, NPV and accuracy were calculated using percentiles of a distribution of 5000 bootstrap replicates with replacement. Demographic characteristics and the tumour stage, lymph node status, and TNM stage of true positives, false positives, true negatives, and false negatives were described for each algorithm. Content validity was evaluated by comparing the percentage of patients who died in year following diagnosis.
Overall, 2366 patients were included; 54.3% had metastasis at diagnosis according to the chart review (Table 1). Sensitivity, specificity, and accuracy of the algorithms are reported in Table 2. Sensitivity ranged from 50.0 - 90%, specificity ranged from 27.6 - 92.5%, and accuracy from 61.5 - 73.4%. Sensitivity and specificity were maximized when the algorithm used the most conservative list of metastatic disease diagnosis codes, hospitalization and outpatient records as the data source, and when the algorithm was run on administrative data from the six months prior to and following diagnosis. The sensitivity of the algorithms all decreased and the specificity of the algorithms increased slightly, when the cohort was restricted to patients who received surgical resection (Additional file 2: Table S2). Excluding patients with unknown metastatic disease status (4.3%) did not change the results (data not shown). Concordant and discordant classifications between the algorithms and the reference standard are reported in Additional file 3: Table S3.
Table 3 describes the algorithm that maximized sensitivity and specificity (algorithm # 12). According to this algorithm, the prevalence of metastatic GC was 45%. Of the 1285 true positives using the reference standard, 31% were misclassified using this administrative healthcare data algorithm; 20% of the metastatic group identified by the algorithm were false positives and 32% of the M0 were false negatives. One third of the false positives and false negatives had an unknown stage at diagnosis according to the reference standard. Correctly classified metastatic patients were more likely to have died within a year of diagnosis, than those incorrectly classified.
Using the algorithm with the highest positive predictive value (algorithm # 1), 11% of those identified as having metastatic disease were misclassified. Ninety percent of patients misclassified using this algorithm were stage III (55.5%) or unknown stage (34.6%), 66% had a T4a or T4b tumour. Overall, as the positive predictive value of the algorithm decreased, the proportion of node-negative patients with smaller tumours, and earlier stage disease, misclassified as metastatic increased (data not shown).
This study evaluated 45 algorithms using routinely collected healthcare data to identify metastatic disease in a population-based cohort of GC patients. None of the algorithms did an excellent job of classifying patients based on the reference standard. The algorithm that maximized sensitivity and specificity identified metastatic disease through one or more hospitalization or outpatient records with a diagnosis from the conservative list, in the six months before and after diagnosis.
Our algorithm accuracy differed from the few others present in the literature as the result of study design or the underlying prevalence of metastatic disease. We observed lower accuracy than a study of colorectal cancer algorithms by Brooks et al. . Whyte et al. reported better accuracy for their algorithms identifying metastatic disease in breast cancer, and similar accuracy for algorithms in lung and colorectal cancer . Whyte et al. reported sensitivity and specificity estimates ranging from 46 to 77 and 83–99% for breast cancer, 50–67 and 68–83% for lung cancer, and 54–77 and 70–91% for colorectal cancer . Whyte et al. did not define the length of their follow-up period, or explain why the total number of patients varied across algorithms, and only included patients treated within a private healthcare system . Both Whyte et al. and Brooks et al. studied only patients who received treatment. Both breast and colorectal cancer have a much lower prevalence of metastatic disease at diagnosis, compared to GC which may impact accuracy. We concluded similar findings to an algorithm developed by Lash et al. to identify colorectal cancer recurrence, in which patients correctly identified by the algorithm were more likely to be younger and to die in a shorter timeframe .
The best algorithm choice is dependent on the research purpose . For example, maximizing accuracy may be the priority when estimating the prevalence of metastatic disease, when representativeness of the identified cohort is not important. Maximizing specificity may be the priority to ensure patients included in a study of metastatic patients are not metastatic. We recommend using a conservative approach with relevant diagnosis codes reported close to the diagnosis date. This approach, and the other algorithms reported in this study should be tested in an additional, external cohort, including one that better reflects current clinical populations and treatment. The properties of algorithms in this study may be generalizable to similar high fatality cancer cohorts such as pancreas and esophagus. The algorithms may be used by other investigators and policy-makers to estimate the extent of misclassification, and in formal bias analyses to adjust effect estimates . Alternatively, given that none of the algorithms demonstrated exemplary accuracy, integrating multiple algorithms using methods such as majority vote and Boolean operations may be another way these algorithms may be implemented in practice .
Our study is limited by our choice of a reference standard, which may have resulted in misclassification of metastatic disease across patients. The prevalence of metastatic disease was 54% in our study, with a median survival of six months, which matches the literature distribution [14, 25]. We performed a sensitivity analysis restricting to the cohort of patients with a surgical resection, who would have better quality pathologic staging data available in their charts. The true prevalence of metastatic disease was lower and the positive predictive value of the algorithms decreased. We also attempted to address administrative data quality issues by creating three sets of algorithms based on the data reliability (hospitalization data being most reliable) and using three sets of diagnosis codes.
We suggest that algorithms using administrative healthcare data are imperfect replacements for population-based staging data and support the need for system level data collection. However, they do yield moderately accurate results. In cases where population-based data collection is infeasible, a global understanding of misclassified patients and administrative algorithm properties is important to assessing potential selection bias.
International Classification of Disease
Ontario Cancer Registry
Personal Health Information Privacy Act
Tumour, node, metastasis
Brierley JD, Srigley JR, Yurcan M, Li B, Rahal R, Ross J, King MJ, Sherar M, Skinner R, Sawka C. The value of collecting population-based cancer stage data to support decision-making at organizational, regional and population levels. Healthcare quarterly (Toronto, Ont). 2013;16(3):27–33.
Falcaro M, Carpenter JR. Correcting bias due to missing stage data in the non-parametric estimation of stage-specific net survival for colorectal cancer using multiple imputation. Cancer Epidemiol. 2017;48:16–21.
Butler J, Foot C, Bomb M, Hiom S, Coleman M, Bryant H, Vedsted P, Hanson J, Richards M. The international Cancer benchmarking partnership: an international collaboration to inform cancer policy in Australia, Canada, Denmark, Norway, Sweden and the United Kingdom. Health Policy. 2013;112(1–2):148–55.
Maringe C, Walters S, Rachet B, Butler J, Fields T, Finan P, Maxwell R, Nedrebo B, Pahlman L, Sjovall A, et al. Stage at diagnosis and colorectal cancer survival in six high-income countries: a population-based study of patients diagnosed during 2000-2007. Acta oncologica. 2013;52(5):919–32.
Walters S, Maringe C, Coleman MP, Peake MD, Butler J, Young N, Bergstrom S, Hanna L, Jakobsen E, Kolbeck K, et al. Lung cancer survival and stage at diagnosis in Australia, Canada, Denmark, Norway, Sweden and the UK: a population-based study, 2004-2007. Thorax. 2013;68(6):551–64.
Benitez-Majano S, Fowler H, Maringe C, Di Girolamo C, Rachet B. Deriving stage at diagnosis from multiple population-based sources: colorectal and lung cancer in England. Br J Cancer. 2016;115(3):391–400.
Luo Q, Egger S, Yu XQ, Smith DP, O'Connell DL. Validity of using multiple imputation for "unknown" stage at diagnosis in population-based cancer registry data. PLoS One. 2017;12(6):e0180033.
Ostenfeld EB, Froslev T, Friis S, Gandrup P, Madsen MR, Sogaard M. Completeness of colon and rectal cancer staging in the Danish Cancer registry, 2004-2009. Clinical epidemiology. 2012;4(Suppl 2):33–8.
Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I, Sorensen HT, von Elm E, Langan SM. The REporting of studies conducted using observational routinely-collected health data (RECORD) statement. PLoS Med. 2015;12(10):e1001885.
van Walraven C, Bennett C, Forster AJ. Administrative database research infrequently used validated diagnostic or procedural codes. J Clin Epidemiol. 2011;64(10):1054–9.
Whyte JL, Engel-Nitz NM, Teitelbaum A, Gomez Rey G, Kallich JD. An evaluation of algorithms for identifying metastatic breast, lung, or colorectal Cancer in administrative claims data. Med Care. 2015;53(7):e49–57.
Brenkman HJ, Haverkamp L, Ruurda JP, van Hillegersberg R. Worldwide practice in gastric cancer surgery. World J Gastroenterol. 2016;22(15):4041.
Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136(5):E359–86.
Dixon M, Mahar AL, Helyer LK, Vasilevska-Ristovska J, Law C, Coburn NG. Prognostic factors in metastatic gastric cancer: results of a population-based, retrospective cohort study in Ontario. Gastric Cancer. 2016;19(1):150–9.
Howlader N, Noone AM, Krapcho M, Miller D, Bishop K, Kosary CL, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA (eds). Bethesda: SEER Cancer Statistics Review, 1975-2014, National Cancer Institute. 2016. https://seer.cancer.gov/csr/1975_2014/.
National Cancer Registration and Analysis Service. Stage Breakdown by CCG 2014. London: NCRAS; 2016.
Northern Ireland Cancer Registry. Incidence by stage 2010–2014. Belfast: Queens University Belfast; 2016.
ISD Scotland. Detect Cancer early staging data. Scotland: ISD; 2016.
American Joint Committee on Cancer. AJCC staging manual. 7th ed. Chicago: Spring; 2012.
Brooks GA, Landrum MB, Keating NL. An administrative stage inference algorithm for use in patients receiving chemotherapy for colorectal cancer. J Clin Oncol. 2017;35:e18121.
Lash TL, Riis AH, Ostenfeld EB, Erichsen R, Vyberg M, Thorlacius-Ussing O. A validated algorithm to ascertain colorectal cancer recurrence using registry resources in Denmark. Int J Cancer. 2015;136(9):2210–5.
Chubak J, Pocobelli G, Weiss NS. Tradeoffs between accuracy measures for electronic health care data algorithms. J Clin Epidemiol. 2012;65(3):343–349.e342.
Lash TL, Fox MP, MacLehose RF, Maldonado G, McCandless LC, Greenland S. Good practices for quantitative bias analysis. Int J Epidemiol. 2014;43(6):1969–85.
Murphree D, Ngufor C, Upadhyaya S, Madde N, Clifford L, Kor DJ, Pathak J. Ensemble learning approaches to predicting complications of blood transfusion. Conf Proc IEEE Eng Med Biol Soc Ann Conf. 2015;2015:7222–5.
Howlader N, Noone AM, Krapcho M, Garshell J, Miller D, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z, et al. SEER Cancer statistics review, 1975-2011, based on November 2013 SEER data submission. Bethesda: National Cancer Institute; 2014.
We would like to acknowledge the tireless, persistent, and important chart abstraction efforts of Dr. Jovanka Vasilevska-Ristovska and Dr. Matthew Dixon.
The authors have no financial interests to disclose. This research was funded by the Canadian Cancer Society (Grant #019325). Dr. Coburn is supported by the Hanna Family Research Chair in Surgical Oncology. This study was additionally supported by ICES, which is funded by an annual grant from the MOHLTC. The opinions, results and conclusions reported in this paper are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred. Parts of this material are based on data and information provided by Cancer Care Ontario (CCO). The opinions, results, view, and conclusions reported in this paper are those of the authors and do not necessarily reflect those of CCO. No endorsement by CCO is intended or should be inferred. Parts of this material are also based on data and/or information compiled and provided by CIHI. However, the analyses, conclusions, opinions and statements expressed in the material are those of the author(s), and not necessarily those of CIHI.
Availability of data and materials
The dataset used in this study is held securely in coded format at ICES. Although data sharing agreements prohibit ICES from making the dataset publicly available, access may be granted to those who meet the conditions for confidential access, available at https://www.ices.on.ca.
AM and YJ conceived the study. AM and BZ participated in the design of the study. AM and NC participated in data acquisition. AM and YJ made substantial contributions to the interpretation of the data and drafted the manuscript. BZ performed the statistical analyses for the study and participated in manuscript revisions. NC made substantial contributions to the interpretation of the data and participated in manuscript revisions. All authors read and approved the final manuscript.
Ethics approval and consent to participate
The project received the Research Ethics Boards approval at the Sunnybrook Health Sciences Centre and adhered to all privacy and confidentiality regulations of ICES. Individual patient consent was not required. ICES is a s. Forty five Prescribed Entity under Ontario’s privacy law (PHIPA) enabling us to study the health and health outcomes of individuals for the purpose of analysis or compiling statistical information with respect to the management of, evaluation or monitoring of, the allocation of resources to or planning for all or part of the health system.
The authors declare they have no competing interests.
Table S1. Included diagnoses, ICD-9 and 10 codes used to identify metastatic disease for the different algorithms. (DOCX 15 kb)
Table S2. Algorithm properties when the patient cohort was restricted to those who received a surgical resection, Sensitivity, specificity, negative predictive value, positive predictive value and accuracy for the algorithms when applied to a subset of patients who received a surgical resection. (DOCX 15 kb)
Table S3. Number of patients in each cell, by algorithm, The breakdown of the number of true positive, true negatives, false positives and false negatives for each algorithm. (DOCX 19 kb)