info@biomedres.us   +1 (720) 414-3554
  One Westbrook Corporate Center, Suite 300, Westchester, IL 60154, USA

Biomedical Journal of Scientific & Technical Research

August, 2021, Volume 37, 4, pp 29654-29657

Brief Report

Brief Report

Machine Learning Analysis of a Chilean Breast Cancer Registry

Francisco Acevedo1,2, Leonardo Causa3, Sebastián Bravo3, Pablo García3, Ricardo Cuevas3, Maria Loreto Bravo1, Carla Avellaira4, Sabrina Muñiz2, Militza Petric2, Raúl Martinez2, Constanza Guerra2, Marisel Navarro2, Carla Taramasco5 and Cesar Sanchez1*

Author Affiliations

1Department of Hematology-Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile

2Complejo Asistencial y Hospital Dr. Sotero del Rio, Santiago, Chile

3Research and Development Department, B-Trials SpA, Facultad de Ingenierıá y Negocios, Universidad de le Américas, Santiago, Chile

4Real Word Evidence, Medical Affairs Department, Roche Chile, Santiago, Chile

5Escuela de Ingeniería Civil Informática, Universidad de Valparaíso, Centro Nacional de Sistemas de Información en Salud (CENS), Valparaiso, Chile

Received: July 22, 2021 | Published: August 02, 2021

Corresponding author: Cesar Sánchez, Department of Hematology-Oncology Facultad de Medicina Pontificia Universidad Católica de Chile Diagonal Paraguay 362, 6th FL RM608 Postal Code: 8330077, Santiago, Chile

DOI: 10.26717/BJSTR.2021.37.006037

Abstract

In recent years, artificial intelligence (AI) and machine learning (a form of AI) have offered valuable tools for medicine by applying and training algorithms in order to make predictions. Herein, we applied a machine learning algorithm to analyze data from a >20 year breast cancer (BC) registry elaborated in two Chilean health institutions (a public hospital and a private center) that includes a total of 4838 patients and their basic clinicalpathological characteristics. Preliminary results suggest that this cohort of patients can be subdivided into five clusters according to key variables that also correlate with overall survival and disease-free survival rates. To our knowledge this is the first Latin American report of its kind. Our laboratory is currently expanding these analyses.

Keywords: Breast Cancer; Machine Learning; Overall Survival; Disease-Free Survival

Purpose

As occurs in several countries, breast cancer (BC) is one of the leading causes of cancer related death among Chilean women [1]. Like other malignancies, breast neoplasms are characterized by their heterogeneity. This not only applies to clinical features of patients but also, to molecular, genetic and histologic characteristics [2]. Similarly, incidence rates and associated risk factors display a marked geographic variability [1]. To date, several studies have reported BC incidence and prevalence rates in both Europe and North America. These studies have also reported clinical-genetic characteristics and prognosis. In sharp contrast, South American reports on these topics are scarce [3]. Indeed, only a few Latin American studies have included data on limited populations, these are mostly from Brazil and Mexico [4,5]. Unpublished data from our group suggest that differences in lifestyle along with a diverse racial background could explain particular characteristics observed in the Chilean population.

In recent decades, Artificial Intelligence (AI) has emerged as an innovative and valuable tool in medicine, providing assistance to achieve more accurate patient diagnoses and to support making medical decisions. Interestingly, certain studies demonstrate that AI-algorithms can compete or even outperform clinicians in specific tasks [6]. In lay terms, AI-algorithms can be easily ‘trained’ by using sample data. Thus, algorithms “learn” to do their job much like doctors learn by attending medical school for years, making right decisions and sometimes mistakes. Within this context, Machine Learning (a form of AI) seeks to apply algorithms and build models based on training data in order to make predictions in a variety of applications including medicine [7]. In 1997 our institution started a longitudinal BC registry that included invasive disease cases. In recent years, our group has generated several publications focused on BC incidence, clinical characteristics of patients and clinical data based on these analyses [8-10]. Herein we report preliminary analyses on data applying machine learning to analyze our local BC patient registry.

Patients and Methods

This study was part of a collaborative effort between Hospital Sotero del Rio and Cancer Center at Pontificia Universidad Católica de Chile, the former a public hospital and the later a university cancer center, both at Santiago, Chile. We sought to determine relevant clusters of BC patients associated with clinical characteristics and survival that allow us to evaluate and propose patient-adapted therapeutic schemes. The K-medoids clustering algorithm was used to define a patient profile based on demographic (sex, age, weight / height, cancer family history, comorbidities and BC risk factors) and clinical-pathological information (stage, BC subtype, surgery, type of systemic treatment). Once the groups were separated, survival rates were calculated using the Kaplan-Meier method. This analysis allows us to link patient profiles with the behavior of survival rates. Then, data analytics methods were applied to determine the most relevant variables for each of the clusters and their correlation with survival rates. Finally, we estimate the time evolution of the treatments carried out (trajectories). In this way, it is possible to describe treatment schemes for each of the defined clustering.

Results

Overall, a total of 4838 registered BC patients were included into our study. Our analyses divided patients into five clusters with marked differences in clinical characteristics and prognoses see Figure 1. The key variables that defined these clusters included: age at diagnosis, body mass index, family history of cancer (by a first-degree relative), comorbidities (mainly hypertension), compromised nodes, and BC relapse. Clusters were also associated with significant differences in overall and disease-free survival (Figure 2).

Figure 1: Graphical three-dimensional representation of the five clusters of patients generated by our model. Panels A, B and C show different angles.

Figure 2: Overall survival for the identified clusters of patients.

Conclusion

To our knowledge, this is the first Latin American report applying a machine learning approach to analyze BC registry data, including clinical features and survival outcomes. Our findings confirm the capacity of machine learning to differentiate BC clusters with specific clinical and prognostic outcomes. Currently, we are validating this approach and expanding our database.

References

Brief Report

Machine Learning Analysis of a Chilean Breast Cancer Registry

Francisco Acevedo1,2, Leonardo Causa3, Sebastián Bravo3, Pablo García3, Ricardo Cuevas3, Maria Loreto Bravo1, Carla Avellaira4, Sabrina Muñiz2, Militza Petric2, Raúl Martinez2, Constanza Guerra2, Marisel Navarro2, Carla Taramasco5 and Cesar Sanchez1*

Author Affiliations

1Department of Hematology-Oncology, Facultad de Medicina, Pontificia Universidad Católica de Chile, Santiago, Chile

2Complejo Asistencial y Hospital Dr. Sotero del Rio, Santiago, Chile

3Research and Development Department, B-Trials SpA, Facultad de Ingenierıá y Negocios, Universidad de le Américas, Santiago, Chile

4Real Word Evidence, Medical Affairs Department, Roche Chile, Santiago, Chile

5Escuela de Ingeniería Civil Informática, Universidad de Valparaíso, Centro Nacional de Sistemas de Información en Salud (CENS), Valparaiso, Chile

Received: July 22, 2021 | Published: August 02, 2021

Corresponding author: Cesar Sánchez, Department of Hematology-Oncology Facultad de Medicina Pontificia Universidad Católica de Chile Diagonal Paraguay 362, 6th FL RM608 Postal Code: 8330077, Santiago, Chile

DOI: 10.26717/BJSTR.2021.37.006037

Abstract

In recent years, artificial intelligence (AI) and machine learning (a form of AI) have offered valuable tools for medicine by applying and training algorithms in order to make predictions. Herein, we applied a machine learning algorithm to analyze data from a >20 year breast cancer (BC) registry elaborated in two Chilean health institutions (a public hospital and a private center) that includes a total of 4838 patients and their basic clinicalpathological characteristics. Preliminary results suggest that this cohort of patients can be subdivided into five clusters according to key variables that also correlate with overall survival and disease-free survival rates. To our knowledge this is the first Latin American report of its kind. Our laboratory is currently expanding these analyses.

Keywords: Breast Cancer; Machine Learning; Overall Survival; Disease-Free Survival