Machine Learning

Background

The electrocardiogram (ECG) is one of the most common diagnostic tools for cardiovascular diseases, thanks to its inexpensive and noninvasive nature. As large datasets of detailed ECG recordings become available, machine learning (ML) tools can be utilized for ECG analysis and lead to advancements in diagnosis, even detecting diseases and patient characteristics that are not detectable in traditional ECG interpretation, such as low left ventricular ejection fraction (LVEF).

Generally, an ML architecture takes in a series of inputs (e.g., ECG recordings) and try to match them to targets (e.g., is low LVEF). The model can learn high-dimensional features from processing the input signals, and attempts to match observed patterns to labeled ground truth. Of course, an untrained ML model cannot predict the results accurately, where discrepancies arise between the ML output and the targets. But, the discrepancies can be computed and used to guide the ML model to update its own parameters, driving the predictions closer to the ground truth. After the ML model reaches a satisfactory performance, the generation or training phase is concluded. The algorithm is then tested for its accuracy in previously unseen datasets, where the generalization of the ML model is evaluated. If the ML model was trained properly, it should be able to reach reasonable accuracy predicting on new inputs it has never seen before.

Methods

One study our lab has completed was using ML technology to predict low LVEF, a clinical marker that is not available via traditional ECG analysis. Using a custom residual-based ML architecture, the ML model incorporates filters in both temporal and spatial dimensions, which is helpful to capture the detailed patterns in ECGs and achieve robust accuracy. The study also compared training using 12-lead ECGs and single-lead ECGs, where certain leads such as Lead I achieved comparable performance, indicating that ML technologies can be used even when resources are limited (e.g., wearable ECG recordings).

There could be many applications of ML in the field of cardiac electrophysiology. In addition to the signal analysis approach mentioned above, the technology can also be used for image analysis and patient-specific modeling. Using these principles, ML models can predict and classify many clinical markers of interest, including ones that are not easily achieved via human interpretation.

Relevant Papers

Bergquist, Jake A et al. “Performance of off-the-shelf machine learning architectures and biases in low left ventricular ejection fraction detection.” Heart rhythm O2 vol. 5,9 644-654. 17 Jul. 2024, doi:10.1016/j.hroo.2024.07.009

Bergquist, Jake A et al. “Comparison of Machine Learning Detection of Low Left Ventricular Ejection Fraction Using Individual ECG Leads.” Computing in cardiology vol. 50 (2023): 10.22489/cinc.2023.047. doi:10.22489/cinc.2023.047

Database Curation – EDGAR

Background

The “Experimental Data and Geometric Analysis Repository” (EDGAR) is an Internet-based archive of curated data that is freely distributed to the international research community for the application and validation of electrocardiographic imaging (ECGI) techniques. One challenge to enhance ECGI technology is the limited access to a wide range of data that is suitable for evaluation and comparison. A curated and accessible database of experimental, clinical, and simulation data from various centers will allow researchers to benefit from a large and diverse data pool, in turn enhancing the accuracy of the ECGI approaches they can develop. To achieve that goal, the EDGAR is a collaborative effort by the Consortium for ECG Imaging (CEI), aiming to host an accessible online repository of high-quality data, in a standardized information format that can facilitate effective exchange of the diverse datasets.

Methods

In addition to a web interface that allows for efficient access and retrieval of data, the EDGAR system proposes the metadata model that is standardized across different datasets. The model describes the different components and their formats of an ECGI dataset, facilitating the aggregation and comparison of multidisciplinary datasets. Each EDGAR dataset is comprised of five modules: Time Signals, Geometric Models, Forward and Inverse Transforms, Registration Information, and Medical Images. Furthermore, Experiment Metadata and Documentation are also included to associate each dataset to its experimental conditions.

So far, the EDGAR includes experimental data from the University of Utah (USA), clinical data from the Charles University Hospital (Czech Republic), and simulation data from the Karlsruhe Institute of Technology (Germany). Data availability has been a key limitation in the development and validation of data-driven technologies. To resolve this need, the EDGAR offers three main benefits: 1) providing researchers with access to hitherto unavailable, fully integrated and documented datasets; 2) providing a diverse and multi-modal dataset encompassing experimental, clinical and simulation applications; and, 3) providing a standardized platform for multi-center researchers to compare the results of different algorithms and numerical methods on the same dataset.

Relevant Papers

Aras, Kedar et al. “Experimental Data and Geometric Analysis Repository-EDGAR.” J Electrocardiol. 48(6): 975-81. 4 Aug. 2015, doi: 10.1016/j.jelectrocard.2015.08.008