This report delves into the predictive analysis of Mild Cognitive Impairment (MCI) severity, leveraging high-density EEG data combined with nuanced neuropsychological scoring. It extends previous research by incorporating advanced feature engineering techniques, including source analysis from the Human Connectome Project (HCP) combined atlas, and exploring the efficacy of machine learning models like Logistic Regression, Random Forest, SVMs, and XGBoost in distinguishing between MCI severity levels.
Groups are based on the FRSSD total score, a functional scale indicating the severity of symptoms. The Functional Rating Scale for Symptoms of Dementia (FRSSD)[2] evaluate daily living skills affected by MCI across 14 dimensions, including eating, dressing, managing incontinence, communicating, sleeping, recognizing faces, maintaining personal hygiene, remembering names, recalling events, staying alert, understanding complex situations, knowing one's location, managing emotions, and interacting socially. The FRSSD assigns scores from 0 (no impairment) to 3 (severe impairment) for each activity, reflecting the individual's level of functional ability. It's important to note that this assessment is completed by caregivers, not the patients themselves, which means the scores might reflect the caregivers' perceptions and the impact of caregiving on their well-being. The scale uses a threshold score of 5 (authors recommendation) to distinguish between individuals who are likely healthy and those who may have dementia, providing a valuable tool for analysis of cognitive decline.
In this analysis the cutoff score of 5 was not used, instead the participants were divided into two groups based on the FRSSD total score: a lower-score group (0-3) and a higher-score group (4 or higher).
Fig.1 Functional Rating Scale for Symptoms of Dementia (FRSSD) and the distribution of scores in the dataset across the subjects. The scores are divided into two groups: lower-score group () and higher-score group ( or higher). No FRSSD tot score: , FRSSD tot score , FRSSD tot score .
The methodology involved preprocessing EEG data, source space analysis, extracting features, and linking these features with neuropsychological scores to create a comprehensive dataset. The dataset was then used to train and evaluate several machine learning models, including Logistic Regression, Support Vector Machines (SVM), and XGBoost, to classify subjects into two groups. The detailed steps for preprocessing the EEG data can be found in the previous report preproc02 and previous report model01.
The study utilized preprocessed EEG segments, with source analysis conducted for each multivariete timeseries segment to extract bandpower features from 42 ROIs using the HCP_combined atlas. Stratified sampling was ensured based on the groups and subject segments to avoid data leakage.
Utilized a custom function to compute source time courses (STCs), employing sLORETA ico4, snr 0.5 with fsaverage brain temaplate for inverse modeling and focusing on meaningful regions within the HCPMMP1_combined atlas using MNE-Python library[3]. The relative band powers and their ratios were calculated using the yasa library[6]. The features were standardized using the StandardScaler from the sklearn library. The dataset was split into training and testing sets, ensuring stratified sampling based on the groups and subject segments defined by FRSSD total scores. The models were trained and evaluated using accuracy and ROC AUC scores, with a focus on understanding the performance of each model in distinguishing between the unbalanced groups.
Default Mode Network (DMN):
Salience Network (SN):
Central Executive Network (CEN):
Sensorimotor Network:
Visual Processing Network:
Auditory and Language Network:
Attention and Frontoparietal Network:
Memory and Temporal Lobe Function:
Emotion and Affective Processing:
Executive Function and Decision Making:
Parietal Lobe Function:
Limbic System:
Plots of ROIs are presented below:

Fig.2 42 Regions of Interest (ROIs) used in the analysis. The ROIs are based on the HCPMMP1_combined atlas.
The animated GIF showcases a dynamic 3D visualization of brain activity over time, rendered on the "fsaverage" template brain. It's presented from multiple perspectives: lateral, medial, dorsal, and ventral views, laid out horizontally for a comprehensive examination of the entire brain surface. This visualization is based on source-localized EEG data, capturing the intricate patterns of neural dynamics across different brain regions. The color intensity on the brain's surface corresponds to the magnitude of activity to highlight variations in neural engagement. Time from 0.1 to 1.1 seconds at a slowed-down pace (time dilation factor of 100) for detailed analysis. Traveling waves can be observed across the brain's surface, ranging from 30 to 90 milliseconds.
The feature extraction process involved the following steps:
yasa library[6]. Averge power was calculated for the (1-4 Hz), (4-8 Hz), (8-12 Hz), (12-16 Hz), (16-30 Hz) and (30-45 Hz) frequency bands.Correlation matrix of the relative band powers for the ROIs was calculated to identify the relationships between different regions and frequency bands. The correlation matrix was visualized using a heatmap to provide a comprehensive overview of the interplay between ROIs and frequency bands.
Feature vector shape: ( , , )
Fig.3 Correlation matrix of the relative band powers for the ROIs.
Data Structure: The dataset (eeg_data_df) was categorized based on unique identifiers for each participant (EEGCode,Segment_id) and their respective group classifications, indicating different levels of cognitive impairment.
Stratified Grouping:
GroupKFold function from the sklearn library, the dataset was stratified based on the EEGCode and subject Segment_id defined by FRSSD total scores.train_df) and testing (test_df) were constructed.Training and Testing Sets:
Following dataset organization and stratification:
Data Preparation for Modeling:
X_train and X_test) were curated by removing non-predictive columns, focusing on variables crucial for predictive accuracy.y_train and y_test) were determined by the group classifications, representing cognitive impairment levels.Feature Scaling:
StandardScaler from the sklearn library, to ensure all features contributed equally to model decision-making, eliminating bias from varied feature scales. This step is crucial for linear models and benefits other algorithms by enhancing accuracy and convergence speed.Following the structured preparation of the dataset, several machine learning models were employed to discern patterns indicative of cognitive impairments:
Logistic Regression without Class Weights CV5: A logistic regression model was initialized and trained, applying scaled features from the training set. Predictions were then made on the scaled test set, with accuracy serving as the primary evaluation metric. This model served as a baseline for comparison.
Logistic Regression with Class Weights CV5: To address potential class imbalance, another logistic regression model was introduced, this time incorporating balanced class weights. After training, predictions were made on the test set, and the model's performance was evaluated based on accuracy and detailed through a classification report. ROC AUC curve was generated to visualize the model's ability to differentiate between classes.
Support Vector Machine (SVM) CV5: An SVM classifier with an RBF & Linear kernel, balanced class weights were trained, leveraging the scaled feature sets. The model's predictive accuracy was assessed, and its performance was further detailed in a classification report. An ROC AUC curve was plotted.
XGBoost Classifier CV5: The XGBoost model was employed next, using label-encoded class targets for training. This model's accuracy was evaluated, and its performance was summarized in a classification report.
Random Forest Classifier CV5: A Random Forest model was trained and evaluated, leveraging the scaled feature sets.
Sklearn's LogisticRegression, Random Forest and SVC were used to train the models. The XGBoost model was trained using the xgboost library. The GroupKFold and cross_val_score functions from the sklearn library were used to perform cross-validation and calculate ROC AUC scores. The classification_report function from the sklearn library was used to generate detailed classification reports. The roc_curve and roc_auc_score functions from the sklearn library were used to plot ROC curves and calculate ROC AUC scores.
The performance of each model was assessed based on their ability to distinguish between two classes: 'MCI-high' and 'MCI-low'. The results indicate varying degrees of success across different metrics, including accuracy, precision, recall, f1-score, and ROC AUC scores. A summary table is provided below to encapsulate the performance metrics for each model evaluated:
| Model | Accuracy | Precision (MCI-high) | Recall (MCI-high) | F1-Score (MCI-high) | Precision (MCI-low) | Recall (MCI-low) | F1-Score (MCI-low) | ROC AUC |
|---|---|---|---|---|---|---|---|---|
| Logistic Regression (Unweighted) | 0.69 | 0.28 | 0.007 | 0.012 | 0.70 | 0.97 | 0.81 | 0.47 ± 0.13 |
| Logistic Regression (Weighted) | 0.48 | 0.26 | 0.47 | 0.33 | 0.68 | 0.48 | 0.56 | 0.47 ± 0.13 |
| SVM (rbf) | 0.68 | 0.19 | 0.02 | 0.04 | 0.70 | 0.95 | 0.81 | 0.45 ± 0.06 |
| SVM (lin) | 0.71 | 0.0 | 0.0 | 0.0 | 0.71 | 1.0 | 0.82 | 0.51 ± 0.10 |
| XGBoost | 0.63 | 0.21 | 0.09 | 0.13 | 0.69 | 0.86 | 0.77 | 0.43 ± 0.06 |
| Random Forest | 0.64 | 0.21 | 0.08 | 0.12 | 0.69 | 0.87 | 0.77 | 0.44 ± 0.07 |
Column Descriptions:
Fig.4 ROC AUC curve for the Logistic Regression model with class weights. The curve illustrates the model's ability to distinguish between 'MCI-high' and 'MCI-low' classes, with an average ROC AUC score of 0.47 ± 0.13.
Fig.5 ROC AUC curve for the SVM model with an RBF kernel. The curve illustrates the model's ability to distinguish between 'MCI-high' and 'MCI-low' classes, with an average ROC AUC score of 0.45 ± 0.06.
Fig.6 ROC AUC curve for the XGBoost model. The curve illustrates the model's ability to distinguish between 'MCI-high' and 'MCI-low' classes, with an average ROC AUC score of 0.43 ± 0.06.
Analysis has revealed significant insights into the classification of Mild Cognitive Impairment (MCI) severity using EEG data and neuropsychological scores. Key findings include:
The endeavor to predict MCI severity using advanced feature engineering and machine learning models has encountered notable challenges, chief among them being the dataset's imbalance. The predominance of 'MCI-low' cases over 'MCI-high' cases (30912 vs. 12600) underscores the necessity for careful consideration in model training and evaluation to mitigate potential biases. The importance of segments length and the number of segments per subject in the dataset was also observed.
The variability in model performance, particularly the models' propensity to better identify 'MCI-low' cases, suggests a need for a nuanced approach in model selection and optimization. While models like XGBoost and SVM showed potential, their effectiveness was inconsistent across classifications, necessitating a critical evaluation of both the modeling strategies employed and the inherent dataset characteristics.
Moreover, the challenge in accurately identifying 'MCI-high' cases calls for further investigation into feature engineering techniques and the incorporation of more sophisticated modeling approaches. This might include exploring multivariate analysis methods, which could offer more robust and balanced classification frameworks without the need for deep learning models at this stage.
The analysis presented in this report highlights the nuanced potential of utilizing machine learning models to understand cognitive impairment through EEG data and neuropsychological assessments. The findings suggest a reevaluation of model selection and optimization strategies, particularly in addressing dataset imbalances and enhancing model sensitivity across varied impairment levels.
The challenges observed in distinguishing between 'MCI-high' and 'MCI-low' cases emphasize the need for more sophisticated approaches to feature engineering, model tuning, and the exploration of advanced analytical techniques. Moving forward, there is a clear avenue for further research and model refinement, with the ultimate goal of developing a balanced and effective classification framework capable of contributing to early detection and intervention strategies for cognitive impairments.