NeuroTrace: A Novel Machine-Powered System to Detect Neurodegeneration through Handwriting Kinematics Analysis
Published:
Neurodegenerative diseases (NDs), like Parkinson’s and Alzheimer’s, affect more than 62 million people worldwide, especially seniors. Most NDs share the common pattern of damaging the motor system, which leads to the loss of hand control. Hand motions can become segmented and jagged, and daily activities, like writing, can be seriously affected.
Experimental Design & Methodology
1. Dataset Preparation
- The training handwritten dataset (n=174) used was the DARWIN dataset from the UCI Machine Learning Repository.
- There were 25 tasks for handwriting and 18 metrics produced from each task (avg. speed, acceleration, jerk, pressure in x,y,z, etc).
- Unnecessary tasks and metrics were removed using the feature_importance function in scikit-learn.
- 6 out of the 25 original handwriting tasks in DARWIN were chosen for training and testing, based on the most distinctive features (line, circle, cursive, etc):
- Test 1: Tracing horizontal lines
- Test 2: Tracing vertical lines
- Test 3: Tracing a large circle (6cm)
- Test 4: Tracing a small circle (3cm)
- Test 5: Writing l’s in cursive
- Test 6: Writing la’s in cursive

The different 6 NeuroTrace tasks.
- 12 metrics out of the 18 were used to reduce the number of dimensions.
2. Random Forest Model Training & Optimization
- Scikit-learn’s Random Forest Classifier was used to train the NeuroTrace model.
- The Random Forest classification model was chosen since it overfits less, generalizing well to unseen data.
- Inputs: The metrics (features) of each of the 6 tasks of a patient
- Outputs: The classification of the patient (healthy/patient), as well as analyzable metrics which are used to tune the model’s hyperparameters.
- 75%/25% training/testing split
- The Random Forest classification model was chosen since it overfits less, generalizing well to unseen data.

Hyperparameter Tuning process.

Performance metrics vs. parameters (# of features & max depth). The 5-fold CV (Cross-validation) score is highlighted, and drops after certain parameters.

Average performance metrics and confusion matrix for trained NeuroTrace RF model.
- The model has high accuracy (90%) and precision (94%), and fairly good recall (84%), with an F1 score (weighted avg. of precision/recall) of 88%.
- 5-fold cross-validation score (80%) and ROC-AUC (Area under curve) score (90%) signal that NeuroTrace generalizes well across unknown (real-life) data.
- Results are statistically significant, meaning that NeuroTrace’s predictions are better than random chance.
3. NeuroTrace Frontend Development
- Developed portable & interactive HTML/JavaScript program that collects handwriting kinematics data (12 features) over the chosen 6 tasks.
- A WACOM Intuos Tablet collects handwriting data with an electronic pen.
- Captures pen kinematics (x, y, pressure, tilt, pen-up/pen-downs) from a calibrated WACOM tablet at 120 Hz, and exports kinematics CSV file.
- There is a paper pad printed with the corresponding mapped tasks (n=6) on the program. Participants traced the paper with the electronic pen, testing hand-eye coordination.
The raw testing data can be found here. All participants are anonymous per signed informed consent forms.
Time (cycle) Pendown X Position Y Position Pressure TiltX TiltY 1 0 530 277 0.001 0 0 2 1 533 276 0.215 0 0 3 0 536 270 0.222 0 0 Example entries from 1 of 6 task csv files from dataset.

Data collection device - WACOM Intuos Tablet (120hz).

NeuroTrace webpage program, with annotations on page by drawing on tablet.
- The CSV file outputted from the HTML program is processed & normalized to DARWIN dataset scaling and then analyzed by the trained NeuroTrace Random Forest model.
- The central difference approximation formulas for parametric/kinematics equations are below, which calculated the tested handwriting kinematics (Dividing by 120^n due to WACOM tablet operating at 120 Hz):
4. Real-life NeuroTrace Prototype Testing & Data Analysis
- Randomly recruited seniors (n=18, mean age=80) from nearby senior facilities (Chandler & Gilbert, Arizona) participated in NeuroTrace testing
- All subjects signed informed consent forms explaining purpose of study and data protection policies.
- Subjects completed two tasks, in about 10 minutes:
- Completed quick demographics survey bout subjects’ history neurodegenerative diseases and their educational & vocational (job) history.
- This survey was used to determine if subjects were positive (Patient) or negative (Control) ground truth for data collection.
- Utilized the calibrated WACOM tablet, digital pen, and software to complete the 6 NeuroTrace tracing tasks.
- Completed quick demographics survey bout subjects’ history neurodegenerative diseases and their educational & vocational (job) history.

Participant completeing vertical dots task

Different participant completing cursive 'l' task.

Data collection WACOM tablet, set up w/ tracing assignment. The subject traces the ring with the pen, which is recorded on the NeuroTrace software.

Each of the 6 NeuroTrace tasks for subjects to trace.
- Data Analysis
- According to the confusion matrix for participant testing, the model was 88% accurate at identifying positives and negatives for neurodegenerative diseases (Alzheimer’s, Parkinson’s, MS, etc), with an 80% F1 score.
- This proves that a distinct difference between healthy control subjects’ and patients’ subjects handwriting kinematic exists.
- Limitations: statistics based on very limited sample size (n=18), needs additional testing.

Confusion matrix for NeuroTrace model evaluation (n=18).

NeuroTrace training and real-life testing accuracy metrics.
Conclusion — NeuroTrace ensures early neurodegenerative disease detection with a 80%+ accuracy. NeuroTrace ultimately will create a future where the burden of neurodegenerative diseases is alleviated, with implications and applications:
- Quick, accurate, and non-invasive screening tool
- Cost-effective, efficient application for low-resource groups
- Effective remote diagnostic and medical monitor
- Useful longitudinal study tool due to its quick & simple tasks
What’s next?
As with any good project, there is future research planned:
- Multimodal Expansion: Combine handwriting with speech or facial expression analysis for higher diagnostic sensitivity.
- Personalization: Train adaptive models that consider demographic, vocational, and medical background data.
- Longitudinal Tracking: Utilize NeuroTrace regularly long-term as a checkup tool through repeated handwriting sessions.
- Clinical Integration: Partner with neurology labs to deploy NeuroTrace as a pre-diagnostic screening application.
Summary Poster
Presented at the 2024 Arizona Science and Engineering Fair, placing 3rd for the Translational Medical Sciences category.
