Limits...
Expert and crowd-sourced validation of an individualized sleep spindle detection method employing complex demodulation and individualized normalization.

Ray LB, Sockeel S, Soon M, Bore A, Myhr A, Stojanoski B, Cusack R, Owen AM, Doyon J, Fogel SM - Front Hum Neurosci (2015)

Bottom Line: Spindles were automatically detected in 15 young healthy subjects.These spindles were then compared between raters and to the automated detection to identify the presence of true positives, true negatives, false positives and false negatives.This method of automated spindle detection resolves or avoids many of the limitations that complicate automated spindle detection, and performs well compared to a group of non-experts, and importantly, has good external validity with respect to the extant literature in terms of the characteristics of automatically detected spindles.

View Article: PubMed Central - PubMed

Affiliation: Brain and Mind Institute, Western University London, ON, Canada.

ABSTRACT
A spindle detection method was developed that: (1) extracts the signal of interest (i.e., spindle-related phasic changes in sigma) relative to ongoing "background" sigma activity using complex demodulation, (2) accounts for variations of spindle characteristics across the night, scalp derivations and between individuals, and (3) employs a minimum number of sometimes arbitrary, user-defined parameters. Complex demodulation was used to extract instantaneous power in the spindle band. To account for intra- and inter-individual differences, the signal was z-score transformed using a 60 s sliding window, per channel, over the course of the recording. Spindle events were detected with a z-score threshold corresponding to a low probability (e.g., 99th percentile). Spindle characteristics, such as amplitude, duration and oscillatory frequency, were derived for each individual spindle following detection, which permits spindles to be subsequently and flexibly categorized as slow or fast spindles from a single detection pass. Spindles were automatically detected in 15 young healthy subjects. Two experts manually identified spindles from C3 during Stage 2 sleep, from each recording; one employing conventional guidelines, and the other, identifying spindles with the aid of a sigma (11-16 Hz) filtered channel. These spindles were then compared between raters and to the automated detection to identify the presence of true positives, true negatives, false positives and false negatives. This method of automated spindle detection resolves or avoids many of the limitations that complicate automated spindle detection, and performs well compared to a group of non-experts, and importantly, has good external validity with respect to the extant literature in terms of the characteristics of automatically detected spindles.

No MeSH data available.


Related in: MedlinePlus

(A) High precision and recall across recordings when comparing Expert 1 to non-expert spindle scoring (black), low recall and variable precision across recordings when comparing Expert 1 to Expert 2 (open), and intermediate precision and recall between Expert 2 and non-experts (gray). (B) Inter-rater agreement was consistently high across subjects for Expert 1 vs. non-expert detections, ranging from 0.60 to 0.90 (Mean F1 = 0.81, ±0.07), low and variable agreement between Expert 1 and Expert 2 ranging from 0.10 to 0.80 (Mean F1 = 0.54, ±0.17), and intermediate and variable agreement between Expert 2 and non-experts ranging from 0.10 to 0.80 (mean F1 = 0.63, ±0.16). F1 score = harmonic mean of recall and precision.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4585171&req=5

Figure 2: (A) High precision and recall across recordings when comparing Expert 1 to non-expert spindle scoring (black), low recall and variable precision across recordings when comparing Expert 1 to Expert 2 (open), and intermediate precision and recall between Expert 2 and non-experts (gray). (B) Inter-rater agreement was consistently high across subjects for Expert 1 vs. non-expert detections, ranging from 0.60 to 0.90 (Mean F1 = 0.81, ±0.07), low and variable agreement between Expert 1 and Expert 2 ranging from 0.10 to 0.80 (Mean F1 = 0.54, ±0.17), and intermediate and variable agreement between Expert 2 and non-experts ranging from 0.10 to 0.80 (mean F1 = 0.63, ±0.16). F1 score = harmonic mean of recall and precision.

Mentions: Overall, Expert 1 had a high mean proportion of correctly identified events relative to the total number of events identified by Expert 2 (i.e., precision = 0.85, ±0.21), but Expert 2 had a low mean proportion of spindles that were correctly identified relative to the total number of events scored by Expert 1 (i.e., recall = 0.40, ±0.14). There was a very high proportion of periods without spindles that were correctly identified by Expert 2 as compared to Expert 1 (i.e., specificity = 0.97, ±0.04) and a high proportion of 3 s periods of EEG without spindles identified by Expert 2 (NPV = 0.80, ±0.07), with a false positive rate of only 0.03, ±0.04. When recall and precision are both maximal (i.e., equal to 1), this represents perfect performance, and when recall and precision are plotted against one another (Figure 2A), data points crowd the upper-right hand corner. However, as shown in Figure 2A, data points were dispersed along the left hand side of the plot, which resulted in low F1 scores (Figure 2B; mean F1 = 0.54, ±0.17), and a low and non-statistically significant phi coefficient (Φ = 0.49, ±0.18, p > 0.05).


Expert and crowd-sourced validation of an individualized sleep spindle detection method employing complex demodulation and individualized normalization.

Ray LB, Sockeel S, Soon M, Bore A, Myhr A, Stojanoski B, Cusack R, Owen AM, Doyon J, Fogel SM - Front Hum Neurosci (2015)

(A) High precision and recall across recordings when comparing Expert 1 to non-expert spindle scoring (black), low recall and variable precision across recordings when comparing Expert 1 to Expert 2 (open), and intermediate precision and recall between Expert 2 and non-experts (gray). (B) Inter-rater agreement was consistently high across subjects for Expert 1 vs. non-expert detections, ranging from 0.60 to 0.90 (Mean F1 = 0.81, ±0.07), low and variable agreement between Expert 1 and Expert 2 ranging from 0.10 to 0.80 (Mean F1 = 0.54, ±0.17), and intermediate and variable agreement between Expert 2 and non-experts ranging from 0.10 to 0.80 (mean F1 = 0.63, ±0.16). F1 score = harmonic mean of recall and precision.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4585171&req=5

Figure 2: (A) High precision and recall across recordings when comparing Expert 1 to non-expert spindle scoring (black), low recall and variable precision across recordings when comparing Expert 1 to Expert 2 (open), and intermediate precision and recall between Expert 2 and non-experts (gray). (B) Inter-rater agreement was consistently high across subjects for Expert 1 vs. non-expert detections, ranging from 0.60 to 0.90 (Mean F1 = 0.81, ±0.07), low and variable agreement between Expert 1 and Expert 2 ranging from 0.10 to 0.80 (Mean F1 = 0.54, ±0.17), and intermediate and variable agreement between Expert 2 and non-experts ranging from 0.10 to 0.80 (mean F1 = 0.63, ±0.16). F1 score = harmonic mean of recall and precision.
Mentions: Overall, Expert 1 had a high mean proportion of correctly identified events relative to the total number of events identified by Expert 2 (i.e., precision = 0.85, ±0.21), but Expert 2 had a low mean proportion of spindles that were correctly identified relative to the total number of events scored by Expert 1 (i.e., recall = 0.40, ±0.14). There was a very high proportion of periods without spindles that were correctly identified by Expert 2 as compared to Expert 1 (i.e., specificity = 0.97, ±0.04) and a high proportion of 3 s periods of EEG without spindles identified by Expert 2 (NPV = 0.80, ±0.07), with a false positive rate of only 0.03, ±0.04. When recall and precision are both maximal (i.e., equal to 1), this represents perfect performance, and when recall and precision are plotted against one another (Figure 2A), data points crowd the upper-right hand corner. However, as shown in Figure 2A, data points were dispersed along the left hand side of the plot, which resulted in low F1 scores (Figure 2B; mean F1 = 0.54, ±0.17), and a low and non-statistically significant phi coefficient (Φ = 0.49, ±0.18, p > 0.05).

Bottom Line: Spindles were automatically detected in 15 young healthy subjects.These spindles were then compared between raters and to the automated detection to identify the presence of true positives, true negatives, false positives and false negatives.This method of automated spindle detection resolves or avoids many of the limitations that complicate automated spindle detection, and performs well compared to a group of non-experts, and importantly, has good external validity with respect to the extant literature in terms of the characteristics of automatically detected spindles.

View Article: PubMed Central - PubMed

Affiliation: Brain and Mind Institute, Western University London, ON, Canada.

ABSTRACT
A spindle detection method was developed that: (1) extracts the signal of interest (i.e., spindle-related phasic changes in sigma) relative to ongoing "background" sigma activity using complex demodulation, (2) accounts for variations of spindle characteristics across the night, scalp derivations and between individuals, and (3) employs a minimum number of sometimes arbitrary, user-defined parameters. Complex demodulation was used to extract instantaneous power in the spindle band. To account for intra- and inter-individual differences, the signal was z-score transformed using a 60 s sliding window, per channel, over the course of the recording. Spindle events were detected with a z-score threshold corresponding to a low probability (e.g., 99th percentile). Spindle characteristics, such as amplitude, duration and oscillatory frequency, were derived for each individual spindle following detection, which permits spindles to be subsequently and flexibly categorized as slow or fast spindles from a single detection pass. Spindles were automatically detected in 15 young healthy subjects. Two experts manually identified spindles from C3 during Stage 2 sleep, from each recording; one employing conventional guidelines, and the other, identifying spindles with the aid of a sigma (11-16 Hz) filtered channel. These spindles were then compared between raters and to the automated detection to identify the presence of true positives, true negatives, false positives and false negatives. This method of automated spindle detection resolves or avoids many of the limitations that complicate automated spindle detection, and performs well compared to a group of non-experts, and importantly, has good external validity with respect to the extant literature in terms of the characteristics of automatically detected spindles.

No MeSH data available.


Related in: MedlinePlus