|

Modeling the Impact of Inter-Rater Disagreement on Sleep Statistics using Deep Generative Learning.

Researchers

Journal

Modalities

Models

Abstract

Sleep staging is the process by which an overnight polysomnographic measurement is segmented into epochs of 30 seconds, each of which is annotated as belonging to one of five discrete sleep stages. The resulting scoring is graphically depicted as a hypnogram, and several overnight sleep statistics are derived, such as total sleep time and sleep onset latency. Gold standard sleep staging as performed by human technicians is time-consuming, costly, and comes with imperfect inter-scorer agreement, which also results in inter-scorer disagreement about the overnight statistics. Deep learning algorithms have shown promise in automating sleep scoring, but struggle to model inter-scorer disagreement in sleep statistics. To that end, we introduce a novel technique using conditional generative models based on Normalizing Flows that permits the modeling of the inter-rater disagreement of overnight sleep statistics, termed U-Flow. We compare U-Flow to other automatic scoring methods on a hold-out test set of 70 subjects, each scored by six independent scorers. The proposed method achieves similar sleep staging performance in terms of accuracy and Cohen’s kappa on the majority-voted hypnograms. At the same time, U-Flow outperforms the other methods in terms of modeling the inter-rater disagreement of overnight sleep statistics. The consequences of inter-rater disagreement about overnight sleep statistics may be great, and the disagreement potentially carries diagnostic and scientifically relevant information about sleep structure. U-Flow is able to model this disagreement efficiently and can support further investigations into the impact inter-rater disagreement has on sleep medicine and basic sleep research.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *