Investigating the impact of vulnerability datasets on deep learning-based vulnerability detectors.

Researchers

Journal

Modalities

Models

Abstract

Software vulnerabilities have led to system attacks and data leakage incidents, and software vulnerabilities have gradually attracted attention. Vulnerability detection had become an important research direction. In recent years, Deep Learning (DL)-based methods had been applied to vulnerability detection. The DL-based method does not need to define features manually and achieves low false negatives and false positives. DL-based vulnerability detectors rely on vulnerability datasets. Recent studies found that DL-based vulnerability detectors have different effects on different vulnerability datasets. They also found that the authenticity, imbalance, and repetition rate of vulnerability datasets affect the effectiveness of DL-based vulnerability detectors. However, the existing research only did simple statistics, did not characterize vulnerability datasets, and did not systematically study the impact of vulnerability datasets on DL-based vulnerability detectors. In order to solve the above problems, we propose methods to characterize sample similarity and code features. We use sample granularity, sample similarity, and code features to characterize vulnerability datasets. Then, we analyze the correlation between the characteristics of vulnerability datasets and the results of DL-based vulnerability detectors. Finally, we systematically study the impact of vulnerability datasets on DL-based vulnerability detectors from sample granularity, sample similarity, and code features. We have the following insights for the impact of vulnerability datasets on DL-based vulnerability detectors: (1) Fine-grained samples are conducive to detecting vulnerabilities. (2) Vulnerability datasets with lower inter-class similarity, higher intra-class similarity, and simple structure help detect vulnerabilities in the original test set. (3) Vulnerability datasets with higher inter-class similarity, lower intra-class similarity, and complex structure can better detect vulnerabilities in other datasets.© 2022 Liu et al.

Show Full Text

Investigating the impact of vulnerability datasets on deep learning-based vulnerability detectors.

Researchers

Journal

Modalities

Models

Abstract

Predicting prime editing efficiency and product purity by deep learning.

Fully Automated CT Quantification of Epicardial Adipose Tissue by Deep Learning: A Multicenter Study.

Using deep-learning to forecast the magnitude and characteristics of urban heat island in Seoul Korea.

GoldDigger and Checkers, computational developments in cryo-scanning transmission electron tomography to improve the quality of reconstructed volumes.

Predicting Two-Photon Absorption Spectra of Octupolar Molecules: A Deep-Learning Approach Based Exclusively on Molecular Structures.

Classification of Mental Stress from Wearable Physiological Sensors Using Image-Encoding-Based Deep Neural Network.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply