TAP: A static analysis model for PHP vulnerabilities based on token and deep learning technology.

November 18, 2019 Computer Science, Software Security

Researchers

Cheng Huang Runpu Wu Shengjun Han Yong Fang

Journal

PloS one

Modalities

Models

deep learning Long Short-Term Memory

Abstract

With the widespread usage of Web applications, the security issues of source code are increasing. The exposed vulnerabilities seriously endanger the interests of service providers and customers. There are some models for solving this problem. However, most of them rely on complex graphs generated from source code or regex patterns based on expert experience. In this paper, TAP, which is based on token mechanism and deep learning technology, was proposed as an analysis model to discover the vulnerabilities of PHP: Hypertext Preprocessor (PHP) Web programs conveniently and easily. Based on the token mechanism of PHP language, a custom tokenizer was designed, and it unifies tokens, supports some features of PHP and optimizes the parsing. Besides, the tokenizer also implements parameter iteration to achieve data flow analysis. On the Software Assurance Reference Dataset(SARD) and SQLI-LABS dataset, we trained the deep learning model of TAP by combining the word2vec model with Long Short-Term Memory (LSTM) network algorithm. According to the experiment on the dataset of CWE-89, TAP not only achieves the 0.9941 Area Under the Curve(AUC), which is better than other models, but also achieves the highest accuracy: 0.9787. Further, compared with RIPS, TAP shows much better in multiclass classification with 0.8319 Kappa and 0.0840 hamming distance.

Show Full Text

TAP: A static analysis model for PHP vulnerabilities based on token and deep learning technology.

Researchers

Journal

Modalities

Models

Abstract

Discovering functional impacts of miRNAs in cancers using a causal deep learning model.

Prior information-guided reconstruction network for positron emission tomography images.

CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics.

Development and validation of an abnormality-derived deep-learning diagnostic system for major respiratory diseases.

Deep-Learning-Based Automated Rotator Cuff Tear Screening in Three Planes of Shoulder MRI.

Deep Learning on Histopathology Images for Breast Cancer Classification: A Bibliometric Analysis.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply