Video-based Person re-identification with parallel correction and fusion of pedestrian area features.

Researchers

Journal

Modalities

Models

Abstract

Deep learning has provided powerful support for person re-identification (person re-id) over the years, and superior performance has been achieved by state-of-the-art. While under practical application scenarios such as public monitoring, the cameras’ resolutions are usually 720p, the captured pedestrian areas tend to be closer to 128×64 small pixel size. Research on person re-id at 128×64 small pixel size is limited by less effective pixel information. The frame image qualities are degraded and inter-frame information complementation requires a more careful selection of beneficial frames. Meanwhile, there are various large differences in person images, such as misalignment and image noise, which are harder to distinguish from person information at the small size, and eliminating a specific sub-variance is still not robust enough. The Person Feature Correction and Fusion Network (FCFNet) proposed in this paper introduces three sub-modules, which strive to extract discriminate video-level features from the perspectives of “using complementary valid information between frames” and “correcting large variances of person features”. The inter-frame attention mechanism is introduced through frame quality assessment, guiding informative features to dominate the fusion process and generating a preliminary frame quality score to filter low-quality frames. Two other feature correction modules are fitted to optimize the model’s ability to perceive information from small-sized images. The experiments on four benchmark datasets confirm the effectiveness of FCFNet.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *