Home // IMMM 2019, The Ninth International Conference on Advances in Information Mining and Management // View article
Multimodal Deep Neural Networks for Banking Document Classification
Authors:
Deniz Engin
Erdem Emekligil
Mehmet Yasin Akpınar
Berke Oral
Seçil Arslan
Keywords: Multimodal Deep Learning; Document Classification
Abstract:
In this paper, we introduce multimodal deep neural networks to classify petition based Turkish banking customer order documents. These petition based documents are commonly free-formatted texts, which are created by customers, but some of them do have a specific format. According to the structure of the banking documents, some documents containing tables and specific forms are convenient for visual representation, while some documents consisting of free-formatted text are convenient for textual features. Since the texts of these documents are obtained via Optic Character Recognition technology which does not work well on handwritten, noisy, and low-resolution image documents, text classification methods can fail on them. Therefore, our proposed deep learning architectures utilize both vision and text modalities to extract information from different types of documents. We conduct our experiments on our Turkish banking documents. Our experiments indicate that combining visual and textual modalities results in better recognition of documents compared to text or vision classification models.
Pages: 21 to 25
Copyright: Copyright (c) IARIA, 2019
Publication date: July 28, 2019
Published in: conference
ISSN: 2326-9332
ISBN: 978-1-61208-731-3
Location: Nice, France
Dates: from July 28, 2019 to August 2, 2019