About: This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences between PyTorch and Tensorflow Deep Learning frameworks.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences between PyTorch and Tensorflow Deep Learning frameworks. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : wasabi.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences between PyTorch and Tensorflow Deep Learning frameworks.
Subject	Machine learning Artificial intelligence Automatic identification and data capture Unicode OpenCL compute devices
part of	Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems
is abstract of	Improving Accuracy and Speeding Up Document Image Classification Through Parallel Systems
is hasSource of	covid:ann/target/234db98d6764476104678121b779086a588e8723 covid:ann/target/6eb3acf8880f73b066025e0cd157d441059dfeb4 covid:ann/target/b5e7c604144c3a56d6058982e3f28fa2c651f5ab covid:ann/target/12238cffe9fe2c4a979202eb2c7f4fc7ae6e966c covid:ann/target/54297b384737a06153a48d8b58e03b9809ea0bcc covid:ann/target/b5fe2d56235876d7c021ac61122915eeb59d1ba9 covid:ann/target/dc6a5da5249008dc2d24c67fa3fce7844e9f2ad9 covid:ann/target/c996e97f9b0025094c522b99e7d70fda195c0a2a covid:ann/target/9ec5c04834a0c7397c7f7a4959e1d0c7e7e4bdbf covid:ann/target/01bf579523db7b5a2a31d5105da2cfc17d9680d9 covid:ann/target/f830b25e2a37be1a4a610f584053a6ad8848f7f1 covid:ann/target/46868fc5a064e27c7fcca0c717b3ef53b8d5bd3e covid:ann/target/a8d747ee16c06aab944c655e7c72526efb2b3022 covid:ann/target/44b84f4cb4f451f191583f3b66b292c0339b6107 covid:ann/target/043b9629e0943f06e3c23c33fe38e616a8ca0b36 covid:ann/target/77beca0b28be8369ec93afba20ea1557df0f61df covid:ann/target/b2ed00ec97d321df2ac998db70fe23ef1d8a8658 covid:ann/target/d097ba08e187adb5cefa5c20782d50bf9adf1c88 covid:ann/target/b1e92aba3d0d7bc0dc47f352124ac709dd14549f covid:ann/target/f0d4adfdb39cd4e604cc78ca130fc27e58c62854 covid:ann/target/9733f1f4dfb1a6d733d3b1f3b82198291cfec11f covid:ann/target/46c8089bc5424a2f86d4a5443faeba2358cdc221 covid:ann/target/0a177c963271bf8f9a90da694f9dfcd05486d648 covid:ann/target/f75e1d5058e618d6d6c1f20bb6da11fbf7ef589e covid:ann/target/3952ccd759e563ac60632fa672757d518926e152 covid:ann/target/e6d9e51bf2a614edb1c291cf4800f856fc425d30 covid:ann/target/6ca1d649ed7a0e6d1331380a740875c03f85afff covid:ann/target/99887302f68e1de9d6b536a88d872ca58b546d03 covid:ann/target/5fb24cb211b3204a60c2a7876de23aedfd640d94 covid:ann/target/0d8870efb7fe96911fe5dedc4691251412c793b7 covid:ann/target/519b6758c984b1d26c4b9b87d54d387ebab0fb66 covid:ann/target/b86e9deae4689221a077cc98253d66916c77ef8e covid:ann/target/398714a29edacbb41e48611b0317ca5e06e4d3cf covid:ann/target/524decdd34de4e6983ca1d3dafc171d47798937d covid:ann/target/99940213d2b45bbb24f34000e6825b9f87733193 covid:ann/target/b3a5959ef2cd651760b34828e397d30ca8994d94 covid:ann/target/04e0402766742f1e0b7178dbd9688a23e60addc0 covid:ann/target/3b7b43919f8c140078b75900cf5de5d5471cefc9 covid:ann/target/4df0c4b8499e40e7722e622795f53f9c2af885e1 covid:ann/target/b04f555686a0ef031c9f5a5be0fe3b30ee45e92a covid:ann/target/284be20981a3b4d348e3ebb6cfab0fcf24c0c69e covid:ann/target/aecdb80c841c7ce2630cbfb88cc509638fd61bd3 covid:ann/target/ccdf5c253a691a79a7c08a30b52a98c80dd3e582 covid:ann/target/0a1eac49be8d6693724925137343903f1b531d53 covid:ann/target/7c7788232fc567f07c30c9ace6afcf9f3fe4cbac covid:ann/target/9d6d1086a4f6afc96f1b794fb56298fac5ab52e4 covid:ann/target/0e06407932aa98707b7268b7b19299fe61111922 covid:ann/target/1926671acc7021da501157b6c42576146571c935 covid:ann/target/40484e80725a0428cba8882b147ba34d4277d980 covid:ann/target/6bfd729091b48165e9b53909053a8a1d540bf1c2 covid:ann/target/40c456f233cafe1e53342b94ae546f3c01f5e01e covid:ann/target/5fc93e86519121fcf56f46d36e1763b2f7afb482 covid:ann/target/1016546b57f299192afed7c39a0887dc0966758a covid:ann/target/58e7278c54e7acc1585da561843f7685b470dc5f covid:ann/target/720dabafb161846cc24a53c78f27d2ae3127fb5d covid:ann/target/55a96e8223bbdf7a57613ecb4e7d2508cfc300c0 covid:ann/target/9f85080019cc6ac123b3a35468c43f6b7845f536 covid:ann/target/1bfccb219f727598107de42c389d1478ff5fb74b covid:ann/target/fe133663d19758f05286a2f79850e34ce5eeed23 covid:ann/target/bfa12851b6df67b32f98dc73b3bfa02feff3ba40 covid:ann/target/2db9989fa0e790e4b9a1f6895d707bb58e111947 covid:ann/target/611fd372e75d143d3d7a31ac2407d8c9365d2346 covid:ann/target/ec8735b2ff37315e80527a4c95d49e33814d582c covid:ann/target/2cf9215451f216e0add4c0a7f881e745f12e19a2 covid:ann/target/5de8f9b3dc8d889a37edd8055594f87f209b5490 covid:ann/target/82c3d6819f5a45dfb3ffaae4892c3c60f7d0857f covid:ann/target/abbdc7757f38ce085a547d75721fc82ee9c2e75b covid:ann/target/c85bdf5bcebd7a367e9bbb7ca3af6ede27ec1f17 covid:ann/target/f38b3e90a53f7e4111ee1f9d6ecafac39dda6dfc covid:ann/target/fa6b9d7ebd81ec86d14ccf4095d285859585fe77 covid:ann/target/736a36a81353c941698eb19cb609a9ba0a739c1c covid:ann/target/8c6b20f66e74bd05d6542fe51ada36d3f84b04ff covid:ann/target/807f81b266389adf73c6d5ec1498743ca22d498d covid:ann/target/cb2bbdfa577e0b74b985e8089b4d0ad58c2273ab covid:ann/target/8ddd0759a2fc8929c9a14868c861de5d07989ad6 covid:ann/target/60bf67a37a6d00481ea96794aa87742bffb7c386 covid:ann/target/533b01af7dcc19f3b3c5e174e3759538b174960b covid:ann/target/841e835b9f77de4fc9041edc5b5579196c448edf covid:ann/target/36439ec7dc15ac8be2dc0d5a42386e06a41e51de covid:ann/target/3e8891791e8d63234678af58574f81f3bfb018ea covid:ann/target/0343069941bbc133a70ae55cafdb8df9c22e1824 covid:ann/target/212e7e6e6d8eaae0edc7a1da844feef2c0c2b250 covid:ann/target/79415af4dbba3febe93b864ae48d580a5fe1aeb1 covid:ann/target/3ca2b4e176c45ec12cbe4a7ec521bad9ae7c0a92 covid:ann/target/b192d021bae9c50a787c1e22ff915fb8648a8cf1 covid:ann/target/fd5fed67c5adf30e1b6e30c93a854490d7f2ef72 covid:ann/target/7f6cc0da89f94315156f72f0f8ec6c1655fe8cfb covid:ann/target/79b43bf02082c16702ea37e121f6287eea98c4df covid:ann/target/35093067030a0e646eac3c3c5dac4a795805d407 covid:ann/target/6db84c7ab9eb53dc876a1677ac6f61b017297f2b covid:ann/target/9b75bc5a734e7bcee3a06cc9bb6435766703658b

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software