About: Although deep neural networks yield state-of-the-art performance in biomedical named entity recognition (bioNER), much research shares one limitation: models are usually trained and evaluated on English texts from a single domain. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for bioNER of drug and disease mentions across two domains in two languages, namely clinical data and user-generated texts on drug therapy in English and Russian. We investigate the role of transfer learning (TL) strategies between four corpora to reduce the number of examples that have to be manually annotated. Evaluation results demonstrate that multi-BERT shows the best transfer capabilities in the zero-shot setting when training and test sets are either in the same language or in the same domain. TL reduces the amount of labeled data needed to achieve high performance on three out of four corpora: pretrained models reach 98–99% of the full dataset performance on both types of entities after training on 10–25% of sentences. We demonstrate that pretraining on data with one or both types of transfer can be effective.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Although deep neural networks yield state-of-the-art performance in biomedical named entity recognition (bioNER), much research shares one limitation: models are usually trained and evaluated on English texts from a single domain. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for bioNER of drug and disease mentions across two domains in two languages, namely clinical data and user-generated texts on drug therapy in English and Russian. We investigate the role of transfer learning (TL) strategies between four corpora to reduce the number of examples that have to be manually annotated. Evaluation results demonstrate that multi-BERT shows the best transfer capabilities in the zero-shot setting when training and test sets are either in the same language or in the same domain. TL reduces the amount of labeled data needed to achieve high performance on three out of four corpora: pretrained models reach 98–99% of the full dataset performance on both types of entities after training on 10–25% of sentences. We demonstrate that pretraining on data with one or both types of transfer can be effective. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : wasabi.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	Although deep neural networks yield state-of-the-art performance in biomedical named entity recognition (bioNER), much research shares one limitation: models are usually trained and evaluated on English texts from a single domain. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for bioNER of drug and disease mentions across two domains in two languages, namely clinical data and user-generated texts on drug therapy in English and Russian. We investigate the role of transfer learning (TL) strategies between four corpora to reduce the number of examples that have to be manually annotated. Evaluation results demonstrate that multi-BERT shows the best transfer capabilities in the zero-shot setting when training and test sets are either in the same language or in the same domain. TL reduces the amount of labeled data needed to achieve high performance on three out of four corpora: pretrained models reach 98–99% of the full dataset performance on both types of entities after training on 10–25% of sentences. We demonstrate that pretraining on data with one or both types of transfer can be effective.
Subject	Machine learning Artificial intelligence Patent law Supercomputers Computational linguistics
part of	On Biomedical Named Entity Recognition: Experiments in Interlingual Transfer for Clinical and Social Media Texts
is abstract of	On Biomedical Named Entity Recognition: Experiments in Interlingual Transfer for Clinical and Social Media Texts
is hasSource of	covid:ann/target/b1397588100fb22f710bb15a930c9d6fcf1e80ca covid:ann/target/f067f210b6eed73c7ea0989f32e971c59dc9d6aa covid:ann/target/b42353e1d6fed057817df3e4d3673be3c1bcb884 covid:ann/target/11e0362ff5d7913bf2abb6de5137d9f375e224e3 covid:ann/target/8d3b76aaaef8dc536fbd25492c83c92c7a1812d6 covid:ann/target/2b19c8355177507872fa627a3d138a99cdd04e2a covid:ann/target/1c86d900e2ddcecd389f73a0e9322464674966b7 covid:ann/target/3240e320430be4f5eaffea0f91c1426100735fa9 covid:ann/target/8ee98e04b1ce68fadc551d77f7c5caf2639ca7cf covid:ann/target/6073da106d01fe18a66ce936919e22bac897875b covid:ann/target/6ccbd70f5a54eb567682e0097475967e8097e149 covid:ann/target/882e99f9ada874fce07e255346b2e575157cf3f6 covid:ann/target/ecc87e51de4eb7ebacd2212c44250b08199785de covid:ann/target/51d8f834048b57409c441bdb8ec6132ff035a5c8 covid:ann/target/bdc918f5e8f747b02c8a5256a46066aff349c607 covid:ann/target/361faa683c800e2814204f793d3882f8833d40c2 covid:ann/target/bf40d6a6496676446bfde760a770f02434f09a0a covid:ann/target/4d21b99a06ce63b48c3c588b656865516735a4cb covid:ann/target/a7f7fee3f393f999bc6cfc99467b704310efe95a covid:ann/target/49c38df95b5e698a8fae82c323890b8966110186 covid:ann/target/1035865dfe7b8e8e71e6f60fa2279bca0e906b01 covid:ann/target/49c4480fc052998d32816c30aeaf8154207ab35c covid:ann/target/39c17a70159fc39aa50f591caa492f5b9d3de69c covid:ann/target/93aa5773bead5ed26743626057e6de2efa9f3098 covid:ann/target/ae9e2481f604c9f365c40ecfcc90b7ffb7524320 covid:ann/target/f356290603af4289b744dc9f07171abbb8af3b4d covid:ann/target/47a4aefe3d2fcfe77d4943cfbfb84fa422dc0626 covid:ann/target/c9ae1c33d15ccadaae738f0bcd0dbb3e1b158281 covid:ann/target/9fe7e8789339b5a8009e1b3f19327444499ed773 covid:ann/target/2c29182b31433938a0fde6536c374030904b0cbd covid:ann/target/58214ad90616af76074f963a038806f1f27d99c4 covid:ann/target/deff9262e6b663f79f76d06a62e3d349d540b1e2 covid:ann/target/5bf534f7adaf30266022224a8bf0762088e9099c covid:ann/target/7c0e27e4933085b22e6228fc7b5ba11265897bf0 covid:ann/target/7da822b750a20f15efbd985cb05d2fae3b356d77 covid:ann/target/504952d2628e66d6874ead57eb676258b7c29734 covid:ann/target/659aa436b035f790023f6bc285d64387f936b3f9 covid:ann/target/e264de8edf6fec8574b35be45c18e6d6cfb1a1f8 covid:ann/target/5a92f85317f16b82a959cd3f168effb07db8bc54 covid:ann/target/789d0b48d3c65b5656aa323b90059a058cae7894 covid:ann/target/bb689f5292be01e615cc7ed0308cc1aad7e3b13b covid:ann/target/cf694ccb2a18800f16768598d64403d0f5791560 covid:ann/target/e75dfd0ff04f772b8f8f997d8977b03aaf6dca1a covid:ann/target/801d2a6a970fbd4cd6eee81c3e05c6bb61db813b covid:ann/target/9c633695e9a6f98f15884bdacc5d9009cba47340 covid:ann/target/4a81d5eac1f5465ac856c18dd3147394136baa2e covid:ann/target/4e143aa116ae237e10150cec274f9388839eca6e covid:ann/target/4f2ed05acd92724de7d8ff9a78482ecac6a9e932 covid:ann/target/7a8f4ef588050ad9a4422245321ccb303a273bbd covid:ann/target/83d029a7dd7a866d68b387562b28a7888fc0504c covid:ann/target/882c2236516bb2199a8a69ee7ae8db82a092b482 covid:ann/target/8a91dc177bbe38397a6383294deabecabdab76c8 covid:ann/target/9f41e0d7240caa423e00b8fed33dd768781dbf03 covid:ann/target/bb51afd42693c3b25bf0fe7796dddd9cb670abf7 covid:ann/target/03ea26ece5fce84dea194c2a3e4963161f2f4a0b covid:ann/target/220bf61728afc00bf3e04fcabac364da7c773e19 covid:ann/target/35221c1bc0c130555e351e59bc39120157efa97f covid:ann/target/6b51ab4c52cb2b4bc232526b8716456f86e483bf covid:ann/target/b1e00f331c2c93b50c0323ea6d8b6729e167a128 covid:ann/target/e07272c87f5ec0e791bf9f8c05a22acb8a474a34 covid:ann/target/762e3816a1c649a9937f1166867893cc195b71f3 covid:ann/target/d49a4bcd68f7a5b538b091820f72fe4448a08065 covid:ann/target/8d480ac2a9befd10b4d25043e7bee187df21acea covid:ann/target/cb47f1444ccd99fbf55b8a87fd68947c2083d189 covid:ann/target/69f26d805dae2ae94d7b4bee2cee94ce6801b74e covid:ann/target/fd254f2f97e985aa4bc1ff02f272d3c1f85c4bc6 covid:ann/target/4c4354095c50772e06a05399f6d502854bc73af2 covid:ann/target/72a9cf0e258abc851d99d57cb7f4643374c0414d covid:ann/target/0c504a9937039a52335bae90c4d3c8edda665492 covid:ann/target/fb5c9861a0f3162df774d1fd6c9366ee7c49ac65 covid:ann/target/f93c815b2d3c3c72c9b7c4a4a0e32d3c51eeef07 covid:ann/target/c3472a5f479817de087278925f8b108f0c316fd3 covid:ann/target/230b4062d5ce97108ae10c5a6c5ee8b4ca60463d covid:ann/target/6d87e0b27f6e01e741b1bcdb6ba8dc2c03a39316 covid:ann/target/4d102f163ad63d7c1a54dab6e9b66ce59525be1d covid:ann/target/747cc5938781e56be7737d7b5f6ab2df804609bc covid:ann/target/8931769b1b69e7c99c4e3d8c5bd272b8b7dd1ad6 covid:ann/target/8a3b4979caf1610eb63ebe3cc9ed1fa9bc626d7d covid:ann/target/5fc2877bb4a9905165c5450ce8f5676004785fcf covid:ann/target/e8b6994bf051caf87a719ffaf10504e5f1d5d605 covid:ann/target/8e39ead4bc1be92efe95115a35aede4efdf7ead1 covid:ann/target/2e61c2aa1dc6dbb8a34d68c1f1755442a6453ad4 covid:ann/target/3be080576553e3bd529ed2e48b074bebef40ffed covid:ann/target/7e99316ae8184eb3a8ee79e0c9af22daa8a7465a covid:ann/target/4d1a225f8a1ec5d91e38e0403f8f78a5754b5ed8 covid:ann/target/f6943f7e49d63f3d2ad895122c6a2aa237b2a188 covid:ann/target/6414369b7414652abf95b8cf1efb2119f9c2c456 covid:ann/target/a31c878bba520403a2dd122ed40cbc62647b0c94 covid:ann/target/3c19d3fb285e0a45f86fd80442faf9482f4b5f19 covid:ann/target/8d561736fc24976ff624d51258746d8961e77805 covid:ann/target/ca4e14327f83d64cb197e29f82891fcbf6585c37

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software