About: Knowledge about the software used in scientific investigations is necessary for different reasons, including provenance of the results, measuring software impact to attribute developers, and bibliometric software citation analysis in general. Additionally, providing information about whether and how the software and the source code are available allows an assessment about the state and role of open source software in science in general. While such analyses can be done manually, large scale analyses require the application of automated methods of information extraction and linking. In this paper, we present SoftwareKG—a knowledge graph that contains information about software mentions from more than 51,000 scientific articles from the social sciences. A silver standard corpus, created by a distant and weak supervision approach, and a gold standard corpus, created by manual annotation, were used to train an LSTM based neural network to identify software mentions in scientific articles. The model achieves a recognition rate of .82 F-score in exact matches. As a result, we identified more than 133,000 software mentions. For entity disambiguation, we used the public domain knowledge base DBpedia. Furthermore, we linked the entities of the knowledge graph to other knowledge bases such as the Microsoft Academic Knowledge Graph, the Software Ontology, and Wikidata. Finally, we illustrate, how SoftwareKG can be used to assess the role of software in the social sciences.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Knowledge about the software used in scientific investigations is necessary for different reasons, including provenance of the results, measuring software impact to attribute developers, and bibliometric software citation analysis in general. Additionally, providing information about whether and how the software and the source code are available allows an assessment about the state and role of open source software in science in general. While such analyses can be done manually, large scale analyses require the application of automated methods of information extraction and linking. In this paper, we present SoftwareKG—a knowledge graph that contains information about software mentions from more than 51,000 scientific articles from the social sciences. A silver standard corpus, created by a distant and weak supervision approach, and a gold standard corpus, created by manual annotation, were used to train an LSTM based neural network to identify software mentions in scientific articles. The model achieves a recognition rate of .82 F-score in exact matches. As a result, we identified more than 133,000 software mentions. For entity disambiguation, we used the public domain knowledge base DBpedia. Furthermore, we linked the entities of the knowledge graph to other knowledge bases such as the Microsoft Academic Knowledge Graph, the Software Ontology, and Wikidata. Finally, we illustrate, how SoftwareKG can be used to assess the role of software in the social sciences. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : wasabi.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	Knowledge about the software used in scientific investigations is necessary for different reasons, including provenance of the results, measuring software impact to attribute developers, and bibliometric software citation analysis in general. Additionally, providing information about whether and how the software and the source code are available allows an assessment about the state and role of open source software in science in general. While such analyses can be done manually, large scale analyses require the application of automated methods of information extraction and linking. In this paper, we present SoftwareKG—a knowledge graph that contains information about software mentions from more than 51,000 scientific articles from the social sciences. A silver standard corpus, created by a distant and weak supervision approach, and a gold standard corpus, created by manual annotation, were used to train an LSTM based neural network to identify software mentions in scientific articles. The model achieves a recognition rate of .82 F-score in exact matches. As a result, we identified more than 133,000 software mentions. For entity disambiguation, we used the public domain knowledge base DBpedia. Furthermore, we linked the entities of the knowledge graph to other knowledge bases such as the Microsoft Academic Knowledge Graph, the Software Ontology, and Wikidata. Finally, we illustrate, how SoftwareKG can be used to assess the role of software in the social sciences.
Subject	Bibliometrics Online databases Software quality Knowledge bases
part of	Investigating Software Usage in the Social Sciences: A Knowledge Graph Approach
is abstract of	Investigating Software Usage in the Social Sciences: A Knowledge Graph Approach
is hasSource of	covid:ann/target/e4d011309dcb5dcf7aee52d295dac61696b2fbfa covid:ann/target/8f10d0ab9cd9aec51830c2def9359b1c0ae71f7e covid:ann/target/36b9b106f2f08511b4634207c39f6811318a60bc covid:ann/target/2370758db76c289657dc1e92c326b9ec52d1bbfe covid:ann/target/8ec689f1a8f80039281df8145fccb6a8ad912642 covid:ann/target/1b3812e94c33bcca9ae7a29b516f700056fc4975 covid:ann/target/4c98ac6ee03f18d38c3a3158934a85fb79b305b1 covid:ann/target/677ed6362b46a0cfa67b77ca5f68eb4c6a6b112a covid:ann/target/694ce59cda7c7cada23457a751e636d15f51629f covid:ann/target/f06094bbc1516690f1fcb3429824aab35af5c34f covid:ann/target/f09e4f3928b9b2ca09aca88ceda1714e7b5f6f0f covid:ann/target/30a527eb7aefe4097d369eee2521c977ca47b257 covid:ann/target/50f1200a18adda205635edb42f83e921facef7b2 covid:ann/target/742f124bc35e2107e5e9c58d4cebbbee6632e7fb covid:ann/target/7a68ac0cdd988f4b1f197cbfe230d70248c436d3 covid:ann/target/8aff8fe9512b4f8a55f228148de067ab0463c219 covid:ann/target/a2f735ca3db7f83fffb29aea7ceef4ee10f4d316 covid:ann/target/bdd99c360e051be4c47880882416bdfff5e9f797 covid:ann/target/d01293b348ffaf86da5f62333905a91acc7c3108 covid:ann/target/db73a6606af8784b8f418c438d52304effa7ce34 covid:ann/target/9128e4e3b1761b21814eb48a354b513a41dcb178 covid:ann/target/2b75a4716cf42f758cb8891e98e28c425707e85f covid:ann/target/433400ae2d6112368bc8abc56dbc055423e4949e covid:ann/target/9255d9f43f4a0657199c6b858002481ef3f71f73 covid:ann/target/10097fb14c9909dc27600afefc41fce0f0f5d246 covid:ann/target/08bcd697916dc2ad3bc3a5a5869b37d5107f1783 covid:ann/target/fb884e0d8f05c069b35a896cc0e24550cea2c81f covid:ann/target/7899a49608f1434707ae29b7a6b90fd7d5ffaac6 covid:ann/target/7a4bd0880f8ffa899d39b7fece7959342e2639f0 covid:ann/target/b6c88ea6b72347ded4de216be87d79126d02be8e covid:ann/target/d31ef29624f728b6955cb22eb07cff6c3c97f8d3 covid:ann/target/ac36e712e482b074617d908bfee14206e6eb20f8 covid:ann/target/3881bb17f4cac3f7990780cb218c1409c64a7e15 covid:ann/target/a5a8dcb4e2dc94377304883dcba95c4ffbc1f504 covid:ann/target/b53953fb8468f9a6b4adbcded1dc50d796a28ec7 covid:ann/target/63326af51c09d625945295f1562ae719d5ad5b7e covid:ann/target/75af1c6537b55fa552d9a681b13e33fd03833eef covid:ann/target/13b763c51b5c02a273053bd895b934957901c9fa covid:ann/target/6bcb110dee5faf13172a19701325f3c13f092820 covid:ann/target/83205b958e1cfdd0fead0a9e126d1c6f416c69bf covid:ann/target/f61d0ee7ce8a0b4561c85252fa0d090f9b5dc6a6 covid:ann/target/db84e34d99e95e01c5c163df4e24aa56477713ea covid:ann/target/47024336224d130417f0b208ba7494b7db349dbf covid:ann/target/4fdca24cfc28bd1c11fca9aa2785fb3abd4039e1 covid:ann/target/a40ab0a3dbf149067db277daeebb63ee8e41b035 covid:ann/target/dc6c99a856372e029254f8d67492d9b1fdb3d6b9 covid:ann/target/99b00b7a6dc616d8fb966a7e7603936357d4ebad covid:ann/target/7742700e2706dc57f9d85ef2099371c6045b6235 covid:ann/target/194f885239e9fd7f6511293b48a1db502ebe2cd5 covid:ann/target/c676b5be6bb760511e5c20aa57abfd384d07ef69 covid:ann/target/6eeff041c72c0f066d056d9ffaf01982b7d40d90 covid:ann/target/4dbd6208f8ed6b7edf8d0e45838509aaacb444be covid:ann/target/debcc4db64f841f18246c46fbf38b362932a9488 covid:ann/target/7c157fb65e3b13d21a69b2c0c7a9642c3905568a covid:ann/target/3d7457375fc3c5ba29233e4920e68bd99748a339 covid:ann/target/054ba1f9303f6673d5a9a808934f31f3a08f3b69 covid:ann/target/6223e651ac07415f03dbdc2ea2650044b967b144 covid:ann/target/6eab78fe54340a95e15ec03345c9da42217e46d0 covid:ann/target/984f0d0c23519f46b2e9e3575fedfd63ffcf8e14 covid:ann/target/abb1ee52915ca529b164e83c41ce25a96aa99221 covid:ann/target/bb0768954b6356d45d7f493d904b15dc9bcd825e covid:ann/target/c85d02499a5715072137c69d56d207ec8cacce9a covid:ann/target/19460270e8882b540af829895b805c4a19f96fc2 covid:ann/target/2fb2fd5b5601074ce16e3984733705df8a6a1ae4 covid:ann/target/ac1a39c43cfeea3face425102319b76d2382d5d6 covid:ann/target/6d51c6ef17b7596f1b1d87efdc6bf77524b872c3 covid:ann/target/bb6b8f71fcc9098b1b46fc245d75077dfa5b156b covid:ann/target/f07086313d6e4eae43eb17bbeccf5afea8b21fc7 covid:ann/target/4de18ec422c69b63e27ec891072a89351e4d9f82 covid:ann/target/a39d53bf5a1cad74a81ca273bad0eb620ccc67d2 covid:ann/target/a3b357bc7f7096f77b586f885347de0b3e7628d9 covid:ann/target/093d46a3beb01b2676932b192f027119a3e93caa covid:ann/target/32c106731f71d99b843ec1ee9e845a25f264796a covid:ann/target/3d3b19edd6fa094b01f2529fd30521e7aba3e577 covid:ann/target/9a113327c6126cde3b5187c7c5a96937e7242e39 covid:ann/target/c95d23debe6c203cd3a96d7b5831376c029f8cc2 covid:ann/target/67a620ea8b6466512dba88bb43c43c5174f718b1 covid:ann/target/7418634e59d0c2d0b753a271b47b8ee85b385e96 covid:ann/target/3903d7ed43edfb1226b38804894b33c29a3b87f4 covid:ann/target/9ac7449f99d1ee65a56c9b23301bbdeed82105c3 covid:ann/target/dce04a6ec9b27c1c5d34d974937181fb48003305 covid:ann/target/0714323c4d9e6556bf2d6f73bf26a66479143c07 covid:ann/target/24596e14b0601052cd0bb7edbf157a405b815af6 covid:ann/target/310301a00aa6d440821967f02833a49b315f222f covid:ann/target/4099e458ff988b4b794810f27d8c2c8268bc6b28 covid:ann/target/dc46c150d942921410e5e113f1cc0f52d090d2aa covid:ann/target/55f1b3ea0d68b7e23ba33a611ef8993e689d9401 covid:ann/target/0a57d7a8448824eda6c65470423489abee6e8f6c covid:ann/target/17fd798509e530a44542a104bd2bb66bc54e4ac5 covid:ann/target/480ccc0f0d040f1a39486ea647b6acbb2ba1d8d3 covid:ann/target/49f94ed418dbc12685f1d0a579ee2514dfc9284f covid:ann/target/821335f9911e194717921a32c31e2f593203ba80

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software