Logotipo de HISPANA
Logotipo del Ministerio de Cultura
  • Què és Hispana?
  • Cerca
  • Directori de col.leccions
  • Contacte
  • ca
    • Español
    • Euskara
    • English
    • Galego
    • Català
    • Valencià
Está en:  › Dades de registre
Linked Open Data
A domain categorisation of vocabularies based on a deep learning classifier.
Identificadores del recurso
0165-5515
https://hdl.handle.net/10641/3129
10.1177/01655515211018170
Procedència
(Repositorio Institucional de la Universidad Francisco de Vitoria)

Fitxa

Títol:
A domain categorisation of vocabularies based on a deep learning classifier.
Tema:
Linked Data
Deep Learning
Document Categorisation
Descripció:
The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.
pre-print
304 KB
Idioma:
English
Relació:
https://journals.sagepub.com/doi/abs/10.1177/01655515211018170
Autor/Productor:
Nogales Moyano, Alberto
Sicilia, Miguel Ángel
García Tejedor, Álvaro José
Editor:
Journal of Information Science
Drets:
Atribución-NoComercial-SinDerivadas 3.0 España
http://creativecommons.org/licenses/by-nc-nd/3.0/es/
openAccess
Data:
2022-10-25T10:37:33Z
2021
Tipo de recurso:
article

oai_dc

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <oai_dc:dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">

    1. <dc:title>A domain categorisation of vocabularies based on a deep learning classifier.</dc:title>

    2. <dc:creator>Nogales Moyano, Alberto</dc:creator>

    3. <dc:creator>Sicilia, Miguel Ángel</dc:creator>

    4. <dc:creator>García Tejedor, Álvaro José</dc:creator>

    5. <dc:subject>Linked Data</dc:subject>

    6. <dc:subject>Deep Learning</dc:subject>

    7. <dc:subject>Document Categorisation</dc:subject>

    8. <dc:description>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</dc:description>

    9. <dc:description>pre-print</dc:description>

    10. <dc:description>304 KB</dc:description>

    11. <dc:date>2022-10-25T10:37:33Z</dc:date>

    12. <dc:date>2022-10-25T10:37:33Z</dc:date>

    13. <dc:date>2021</dc:date>

    14. <dc:type>article</dc:type>

    15. <dc:identifier>0165-5515</dc:identifier>

    16. <dc:identifier>https://hdl.handle.net/10641/3129</dc:identifier>

    17. <dc:identifier>10.1177/01655515211018170</dc:identifier>

    18. <dc:language>eng</dc:language>

    19. <dc:relation>https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</dc:relation>

    20. <dc:rights>Atribución-NoComercial-SinDerivadas 3.0 España</dc:rights>

    21. <dc:rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dc:rights>

    22. <dc:rights>openAccess</dc:rights>

    23. <dc:publisher>Journal of Information Science</dc:publisher>

    </oai_dc:dc>

didl

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <d:DIDL schemaLocation="urn:mpeg:mpeg21:2002:02-DIDL-NS http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-21_schema_files/did/didl.xsd">

    1. <d:DIDLInfo>

      1. <dcterms:created schemaLocation="http://purl.org/dc/terms/ http://dublincore.org/schemas/xmls/qdc/dcterms.xsd">2022-10-25T10:37:33Z</dcterms:created>

      </d:DIDLInfo>

    2. <d:Item id="hdl_10641_3129">

      1. <d:Descriptor>

        1. <d:Statement mimeType="application/xml; charset=utf-8">

          1. <dii:Identifier schemaLocation="urn:mpeg:mpeg21:2002:01-DII-NS http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-21_schema_files/dii/dii.xsd">urn:hdl:10641/3129</dii:Identifier>

          </d:Statement>

        </d:Descriptor>

      2. <d:Descriptor>

        1. <d:Statement mimeType="application/xml; charset=utf-8">

          1. <oai_dc:dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">

            1. <dc:title>A domain categorisation of vocabularies based on a deep learning classifier.</dc:title>

            2. <dc:creator>Nogales Moyano, Alberto</dc:creator>

            3. <dc:creator>Sicilia, Miguel Ángel</dc:creator>

            4. <dc:creator>García Tejedor, Álvaro José</dc:creator>

            5. <dc:subject>Linked Data</dc:subject>

            6. <dc:subject>Deep Learning</dc:subject>

            7. <dc:subject>Document Categorisation</dc:subject>

            8. <dc:description>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</dc:description>

            9. <dc:date>2022-10-25T10:37:33Z</dc:date>

            10. <dc:date>2022-10-25T10:37:33Z</dc:date>

            11. <dc:date>2021</dc:date>

            12. <dc:type>article</dc:type>

            13. <dc:identifier>0165-5515</dc:identifier>

            14. <dc:identifier>https://hdl.handle.net/10641/3129</dc:identifier>

            15. <dc:identifier>10.1177/01655515211018170</dc:identifier>

            16. <dc:language>eng</dc:language>

            17. <dc:relation>https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</dc:relation>

            18. <dc:rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dc:rights>

            19. <dc:rights>openAccess</dc:rights>

            20. <dc:rights>Atribución-NoComercial-SinDerivadas 3.0 España</dc:rights>

            21. <dc:publisher>Journal of Information Science</dc:publisher>

            </oai_dc:dc>

          </d:Statement>

        </d:Descriptor>

      3. <d:Component id="10641_3129_1">

        1. <d:Resource mimeType="application/pdf" ref="http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf" />

        </d:Component>

      </d:Item>

    </d:DIDL>

dim

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <dim:dim schemaLocation="http://www.dspace.org/xmlns/dspace/dim http://www.dspace.org/schema/dim.xsd">

    1. <dim:field authority="209" confidence="600" element="contributor" mdschema="dc" qualifier="author">Nogales Moyano, Alberto</dim:field>

    2. <dim:field authority="585349cb-d468-4b6d-825d-aad5870b6796" confidence="600" element="contributor" mdschema="dc" qualifier="author">Sicilia, Miguel Ángel</dim:field>

    3. <dim:field authority="75" confidence="600" element="contributor" mdschema="dc" qualifier="author">García Tejedor, Álvaro José</dim:field>

    4. <dim:field element="date" mdschema="dc" qualifier="accessioned">2022-10-25T10:37:33Z</dim:field>

    5. <dim:field element="date" mdschema="dc" qualifier="available">2022-10-25T10:37:33Z</dim:field>

    6. <dim:field element="date" mdschema="dc" qualifier="issued">2021</dim:field>

    7. <dim:field element="identifier" lang="spa" mdschema="dc" qualifier="issn">0165-5515</dim:field>

    8. <dim:field element="identifier" mdschema="dc" qualifier="uri">https://hdl.handle.net/10641/3129</dim:field>

    9. <dim:field element="identifier" lang="spa" mdschema="dc" qualifier="doi">10.1177/01655515211018170</dim:field>

    10. <dim:field element="description" lang="spa" mdschema="dc" qualifier="abstract">The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</dim:field>

    11. <dim:field element="description" lang="spa" mdschema="dc" qualifier="version">pre-print</dim:field>

    12. <dim:field element="description" lang="spa" mdschema="dc" qualifier="extent">304 KB</dim:field>

    13. <dim:field element="language" lang="spa" mdschema="dc" qualifier="iso">eng</dim:field>

    14. <dim:field element="publisher" lang="spa" mdschema="dc">Journal of Information Science</dim:field>

    15. <dim:field element="rights" lang="*" mdschema="dc">Atribución-NoComercial-SinDerivadas 3.0 España</dim:field>

    16. <dim:field element="rights" lang="*" mdschema="dc" qualifier="uri">http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dim:field>

    17. <dim:field element="rights" lang="spa" mdschema="dc" qualifier="accessRights">openAccess</dim:field>

    18. <dim:field element="subject" lang="spa" mdschema="dc">Linked Data</dim:field>

    19. <dim:field element="subject" lang="spa" mdschema="dc">Deep Learning</dim:field>

    20. <dim:field element="subject" lang="spa" mdschema="dc">Document Categorisation</dim:field>

    21. <dim:field element="title" lang="spa" mdschema="dc">A domain categorisation of vocabularies based on a deep learning classifier.</dim:field>

    22. <dim:field element="type" lang="spa" mdschema="dc">article</dim:field>

    23. <dim:field element="relation" lang="spa" mdschema="dc" qualifier="publisherversion">https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</dim:field>

    </dim:dim>

etdms

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <thesis schemaLocation="http://www.ndltd.org/standards/metadata/etdms/1.0/ http://www.ndltd.org/standards/metadata/etdms/1.0/etdms.xsd">

    1. <title>A domain categorisation of vocabularies based on a deep learning classifier.</title>

    2. <creator>Nogales Moyano, Alberto</creator>

    3. <creator>Sicilia, Miguel Ángel</creator>

    4. <creator>García Tejedor, Álvaro José</creator>

    5. <subject>Linked Data</subject>

    6. <subject>Deep Learning</subject>

    7. <subject>Document Categorisation</subject>

    8. <description>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</description>

    9. <date>2022-10-25</date>

    10. <date>2022-10-25</date>

    11. <date>2021</date>

    12. <type>article</type>

    13. <identifier>0165-5515</identifier>

    14. <identifier>https://hdl.handle.net/10641/3129</identifier>

    15. <identifier>10.1177/01655515211018170</identifier>

    16. <language>eng</language>

    17. <relation>https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</relation>

    18. <rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</rights>

    19. <rights>openAccess</rights>

    20. <rights>Atribución-NoComercial-SinDerivadas 3.0 España</rights>

    21. <publisher>Journal of Information Science</publisher>

    </thesis>

marc

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <record schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">

    1. <leader>00925njm 22002777a 4500</leader>

    2. <datafield ind1=" " ind2=" " tag="042">

      1. <subfield code="a">dc</subfield>

      </datafield>

    3. <datafield ind1=" " ind2=" " tag="720">

      1. <subfield code="a">Nogales Moyano, Alberto</subfield>

      2. <subfield code="e">author</subfield>

      </datafield>

    4. <datafield ind1=" " ind2=" " tag="720">

      1. <subfield code="a">Sicilia, Miguel Ángel</subfield>

      2. <subfield code="e">author</subfield>

      </datafield>

    5. <datafield ind1=" " ind2=" " tag="720">

      1. <subfield code="a">García Tejedor, Álvaro José</subfield>

      2. <subfield code="e">author</subfield>

      </datafield>

    6. <datafield ind1=" " ind2=" " tag="260">

      1. <subfield code="c">2021</subfield>

      </datafield>

    7. <datafield ind1=" " ind2=" " tag="520">

      1. <subfield code="a">The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</subfield>

      </datafield>

    8. <datafield ind1="8" ind2=" " tag="024">

      1. <subfield code="a">0165-5515</subfield>

      </datafield>

    9. <datafield ind1="8" ind2=" " tag="024">

      1. <subfield code="a">https://hdl.handle.net/10641/3129</subfield>

      </datafield>

    10. <datafield ind1="8" ind2=" " tag="024">

      1. <subfield code="a">10.1177/01655515211018170</subfield>

      </datafield>

    11. <datafield ind1=" " ind2=" " tag="653">

      1. <subfield code="a">Linked Data</subfield>

      </datafield>

    12. <datafield ind1=" " ind2=" " tag="653">

      1. <subfield code="a">Deep Learning</subfield>

      </datafield>

    13. <datafield ind1=" " ind2=" " tag="653">

      1. <subfield code="a">Document Categorisation</subfield>

      </datafield>

    14. <datafield ind1="0" ind2="0" tag="245">

      1. <subfield code="a">A domain categorisation of vocabularies based on a deep learning classifier.</subfield>

      </datafield>

    </record>

mets

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <mets ID=" DSpace_ITEM_10641-3129" OBJID=" hdl:10641/3129" PROFILE="DSpace METS SIP Profile 1.0" TYPE="DSpace ITEM" schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd">

    1. <metsHdr CREATEDATE="2023-01-30T02:36:38Z">

      1. <agent ROLE="CUSTODIAN" TYPE="ORGANIZATION">

        1. <name>DDFV</name>

        </agent>

      </metsHdr>

    2. <dmdSec ID="DMD_10641_3129">

      1. <mdWrap MDTYPE="MODS">

        1. <xmlData schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">

          1. <mods:mods schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">

            1. <mods:name>

              1. <mods:role>

                1. <mods:roleTerm type="text">author</mods:roleTerm>

                </mods:role>

              2. <mods:namePart>Nogales Moyano, Alberto</mods:namePart>

              </mods:name>

            2. <mods:name>

              1. <mods:role>

                1. <mods:roleTerm type="text">author</mods:roleTerm>

                </mods:role>

              2. <mods:namePart>Sicilia, Miguel Ángel</mods:namePart>

              </mods:name>

            3. <mods:name>

              1. <mods:role>

                1. <mods:roleTerm type="text">author</mods:roleTerm>

                </mods:role>

              2. <mods:namePart>García Tejedor, Álvaro José</mods:namePart>

              </mods:name>

            4. <mods:extension>

              1. <mods:dateAccessioned encoding="iso8601">2022-10-25T10:37:33Z</mods:dateAccessioned>

              </mods:extension>

            5. <mods:extension>

              1. <mods:dateAvailable encoding="iso8601">2022-10-25T10:37:33Z</mods:dateAvailable>

              </mods:extension>

            6. <mods:originInfo>

              1. <mods:dateIssued encoding="iso8601">2021</mods:dateIssued>

              </mods:originInfo>

            7. <mods:identifier type="issn">0165-5515</mods:identifier>

            8. <mods:identifier type="uri">https://hdl.handle.net/10641/3129</mods:identifier>

            9. <mods:identifier type="doi">10.1177/01655515211018170</mods:identifier>

            10. <mods:abstract>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</mods:abstract>

            11. <mods:language>

              1. <mods:languageTerm authority="rfc3066">eng</mods:languageTerm>

              </mods:language>

            12. <mods:accessCondition type="useAndReproduction">Atribución-NoComercial-SinDerivadas 3.0 España</mods:accessCondition>

            13. <mods:subject>

              1. <mods:topic>Linked Data</mods:topic>

              </mods:subject>

            14. <mods:subject>

              1. <mods:topic>Deep Learning</mods:topic>

              </mods:subject>

            15. <mods:subject>

              1. <mods:topic>Document Categorisation</mods:topic>

              </mods:subject>

            16. <mods:titleInfo>

              1. <mods:title>A domain categorisation of vocabularies based on a deep learning classifier.</mods:title>

              </mods:titleInfo>

            17. <mods:genre>article</mods:genre>

            </mods:mods>

          </xmlData>

        </mdWrap>

      </dmdSec>

    3. <amdSec ID="TMD_10641_3129">

      1. <rightsMD ID="RIG_10641_3129">

        1. <mdWrap MDTYPE="OTHER" MIMETYPE="text/plain" OTHERMDTYPE="DSpaceDepositLicense">

          1. <binData>LSBFbCByZXBvc2l0b3JpbyBpbnN0aXR1Y2lvbmFsIGRlIGxhIFVuaXZlcnNpZGFkIEZyYW5jaXNjbyBkZSBWaXRvcmlhIGRlIE1hZHJpZCAoRERGViksIHBvbmUgYSBkaXNwb3NpY2nDs24gZGUgbG9zIHVzdWFyaW9zIGxhIHBsYXRhZm9ybWEgZGlnaXRhbCBhYmllcnRhIHkgZGUgYWNjZXNvIGxpYnJlIGRlIGxhIHByb2R1Y2Npw7NuIGNpZW50w61maWNhIGRlIGxhIGluc3RpdHVjacOzbi4KCi0gQSB0YWxlcyBmaW5lcywgbG9zIGF1dG9yZXMgZGVjbGFyYW4gcXVlIHNvbiB0aXR1bGFyZXMgZGUgbG9zIGRlcmVjaG9zIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCBkZSBsYSBvYnJhIHkgcXVlIMOpc3RhIGVzIG9yaWdpbmFsLgoKLSBNZWRpYW50ZSBsYSBhY2VwdGFjacOzbiBkZSBlc3RhIGxpY2VuY2lhLCBlbCBhdXRvciwgY29tbyB0aXR1bGFyIGRlIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgYXV0b3JpemEgeSBjZWRlIGEgbGEgVW5pdmVyc2lkYWQgRnJhbmNpc2NvIGRlIFZpdG9yaWEsIGRlIGZvcm1hIGdyYXR1aXRhIHkgbm8gZXhjbHVzaXZhLCBwb3IgZWwgbcOheGltbyBwbGF6byBsZWdhbCB5IGNvbiDDoW1iaXRvIHVuaXZlcnNhbCwgbG9zIGRlcmVjaG9zIGRlIHJlcHJvZHVjY2nDs24sIGRpc3RyaWJ1Y2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIGluY2x1aWRvIGVsIGRlcmVjaG8gZGUgcHVlc3RhIGEgZGlzcG9zaWNpw7NuIGVsZWN0csOzbmljYSwgeSBsYSB0cmFuc2Zvcm1hY2nDs24gZGUgZm9ybWF0byBzb2JyZSBsYSBvYnJhIGluZGljYWRhLCBzaSBmdWVyYSBlbCBjYXNvLgoKLSBFbiBlbCBjYXNvIGRlIGNlc2nDs24gZGUgZGVyZWNob3MgZGUgZXhwbG90YWNpw7NuIGEgdGVyY2Vyb3MsIGRlY2xhcmEgcXVlIGN1ZW50YSBjb24gbGEgYXV0b3JpemFjacOzbiBkZSBkaWNob3MgdGl0dWxhcmVzIHkgcXVlIGhhIG9idGVuaWRvIGVsIHBlcm1pc28gc2luIHJlc3RyaWNjaW9uZXMgZGVsIHByb3BpZXRhcmlvIGRlbCBjb3B5cmlnaHQgcGFyYSBvdG9yZ2FyIGEgbGEgaW5zdGl0dWNpw7NuIGxvcyBkZXJlY2hvcyByZXF1ZXJpZG9zIHBhcmEgZXN0YSBsaWNlbmNpYSB5IHF1ZSBkaWNobyBwcm9waWV0YXJpbyBjb25vY2UgZWwgdGV4dG8gbyBlbCBjb250ZW5pZG8gZGUgbGEgb2JyYS4KCi0gU2kgZnVlcmEgdW5hIG9icmEgcGF0cm9jaW5hZGEgcG9yIGFsZ3VuYSBpbnN0aXR1Y2nDs24gZGlzdGludGEgYSBsYSBVbml2ZXJzaWRhZCBGcmFuY2lzY28gZGUgVml0b3JpYSwgZGVjbGFyYSBxdWUgZW4gY2FzbyBuZWNlc2FyaW8sIGN1ZW50YSBjb24gbG9zIHBlcm1pc29zIHBlcnRpbmVudGVzLCBkZSBsYSBpbnN0aXR1Y2nDs24gbyBlbnRpZGFkLCBxdWUgbGUgcGVybWl0YW4gbGEgZGlmdXNpw7NuIGRlIGRpY2hhIG9icmEuCgotIExhIFVuaXZlcnNpZGFkIEZyYW5jaXNjbyBkZSBWaXRvcmlhIG5vIHRpZW5lIGxhIHRpdHVsYXJpZGFkIGRlIGxvcyBkZXJlY2hvcyBzb2JyZSBsYSBvYnJhLCBxdWUgY29ycmVzcG9uZGVuIGFsIGF1dG9yLCBwZXJvIHNpbiBlbWJhcmdvIMOpc3RhIGxpY2VuY2lhIGRhIGRlcmVjaG8gYSByZXByb2R1Y2lybGEgZW4gdW4gc29wb3J0ZSBkaWdpdGFsLCBkaXN0cmlidWlyIGEgbG9zIHVzdWFyaW9zIGNvcGlhcyBlbGVjdHLDs25pY2FzIGRlIGxhIG9icmEgZW4gZm9ybWF0byBkaWdpdGFsLCBjb211bmljYWNpw7NuIHDDumJsaWNhIHkgc3UgcHVlc3RhIGEgZGlzcG9zaWNpw7NuIGEgdHJhdsOpcyBkZSB1biBhcmNoaXZvIGFiaWVydG8gaW5zdGl0dWNpb25hbC4KCi0gTGEgb2JyYSBzZSBwb25kcsOhIGEgZGlzcG9zaWNpw7NuIGRlIGxvcyB1c3VhcmlvcyBwYXJhIHF1ZSBoYWdhbiBkZSBlbGxhIHVuIHVzbyBqdXN0byB5IHJlc3BldHVvc28gY29uIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgc2VhIGNvbiBmaW5lcyBkZSBlc3R1ZGlvLCBpbnZlc3RpZ2FjacOzbiBvIGN1YWxxdWllciBvdHJvIGZpbiBsw61jaXRvLCB5IGRlIGFjdWVyZG8gYSBsYXMgY29uZGljaW9uZXMgZXN0YWJsZWNpZGFzIGVuIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMsIGRlIG1vZG8gcXVlIGxhcyBvYnJhcyBwdWVkYW4gc2VyIGRpc3RyaWJ1aWRhcywgY29waWFkYXMgeSBleGhpYmlkYXMgc2llbXByZSBxdWUgc2UgY2l0ZSBsYSBhdXRvcsOtYSB5IG5vIHNlIG9idGVuZ2EgYmVuZWZpY2lvIGNvbWVyY2lhbC4gUG9yIHRhbnRvLCBsYSBVbml2ZXJzaWRhZCBubyBhc3VtaXLDoSByZXNwb25zYWJpbGlkYWQgYWxndW5hIHBvciBsYSBmb3JtYSBlZmVjdGl2YSBlbiBxdWUgbG9zIHVzdWFyaW9zIHV0aWxpY2VuIGVsIG1hdGVyaWFsIHB1ZXN0byBhIHN1IGRpc3Bvc2ljacOzbi4KCi0gRWwgYXV0b3IgcG9kcsOhIHNvbGljaXRhciBsYSByZXRpcmFkYSBkZSBsYSBvYnJhIGRlbCByZXBvc2l0b3JpbyBwb3IgY2F1c2EganVzdGlmaWNhZGEuIAoK</binData>

          </mdWrap>

        </rightsMD>

      </amdSec>

    4. <amdSec ID="FO_10641_3129_1">

      1. <techMD ID="TECH_O_10641_3129_1">

        1. <mdWrap MDTYPE="PREMIS">

          1. <xmlData schemaLocation="http://www.loc.gov/standards/premis http://www.loc.gov/standards/premis/PREMIS-v1-0.xsd">

            1. <premis:premis>

              1. <premis:object>

                1. <premis:objectIdentifier>

                  1. <premis:objectIdentifierType>URL</premis:objectIdentifierType>

                  2. <premis:objectIdentifierValue>http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf</premis:objectIdentifierValue>

                  </premis:objectIdentifier>

                2. <premis:objectCategory>File</premis:objectCategory>

                3. <premis:objectCharacteristics>

                  1. <premis:fixity>

                    1. <premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm>

                    2. <premis:messageDigest>cbb24acbf4b87b52d69b8ac7c4c9d4f3</premis:messageDigest>

                    </premis:fixity>

                  2. <premis:size>310836</premis:size>

                  3. <premis:format>

                    1. <premis:formatDesignation>

                      1. <premis:formatName>application/pdf</premis:formatName>

                      </premis:formatDesignation>

                    </premis:format>

                  </premis:objectCharacteristics>

                4. <premis:originalName>A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf</premis:originalName>

                </premis:object>

              </premis:premis>

            </xmlData>

          </mdWrap>

        </techMD>

      </amdSec>

    5. <amdSec ID="FT_10641_3129_4">

      1. <techMD ID="TECH_T_10641_3129_4">

        1. <mdWrap MDTYPE="PREMIS">

          1. <xmlData schemaLocation="http://www.loc.gov/standards/premis http://www.loc.gov/standards/premis/PREMIS-v1-0.xsd">

            1. <premis:premis>

              1. <premis:object>

                1. <premis:objectIdentifier>

                  1. <premis:objectIdentifierType>URL</premis:objectIdentifierType>

                  2. <premis:objectIdentifierValue>http://ddfv.ufv.es/bitstream/10641/3129/4/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.txt</premis:objectIdentifierValue>

                  </premis:objectIdentifier>

                2. <premis:objectCategory>File</premis:objectCategory>

                3. <premis:objectCharacteristics>

                  1. <premis:fixity>

                    1. <premis:messageDigestAlgorithm>MD5</premis:messageDigestAlgorithm>

                    2. <premis:messageDigest>ba5681bed4c85691606cb07eb8835bfb</premis:messageDigest>

                    </premis:fixity>

                  2. <premis:size>40390</premis:size>

                  3. <premis:format>

                    1. <premis:formatDesignation>

                      1. <premis:formatName>text/plain</premis:formatName>

                      </premis:formatDesignation>

                    </premis:format>

                  </premis:objectCharacteristics>

                4. <premis:originalName>A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf.txt</premis:originalName>

                </premis:object>

              </premis:premis>

            </xmlData>

          </mdWrap>

        </techMD>

      </amdSec>

    6. <fileSec>

      1. <fileGrp USE="ORIGINAL">

        1. <file ADMID="FO_10641_3129_1" CHECKSUM="cbb24acbf4b87b52d69b8ac7c4c9d4f3" CHECKSUMTYPE="MD5" GROUPID="GROUP_BITSTREAM_10641_3129_1" ID="BITSTREAM_ORIGINAL_10641_3129_1" MIMETYPE="application/pdf" SEQ="1" SIZE="310836">

          1. <FLocat LOCTYPE="URL" href="http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf" type="simple" />

          </file>

        </fileGrp>

      2. <fileGrp USE="TEXT">

        1. <file ADMID="FT_10641_3129_4" CHECKSUM="ba5681bed4c85691606cb07eb8835bfb" CHECKSUMTYPE="MD5" GROUPID="GROUP_BITSTREAM_10641_3129_4" ID="BITSTREAM_TEXT_10641_3129_4" MIMETYPE="text/plain" SEQ="4" SIZE="40390">

          1. <FLocat LOCTYPE="URL" href="http://ddfv.ufv.es/bitstream/10641/3129/4/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.txt" type="simple" />

          </file>

        </fileGrp>

      </fileSec>

    7. <structMap LABEL="DSpace Object" TYPE="LOGICAL">

      1. <div ADMID="DMD_10641_3129" TYPE="DSpace Object Contents">

        1. <div TYPE="DSpace BITSTREAM">

          1. <fptr FILEID="BITSTREAM_ORIGINAL_10641_3129_1" />

          </div>

        </div>

      </structMap>

    </mets>

mods

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <mods:mods schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-1.xsd">

    1. <mods:name>

      1. <mods:namePart>Nogales Moyano, Alberto</mods:namePart>

      </mods:name>

    2. <mods:name>

      1. <mods:namePart>Sicilia, Miguel Ángel</mods:namePart>

      </mods:name>

    3. <mods:name>

      1. <mods:namePart>García Tejedor, Álvaro José</mods:namePart>

      </mods:name>

    4. <mods:extension>

      1. <mods:dateAvailable encoding="iso8601">2022-10-25T10:37:33Z</mods:dateAvailable>

      </mods:extension>

    5. <mods:extension>

      1. <mods:dateAccessioned encoding="iso8601">2022-10-25T10:37:33Z</mods:dateAccessioned>

      </mods:extension>

    6. <mods:originInfo>

      1. <mods:dateIssued encoding="iso8601">2021</mods:dateIssued>

      </mods:originInfo>

    7. <mods:identifier type="issn">0165-5515</mods:identifier>

    8. <mods:identifier type="uri">https://hdl.handle.net/10641/3129</mods:identifier>

    9. <mods:identifier type="doi">10.1177/01655515211018170</mods:identifier>

    10. <mods:abstract>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</mods:abstract>

    11. <mods:language>

      1. <mods:languageTerm>eng</mods:languageTerm>

      </mods:language>

    12. <mods:accessCondition type="useAndReproduction">http://creativecommons.org/licenses/by-nc-nd/3.0/es/</mods:accessCondition>

    13. <mods:accessCondition type="useAndReproduction">openAccess</mods:accessCondition>

    14. <mods:accessCondition type="useAndReproduction">Atribución-NoComercial-SinDerivadas 3.0 España</mods:accessCondition>

    15. <mods:subject>

      1. <mods:topic>Linked Data</mods:topic>

      </mods:subject>

    16. <mods:subject>

      1. <mods:topic>Deep Learning</mods:topic>

      </mods:subject>

    17. <mods:subject>

      1. <mods:topic>Document Categorisation</mods:topic>

      </mods:subject>

    18. <mods:titleInfo>

      1. <mods:title>A domain categorisation of vocabularies based on a deep learning classifier.</mods:title>

      </mods:titleInfo>

    19. <mods:genre>article</mods:genre>

    </mods:mods>

ore

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <atom:entry schemaLocation="http://www.w3.org/2005/Atom http://www.kbcafe.com/rss/atom.xsd.xml">

    1. <atom:id>https://hdl.handle.net/10641/3129/ore.xml</atom:id>

    2. <atom:link href="https://hdl.handle.net/10641/3129" rel="alternate" />
    3. <atom:link href="https://hdl.handle.net/10641/3129/ore.xml" rel="http://www.openarchives.org/ore/terms/describes" />
    4. <atom:link href="https://hdl.handle.net/10641/3129/ore.xml#atom" rel="self" type="application/atom+xml" />
    5. <atom:published>2022-10-25T10:37:33Z</atom:published>

    6. <atom:updated>2022-10-25T10:37:33Z</atom:updated>

    7. <atom:source>

      1. <atom:generator>DDFV</atom:generator>

      </atom:source>

    8. <atom:title>A domain categorisation of vocabularies based on a deep learning classifier.</atom:title>

    9. <atom:author>

      1. <atom:name>Nogales Moyano, Alberto</atom:name>

      </atom:author>

    10. <atom:author>

      1. <atom:name>Sicilia, Miguel Ángel</atom:name>

      </atom:author>

    11. <atom:author>

      1. <atom:name>García Tejedor, Álvaro José</atom:name>

      </atom:author>

    12. <atom:category label="Aggregation" scheme="http://www.openarchives.org/ore/terms/" term="http://www.openarchives.org/ore/terms/Aggregation" />
    13. <atom:category scheme="http://www.openarchives.org/ore/atom/modified" term="2022-10-25T10:37:33Z" />
    14. <atom:category label="DSpace Item" scheme="http://www.dspace.org/objectModel/" term="DSpaceItem" />
    15. <atom:link href="http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf" length="310836" rel="http://www.openarchives.org/ore/terms/aggregates" title="A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf" type="application/pdf" />
    16. <oreatom:triples>

      1. <rdf:Description about="https://hdl.handle.net/10641/3129/ore.xml#atom">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceItem" />
        2. <dcterms:modified>2022-10-25T10:37:33Z</dcterms:modified>

        </rdf:Description>

      2. <rdf:Description about="http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceBitstream" />
        2. <dcterms:description>ORIGINAL</dcterms:description>

        </rdf:Description>

      3. <rdf:Description about="http://ddfv.ufv.es/bitstream/10641/3129/2/license_rdf">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceBitstream" />
        2. <dcterms:description>CC-LICENSE</dcterms:description>

        </rdf:Description>

      4. <rdf:Description about="http://ddfv.ufv.es/bitstream/10641/3129/3/license.txt">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceBitstream" />
        2. <dcterms:description>LICENSE</dcterms:description>

        </rdf:Description>

      5. <rdf:Description about="http://ddfv.ufv.es/bitstream/10641/3129/4/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.txt">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceBitstream" />
        2. <dcterms:description>TEXT</dcterms:description>

        </rdf:Description>

      6. <rdf:Description about="http://ddfv.ufv.es/bitstream/10641/3129/5/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.jpg">

        1. <rdf:type resource="http://www.dspace.org/objectModel/DSpaceBitstream" />
        2. <dcterms:description>THUMBNAIL</dcterms:description>

        </rdf:Description>

      </oreatom:triples>

    </atom:entry>

qdc

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <qdc:qualifieddc schemaLocation="http://purl.org/dc/elements/1.1/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dc.xsd http://purl.org/dc/terms/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dcterms.xsd http://dspace.org/qualifieddc/ http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/qualifieddc.xsd">

    1. <dc:title>A domain categorisation of vocabularies based on a deep learning classifier.</dc:title>

    2. <dc:creator>Nogales Moyano, Alberto</dc:creator>

    3. <dc:creator>Sicilia, Miguel Ángel</dc:creator>

    4. <dc:creator>García Tejedor, Álvaro José</dc:creator>

    5. <dc:subject>Linked Data</dc:subject>

    6. <dc:subject>Deep Learning</dc:subject>

    7. <dc:subject>Document Categorisation</dc:subject>

    8. <dcterms:abstract>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</dcterms:abstract>

    9. <dcterms:dateAccepted>2022-10-25T10:37:33Z</dcterms:dateAccepted>

    10. <dcterms:available>2022-10-25T10:37:33Z</dcterms:available>

    11. <dcterms:created>2022-10-25T10:37:33Z</dcterms:created>

    12. <dcterms:issued>2021</dcterms:issued>

    13. <dc:type>article</dc:type>

    14. <dc:identifier>0165-5515</dc:identifier>

    15. <dc:identifier>https://hdl.handle.net/10641/3129</dc:identifier>

    16. <dc:identifier>10.1177/01655515211018170</dc:identifier>

    17. <dc:language>eng</dc:language>

    18. <dc:relation>https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</dc:relation>

    19. <dc:rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dc:rights>

    20. <dc:rights>openAccess</dc:rights>

    21. <dc:rights>Atribución-NoComercial-SinDerivadas 3.0 España</dc:rights>

    22. <dc:publisher>Journal of Information Science</dc:publisher>

    </qdc:qualifieddc>

rdf

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <rdf:RDF schemaLocation="http://www.openarchives.org/OAI/2.0/rdf/ http://www.openarchives.org/OAI/2.0/rdf.xsd">

    1. <ow:Publication about="oai:ddfv.ufv.es:10641/3129">

      1. <dc:title>A domain categorisation of vocabularies based on a deep learning classifier.</dc:title>

      2. <dc:creator>Nogales Moyano, Alberto</dc:creator>

      3. <dc:creator>Sicilia, Miguel Ángel</dc:creator>

      4. <dc:creator>García Tejedor, Álvaro José</dc:creator>

      5. <dc:subject>Linked Data</dc:subject>

      6. <dc:subject>Deep Learning</dc:subject>

      7. <dc:subject>Document Categorisation</dc:subject>

      8. <dc:description>The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</dc:description>

      9. <dc:date>2022-10-25T10:37:33Z</dc:date>

      10. <dc:date>2022-10-25T10:37:33Z</dc:date>

      11. <dc:date>2021</dc:date>

      12. <dc:type>article</dc:type>

      13. <dc:identifier>0165-5515</dc:identifier>

      14. <dc:identifier>https://hdl.handle.net/10641/3129</dc:identifier>

      15. <dc:identifier>10.1177/01655515211018170</dc:identifier>

      16. <dc:language>eng</dc:language>

      17. <dc:relation>https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</dc:relation>

      18. <dc:rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dc:rights>

      19. <dc:rights>openAccess</dc:rights>

      20. <dc:rights>Atribución-NoComercial-SinDerivadas 3.0 España</dc:rights>

      21. <dc:publisher>Journal of Information Science</dc:publisher>

      </ow:Publication>

    </rdf:RDF>

xoai

Descarregar XML

    <?xml version="1.0" encoding="UTF-8" ?>

  1. <metadata schemaLocation="http://www.lyncode.com/xoai http://www.lyncode.com/xsd/xoai.xsd">

    1. <element name="dc">

      1. <element name="contributor">

        1. <element name="author">

          1. <element name="none">

            1. <field name="value">Nogales Moyano, Alberto</field>

            2. <field name="authority">209</field>

            3. <field name="confidence">600</field>

            4. <field name="value">Sicilia, Miguel Ángel</field>

            5. <field name="authority">585349cb-d468-4b6d-825d-aad5870b6796</field>

            6. <field name="confidence">600</field>

            7. <field name="value">García Tejedor, Álvaro José</field>

            8. <field name="authority">75</field>

            9. <field name="confidence">600</field>

            </element>

          </element>

        </element>

      2. <element name="date">

        1. <element name="accessioned">

          1. <element name="none">

            1. <field name="value">2022-10-25T10:37:33Z</field>

            </element>

          </element>

        2. <element name="available">

          1. <element name="none">

            1. <field name="value">2022-10-25T10:37:33Z</field>

            </element>

          </element>

        3. <element name="issued">

          1. <element name="none">

            1. <field name="value">2021</field>

            </element>

          </element>

        </element>

      3. <element name="identifier">

        1. <element name="issn">

          1. <element name="spa">

            1. <field name="value">0165-5515</field>

            </element>

          </element>

        2. <element name="uri">

          1. <element name="none">

            1. <field name="value">https://hdl.handle.net/10641/3129</field>

            </element>

          </element>

        3. <element name="doi">

          1. <element name="spa">

            1. <field name="value">10.1177/01655515211018170</field>

            </element>

          </element>

        </element>

      4. <element name="description">

        1. <element name="abstract">

          1. <element name="spa">

            1. <field name="value">The publication of large amounts of open data has become a major trend nowadays. This is a consequence of pro-jects like the Linked Open Data (LOD) community, which publishes and integrates datasets using techniques like Linked Data. Linked Data publishers should follow a set of principles for dataset design. This information is described in a 2011 document that describes tasks as the consideration of reusing vocabularies. With regard to the latter, another project called Linked Open Vocabularies (LOV) attempts to compile the vocabularies used in LOD. These vocabularies have been classified by domain following the subjective criteria of LOV members, which has the inherent risk introducing personal biases. In this paper, we present an automatic classifier of vocabularies based on the main categories of the well-known knowledge source Wikipedia. For this purpose, word-embedding models were used, in combination with Deep Learning techniques. Results show that with a hybrid model of regular Deep Neural Network (DNN), Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), vocabularies could be classified with an accuracy of 93.57 per cent. Specifically, 36.25 per cent of the vocabularies belong to the Culture category.</field>

            </element>

          </element>

        2. <element name="version">

          1. <element name="spa">

            1. <field name="value">pre-print</field>

            </element>

          </element>

        3. <element name="extent">

          1. <element name="spa">

            1. <field name="value">304 KB</field>

            </element>

          </element>

        </element>

      5. <element name="language">

        1. <element name="iso">

          1. <element name="spa">

            1. <field name="value">eng</field>

            </element>

          </element>

        </element>

      6. <element name="publisher">

        1. <element name="spa">

          1. <field name="value">Journal of Information Science</field>

          </element>

        </element>

      7. <element name="rights">

        1. <element name="*">

          1. <field name="value">Atribución-NoComercial-SinDerivadas 3.0 España</field>

          </element>

        2. <element name="uri">

          1. <element name="*">

            1. <field name="value">http://creativecommons.org/licenses/by-nc-nd/3.0/es/</field>

            </element>

          </element>

        3. <element name="accessRights">

          1. <element name="spa">

            1. <field name="value">openAccess</field>

            </element>

          </element>

        </element>

      8. <element name="subject">

        1. <element name="spa">

          1. <field name="value">Linked Data</field>

          2. <field name="value">Deep Learning</field>

          3. <field name="value">Document Categorisation</field>

          </element>

        </element>

      9. <element name="title">

        1. <element name="spa">

          1. <field name="value">A domain categorisation of vocabularies based on a deep learning classifier.</field>

          </element>

        </element>

      10. <element name="type">

        1. <element name="spa">

          1. <field name="value">article</field>

          </element>

        </element>

      11. <element name="relation">

        1. <element name="publisherversion">

          1. <element name="spa">

            1. <field name="value">https://journals.sagepub.com/doi/abs/10.1177/01655515211018170</field>

            </element>

          </element>

        </element>

      </element>

    2. <element name="bundles">

      1. <element name="bundle">

        1. <field name="name">ORIGINAL</field>

        2. <element name="bitstreams">

          1. <element name="bitstream">

            1. <field name="name">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf</field>

            2. <field name="originalName">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf</field>

            3. <field name="description" />
            4. <field name="format">application/pdf</field>

            5. <field name="size">310836</field>

            6. <field name="url">http://ddfv.ufv.es/bitstream/10641/3129/1/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf</field>

            7. <field name="checksum">cbb24acbf4b87b52d69b8ac7c4c9d4f3</field>

            8. <field name="checksumAlgorithm">MD5</field>

            9. <field name="sid">1</field>

            10. <field name="drm">open access</field>

            </element>

          </element>

        </element>

      2. <element name="bundle">

        1. <field name="name">CC-LICENSE</field>

        2. <element name="bitstreams">

          1. <element name="bitstream">

            1. <field name="name">license_rdf</field>

            2. <field name="originalName">license_rdf</field>

            3. <field name="format">application/rdf+xml; charset=utf-8</field>

            4. <field name="size">811</field>

            5. <field name="url">http://ddfv.ufv.es/bitstream/10641/3129/2/license_rdf</field>

            6. <field name="checksum">4d01a8abc68801ab758ec8c2c04918c3</field>

            7. <field name="checksumAlgorithm">MD5</field>

            8. <field name="sid">2</field>

            9. <field name="drm">open access</field>

            </element>

          </element>

        </element>

      3. <element name="bundle">

        1. <field name="name">LICENSE</field>

        2. <element name="bitstreams">

          1. <element name="bitstream">

            1. <field name="name">license.txt</field>

            2. <field name="originalName">license.txt</field>

            3. <field name="format">text/plain; charset=utf-8</field>

            4. <field name="size">2418</field>

            5. <field name="url">http://ddfv.ufv.es/bitstream/10641/3129/3/license.txt</field>

            6. <field name="checksum">8b6e3a0bc6a1ca51936267b0e6e4740c</field>

            7. <field name="checksumAlgorithm">MD5</field>

            8. <field name="sid">3</field>

            9. <field name="drm">open access</field>

            </element>

          </element>

        </element>

      4. <element name="bundle">

        1. <field name="name">TEXT</field>

        2. <element name="bitstreams">

          1. <element name="bitstream">

            1. <field name="name">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf.txt</field>

            2. <field name="originalName">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf.txt</field>

            3. <field name="description">Extracted text</field>

            4. <field name="format">text/plain</field>

            5. <field name="size">40390</field>

            6. <field name="url">http://ddfv.ufv.es/bitstream/10641/3129/4/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.txt</field>

            7. <field name="checksum">ba5681bed4c85691606cb07eb8835bfb</field>

            8. <field name="checksumAlgorithm">MD5</field>

            9. <field name="sid">4</field>

            10. <field name="drm">open access</field>

            </element>

          </element>

        </element>

      5. <element name="bundle">

        1. <field name="name">THUMBNAIL</field>

        2. <element name="bitstreams">

          1. <element name="bitstream">

            1. <field name="name">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf.jpg</field>

            2. <field name="originalName">A domain categorization of vocabularies based on a Deep Learning classifier - EDITED (Copia en conflicto de ceiecubuntu 2018-12-13).pdf.jpg</field>

            3. <field name="description">Generated Thumbnail</field>

            4. <field name="format">image/jpeg</field>

            5. <field name="size">1649</field>

            6. <field name="url">http://ddfv.ufv.es/bitstream/10641/3129/5/A%20domain%20categorization%20of%20vocabularies%20based%20on%20a%20Deep%20Learning%20classifier%20-%20EDITED%20%28Copia%20en%20conflicto%20de%20ceiecubuntu%202018-12-13%29.pdf.jpg</field>

            7. <field name="checksum">908b259a0428be884473428a0ef5dbd0</field>

            8. <field name="checksumAlgorithm">MD5</field>

            9. <field name="sid">5</field>

            10. <field name="drm">open access</field>

            </element>

          </element>

        </element>

      </element>

    3. <element name="others">

      1. <field name="handle">10641/3129</field>

      2. <field name="identifier">oai:ddfv.ufv.es:10641/3129</field>

      3. <field name="lastModifyDate">2022-10-26 02:00:10.493</field>

      4. <field name="drm">open access</field>

      </element>

    4. <element name="repository">

      1. <field name="name">DDFV</field>

      2. <field name="mail">dspace@ufv.es</field>

      </element>

    5. <element name="license">

      1. <field name="bin">LSBFbCByZXBvc2l0b3JpbyBpbnN0aXR1Y2lvbmFsIGRlIGxhIFVuaXZlcnNpZGFkIEZyYW5jaXNjbyBkZSBWaXRvcmlhIGRlIE1hZHJpZCAoRERGViksIHBvbmUgYSBkaXNwb3NpY2nDs24gZGUgbG9zIHVzdWFyaW9zIGxhIHBsYXRhZm9ybWEgZGlnaXRhbCBhYmllcnRhIHkgZGUgYWNjZXNvIGxpYnJlIGRlIGxhIHByb2R1Y2Npw7NuIGNpZW50w61maWNhIGRlIGxhIGluc3RpdHVjacOzbi4KCi0gQSB0YWxlcyBmaW5lcywgbG9zIGF1dG9yZXMgZGVjbGFyYW4gcXVlIHNvbiB0aXR1bGFyZXMgZGUgbG9zIGRlcmVjaG9zIGRlIHByb3BpZWRhZCBpbnRlbGVjdHVhbCBkZSBsYSBvYnJhIHkgcXVlIMOpc3RhIGVzIG9yaWdpbmFsLgoKLSBNZWRpYW50ZSBsYSBhY2VwdGFjacOzbiBkZSBlc3RhIGxpY2VuY2lhLCBlbCBhdXRvciwgY29tbyB0aXR1bGFyIGRlIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgYXV0b3JpemEgeSBjZWRlIGEgbGEgVW5pdmVyc2lkYWQgRnJhbmNpc2NvIGRlIFZpdG9yaWEsIGRlIGZvcm1hIGdyYXR1aXRhIHkgbm8gZXhjbHVzaXZhLCBwb3IgZWwgbcOheGltbyBwbGF6byBsZWdhbCB5IGNvbiDDoW1iaXRvIHVuaXZlcnNhbCwgbG9zIGRlcmVjaG9zIGRlIHJlcHJvZHVjY2nDs24sIGRpc3RyaWJ1Y2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIGluY2x1aWRvIGVsIGRlcmVjaG8gZGUgcHVlc3RhIGEgZGlzcG9zaWNpw7NuIGVsZWN0csOzbmljYSwgeSBsYSB0cmFuc2Zvcm1hY2nDs24gZGUgZm9ybWF0byBzb2JyZSBsYSBvYnJhIGluZGljYWRhLCBzaSBmdWVyYSBlbCBjYXNvLgoKLSBFbiBlbCBjYXNvIGRlIGNlc2nDs24gZGUgZGVyZWNob3MgZGUgZXhwbG90YWNpw7NuIGEgdGVyY2Vyb3MsIGRlY2xhcmEgcXVlIGN1ZW50YSBjb24gbGEgYXV0b3JpemFjacOzbiBkZSBkaWNob3MgdGl0dWxhcmVzIHkgcXVlIGhhIG9idGVuaWRvIGVsIHBlcm1pc28gc2luIHJlc3RyaWNjaW9uZXMgZGVsIHByb3BpZXRhcmlvIGRlbCBjb3B5cmlnaHQgcGFyYSBvdG9yZ2FyIGEgbGEgaW5zdGl0dWNpw7NuIGxvcyBkZXJlY2hvcyByZXF1ZXJpZG9zIHBhcmEgZXN0YSBsaWNlbmNpYSB5IHF1ZSBkaWNobyBwcm9waWV0YXJpbyBjb25vY2UgZWwgdGV4dG8gbyBlbCBjb250ZW5pZG8gZGUgbGEgb2JyYS4KCi0gU2kgZnVlcmEgdW5hIG9icmEgcGF0cm9jaW5hZGEgcG9yIGFsZ3VuYSBpbnN0aXR1Y2nDs24gZGlzdGludGEgYSBsYSBVbml2ZXJzaWRhZCBGcmFuY2lzY28gZGUgVml0b3JpYSwgZGVjbGFyYSBxdWUgZW4gY2FzbyBuZWNlc2FyaW8sIGN1ZW50YSBjb24gbG9zIHBlcm1pc29zIHBlcnRpbmVudGVzLCBkZSBsYSBpbnN0aXR1Y2nDs24gbyBlbnRpZGFkLCBxdWUgbGUgcGVybWl0YW4gbGEgZGlmdXNpw7NuIGRlIGRpY2hhIG9icmEuCgotIExhIFVuaXZlcnNpZGFkIEZyYW5jaXNjbyBkZSBWaXRvcmlhIG5vIHRpZW5lIGxhIHRpdHVsYXJpZGFkIGRlIGxvcyBkZXJlY2hvcyBzb2JyZSBsYSBvYnJhLCBxdWUgY29ycmVzcG9uZGVuIGFsIGF1dG9yLCBwZXJvIHNpbiBlbWJhcmdvIMOpc3RhIGxpY2VuY2lhIGRhIGRlcmVjaG8gYSByZXByb2R1Y2lybGEgZW4gdW4gc29wb3J0ZSBkaWdpdGFsLCBkaXN0cmlidWlyIGEgbG9zIHVzdWFyaW9zIGNvcGlhcyBlbGVjdHLDs25pY2FzIGRlIGxhIG9icmEgZW4gZm9ybWF0byBkaWdpdGFsLCBjb211bmljYWNpw7NuIHDDumJsaWNhIHkgc3UgcHVlc3RhIGEgZGlzcG9zaWNpw7NuIGEgdHJhdsOpcyBkZSB1biBhcmNoaXZvIGFiaWVydG8gaW5zdGl0dWNpb25hbC4KCi0gTGEgb2JyYSBzZSBwb25kcsOhIGEgZGlzcG9zaWNpw7NuIGRlIGxvcyB1c3VhcmlvcyBwYXJhIHF1ZSBoYWdhbiBkZSBlbGxhIHVuIHVzbyBqdXN0byB5IHJlc3BldHVvc28gY29uIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgc2VhIGNvbiBmaW5lcyBkZSBlc3R1ZGlvLCBpbnZlc3RpZ2FjacOzbiBvIGN1YWxxdWllciBvdHJvIGZpbiBsw61jaXRvLCB5IGRlIGFjdWVyZG8gYSBsYXMgY29uZGljaW9uZXMgZXN0YWJsZWNpZGFzIGVuIGxhIGxpY2VuY2lhIENyZWF0aXZlIENvbW1vbnMsIGRlIG1vZG8gcXVlIGxhcyBvYnJhcyBwdWVkYW4gc2VyIGRpc3RyaWJ1aWRhcywgY29waWFkYXMgeSBleGhpYmlkYXMgc2llbXByZSBxdWUgc2UgY2l0ZSBsYSBhdXRvcsOtYSB5IG5vIHNlIG9idGVuZ2EgYmVuZWZpY2lvIGNvbWVyY2lhbC4gUG9yIHRhbnRvLCBsYSBVbml2ZXJzaWRhZCBubyBhc3VtaXLDoSByZXNwb25zYWJpbGlkYWQgYWxndW5hIHBvciBsYSBmb3JtYSBlZmVjdGl2YSBlbiBxdWUgbG9zIHVzdWFyaW9zIHV0aWxpY2VuIGVsIG1hdGVyaWFsIHB1ZXN0byBhIHN1IGRpc3Bvc2ljacOzbi4KCi0gRWwgYXV0b3IgcG9kcsOhIHNvbGljaXRhciBsYSByZXRpcmFkYSBkZSBsYSBvYnJhIGRlbCByZXBvc2l0b3JpbyBwb3IgY2F1c2EganVzdGlmaWNhZGEuIAoK</field>

      </element>

    </metadata>

Hispana

Portal d'accés al patrimoni digital i l'agregador nacional de continguts a Europeana

Contacte

Accedeix al nostre formulari i et contestarem el més aviat

Contacte

X

Tweets by Hispana_roai

Facebook

HISPANA
© Ministeri de Cultura
  • Avís legal