Towards a database for genotype-phenotype association research: mining data from encyclopaedia
Abstract
To associate phenotypic characteristics of an organism to molecules encoded by its genome, there is a need for well-structured genotype and phenotype data. We use a novel method for extracting data on phenotype and genotype characteristics of microorganisms from text. As a resource, we use an encyclopedia of microorganisms, which holds phenotypic and genotypic data and create a structured, flexible data resource, which can be exported to a range of database formats, containing genotype and phenotype data for 2412 species and 873 genera of microbes. This data source has great potential as a resource for future biological research on genotype-phenotype associations. In this paper, we focus on describing the structure and content of the resulting database and on evaluating the method used for extracting the data. We conclude that the resulting database can be used as a reliable complementary resource for research into genotype-phenotype association.
Keywords:
biological database / bioinformatics / genotype-phenotype association / text mining / information extraction / knowledge acquisition / knowledge discovery / unstructured textual resourcesSource:
International Journal of Data Mining and Bioinformatics, 2013, 7, 2, 196-213Publisher:
- Inderscience Enterprises Ltd, Geneva
Funding / projects:
- Automated Reasoning and Data Mining (RS-MESTD-Basic Research (BR or ON)-174021)
DOI: 10.1504/IJDMB.2013.053196
ISSN: 1748-5673
PubMed: 23777176
WoS: 000317486500007
Scopus: 2-s2.0-84876257326
Collections
Institution/Community
Poljoprivredni fakultetTY - JOUR AU - Pajić, Vesna S. AU - Pavlović-Lažetić, Gordana AU - Beljanski, Milos V. AU - Brandt, Bernd W. AU - Pajić, Miloš PY - 2013 UR - http://aspace.agrif.bg.ac.rs/handle/123456789/3280 AB - To associate phenotypic characteristics of an organism to molecules encoded by its genome, there is a need for well-structured genotype and phenotype data. We use a novel method for extracting data on phenotype and genotype characteristics of microorganisms from text. As a resource, we use an encyclopedia of microorganisms, which holds phenotypic and genotypic data and create a structured, flexible data resource, which can be exported to a range of database formats, containing genotype and phenotype data for 2412 species and 873 genera of microbes. This data source has great potential as a resource for future biological research on genotype-phenotype associations. In this paper, we focus on describing the structure and content of the resulting database and on evaluating the method used for extracting the data. We conclude that the resulting database can be used as a reliable complementary resource for research into genotype-phenotype association. PB - Inderscience Enterprises Ltd, Geneva T2 - International Journal of Data Mining and Bioinformatics T1 - Towards a database for genotype-phenotype association research: mining data from encyclopaedia EP - 213 IS - 2 SP - 196 VL - 7 DO - 10.1504/IJDMB.2013.053196 ER -
@article{ author = "Pajić, Vesna S. and Pavlović-Lažetić, Gordana and Beljanski, Milos V. and Brandt, Bernd W. and Pajić, Miloš", year = "2013", abstract = "To associate phenotypic characteristics of an organism to molecules encoded by its genome, there is a need for well-structured genotype and phenotype data. We use a novel method for extracting data on phenotype and genotype characteristics of microorganisms from text. As a resource, we use an encyclopedia of microorganisms, which holds phenotypic and genotypic data and create a structured, flexible data resource, which can be exported to a range of database formats, containing genotype and phenotype data for 2412 species and 873 genera of microbes. This data source has great potential as a resource for future biological research on genotype-phenotype associations. In this paper, we focus on describing the structure and content of the resulting database and on evaluating the method used for extracting the data. We conclude that the resulting database can be used as a reliable complementary resource for research into genotype-phenotype association.", publisher = "Inderscience Enterprises Ltd, Geneva", journal = "International Journal of Data Mining and Bioinformatics", title = "Towards a database for genotype-phenotype association research: mining data from encyclopaedia", pages = "213-196", number = "2", volume = "7", doi = "10.1504/IJDMB.2013.053196" }
Pajić, V. S., Pavlović-Lažetić, G., Beljanski, M. V., Brandt, B. W.,& Pajić, M.. (2013). Towards a database for genotype-phenotype association research: mining data from encyclopaedia. in International Journal of Data Mining and Bioinformatics Inderscience Enterprises Ltd, Geneva., 7(2), 196-213. https://doi.org/10.1504/IJDMB.2013.053196
Pajić VS, Pavlović-Lažetić G, Beljanski MV, Brandt BW, Pajić M. Towards a database for genotype-phenotype association research: mining data from encyclopaedia. in International Journal of Data Mining and Bioinformatics. 2013;7(2):196-213. doi:10.1504/IJDMB.2013.053196 .
Pajić, Vesna S., Pavlović-Lažetić, Gordana, Beljanski, Milos V., Brandt, Bernd W., Pajić, Miloš, "Towards a database for genotype-phenotype association research: mining data from encyclopaedia" in International Journal of Data Mining and Bioinformatics, 7, no. 2 (2013):196-213, https://doi.org/10.1504/IJDMB.2013.053196 . .