TY - JOUR
T1 - Network for knowledge Organization (NEKO)
T2 - An AI knowledge mining workflow for synthetic biology research
AU - Xiao, Zhengyang
AU - Pakrasi, Himadri B.
AU - Chen, Yixin
AU - Tang, Yinjie J.
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2025/1
Y1 - 2025/1
N2 - Large language models (LLMs) can complete general scientific question-and-answer, yet they are constrained by their pretraining cut-off dates and lack the ability to provide specific, cited scientific knowledge. Here, we introduce Network for Knowledge Organization (NEKO), a workflow that uses LLM Qwen to extract knowledge through scientific literature text mining. When user inputs a keyword of interest, NEKO can generate knowledge graphs to link bioinformation entities and produce comprehensive summaries from PubMed search. NEKO significantly enhance LLM ability and has immediate applications in daily academic tasks such as education of young scientists, literature review, paper writing, experiment planning/troubleshooting, and new ideas/hypothesis generation. We exemplified this workflow's applicability through several case studies on yeast fermentation and cyanobacterial biorefinery. NEKO's output is more informative, specific, and actionable than GPT-4's zero-shot Q&A. NEKO offers flexible, lightweight local deployment options. NEKO democratizes artificial intelligence (AI) tools, making scientific foundation model more accessible to researchers without excessive computational power.
AB - Large language models (LLMs) can complete general scientific question-and-answer, yet they are constrained by their pretraining cut-off dates and lack the ability to provide specific, cited scientific knowledge. Here, we introduce Network for Knowledge Organization (NEKO), a workflow that uses LLM Qwen to extract knowledge through scientific literature text mining. When user inputs a keyword of interest, NEKO can generate knowledge graphs to link bioinformation entities and produce comprehensive summaries from PubMed search. NEKO significantly enhance LLM ability and has immediate applications in daily academic tasks such as education of young scientists, literature review, paper writing, experiment planning/troubleshooting, and new ideas/hypothesis generation. We exemplified this workflow's applicability through several case studies on yeast fermentation and cyanobacterial biorefinery. NEKO's output is more informative, specific, and actionable than GPT-4's zero-shot Q&A. NEKO offers flexible, lightweight local deployment options. NEKO democratizes artificial intelligence (AI) tools, making scientific foundation model more accessible to researchers without excessive computational power.
KW - Foundation model
KW - Knowledge graph
KW - Large language model
KW - Qwen
KW - Retrieval augmented generation
UR - http://www.scopus.com/inward/record.url?scp=85210356580&partnerID=8YFLogxK
U2 - 10.1016/j.ymben.2024.11.006
DO - 10.1016/j.ymben.2024.11.006
M3 - Article
C2 - 39580108
AN - SCOPUS:85210356580
SN - 1096-7176
VL - 87
SP - 60
EP - 67
JO - Metabolic Engineering
JF - Metabolic Engineering
ER -