The predictive power of NLP models on Perovskite solar cells: BERTforPSC
With the advent of ChatGPT, natural language processing (NLP) models have gained tremendous interest from the research community and have been applied to a plethora of scientific domains like batteries, pharmaceuticals, recycling plastics, etc., to obtain insights from the existing corpus of literature, and thus making the process of reading, analyzing, interpreting, and reporting the results shorter and faster. However, the applications of such models are still limited to a few fields in the past, and perovskite solar cells (PSCs) are among them. Recently, PSCs power conversion efficiency climbed the mark of 26.1% in a single junction and 33.7% in silicon/perovskite tandem solar cells, putting them in the leading position of next-generation solar cells. However, optimizing decision variables in terms of materials selection and process conditions requires analysis of the huge database of experiments to draw better insights to make them marketcompetitive in terms of cost and environmental impacts. In this article, authors have used two state-of-the-art NLP models, BERT and SciBERT, to analyze the corpus of stability data based on experimental datasets and further normalised based on storage and testing conditions to visualize the trends and compare their performance with regression-based models. The insights obtained while employing such models with different kinds of datasets where both alphanumeric keys are presented as model features are also offered, highlighting the limitations of such models. The efficiency and effectiveness of such models in interpreting the causal relationships and predicting the trends will help in utilizing such models for tackling the challenges of optimizing material-process design problems (MPDP) with available data from literature.
BERTforPSC.pdf
Main Document
http://purl.org/coar/version/c_ab4af688f83e57aa
openaccess
CC BY-NC
1004.75 KB
Adobe PDF
bdbc454b8ed28d5c870cccc8127da0c5