Time-Aware Machine Learning for Biomass Power Output Estimation Using SCADA Data

dc.contributor.authorGuney, Ezgi
dc.contributor.authorDemir, Memnun
dc.date.accessioned2026-04-25T14:19:54Z
dc.date.available2026-04-25T14:19:54Z
dc.date.issued2026
dc.departmentSinop Üniversitesi
dc.description.abstractAccurate short-term estimation of electrical power output in biomass power plants remains challenging due to the nonlinear and dynamically coupled nature of thermochemical conversion processes, fuel heterogeneity, and pronounced thermal inertia. Conventional physics-based models, while effective for steady-state analysis, often fail to capture the high-frequency dynamics required for real-time monitoring and decision-support applications. This study proposes a data-driven framework for short-term power output estimation using high-resolution Supervisory Control and Data Acquisition (SCADA) data collected from an operational industrial biomass power plant. A large-scale SCADA dataset comprising several hundred thousand time-stamped records is used to model the relationship between seven key thermodynamic and operational variables and net electrical power output. Multi-layer perceptron (MLP), random forest (RF), gradient boosting regressor (GBR), and support vector regression (SVR) are evaluated under two distinct validation strategies: (i) a conventional random train-test split and (ii) a temporally blocked cross-validation scheme preserving causal order. Under random sampling, RF attains the highest apparent accuracy (R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<^>{2}$$\end{document} = 0.9687), whereas MLP exhibits lower performance (R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<^>{2}$$\end{document} = 0.8492), highlighting sensitivity to instantaneous regression assumptions. When temporal continuity is enforced, predictive performance improves consistently across all models. In the blocked validation stage, GBR and RF achieve R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<^>{2}$$\end{document} values of 0.9983 and 0.9973, respectively, while MLP demonstrates a substantial performance increase (R2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$<^>{2}$$\end{document} = 0.9865). Time-domain analysis further reveals that ensemble-based models provide smoother tracking of short-term fluctuations, whereas temporally aligned evaluation significantly improves the physical consistency of neural network predictions. These results demonstrate that temporally consistent validation is essential for reliable SCADA-based modeling of biomass power generation and provide a practical foundation for real-time monitoring and decision-support applications in industrial biomass power plants.
dc.description.sponsorshipSinop University
dc.description.sponsorshipOpen access funding provided by the Scientific and Technological Research Council of Turkiye (TUB & Idot;TAK).
dc.identifier.doi10.1007/s13369-026-11143-y
dc.identifier.issn2193-567X
dc.identifier.issn2191-4281
dc.identifier.orcid0000-0003-4868-0626
dc.identifier.orcid0000-0002-4228-9637
dc.identifier.scopus2-s2.0-105030700174
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1007/s13369-026-11143-y
dc.identifier.urihttps://hdl.handle.net/11486/8235
dc.identifier.wosWOS:001694945300001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherSpringer Heidelberg
dc.relation.ispartofArabian Journal for Science and Engineering
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20260420
dc.subjectBiomass
dc.subjectSCADA
dc.subjectMachine learning
dc.subjectTemporal information leakage
dc.subjectShort-term power forecasting
dc.titleTime-Aware Machine Learning for Biomass Power Output Estimation Using SCADA Data
dc.typeArticle

Dosyalar