Prediction of Particulate Matter (PM₁₀) concentration in industrialized area in Malaysia
Abstract
The aims of this study is to improve the prediction model of Multiple Linear Regression (MLR) by combining with the Principle Component Analysis (PCA) to predict future (next day, next two-day and next three-day) of the PM₁₀ concentration at Pasir Gudang and Paka. Both of these places are industrialization areas. The annual hourly observations for PM₁₀ concentration in Pasir Gudang and Paka from 2005 until 2009
were selected to predicting PM₁₀ concentration level. Firstly study on descriptive statistics of PM₁₀ concentration and weather parameter. Then, by using Principal Component Analysis (PCA) was correlate the PM₁₀ concentration and weather
parameter. To develop the model of PM₁₀ concentration is applied the Multiple Linear
Regression (MLR) and Multiple Linear Regression (MLR) by combining with the
Principle Component Analysis (PCA). Next, by using the performance indicator are
using for validation the model which is two accuracy measures i) Prediction Accuracy
(PA) and ii) Coefficient of Determination (R2) then for the error measurement i)
Normalized Absolute Error (NAE), ii) Mean Absolute Error and iii) Root Mean Square
Error (RMSE). The result shows that the modelling of MLR-PCA is the best compare
to MLR modelling. Performance indicator show for next-day at Pasir Gudang MAE =
9.54, NAE = 0.19146, RMSE = 13.7839, PA = 0.70344 and R2 = 0.4931. While, for
Paka MAE = 4.01, NAE = 0.11019, RMSE = 6.69498, PA = 0.74589 and R2 = 0.5534.