全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

The Effectiveness of Feature Selection Method in Solar Power Prediction

DOI: 10.1155/2013/952613

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper empirically shows that the effect of applying selected feature subsets on machine learning techniques significantly improves the accuracy for solar power prediction. Experiments are performed using five well-known wrapper feature selection methods to obtain the solar power prediction accuracy of machine learning techniques with selected feature subsets. For all the experiments, the machine learning techniques, namely, least median square (LMS), multilayer perceptron (MLP), and support vector machine (SVM), are used. Afterwards, these results are compared with the solar power prediction accuracy of those same machine leaning techniques (i.e., LMS, MLP, and SVM) but without applying feature selection methods (WAFS). Experiments are carried out using reliable and real life historical meteorological data. The comparison between the results clearly shows that LMS, MLP, and SVM provide better prediction accuracy (i.e., reduced MAE and MASE) with selected feature subsets than without selected feature subsets. Experimental results of this paper facilitate to make a concrete verdict that providing more attention and effort towards the feature subset selection aspect (e.g., selected feature subsets on prediction accuracy which is investigated in this paper) can significantly contribute to improve the accuracy of solar power prediction. 1. Introduction Feature selection can be considered one of the main preprocessing steps of machine learning [1]. Feature selection is different from feature extraction (or feature transformation), which creates new features by combining the original features [2]. The advantages of feature selection are manyfold. First, feature selection significantly saves the operating time of a learning procedure by eliminating irrelevant and redundant features. Second, without the intervention of irrelevant, redundant, and noisy features, learning algorithms can centrally point on most essential features of data and build simpler but more precise data models. Third, feature selection can help build a simpler and more common model and get a better insight into the fundamental perception of the task [3–5]. The feature selection aspect is fairly significant for the reason that with the same training data, it may happen that an individual regression algorithm can perform better with different feature subsets. The success of machine learning on a particular task is affected by many factors. Among those factors first and foremost is the representation and quality of the instance data [6]. The training stage becomes critical with the

References

[1]  L. Yu and H. Liu, “Feature selection for high-dimensional data: a fast correlation based filter solution,” in Proceedings of the 20th International Conference on Machine Learning, pp. 856–863, August 2003.
[2]  F. Tan, Improving feature selection techniques for machine learning [Ph.D. thesis], 2007, Computer Science Dissertations: Paper 27, http://digitalarchive.gsu.edu/cs_diss/27.
[3]  R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1-2, pp. 273–324, 1997.
[4]  D. Koller and M. Sahami, “Toward optimal feature selection,” in Proceedings of the 13th International Conference on Machine Learning, pp. 284–292, 1996.
[5]  M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, no. 3, pp. 131–156, 1997.
[6]  T. Mitchell, Machine Learning, McGraw Hill, 1997.
[7]  D. A. Bell and H. Wang, “A formalism for relevance and its application in feature subset selection,” Machine Learning, vol. 41, no. 2, pp. 175–195, 2000.
[8]  M. Karagiannopoulos, D. Anyfantis, S. B. Kotsiantis, and P. E. Pintelas, “Feature selection for regression problems,” in Proceedings of the 8th Hellenic European Research on Computer Mathematics & Its Applications (HERCMA '07), pp. 20–22, 2007.
[9]  H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–502, 2005.
[10]  P. Langley, “Selection of relevant features in machine learning,” in Proceedings of the AAAI Fall Symposium on Relevance, pp. 1–5, 1994.
[11]  A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, no. 1-2, pp. 245–271, 1997.
[12]  I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003.
[13]  R. Caruana and D. Freitag, “Greedy attribute selection,” in Proceedings of the 11th International Conference on Machine Learning, San Francisco, Calif, USA, 1994.
[14]  M. Gütlein, E. Frank, M. Hall, and A. Karwath, “Large-scale attribute selection using wrappers,” in Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM '09), pp. 332–339, April 2009.
[15]  M. Gutlein, Large scale attribute selection using wrappers [Diploma of Computer Science thesis], Albert-Ludwigs-Universit?t-Freiburg, 2006.
[16]  M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009.
[17]  E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
[18]  M. R. Hossain, A. M. T. Oo, and A. B. M. S. Ali, “The combined effect of applying feature selection and parameter optimization on machine learning techniques for solar Power prediction,” American Journal of Energy Research, vol. 1, no. 1, pp. 7–16, 2013.
[19]  S. Coppolino, “A new correlation between clearness index and relative sunshine,” Renewable Energy, vol. 4, no. 4, pp. 417–423, 1994.
[20]  http://www.solarguys.com.au/.
[21]  P. J. Rousseeuw, “Least median of squares regression,” Journal of the American Statistical Association, vol. 79, no. 388, pp. 871–880, 1984.
[22]  S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999.
[23]  S. K. Shevade, S. S. Keerthi, C. Bhattacharyya, and K. R. K. Murthy, “Improvements to the SMO algorithm for SVM regression,” IEEE Transactions on Neural Networks, vol. 11, no. 5, pp. 1188–1193, 2000.
[24]  H. Zheng and A. Kusiak, “Prediction of wind farm power ramp rates: a data-mining approach,” Journal of Solar Energy Engineering, vol. 131, no. 3, Article ID 031011, 8 pages, 2009.
[25]  R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy,” International Journal of Forecasting, vol. 22, no. 4, pp. 679–688, 2006.
[26]  IBM SPSS Statistics for Windows-Version 20.0, IBM Corporation, New York, NY, USA, 2011.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413