%0 Journal Article %T Mass Valuation of Unimproved Land Value Case Study: Nairobi County %A Edwin Kochulem %A Dennis Mwaniki %A Felix Mutua %J Journal of Geographic Information System %P 122-139 %@ 2151-1969 %D 2023 %I Scientific Research Publishing %R 10.4236/jgis.2023.151008 %X The purpose of this study is to investigate mass valuation of unimproved land value using machine learning techniques. The study was conducted in Nairobi County. It is one of the 47 Kenyan Counties under the 2010 constitution. A total of 1440 geocoded data points containing the market selling price of vacant land in Nairobi were web scraped from major property listing websites. These data points were adopted as dependent variables given as unit price of vacant land per square meter. The Covariates used in this study were categorized into Accessibility, Environmental, Physical and Socio-Economic Factors. Due to multi-collinearity problem present in the covariates, PLS and PCA methods were adopted to transform the observed features using a set of vectors. These methods resulted in an uncorrelated set of components that were used in training machine learning algorithms. The dependent variable and uncorrelated components derived feature reduction methods were used as training data for training different machine learning regression models namely; Random forest, support vector regression and extreme gradient boosting regression (XGboost regression). PLS performed better than PCA because the former maximizes the covariance between dependent and independent variables while the latter maximizes variance between the independent variables only and ignores the relationship between predictors and response. The first 9 components were identified as significant both by PLS and PCA methods. The spatial distribution of vacant land value within Nairobi County was consistent for all the three machine learning models. It was also noted that the land value pattern was higher in the central business district and the pattern spread northwards and westwards relative to the CBD. A relative low vacant land value pattern was observed on the eastern side of the county and also at the extreme periphery of Nairobi County boundary. From the accuracy metrics of R-squared and MAPE, Random Forest Regression model performed better than XGBoost and SVR models. This confirms the capability of random forest model to predict valid estimates of vacant land value for purposes of property taxation in Nairobi County. %K Machine Learning %K Property Valuation %K GIS %K PLS %K PCA %U http://www.scirp.org/journal/PaperInformation.aspx?PaperID=123395