A1 Journal article (refereed), original research

A generalized fuzzy k-nearest neighbor regression model based on Minkowski distance


Open Access hybrid publication


Publication Details

Authors: Mailagaha Kumbure Mahinda, Luukka Pasi

Publication year: 2021

Language: English

Related journal or series: Granular Computing

ISSN: 2364-4966

eISSN: 2364-4974

JUFO level of this publication: 1

Digital Object Identifier (DOI): http://dx.doi.org/10.1007/s41066-021-00288-w

Permanent website address: https://link.springer.com/article/10.1007/s41066-021-00288-w

Social media address: https://www.researchgate.net/publication/354848035_A_generalized_fuzzy_k-nearest_neighbor_regression_model_based_on_Minkowski_distance

Open Access: Open Access hybrid publication


Abstract

The fuzzy k-nearest neighbor (FKNN) algorithm, one of the most
well-known and effective supervised learning techniques, has often been
used in data classification problems but rarely in regression settings.
This paper introduces a new, more general fuzzy k-nearest neighbor
regression model. Generalization is based on the usage of the Minkowski
distance instead of the usual Euclidean distance. The Euclidean distance
is often not the optimal choice for practical problems, and better
results can be obtained by generalizing this. Using the Minkowski
distance allows the proposed method to obtain more reasonable nearest
neighbors to the target sample. Another key advantage of this method is
that the nearest neighbors are weighted by fuzzy weights based on their
similarity to the target sample, leading to the most accurate prediction
through a weighted average. The performance of the proposed method is
tested with eight real-world datasets from different fields and
benchmarked to the k-nearest neighbor and three other
state-of-the-art regression methods. The Manhattan distance- and
Euclidean distance-based FKNNreg methods are also implemented, and the
results are compared. The empirical results show that the proposed
Minkowski distance-based fuzzy regression (Md-FKNNreg) method
outperforms the benchmarks and can be a good algorithm for regression
problems. In particular, the Md-FKNNreg model gave the significantly
lowest overall average root mean square error (0.0769) of all other
regression methods used. As a special case of the Minkowski distance,
the Manhattan distance yielded the optimal conditions for Md-FKNNreg and
achieved the best performance for most of the datasets.


Last updated on 2021-28-09 at 12:47