An approach of improving decision tree classifier using condensed informative data

Panhalkar, Archana R.; Doye, Dharmpal D.

Please use this identifier to cite or link to this item: https://ir.iimcal.ac.in:8443/jspui/handle/123456789/3331

Full metadata record

DC Field	Value	Language
dc.contributor.author	Panhalkar, Archana R.
dc.contributor.author	Doye, Dharmpal D.
dc.date.accessioned	2021-08-27T09:10:35Z
dc.date.available	2021-08-27T09:10:35Z
dc.date.issued	2020-12
dc.identifier.issn	0304-0941 (print version) ; 2197-1722 (electronic version)
dc.identifier.uri	https://doi.org/10.1007/s40622-020-00265-3
dc.identifier.uri	https://ir.iimcal.ac.in:8443/jspui/handle/123456789/3331
dc.description	Archana R. Panhalkar & Dharmpal D. Doye, Shri Guru Gobind Singhji Institute of Engineering and Technology, Vishnupuri, Nanded, Maharashtra, India
dc.description	p.431-445
dc.description	Issue Editor – Arnab Adhikari & Adrija Majumdar
dc.description.abstract	The advancement of new technologies in today’s era produces a vast amount of data. To store, analyze and mine knowledge from huge data requires large space as well as better execution speed. To train classifiers using a large amount of data requires more time and space. To avoid wastage of time and space, there is a need to mine significant information from a huge collection of data. Decision tree is one of the promising classifiers which mine knowledge from huge data. This paper aims to reduce the data to construct efficient decision tree classifier. This paper presents a method which finds informative data to improve the performance of decision tree classifier. Two clustering-based methods are proposed for dimensionality reduction and utilizing knowledge from outliers. These condensed data are applied to the decision tree for high prediction accuracy. The uniqueness of the first method is that it finds the representative instances from clusters that utilize knowledge of its neighboring data. The second method uses supervised clustering which finds the number of cluster representatives for the reduction of data. With an increase in the prediction accuracy of a tree, these methods decrease the size, building time and space required for decision tree classifiers. These novel methods are united into a single supervised and unsupervised Decision Tree based on Cluster Analysis Pre-processing (DTCAP) which hunts the informative instances from a small, medium and large dataset. The experiments are conducted on a standard UCI dataset of different sizes. It illustrates that the method with its simplicity performs a reduction of data up to 50%. It produces a qualitative dataset which enhances the performance of the decision tree classifier.
dc.publisher	Indian Institute of Management Calcutta, Kolkata
dc.relation.ispartofseries	Vol.47;No.4 (Special Issue on Emerging technologies and operational analytics)
dc.subject	Data mining
dc.subject	Decision tree classifier
dc.subject	K-means clustering
dc.subject	C4.5
dc.subject	Instance reduction
dc.title	An approach of improving decision tree classifier using condensed informative data
dc.type	Article
Appears in Collections:	Issue 4, December 2020

Files in This Item:

File	Size	Format
An approach of improving decision tree classifier using.pdf Until 2027-03-31	1.54 MB	Adobe PDF	View/Open Request a copy

Show simple item record