Classification By Decision Tree Induction-Advantages of Decision Tree and Over Fitting and Tree Pruning
Classification By Decision Tree Induction
Decision Tree-
The decision tree is made with root node and another is branches of that node with name leaf nodes. The outcome of that root is denoted with different possibilities output.
Classification By Decision Tree Induction-
The classification use a decision tree method for classifying data with different angles. In that classification tree data mining the tree structure is defined as flowchart method, In that tree each internal node means known as leaf node denoted as test on the attribute and each branch represented and output of that test and each leaf node or terminal node holds a class label that class label is known factor. If we create a classification tree that not required the previous data or any domain knowledge about that industry because of that the decision tree induction is mainly used for classification method.
Advantages of Decision Tree-
1.It does not required any domain knowledge about that industry.
2.It is easy to maintain and easy to draw.
3.Classification steps are easy to maintain in that decision tree.
Over Fitting and Tree Pruning-
If you work with classification and decision tree then over fitting problem arises. At the time of decision tree calculation if any over fitting problem arises then the accuracy of that classification is less. The over fitting problem means that the extra information is added in decision tree with adding extra node of that root. To avoid over fitting the pruning method is used for resolve that problem. Mainly two methods are used for over fitting problem-
1.Pre-Pruning
2.Post-Pruning
Pre-Pruning-
pre-Pruning checks firstly the extra node created. Checks or deciding no further split or partition tree without requirement. so the pre-Pruning method is worked successfully.
post-Pruning-
After creation tree free or delete unused node with classification rule then the post- Pruning method is used on that tree and deleting a nodes from that decision tree. If you done post-Pruning method then the accuracy of that tree is increased.
Explanation :
Decision Tree Induction is one of the most widely used and effective techniques for classification in data mining and machine learning. It is a method that builds a model in the form of a tree structure to predict the class label of a given dataset based on input attributes. Each internal node in the tree represents a test on an attribute, each branch represents the outcome of that test, and each leaf node represents a class label or decision outcome.
The process of decision tree induction starts with the root node, which contains all the training data. The algorithm then selects the attribute that best separates the data into different classes. This selection is based on a measure of purity such as Information Gain, Gain Ratio, or Gini Index. The dataset is then split into subsets based on this attribute, and the process is repeated recursively for each subset until the data cannot be split further or the desired level of classification accuracy is achieved.
One of the most popular algorithms used for decision tree induction is ID3 (Iterative Dichotomiser 3), which uses Information Gain as a criterion for attribute selection. Its successors, C4.5 and CART (Classification and Regression Tree), improved the method by handling continuous attributes, missing values, and pruning unnecessary branches to prevent overfitting. Pruning is an important step that simplifies the tree by removing branches that provide little or no improvement in classification accuracy, thus making the model more general and reliable.
Decision tree induction offers several advantages: it is easy to interpret, non-parametric, and can handle both categorical and numerical data. It also provides a clear visual representation of decision rules, making it useful for knowledge discovery and decision support. However, it can be sensitive to noisy data and small changes in the dataset, which may lead to different tree structures.
In real-world applications, decision tree classification is widely used in areas such as medical diagnosis, customer segmentation, fraud detection, and credit risk assessment. By converting large datasets into simple and interpretable rules, decision tree induction helps in making efficient and accurate predictions.
Read More-
- What Is Data Warehouse
- Applications of Data Warehouse, Types Of Data Warehouse
- Architecture of Data Warehousing
- Difference Between OLTP And OLAP
- Python Notes

Comments
Post a Comment