Tree Mining in Frequent Patterns
That tree mining algorithm is basically used for stores quantitative information about frequent patterns. In that tree method use root and node structure for declaring tree with different structure. In that data mining tree one root node with set of sub trees as a children. Each node in the sub tree consist of 3 fields-
1.Indicating item name.
2.Count the number of transaction represented by the node.
3.Node links. They indicating links to the next node in frequent tree.
Algorithm for construct of Frequent pattern tree-
Input- Transaction database Output-Frequent tree
Steps-
1.Scan all database and select frequent item from that.
2.descending order list.
3.Create root and repeat fallowing steps up to final output. Insert tree([P/Q]T)
•where P-First element
•Q-Remaining list. Advantages of Frequent Tree-
1.Database scanned 2 times during construction.
2.Frequent Pattern tree contains all the information related to mining frequent patterns.
Example-
STEP 1-Arranging database in descending order.
STEP 2-Scan database for second item, order frequent pattern in each transaction.
STEP 3-Construct Frequent tree.
Tree mining is an advanced data mining technique used to discover frequent patterns that have a hierarchical or tree-like structure. While traditional frequent pattern mining focuses on sets or sequences of items, tree mining identifies recurring subtrees within structured data. This approach is essential in domains where data naturally forms tree structures, such as XML documents, web page hierarchies, chemical compounds, biological data, and program code analysis.
The main objective of tree mining is to find all subtrees that occur frequently within a large database of trees, based on a given support threshold. A subtree is considered frequent if it appears in multiple tree structures with sufficient frequency. For example, in bioinformatics, certain molecular substructures (subtrees) may appear repeatedly across different compounds, providing insights into chemical behavior or biological functions.
Tree mining can be broadly categorized into two types: ordered tree mining and unordered tree mining. In ordered tree mining, the order of child nodes is significant, whereas in unordered tree mining, the order does not matter. Several algorithms have been developed for efficient tree mining, including FREQT (Frequent Tree Mining Algorithm), TreeMiner, and PrefixTree. These algorithms reduce redundancy and improve efficiency by using techniques like depth-first search, candidate pruning, and prefix-based pattern growth.
The process of tree mining generally involves three major steps: data preprocessing, candidate generation, and pattern evaluation. During preprocessing, tree data is structured and encoded for mining. Candidate subtrees are then generated and checked against the database to determine their frequency. Finally, only those subtrees meeting the minimum support threshold are retained as frequent patterns.
Tree mining has numerous real-world applications. In bioinformatics, it identifies common structural motifs in proteins and molecules. In web mining, it helps analyze website hierarchies and XML data. In software engineering, it can detect recurring code structures or design patterns for optimization.
In conclusion, tree mining in frequent pattern analysis is a powerful tool in data warehousing and data mining. It enables the discovery of meaningful hierarchical relationships in complex data, leading to improved understanding, prediction, and decision-making.
Read More-

Comments
Post a Comment