The first five free decision tree software in this list support the manual construction of decision trees, often used in decision support. iBoske, Lucidchart and SilverDecisions are online tools, and the others are installable. All products in this list are free to use forever, and are not free trials (of which there are many).
The Top 23 Decision Trees Open Source Projects. (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. Become A Software Engineer At Top Companies.
Lucidchart offers a free, but limited subscription to its Online Decision Maker decision tree software.
Gambit is an open-source collection of tools for doing computation in game theory, and by inference, decision trees. Gambit’s graphical user interface provides an “integrated development environment” to help visually construct games and trees and to investigate their main strategic features.
iBoske is a decision tree creation and sharing platform. It’s 100% web based, free and mobile friendly. It allows anyone, with no previous knowledge, to share their experience in a directly applicable way, which is nowadays known as knowledge application, both in iBoske or in their own website (embedded). End users can use them with a player similar to Youtube or Slideshare.
SilverDecisions is a Free Open Source tool for creating and analyzing of decision trees. The application provides a browser based interface for manual tree modeling and provides a rich set of layout options. For a constructed tree a set of optimal decisions is found and highlighted. The decision tree with all its characteristics can be exported to JSON, PNG and SVG format. The application development was financed by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 645860.
Simple Decision Tree is an Excel Add-in created by Thomas Seyller. The Add-in is released under the terms of GPL v3 with additional permissions.
These remaining five tools create decision trees as part of an analysis process. KNIME and RapidMiner are data mining platforms, with the remaining products more focused on decision trees.
- See also10+ Free Decision Tree Classification Software
bigml is a cloud based machine learning platform with good support for decision tree generation. A free account is offered with maximum of 60MB data sets.
GATree decision tree tool makes use of genetic algorithms to evolve binary decision trees. It allows users to set the characteristics of the resulting decision tree and can provide a set of different decision trees that match the solution space. The free version restricts the number of generations and the maximum size of the initial population. The commercial version costs 200 Euro.
KNIME is an open source data mining platform with good support for decision trees. Although not as powerful as some of the stand-alone tools, the user interface is friendly and most business problems can be addressed with it.
RapidMiner is a data mining platform and comes in both free and paid for versions. Like KNIME it provides good support for decision trees.
Smiles is a machine learning system that integrates many different features from other machine learning techniques and paradigms and, more importantly, it presents several innovations in almost all of these features. In particular, SMILES extends classical decision tree learners in many ways (new splitting criteria, non-greedy search, new partitions, extraction of several and different solutions), it has an anytime handling of resources, and has a sophisticated and quite effective handling of (misclassification and test) costs.
YaDT is a new from-scratch implementation of the entropy-based tree construction algorithm. It has been designed and implemented in C++ with strong emphasis on efficiency (time and space) and portability (Windows/Linux, 32/64 bit executable).
By Michael Galarnyk, Data Scientist
Decision trees are a popular supervised learning method for a variety of reasons. Benefits of decision trees include that they can be used for both regression and classification, they don’t require feature scaling, and they are relatively easy to interpret as you can visualize decision trees. This is not only a powerful way to understand your model, but also to communicate how your model works. Consequently, it would help to know how to make a visualization based on your model.
This tutorial covers:
- How to Fit a Decision Tree Model using Scikit-Learn
- How to Visualize Decision Trees using Matplotlib
- How to Visualize Decision Trees using Graphviz (what is Graphviz, how to install it on Mac and Windows, and how to use it to visualize decision trees)
- How to Visualize Individual Decision Trees from Bagged Trees or Random Forests®
As always, the code used in this tutorial is available on my GitHub. With that, let’s get started!
How to Fit a Decision Tree Model using Scikit-Learn
In order to visualize decision trees, we need first need to fit a decision tree model using scikit-learn. If this section is not clear, I encourage you to read my Understanding Decision Trees for Classification (Python) tutorial as I go into a lot of detail on how decision trees work and how to use them.
Import Libraries
The following import statements are what we will use for this section of the tutorial.
Load the Dataset
The Iris dataset is one of datasets scikit-learn comes with that do not require the downloading of any file from some external website. The code below loads the iris dataset.
Splitting Data into Training and Test Sets
The code below puts 75% of the data into a training set and 25% of the data into a test set.
Scikit-learn 4-Step Modeling Pattern
How to Visualize Decision Trees using Matplotlib
As of scikit-learn version 21.0 (roughly May 2019), Decision Trees can now be plotted with matplotlib using scikit-learn’s tree.plot_tree
without relying on the dot
library which is a hard-to-install dependency which we will cover later on in the blog post.
The code below plots a decision tree using scikit-learn.
In addition to adding the code to allow you to save your image, the code below tries to make the decision tree more interpretable by adding in feature and class names (as well as setting filled = True
).
How to Visualize Decision Trees using Graphviz
Graphviz
is open source graph visualization software. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. In data science, one use of Graphviz
is to visualize decision trees. I should note that the reason why I am going over Graphviz after covering Matplotlib is that getting this to work can be difficult. The first part of this process involves creating a dot file. A dot file is a Graphviz representation of a decision tree. The problem is that using Graphviz to convert the dot file into an image file (png, jpg, etc) can be difficult. There are a couple ways to do this including: installing python-graphviz
though Anaconda, installing Graphviz through Homebrew (Mac), installing Graphviz executables from the official site (Windows), and using an online converter on the contents of your dot file to convert it into an image.
Export your model to a dot file
The code below code will work on any operating system as python generates the dot file and exports it as a file named tree.dot
.
Installing and Using Graphviz
Converting the dot file into an image file (png, jpg, etc) typically requires the installation of Graphviz which depends on your operating system and a host of other things. The goal of this section is to help people try and solve the common issue of getting the following error. dot: command not found
.
How to Install and Use on Mac through Anaconda
To be able to install Graphviz on your Mac through this method, you first need to have Anaconda installed (If you don’t have Anaconda installed, you can learn how to install it here).
Open a terminal. You can do this by clicking on the Spotlight magnifying glass at the top right of the screen, type terminal and then click on the Terminal icon.
Type the command below to install Graphviz.
After that, you should be able to use the dot
command below to convert the dot file into a png file.
How to Install and Use on Mac through Homebrew
If you don’t have Anaconda or just want another way of installing Graphviz on your Mac, you can use Homebrew. I previously wrote an article on how to install Homebrew and use it to convert a dot file into an image file here (see the Homebrew to Help Visualize Decision Trees section of the tutorial).
How to Install and Use on Windows through Anaconda
This is the method I prefer on Windows. To be able to install Graphviz on your Windows through this method, you first need to have Anaconda installed (If you don’t have Anaconda installed, you can learn how to install it here).
Open a terminal/command prompt and enter the command below to install Graphviz.
After that, you should be able to use the dot
command below to convert the dot file into a png file.
How to Install and Use on Windows through Graphviz Executable
If you don’t have Anaconda or just want another way of installing Graphviz on your Windows, you can use the following link to download and install it.
How to Use an Online Converter to Visualize your Decision Trees
If all else fails or you simply don’t want to install anything, you can use an online converter.
In the image below, I opened the file with Sublime Text (though there are many different programs that can open/read a dot file) and copied the content of the file.
In the image below, I pasted the content from the dot file onto the left side of the online converter. You can then choose what format you want and then save the image on the right side of the screen.
Keep in mind that there are other online converters that can help accomplish the same task.
How to Visualize Individual Decision Trees from Bagged Trees or Random Forests®
A weakness of decision trees is that they don’t tend to have the best predictive accuracy. This is partially because of high variance, meaning that different splits in the training data can lead to very different trees.
The image above could be a diagram for Bagged Trees or the random forest algorithm models which are ensemble methods. This means using multiple learning algorithms to obtain a better predictive performance than could be obtained from any of the constituent learning algorithms alone. In this case, many trees protect each other from their individual errors. How exactly Bagged Trees and the random forest algorithm models work is a subject for another blog, but what is important to note is that for each both models we grow N trees where N is the number of decision trees a user specifies. Consequently after you fit a model, it would be nice to look at the individual decision trees that make up your model.
Fit a Random Forest® Model using Scikit-Learn
In order to visualize individual decision trees, we need first need to fit a Bagged Trees or Random Forest® model using scikit-learn (the code below fits the random forest algorithm model).
Visualizing your Estimators
You can now view all the individual trees from the fitted model. In this section, I will visualize all the decision trees using matplotlib.
You can now visualize individual trees. The code below visualizes the first decision tree.
You can try to use matplotlib subplots to visualize as many of the trees as you like. The code below visualizes the first 5 decision trees. I personally don’t prefer this method as it is even harder to read.
Create Images for each of the Decision Trees (estimators)
Keep in mind that if for some reason you want images for all your estimators (decision trees), you can do so using the code on my GitHub. If you just want to see each of the 100 estimators for the random forest algorithm model fit in this tutorial without running the code, you can look at the video below.
Decision Tree Software Open Source Mac Software
Concluding Remarks
This tutorial covered how to visualize decision trees using Graphviz and Matplotlib. Note that the way to visualize decision trees using Matplotlib is a newer method so it might change or be improved upon in the future. Graphviz is currently more flexible as you can always modify your dot files to make them more visually appealing like I did using the dot language or even just alter the orientation of your decision tree. One thing we didn’t cover was how to use dtreeviz which is another library that can visualize decision trees. There is an excellent post on it here.
If you have any questions or thoughts on the tutorial, feel free to reach out in the comments below or through Twitter. If you want to learn more about how to utilize Pandas, Matplotlib, or Seaborn libraries, please consider taking my Python for Data Visualization LinkedIn Learning course.
RANDOM FORESTS and RANDOMFORESTS are registered marks of Minitab, LLC.
Open Source Mac App
Bio: Michael Galarnyk is a Data Scientist and Corporate Trainer. He currently works at Scripps Translational Research Institute. You can find him on Twitter (https://twitter.com/GalarnykMichael), Medium (https://medium.com/@GalarnykMichael), and GitHub (https://github.com/mGalarnyk).
Original. Reposted with permission.
Decision Tree Software Open Source
Related: