Please find below the slides of my presentation about classification at the Datageeks Meetup in Munch at Feb. 26th.
dotplot designer is a new statistical cloud analytic tool based on R which not only offers visualization to common R functions, but also provides a huge repository of ready made functions for different statistical tasks.
Although dotplot is working on improving the usability of their products, it is not very obvious on how to import your own data for inexperienced users.
To learn how to import your own data, there is a new tutorial video:
Note that the import wizard works in a similar manner for Excel, SPSS or WEKA’s arff files.
Remember that their is a basic distinction between your local file system and the cloud file system of dotplot. All files you want to use within dotplot designer need to be uploaded to the cloud first. Also, files generated by dotplot are stored in the cloud as well, but can be downloaded to your local file system.
Once your data is imported, you can used it with any function provided in the function tree.
Hint: Try getting started with the Summary function of group ‘Exploration‘ to get a quick overview of your data. A well rendered output can be seen by double clicking its ‘Report’ output node.
There are many reasons why people are using statistics software, verifying results for scientific papers, generating business reports or trying to get insight into data to gather new knowledge. Although statistics is an engineering field of its own, it is required to be used in nearly all other areas. Thus, there is a very high demand for proper statistics software and therefore a bunch of these software tools shows up over the last years including the nowadays most popular IBM SPSS and SAS desktop applications that are widely used for enterprise necessities. On the other hand, probably the most popular open source software is R, which indeed is not really an application but rather a programming language for statistical analysis. While SPSS and SAS are very expensive and not easily affordable for personal users, small organisations or university departments, R is rather hard to understand and learn for non computer scientist or programmers. Further, all three and also most other big players are limited to the computational power of the users local machine.
There is a need for a more modern, easy to learn, powerful and cheap software, that’s why dotplot introduced the ‘dotplot designer‘ in 2013, a cloud analytics software that is free for personal usage.
In this post, I want to introduce the basics of dotplot designer to give an easy access to the software. First, let’s have a quick look at the benefits and why its worth giving it a try:
- Data Analysis Modelling: There is an easy understandable graphical user interface where the statistical process is is modelled by a flow diagram
- Cloud Analytics: The power of theoretically endless CPU and storage
- Accessible from everywhere and on any OS, not only on your local Computer
- It’s free
- Huge amount of functionality
- Predefined solutions: Examples managed by dotplot experts and the community
Once you’ve registered, the designer can be started from the top right of the website by pressing the ‘Launch Designer’ button. Once the designer load completely, which may take quite a while, there is a welcome dialogue where you can enter the application in different ways, loading an existing project, opening a solution etc. Whatever you choose, you will be faced with the common user interface which should look something like (the software is pretty new and the design is updated from time to time):
We can roughly divide the interface into 4 parts:
The toolbar is located at the top and allows to open, save or create new projects. Further, the reporter can be opened and access is given to solutions and to the help system.
All functions that can be used for modelling are located in the so called function tree or function repository. The tree is expandable and organized to model a common data analysis workflow. Starting with data IO and exploration of the data, followed by preprocessing steps and analysis. Further, functions to create plots and graphs can be found in group ‘Visualisation’.
The middle of the screen shows open projects in different tabs. Each project runs in its own environment and thus, several projects can be used in parallel.
The right side of the user interface shows project specific settings, configuration values for selected function cells and their results.
Having a closer look at the function tree, one can see that dotplot addresses a wide group of users. The first groups of the tree are ready made functions that can be used easily also by unexperienced users. We will see how easy these functions can be used in one of my next posts. The last group ‘Packages‘ is especially made for R experienced users and provides visualisation for R packages. R functions can be used in exactly the same way as in R, but within a graphical modeller.
To get started with dotplot designer, it is recommended to make the interactive tutorial that can be started directly from the welcome dialogue or from the help menu in the top toolbar.
But generally, using dotplot designer is pretty easy and straight forward. Functions can be add to the project canvas by either a double click or by drag and drop. For a first test of the application, I suggest using one of the datasets provided by dotplot. You will find them in the ‘Data Management / Data Repository’ group of the function tree:
Functions typically have inputs and outputs to plunge in data, do some processing on it and providing the generated results as output nodes. To plot a histogram for example, we add the iris dataset and connect the output of the cell to the input of the histogram function. Some functions require specific parameters to be set in the function configurations in part 4 of the interface. The generated model could look like this:
And here you go, that’s all you need to do. Drop a dataset and connect it to some statistic functions. Of course, there are much more functions and statistic or data analysis tasks that require a more complex model. But I hope that little introduction gave a good overview of how dotplot designer works and why its worth trying it. Have fun with it!