7  Draw

Visualization is a central part of any data analysis pipeline. Ideally, you want to visualize data before and after all operations, if possible. Depending on the kind and amount of data you are working on, this can range from straightforward to rather challening, but it’s always worthwhile.
The new rtemis package (v.0.99+_ provides the draw_* family of functions, which uses plotly to create visualizations that:

Interactive graphics offer a flexible and dynamic way to communicate information, great for websites / web applications and live demonstrations. rtemis uses the powerful plotly open source graphing library and other libraries built on top of that.
While viewing these graphs, try using the mouse to hover over and click to interact with the graphic elements - especially with the 3D plots.

7.1 Overview

You can print available draw_* functions using the available_draw() function. Here is an interactive table of the available functions:

7.2 Density and Histograms

draw_dist(iris$Sepal.Length)

To plot multiple traces, you can either pass a list, or define groups by passing a factor to the group argument. By default, mode = "overlap", which draws traces in the same plot.

draw_dist(iris$Sepal.Length, group = iris$Species) 

Note that non-numeric columns are automatically omitted. You can set mode = "ridge" to create a multiplot:

draw_dist(iris, mode = "ridge")

By default, “ridge” mode will order plot order by variable mean. This can be changed using the ridge_order_on_mean when you want to maintain group ordering - for example, if groups represent temporal information.

xl <- list(
  mango = rnorm(200, 7, 1),
  banana = rnorm(200, 10, .8),
  tangerine = rnorm(400, 0, 2),
  sugar = rnorm(500, 3, 1.5)
)
draw_dist(xl)
draw_dist(xl, mode = 'ridge', ridge_order_on_mean = FALSE)
draw_dist(xl, mode = 'ridge') # default is TRUE

7.3 Scatter plots

set.seed(2025)
n <- 500
x <- rnorm(n)
y <- x^2 + rnorm(n, 2, 1)
z <- x^3 + rnorm(n, 3, 2)
draw_scatter(x, y)

Add a fit line using the fit argument, which accepts the name of any supervised learner available in rtemis:

draw_scatter(x, y, fit = "gam")

Add a confidence interval using the se_fit argument, which is only available for “GLM” and “GAM” fits:

draw_scatter(x, y, fit = "gam", se_fit = TRUE)

Have fun with other learners:

draw_scatter(x, y, fit = "cart")

Lists (and therefore data.frames) are also supported:

draw_scatter(x, list(Square = y, Cube = z),
             fit = "gam", se_fit  = TRUE)

7.3.1 Scatterplot + Cluster

We already saw we can use any learner to draw a fit line in a scatter plot. You can similarly use any clutering algorithm to cluster the data and color them by cluster membership. Learn more about [Clustering].

draw_scatter(
  iris$Sepal.Width,
  iris$Petal.Width,
  cluster = "NeuralGas",
  fit = "gam",
  se_fit = TRUE
)
2025-06-03 10:05:29 👽Hello. [cluster]

  Input: 150 cases x 2 features.
2025-06-03 10:05:29 Clustering with NeuralGas... [cluster]
2025-06-03 10:05:29 Checking unsupervised data... [check_unsupervised_data]
2025-06-03 10:05:29 Completed in 1.6e-03 minutes (Real: 0.09; User: 0.08; System: 0.01). [cluster]

7.4 3D Scatter plots

The function for 3D scatterplots is draw_3Dscatter.
You can:

  • Specify x, y, and z individually.
  • Pass a list or data.frame with at least 3 elements/columns.

If there are more than 3 columns, the first 3 will be used:

draw_3Dscatter(iris)
draw_3Dscatter(iris, fit = "gam")

group works as expected:

draw_3Dscatter(iris, group = iris$Species)

7.4.1 Glass-cut plots

We can plot fitted surfaces using the fit argument. The dependent variable is z, i.e. we fit a model of the type z ~ x + y.
Use the mouse to rotate the plot:

set.seed(2019)
x1 <- rnorm(500)
x2 <- rnorm(500)
y <- x1^2 + x2^3 + 3 + rnorm(500) * 3
draw_3Dscatter(x1, x2, y, fit = "gam")

With groups:

draw_3Dscatter(iris, fit = "glm", group = iris$Species)
draw_3Dscatter(iris, fit = "gam", group = iris$Species)

7.5 Heatmaps

x <- rnormmat(20, 20, seed = 2018)
x_cor <- cor(x)
draw_heatmap(x_cor)
Loading required namespace: colorspace

7.6 Barplots

draw_bar(VADeaths)

7.7 Boxplots

Some synthetic data:

set.seed(1999)
x <- list(mango = rnorm(200, 1, 1),
          banana = rpois(500, sample(c(0, 1, 2), 500, T)),
          tangerine = rbinom(500, 1, .3),
          sugar = rgamma(400, shape = 1))
draw_box(x)

7.8 Violin Plots

Violin plots are extended boxplots that visualize the actual variable distribution as density plots around the standard boxplot.

draw_box(x, type = "violin")

7.9 Pie Charts

Pie charts are best avoided, but if you need them, there’s draw_pie().

Some real population data:

x <- structure(list(Continent = structure(c(2L, 1L, 3L, 6L, 4L, 5L),
                                     .Label = c("Africa", "Asia",
                                                "Europe", "North America", 
                                                "Oceania", "South America"), class = "factor"),
                    Population = c(4601371198, 1308064195, 747182751, 
                            427199446, 366600964, 42128035)),
               class = "data.frame",
               row.names = c(NA, -6L))
draw_pie(x)
© 2025 E.D. Gennatas