LOADING

Type to search

“ggpubr” package in R for Data Visualization

To Know more about the Different Corporate Training & Consulting Visit our website www.Instrovate.com Or Email : info@instrovate.com or WhatsApp / Call at +91 74289 52788

R Programming

“ggpubr” package in R for Data Visualization

Share

ggpubr

We are going to use “ggpubr” package for data visualization .

ggpubr

It provides some easy-to-use functions for creating and customizing “ggplot2” based publication ready plots.

We install “ggpubr” package as:

"ggpubr" package in R for Data Visualization 29

We load “ggpubr” package as:

"ggpubr" package in R for Data Visualization 30

We set the seed of random number generator , which is useful for creating random objects can be reproduced.

set.seed(1234)

We are creating a data frame contains variable ‘sex’ and ‘weight’ . We are using rnorn() function to generate random numbers from normal distribution . We are creating first 300 random numbers with mean 45 and next 300 random numbers with mean 49 .  

wdata = data.frame(

  sex = factor(rep(c(“F”, “M”), each=300)),

  weight = c(rnorm(300, 45), rnorm(300, 49)))

We check top four observations of data frame wdata as:

head(wdata, 4)

"ggpubr" package in R for Data Visualization 31

We create a density plot by using ggdensity()  function.

The first argument specifies the dataset  and x specifies the variable to be drawn . The add argument is used  to add mean line in the plot. We added rug to the plot so that we can display individual plots of density plot. We used color argument to color on the basis of sex value. We used fill argument to fill color according to sex value . We used palette for coloring or filling by group.

ggdensity(wdata, x = “weight”,

          add = “mean”, rug = TRUE,

          color = “sex”, fill = “sex”,

          palette = c(“#00AFBB”, “#E7B800”))

"ggpubr" package in R for Data Visualization 32

We plot histogram with same options by using gghistogram() function .

gghistogram(wdata, x = “weight”,

            add = “mean”, rug = TRUE,

            color = “sex”, fill = “sex”,

            palette = c(“#00AFBB”, “#E7B800”))

The default value of bins are 30 to plot histogram .

"ggpubr" package in R for Data Visualization 33

gghistogram(wdata, x = “weight”,

            add = “mean”, rug = TRUE,

            color = “sex”, fill = “sex”,bins = 50,

            palette = c(“#00AFBB”, “#E7B800” ))

We have changes bins equal to 50 to see the difference in histogram formation. Now , the plot is more wider and more frequent observations can be seen .

"ggpubr" package in R for Data Visualization 34

We want to work on ToothGrowth dataset . We load ToothGrowth dataset by using following code :

data(“ToothGrowth”)

We check the description of ToothGrowth dataset as :

?ToothGrowth

"ggpubr" package in R for Data Visualization 35

df <- ToothGrowth

We want to see top four observations of ToothGrowth dataset.

head(df, 4)

"ggpubr" package in R for Data Visualization 36

We create a box plot by using ggboxplot() function . The arguments of function are :

data – a data frame

x – character string containing the name of x variable

y – character string containing one or more variables to plot

color – outline color

palette – the color palette to be used for coloring or filling by groups .

add – character vector for adding another plot element . We are adding “jitter” in the plot

shape – the shape or symbol to represent different box plots points .

We want to plot box plot with different doses with respect to len or Tooth length .

We can check the dose values by using following code :

unique(df$dose)

"ggpubr" package in R for Data Visualization 37

p <- ggboxplot(df, x = “dose”, y = “len”,

               color = “dose”, palette =c(“#00AFBB”, “#E7B800”, “#FC4E07”),

               add = “jitter”, shape = “dose”)

p

"ggpubr" package in R for Data Visualization 38

We are using stat_compare_means() function to compare p-values to a ggplot for box plots , dot plots and  stripcharts .

The arguments of stat_compare_means() are –

comparisons – a list of two length vectors . The entries in vectors are either the names of two values on the x-axis or the two integers that correspond to the index of the groups of interest , to be compared .

We add label.y argument to 50 for absolute positioning of the label .

my_comparisons <- list( c(“0.5”, “1”), c(“1”, “2”), c(“0.5”, “2”) )

"ggpubr" package in R for Data Visualization 39

p + stat_compare_means(comparisons = my_comparisons)+ # Add pairwise comparisons p-value

  stat_compare_means(label.y = 50)                  

"ggpubr" package in R for Data Visualization 40

We create violin plots with box plots inside . We used add.params argument to add different parameters like color  , shape , size etc .

ggviolin(df, x = “dose”, y = “len”, fill = “dose”,

         palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”),

         add = “boxplot”, add.params = list(fill = “white”))+

  stat_compare_means(comparisons = my_comparisons, label = “p.signif”)+ # Add significance levels

  stat_compare_means(label.y = 50)                                      # Add global the p-value

"ggpubr" package in R for Data Visualization 41

We can also create dot plots and adding mean and standard deviation line in plot .

ggdotplot(df, x = “dose”, y = “len”, color = “dose”, fill = “dose”,

          palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”),

          add = “mean_sd”, add.params = list(color = “black”))

"ggpubr" package in R for Data Visualization 42

We create a new data frame as :

df3 <- data.frame(supp=rep(c(“AB”, “SK”), each=3),

                  dose=rep(c(“D0.5”, “D1”, “D2”),2),

                  len=c(7.2, 12, 34, 5, 8, 34.2))

We print the value of df3 as :

print(df3)

"ggpubr" package in R for Data Visualization 43

We create a bar plot to fill color on the basis of “supp” group . We use lab.col to specify color of label as white and lab. pos to specifying the position of labels. So , lab.pos defined position as inside the plot.

ggbarplot(df3, x = “dose”, y = “len”,

          fill = “supp”, color = “supp”, palette = c(“#00AFBB”, “#E7B800”),

          label = TRUE, lab.col = “white”, lab.pos = “in”)

"ggpubr" package in R for Data Visualization 44

We plot line plots with multiple groups . Here , we want to plot line plots combination of dose and len values. we use shape group by supp values .

ggline(df3, x = “dose”, y = “len”,

       linetype = “supp”, shape = “supp”,

       color = “supp”,  palette = c(“#00AFBB”, “#E7B800”))

"ggpubr" package in R for Data Visualization 45

We can create a pie chart by using ggpie() function .

We create a data frame df4 as :

df4 <- data.frame(

  group = c(“Male”, “Female”, “Child”),

  value = c(22, 19, 45))

We check the dataset  df4 as :

df4

"ggpubr" package in R for Data Visualization 46

We create a new variable labs to store the combination of group and values .

labs <- paste0(df4$group, ” (“, df4$value, “%)”)

"ggpubr" package in R for Data Visualization 47

ggpie(df4, x = “value”, fill = “group”, color = “white”,

      palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”),

      label = labs, lab.pos = “in”, lab.font = “white”)

"ggpubr" package in R for Data Visualization 48

We want to work with “mtcars” dataset . We load “mtcars” dataset as :

data(“mtcars”)

We create a new object to store mtcars dataset.

dfm <- mtcars

We convert the cyl variable to a factor

dfm$cyl <- as.factor(dfm$cyl)

We add a new column name to store the name of cars .

dfm$name <- rownames(dfm)

We check top observations of dfm dataset

head(dfm[, c(“wt”, “mpg”, “cyl”)])

"ggpubr" package in R for Data Visualization 49

We create a scatter plot with concentration ellipses and labels . We use repel to avoid overplotting text labels .

ggscatter(dfm, x = “wt”, y = “mpg”,

          color = “cyl”, shape = “cyl”,

          palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”),

          ellipse = TRUE, mean.point = TRUE,

          rug = TRUE, label = “name”, font.label = 10, repel = TRUE)

"ggpubr" package in R for Data Visualization 50

We create bar plot and sort data in descending order by using sort.val =desc . We fill color in the bars by cyl values  . We set white color to bar borders . We used sort.by.groups as FALSE to not sort data by groups . We used x.text.angle = 90 to rotate x-axis in 90⁰ . 

ggbarplot(dfm, x = “name”, y = “mpg”,

          fill = “cyl”,               # change fill color by cyl

          color = “white”,            # Set bar border colors to white

          palette = “jco”,            # jco journal color palett. see ?ggpar

          sort.val = “desc”,          # Sort the value in dscending order

          sort.by.groups = FALSE,     # Don’t sort inside each group

          x.text.angle = 90           # Rotate vertically x axis texts

)

"ggpubr" package in R for Data Visualization 51

We change the value of sort.by.groups as TRUE , the data sort by each group .

ggbarplot(dfm, x = “name”, y = “mpg”,

          fill = “cyl”,               # change fill color by cyl

          color = “white”,            # Set bar border colors to white

          palette = “jco”,            # jco journal color palett. see ?ggpar

          sort.val = “asc”,           # Sort the value in dscending order

          sort.by.groups = TRUE,      # Sort inside each group

          x.text.angle = 90           # Rotate vertically x axis texts

)

"ggpubr" package in R for Data Visualization 52

We create a dot chart by using following code :

ggdotchart(dfm, x = “name”, y = “mpg”,

           color = “cyl”,                                

           palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”),

           sorting = “ascending”,                

           add = “segments”,                      

           ggtheme = theme_pubr()          

)

We add background theme in Plots window by using ggtheme = theme_pubr()

"ggpubr" package in R for Data Visualization 53

We create a dot chart on graph between mpg and name of mtcars dataset . You can see various attributes of ggdotchart() function as :

?ggdotchart

"ggpubr" package in R for Data Visualization 54

ggdotchart(dfm, x = “name”, y = “mpg”,

           color = “cyl”,                                # Color by groups

           palette = c(“#00AFBB”, “#E7B800”, “#FC4E07”), # Custom color palette

           sorting = “descending”,                       # Sort value in descending order

           add = “segments”,                             # Add segments from y = 0 to dots

           rotate = TRUE,                                # Rotate vertically

           group = “cyl”,                                # Order by groups

           dot.size = 6,                                 # Large dot size

           label = round(dfm$mpg),                        # Add mpg values as dot labels

           font.label = list(color = “white”, size = 9,

                             vjust = 0.5),               # Adjust label parameters

           ggtheme = theme_pubr()                        # ggplot2 theme

)

"ggpubr" package in R for Data Visualization 55