Below are some basic commands to calculate descriptive statistics and generate associated graphs. Below that I showcase the table1 package/function, which makes calculating and automatically generating a table of summary statistics easy. Lastly, I include some links to some helpful data visualization resources and showcase the patchwork package, which allows one to combine multiple graphs into a single display. Show
Packages Needed for Descriptive Statistics and Data VisualizationThis code will check that required packages for this chapter are installed, install them if needed, and load them into your session.
Interval or Continuous VariablesThere are a variety of packages and commands that will return various descriptive statistics. Here are some options:
You can also get descriptive statistics for interval variables broken out by groups (categorical variable).
Histograms (and related density and area plots) and boxplots are all useful for visualizing continuous variables. All of these can be refined by adding/changing arguments.
Categorical VariablesFor simple frequency counts:
To calculate proportions for a categorical variable, it is a two step process:
Bar charts are most often used to visualize categorical variables. You can have the bars reflect frequencies or percentages/proportions.
Generating a Summary Statistics TableThere are a variety of packages that have been created to facilitate the production of summary statistics tables. I’ll showcase table1 here. This site offers some helpful insights on how to make the most of the table1 package/function. Before attempting to generate a table, you will want to first reclassify your categorical variables as factor variables.
Here’s an example of a summary statistics table generated by table1. Patchwork for Combining GraphsThe patchwork package is also quite useful for displaying multiple graphs at once. Each graph is assigned to an object. They are then simply patched together using a few different options.
Here’s are a couple of examples of patchwork at work: (p1 + p2 + p3) / p4 Check out Little Miss Data’s r-bloggers post on patchwork for more information and examples. This webpage is useful for adding titles, subtitles, captions, and tags. An example from that page: What descriptive statistics are used for categorical variables?Descriptive statistics used to analyse data for a single categorical variable include frequencies, percentages, fractions and/or relative frequencies (which are simply frequencies divided by the sample size) obtained from the variable's frequency distribution table.
How do you describe categorical data in statistics?What is Categorical Data? Categorical data is a collection of information that is divided into groups. I.e, if an organisation or agency is trying to get a biodata of its employees, the resulting data is referred to as categorical.
What is the best way to visualize the descriptive statistics of a categorical variable?Bar charts are most often used to visualize categorical variables. You can have the bars reflect frequencies or percentages/proportions.
How do I show descriptive statistics in R?The descr() function allows to display: only a selection of descriptive statistics of your choice, with the stats = c("mean", "sd") argument for mean and standard deviation for example. the minimum, first quartile, median, third quartile and maximum with stats = "fivenum"
|