Material to support teaching in Environmental Science at The University of Western Australia

Units ENVT3361, ENVT4461, and ENVT5503

This guide gets you started with reading data into R (R Core Team, 2022) from a file, including checking that the data have been read in correctly. We will always be using R in the RStudio environment (Posit Software, 2022).

If you need or would like a more basic introduction to R, you could first read our
Guide to R and RStudio for absolute beginners.

Reading the data

We use the read.csv() function – we will mostly supply data to you as .csv files. Sometimes you need to download these into your working directory to use them, and sometimes you can read them directly from a web URL (e.g. https://github.com/.../afs19.csv).

With the type of dataset we usually use, there are columns containing categorical information, which R calls factors. These are typically stored as text or character information, i.e. character strings, or just strings. R identifies factors in a particular way so the categories are recognised, so we need to use the stringsAsFactors = TRUE argument in the read.csv() function.

The result of the read.csv() function is a data frame object stored in the R environment. We need a data frame, since it is a data structure in R which allows having columns of different classes (e.g. integer, numeric, factor, date, logical, etc.) in the same object (each column contains just one class of data, though).

Objects we create in an R session like data frames are only stored while we have our R session active, and disappear when we close R. Fortunately we can save our whole environment by clicking the 💾 icon on the Environment tab, or by running the save.image() function. Either of these saving methods will create a file with extension .RData which we can then load in later R sessions using the load() function, or clicking the 📁 icon in the Environment tab in RStudio.

git <- "https://github.com/Ratey-AtUWA/Learn-R-web/raw/main/"
hubbard <- read.csv(file = paste0(git,"hubbard.csv"), stringsAsFactors = TRUE)
# ... and do a quick check
is.data.frame(hubbard) # check that it worked; if so, the result should be TRUE

## [1] TRUE

These data are from the Hubbard Brook Experimental Forest near Woodstock in New Hampshire, USA (Figure 1).

Figure 1: Location of the Hubbard Brook Experimental Forest in New Hampshire, USA. The data used in this workshop were generated at this site.

First proper check - summarise some of the data

We can use the summary() function to make a quick check of our data, to make sure the file has read correctly (this may not happen if the file is improperly formatted, etc.).

summary(hubbard[,1:9]) # just the first 9 columns to save space!

##       PLOT       Rel.To.Brook    Transect   UTM_EASTING      UTM_NORTHING    
##  Min.   :  1.0   North:135    E278000:44   Min.   :276000   Min.   :4866300  
##  1st Qu.:126.8   South:125    E277000:43   1st Qu.:277000   1st Qu.:4867800  
##  Median :254.5                E280000:37   Median :279000   Median :4869000  
##  Mean   :249.2                E276000:35   Mean   :278996   Mean   :4868769  
##  3rd Qu.:392.2                E279000:34   3rd Qu.:281000   3rd Qu.:4869700  
##  Max.   :460.0                E282000:25   Max.   :283000   Max.   :4871100  
##                               (Other):42                                     
##        PH         MOISTURE.pct        OM.pct             Cd        
##  Min.   :2.510   Min.   : 0.510   Min.   : 2.240   Min.   :0.0980  
##  1st Qu.:4.168   1st Qu.: 3.068   1st Qu.: 9.188   1st Qu.:0.3070  
##  Median :4.350   Median : 4.131   Median :11.162   Median :0.6350  
##  Mean   :4.322   Mean   : 5.367   Mean   :12.009   Mean   :0.7426  
##  3rd Qu.:4.503   3rd Qu.: 6.324   3rd Qu.:13.582   3rd Qu.:1.1310  
##  Max.   :4.870   Max.   :26.262   Max.   :50.177   Max.   :1.8890  
##                                                    NA's   :167

The summary() function creates a little table for each column - note that these little tables do not all look the same. Integer or numeric columns get a numeric summary with minimum, mean etc., and sometime the number of missing (NA) values. Categorical (Factor) columns show the number of samples (rows) in each category (unless there are too many categories). These summaries are useful to check if there are zero or negative values in columns, how many missing observations there might be, and if the data have been read correctly into R.

Note: You would usually check the whole data frame, without restricting rows or columns, by running summary(hubbard). We could also:

summarise all variables, but only the first 10 rows by running summary(hubbard[1:10,]) (we also call rows 'observations' which often represent separate field samples);
summarise a defined range of both rows and columns, e.g. summary(hubbard[1:20,6:10]), which would summarise only the first 20 rows of columns 6 to 10.

Final checks of the data frame

Usually we would not restrict the output as done below with [,1:15]. We only do it here so we're not bored with pages of similar-looking output. You should look at structure for the whole data frame using str(hubbard) (or substitute hubbard for whatever your data object is called). The whole hubbard data frame has 62 variables (i.e. columns), not 15.

str(hubbard[,1:15]) # 'str' gives the structure of an object

## 'data.frame':    260 obs. of  15 variables:
##  $ PLOT        : int  1 2 3 5 6 7 8 9 10 11 ...
##  $ Rel.To.Brook: Factor w/ 2 levels "North","South": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Transect    : Factor w/ 8 levels "E276000","E277000",..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ UTM_EASTING : int  280000 280000 280000 280000 280000 280000 280000 280000 280000 280000 ...
##  $ UTM_NORTHING: int  4868400 4868500 4868700 4869000 4869100 4869300 4869400 4869600 4869700 4869900 ...
##  $ PH          : num  4.29 4.66 4.23 4.15 4.49 4.79 4.11 4.52 4.51 4.43 ...
##  $ MOISTURE.pct: num  4.74 7.47 5.55 3.77 4.82 ...
##  $ OM.pct      : num  12.2 10.46 14.88 9.14 12.01 ...
##  $ Cd          : num  0.498 0.207 1.359 0.913 0.099 ...
##  $ Cu          : num  22.1 27.1 18.6 7 17 ...
##  $ Ni          : num  12.25 20.5 14.43 8.72 10.62 ...
##  $ Cr          : num  21.1 28.2 19.6 21.3 22.8 ...
##  $ Co          : num  14.4 17 13.1 11.1 11.4 ...
##  $ Zn          : num  47.6 63.1 47.5 18.2 31.9 ...
##  $ Mn          : num  490 453 578 261 335 ...

We can see that some columns are integer (int) values (e.g. PLOT, UTM_EASTING), some columns contain Factor values i.e. in fixed categories (e.g. Rel.To.Brook, Transect), and some columns are numeric (num) (e.g. PH, OM.pct, Ni). Applying the str() function to a data object is always a good idea, to check that the data have read correctly into R. [NOTE that other variable types are possible such as character chr, date (Date or POSIXct), logical, etc.]

"Data is like garbage. You'd better know what you are going to do with it before you collect it."

— Mark Twain

The following section describes how to make graphs and plots in R without needing any additional packages. If you would like to try making plots in the widely-used package ggplot2, then go to this page.

Base R plotting: x-y plot using `plot()`

We can use either plot(x, y, ...) OR plot(y ~ x, ...)
In R the ~ symbol means 'as a function of', so ~ indicates a formula.

In R we need to tell the program which 'object' our variables are in. We've just made a Data Frame (a type of data object) called hubbard.

The following 3 styles of code do exactly the same thing:

Specifying the data frame using with() syntax -- (we recommend this one!)

with(hubbard,
     plot(EXCH_Al ~ PH)
)

...which can just be written on a single line:

with(hubbard, plot(EXCH_Al ~ PH))

Figure 2: Plot of exchangeable Al vs. pH using with() to specify the data frame.

Specifying the data frame using the dollar-sign operator

plot(hubbard$EXCH_Al ~ hubbard$PH) # look at axis titles!

Figure 3: Plot of exchangeable Al vs. pH using dollar-sign syntax to specify the data frame. Notice the axis titles!

Specifying the data frame using attach() and detach() (not recommended)

attach(hubbard)
plot(EXCH_Al ~ PH)
detach(hubbard)

Figure 4: Plot of exchangeable Al vs. pH using attach() to specify the data frame.

Without changing any of the (numerous) options or parameters in the plot() function, the plot is not very attractive (e.g. axis titles!).

We can also change the overall plot appearance by using the function par() before plotting; par() sets graphics parameters. Let's try some variations:

Setting some overall graphics parameters using `par()`

mar= sets margins in 'lines' units: c(bottom,left,top,right) e.g. c(4,4,3,1)
mgp= sets distance of text from axes: c(titles, tickLabels, ticks)
font.lab= sets font style for axis titles: 2=bold, 3=italic, etc.
- and within the plot() function itself, xlab= and ylab= set axis titles

par(mar=c(4,4,1,1), mgp=c(2,0.6,0), font.lab=2)
# We'll also include some better axis title text using xlab, ylab
with(hubbard,
     plot(EXCH_Al ~ PH, 
     xlab="Soil pH", 
     ylab="Exchangeable Al (centimoles/kg)")
     )

Figure 5: Plot of exchangeable Al vs. pH improved by changing graphics parameters and including custom axis titles.

This is starting to look a lot better!

We can still add more information to the graph; for example, by making use of the factors (categories) in the dataset. We also need to learn these graphics parameters:

col = plotting colour(s) - it's easiest to use words like "red", "darkblue" and so on
see this R colour chart
or just run the R function colors() for a list of all 657 names!
pch = plot character(s) - numbers from 0 to 24 (run help(points) or see this page from YaRrr).

par(mar=c(4,4,1,1), mgp=c(2,0.6,0), font.lab=2)
with(hubbard,
     plot(EXCH_Al ~ PH, xlab="Soil pH",
          ylab="Exchangeable Al (centimoles/kg)",
          pch=c(1,16)[Rel.To.Brook], 
          col=c("blue","darkred")[Rel.To.Brook])
     )

Figure 6: Plot of exchangeable Al vs. pH with custom graphics parameters and axis titles, improved by separating points by a Factor.

The parameter pch=c(1,16)[Rel.To.Brook] separates the points by the information in the Factor column Rel.To.Brook, shown inside [ ]. This column is a 2-level factor, so can be one of two categories (North or South), and so we need a vector with two numbers in it (pch=c(1,16)). The code for specifying colour is very similar, except our vector has 2 colour names in it.

There is still one thing missing; a graph legend. We can add one using the legend() function. We will use the following options:

"topleft" position of legend -- run help(legend) for options, or we can use x-y coordinates
legend = a vector of names identifying the plot symbols - we have used the categories in the factor 'Rel.To.Brook', levels(hubbard$Rel.To.Brook), but we could have used legend=c("North","South") instead
pch = plot symbols - should be exactly the same vector as in the plot function
col = plot colours - should be exactly the same vector as in the plot function
title = a title for the legend - optional

par(mar=c(4,4,1,1), mgp=c(2,0.6,0), font.lab=2)
with(hubbard,
       plot(EXCH_Al ~ PH, xlab="Soil pH", 
            ylab="Exchangeable Al (centimoles/kg)",
            pch=c(1,16)[Rel.To.Brook], 
            col=c("blue","darkred")[Rel.To.Brook])
       )
legend("topleft", legend=levels(hubbard$Rel.To.Brook), pch=c(1,16), 
       col=c("blue","darkred"), title="Location")

Figure 7: Plot of exchangeable Al vs. pH by sample location, with custom graphics parameters and axis titles, and a legend to explain the plot symbols.

Alternative to base-R plot: `scatterplot()` (from the `car` package)

The R package car (Companion to Applied Regression) has many useful additional functions that extend the capability of R. The next two examples produce nearly the same plot as in the previous examples, using the scatterplot() function in the car package.

# load required package(s)
library(car)
# par() used to set overall plot appearance using options within 
# par(), e.g.
#     mar sets plot margins, mgp sets distance of axis title and 
#     tick labels from axis
par(font.lab=2, mar=c(4,4,1,1), mgp=c(2.2,0.7,0.0))
# draw scatterplot with customised options 
# remember pch sets plot character (symbol); 
# we will also use the parameter cex which sets symbol sizes and
# 
scatterplot(EXCH_Al ~ PH, data=hubbard, smooth=FALSE, 
            legend = c(coords="topleft"), 
            cex=1.5, cex.lab=1.5, cex.axis=1.2)

Figure 8: Plot of exchangeable Al vs. pH made using the car::scatterplot() function.

Note that we get some additional graph features by default:

boxplots for each variable in the plot margins – these are useful for evaluating the distribution of our variables and any extreme values
a linear regression line showing the trend of the relationship (it's possible to add this in base R plots, too)
grid lines in the plot area (also available in base R)

We can turn all of these features off if we want - run help(scatterplot) in the RStudio Console, and look under Arguments and Details.

Also, we separately specify the dataset to be used as a function argument, i.e., data=hubbard.

Scatterplot (`car`) with groups, Hubbard Brook soil data

# 'require()' loads package(s) if they haven't already been loaded
require(car)
# adjust overall plot appearance using options within par()
# mar sets plot margins, mgp sets distance of axis title and tick 
#   labels from axis
par(font.lab=2, mar=c(4,4,1,1), mgp=c(2.2,0.7,0.0))
# create custom palette with nice colours :)
# this lets us specify colours using numbers - try it!
palette(c("black","red3","blue3","darkgreen","sienna"))
# draw scatterplot with points grouped by a factor (Rel.To.Brook) 
scatterplot(EXCH_Al ~ PH | Rel.To.Brook, data=hubbard, smooth=FALSE,
            legend = c(coords="topleft"), col=c(5,3,1), 
            pch=c(16,0,2), cex=1.2, cex.lab=1.3, cex.axis=1.0)

Figure 9: Plot of exchangeable Al vs. pH with points grouped by location, made using the car::scatterplot() function.

The scatterplot() function creates a legend automatically if we plot by factor groupings (note the different way that the legend position is specified within the scatterplot() function). This is pretty similar to the base R plot above (we can also customise the axis titles in scatterplot(), using xlab= and ylab= as before).

Other types of data presentation: Plot types and Tables

We'll give you some starting code chunks, and the output from them. You can then use the help in RStudio to try to customise the plots according to the suggestions below each plot!

Histograms

Histograms are an essential staple of statistical plots, because we always need to know something about the distribution of our variable(s). Histograms are a good visual way to assess the shape of a variable's distribution, whether it's symmetrical and bell-shaped (normal), left- or right-skewed, bimodal (two 'peaks') or even multimodal. As with any check of a distribution, the 'shape' will be clearer if we have more observations.

with(hubbard, hist(MOISTURE.pct))

Figure 10: Histogram of percent soil moisture content at Hubbard Brook.

For histograms, try the following:

add suitable axis labels (titles)
make bars a different colour (all the same)
change the number of cells (bars) on the histogram to give wider or narrower intervals
log₁₀-transform the x-axis (horizontal axis)
remove the title above the graph (this information would usually go in a caption)
...and so on.

Box plots

Box plots also give us some information about a variable's distribution, but one of their great strengths is in comparing values of a variable between different groups in our data.

Tukey box plots implemented by R have 5 key values; from least to greatest these are:

the lower whisker, which is the greatest of the minimum or the lower hinge − 1.5 × IQR^a
the lower hinge, which is the lower quartile (25^th percentile)
the median (i.e. the 50^th percentile)
the upper hinge, which is the upper quartile (75^th percentile)
the upper whisker, , which is the least of the maximum or the upper hinge + 1.5 × IQR

Any points less than the lower whisker, or greater than the upper whisker, are plotted separately and represent potential outliers.

^aIQR is the interquartile range between the upper and lower quartiles (25^th and 75^th percentiles)

boxplot(MOISTURE.pct ~ Rel.To.Brook, data=hubbard)

Figure 11: Box plot of percent soil moisture content at Hubbard Brook. Only default boxplot() arguments have been used (except for the annotations explaining the boxplot features).

For box plots, try the following:

include informative axis labels (titles)
plotting boxes separated by categories in a Factor
make boxes a different colour (all the same, and all different!)
add notches to boxes representing approximate 95% confidence intervals around the median
give the (vertical) y-axis a log₁₀ scale
...and so on.

Strip Charts

Strip charts, or one-dimensional scatter plots, can be a useful companion (or alternative) to box plots, especially when we don't have many observations of a variable.

stripchart(hubbard$OM.pct)

Figure 12: Soil organic matter content (%) at Hubbard Brook: as a one-dimensional scatterplot (stripchart()).

For stripcharts, try the following:

add suitable axis labels (titles), in bold font;
make symbols a different shape ± colour (all the same, and all different!);
make the strip chart vertical instead of horizontal;
apply some 'jitter' (noise) to the symbols so that overlapping points are easier to see;
log₁₀-transform the numerical axis, so that overlapping points are easier to see;
rotate the y-axis titles 90° clockwise (make sure the left margin is wide enough!)
...and so on.

"With data collection, 'the sooner the better' is always the best answer."

— Marissa Mayer

Tables

There are a few ways to make useful tables in R to summarise your data. Here are a couple of examples.

Using the `tapply()` function in base R

We can use the tapply() function to make very simple tables:

# use the cat() [conCATenate] function to make a Table heading 
#     (\n is a line break)
cat("One-way table of means\n")
with(hubbard, tapply(X = EXCH_Ni, INDEX=Transect, 
       FUN=mean, na.rm=TRUE))
cat("\nTwo-way table of means\n")
with(hubbard, tapply(X = EXCH_Ni, 
       INDEX=list(Transect,Rel.To.Brook), 
       FUN=mean, na.rm=TRUE))

## One-way table of means
##     E276000     E277000     E278000     E279000     E280000     E281000     E282000 
## 0.002588815 0.002136418 0.002827809 0.002813563 0.002528296 0.002262386 0.002221885 
##     E283000 
## 0.002957922 
## 
## Two-way table of means
##               North       South
## E276000 0.002652212 0.002559759
## E277000 0.002154745 0.002117219
## E278000 0.003133112 0.002360874
## E279000 0.002733461 0.002927993
## E280000 0.002176569 0.002861511
## E281000 0.002537732 0.001962009
## E282000 0.002219457 0.002224313
## E283000 0.003542597 0.002039146

For tapply() tables, try the following:

we have used the mean function (FUN=mean) – try another function to get minima, maxima, standard deviations, etc.
try copying the output to Word or Excel and using this to make a table in that software
...and so on.

Using the `numSummary()` function in the 'RcmdrMisc' R package

require(RcmdrMisc)
# use the cat() [conCATenate] function to make a Table heading 
#   (\n is a line break)
cat("Summary statistics for EXCH_Ni\n")
numSummary(hubbard$EXCH_Ni)
cat("\nSummary statistics for EXCH_Ni grouped by Rel.To.Brook\n")
numSummary(hubbard$EXCH_Ni, groups=hubbard$Rel.To.Brook)

## Summary statistics for EXCH_Ni
##         mean          sd         IQR         0%         25%         50%         75%
##  0.002536502 0.001382344 0.001126141 0.00070457 0.001829135 0.002246775 0.002955276
##        100%   n NA
##  0.01784169 257  3
## 
## Summary statistics for EXCH_Ni grouped by Rel.To.Brook
##              mean          sd         IQR          0%         25%         50%         75%
## North 0.002635924 0.001605190 0.001114533 0.001120626 0.001933342 0.002277016 0.003047875
## South 0.002431513 0.001096042 0.001170948 0.000704570 0.001719421 0.002174767 0.002890369
##              100% data:n data:NA
## North 0.017841689    132       3
## South 0.008483806    125       0

For numSummary() tables, try the following:

generating summary tables for more than one variable at a time
generating summary tables with fewer statistical parameters (e.g. omit IQR) or more statistical parameters (e.g. include skewness)
(use R Studio Help!)
try copying the output to Word or Excel and using this to make a table in that software
...and so on.

Tables using `print()` on a data frame

Data frames are themselves tables, and if they already contain the type of summary we need, we can just use the print() function to get output. Let's do something [slightly] fancy (see if you can figure out what is going on here¹):

output <- 
  numSummary(hubbard[,c("PH","MOISTURE.pct","OM.pct","Al.pct","Ca.pct","Fe.pct")],
             statistics = c("mean","sd","quantiles"), quantiles=c(0,0.5,1))
mytable <- t(cbind(output$table,output$NAs))
row.names(mytable) <- c("Mean","Std.dev.","Min.","Median","Max.","Missing")
# here's where we get the output
print(mytable, digits=3)
write.table(mytable,"clipboard",sep="\t")
cat("\nThe table has now been copied to the clipboard, so you can paste it into Excel!\n")

##             PH MOISTURE.pct OM.pct Al.pct Ca.pct Fe.pct
## Mean     4.322         5.37  12.01  1.641  0.208  3.163
## Std.dev. 0.293         3.83   5.17  1.478  0.214  1.112
## Min.     2.510         0.51   2.24  0.155  0.002  0.497
## Median   4.350         4.13  11.16  0.970  0.139  3.168
## Max.     4.870        26.26  50.18  7.132  1.107  7.709
## Missing  0.000         0.00   0.00 60.000 62.000 60.000
## 
## The table has now been copied to the clipboard, so you can paste it into Excel!

If you want to take this further, we can start making really nice Tables for reports with various R packages. I use the flextable package and sometimes the kable() function from the knitr package.

¹Hints: we have made two data frame objects, one from the output of numSummary(); t() is the transpose function; there are also some other useful functions which might be new to you like: row.names(), cbind(), print(), write.table() . . .

If you want to take this further, we can start making really nice Tables for reports with various R packages. I use the flextable package (Gohel & Skintzos, 2022) and sometimes the kable() function from the knitr package. Here's an example (Table 1) using flextable and the table object mytable made above:

library(flextable)
flextable(data.frame(Statistic=row.names(mytable),signif(mytable,3))) |>
  bold(bold=TRUE, part="header") |>
  set_caption(caption="Table 1: Table created by the `flextable` R package. Many more table formatting, text formatting, and number formatting options are available in this package.")

Table 1: Table created by the `flextable` R package. Many more table formatting, text formatting, and number formatting options are available in this package.
Statistic	PH	MOISTURE.pct	OM.pct	Al.pct	Ca.pct	Fe.pct
Mean	4.320	5.37	12.00	1.640	0.208	3.160
Std.dev.	0.293	3.83	5.17	1.480	0.214	1.110
Min.	2.510	0.51	2.24	0.155	0.002	0.497
Median	4.350	4.13	11.20	0.970	0.139	3.170
Max.	4.870	26.30	50.20	7.130	1.110	7.710
Missing	0.000	0.00	0.00	60.000	62.000	60.000

For more detail on producing high-quality tables for reports, see the relevant page on the R for Environmental Science website.

The following two excellent websites can extend your basic knowledge of using R and RStudio:

A great free resource for R beginners is An Introduction to R by Alex Douglas, Deon Roos, Francesca Mancini, Ana Couto & David Lusseau.
Getting used to R, RStudio, and R Markdown is an awesome (and free) eBook by Chester Ismay which is super-helpful if you want to start using R Markdown for reproducible coding and reporting.

References

Fox J (2022). RcmdrMisc: R Commander Miscellaneous Functions. R package version 2.7-2, https://CRAN.R-project.org/package=RcmdrMisc.

Fox J, Weisberg S (2019). An {R} Companion to Applied Regression, Third Edition. Thousand Oaks CA: Sage. https://socialsciences.mcmaster.ca/jfox/Books/Companion/ (car package).

Posit Software (2023) RStudio 2023.09.1 "Desert Sunflower" Release. https://posit.co/products/open-source/rstudio/.

R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

CC-BY-SA • All content by Ratey-AtUWA. My employer does not necessarily know about or endorse the content of this website.
Created with rmarkdown in RStudio. Currently using the free yeti theme from Bootswatch.

Material to support teaching in Environmental Science at The University of Western Australia

Units ENVT3361, ENVT4461, and ENVT5503

RStudio introduction

Reading files, checking data, and making plots

Andrew Rate

2025-06-19

Reading the data

First proper check - summarise some of the data

Final checks of the data frame

Base R plotting: x-y plot using `plot()`

Setting some overall graphics parameters using `par()`

Alternative to base-R plot: `scatterplot()` (from the `car` package)

Scatterplot (`car`) with groups, Hubbard Brook soil data

Other types of data presentation: Plot types and Tables

Histograms

Box plots

Strip Charts

Tables

Using the `tapply()` function in base R

Using the `numSummary()` function in the 'RcmdrMisc' R package

Tables using `print()` on a data frame

References

Material to support teaching in Environmental Science at The University of Western Australia

Units ENVT3361, ENVT4461, and ENVT5503

RStudio introduction

Reading files, checking data, and making plots

Andrew Rate

2025-06-19

Reading the data

First proper check - summarise some of the data

Final checks of the data frame

Base R plotting: x-y plot using plot()

Setting some overall graphics parameters using par()

Alternative to base-R plot: scatterplot() (from the car package)

Scatterplot (car) with groups, Hubbard Brook soil data

Other types of data presentation: Plot types and Tables

Histograms

Box plots

Strip Charts

Tables

Using the tapply() function in base R

Using the numSummary() function in the 'RcmdrMisc' R package

Tables using print() on a data frame

References

Base R plotting: x-y plot using `plot()`

Setting some overall graphics parameters using `par()`

Alternative to base-R plot: `scatterplot()` (from the `car` package)

Scatterplot (`car`) with groups, Hubbard Brook soil data

Using the `tapply()` function in base R

Using the `numSummary()` function in the 'RcmdrMisc' R package

Tables using `print()` on a data frame