There are many ways to display data. The fundamental idea is that the graphical depiction of data should communicate the truth the data has to offer about the situation of interest.
1 Quantitative Variable
Great for showing the distribution of data for a single quantitative variable when the sample size is large. Dotplots are a good alternative for smaller sample sizes. Gives a good feel for the mean and standard deviation of the data.
To make a histogram in R use the function:
hist(object)
object
must be quantitative data. R refers to this as a
“numeric vector.” Usually this will be the column of a dataset accessed
with the $
sign by
hist(dataSetName$columnName)
.Type ?hist
in your R Console to open
the help file in R.
Example Code
Hover your mouse over the example codes to learn more. Click on them to see what they create.
Basic histogram
hist An R function
“hist” used to create a histogram.
( Parenthesis to begin the function. Must
touch the last letter of the function. airquality “airquality” is
a dataset. Type “View(airquality)” in R to see it. $ The $ allows us to access
any variable from the airquality dataset. Temp “Temp” is a
quantitative variable (numeric vector) from the “airquality”
dataset. )
Closing parenthsis for the hist
function.
Press Enter to run the code.
… Click to View Output.
Change Color
hist(airquality$Temp, This code was explained in the first example
code. col=“skyblue” col= allows us to specify the color of the plot
using a named color. The name of the color must be placed in quotations.
Type “colors()” in R to see color options. ) Functions always end with
a closing parenthesis.
Press Enter to run the code.
… Click to View Output.
Add Titles
hist(airquality$Temp This part was explained in the first example
code. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. col=“skyblue”col= allows us to specify the color of the plot
using a named color. The name of the color must be placed in quotations.
Type “colors()” in R to see color options. , A comma must always be
used to separate additional commands. xlab=“Temperature” xlab=
stands for “x label.” Use it to specify the text to print on the plot
under the x-axis. The desired text must always be in quotations.
, A comma
must always be used to separate additional commands. main=“La Guardia Airport Daily Mean Temperatures”
main= lets us specify the “main” title to be
placed above the plot. The desired text must always be placed in
quotations. ) Functions must always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
To make a histogram in R using the ggplot approach, first ensure
library(ggplot2)
is loaded. Then,
ggplot(data, aes(x=column)) +
geom_histogram()
data
is the name of your dataset.column
is a column of data from your dataset that is
quantitative.aes(x= )
is how you tell
the gpplot to make the x-axis become your column
of
data.geom_histogram()
causes
the ggplot to become a histogram.Example Code
Hover your mouse over the example codes to learn more. Click on them to see what they create.
Basic Histogram
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=Temp “x=” declares which
variable will become the x-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_histogram() The
“geom_histogram()” function causes the ggplot to become a histogram.
There are many other “geom_” functions that could be used.
Press Enter to run the code.
… Click to View Output.
Change Bin Width and Color
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=Temp “x=” declares which
variable will become the x-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_histogram( The
“geom_histogram()” function causes the ggplot to become a histogram.
There are many other “geom_” functions that could be used.
binwidth=5, The “binwidth” command controls the width of the
bars in the histogram.
fill=“skyblue”, The “fill” command controls
the color of the insides of each bar. color=“black” The “color”
command controls the color of the edges of each bar. )
Closing parenthsis for the geom_histogram
function.
Press Enter to run the code.
… Click to View Output.
Add Titles
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=Temp “x=” declares which
variable will become the x-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_histogram( The
“geom_histogram()” function causes the ggplot to become a histogram.
There are many other “geom_” functions that could be used.
binwidth=5, The “binwidth” command controls the width of the
bars in the histogram.
fill=“skyblue”, The “fill” command controls
the color of the insides of each bar. color=“black” The “color”
command controls the color of the edges of each bar. )
Closing parenthsis for the geom_histogram
function. + The addition symbol +
is used to add
further elements to the ggplot.
labs( The “labs”
function is used to add labels to the plot, like a main title, x-label
and y-label. title=“La Guardia
Airport Daily Mean Temperature”, The
“title=” command allows you to control the main title at the top of the
graphic. x=“Temperature”, The “x=” command allows you to control the x-label
of the graphic. y=“Number of Days”
The “y=” command allows you to control the
y-label of the graphic. )
Closing parenthsis for the labs
function.
Press Enter to run the code.
… Click to View Output.
Gallery
See some ideas from past students…
To make a histogram in plotly first load
library(plotly)
Then, use the function:
plot_ly(dataName, x=~columnName, type="histogram")
dataName
is the name of a data setcolumnName
must be the name of a column of quantitative
data. R refers to this as a “numeric vector.”type="histogram"
tells the plot_ly(…) function to
create a histogram.Visit plotly.com/r/histograms for more details.
Example Code
Hover your mouse over the example codes to learn more. Click on them to see what they create.
Basic histogram
plot_ly An R
function “plot_ly” from library(plotly) used to create any plotly
plot. ( Parenthesis to begin the function. Must touch the
last letter of the function.
airquality, “airquality” is a dataset. Type
“View(airquality)” in R to see it.
x= The x= allows us to declare which column
of the data set will become the x-axis of the histogram.
~Temp, “Temp” is a quantitative variable (numeric vector)
from the “airquality” dataset. The ~
is required before
column names inside all plot_ly(…) commands. type=“histogram” This
option tells the plot_ly(…) function what “type” of graph to make. In
this case, a histogram. )
Closing parenthsis for the plot_ly
function.
Press Enter to run the code.
… Click to View Output.
Change Color
plot_ly(airquality, x=~Temp,
type=“histogram”, This code was explained in
the first example code.
marker=list( this “list(…)” of options that
will be specified will effect the bars of the histogram.
color = “skyblue”, this will change the color of the bars to
skyblue. line = list(, this opens a list of options to specify for the
“lines” around the “markers.”
color = “darkgray”, this will change the
color of the lines around the bars to darkgray. width = 2 this will change
the width of the lines around the bars to 2 pixels. Too really see what
this does, change it to something crazy like 10. ) Functions always end with
a closing parenthesis. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
Add Titles
plot_ly(airquality$Temp, type=“histogram”,
This code was explained in the first example
code. marker=list( this “list(” of options that will be specified will
effect the bars of the histogram.
color = “skyblue”, this will change the
color of the bars to skyblue. line
= list(, this opens a list of options to
specify for the “lines” around the “markers.” color = “darkgray”, this
will change the color of the lines around the bars to darkgray.
width = 10 this will change the width of the lines around the
bars to 10 pixels, which is rather large really. Using a width=2 is
probably better. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis. %>% The pipe operator passes the completed plot_ly(…)
code into the layout(…) function.
layout( The layout(…)
function is used for specifying details about the axes and their
labels. title=“La Guardia Airport
Daily Mean Temperatures” This declares a main
title for the top of the graph.
xaxis=list( This declares a list of options
to be specified for the xaxis. The same can be done for the
yaxis(…). title=“Temperature in
Degrees F” This declares a title underneath
the x-axis. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
Histograms group data that are close to each other into “bins” (the vertical bars in the plot). The height of a bin is determined by the number of data points that are contained within the bin. For example, if we group together all the sections of the book of scripture known as the Doctrine and Covenants that occurred in a given year (Jan. 1st - Dec. 31st) then we get the following counts.
Year | Number of Sections |
---|---|
1823 | 1 |
1824 | 0 |
1825 | 0 |
1826 | 0 |
1827 | 0 |
1828 | 1 |
1829 | 16 |
1830 | 19 |
1831 | 37 |
1832 | 16 |
1833 | 12 |
1834 | 5 |
1835 | 3 |
1836 | 4 |
1837 | 1 |
1838 | 8 |
1839 | 3 |
1840 | 0 |
1841 | 3 |
1842 | 2 |
1843 | 4 |
1844 | 1 |
1845 | 0 |
1846 | 0 |
1847 | 1 |
*Note that Section 138 occurred in 1918 and is removed from this example.
In this example, each “bin” spans 365 days (Jan. 1 - Dec. 31 of each year). Since “dates” can be used as quantitative data, it makes sense to make a histogram of these data. (Remember, histograms are only for quantitative data.)
Notice in the bins above that the left edge of the bin is on the year the data corresponds with. The right edge of the bin lands on the following year. For example, the first bin has left edge on 1823 and right edge on 1824. Since there was one revelation in 1823, this bin has a height of 1. The bin that has 1831 on the left and 1832 on the right shows that 37 revelations occurred in 1831. It is powerful to notice the amount of revelations occurring around 1830, the year the Church of Jesus Christ of Latter-day Saints was organized.
1 Quantitative Variable | 2+ Groups
Graphical depiction of the five-number summary. Great for comparing the distributions of data across several groups or categories. Provides a quick visual understanding of the location of the median as well as the range of the data. Can be useful in showing outliers. Sample size should be larger than at least five, or computing the five-number summary is not very meaningful. Side-by-side dotplots are a good alternative for smaller sample sizes.
To make a boxplot in R use the function:
boxplot(object)
To make side-by-side boxplots:
boxplot(object ~ group, data=NameOfYourData, ...)
object
must be quantitative data. R refers to this as a
“numeric vector.”group
must be qualitative data. R refers to this as
either a “character vector” or a “factor.” However, a “numeric vector”
can also act as a qualitative variable.NameOfYourData
is the name of the dataset containing
object
and group
....
implies there are many other options that can be
given to the boxplot()
function. Type ?boxplot
in your R Console for more details.Example Code
Basic Single Boxplot
boxplot An R
function “boxplot” used to create boxplots. ( Parenthesis to begin the
function. Must touch the last letter of the function.
airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. $ The $ allows us to access any variable from the
airquality dataset. Temp “Temp” is a quantitative variable (numeric vector)
from the “airquality” dataset.
)
Closing parenthsis for the function.
Press Enter to run the code.
… Click to View Output.
More Useful… Basic Side-by-Side Boxplot
boxplot An R
function “boxplot” used to create boxplots. ( Parenthesis to begin the
function. Must touch the last letter of the function.
Temp “Temp” is
a quantitative variable (numeric vector) from the “airquality”
dataset. ~ The ~ is used to tell R that you want one boxplot
of the quantitative variable (“Temp”) for each group found in the
qualitative variable (“Month”).
Month “Month” is a qualitative variable (in
this case a “numeric vector” defining months by 5, 6, 7, 8, and 9) from
the “airquality” dataset. ,
The “,” is required to start specifying
additional commands for the “boxplot()” function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
Add Names under each Box
boxplot An R
function “boxplot” used to create boxplots. ( Parenthesis to begin the
function. Must touch the last letter of the function.
Temp “Temp” is
a quantitative variable (numeric vector) from the “airquality”
dataset. ~ The ~ is used to tell R that you want one boxplot
of the quantitative variable (“Temp”) for each group found in the
qualitative variable (“Month”).
Month “Month” is a qualitative variable (in
this case a “numeric vector” defining months by 5, 6, 7, 8, and 9) from
the “airquality” dataset. ,
The “,” is required to start specifying
additional commands for the “boxplot()” function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ,
The “,” is required to start specifying
additional commands for the “boxplot()” function. names=c(“May”,“June”,“July”,“Aug”,“Sep”) names= is used to tell R what labels to place on
the x-axis below each boxplot. )
Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
Add Color and Labels
boxplot(Temp ~ Month, data=airquality This code was explained in the previous example
code. , The comma is used to separate each additional
command to a function. xlab=“Month
of the Year” xlab= stands for “x label.” Use
it to specify the text to print on the plot under the x-axis. The
desired text must always be contained in quotes. , The comma is used to
separate each additional command to a function. ylab=“Temperature” ylab=
stands for “y label.” Use it to specify the text to print on the plot
next to the y-axis. The desired text must always be contained in
quotes. , The comma is used to separate each additional
command to a function. main=“La
Guardia Airport Daily Temperatures” main=
stands for the “main label” of the plot, which is placed at the top
center of the plot. The desired text must always be contained in
quotes. , The comma is used to separate each additional
command to a function. col=“wheat”
col= stands for the “color” of the plot. The
color name “wheat” is an available color in R. Type colors() in the R
Console to see more options. The color name must always be placed in
quotes. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
To make a boxplot in R using the ggplot approach, first ensure
library(ggplot2)
is loaded. Then,
ggplot(data, aes(x=groupsColumn, y=dataColumn) +
geom_boxplot()
data
is the name of your dataset.groupsColumn
is a column of data from your dataset that
is qualitative and represents the groups that should each have a
boxplot.dataColumn
is a column of data from your dataset that
is quantitative.aes(x= , y=)
is how you
tell the gpplot to make the x-axis have the values in your
groupsColumn
of data, the y-axis become your
dataColumn
. Note if
groupsColumn
is not a factor, use
factor(groupsColumn)
instead.geom_boxplot()
causes the
ggplot to become a boxplot.Example Code
Basic Single Boxplot
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the y-axis should become. y=Temp “y=” declares which
variable will become the y-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_boxplot() The
“geom_boxplot()” function causes the ggplot to become a boxplot. There
are many other “geom_” functions that could be used.
Press Enter to run the code.
… Click to View Output.
Side-by-side Boxplot and Color Change
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=factor(Month), “x=”
declares which variable will become the x-axis of the graphic. Since
Month is “numeric” we must use “factor(Month)” instead of just
“Month”. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_boxplot( The
“geom_boxplot()” function causes the ggplot to become a boxplot. There
are many other “geom_” functions that could be used. fill=“skyblue”, The “fill”
command controls the color of the insides of each box in the
boxplot. color=“black” The “color” command controls the color of the edges
of each box. )
Closing parenthsis for the geom_boxplot
function.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=factor(Month), “x=”
declares which variable will become the x-axis of the graphic. Since
Month is “numeric” we must use “factor(Month)” instead of just
“Month”. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_boxplot( The
“geom_histogram()” function causes the ggplot to become a histogram.
There are many other “geom_” functions that could be used.
fill=“skyblue”, The “fill” command controls the color of the
insides of each box. color=“black”
The “color” command controls the color of the
edges of each box. )
Closing parenthsis for the geom_boxplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
labs( The “labs”
function is used to add labels to the plot, like a main title, x-label
and y-label. title=“La Guardia
Airport Daily Mean Temperature”, The
“title=” command allows you to control the main title at the top of the
graphic. x=“Month of the Year”,
The “x=” command allows you to control the
x-label of the graphic. y=“Daily
Mean Temperature” The “y=” command allows you
to control the y-label of the graphic. )
Closing parenthsis for the labs
function.
Press Enter to run the code.
… Click to View Output.
Gallery
See what past students have done…
To make a histogram in plotly first load
library(plotly)
Then, use the function:
plot_ly(dataName, y=~columnNameY, x=~columnNameX, type="box")
dataName
is the name of a data setcolumnNameY
must be the name of a column of
quantitative data. R refers to this as a “numeric vector.” This will
become the y-axis of the plot.columnNameX
must be the name of a column of qualitative
data. This will provide the “groups” forming each individual box in the
boxplot.type="box"
tells the plot_ly(…) function to create a
boxplot.Visit plotly.com/r/box-plots for more details.
Example Code
Hover your mouse over the example codes to learn more. Click on them to see what they create.
Basic Boxplot
plot_ly An R
function “plot_ly” from library(plotly) used to create any plotly
plot. ( Parenthesis to begin the function. Must touch the
last letter of the function.
airquality, “airquality” is a dataset. Type
“View(airquality)” in R to see it.
y= The y= allows us to declare which column
of the data set will become the y-axis of the boxplot. In other words,
the quantitative data we are interested in studying for each
group. ~Temp, “Temp” is a quantitative variable (numeric vector)
from the “airquality” dataset. The ~
is required before
column names inside all plot_ly(…) commands. x= The x= allows us to
declare which column of the data set will become the x-axis of the
boxplot. In other words, the “groups” forming each separate box in the
boxplot. ~as.factor(Month),
since “Month” is a quantitative variable
(numeric vector) from the “airquality” dataset we have to change it to a
“factor” which forces R to treat it as a qualitative (groups) variable.
The ~
is required before column names inside all plot_ly(…)
commands. type=“box” This option tells the plot_ly(…) function what
“type” of graph to make. In this case, a boxplot. )
Closing parenthsis for the plot_ly
function.
Press Enter to run the code.
… Click to View Output.
Change Color
plot_ly(airquality, y=~Temp,
x=~as.factor(Month), type=“box”, This code
was explained in the first example code. fillcolor=“skyblue”, this
changes the fill color of the boxes in the boxplot to the color
specified, in this case “skyblue.”
line=list(color=“darkgray”, width=3), this
“list(…)” of options that will be specified will effect the edges of the
boxes in the boxplot. We are changing their color to “darkgray” and
their width to 3 pixels wide.
marker=list( this “list(…)” of options that
will be specified will effect the outlying dots shown in the boxplots
beyond the “fences” of each box.
color = “orange”, this will change the color
of the dots to orange. line =
list(, this opens a list of options to
specify for the “lines” around the “markers.” color = “red”, this will
change the color of the lines around the outlier dots to red.
width = 1 this
will change the width of the lines around the outlier dots to 1
pixel. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
Add Titles
plot_ly(airquality, y=~Temp,
x=~as.factor(Month), type=“box”, fillcolor=“skyblue”,
line=list(color=“darkgray”, width=3), marker = list(color=“orange”, line
= list(color=“red”, width=1))) This code was
explained in the above example code. %>% the pipe operator
sends the completed plot_ly(…) code into the layout function.
layout( The layout(…) function is used for specifying
details about the axes and their labels. title=“La Guardia Airport Daily Mean Temperatures”
This declares a main title for the top of the
graph. xaxis=list( This declares a list of options to be specified for
the xaxis. The same can be done for the yaxis(…). title=“Month of the Year” This declares a title underneath the x-axis.
), Functions
always end with a closing parenthesis. yaxis=list( This declares a
list of options to be specified for the y-axis. title=“Temperature in Degrees F” This declares a title beside the y-axis.
) Functions
always end with a closing parenthesis. ) Functions always end with
a closing parenthesis.
Press Enter to run the code.
… Click to View Output.
Understanding how a boxplot is created is the best way to understand what the boxplot shows.
1 Quantitative Variable | 2+ Groups
Depicts the actual values of each data point. Best for small sample sizes or for datasets where there are lots of repeated values. Histograms or boxplots are better alternatives for large sample sizes when there are few repeated values. Great for comparing the distribution of data across several groups or categories.
To make a dot plot in Base R use the code:
stripchart(object)
For side-by-side dotplots:
stripchart(object ~ group, data=NameOfYourData)
object
must be a quantitative (or ordinal) variable,
what R refers to as a “numeric vector.”group
is a qualitative variable, which in R can be
either a “character vector” or a “factor.”NameOfYourData
is the name of the dataset containing
object
and group
.Example Code
stripchart An R
function “stripchart” used to create a dot plot. ( Parenthesis to begin the
function. Must touch the last letter of the function.
airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. $ The $ allows us to access any variable from the
airquality dataset. Temp “Temp” is a quantitative variable (numeric vector)
from the “airquality” dataset.
,
The “,” is required to start specifying
additional commands for the function. method=“stack”
method= allows us to choose from the options
“overplot”, “jitter”, and “stack”. The “stack” option stacks mutliple
points that occur at the same location on top of each other. You can try
the code yourself to see what “overplot” and “jitter” do.
)
Closing parenthsis for the function.
Press Enter to run the code.
… Click to View Output.
stripchart An R
function “stripchart” used to create dot plots. ( Parenthesis to begin the
function. Must touch the last letter of the function.
Temp “Temp” is
a quantitative variable (numeric vector) from the “airquality”
dataset. ~ The ~ is used to tell R that you want a dot plot of
the quantitative variable (“Temp”) for each group found in the
qualitative variable (“Month”).
Month “Month” is a qualitative variable (in
this case a “numeric vector” defining months by 5, 6, 7, 8, and 9) from
the “airquality” dataset. ,
The “,” is required to start specifying
additional commands for the function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ,
The “,” is required to start specifying
additional commands for the function. method=“stack”
method= allows us to choose from the options
“overplot”, “jitter”, and “stack”. The “stack” option stacks mutliple
points that occur at the same location on top of each other. You can try
the code yourself to see what “overplot” and “jitter” do.
) Functions
always end with a closing parenthesis.
Press Enter to run the code.
… Click to View Output.
stripchart(Temp ~ Month This part of the code was explained already in the
example code directly above this one. ,
The “,” is required to start specifying
additional commands for the function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ,
The “,” is required to start specifying
additional commands for the function. method=“stack”
method= allows us to choose from the options
“overplot”, “jitter”, and “stack”. The “stack” option stacks mutliple
points that occur at the same location on top of each other. You can try
the code yourself to see what “overplot” and “jitter” do.
, The comma
is used to separate each additional command to a function.
ylab=“Month of the Year” ylab= stands for “y label.” Use it to specify the
text to print on the plot next to the y-axis. The desired text must
always be contained in quotes. ,
The comma is used to separate each additional
command to a function.
xlab=“Temperature” xlab= stands for “x
label.” Use it to specify the text to print on the plot below the
x-axis. The desired text must always be contained in quotes.
, The comma
is used to separate each additional command to a function.
main=“La Guardia Airport Daily
Temperatures” main= stands for the “main
label” of the plot, which is placed at the top center of the plot. The
desired text must always be contained in quotes. , The comma is used to
separate each additional command to a function. col=“sienna” col= stands
for the “color” of the plot. The color name “sienna” is an available
color in R. Type colors() in the R Console to see more options. The
color name must always be placed in quotes. , The comma is used to
separate each additional command to a function. pch=16 pch= stands for the
“plotting character” of the plot. This plot uses the filled circle
(option 16) as the plotting character. The options are 0, 1, 2, …, 25.
Type ?pch in the R Console, and scroll down the help file half way to
see what each option does. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
To make a dot plot in R using the ggplot approach, first ensure:
library(ggplot2)
is loaded. Then,
ggplot(data, aes(x=groupsColumn, y=dataColumn) +
geom_dotplot()
data
is the name of your dataset.groupsColumn
is a column of data from your dataset that
is qualitative and represents the groups that should each have a
boxplot.dataColumn
is a column of data from your dataset that
is quantitative.aes(x= , y=)
is how you
tell the gpplot to make the x-axis have the values in your
groupsColumn
of data, the y-axis become your
dataColumn
. Note if
groupsColumn
is not a factor, use
factor(groupsColumn)
instead.geom_dotplot()
causes the
ggplot to become a dot plot.Example Code
Click to view. Hover to learn.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the y-axis should become. x=Temp “x=” declares which
variable will become the x-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_dotplot() The
“geom_dotplot()” function causes the ggplot to become a dot plot. There
are many other “geom_” functions that could be used.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=factor(Month), “x=”
declares which variable will become the x-axis of the graphic. Use
factor(Month) to change “Month”, which is numeric, into
categories. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_dotplot( The
“geom_dotplot()” function causes the ggplot to become a dot plot. There
are many other “geom_” functions that could be used. binaxis = “y”, This tells
the function that the y=Temp statement should be used as the
quantitative data. stackdir =
“up”, This causes the dots to be stacked on
top of each other. position =
“dodge”, This causes the dots to not
overalap, i.e., “dodge each other.” dotsize = 0.75, Controls
the size of the dots. You can make them larger with numbers greater than
1 and smaller with numbers less than 1. binwidth = 0.5 Controls how
the dots are grouped, similar to the bins in a histogram.
)
Closing parenthsis for the geom_dotplot
function.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=factor(Month), “x=”
declares which variable will become the x-axis of the graphic. Use
factor(Month) to change “Month”, which is numeric, into
categories. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
coord_flip( ) The
“coord_flip()” function causes the ggplot to reverse the axes when
drawing the plot. However, all commands must be given as if the plot
were to be drawn without coord_flip(), then coord_flip() is
applied. + The addition symbol +
is used to add
further elements to the ggplot.
geom_dotplot( The
“geom_dotplot()” function causes the ggplot to become a dot plot. There
are many other “geom_” functions that could be used. binaxis = “y”, This tells
the function that the y=Temp statement should be used as the
quantitative data. stackdir =
“up”, This causes the dots to be stacked on
top of each other. position =
“dodge”, This causes the dots to not
overalap, i.e., “dodge each other.” dotsize = 0.75, Controls
the size of the dots. You can make them larger with numbers greater than
1 and smaller with numbers less than 1. binwidth = 0.5 Controls how
the dots are grouped, similar to the bins in a histogram.
)
Closing parenthsis for the geom_dotplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
labs( The “labs”
function is used to add labels to the plot, like a main title, x-label
and y-label. title=“La Guardia
Airport Daily Mean Temperature”, The
“title=” command allows you to control the main title at the top of the
graphic. x=“Month of the Year”,
The “x=” command allows you to control the
x-label of the graphic. y=“Daily
Mean Temperature” The “y=” command allows you
to control the y-label of the graphic. )
Closing parenthsis for the labs
function.
Press Enter to run the code.
… Click to View Output.
Not yet available.
2 Quantitative Variables
Depicts the actual values of the data points, which are \((x,y)\) pairs. Works well for small or large sample sizes. Visualizes well the correlation between the two variables. Should be used in linear regression contexts whenever possible.
To make a scatterplot in R use the code:
plot(y ~ x, data=NameOfYourData)
y
is the quantitative response variable, i.e., “numeric
vector.”x
is the quantitative explanatory variable, i.e.,
“numeric vector.”NameOfYourData
is the name of the dataset containing
y
and x
.Note: plot(object)
where object
is a
“numeric vector” will create a time series plot, which is sometimes
useful.
Example Code
plot An R function
“plot” used to create a scatterplot, or in this case a time series plot
because only one quantitative variable is being supplied to the
function. ( Parenthesis to begin the function. Must touch the
last letter of the function.
airquality “airquality” is a dataset. Type
“View(airquality)” in R to see it.
$ The $ allows us to access any variable from
the airquality dataset. Temp “Temp” is a quantitative variable (numeric vector)
from the “airquality” dataset.
,
The “,” is required to start specifying
additional commands for the function. type=“l”
type= allows us to choose from the options
“p” for points, “l” for lines, and “b” for both. There are also other
options that could be chosen, type ?plot in the R Console to learn
about them. )
Closing parenthsis for the function.
Press Enter to run the code.
… Click to View Output.
plot An R function
“plot” used to create a scatterplot. ( Parenthesis to begin the
function. Must touch the last letter of the function.
Temp “Temp” is
a quantitative variable (numeric vector) from the “airquality” dataset
that is being used as the response variable (y-axis) for this
plot. ~ The ~ is used to tell R that you want a scatterplot
with the quantitative variable “Temp” on the y-axis and the qauntitative
variable “Month” on the x-axis.
Wind “Wind” is a quantitative variable
(numeric vector) from the “airquality” dataset that is being used as the
explanatory variable (x-axis) for this plot. ,
The “,” is required to start specifying
additional commands for the function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ,
The “,” is required to start specifying
additional commands for the function. pch=8 pch= stands for the
“plotting character” of the plot. This plot uses the star shape (option
8) as the plotting character. The options are 0, 1, 2, …, 25. Type ?pch
in the R Console, and scroll down the help file half way to see what
each option does. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
plot(Temp ~ Wind This part of the code was explained already in the
example code directly above this one. ,
The “,” is required to start specifying
additional commands for the function. data=airquality data= is
used to tell R that the “Temp” and “Month” variables are located in the
airquality dataset. Without this,
R will not know where to find “Temp” and “Month” and the command will
give an error. ,
The “,” is required to start specifying
additional commands for the function. xlab=“Daily Wind Speed (mph)” xlab= stands for “x label.” Use it to specify the
text to print on the plot below the x-axis. The desired text must always
be contained in quotes. , The comma is used to separate each additional
command to a function.
ylab=“Temperature” ylab= stands for “y
label.” Use it to specify the text to print on the plot next to the
y-axis. The desired text must always be contained in quotes.
, The comma
is used to separate each additional command to a function.
main=“La Guardia Airport (May - Sep)”
main= stands for the “main label” of the
plot, which is placed at the top center of the plot. The desired text
must always be contained in quotes. , The comma is used to
separate each additional command to a function. col=“ivory3” col= stands
for the “color” of the plot. The color name “ivory3” is an available
color in R. Type colors() in the R Console to see more options. The
color name must always be placed in quotes. , The comma is used to
separate each additional command to a function. pch=18 pch= stands for the
“plotting character” of the plot. This plot uses the filled diamond
(option 18) as the plotting character. The options are 0, 1, 2, …, 25.
Type ?pch in the R Console, and scroll down the help file half way to
see what each option does. ) Functions always end with a closing
parenthesis.
Press Enter to run the code.
… Click to View Output.
pch Options
To make a scatterplot in R using the ggplot approach, first ensure:
library(ggplot2)
is loaded. Then,
ggplot(data, aes(x=dataColumn1, y=dataColumn2) +
geom_point()
data
is the name of your dataset.dataColumn1
is a column of data from your dataset that
is quantitative and will be used as the explanatory variable.dataColumn2
is a column of data from your dataset that
is quantitative and will be used as the response variable.aes(x= , y=)
is how you
tell the gpplot to make the x-axis have the values in your
dataColumn1
of data, the y-axis become your
dataColumn2
.geom_point()
causes the
ggplot to become a scatterplot.Example Code
Click to view. Hover to learn.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=Wind, “x=” declares
which variable will become the x-axis of the graphic, the explanatory
variable. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_point( The
“geom_point()” function causes the ggplot to become a scatterplot. There
are many other “geom_” functions that could be used. )
Closing parenthsis for the geom_point
function.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. airquality “airquality” is a dataset. Type “View(airquality)”
in R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x=Wind, “x=” declares
which variable will become the x-axis of the graphic, the explanatory
variable. y=Temp “y=” declares which variable will become the y-axis
of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_point( The
“geom_point()” function causes the ggplot to become a scatterplot. There
are many other “geom_” functions that could be used. color = “ivory3”, Controls
the color of the dots. pch = 18
Controls the type of plotting character to be
used in the plot. )
Closing parenthsis for the geom_point
function. + The addition symbol +
is used to add
further elements to the ggplot.
labs( The “labs”
function is used to add labels to the plot, like a main title, x-label
and y-label. title=“La Guardia
Airport (May - Sep)”, The “title=” command
allows you to control the main title at the top of the graphic.
x=“Daily Average Wind Speed (mph)”,
The “x=” command allows you to control the
x-label of the graphic. y=“Daily
Mean Temperature” The “y=” command allows you
to control the y-label of the graphic. )
Closing parenthsis for the labs
function. + The addition symbol +
is used to add
further elements to the ggplot.
theme_bw()
Changes the “theme” or look of the plot to
“black” and “white”.
Press Enter to run the code.
… Click to View Output.
To make a scatterplot in R using the plotly approach, first ensure:
library(plotly)
is loaded. Then,
plot_ly(data, x= ~dataColumn1, y= ~dataColumn2)
data
is the name of your dataset.dataColumn1
is a column of data from your dataset that
is quantitative and will be used as the explanatory variable.dataColumn2
is a column of data from your dataset that
is quantitative and will be used as the response variable.Example Code
plot_ly(airquality, x= ~Wind, y= ~Temp)
plot_ly(KidsFeet,
x= ~length,
y= ~width,
color= ~sex,
size= ~birthmonth,
text= ~paste("Name:", name, "\n", "Birth-Month:", birthmonth),
colors=c("skyblue","hotpink")) %>%
layout(title="KidsFeet dataset",
xaxis=list(title="Length of the longer foot in cm"),
yaxis=list(title="Width of the longer foot in cm"))
1 (or 2) Qualitative Variable(s)
Depicts the number of occurrances for each category, or level, of the qualitative variable. Similar to a histogram, but there is no natural way to order the bars. Thus the white-space between each bar. It is called a Pareto chart if the bars are ordered from tallest to shortest. Clustered and stacked bar charts are often used to display information for two qualitative variables simultaneously.
To make a bar chart in R use the code:
barplot(heights)
heights
must be a “numeric vector” that contains the
heights for each bar that will be drawn in the plot.Note: both the c()
and table()
functions
can be used to specify the heights
. The example codes below
demonstrate.
Example Code
Using thec()
function.
barplot barplot is
an R function used to create a bar chart. ( Parenthesis to begin the
barplot function. Must touch the last letter of the function.
c c is an R
function used to concatenate a list of values together into a “vector.”
It is being used here to specify the heights of the 4 bars in the bar
plot. ( Parenthesis to begin the c function. Must touch the
last letter of the function.
10,5,28,3 This list of numbers will be joined
together into a single “vector.” There is no limit on the number of
entries that can be put into such a list. )
Closing parenthsis for the c()
function. ,
The “,” is required to start specifying
additional commands for the barplot function. col=“gray24” col= stands
for the “color” of the plot. The color name “gray24” is an available
color in R. Type colors() in the R Console to see more options. The
color name must always be placed in quotes. )
Closing parenthsis for the barplot
function.
Press Enter to run the code.
… Click to View Output.
barplot barplot is
an R function used to create a bar chart. ( Parenthesis to begin the
barplot function. Must touch the last letter of the function.
c c is an R
function used to concatenate a list of values together into a “vector.”
It is being used here to specify the heights of the 4 bars in the bar
plot. ( Parenthesis to begin the c function. Must touch the
last letter of the function.
Pigs=10,Cats=5,Dogs=28,Roosters=3 This named
list of numbers will be joined together into a single “vector.” There is
no limit on the number of entries that can be put into such a list.
Notice how the names show up as the labels for each bin in the bar
chart. )
Closing parenthsis for the c()
function. ,
The “,” is required to start specifying
additional commands for the barplot function. col=“gray44” col= stands
for the “color” of the plot. The color name “gray44” is an available
color in R. Type colors() in the R Console to see more options. The
color name must always be placed in quotes. )
Closing parenthsis for the barplot
function.
Press Enter to run the code.
… Click to View Output.
barplot( barplot
is an R function used to create a bar chart. rbind( rbind stands for
“row bind” and is a function that joins together different c() vectors
to make them become rows of a table. `Farm 1`=c(Pigs=10,Cats=5,Dogs=28,Roosters=3) Notice how this c() vector of named values is being
named “Farm 1.” The tick marks ` ` are required to specify a name of a
vector that has a space in it. If the name was just Farm1 (without a
space) then the tick marks would not be needed. Since `Farm 1` is the
first vector in the rbind() function, it will become the first row of
the resulting table that rbind() will create. ,
The “,” is required to specify additional c()
vectors for the rbind() function.
`Farm 2`=c(Pigs=15,Cats=3,Dogs=8,Roosters=1) Notice how this c() vector of named values is being
named “Farm 2.” It will become the second row of the table created by
rbind(). )
Closing parenthsis for the rbind()
function. ,
The “,” is required to specify additional
commands for the barplot function.
col=c(“gray84”,“gray44”) col= stands for the
“color” of the plot. Here two colors: “gray84” and “gray44” are being
passed to the col= option by using the c() function. Notice how these
two colors are used in the resulting bar chart. ,
The “,” is required to specify additional
commands for the barplot function.
beside=TRUE beside= can be set to either TRUE
or FALSE. When it is TRUE, the bars are clustered side-by-side. When it
is set to FALSE, the bars are stacked on top of each other. Typically,
beside=TRUE is preferred. ,
The “,” is required to specify additional
commands for the barplot function.
legend.text=TRUE legend.text=TRUE allows for
the legend to be placed on the barplot. )
Closing parenthsis for the barplot
function.
Press Enter to run the code.
… Click to View Output.
table()
function.
barplot barplot is
an R function used to create a bar chart. ( Parenthesis to begin the
function. Must touch the last letter of the function.
table table is
an R function used to tabulate how many times each value occurs in a
given dataset. It is being used here to specify the heights of the bars
in the bar chart. ( Parenthesis to begin the function. Must touch the
last letter of the function.
mtcars “mtcars” is a dataset. Type
“View(mtcars)” in R to see it. $
The $ allows us to access any variable from
the mtcars dataset. cyl “cyl” is a qualitative variable (in this case
actually a numeric vector acting as a qualitative variable) from the
“mtcars” dataset. It represents the number of cylinders the vehicle’s
engine has. )
Closing parenthsis for the table()
function. ,
The “,” is required to start specifying
additional commands for the barplot function. col=“cornsilk” col= stands
for the “color” of the plot. The color name “cornsilk” is an available
color in R. Type colors() in the R Console to see more options. The
color name must always be placed in quotes. )
Closing parenthsis for the barplot
function.
Press Enter to run the code.
… Click to View Output.
barplot( barplot
is an R function used to create a bar chart. table( table is an R
function used to tabulate how many times each pair of values occurs in a
given dataset. It is being used here to specify the heights of the bars
in this clustered bar chart.
mtcars$am “mtcars” is a dataset and the $
sign is being used to access the “am” variable from that dataset. Note
that “am” is being used as a qualitative variable, but is actually a
numeric vector acting as a qualitative variable. It denotes whether the
vehicle is an automatic (0) or manual (1) transmission.
,
The “,” is required to specify additional
variables for the table() function. mtcars$cyl “mtcars” is a
dataset and the $ sign is being used to access the “cyl” variable from
that dataset. The “cyl” variable gives the cylinders of the vehicle’s
engine as either 4, 6, or 8. So even though it is numeric, it can be
used as a qualitative variable.
)
Closing parenthsis for the table()
function. ,
The “,” is required to start specifying
additional commands for the barplot function. beside=TRUE
beside= is an optional command to the
barplot() function. When TRUE, the bars are placed next to each other.
When FALSE, the bars are stacked on top of each other.
,
The “,” is required to specify additional
commands for the barplot function.
col=c(“firebrick”,“snow1”) col= stands for
the “color” of the plot. The colors of “firebrick” and “snow1” are being
passed to the col= option using the c() function. ,
The “,” is required to specify additional
commands for the barplot function.
legend.text=TRUE legend.text=TRUE allows for
the legend to be placed on the barplot. ,
The “,” is required to specify additional
commands for the barplot function.
xlab=“Cylinders” xlab= stands for “x label.”
Use it to specify the text to print on the plot below the x-axis. The
desired text must always be contained in quotes. )
Closing parenthsis for the barplot
function.
Press Enter to run the code.
… Click to View Output.
To make a bar chart in R using the ggplot approach, first ensure:
library(ggplot2)
is loaded. Then,
ggplot(data, aes(x=groupsColumn, y=countsColumn) +
geom_bar()
data
is the name of your dataset.groupsColumn
is a column of data from your dataset that
is qualitative and represents the groups that should each have a bar in
the barplot.countsColumn
is a column of data from your dataset that
contains the counts of how many times each group has been observed.aes(x= , y=)
is how you
tell the gpplot to make the x-axis have the values in your
groupsColumn
of data, the y-axis become your
countsColumn
. Note if
groupsColumn
is not a factor, use
factor(groupsColumn)
instead.geom_bar()
causes the
ggplot to become a bar chart.Example Code
Manually building the counts data.
FarmAnimals <- data.frame(animal =
c(“pigs”,“cats”,“dogs”,“Roosters”), count = c(10,5,28,3)) This code creates a data set manually called
FarmAnimals using the data.frame() function. Notice that there are two
columns in this dataset: “animal” and “count”.
ggplot An R function
“ggplot” used to create a framework for a graphic that will have
elements added to it with the +
sign. ( Parenthesis to begin the
function. Must touch the last letter of the function.
FarmAnimals “FarmAnimals” is a dataset we just created. Type
“View(FarmAnimals)” in R after running the above code to see it.
, The comma
allows us to specify optional commands to the function. The space after
the comma is not required. It just looks nice. aes( The aes
or “aesthetics” function allows you to tell the ggplot how it should
appear. This includes things like what the x-axis or y-axis should
become. animal, Declares which variable will become the x-axis of
the graphic, the explanatory variable. count, Declares which
variable will become the y-axis of the graphic. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_col( ) The
“geom_col()” function is being used here instead of “geom_bar()” because
this is a very simple bar chart for just one groups column.
Press Enter to run the code.
… Click to View Output.
FarmAnimals <- data.frame(animal =
c(“pigs”,“pigs”,“cats”,“cats”,“dogs”,“dogs”,“Roosters”,“Roosters”),
count = c(6,4,2,3,18,10,2,1), farm =
c(“farm1”,“farm2”,“farm1”,“farm2”,“farm1”,“farm2”,“farm1”,“farm2”))
This code creates a data set manually called
FarmAnimals using the data.frame() function. Notice that there are two
columns in this dataset: “animal” and “count”.
ggplot An R function
“ggplot” used to create a framework for a graphic that will have
elements added to it with the +
sign. ( Parenthesis to begin the
function. Must touch the last letter of the function.
FarmAnimals “FarmAnimals” is a dataset we just created. Type
“View(FarmAnimals)” in R after running the above code to see it.
, The comma
allows us to specify optional commands to the function. The space after
the comma is not required. It just looks nice. aes( The aes
or “aesthetics” function allows you to tell the ggplot how it should
appear. This includes things like what the x-axis or y-axis should
become. x = animal, Declares which variable will become the x-axis of
the graphic, the explanatory variable. y = count, Declares which
variable will become the y-axis of the graphic. fill = farm, Declares
which variable will become the y-axis of the graphic.
)
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_bar( The “geom_bar”
function tells the ggplot() to become a bar chart. stat = “identity”,
Tells the ggplot to use the counts as listed
in the counts column. position =
“dodge”,
Causes the bars in the barchart to be
side-by-side rather than stacked.
color = “black”,
Controls the colors of the borders of the
bars in the plot. )
Closing parenthsis for the geom_bar()
function.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. mtcars “mtcars” is a dataset in R. Type “View(mtcars)” in
R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x = factor(cyl) Declares
which variable will become the x-axis of the graphic. Use
factor(columnName) when the column consists of numbers to turn it into
groups. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_bar( The “geom_bar”
function tells the ggplot() to become a bar chart. fill = “cornsilk”,
Controls the colors of the insides of the
bars in the plot. color = “black”
Controls the colors of the borders of the
bars in the plot. )
Closing parenthsis for the geom_bar()
function.
Press Enter to run the code.
… Click to View Output.
ggplot An R
function “ggplot” used to create a framework for a graphic that will
have elements added to it with the +
sign.
( Parenthesis
to begin the function. Must touch the last letter of the
function. mtcars “mtcars” is a dataset in R. Type “View(mtcars)” in
R to see it. , The comma allows us to specify optional commands to
the function. The space after the comma is not required. It just looks
nice. aes( The aes
or “aesthetics” function
allows you to tell the ggplot how it should appear. This includes things
like what the x-axis or y-axis should become. x = factor(cyl), Declares
which variable will become the x-axis of the graphic. Use
factor(columnName) when the column consists of numbers to turn it into
groups. fill = factor(am), Declares which variable will become the x-axis of
the graphic. Use factor(columnName) when the column consists of numbers
to turn it into groups. )
Closing parenthsis for the aes
function. )
Closing parenthsis for the ggplot
function. + The addition symbol +
is used to add
further elements to the ggplot.
geom_bar( The “geom_bar”
function tells the ggplot() to become a bar chart. position = “dodge”,
Causes the bars to be side-by-side instead of
stacked. color = “black” Controls the colors of the borders of the bars in
the plot. )
Closing parenthsis for the geom_bar()
function. + The addition symbol +
is used to add
further elements to the ggplot.
labs(x=“Cylinders”) The
“labs” function is being used to add a title to the x-axis only.
title=“main title” and y=“y title” could also be used.
Press Enter to run the code.
… Click to View Output.
Not yet available.
Creativity Required
Sometimes no standard plot sufficiently describes the data. In these cases, the only guideline is the one stated originally, “the graphical depiction of data should communicate the truth the data has to offer about the situation of interest.”
You should add links to examples you find of interesting plots made in R.
Here is the R Code for the graphic to the left:
plot(density(CO2$uptake[CO2$Type=="Quebec"]),
main="", col='skyblue4',
xlab="", ylab="", xaxt='n', yaxt='n')
lines(density(CO2$uptake[CO2$Type=="Mississippi"]),
col='firebrick')