Say that I want to generate a PNG named myPlot.png in my current directory:
png('myPlot.png') plot(df$var1, df$var2) dev.off()The last line says to R that I want to switch back to the default, meaning X11 that is going to show my up the result.
png('myPlot.png') plot(df$var1, df$var2) dev.off()The last line says to R that I want to switch back to the default, meaning X11 that is going to show my up the result.
df$VarNow, for that variable I can get its mean:
mean(df$Var)Standard deviation:
sd(df$Var)Summary (that works also for a complete data frame):
summary(df$Var)Which observation has the minimum value for the passed variable:
which.min(df$Var)Which observation has the maximum value for the passed variable:
which.max(df$Var)
plot(df$Var1, df$Var2)Here Var1 would get the X axis while Var2 the Y axis.
sub = subset(df, Var1 > 100 & Var2 < 50)Notice that the AND logical operator is an ampersand. To see how many observations are in this subset (and in any dataframe) we can use the nrow function:
nrow(sub)
setwd('pathname')If you have a doubt about which is your current working directory, just print it:
getwd()Reading from a CSV file to a data frame is pretty simple:
df = read.csv('path/to/file.csv')Now we can get the structure of the dataframe:
str(df)It gives us information on the number of observations (rows) and variables (columns); names of variables, a few of their values and, when they are detected as 'factors', also the number of 'level'on which that variable is structured.
summary(df)It tries to provide us useful summary for each variable, giving the levels in case of factor, or a few statistic measures otherwise (min, max, mean, median, first and third quartile).
sub = subset(df, MyVariable = 'a value')Then we can save this subset to a CSV file:
write.csv(sub, 'path/to/subFile.csv')
Sys.setlocale("LC_ALL", "C")