R introduction!

I wrote a really short intro2R document for those who think Verzani is too wordy…

(Verzani is awesome.)

Introduction to R (updated 9/19/2016)

Advertisements

The thing that has been annoying me so much: epsilon-delta proofs

For those who have taken real analysis long time ago, you might have been annoyed by the same thing that has been annoying me recently during math bootcamp: How can I formally define limits and continuities? Here is my thoughts on the formal definition of limit, continuity, and uniform continuity.

Limit is defined

\forall \,\varepsilon>0\,\exists\,\delta>0 \text{ s.t. } 0<|x-x_0|<\delta\,\to\,|f(x)-L|<\varepsilon

Continuity is defined

\forall x, \, \forall \varepsilon>0 \,\exists\,\delta>0\text{ s.t. }|x-x_0|<\delta\to f(x)-f(x_0)<\varepsilon

And uniform continuity is defined

\forall \varepsilon>0 \,\exists\,\delta>0 \text{ s.t }\,,\forall x, \,|x-x_0|<\delta\to f(x)-f(x_0)<\varepsilon

In my opinion, the key difference between limit and continuity is how strict the definition is for each concept. In the continuity definition, we do not stipulate that x\ne x_0 as we do in the limit definition. The difference between continuity and uniform continuity is also a matter of strictness of the definition. And this difference is registered by the order of the quantifiers in our case. Uniform continuity gives us a continuous function for all x in the domain, and is thus much more strict than the pointwise continuity.

Loop through files to loop through variables!

Say you have a bunch of data files formatted in exactly the same way (which is not rare if you are scraping or if the data are clean), how do you loop through all the files at once, extract all the useful information, and bind them to a big matrix? Consider the following code:

Suppose all my files are named “1.csv”, …, “5.csv”, and we loop through files by

file.names <- c("1", "2", "3", "4", "5")
for (i in 1:length(file.names)) {
data <- readLines(paste(file.names[i], "csv", sep = "."))
read <- read.csv(textConnection(data), header = TRUE, stringsAsFactors = FALSE)
assign(paste(file.names[i]), read)
}

Oftentimes you would need to reshape your data. Suppose we are looking at such data

year    place1    place2
1999    1.1       7.8
...

An efficient way to reshape your data is to write a melt function:

my.melt <- function(x){
  x <- melt(x, id.vars=c('year'), 
  variable.name='place')
  x
}

Since all the files are the same, we get a  long list of variables that have the same dimension. Thus, we can merge all of them. Consider the example where I want to merge two of my variables:

var.names=list(var1, var2)
for (i in 1:length(var.names)){
  var.names[[i]] <- my.melt(var.names[[i]]) 
}

Alternatively, you can use lapply()

reshape <- lapply(var.names, my.melt)

Now, we need to cbind() all our data:

datalist = list() # create empty list
for (i in 1:5) {
  datalist[[i]] <- reshape[[i]]
}
merge <- do.call(cbind, datalist)
names(merge) <- c(var.names)

Definitely not the smartest way — but it works.