How can you learn R?

2020-12-12

Categories: Blog post Tags: R programming

I get this question a lot: How do I learn how to use R? I usually mumble something, and feel quite uncomfortable. In fact. It is not easy to learn R on your own. It is not easy to learn ANYTHING on your own, but that is especially true of R. Someon asked me again today, and I thought I should outline the many different approaches that you can take in learning R. I am listing this in order of priority, with the best and most effective approaches listed first.

  1. Take a live local class. The best class would be one in Statistics where the teacher uses R extensively. At UMKC, most of the faculty other than me seem to use a different software package (SAS or SPSS) in their classes. You might have better luck at KUMC, but I’m not familiar at all with their classes. Many of the data science classes (e.g, at Rockhurst University) use Python. But if you find one with a heavy statistics content that also uses R, you should grab it. For what it’s worth, I offer a one credit hour class, Introduction to R, but it covers reading in data from text files, basic data manipulation, and simple graphs and tables. If you want to use R for statistical analysis, this class would provide a solid foundation, but would not cover anything more complex than a correlation coefficient. Sam Ye offers a one day workshop in R through UMKC. It is not for academic credit, though, and is offered irregularly. KU offers a one week workshop in R every summer. Both Sam’s class and the KU workshop are similar, I think, to my class in that they don’t dive deeply into data analysis. This is written from the perspective of someone who lives in Kansas City, but you can easily extrapolate to your local setting.

  2. Take a remote class. I really dislike the MOOCs, as they do not offer sufficient one-on-one instruction. Rather, pay for a course at The Analysis Factor (theanalysisfactor.com) or The Institute for Statistics Education (statistics.com). I just finished teaching a course in Survival Analysis for the former group, and we had examples in R (as well as SAS, SPSS, and Stata). There are lots of other classes, but one of particular note is Kim Love teaching a course soon, called Introduction to R. I can’t talk about the class content, but I do know that Kim is a very good teacher. I don’t know the folks at statistics.com very well, but I did take a couple of classes from them many years ago and they were very well run. Their faculty are prominent researchers with strong international reputations.

  3. Attend a statistics conference. These are all virtual for the near future, but you can still get a lot out of them. Pay special attention to the half day or longer short courses, as these are often oriented towards beginners. But some of the regular talks might be good as well. I get a LOT of mileage out of these conferences, as they attract some of the best speakers, like Hadley Wickham, Garrett Grolemund, and Yihui Xie. Any talk by Frank Harrell is worth your time. The conferences sponsored by the American Statistical Association are quite good. I have not attended any of the conferences sponsored by R Studio, but I hear very good things about them. Here’s just one talk that looks interesting. The UseR conference is also reputed to be quite good, but I’m not sure where to find the recordings.

  4. Watch videos. You don’t get to ask questions, but otherwise the videos that you can find about R are quite good. Some of the best videos are recorded talks at the R Studio and UserR conferences. The quality of videos, however, is quite uneven. Look for speakers with strong reputations in the research community.

  5. Attend local meetings of the Kansas City R Users Group, and (when the topic is right) Data Science KC. You don’t learn R from a users group, but it does help supplement some of your other efforts.

  6. Read a good book. There are lots of excellent books out there, but it takes a strongly self-motivated person to learn solely from a book. There is something about the give and take of an instructor that keeps your interest and motivation high. A book is too easy to read for a couple of chapters and then put aside. That being said, the book R for Data Science, written by Garrett Grolemund and Hadley Wickham, is quite good. It covers a lot of advanced topics in data analysis, but is still accessible for a beginner. If you find a book you like, but it doesn’t cover R, see if it is listed in the resources at UCLA. I leaned heavily on this site for my Survival Analysis class, as it had great code examples in R (and other packages) for the Hosmer, Lemeshow, and May book.

  7. Try a MOOC. I have no experience with these, but I know several people who have taken the Johns Hopkins Data Science series from Coursera. The teachers have a strong reputation, and some people just loved the classes. Others thought that the quality of instruction was uneven. Still others found it hard to stay motivated enough to work through all of the classes.

  8. Read blog sites. There are lots of blogs for R and many of the good ones are aggregated. The blog posts are an odd mix of beginning, intermediate, and advanced R, and you often won’t be able to tell until you’ve read half of any post. This is, at best, a supplement to your other efforts.

  9. RTFM. This acronym stands for Read The Fine Manual. R itself comes with some documentation which is good at times, and sometimes great. The best document comes in the form of a vignette. Find one on a topic you are interested in, such as the vignette on dplyr and work through the examples that they give.

I’m going to ask about this at today’s meeting of the Kansas City R Users Group and see what others suggest. I rely a lot on #3 (conferences), #6 (books) and #9 (documentation), but I’m not a beginner. I was a beginner back in 1994, but I can scarcely remember what things were like one score and six years ago.