{"id":147,"date":"2017-05-14T13:45:09","date_gmt":"2017-05-14T12:45:09","guid":{"rendered":"http:\/\/www.cardiomaths.net\/?p=147"},"modified":"2017-07-21T13:38:36","modified_gmt":"2017-07-21T12:38:36","slug":"using-the-tapply-function-in-r","status":"publish","type":"post","link":"http:\/\/www.cardiomaths.net\/index.php\/2017\/05\/14\/using-the-tapply-function-in-r\/","title":{"rendered":"Using the tapply function in R"},"content":{"rendered":"<p>There are lots of apply functions in R (apply, lapply, sapply etc) and they are used instead of loops. I find tapply particularly useful, it applies a function such as sum, mean or length to a subset of a table<\/p>\n<p>There are three components,\u00a0 tapply(1, 2, 3) where<\/p>\n<ol>\n<li>the vector of data you want to apply the function to<\/li>\n<li>the way to break the data up (this can be a single variable or a list of them<\/li>\n<li>the function (such as length or mean) that you wish to apply to the data<\/li>\n<\/ol>\n<p>#load up some data<br \/>\ndata(iris)<\/p>\n<p>#example &#8211; obtain mean petal length per species<br \/>\ntapply(iris$Petal.Length, iris$Species, mean)<\/p>\n<p>#Find the mean petal length for each species where the petal width is 1.4<br \/>\ntapply(iris$Petal.Length[iris$Petal.Width == 1.4], iris$Species[iris$Petal.Width == 1.4], mean, na.rm=TRUE)<\/p>\n<p>#If you have problems with this check the two vectors are the same length<br \/>\nlength(iris$Petal.Length[iris$Petal.Width == 1.4])<br \/>\nlength(iris$Species[iris$Petal.Width == 1.4])<\/p>\n<p># example \u2013 I can find the average of two different subsets of the data by using a list<br \/>\ntapply(mtcars$mpg, list(mtcars$cyl, mtcars$am), mean)<\/p>\n<p>&nbsp;<\/p>\n<p>#Other useful sources<br \/>\n<a href=\"https:\/\/www.r-bloggers.com\/r-function-of-the-day-tapply-2\/\">https:\/\/www.r-bloggers.com\/r-function-of-the-day-tapply-2\/<\/a><br \/>\n<a href=\"https:\/\/stat.ethz.ch\/R-manual\/R-devel\/library\/base\/html\/tapply.html\">https:\/\/stat.ethz.ch\/R-manual\/R-devel\/library\/base\/html\/tapply.html<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are lots of apply functions in R (apply, lapply, sapply etc) and they are used instead of loops. I find tapply particularly useful, it applies a function such as sum, mean or length to a subset of a table There are three components,\u00a0 tapply(1, 2, 3) where the vector of data you want to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/posts\/147"}],"collection":[{"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/comments?post=147"}],"version-history":[{"count":4,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/posts\/147\/revisions"}],"predecessor-version":[{"id":151,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/posts\/147\/revisions\/151"}],"wp:attachment":[{"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/media?parent=147"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/categories?post=147"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.cardiomaths.net\/index.php\/wp-json\/wp\/v2\/tags?post=147"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}