Fixing some problems in R

September 3, 2008

Radford Neal points out a few problems in R in this post on his blog.

At first I thought that his approach would not be the best, but maybe I was wrong. So I read up on R internals, and here is my first thought on implementing fixes. The good news is that some of this can be done without changing the R internals yet.

There are two problems:

First, in R the : operator has some annoying properties. One usually uses this operator to generate increasing sequences where 1:3 becomes c(1,2,3). This is often used to index matrices like m[1:3,]. But : will happily generate decreasing sequences, as in 1:-1 becomes c(1,0,-1). This causes indexing to fail in strange ways.

Second, matrix indexing behavior is not consistent. If the indexing returns multiple entries in the matrix, then the result keeps the dimensions of the matrix. But if it returns one entry, then the entry is returned as a vector with no dimensions. For example, m[1,] returns the first row of the matrix as a vector and not as a matrix.

There are two basic ways to solve the first problem. The simple way would be to create two new operators, %>% and %<% where a %>% b would produce only an increasing sequence and either an empty vector or an error if a>b. Clearly, %<% would produce only decreasing sequences. This is easy.

Another possibility would be for %>% and %<% to produce a new class of objects that would would be treated specially by index operations and loops. These objects would not be vectors, but would cause indexing and loops to act as though they are vectors. So 1%>%1000 would not be a vector with 1000 entries, but loops and indexing would treat it as such. This is harder, so I will skip it for the moment.

The second problem is a little more straight forward. In order to stop matrix indexing from returning a vector, one uses the drop=FALSE argument to the indexing operation. So here is a solution – put an attribute on the indexing argument that says “drop=FALSE”. Then update the [ and [<- indexing operations to look for this property on their arguments and automatically use the drop=FALSE option if this attribute is set. This can be done without, for the moment, changing the internal [ and [<- operators.

So here is an example:


h=1

m[h,] # this returns one row from the matrix m, but that row is a vector

drop.no(h) # this sets the drop=FALSE attribute on h
m[h,] # this returns one row from m and that row is a matrix

h=1 %>% 1 # %>% returns a vector that already has the drop=FALSE attribute set
m[h,] # so this returns one row from m and that row is a matrix

Having some spare time, I will see if I can code this up in the next few days.

One Response to “Fixing some problems in R”


  1. [...] Gilde commented on improvements to R and responded to Marc’s comment on S+ [...]


Leave a Reply