Old habits die hard. For someone coming from linear programming models such as C++ or Java, it takes quite a bit of effort before a *fully-compliant* R functions can be developed. The major pit-fall is, defining a function that is not *vector-compliant*.

For example, consider the below function that accepts two numbers and returns some computed value based on them.

getNc2 = function(vc, ec) { # compute nC2 and return the ratio max_ec = vc * (vc-1) / 2 return (trunc(ec * 100 / max_ec)) }

This is a valid function in R, and perhaps similar to any C++ or Java method, except it works on both single number input *and* vector inputs. For example, note the below single number input calls:

> getNc2(5, 4) [1] 40 > getNc2(6, 5) [1] 33

Now, those two calls can be combined into a single vector call as below and our above definition works perfectly fine without complaints:

> getNc2(c(5,6), c(4,5)) [1] 40 33

The method accepted two vectors (instead of just two individual numbers) and returned vectorized results. Something not quite natural in C++ or Java, unless special care is taken to define the argument types as templated parameters and so on.

Now, what about the limits? Our method is computing a ratio which essentially involves a division. If we are not careful, we can endup with dividing-by-zero. For example, consider the below calls:

> getNc2(0, 5) [1] -Inf > getNc2(0, 0) [1] NaN

Suppose, we want to handle these cases where the argument is zero and return 0 instead of *-Inf* or *NaN*. Then we will be adding an if condition as below:

getNc2 = function(vc, ec) { # if no vertices return 0 if(vc <= 1) return (0) # compute nC2 and return the ratio max_ec = vc * (vc-1) / 2 return (trunc(ec * 100 / max_ec)) }

Now, the earlier calls will return 0 correctly as shown below:

> getNc2(0, 0) [1] 0 > getNc2(0, 5) [1] 0

However, the if condition introduces a new problem. *if* in R is not a vector-compatible opration. It works only for scalar inputs and as such our method hence fails on vector inputs. For example:

> getNc2(c(5,0), c(4,5)) [1] 40 -Inf Warning message: In if (vc <= 1) return(0) : the condition has length > 1 and only the first element will be used

Now, there are couple of ways you can address this problem. One would be to use *apply* constructs - but they can be overkill oftentimes. Other way is, using *ifelse* instead of *if* etc. vector-capable operations explicitly. That would work, provided you have the option to change the source. What about cases where you are using a method from some package and you cannot change the method ?

Take the method *get.edge* from *igraph* package, for example. It accepts an edge id and returns the vertices for that edge. However, if you supply a vector of edge ids, it will not work - it just returns the vertices for only the first edge, as can be seen from the below output:

> cg <- erdos.renyi.game(8, 0.6) > get.edge(cg, 13) [1] 4 6 > get.edge(cg, c(13,15,16)) [1] 4 6

How do you make these kind of methods work for vector inputs as well as scalar inputs? R has a nice base method under its sleeve that exactly addresses this problem. Named *Vectorize*, it helps in these situations - coverting scalar methods to vector methods.

*Vectorize* creates a function wrapper that vectorizes the action of its argument function. For example, we can vectorize our *getNc2* method as shown below:

> Vectorize(getNc2)(c(5,0), c(4,5)) [1] 40 0

Instead of directly calling *getNc2*, we are rather creating a wrapper around it by calling *Vectorize* method with *getNc2* as its argument, and supplying the original vector inputs to the resultant function. In other words, the above is same as:

> vecNc2 <- Vectorize(getNc2) > vecNc2(c(5,0), c(4,5)) [1] 40 0

Remember, the vectorized *vecNc2* method works perfectly fine on scalar inputs too. For example:

> vecNc2(5, 4) [1] 40

By default *Vectorize* tries to vectorize all the arguments of the supplied method, which may not work sometimes. For example, the earlier example of *get.edge* in *igraph*:

> vecGE <- Vectorize(get.edge) > vecGE(cg, c(13,15,16)) Error in function (graph, id) : Not a graph object

Here the *get.edge* method is accepting two inputs, one the graph object itself and the second the edge id - and we want to vectorize only the second argument, namely the *id* argument and not the graph object. For this, we would have to use the *vectorize.args* facility of *Vectorize* method as shown below:

> vecGE <- Vectorize(get.edge, vectorize.args="id") > vecGE(cg, c(13,15,16)) [,1] [,2] [,3] [1,] 4 1 2 [2,] 6 7 7

This is a simplified output of the *Vectorize* method. We can use the *SIMPLIFY* facility of *Vectorize* method to get the original results unaltered. For example:

> vecGE <- Vectorize(get.edge, vectorize.args="id", SIMPLIFY=FALSE) > vecGE(cg, c(13,15,16)) [[1]] [1] 4 6 [[2]] [1] 1 7 [[3]] [1] 2 7

Note, this is same as the *lapply* output.

> lapply(c(13,15,16), function(id) get.edge(cg, id)) [[1]] [1] 4 6 [[2]] [1] 1 7 [[3]] [1] 2 7

So go ahead and use *Vectorize* whenever you want to avoid *lapply* constructs for methods that accept only scalar inputs.