Version Issues

One of the key issues to be mindful of is that there are two new versions of MATLAB every year, one in the spring of that year (e.g., 2016a) and one in the fall (e.g., 2016b). Whereas changes in each release are incremental, over time, these changes do add up to transformational change.

For instance, between MATLAB 2014a and now, the default colormap was changed from “jet” to “parula.” If this changes again in the future, the figures might look different from what you see. If that is the case, you can set your colormap explicitly, by typing

colormap(jet) 

or

colormap(parula). 

Sometimes, changes to how MATLAB works require updating your code. For instance, the way to initialize (“seed”) the random number generator was by calling the setDefaultStream method of the RandStream object, such as this:

RandStream.setDefaultStream(s);

after defining s as

s = RandStream('mt19937ar', 'seed', sum(100*clock))

This specifies a particular type of random number generation method (one involving mersenne twisters), hooked up to the system clock.

But this no longer works. The method setDefaultStream has been replaced with setGlobalStream, the correct code is now: RandStream.setGlobalStream(s);
Be on the lookout for things like that.

 

 

VECTORIZATION

MATLAB is an interpreted language, so each line is interpreted and executed one after the other. This is fine, but if there are a lot of lines, it can take a lot of time to execute the code. This is a particular concern if there are long loops, and
possibly even nested loops. In principle, every loop can be replaced by a vector operation, and MATLAB is optimized to do those, so this will speed up your code considerably. Here, we will provide three simple examples of how to do this that generalize easily. Note that this is less of a concern as of late. Which interfaces with the “version issues” point made above. Since recently, Matlab code is now auto-compiled before it is run (under the hood and out of sight), speeding up code considerably.

 

1. A single loop:

Say you have data from 100,000,000 trials and need to calculate the total number of photons presented in a given trial (from illumination levels and time presented in milliseconds). If you do this with a loop, it will take a while:

numTrials = 1e6;
numPhotons = randi(100,[numTrials,1]);
exposureDuration = randi(1000,[numTrials,1]);
tic
for ii = 1:numTrials
   totalExposure(ii,1) = numPhotons(ii)*exposureDuration(ii);
end
toc

If you replace the second paragraph with this one (without the loop) the result will be the same, but it will be much
faster:

tic
totalExposure = numPhotons.*exposureDuration;
toc

 

2. Nested loops:

Say you have a 10,000 by 10,000 matrix that results from multiplying all numbers from 1 to 10,000 with all other numbers from 1 to 10,000 (a full cross). You can do this element by element, first going through all rows, then all columns.
Note that we always preallocate. Otherwise, this would take even longer:

howBig = 1e4;
tic
M = zeros(howBig,howBig); %Always preallocate
for ii = 1:howBig
   for jj = 1:howBig
      M(ii,jj) = ii*jj; %Each ii, jjth entry of M is ii * jj
   end
end
toc

 

 

This works, but it takes a long time.
Now, we vectorize the last dimension (columns), so instead of a nested loop, we have a single loop:

tic
M = zeros(howBig,howBig); %Always preallocate
for ii = 1:howBig
   M(ii,:) = ii*(1:howBig); %Doing each row at once
end
toc

This should already be much faster.
Finally, let’s vectorize both dimension and get rid of loops altogether:

tic
M = zeros(howBig,howBig); %Always preallocate
M(:,:) = (1:howBig)'*(1:howBig); %Doing it all at once
toc

Note that we have to transpose the first vector to get the outer product. All three code versions yield the same result, but you should be able to realize considerable time savings with the last one. These would be even more dramatic if you compared it to a unpreallocated version of the code.

 

 

3. Conditionals:

Say you want to add a number (e.g., performance) to a running total, but only if another number (e.g., percentage of trials completed) is big enough. You could do this in a loop, checking the condition each time.

numParticipants = 1e6;
numTrials = randi(100,[numParticipants,1]);
performance = rand(numParticipants,1);
cumPerf = 0;
tic
for ii = 1:numParticipants
    if numTrials(ii,1) > 50
        cumPerf = cumPerf + performance(ii,1);
    end
end
toc





 

It is straightforward to replace the second paragraph with faster code that produces the same result and that gets rid of the loop:

tic
temp = find(numTrials > 50);
cumPerf = sum(performance(temp));
toc