I have been experimenting with large datasets recently (in the order of millions). At this point you start feeling the implementation overhead of some of Matlab’s built in functions such as the AUC calculation of classifier which takes up to a few minutes on some of my datasets. To remedy this I have been re-implementing it in C++, the resulting code achieves a speed-up of about 50-fold on a 20 million x 13 million dataset.
The full code can be downloaded at:
I just found out Matlab offers a cool function to generate animated gifs, so I figured I’d give it a spin. The visualization below demos an SVM implementation that I am working on (dual space, slack, RBF kernel with varying bandwidth ratios).
The code to generate the gif goes as follows:
for sigma = sigmas
%WRITE TO GIF
frame = getframe(1);
im = frame2im(frame);
[imind,cm] = rgb2ind(im,256);
if sigma == sigmas(1) %First iteration
Fore more information on SVMs and kernels, see this post.