Path: news.mathworks.com!not-for-mail
From: "James Tursa" <aclassyguywithaknotac@hotmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Speeding up sum of squares
Date: Thu, 8 May 2008 04:48:03 +0000 (UTC)
Organization: Boeing
Lines: 51
Message-ID: <fvu0m3$hjm$1@fred.mathworks.com>
References: <fm012450tjat7jgv5hqhtiokid3ketdsgj@4ax.com> <fvq0qr$7tv$1@fred.mathworks.com> <1v2124dbkuukhhprui7a8ss4vhmtgn3jcg@4ax.com> <fvq421$p13$1@fred.mathworks.com> <fvtkg4$foh$1@fred.mathworks.com> <fvtllj$n9b$1@fred.mathworks.com> <fvtvo2$n92$1@fred.mathworks.com>
Reply-To: "James Tursa" <aclassyguywithaknotac@hotmail.com>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1210222083 18038 172.30.248.37 (8 May 2008 04:48:03 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Thu, 8 May 2008 04:48:03 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 756104
Xref: news.mathworks.com comp.soft-sys.matlab:467299


"James Tursa" <aclassyguywithaknotac@hotmail.com> wrote in
message <fvtvo2$n92$1@fred.mathworks.com>...
> I made the
> following test runs with a 10,000,000 size double array:
> 
> (1) Matrix Multiply, BLAS calls in background
> 
> >> tic;x'*x;toc
> Elapsed time is 0.028451 seconds.
> 

Since it wasn't too difficult I went ahead and forced the
intermediate values to be register variables using Visual
C++ 8.0 (actually the C compiler that comes with it of
course). The results are:

>> tic;sumsquares3(x);toc
Elapsed time is 0.029241 seconds.

This result is almost identical to the x'*x method. So maybe
the only trick behind the scenes here to get the speed
improvement is to use register variables for the sum and
loop indexes.

James Tursa

---------------------------------------------------

(6) Mex routine, straightforward using register variables

#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const
mxArray *prhs[])
{
    register double sumsquares = 0.0;
    double *dp;
    register int i, n;

    dp = mxGetPr( prhs[0] );
    n = mxGetNumberOfElements( prhs[0] );
    for( i=0; i<n; i++ )
    {
        sumsquares += (*dp) * (*dp);
        dp++;
    }
    plhs[0] = mxCreateDoubleMatrix( 1, 1, mxREAL );
    *mxGetPr( plhs[0] ) = sumsquares;
}