Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
give each lab one copy of my structure array ?

Subject: give each lab one copy of my structure array ?

From: Juliette Salexa

Date: 21 Dec, 2011 04:53:08

Message: 1 of 9

Hello!

I've read through the parallel computing toolbox documentation several times now and am still very confused about something ='(

I have a loop like this, where B is a 1x12 structure array that uses a very large amount of memory.

A=zeros(12,12)
for i=1:12
    parfor j = 1:12
            A(i,j)=someFunction(B(i).someField,B(j).someField);
    end
end

I am watching my resource monitor while running this program, and it look like this:
1. The main client is using about 2GB of memory all the time
2. The 12 workers each use about 1.5GB of memory at the beginning (but it takes a very long time for each of them to occupy all that memory).
3. Then every time the index i changes, the 1.5GB of memory being taken up by each worker gets freed ... **only to be occupied again the next second**

Since this deleting and reoccupying of memory is taking up a LONG time ... I'd prefer if all 12 workers were just given the complete structure array (named B in this example) to begin with !!

RAM is not a problem because I have 72GB of RAM ... I'd just like all the 12 workers to have the full structure array (named B in this example) the whole time rather than shuffling the data around so much.

Is there a way to do this ?
Should I be using SPMD instead ??

I've read the PARFOR and SPMD documentations quite a few times but most of it doesn't make sense to me ='(

Thank you

Subject: give each lab one copy of my structure array ?

From: Edric M Ellis

Date: 3 Jan, 2012 08:24:31

Message: 2 of 9

"Juliette Salexa" <juliette.physicist@gmail.com> writes:
> I've read through the parallel computing toolbox documentation several
> times now and am still very confused about something ='(
>
> I have a loop like this, where B is a 1x12 structure array that uses a
> very large amount of memory.
>
> A=zeros(12,12)
> for i=1:12
> parfor j = 1:12
> A(i,j)=someFunction(B(i).someField,B(j).someField);
> end
> end
>
> I am watching my resource monitor while running this program, and it look like this:
> 1. The main client is using about 2GB of memory all the time
> 2. The 12 workers each use about 1.5GB of memory at the beginning (but
> it takes a very long time for each of them to occupy all that memory).
> 3. Then every time the index i changes, the 1.5GB of memory being
> taken up by each worker gets freed ... **only to be occupied again the
> next second**
>
> Since this deleting and reoccupying of memory is taking up a LONG time
> ... I'd prefer if all 12 workers were just given the complete
> structure array (named B in this example) to begin with !!

The problem is that each new PARFOR loop is sending the whole of B out
to each worker each time - because it doesn't know that you haven't
changed B on the client in between invocations.

The simplest way to address this is probably to flatten the multiple
PARFOR loops into a single one, like this:

parfor idx = 1:(12*12)
  [i, j] = ind2sub([12, 12], idx);
  A(idx)=someFunction(B(i).someField,B(j).someField);
end

Note that I've used IND2SUB to get back the i/j subscripts, but assigned
into A using the linear index 'idx' to ensure that PARFOR still knows
how to 'slice' A.

Cheers,

Edric.

Subject: give each lab one copy of my structure array ?

From: Juliette Salexa

Date: 5 Jan, 2012 17:04:07

Message: 3 of 9

Thanks VERY much Edric!

Your suggestion for combining the two loops is brilliant, and I can see it being useful in many of my other applications as well.

I believe your method worked, in that each lab had a copy of the >1 GB structure array, and didn't keep deleting it and recreating it over and over again.

Now that I've been shown your code: WorkerObjWrapper.m
I can see that it would be more efficient if my structure array B is not recreated on 12 different labs, but is rather just stored in one place and accessed from each of the 12 labs.

This way I'm only using 1.5GB of RAM, and not 1.5*13 GB of RAM.

I tried doing:
=================
matlabpool('open','12')
Bpersistent=WorkerObjWrapper(B);
parfor idx= 1:12^2
     [i, j] = ind2sub([12,12], idx);
              A(idx)=1-someFunction(Bpersistent(i).someField,Bpersistent(j).someField);
end
=================
But then I got the error messages:

Error using parallel_function (line 598)
Index exceeds matrix dimensions.
Error stack:
(No remote error stack)
=================
Which I did not get when I did it without WorkerObjWrapper.

The other thing is that it appears that the 1.5GB array is still being copied on to each lab even with the above code that uses WorkerObjWrapper.

Am I using your FEX code wrong ?

Cheers,
Juliette

Subject: give each lab one copy of my structure array ?

From: Edric M Ellis

Date: 6 Jan, 2012 08:13:06

Message: 4 of 9

"Juliette Salexa" <juliette.physicist@gmail.com> writes:

> Now that I've been shown your code: WorkerObjWrapper.m I can see that it would
> be more efficient if my structure array B is not recreated on 12 different labs,
> but is rather just stored in one place and accessed from each of the 12 labs.
> >
> This way I'm only using 1.5GB of RAM, and not 1.5*13 GB of RAM.
> >
> I tried doing:
> =================
> matlabpool('open','12')
> Bpersistent=WorkerObjWrapper(B);
> parfor idx= 1:12^2
> [i, j] = ind2sub([12,12], idx);
> A(idx)=1-someFunction(Bpersistent(i).someField,Bpersistent(j).someField);
> end
> =================
> But then I got the error messages:
> >
> Error using parallel_function (line 598)
> Index exceeds matrix dimensions.
> Error stack:
> (No remote error stack)
> =================
> Which I did not get when I did it without WorkerObjWrapper.

When using the WorkerObjWrapper, you need to pick out the "Value" field
when you're on the workers. In other words,

Bpersistent=WorkerObjWrapper(B);
parfor ...
  v = Bpersistent.Value;
  % use 'v'
end

Unfortunately, even with WorkerObjWrapper it's not possible to make it
so that there's only a single copy of 'B' across all workers.

Cheers,

Edric.

Subject: give each lab one copy of my structure array ?

From: Juliette Salexa

Date: 14 Jul, 2012 02:49:39

Message: 5 of 9

Hello,
I'm back to this question now unfortunately ='(

I'm using the suggestion of Edric M Ellis from 3 Jan 2012 (a couple posts above this):

parfor idx = 1:(12*12)
  [i, j] = ind2sub([12, 12], idx);
  A(idx)=someFunction(B(i).someField,B(j).someField);
end

When the total data in the structure array B was about 1.5GB, the 12 workers grabbed a copy of this 1.5GB structure array and applied calculations to it in parallel.

Now I have 3.7GB of data and this is a disaster...

12 workers opened, however, my task manager looks like this:

1. Main matlab client is using 11% CPU, and about 3.7GB of memory (which is fluctuating, since calculations are happening)
2. 12 matlab workers are open but using 0% CPU, and 142MB of memory (and the amount of memory being used is NOT changing AT ALL )

Basically the whole calculation seems to be happening only on the main client and the workers don't seem to be doing anything ='(

I certainly have enough memory for each worker to have this 3.7GB array.
Any suggestions ?

Subject: give each lab one copy of my structure array ?

From: Edric M Ellis

Date: 16 Jul, 2012 08:09:59

Message: 6 of 9

"Juliette Salexa" <juliette.physicist@gmail.com> writes:

> I'm back to this question now unfortunately ='(
>
> I'm using the suggestion of Edric M Ellis from 3 Jan 2012 (a couple posts above this):
>
> parfor idx = 1:(12*12)
> [i, j] = ind2sub([12, 12], idx);
> A(idx)=someFunction(B(i).someField,B(j).someField);
> end
>
> When the total data in the structure array B was about 1.5GB, the 12
> workers grabbed a copy of this 1.5GB structure array and applied
> calculations to it in parallel.
>
> Now I have 3.7GB of data and this is a disaster...

Unfortunately, it's not currently supported to transfer that much data
into a PARFOR loop.

> 12 workers opened, however, my task manager looks like this:
>
> 1. Main matlab client is using 11% CPU, and about 3.7GB of memory
> (which is fluctuating, since calculations are happening)
> 2. 12 matlab
> workers are open but using 0% CPU, and 142MB of memory (and the amount
> of memory being used is NOT changing AT ALL )
>
> Basically the whole calculation seems to be happening only on the main
> client and the workers don't seem to be doing anything ='(

That can happen if the data to be sent to the PARFOR loop cannot be
transferred because it is too large. You should have seen a warning
about this.

If your 'B' array is constant and the same value is needed on each
worker, what I'd suggest is using my Worker Object Wrapper

<http://www.mathworks.com/matlabcentral/fileexchange/31972-worker-object-wrapper>

like so:

% Build B on the workers so we never need to transfer it
spmd
    if labindex == 1
        B = <generate B>;
        labBroadcast(1, B);
    else
        B = labBroadcast(1);
    end
end
% Build a WorkerObjWrapper to wrap the Composite 'B'
Bw = WorkerObjWrapper(B);

% Use 'Bw.Value' inside PARFOR
parfor ...
    B = Bw.Value;
    <use B>;
end

While it's not ideal that you need to use this workaround for now, it
has the advantage that you never need to transfer 'B' from client to
workers. In particular, if you run multiple PARFOR loops, 'B' will be
there already.

If you have an encapsulated function to generate 'B', you can actually
build the WorkerObjWrapper directly like this:

Bw = WorkerObjWrapper( @buildBFcn, {} );

Hope this helps,

Edric.

Subject: give each lab one copy of my structure array ?

From: Juliette Salexa

Date: 16 Jul, 2012 19:29:21

Message: 7 of 9

Hi Edric,
Thank you VERY much for your very useful response once again.

It's a shame that MATLAB doesn't let us do parfor with 3.5GB arrays. I paid a lot for a computer with 128GB of RAM !
In fact, ideally, each lab would be able to run the computation on the same 3.5GB (constant) array rather than making its own copy to work with.

Anyway, I have one question about your code:

parfor idx=1:length(B)^2
    B = Bw.Value;
     [i, j] = ind2sub([length(B), length(B)], idx);
     A(idx)=someFunction(B(i).someField,B(j).someField);;
end

why does B=Bw.Value have to be done at the beginning of the loop, at each iteration ?
Can it not be done just once, before the whole loop ?

Thanks so much once again !

Subject: give each lab one copy of my structure array ?

From: Juliette Salexa

Date: 16 Jul, 2012 19:58:07

Message: 8 of 9

Hi again,
Further to my message directly above this one:

I wrapped tic; and toc; around the parfor portion, and I wrapped "open" and "close" matlabpool commands around your entire code, and got this error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Starting matlabpool using the 'local' configuration ... connected to 12 labs.
Elapsed time is 1.759041 seconds.
Sending a stop signal to all the labs ... stopped.

Warning: The following error was caught while executing 'WorkerObjWrapper' class destructor:
Error detected on lab(s) 4
> In createMatrixInParallelWhenImpossibleToPARFORsuchAbigArray at 16 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Line 16 is:
Bwrapped = WorkerObjWrapper(B);

Also, the fact that the matlabpool('close') command got reached is surprising to me, because there should be about 5 days worth of work to do within the PARFOR portion, which only finished in 1.76 seconds (obviously the stuff in the PARFOR didn't actually happen !)

Thank you so much again !

Subject: give each lab one copy of my structure array ?

From: Bruno Luong

Date: 16 Jul, 2012 22:14:07

Message: 9 of 9

This FEX might be of your interest
http://www.mathworks.fr/matlabcentral/fileexchange/28572

Bruno

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us