<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129</link>
    <title>MATLAB Central Newsreader - Removing duplicates</title>
    <description>Feed for thread: Removing duplicates</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2013 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.nl/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Mon, 16 Apr 2012 02:25:07 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873600</link>
      <author>Mary Thompson</author>
      <description>I was wondering if it would be possible to do the following.&lt;br&gt;
&lt;br&gt;
I have a set of data in one column with ID numbers:&lt;br&gt;
&lt;br&gt;
ID:&lt;br&gt;
22&lt;br&gt;
22&lt;br&gt;
33&lt;br&gt;
33&lt;br&gt;
44&lt;br&gt;
44&lt;br&gt;
55&lt;br&gt;
55&lt;br&gt;
66&lt;br&gt;
66&lt;br&gt;
66&lt;br&gt;
77&lt;br&gt;
77&lt;br&gt;
88&lt;br&gt;
88&lt;br&gt;
88&lt;br&gt;
&lt;br&gt;
The first and second row should be the same. However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate -i am not able to do anything in excel and was wondering if there's any type of checking/verifying in matlab?&lt;br&gt;
&lt;br&gt;
thanks.</description>
    </item>
    <item>
      <pubDate>Mon, 16 Apr 2012 02:51:38 +0000</pubDate>
      <title>Re: Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873604</link>
      <author>Nasser M. Abbasi</author>
      <description>On 4/15/2012 9:25 PM, Mary Thompson wrote:&lt;br&gt;
&amp;gt; I was wondering if it would be possible to do the following.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; I have a set of data in one column with ID numbers:&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; ID:&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; The first and second row should be the same.&lt;br&gt;
&amp;gt;However, there are scenarios like with 66 and 88 that the identifier and&lt;br&gt;
&amp;gt;the data that comes along with it repeats 3x.  I would like to remove the&lt;br&gt;
&amp;gt;middle duplicate -i am not able to do anything in excel and was wondering&lt;br&gt;
&amp;gt;if there's any type of checking/verifying in matlab?&lt;br&gt;
&lt;br&gt;
WHat do you mean by "middle duplicate" ?&lt;br&gt;
&lt;br&gt;
You can use the unique() command in matlab to remove duplicates.&lt;br&gt;
&lt;br&gt;
If you want to start this after some index, say after the second&lt;br&gt;
index, then you can. But you need to be more clear by what you mean&lt;br&gt;
by "middle duplicate".&lt;br&gt;
&lt;br&gt;
using your data:&lt;br&gt;
&lt;br&gt;
EDU&amp;gt;&amp;gt; unique(A)&lt;br&gt;
&lt;br&gt;
ans =&lt;br&gt;
&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;22&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;33&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;44&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;55&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;66&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;77&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;88&lt;br&gt;
&lt;br&gt;
--Nasser</description>
    </item>
    <item>
      <pubDate>Mon, 16 Apr 2012 03:03:06 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873606</link>
      <author>Siva </author>
      <description>"Mary Thompson" wrote in message &amp;lt;jmfvu3$sl3$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; I was wondering if it would be possible to do the following.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a set of data in one column with ID numbers:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; ID:&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The first and second row should be the same. However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate -i am not able to do anything in excel and was wondering if there's any type of checking/verifying in matlab?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; thanks.&lt;br&gt;
&lt;br&gt;
Not sure how big your data sets are but for small data sets this might work:&lt;br&gt;
&lt;br&gt;
% Assume DATA is a matrix where column 1 contains ID, and the rest of the columns &lt;br&gt;
% contain the associated data.&lt;br&gt;
uniqueIDs= unique( DATA( :, 1)) ; % identify all the unique IDs&lt;br&gt;
for i= 1:length( uniqueIDs)&lt;br&gt;
&amp;nbsp;&amp;nbsp;idx= find( DATA( :, 1)==uniqueIDs( i)) ; % identify the rows corresponding &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;% to i'th unique ID&lt;br&gt;
&amp;nbsp;&amp;nbsp;if length( idx)==3              % check if we have three of the same ID&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;DATA( idx( 2), :)= [] ;      % discard the second row for that ID&lt;br&gt;
&amp;nbsp;&amp;nbsp;end&lt;br&gt;
end&lt;br&gt;
&lt;br&gt;
At the end of this code segment, the matrix DATA should be stripped of the second row when there were three rows for an ID.</description>
    </item>
    <item>
      <pubDate>Mon, 16 Apr 2012 04:33:08 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873616</link>
      <author>Roger Stafford</author>
      <description>"Mary Thompson" wrote in message &amp;lt;jmfvu3$sl3$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; .... However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate&lt;br&gt;
- - - - - - - - - -&lt;br&gt;
&amp;nbsp;&amp;nbsp;If 'ID' is the name of the column vector, do this:&lt;br&gt;
&lt;br&gt;
&amp;nbsp;t = [true;diff(ID)~=0];&lt;br&gt;
&amp;nbsp;ID = ID(t|[diff(t)~=0;true]);&lt;br&gt;
&lt;br&gt;
&amp;nbsp;&amp;nbsp;It should reduce any consecutive sequence of more than two like numbers to just two of them, but a sequence of two is left unchanged.  Is that what you wanted?&lt;br&gt;
&lt;br&gt;
Roger Stafford</description>
    </item>
    <item>
      <pubDate>Tue, 17 Apr 2012 04:48:08 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873787</link>
      <author>Parag Shridhar </author>
      <description>"Mary Thompson" wrote in message &amp;lt;jmfvu3$sl3$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; I was wondering if it would be possible to do the following.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a set of data in one column with ID numbers:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; ID:&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The first and second row should be the same. However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate -i am not able to do anything in excel and was wondering if there's any type of checking/verifying in matlab?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; thanks.&lt;br&gt;
&lt;br&gt;
Roger Stafford just provided a brilliant solution.&lt;br&gt;
Anyway if you want a generalized solution but a slow one comparatively, you can group the numbers using "splitvec" (A file in MATLAB File Exhchange) and then you can remove the numbers you want to.&lt;br&gt;
&lt;br&gt;
- Parag Shridhar Chandakkar.</description>
    </item>
    <item>
      <pubDate>Wed, 18 Apr 2012 11:12:07 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873962</link>
      <author>venkat vasu</author>
      <description>This code surely helpful for you.... &lt;br&gt;
&lt;br&gt;
&lt;br&gt;
a=[1 1 2 2 3 3 3 5 5 5 5 5 7 7 4 4 8 8 9 9 9 4 4 ];&lt;br&gt;
c=length(a);&lt;br&gt;
j=1;&lt;br&gt;
l=1;&lt;br&gt;
while j&amp;lt;c&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;d(l)=a(j);&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;j=j+1;l=l+1;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;d(l)=a(j);&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;l=l+1;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for k=j+1:c-1 &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if a(j)==a(k)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;continue;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;break;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;j=k;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;br&gt;
end&lt;br&gt;
disp(d);</description>
    </item>
    <item>
      <pubDate>Wed, 18 Apr 2012 11:21:50 +0000</pubDate>
      <title>Re: Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873965</link>
      <author>Nasser M. Abbasi</author>
      <description>On 4/18/2012 6:12 AM, venkat vasu wrote:&lt;br&gt;
&amp;gt; This code surely helpful for you....&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; a=[1 1 2 2 3 3 3 5 5 5 5 5 7 7 4 4 8 8 9 9 9 4 4 ];&lt;br&gt;
&amp;gt; c=length(a);&lt;br&gt;
&amp;gt; j=1;&lt;br&gt;
&amp;gt; l=1;&lt;br&gt;
&amp;gt; while j&amp;lt;c&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;         d(l)=a(j);&lt;br&gt;
&amp;gt;         j=j+1;l=l+1;&lt;br&gt;
&amp;gt;         d(l)=a(j);&lt;br&gt;
&amp;gt;         l=l+1;&lt;br&gt;
&amp;gt;         for k=j+1:c-1&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;             if a(j)==a(k)&lt;br&gt;
&amp;gt;                 continue;&lt;br&gt;
&amp;gt;             else&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;                 break;&lt;br&gt;
&amp;gt;             end&lt;br&gt;
&amp;gt;         end&lt;br&gt;
&amp;gt;         j=k;&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; end&lt;br&gt;
&amp;gt; disp(d);&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
I have not examined your algorithm in detail, but it does&lt;br&gt;
not seem to work on my matlab 2012a.&lt;br&gt;
&lt;br&gt;
When I look at 'd' at the end, it print same as 'a'.&lt;br&gt;
May be there is a bug some where?&lt;br&gt;
&lt;br&gt;
btw, why not just use Matlab function&lt;br&gt;
&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;unique(a)&lt;br&gt;
&lt;br&gt;
instead?&lt;br&gt;
&lt;br&gt;
--Nasser</description>
    </item>
    <item>
      <pubDate>Wed, 18 Apr 2012 11:46:08 +0000</pubDate>
      <title>Re: Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873969</link>
      <author>venkat vasu</author>
      <description>&lt;br&gt;
&lt;br&gt;
Yes... we can use the matlab function unique(a). it will give  following output.&lt;br&gt;
a=[1 1 2 2 3 3 3 5 5 5 5 5 7 7 4 4 8 8 9 9 9 4 4 ];&lt;br&gt;
&lt;br&gt;
&amp;nbsp;b=1     2     3     4     5     7     8     9;&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
I thought following output have to give&lt;br&gt;
&amp;nbsp;&amp;nbsp;b=1     1     2     2     3     3     5     5     7     7     4     4     8     8     9     9     4     4;&lt;br&gt;
my code give's this like output.</description>
    </item>
    <item>
      <pubDate>Wed, 18 Apr 2012 12:46:10 +0000</pubDate>
      <title>Removing duplicates</title>
      <link>http://www.mathworks.nl/matlabcentral/newsreader/view_thread/319129#873978</link>
      <author>Shanmugam Kannappan</author>
      <description>"Mary Thompson" wrote in message &amp;lt;jmfvu3$sl3$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; I was wondering if it would be possible to do the following.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a set of data in one column with ID numbers:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; ID:&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 22&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 33&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 44&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 55&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 66&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 77&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; 88&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The first and second row should be the same. However, there are scenarios like with 66 and 88 that the identifier and the data that comes along with it repeats 3x.  I would like to remove the middle duplicate -i am not able to do anything in excel and was wondering if there's any type of checking/verifying in matlab?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; thanks.&lt;br&gt;
&lt;br&gt;
Hi,&lt;br&gt;
&lt;br&gt;
Just try If this helps.&lt;br&gt;
&lt;br&gt;
id = [2 2 4 4 5 5 5 6 6 7 7 8 8 8 8 9 9 9 9 9 10 10]';&lt;br&gt;
val = (1:length(id))';&lt;br&gt;
x =[id val];&lt;br&gt;
[a index_frst] = unique(x(:,1), 'first');&lt;br&gt;
[b index_last] = unique(x(:,1), 'last');&lt;br&gt;
x1 = zeros(2*length(index_frst), 2);&lt;br&gt;
x1(1:2:end,:) = x(index_frst,:);&lt;br&gt;
x1(2:2:end,:) = x(index_last,:);&lt;br&gt;
disp(x1)&lt;br&gt;
disp(x) % Just for comparison.&lt;br&gt;
&lt;br&gt;
It extracts the rows of the first &amp; last index.&lt;br&gt;
&lt;br&gt;
HTH,&lt;br&gt;
Shan!</description>
    </item>
  </channel>
</rss>
