Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
How to use LDA and PCA as a preprocessing tool before classification ?

Subject: How to use LDA and PCA as a preprocessing tool before classification ?

From: Chaou

Date: 28 May, 2012 10:03:05

Message: 1 of 10

Hello everyone.
Here is what I want to do. I have an Input matrix (120x6), and its assigned classes Target (120x1).
Target matrix looks like : Target = [1 1 2 3 3 1 1 2 3 3 .... and so on].

Well, I want to use PCA and LDA to reduce the dimension of my Input matrix. Therefor, the projected matrix that I'll obtain, will serves me for a KNN classification (nearest neighbors).

Okay. Concerning PCA.
when I run :
[coeff score] = princomp(Input);
I obtain a score matrix (120x6). when I try to classify with the score matrix like this : class=knnclassify(sample,score,Target);
I obtain a class vector composed fully of 3, like this : class=[3 3 3 3 3 ..... 3 3 3 3], which is definitely false. Why ? Shouldn't I use the score matrix ?

Concerning LDA, I use the algorithm of Will Dwinnell found here : http://www.mathworks.com/matlabcentral/fileexchange/29673-lda-linear-discriminant-analysis ,
I run this code :
W=LDA(Input,Target);
I obtain a W matrix of 3x7. How should I work with this matrix ? I don't know the assigned classes of it .... How can I classify with it ?

How should I use matrix obtained from PCA and LDA processing to perform a classification (knn) ?

Thanks to anyone who can help me !

Subject: How to use LDA and PCA as a preprocessing tool before classification ?

From: Ilya Narsky

Date: 29 May, 2012 14:18:12

Message: 2 of 10

"Chaou " <chaouahmedkhaled@gmail.com> wrote in message
news:jpvigp$fpf$1@newscl01ah.mathworks.com...
> Hello everyone.
> Here is what I want to do. I have an Input matrix (120x6), and its
> assigned classes Target (120x1).
> Target matrix looks like : Target = [1 1 2 3 3 1 1 2 3 3 .... and so on].
>
> Well, I want to use PCA and LDA to reduce the dimension of my Input
> matrix. Therefor, the projected matrix that I'll obtain, will serves me
> for a KNN classification (nearest neighbors).
>
> Okay. Concerning PCA.
> when I run : [coeff score] = princomp(Input);
> I obtain a score matrix (120x6). when I try to classify with the score
> matrix like this : class=knnclassify(sample,score,Target); I obtain a
> class vector composed fully of 3, like this : class=[3 3 3 3 3 ..... 3 3 3
> 3], which is definitely false. Why ? Shouldn't I use the score matrix ?

You have not described what your "sample" is. If these are data in the
original space, you need to transform to the PCA coordinates. princomp
centers the data, and so you need to center your "sample" too:

m = mean(Input);
[coeff score] = princomp(Input);
centeredSample = bsxfun(@minus,sample,m);
centeredSampleInPCAcoords = centeredSample*coeff;

Now you can call knnclassify(centeredSampleInPCAcoords ,score,Target).

>
> Concerning LDA, I use the algorithm of Will Dwinnell found here :
> http://www.mathworks.com/matlabcentral/fileexchange/29673-lda-linear-discriminant-analysis ,
> I run this code : W=LDA(Input,Target);
> I obtain a W matrix of 3x7. How should I work with this matrix ? I don't
> know the assigned classes of it .... How can I classify with it ?
>
> How should I use matrix obtained from PCA and LDA processing to perform a
> classification (knn) ?

Different implementations of LDA do different things. LDA for classification
typically finds K*(K-1)/2 hyperplanes separating K classes. LDA for variable
transformation typically finds K-1 directions that define new variables for
K classes. I don't know what this particular implementation does.

Function CLASSIFY in Statistics Toolbox does the first thing
(classification). The new (2011b and later) class ClassificationDiscriminant
also targets classification, but variable transformation can be easily done
using the between-class and within-class covariance matrices saved in a
ClassificationDiscriminant object. By the way, since you are using princomp,
you do have access to Statistics Toolbox. -Ilya

Subject: How to use LDA and PCA as a preprocessing tool before

From: Greg Heath

Date: 29 May, 2012 18:56:24

Message: 3 of 10

On May 28, 6:03

Subject: How to use LDA and PCA as a preprocessing tool before classification ?

From: Chaou

Date: 29 May, 2012 22:27:11

Message: 4 of 10

> Different implementations of LDA do different things. LDA for classification
> typically finds K*(K-1)/2 hyperplanes separating K classes. LDA for variable
> transformation typically finds K-1 directions that define new variables for
> K classes. I don't know what this particular implementation does.
>
> Function CLASSIFY in Statistics Toolbox does the first thing
> (classification). The new (2011b and later) class ClassificationDiscriminant
> also targets classification, but variable transformation can be easily done
> using the between-class and within-class covariance matrices saved in a
> ClassificationDiscriminant object. By the way, since you are using princomp,
> you do have access to Statistics Toolbox. -Ilya

Hello Ilya. Thank you for your help, I really appreciate it !!!

Well, for PCA, I'm traying to proof that by combining PCA to knn (or other classifier), I obtain better recognition rate than when using the classifier without PCA. I'm far from obtaining such results....

Concerning LDA. Yes, I do have access to Statistics Toolbox. LDA in this Toolbox is performed by runing : class = classify(sample,training,group).

Well, I'm not looking for classification through LDA, but I want to have the scores of the training data (just like I did for the PCA, I want the score, and then using it for another knn classifier). How should I do to obtain the score of the training data through LDA ?

Please, feel free to ask me if I wasn't clear.

I really want to thank you very much !

I deeply appreciate your help. Thanks Ilya :)

Subject: How to use LDA and PCA as a preprocessing tool before

From: Greg Heath

Date: 30 May, 2012 00:25:07

Message: 5 of 10

Greg Heath <g.heath@verizon.net> wrote in message <048d4679-712b-4644-9e93-647204aebe83@j25g2000yqn.googlegroups.com>...
> On May 28, 6:03

Please read my reply in Google Groups.

Hope this helps.

Greg

Subject: How to use LDA and PCA as a preprocessing tool before classification

From: Ilya Narsky

Date: 30 May, 2012 01:58:18

Message: 6 of 10

On 5/29/2012 6:27 PM, Chaou wrote:
>> Different implementations of LDA do different things. LDA for
>> classification typically finds K*(K-1)/2 hyperplanes separating K
>> classes. LDA for variable transformation typically finds K-1
>> directions that define new variables for K classes. I don't know what
>> this particular implementation does.
>>
>> Function CLASSIFY in Statistics Toolbox does the first thing
>> (classification). The new (2011b and later) class
>> ClassificationDiscriminant also targets classification, but variable
>> transformation can be easily done using the between-class and
>> within-class covariance matrices saved in a ClassificationDiscriminant
>> object. By the way, since you are using princomp, you do have access
>> to Statistics Toolbox. -Ilya
>
> Hello Ilya. Thank you for your help, I really appreciate it !!!
>
> Well, for PCA, I'm traying to proof that by combining PCA to knn (or
> other classifier), I obtain better recognition rate than when using the
> classifier without PCA. I'm far from obtaining such results....
>
> Concerning LDA. Yes, I do have access to Statistics Toolbox. LDA in this
> Toolbox is performed by runing : class = classify(sample,training,group).
>
> Well, I'm not looking for classification through LDA, but I want to have
> the scores of the training data (just like I did for the PCA, I want the
> score, and then using it for another knn classifier). How should I do to
> obtain the score of the training data through LDA ?
>
> Please, feel free to ask me if I wasn't clear.
>
> I really want to thank you very much !
> I deeply appreciate your help. Thanks Ilya :)

First you need to compute within-class and between-class covariance
matrices. This is easy, and you can find their definitions for LDA in
many places. Here is what you do next. In the example below, I assume
that the data has 3 classes and 4 predictors:

[coeff,lambda] = eig(BetweenSigma,WithinSigma,'chol');
[lambda,sorted] = sort(diag(lambda),'descend') % sort by eigenvalues
coeff = coeff(:,sorted);
coeff(:,[3 4]) = [] % get rid of zero eigenvalues

The rank of BetweenSigma is at most K-1 for K classes. EIG does not know
that and returns two very small eigenvalues. You need to get rid of them.

To transform your original data to the new coordinates, do the usual
thing - center and multiply by coeff, just like I did for princomp in my
previous email.

What I described could be perhaps viewed as the most standard "vanilla"
variable transformation by LDA. There are other possibilities. Read
about them in

Ye, J., & Xiong, T. (2006). Computational and Theoretical Analysis of
Null Space and Orthogonal Linear Discriminant Analysis. Journal of
Machine Learning Research, 7, 1183–1204

Zhang, L.-H., Liao, L.-Z., & NG, M. (2010). FAST ALGORITHMS FOR THE
GENERALIZED FOLEY-SAMMON DISCRIMINANT ANALYSIS. SIAM journal on matrix
analysis and applications, 31, 1584-1605

-Ilya

Subject: How to use LDA and PCA as a preprocessing tool before

From: Greg Heath

Date: 31 May, 2012 09:03:08

Message: 7 of 10

Newsgroups: comp.soft-sys.matlab
From: Greg Heath <g.he...@verizon.net>
Date: Tue, 29 May 2012 11:56:24 -0700 (PDT)
Local: Tues, May 29 2012 2:56 pm
Subject: Re: How to use LDA and PCA as a preprocessing tool before classification ?
On May 28, 6:03 am, "Chaou " <chaouahmedkha...@gmail.com> wrote:

> Hello everyone.
> Here is what I want to do. I have an Input matrix (120x6), and its assigned classes Target (120x1).
> Target matrix looks like : Target = [1 1 2 3 3 1 1 2 3 3 .... and so on].
> Well, I want to use PCA and LDA to reduce the dimension of my Input matrix. Therefor, the projected matrix that I'll obtain, will serves me for a KNN classification (nearest neighbors).
> Okay. Concerning PCA.
> when I run :
> [coeff score] = princomp(Input);
> I obtain a score matrix (120x6). when I try to classify with the score matrix like this : class=knnclassify(sample,score,Target);
> I obtain a class vector composed fully of 3, like this : class=[3 3 3 3 3 ..... 3 3 3 3], which is definitely false. Why ? Shouldn't I use the score matrix ?
> Concerning LDA, I use the algorithm of Will Dwinnell found here :http://www.mathworks.com/matlabcentral/fileexchange/29673-lda-linear-...,
> I run this code :
> W=LDA(Input,Target);
> I obtain a W matrix of 3x7. How should I work with this matrix ? I don't know the assigned classes of it .... How can I classify with it ?
> How should I use matrix obtained from PCA and LDA processing to perform a classification (knn) ?
> Thanks to anyone who can help me !

It is interesting to compare the above with the results of the following simple approach for a c-class linear classifier:

[ I N ] = size(x) % N I-dimensional column vectors
[ 1 N ] = size(class) % Corresponding N integer class indices (1,2,...c)

xa = [ ones(1,N) ; x ]; % Augment with row of ones to accomodate a bias
t = ind2vec(class); % Conversion to a target matrix of N columns of eye(c)

W = t / xa; % Linear classifier weight vector. size(W)= [ 3 7 ]
y = W*xa; % Linear classifier output: min (mse( t - y))
classy = vec2ind(y); % Conversion to class indices

errors = (classy ~ = class); % {0,1}
Nerrors = sum(errors)
Pcterr= 100*Nerrors/N

If input dimensionality reduction is desired consider the following four approaches:

1. Standardize x and rank the coefficients of W. Delete the variable corresponding to
the component of W that has the smallest absolute value. Stop when the degradation
in performance is no longer insignidicant.
2. Use function function STEPWISE or STEPWISEFIT for each of the 6 classes
3. Use the function SEQUENTIALFS
4. Use the function RELIEFF

Can extend to a quadratic classifier that is linear in coefficients by augmenting x with crossproducts and squares

Hope this helps.

Greg

Subject: How to use LDA and PCA as a preprocessing tool before classification

From: Chaou

Date: 1 Jun, 2012 23:26:22

Message: 8 of 10

Hello Ilya. I computed the within-class and between-class covariance matrices. I don't know if I did it right.
I have 3 groups (3 classes). The first one contains a training matrice of 60 rows and 6 columns. The second one contains a training matrice of 30 rows and 6 columns, and the third one contains a training matrices of 60 rows and 6 columns.

I obtained a within-class and between-class covariance matrices of 60 rows and 60 columns.

Then, I followed your instructions, I runed :

[coeff,lambda] = eig(BetweenSigma,WithinSigma,'chol');
[lambda,sorted] = sort(diag(lambda),'descend') % sort by eigenvalues
coeff = coeff(:,sorted);

my coeff has 60 rows and 60 columns.

You suggest me to multiplie coeff * centeredSample, but I can't. coeff has 60 x 60, while centredSample has 30 x 6 (Inner matrix dimensions must agree).

I really don't get this. It worked perfectly with PCA, but not with LDA.

Are my within-class and between-class covariance matrices false ?

Thank you so much Ilya !!!

Subject: How to use LDA and PCA as a preprocessing tool before classification

From: Ilya Narsky

Date: 2 Jun, 2012 02:08:58

Message: 9 of 10

On 6/1/2012 7:26 PM, Chaou wrote:
> Hello Ilya. I computed the within-class and between-class covariance
> matrices. I don't know if I did it right.
> I have 3 groups (3 classes). The first one contains a training matrice
> of 60 rows and 6 columns. The second one contains a training matrice of
> 30 rows and 6 columns, and the third one contains a training matrices of
> 60 rows and 6 columns.
>
> I obtained a within-class and between-class covariance matrices of 60
> rows and 60 columns.
>
> Then, I followed your instructions, I runed :
> [coeff,lambda] = eig(BetweenSigma,WithinSigma,'chol'); [lambda,sorted] =
> sort(diag(lambda),'descend') % sort by eigenvalues coeff = coeff(:,sorted);
> my coeff has 60 rows and 60 columns.
> You suggest me to multiplie coeff * centeredSample, but I can't. coeff
> has 60 x 60, while centredSample has 30 x 6 (Inner matrix dimensions
> must agree).
>
> I really don't get this. It worked perfectly with PCA, but not with LDA.
>
> Are my within-class and between-class covariance matrices false ?
>
> Thank you so much Ilya !!!

doc cov

A covariance matrix for p predictors is of size p-by-p. Sounds like you
have 6 predictors.

Subject: How to use LDA and PCA as a preprocessing tool before classification

From: Chaou

Date: 2 Jun, 2012 19:33:27

Message: 10 of 10

Okay Ilya. I got my within-class and between-class covariance matrices, size 6x6.

I ran your code :

[coeff,lambda] = eig(sb,sw,'chol');
[lambda,sorted] = sort(diag(lambda),'descend') % sort by eigenvalues
coeff = coeff(:,sorted);
coeff(:,[3 4]) = [] % get rid of zero eigenvalues

I obtained a coeff of 6x4. I don't understand how I will call my knnclassify with this coeff ???

With PCA, I obtained a score (6x120), which I used it as a training matrix in my knnclassify, just like this : knnclassify(centeredSampleInPCAcoords ,score,Target).

I know how to transform my Sample matrix, I center it, and then multiplie by coeff.

I don't understand what to do with " coeff " . How can " coeff " train my knn classifier ?

Can you please help me for this final part ?

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us