Main Page   File List   File Members  

PCA_PP.cpp File Reference

Multidimensionnal analysis and dimension reduction by Principal Component Analysis and Projection Pursuit. More...

#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <cstdio>
#include <cstdlib>
#include <cmath>
#include <cassert>
#include <ctime>

Go to the source code of this file.

Namespaces

namespace  std

Defines

#define PIX   float
#define inf   1e20

Functions

void usage (string argv0)
 Usage function. More...

float Evaluate_PI (float **x, unsigned long int lenght)
 Evaluation of the projection index by a chi-square distance. More...

void base_ortho_plan_finder (float **plan_init, float **new_base, unsigned long int nb_dim)
 Procedure searching an orthogonal base to the found projection. More...

void projec_on_base (float **Y, float **newbase, unsigned long int dim, unsigned long int nb_dim)
 Procedure projecting the data in a new orthogonal base. More...

void projec_on_oldbase (float **Y, float **newbase, unsigned long int dim, unsigned long int nb_dim)
 Procedure projecting the data in the original orthogonal base. More...

void destructuration (float **Y, unsigned long int dim)
 Procedure desctructuring : transforming the distribution so that it becomes a Normal distribution. More...

double Total (float *a, unsigned long int lenght)
 Returns the sum of an aray.

void eigsrt (float d[], float **v, int n)
 imported from Numerical Reciepes functions. See Numerical Reciepes documentation for further details.

void jacobi (float **a, int n, float d[], float **v, int *nrot)
 imported from Numerical Reciepes functions. See Numerical Reciepes documentation for further details.

float gasdev (long *idum)
 imported from Numerical Reciepes functions. See Numerical Reciepes documentation for further details.

void gaussj (float **a, int n, float **b, int m)
 imported from Numerical Reciepes functions : Normal(0,1) distributed random number generator See Numerical Reciepes documentation for further details.

int main (int argc, char *argv[])
 Main procedure where PCA and PP analysis are chained. More...


Detailed Description

Multidimensionnal analysis and dimension reduction by Principal Component Analysis and Projection Pursuit.

This procedure :

Author:
P.Héas (IRIT / ENSAE / DLR / CNES)
Since:
January 2005

Definition in file PCA_PP.cpp.


Function Documentation

void base_ortho_plan_finder float **    plan_init,
float **    new_base,
unsigned long int    nb_dim
 

Procedure searching an orthogonal base to the found projection.

The analysis is based on Gauss Jordan linear equation solution. the analysis uses Numerical recepies function

Parameters:
plan_init  input plan that will be contained in the orthogonal base
new_base  output orthogonal base
nb_dim  number of dimension of the base

Definition at line 741 of file PCA_PP.cpp.

void destructuration float **    Y,
unsigned long int    dim
 

Procedure desctructuring : transforming the distribution so that it becomes a Normal distribution.

the analysis uses Numerical recepies function

Parameters:
Y  input bidimensionnal data to modify
dim  number of data samples

Definition at line 858 of file PCA_PP.cpp.

float Evaluate_PI float **    x,
unsigned long int    dim
 

Evaluation of the projection index by a chi-square distance.

The non-Gaussianity of the plan distribution is evaluated using a chi2 projection index distance between the data distribution and a Gaussian normalized distribution. The integration requiered for this distance evaluation (see Theoritical note) is performed by deviding the integration domain into 432 small boxes. For further details, we refer the reader to Posse (1993) thesis.

Parameters:
x  bidimensionnal data
dim  number of data samples
Returns:
projection index

Definition at line 678 of file PCA_PP.cpp.

int main int    argc,
char *    argv[]
 

Main procedure where PCA and PP analysis are chained.

After a step of memory allocation and initialisation , the input data is read according to the cutting and sampling input parameters.
Then, the data is analysed by PCA. The analyse relies on finding the eigenvalues and eigenvectors of the data covariance matrix. The eigenvalue decompositionis performed using Numerical recepies function " jacobi(...)" and "eigsrt(...)". The principal components are then saved on the disk.
The PP optimisation procedure is then launch on a reduced space composed by principal components comprising a user provided signal energy percentage (input parameter "PCAEnergiePercentage"):

  • 1) first, a plan (bi-dimenionnal projection) is initialized randomly and the data is projected on it. The non-Gaussianity of the distribution within this plan is is evaluated by the chi-2 projection index function :"Evaluate_PI(...)".
  • 2) then, the search of the projection index minima is performed simultaneously by 4 different sub-routines, by an iterative algorithm. For each sub routine, 2 different moves in the feature space are generated randomly; they reveal, in parallel, 2 different plans, on which the data is projected and the non-Gaussianity of the projections are simultaneously evaluated; the best candidate between the two moves is preserved for each of the 4 sub-routines and the procedure is iterated.
    The scale of the search reduces while iterating the algorithm, for 3 of the 4 sub-routines, by deviding by a given factor (user provided : input parameter "factor_decreasing") the importance of the moves. The change of the scale search is done when the number of iteration reaches a limit fixed by the user (user provided : input parameter "nb_maximum_of_iterations"). But, 1 of the 4 search is performed always using the same search scale. Once the scale has changed, the algorithm is reiterated with this new search scale. The algorithm stops iterating when the size of the move is lower than a given value (user provided : input parameter "epsilon").
  • 3) The best projection plan found is then saved on the disk and the structure found in the projection is removed : the data is projected in an orthogonal based comprising the bidimensionnal plan; the distribution in this plan is then transform into a Gaussian normalized distribution and the data is projected back in the original feature space. Note, that even if the data has been modified by structure removals, the saved projections are not affected because the are generated from the original data.
    The step 2) is then reiterated a given number of times (user provided : input parameter "rand_departure")
  • 4) While the projection index limit provided by the user (input parameter "quantile") is not reached, that is to stay while intersting projection remain in the feature space, Step 2) and 3) are iterated.
Parameters:
argv0  [1] Text file name of input image bands
argv0  [2] Text file name of output image bands created by PP
argv0  [3] Text file name of output image bands created by PCA which represent a given energie percentage
argv0  [4] Number of lines of the images
argv0  [5] Number of columns of the images
argv0  [6] Signal energie percentage to preserve with PCA
argv0  [7] Quantile : index projection limit to stop the search
Optional parameters for selecting a spatial window :
Parameters:
argv0  [8] Column offset when reading the image file
argv0  [9] Line offset when reading the image file
argv0  [10] Number of columns of the image file
Optional parameters for Projection Pursuit :
Parameters:
argv0  [11] Number of rand_departures
argv0  [12] Maximum number of iterations
argv0  [13] Precision of the convergence (epsilon)
argv0  [14] Factor decreasing the scale of the search at each iteration

Definition at line 127 of file PCA_PP.cpp.

void projec_on_base float **    Y,
float **    newbase,
unsigned long int    dim,
unsigned long int    nb_dim
 

Procedure projecting the data in a new orthogonal base.

Parameters:
Y  input data to project
newbase  base for projection
dim  number of data samples
nb_dim  number of dimension of the base

Definition at line 797 of file PCA_PP.cpp.

void projec_on_oldbase float **    Y,
float **    newbase,
unsigned long int    dim,
unsigned long int    nb_dim
 

Procedure projecting the data in the original orthogonal base.

the anlysis is based on a matrix inversion. the analysis uses Numerical recepies function

Parameters:
Y  input data in a new coordinate system to project
newbase  old base for projection
dim  number of data samples
nb_dim  number of dimension of the base

Definition at line 824 of file PCA_PP.cpp.

void usage string    argv0
 

Usage function.

Parameters:
argv0  String containing the imput output file/paramters

Definition at line 35 of file PCA_PP.cpp.


Generated on Thu Feb 17 10:58:49 2005 for Dimension Reduction by Principal Component Analysis and Projection Pursuit by doxygen1.2.14 written by Dimitri van Heesch, © 1997-2002