Adaptive Algorithms for PCA Review of Principal Component Analysis Solves the matrix equation RV V Diagonalizes the covariance matrix of the input signal, V T RV The eigenvectors become eigenfilters and they span the frequency spectrum of the input signal. This is true only if the dimensionality of the data is very high. (From the spectral decomposition property of eigenvectors) The eigenvectors being linearly independent form the most optimal basis for any vector space. This is the motivation for using eigenvectors in data compression (KLT) How to solve PCA using adaptive structures? Although there are many numerical techniques (SVD) to solve PCA, for real-time applications, we need iterative algorithms that solve PCA using one sample of data at a time.
MATLAB programmers use eig function. I METHOD - Minimization of mean square reconstruction error Desired response is the input itself. N N M Input=x Y=WTx x Wy
J E x x 2 Minimize the cost function using gradient method with constraint W T W 1 The weight matrix will converge to a rotation of the eigenvector matrix as W TV T is a square orthogonal matrix II METHOD Hebbian and anti-Hebbian learning If there are 2 units (neurons) A and B, and there is a connection (synapse) between the units, adjust the strength of the connection (weight w) in proportion to the product of their
activations. If x denotes the input excitation and y denotes the output of the neuron, then the synaptic weights w are updated using the equation W(n+1) = W(n) + y(n)X(n)y(n)X(n) Note that the weight vector W can easily blow up to infinity if there is no restriction on the norm. x1(n) x2(n) y (n) xm(n) Let us define a cost function
W T RW E W T XX T W E y2 J T T T W W W W W W This cost function is nothing but the variance of the output
divided by the norm of the weight that produced it. We are interested in finding out the weight vector that produces the maximum output variance under the constraint of unity norm of W. J RW W T RW W 0 RW W T RW W W RW W , W T RW So, if we maximize the output variance subject to the constraint of unit norm, then W is nothing but the principal eigenvector and the eigenvalue itself is the variance! In the case of Hebbian rule, we just maximize the output variance without putting any restriction on the norm. T J W RW
Thus, Hebbian rule will still give us the principal component but it is an unstable algorithm Anti-Hebbian learning Hebbian learning effectively correlates input and output. A simple minus sign in the learning rule will give a decorrelator W(n+1) = W(n) - y(n)X(n)y(n)X(n) A simple learning rule which uses both the Hebbian and antiHebbian learning rules is the LMS! w(n 1) w(n) e(n) x(n) w(n 1) w(n) d (n) y (n) x(n) w(n 1) w(n) d (n) x(n) y (n) x(n) Stabilizing Hebbs rule - Ojas rule The simplest way to stabilize Hebbs rule is to normalize the weights after every iteration.
w (n 1) w(n) x(n) y (n) w (n 1) w(n 1) w (n 1) 2 w (n 1) w T (n 1) w (n 1) 1 2y 2 (n) O( 2 ) 1/ 2 w(n 1) w (n 1)1 y (n) w(n 1) w(n) x(n) y (n) 1 y (n) 2 2
w(n 1) w (n 1) 1 2y (n) O( ) 2 2 w(n 1) w(n) x(n) y (n) w(n) y 2 (n) 2 y 3 (n) x(n) w(n 1) w(n) x(n) y (n) w(n) y (n) So, Ojas rule will give us the first eigenvector for a small enough step-size. This is the first learning algorithm for PCA. How do we compute other eigenvectors? Deflation procedure is adopted to compute the other eigenvectors. Deflation is similar to Gram-Schmidt procedure! Step I : Find the first principal component using Ojas rule Step II : Compute the projection of the first eigenvector on the
T y v input 1 x Step III :Generate the modified input as x x v1 y x v1v1T x Rx ( I v1v1T ) Rx Step IV : Repeat Ojas rule on the modified data The steps can be repeated to generate all the eigenvectors. Putting it all together Ojas rule + Deflation = Sangers rule Sanger extended Ojas rule to extract multiple eigenvectors using deflation procedure.
The rule is simple to implement The algorithm is on-line Weight matrix Input data Outputs x1(n) wj1(n) x2(n) yj (n)
wj2(n) wjM(n) xM(n) Wji represents the scalar weight between input node i and output node j. Network has M inputs and L outputs. M y j (n) w ji (n) xi (n) j 1,2,3,4,....L i 1 j
w ji (n) y j (n) xi (n) y j (n) wki (n) yk (n) k 1 i 1,2,3,4,....M j 1,2,3,4,....L
Pressure is the force per unit area applied to an object in a direction perpendicular to the surface. Pressure is calculated by taking the total force and dividing it by the area over which the force acts. A very small...
With Mobile Services, you can extend internal web apps to a variety of mobile devices and enable your workforce to stay connected no matter where they are. You can deliver that experience even if you have sensitive data that needs...
The Job of a Journalist Journalists serve: A political function (watchdog of government) An economic function (business, farming, industrial, ads) A sentry function (pointing out social problems) A record-keeping function (important news) An entertainment function (diversion) A social function (gets...
Different levels of milieu-therapeutic work will be discussed and the importance of defining the therapeutic task will be emphesized. The evailable resources in milieu-therapeutic work will be presented, and the role of the milieu-therapist will be presented.
Provide an overview of basic unit operation and functions. Navigate through common ACARS menu system. Provide interactive step-by-step instructions for basic use of the unit and common operational scenarios . Act as a supplement to the Dlink+ w/CPDLC Users Guide,...
Problem: how do you know the address of the shellcode on the stack? To execute the shellcode, you have to return to its exact start address. This is a small target [[email protected] game] ./guessinggame [16-bytes of garbage][4-byte stack pointer][evil shellcode...
The Meaning of Party. What is a political party? People trying to win office and control the government . Party in the electorate: Any American can be a member of any party at any time - membership cards or dues...
Ready to download the document? Go ahead and hit continue!