2

Recently I was reading this paper Skeleton Based Action RecognitionUsing Spatio Temporal Graph Convolution. In this paper, the authors claim (below equation (\ref{9})) that we can perform graph convolution with the following formula

$$ \mathbf{f}_{o u t}=\mathbf{\Lambda}^{-\frac{1}{2}}(\mathbf{A}+\mathbf{I}) \mathbf{\Lambda}^{-\frac{1}{2}} \mathbf{f}_{i n} \mathbf{W} \label{9}\tag{9} $$

using the standard 2d convolution with kernels of shape $1 \times \Gamma$ (where $\Gamma$ is defined under equation 6 of the paper), and then multiplying it with the normalised adjacency matrix

$$\mathbf{\Lambda}^{-\frac{1}{2}}(\mathbf{A}+\mathbf{I}) \mathbf{\Lambda}^{-\frac{1}{2}}$$

For the past few days, I was thinking about his claim but I can't find an answer. Does anyone read this paper and can help me to find it out, please?

nbro
  • 42,615
  • 12
  • 119
  • 217
Swakshar Deb
  • 703
  • 4
  • 12

0 Answers0