In the blog, I will discuss several papers about attack methods on Graph Convolutional Networks for Node Classification while there are also many other tasks which are deserved digging into like Community Detection, Link Prediction, etc.
In the blog, I exploit the adversarial graph neural networks and related knowledge. I will summarize these attacks and defenses. There is also a useful github repo which collects related papers and code links.
1. Concepts
The definition of Graph Adversarial Attack
2. Attack
The graph adversarial attack can be classified based on different prospectives.
Attacker’s Ability:
- Evasion Attack: Attacking happens after the GNN model is trained.
- Poisoning Attack: Attacking happens during the training by adding poison data into the training set.
Perturbation Type:
- Modifying Feature:
- Adding or Deleting Edges
- Injecting Nodes
Adversary’s Goal:
- Targeted Attack
- Un-targeted Attack
Adversary’s Knowledge:
- White-box Attack
- Black-box Attack
- Gray-box Attack: the attacker has limited knowledge.
1. Adversarial Attack on Graph Structured Data
Author-Time-arXiv: Hanjun Dai etc.; ICML 2018; 1806.02371;
Key Points:
- Proposing a reinforcement learning based attack.
RL-S2V
1. A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models
Author-Time-arXiv: Heng Chang etc.; AAAI 2020; 1908.01297;
Key Points:
- Proposing GF Attack, a black-box attack which constructs adversarial examples by graph filter and feature matrix.
Graph Embedding Models are used to transfer a graph including vertexes, edges and other related information into a lower dimension vector space.
1. Adversarial Attacks on Graph Neural Networks Via Meta Learning
Author-Time: Daniel Zugner etc.; ICLR2019;
Key Points:
- Meta Learning
- Poisoning Attack
1.1 Problem Formulation
The paper considers semi-supervised node classification. Given a set of labeled nodes \(\nu_L\subseteq\nu\), where they have exactly on class in \(C=\{c_1,c_2,\dots,c_K\}\). The goal is to learn a function \(f_\theta\) mapping each node \(v\in \nu\) to one of the \(K\) classes in \(C\) or in a probabilistic formulation: to the \(K\)-simplex. The parameters \(\theta\) of the function \(f_\theta\) are generally learned by minimizing a loss function \(L_{trian}\), e.g. cross-entropy, on the labeled training nodes with a single attributed graph \(G\):
\(\theta^*=\mathop{\arg\min}_\limits{\theta}L_{train}(f_{\theta}(G))\).
1.2 Attack Model
Goal: The adversary’s goal is to increase the misclassification rate of a node classification algorithm after training on the data modified by his attacking algorithm.
Knowledge: The adversary has no knowledge about the classification model and its trained weights. At the same time, he can observe all nodes’ attributes, the graph structure and labels of the subset \(\nu_L\) and uses a surrogate model to modify these data.
Capability: The adversary has several rules when attacking the graph models.
- The budget constraint $\Delta$ on the attacks: \(\vert\vert A-\hat A\vert\vert_0\le\Delta\).
- No node becomes disconnected.
- The graph’s degree distribution can only marginally be modified by the attacker. All constraints and admissible perturbations on the data are denoted as \(\Phi(G)\).
1.3 Overall Goal
Poisoning attacks can be formulated as a bilevel optimization problem: \(\mathop{\min}_\limits{\hat G\in\Phi(G)}L_{atk}(f_{\theta^*}(\hat G))~s.t.\theta^*=\mathop{\arg\min}_\limits{\theta}L_{train}(f_\theta(\hat G))\)
\(L_{atk}\) is the loss function the attacker aims to optimize which aims to decrease the generalization performance of the model on the unlabeled nodes. \(L_{train}\) is the loss on the training labeled nodes.
The first option is to choose \(L_{atk}=-L_{train}\).
The second option is to choose \(L_{atk}=-L_{self}~where~L_{self}=L(\nu_U,\hat C_U)\).
The attacker computes the loss \(L_{self}\) on the unlabeled nodes by predicted labels \(\hat C_U\) of the unlabeled nodes \(\nu_U=\nu \backslash \nu_L\) from a model training on the labeled data.
1.4 Poisoning via Meta-Gradients
The essence of the attack is to treat the graph structure matrix as a hyper-parameter and compute the gradient of the attacker’s loss after training.
\[\nabla_G^{meta}:=\nabla_GL_{atk}(f_{\theta^*}(G))~s.t.~\theta^*=opt_\theta(L_{train}(f_\theta(G)))\]where \(opt(\cdot)\) is a differentiable optimization procedure like gradient descent and its stochastic variants.
The parameters are updated as follows:
\(\theta_{t+1}=\theta_t-\alpha\nabla_{\theta_t}L_{train}(f_{\theta_t}(G))\).
The paper uses a two-layer graph convolutional network as the surrogate model.
\[f_\theta(A,X)=softmax(\hat A^2XW)\]where \(\hat A=D^{-\frac{1}{2}}\tilde AD^{-\frac{1}{2}},~\tilde A=A+I\) and \(\theta=\{W\}\) is the set of learnable parameters.
The score function is defined as follows:
\[S(u,v)=\nabla_{a_{uv}}^{meta}\cdot(-2\cdot a_{uv}+1)\]where \(a_{uv}\) is the entry at position \((u,v)\) in the adjacency matrix \(A\). If the edge \(e=(i,j)\) is inserted, \(a_{ij}=1\) otherwise it is deleted by setting \(a_{ij}=0\).
They pick the perturbation \(e'=(u',v')\) with the highest score at a time:
\[e'=\mathop{\arg\max}_\limits{e=(u,v):M(A,e)\in\Phi(G)}S(u,v)\]1.5 Approximation
The paper also assumes \(\hat \theta_t\) is independent of \(\hat \theta_{t-1}\). They discover that the heuristic meta gradient on average has the strongest increase in the training loss.
The heuristic of the meta gradient which achieves faster convergence sets the initial weights \(\theta_0\) like that: \(\nabla_{\theta_0}^{meta}\approx \sum_{t=1}^T\nabla_{\hat \theta_t} L_{train}(f_{\hat\theta_t}(A;X))\).
After approximating based on a , the meta gradient can compute as follows.
\(\nabla_A^{meta}\approx \sum_{t=1}^T\nabla_AL_{train}(f_{\hat\theta_t}(A;X))\).
Besides, when considering the