Adversarial attack on Graph Neural Network for Node Classification

In the blog, I will discuss several papers about attack methods on Graph Convolutional Networks for Node Classification while there are also many other tasks which are deserved digging into like Community Detection, Link Prediction, etc.

In the blog, I exploit the adversarial graph neural networks and related knowledge. I will summarize these attacks and defenses. There is also a useful github repo which collects related papers and code links.

1. Concepts

The definition of Graph Adversarial Attack

2. Attack

The graph adversarial attack can be classified based on different prospectives.

Attacker’s Ability:

Evasion Attack: Attacking happens after the GNN model is trained.
Poisoning Attack: Attacking happens during the training by adding poison data into the training set.

Perturbation Type:

Modifying Feature:
Adding or Deleting Edges
Injecting Nodes

Adversary’s Goal:

Targeted Attack
Un-targeted Attack

Adversary’s Knowledge:

White-box Attack
Black-box Attack
Gray-box Attack: the attacker has limited knowledge.

1. Adversarial Attack on Graph Structured Data

Author-Time-arXiv: Hanjun Dai etc.; ICML 2018; 1806.02371;

Key Points:

Proposing a reinforcement learning based attack.

RL-S2V

1. A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models

Author-Time-arXiv: Heng Chang etc.; AAAI 2020; 1908.01297;

Key Points:

Proposing GF Attack, a black-box attack which constructs adversarial examples by graph filter and feature matrix.

Graph Embedding Models are used to transfer a graph including vertexes, edges and other related information into a lower dimension vector space.

1. Adversarial Attacks on Graph Neural Networks Via Meta Learning

Author-Time: Daniel Zugner etc.; ICLR2019;

Key Points:

Meta Learning
Poisoning Attack

1.1 Problem Formulation

The paper considers semi-supervised node classification. Given a set of labeled nodes $\nu_L\subseteq\nu$, where they have exactly on class in $C=\{c_1,c_2,\dots,c_K\}$. The goal is to learn a function $f_\theta$ mapping each node $v\in \nu$ to one of the $K$ classes in $C$ or in a probabilistic formulation: to the $K$-simplex. The parameters $\theta$ of the function $f_\theta$ are generally learned by minimizing a loss function $L_{trian}$, e.g. cross-entropy, on the labeled training nodes with a single attributed graph $G$:

$\theta^*=\mathop{\arg\min}_\limits{\theta}L_{train}(f_{\theta}(G))$.

1.2 Attack Model

Goal: The adversary’s goal is to increase the misclassification rate of a node classification algorithm after training on the data modified by his attacking algorithm.

Knowledge: The adversary has no knowledge about the classification model and its trained weights. At the same time, he can observe all nodes’ attributes, the graph structure and labels of the subset $\nu_L$ and uses a surrogate model to modify these data.

Capability: The adversary has several rules when attacking the graph models.

The budget constraint $\Delta$ on the attacks: $\vert\vert A-\hat A\vert\vert_0\le\Delta$.
No node becomes disconnected.
The graph’s degree distribution can only marginally be modified by the attacker. All constraints and admissible perturbations on the data are denoted as $\Phi(G)$.

1.3 Overall Goal

Poisoning attacks can be formulated as a bilevel optimization problem: $\mathop{\min}_\limits{\hat G\in\Phi(G)}L_{atk}(f_{\theta^*}(\hat G))~s.t.\theta^*=\mathop{\arg\min}_\limits{\theta}L_{train}(f_\theta(\hat G))$

$L_{atk}$ is the loss function the attacker aims to optimize which aims to decrease the generalization performance of the model on the unlabeled nodes. $L_{train}$ is the loss on the training labeled nodes.

The first option is to choose $L_{atk}=-L_{train}$.

The second option is to choose $L_{atk}=-L_{self}~where~L_{self}=L(\nu_U,\hat C_U)$.

The attacker computes the loss $L_{self}$ on the unlabeled nodes by predicted labels $\hat C_U$ of the unlabeled nodes $\nu_U=\nu \backslash \nu_L$ from a model training on the labeled data.

1.4 Poisoning via Meta-Gradients

The essence of the attack is to treat the graph structure matrix as a hyper-parameter and compute the gradient of the attacker’s loss after training.

\[\nabla_G^{meta}:=\nabla_GL_{atk}(f_{\theta^*}(G))~s.t.~\theta^*=opt_\theta(L_{train}(f_\theta(G)))\]

where $opt(\cdot)$ is a differentiable optimization procedure like gradient descent and its stochastic variants.

The parameters are updated as follows:

$\theta_{t+1}=\theta_t-\alpha\nabla_{\theta_t}L_{train}(f_{\theta_t}(G))$.

The paper uses a two-layer graph convolutional network as the surrogate model.

\[f_\theta(A,X)=softmax(\hat A^2XW)\]

where $\hat A=D^{-\frac{1}{2}}\tilde AD^{-\frac{1}{2}},~\tilde A=A+I$ and $\theta=\{W\}$ is the set of learnable parameters.

The score function is defined as follows:

\[S(u,v)=\nabla_{a_{uv}}^{meta}\cdot(-2\cdot a_{uv}+1)\]

where $a_{uv}$ is the entry at position $(u,v)$ in the adjacency matrix $A$. If the edge $e=(i,j)$ is inserted, $a_{ij}=1$ otherwise it is deleted by setting $a_{ij}=0$.

They pick the perturbation $e'=(u',v')$ with the highest score at a time:

\[e'=\mathop{\arg\max}_\limits{e=(u,v):M(A,e)\in\Phi(G)}S(u,v)\]

1.5 Approximation

The paper also assumes $\hat \theta_t$ is independent of $\hat \theta_{t-1}$. They discover that the heuristic meta gradient on average has the strongest increase in the training loss.

The heuristic of the meta gradient which achieves faster convergence sets the initial weights $\theta_0$ like that: $\nabla_{\theta_0}^{meta}\approx \sum_{t=1}^T\nabla_{\hat \theta_t} L_{train}(f_{\hat\theta_t}(A;X))$.

After approximating based on a , the meta gradient can compute as follows.

$\nabla_A^{meta}\approx \sum_{t=1}^T\nabla_AL_{train}(f_{\hat\theta_t}(A;X))$.

Besides, when considering the

Adversarial attack on Graph Neural Network for Node Classification

1. Concepts

2. Attack

1. Adversarial Attack on Graph Structured Data

1. A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models

1. Adversarial Attacks on Graph Neural Networks Via Meta Learning

1.1 Problem Formulation

1.2 Attack Model

1.3 Overall Goal

1.4 Poisoning via Meta-Gradients

1.5 Approximation

Further Reading

LLM Deploy

[SP2024] Understanding and Bridging the Gap Between Unsupervised Network Representation Learning and Security Analytics

LLM Paper Summary