Home Paper Summary 2022
Post
Cancel

Paper Summary 2022

In the blog, I summarize the papers I feel interested and record some thoughts when reading.

Rethinking Graph Neural Netowrks for Anomaly Detection

paper/code

Main Idea

Based on the observation that anomalies lead to the right-shift effect in spectral energy distribution from low-frequencies to high frequencies, the paper propose Beta Wavelet GNN to get better performance. figure_1

Key insight

  • Spectral Analysis of Graph Anomaly
    • They define the anomaly degree $\sigma / \mu $ wherethe graph features are independent frawn from a Gaussian distribution: $x\sim N(\mu e_N,\sigma^2I_N)$.
    • They observe that with the increase of the degree of anomaly nodes, the spectral energy concentrate less on low-frequency eigenvalues.
    • They redefine the high-drequency area to avoid the high cost of eigen-decomposition:
      • \[S_{high}=\int_0^{\lambda_N} 1-f(t)dt\]
  • Hammond’s Graph Wavelet
    • It is composed of a mother wave and a group of wavelet bases: $W=(W_{\psi_1}, W_{\psi_2},\cdots)$
    • $W_{\psi_i}(x)=Ug_i(\Lambda)^T x$ where $g_i(\cdot)$ is a kernel function and $g_i(\lambda)=diag(g_i(\lambda))$. -
  • Beta Wavelet on Graph
  • Beta Wavelet Graph Neural Network
    • $Z_i=W_{i, C-i}(MLP(X))$
    • $H=AGG([Z_0, Z_1, \cdots, Z_C])$

Experiments

Datasets:

  • YelpChi
  • Amazon

Comparisons:

  • MLP/SVM
  • GCN/ChebyNet/GAT/GraphSAGE…
  • GraphConsis/CAREGNN…

A Comparative Study for Unsupervised Network Representation Learning

arxiv

Main Idea

The paper summarizes several unsupervised network representation learning (UNRL) approaches over graphs:

  • Random Walk based
  • Matrix Factorization
  • Deep Learning based They consider 9 UNRL methods and 11 datasets on two tasks: node classification and link prediction, and analyze all of them to show a clear blueprint of the topic.

The involved algorithms are listed in the table: figure_2

Experiments

Datasets: figure_3

Graph Structure of Neural Networks

arxiv/code/ ICML2020

Main Idea

The paper takes the neural networks as graphs and develops a graph-based representation called relational graph. Then they design a graph generator WS-flex to systematically explore the design space of neural networks.

Imgur

In the Average Path Length-Clustering Coefficient figures, they claim to find sweet spots to find the best performance models. They furthermore provide a quick way to identify a sweet spot.

SIGL: Securing Software Installations Through Deep Graph Learning

arXiv

Main Idea

The paper designs a tool, SIGL, to detect malicious behavior on software installation. It first build the provenace graphs by system call activities. Then it uses Graph LSTM as an autoencoder to generate embeddings. System Overview:

  • Data Collection and Featurization
  • Model Training and Validation
  • Anomaly Detection and Prioritization

    Key insights

  • Formalize the problem of detecting malicious software installation.
  • Define the Software Installation Graphs and use an autoencoder to generate graph-level embeddings.

Threat model

They define the installation behavior of a software package as a chain of system events which can be displayed as a directed acyclic graph. Nodes represent processes and objects while edges represent interactions. The operating system and audit framework are benign.

Experiments

Dataset: synthetic dataset on Windows ETW and Linux Aduit

It also consider the poisoning attack and adversarial attack. For poisoning attacks, different ratios of malware data from 5% to 25% are added. For adversarial attacks, they design restrict black-box attack (RBA) including feature and structure attacks and practical black-box attack (PBA) including hierarchical reinforcement learning attack.

You are what you do: Hunting Stealthy Malware via Data Provenance Analysis

paper/slide/video

Main idea

The paper designs a provenance-based approach ProvDetector to detect stealthy malware.

  • Provenance graph building
  • Representation Extraction
    • Rareness-based path selection
  • Embedding
  • Anomaly Detection

Key insights

  • Detection of marginal deviation: Impersonation-based stealthy malware blends into benign programs. ProvDetector needs to identify and isolate the marginal outlier events from the benign behaviors.
  • Scalable model building: The size of provenance graphs is super large. ProvDetector works on the suspicious subgraphs by ranking the top $K$ uncommon causal paths.

Threat Model

The paper concentrates on stealthy malware that leverages legitimate methods or machine services from the victim’s host to exploit and execute malicious activities.

  • The operating systems and the loggers are benign or in the trusted computing base (TCB) and the adversary cannot change the records.
  • The side channel attack is also not considered

Experiments

They collected process instances from 23 programs. In conclusions, they tested on 1150 malware samples that hijack benign processes.

They also evaluated the interpretation of detection results.

  • Simple models do not perform well because they can only consider one hop neighborhoods.
  • Whole Graph modeling is not a good feature based on their experiments when they use graph2vec to detect hijacked process attacks.

Log2vec: A heterogeneous Graph Embedding Based Approach for Detecting Cyber Threats within Enterprise

paper/

Main idea

The paper design an embedding generator to detect malicious logs. It firstly build a heterogeneous graph from logs that each node represents the log event and they design 10 rules to decide the edges. Basically, the edges show the common hosts or temporal sequence between nodes. In general, the heterogeneous graph looks like a self-defined connections instead of the raw node interaction structures. After the graph generating, they use random walk and word2vec to generate the node embedddings and clustering algorithm to detect anomaly nodes.

Key insights

  • The graph definition is totally different from provenance graphs. It looks like a multi-dimensional log sequence.
  • It did not use GNN models. Instead, Random walk + word2vec is used. Imgur

Experiments

datasets:

  • cert
  • LANL

Learning Causal Effects on Hypergraphs

paper/code

Main Idea

The paper concentrates on individual treatment effect (ITE) estimation on hypergraphs. One example of ITE estimation is how much an intervention (wearing face covering) will causally afect an outcome (COVID-19 infection) of each individual node. They use a hypergraph neural network to learn high-order interference.

Key insights

  • The first one works on the problem of ITE estimation under high-order interference on hypergraphs.
  • Design a framework to models confounders and high-order interference by representation learning and hypergraph neural networks.

Experiments

Dataset: Semi-synthetic data from Contact, Goodreads and Microsoft Teams.

Multi-Dimensional Network Embedding with Hierarchical Structure

paper

Main Idea

The paper designs a multi-dimensional graph model that considers the edge types and node types in different layers/dimensions. For example, if a set of nodes can be shown as users/products and edges can be viewing or purchasing, we can draw two graphs that represent viewing and purchasing information. In that case, two graphs have same nodes but different edges.

Model Structure

  • For different edge types, using a specific embedding to record the dimension information.
  • For different node types, using an embedding to record the hierarchical strucutre information.
  • Use a skip-gram model to learn the embeddings. And simply adding the embeddings together.

GraphMAE: Self-Supervised Masked Graph Autoencoders

paper/code

Main Idea

The paper focuses on Self-supervised Learning (SSL) by graph autoencoders (GAEs). It lists four common problems of the current models:

  • The structure information is over-emphasized.
  • Feature reconstruction without corruption is not robust.
  • The mean square error is sensitive and unstable.
  • The decoder architectures are of little expressiveness. They improve the following parts:
  • Masked feature reconstruction.
  • Scaled cosine error.
  • Re-mask decoding.

Background of Self-supervised Learning[1]

Imgur

  • Generative: Use an encoder to generate middle vector $z$ and use an decoder to rebuild the orginal input $x$.
  • Contrastive: After getting the middle vector $z$, compare the similarity.
  • Generative-Contrastive: Based on the generative model, use a large model like ResNet to compare them.

Model Techniques

  • Use feature reconstruction as the objective.
  • Reconstruct the masked features and apply a uniform random sampling strategy.
  • Use a GNN-decoder with re-mask decoding.
  • Use scaled cosine error as the criterion.

Imgur

Reference:

    1. Self-supervised Learning: Generative or Contrastive
This post is licensed under CC BY 4.0 by the author.

S&P2022

Usenix Security 2022