In the blog, I summary the accepted papers from S&P 2022 which are related to my research interests from the link.
Papers:
- BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
- DEPCOMM: Graph Summarization on System Audit Logs for Attack Investigation
- DeepCASE: Semi-Supervised Contextual Analysis of Security Events
- Effective Seed Scheduling for Fuzzing with Graph Centrality Analysis
- LinkTeller: Recovering Private Edges from Graph Neural Networks via Influence Analysis
- Membership inference attacks from first principles
- Model Stealing Attacks Against Inductive Graph Neural Networks
- ShadeWatcher: Recommendation-guided Cyber Threat Analysis using System Audit Records
- WtaGraph: Web Tracking and Advertising Detection using Graph Neural Networks
DEPCOMM: Graph Summarization on System Audit Logs for Attack Investigation
Main Idea
DepComm is a provenance graph summarization framework which uses InfoPaths to capture the attack-related processes and assist attack investigation.
- Graph Summarization:
- Process-based community Detection: Hierarchical Random Walk considering local neighbors and global process lineage trees.
- Community Compression
- Community Summarization: including a master process, time span, the top-ranked infoPath
- Attack Investigation: find the events close to the POI (Point of Interest)
Key insight (Why is the paper better than others?)
The challenges:
- The provenance graphs are heterogeneous while a general summarization takes the nodes equally.
- The provenance graphs have plenty of trival and irrelevant dependencies.
- No suitable graph summarization techniques. The insights:
- It partitions graphs into process-centric communities (a group of processes and resources accessed by the processes) based on the observations:
- The cooperated process nodes (intimate processes) either have strong correlation or data dependencies through resource nodes which means:
- They have parent-child relationships.
- They share the same parent process and have data dependencies.
- They summarize 8 different schemes for parent-child nodes, resources to list the way hierarchical random walk works.
- The cooperated process nodes (intimate processes) either have strong correlation or data dependencies through resource nodes which means:
- It compresses the edges by process and resource patterns.
- Process-based patterns: the middle processes of the begin and end processes are parallel.
- Resource-based patterns: the middle resources of the begin and end processes are parallel.
- It prioritizes the InfoPaths to show the attack steps or major system activities at the top.
Experiments
Datasets:
- Darpa TC 3 Details:
- Compare with Nodoze.
- Cooperate with HOLMES.
Comments:
- It does not think about the temporal information.
- The main insight is the graph summarization.
- POI (Point of interest) is required.
DeepCase: Semi-supervised Contextual Analysis of Security Events
Main Idea
The paper designs a semi-supervised context-based suspecious event detector DeepCase to reduce the false positive alerts. The Model is composed of the following parts:
- Context Builder
- Encoder: Embedding layers+Recurrent layers
- Attention Decoder
- Interpreter
Key Insights
The previous methods concentrate on:
- reducing the number of false positive ratio by improving individual detectors
- prioritizing alerts (alert triaging)
- miss some relatively benign events from a complicated attack.
Deepcase addresses the challenges (Complex relations, Evolving, Explainable):
- handle complex relations within sequences of events from an evolving threats
- remain explainable to security operators.
Experiments
Datasets:
- Lastline
- HDFS
Comparisons:
- Cluster N-gram…
ShadeWatcher: Recommendation-guided Cyber Threat Analysis using System Audit Records
Main Idea
The paper leverages user-item interactions in recommendation systems and design a new framework ShadeWatcher to predict the system entity preferences on interactive entities to detect threats on audit records with the help of the high-order information. The high-order information here means the side information like genre of movies, type of files and other non-topology information.
ShadeWatcher has three parts:
- Knowledge Builder
- Recommendation Model
- Threat Detector and Adaptor
- Retrain with the new negative instance.
Key insight
The problems of the current methods:
- Statistics-based detection: high false positive rate.
- Specification-based detection: time-consuming and domain expertise needed.
- Learning-based detection:
- No explicable results or insights on the essential indicators or root causes of the attacks.
- Extra manual efforts needed.
ShadeWatcher leverages high-order information in the knowledge graphs to help the model detect malicious interactions.
Experiments
Datasets:
- Trace from Darpa 3
- Simulated Dataset
Evaluation
- on normal workloads
- on classification
Compare with
- Poirot
- Morse
- Unicorn
Efficiency