前一篇从个人角度介绍英文论文引言如何撰写。这篇文章将从个人角度介绍英文论文模型设计（Model Design）如何撰写，并以入侵检测系统为例（Intrusion Detection System）。一方面自己英文太差，只能通过最土的办法慢慢提升，另一方面是自己的个人学习笔记，并分享出来希望大家批评和指正。希望这篇文章对您有所帮助，这些大佬是真的值得我们去学习，献上小弟的膝盖~fighting！
该部分回顾和参考周老师的博士课程内容，感谢老师的分享。典型的论文框架包括两种（The typical “anatomy” of a paper），如下所示：
Mathematics and algorithms（算法）
在评估一个科研成果的科学价值时，最重要的是创新性和研究意义。创新性是指研究者不是单纯地跟随或重复别人的研究，而是有自己的独到的新贡献。据说，研究要经历三个阶段：“me too”、“me better”、“me only”。同样，创新性也可以套用三个阶段描述。
(1) Chuanpu Fu, et al. Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis. CCS.
4 DESIGN DETAILS
- 4.1 Frequency Feature Extraction Module
- 4.2 Automatic Parameters Selection Module
- 4.3 Statistical Clustering Module
(2) Sunwoo Ahn, et al. Hawkware: Network Intrusion Detection based on Behavior Analysis with ANNs on an IoT Device. DAC
III. HAWKWARE DESIGN
- A. Overview
- B. Threat models and assumptions
- C. Monitor Module
- D. Detector Module
(3) Ning Wang, et al. MANDA: On Adversarial Example Detection for Network Intrusion Detection System. IEEE Infocom.
III. SYSTEM MODEL AND THREAT MODEL
- A. Notations
- B. System Model
- C. Threat Model
IV. THE MANDA SYSTEM
- A. Problem-Space AE Attack for IDS
- B. Properties of AE
- C. MANDA
(4) Mohammed A. Ambusaidi, et al. Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm. IEEE TRANSACTIONS ON COMPUTERS
4 INTRUSION DETECTION FRAMEWORK BASED ON LEAST SQUARE SUPPORT VECTOR MACHINE
- 4.1 Data Collection
- 4.2 Data Preprocessing
- 4.3 Classifier Training
- 4.4 Attack Recognition
(5) Jun Zeng, et al. WATSON: Abstracting Behaviors from Audit Logs via Aggregation of Contextual Semantics. NDSS.
III. WATSON DESIGN
- A. Approach Overview
- B. Knowledge Graph Construction
- C. Event Semantics Inference
- D. Behavior Summarization
- E. Behavior Semantics Aggregation
- F. Behavior Clustering
(6) Ron Bitton, et al. A Machine Learning-Based Intrusion Detection System for Securing Remote Desktop Connections to Electronic Flight Bag Servers. IEEE TDSC.
3 AN OVERVIEW OF THE PROPOSED NIDS FOR SECURING RDP CONNECTIONS
- 3.1 Anomaly Detection for RFB Protocol
- 3.2 Illustrating the Proposed Fine-Grained Algorithm
4 ADETAILED DESCRIPTION OF THE PROPOSED NIDS FOR SECURING RDP CONNECTIONS
- 4.1 Network Monitoring
- 4.2 Feature Extraction
- 4.3 Anomaly Detection
– 4.3.1 Model Construction
– 4.3.2 Detecting Anomalous TCP Packets
个人习惯将模型设计结合框架图进行描述，也欢迎大家批评指正。下面主要以CCF A会议和期刊论文为主进行介绍，重点以入侵检测系统（Intrusion Detection System，IDS）领域为主。
In this section, we present the design details of Whisper, i.e., the design of three main modules in Whisper.
We will first describe the design goals of our provenance-based intrusion detection system, then we elaborate (详细描述) the details on design and implementation of this system.
In this section, we outline our SDN-based system to enforce MUD policies and dynamically inspect exception traffic which is a small fraction of total packets to/from IoT devices. Our system uses as input MUD profiles of 28 consumer IoT devices that we have automatically generated by the MUDgee tool  using packet traces collected over several months. We next begin with the architecture of our system.
In this section, we present the design of MANDA, the proposed AE detector for ML-based IDS, and explain the rationale behind each design choice. The valid input to an IDS system is real network traffic flows in the problem-space. Therefore, the generated AE should also lie in the same problem-space of IDS. We adapt existing feature-space AE generation algorithms to problem-space algorithms in order to generate AEs that can map back to valid real network events. The key insight for detecting AEs is to identify the discrepancy between true benign samples and AEs. Such an intuition motivates us to investigate AE’s position to the decision-boundary of the IDS model and its position in the traffic manifolds formed by training samples.
In this section, the sensor data are analyzed in the frequency domain based on the NPGF. First, the second-order NPGF is introduced in Section III-A to reconstruct the IoT sensor data. To conduct the frequency-domain analysis for the sensor data reconstruction, GFT and its inverse transform are presented in Section III-B, and the frequency-domain analysis for the second-order NPGF is presented in Section III-C.
In this section, we describe how the disagreement-based semi-supervised learning works, and introduce how to use it to construct a false alarm filter (simply called DASSL false alarm filter). Then, we present the framework of DAS-CIDS and show how to combine semi-supervised learning with CIDSs.
In this section, we first present the intuition of a conventional LSTM architecture. Then, we explain in detail the bidirectional LSTM (BiDLSTM) architecture. We further describe the NSL-KDD dataset used to train our model.
In this section, we give a technical description of the features in the data sets used in our experiments. We then explain the details of the techniques employed for adversarial example generation. This leads to the layout of our computational setting.
The overall architecture of Hawkware is depicted in Figure 1. Hawkware is divided into two main modules: the monitor module (MM) and the detection module (DM). MM monitors both the network and device behaviors and extracts the relevant features for the ANN named as Hawknet. DM detects any suspicious behaviors indicating network intrusions by utilizing Hawknet.
Hawkware consists of five components and their functions are summarized as follows: (1) the packet analyzer (PA) analyzes network packet headers and extracts relevant features; (2) the system call logger (SCL) records the device behavior and extracts features related to incoming/outgoing network packets; (3) the feature preprocessor (FP) aggregates both extracted features and transfer them as inputs to Hawknet; (4) the Hawknet controller (HC) examines the Hawknet’s outputs and determines the existence of intrusions; (5) the Hawknet quantifies the degree of anomaly.
The proposed anomaly detection method is designed to prevent malicious entities from exploiting vulnerabilities in the remote desktop server. The proposed solution specifically focuses on the remote framebuffer protocol , which is one of the common protocols used for connecting and interacting with computers remotely.
The framework of the proposed intrusion detection system is depicted in Fig. 1. The detection framework is comprised of four main phases: (1) data collection, where sequences of network packets are collected, (2) data preprocessing, where training and test data are preprocessed and important features that can distinguish one class from the others are selected, (3) classifier training, where the model for classification is trained using LS-SVM, and (4) attack recognition, where the trained classifier is used to detect intrusions on the test data.
Support Vector Machine is a supervised learning method . It studies a given labeled dataset and constructs an optimal hyperplane in the corresponding data space to separate the data into different classes. Instead of solving the classification problem by quadratic programming, Suykens and Vandewalle  suggested re-framing the task of classification into a linear programming problem. They named this new formulation the Least Squares SVM (LS-SVM). LS-SVM is a generalized scheme for classification and also incurs low computation complexity in comparison with the ordinary SVM scheme . One can find more details about calculating LS-SVM in Appendix B, available in the online supplemental material. The following sections explain each phase in detail.
Figure 1 summarises the MAGPIE architecture. Its collection phase captures and decodes the data coming from cyber (computation, communication) or physical feeds (e.g., audio, signal strength). It can dynamically activate or deactivate interfaces and decode the corresponding raw feeds, such as sensor readings or network datagrams.
Smart homes generate large volumes of usually encrypted data  that may differ considerably between different environments. In the transcription phase, MAGPIE considers only meta-data that are consistent across different smart homes. … Moreover, by reading only smart home network communication flow meta-data, MAGPIE is better positioned to preserve privacy. MAGPIE extracts meta-data streams (MDS) based on specific interface datastream parsing logic (e.g., communication/application/sensor protocol) (Figure 2).
Fig. 1 shows the architecture of Pagoda. It consists of six components, namely, Provenance collection, Provenance pruning, Provenance storage and maintenance, Rule building and deduplication, Detection process andForensic analysis. The Provenance collection component （部分） is responsible for monitoring the behaviors of the normal/intrusion applications, intercepting the system calls invoked by them and translate these system calls to causality-based provenance records. Then the provenance pruning module omits the provenance records that are not related to intrusion detection to improve the detection accuracy and save the storage space simultaneously. The Provenance storage and maintenance component uses key-value memory database (e.g., Redis ) to store rule database and run the provenance-based intrusion detection algorithm to make real-time detection. The Rule building and deduplication module constructs the rule sets for intrusion detection and removes the duplicated strings to make the rule database as small as possible. The Detection process component judges whether the intrusion has happened according to the rule sets and also updates the rule sets according to the detection results. At last, the Forensic analysis module looks for the system vulnerability and intrusion sources by making forward and backward queries.
Fig. 3 illustrates the overall process of applying the proposed fine-grained algorithm for evaluating an incoming packet. As can be seen, the process starts with a new incoming packet. First, two types of features, contextual and content-based, are extracted from the packet (step 1). The contextual features are extracted from the header of the packet, and the content-based features are extracted from the packet’s payload. Next, for dimensionality reduction, as well as for improving detection rate, a ‘message type classification’ PCA model is applied on the content-based features (step 2). The resulting PCA features, along with the contextual features, are used as the input vector to a decision tree model.
The decision tree model is applied on the packet’s feature vector to classify the packet based on its message type (step 3). Note that in the example provided in Fig. 3 the packet is classified by the decision tree model as a ‘Pointer Event’ packet.
Based on the derived message type (‘Pointer Event’), in the next step (step 4) a second PCA model which was initially created for each specific message type, is applied on the original content-based features of the packet.
Finally, the resulting PCA features are used by a k-means algorithm and CBLOF model of the specific message type (‘Pointer Event’) in order to assign an anomaly score for the packet.
A typical architecture of a ML-based IDS is shown in Fig. 1. Usually, IDS is a passive infrastructure which rarely interferes with the network traffic under monitoring. An IDS sniffs the internal interface of the firewall in a read-only mode and sends alerts to an IDS management server via a read-and-write network interface , . As Fig. 1 shows, a ML-based IDS is composed of the following modules :
Network Traffic Monitor keeps tracking the ongoing network traffic of a communication and networking system.
Feature Extractor processes the raw traffic data as feature vectors in a pre-defined form.
Training Phase. In the training phase, an ML model is trained with both benign and malicious traffic instances. We refer to the ML model as IDS model.
Detection Phase. In the detection phase, processed runtime traffic instances are fed into the learned model. An alert will be generated if an input instance is classified as positive by IDS model.
Ning Wang, et al. MANDA: On Adversarial Example Detection for Network Intrusion Detection System. IEEE INFOCOM.
The data obtained during the phase of data collection are first processed to generate the basic features such as the ones in KDD Cup 99 dataset . This phase contains three main stages shown as follows.
Data Transferring. The trained classifier requires each record in the input data to be represented as a vector of real number. Thus, every symbolic feature in a dataset is first converted into a numerical value. For example, the KDD CUP 99 dataset contains numerical as well as symbolic features. These symbolic features include the type of protocol (i.e., TCP, UDP and ICMP), service type (e.g., HTTP, FTP, Telnet and so on) and TCP status flag (e.g., SF, REJ and so on). The method simply replaces the values of the categorical attributes with numeric values.
Data Normalisation. An essential step of data preprocessing after transferring all symbolic attributes into numerical values is normalisation. Data normalisation is a process of scaling the value of each attribute into a well-proportioned range, so that the bias in favor of features with greater values is eliminated from the dataset. Data used in Section 5 are standardised. Every feature within each record is normalised by the respective maximum value and falls into the same range of [0-1]. The transferring and normalisation process will also be applied to test data.
As mentioned in Section 4.3 above, the NSL-KDD dataset comes with 38 numeric and 3 non-numeric features. However, just as any RNN, the proposed BiDLSTM model only handles numerical data inputs. As a result, there is a need for us to convert all non-numeric features to numeric representations. The features (protocol type, service and flag) are the non-numeric features in the NSL-KDD dataset that require transformation into numeric form. These three features are encoded and assigned integer values unique to each of them. After successfully transforming these features into numeric form, the next appropriate thing is feature scaling. Feature scaling ensures that the dataset is in the normalized form. The values of some features in the NSL-KDD dataset (e.g., src_bytes and dst_bytes) have uneven distribution, so we scale every feature’s values within the range of (0, 1) using Min–Max scaling. By this, we ensure that our classifier does not produce biased outcomes. The expression for the Min–Max feature scaling is as follows:
Note that we apply PCA transformation twice in the overall proposed process. In the first task, namely the message type classification, we apply the PCA transformation on all packet payloads in order to bring out the patterns that are useful for differentiating packets by its message types, as well as for dimensionality reduction. In the second task, namely the fine-grained anomaly detection, we apply the PCA transformation on subsets of packets belonging to the same message type; this is performed separately for each message type.
Thus, we need a measure capable of analysing the relation between two variables no matter whether they are linearly or nonlinearly dependent. For these reasons, this work intends to explore a means of selecting optimal features from a feature space regardless of the type of correlation between them.
Even though every connection in a dataset is represented by various features, not all of these features are needed to build an IDS. Therefore, it is important to identify the most informative features of traffic data to achieve higher performance. In the previous section using Algorithm 1, a flexible method for the problem of feature selection, FMIFS, is developed. However, the proposed feature selection algorithms can only rank features in terms of their relevance but they cannot reveal the best number of features that are needed to train a classifier. Therefore, this study applies the same technique proposed in  to determine the optimal number of required features. To do so, the technique first utilises the proposed feature selection algorithm to rank all features based on their importance to the classification processes.
Then, incrementally the technique adds features to the classifier one by one. The final decision of the optimal number of features in each method is taken once the highest classification accuracy in the training dataset is achieved.
The selected features for all datasets are depicted in Tables 1 [a, b, c], where each row lists the number and the indexes of the selected features with respect to the corresponding feature selection algorithm. In addition, for KDD Cup 99 and to make a comparison with those systems that have been evaluated on different types of attacks (discussed in Sections 5.5 and 5.6), we construct five classes. One of these classes contains purely the normal records and the other four hold different types of attacks (i.e., DoS, Probe, U2R, R2L), respectively. The proposed feature selection algorithm is applied to the aforementioned classes. The selected features are shown in Table 3.
The proposed solution (i.e., the fine-grained anomaly detection algorithm) for detecting anomalous TCP packets in RFB traffic combines supervised classification techniques, together with unsupervised cluster analysis and anomaly detection methods. The solution consists of two primary phases, the Model Construction Phase in which we utilize legitimate RFB traffic to construct a predictive model which represents the normal protocol behavior; and the Detection Phase in which we apply these models in order to detect anomalous protocol behavior.
Now we utilize the statistical clustering algorithm to learn the patterns of the frequency domain features obtained from the feature extraction module with the selected parameters. We train the statistical clustering algorithm with only benign traffic. In the training phase, this module calculates the clustering centers of the frequency domain features and the averaged training loss. In order to improve the robustness of Whisper and reduce false positive caused by the extreme values, we segment the frequency domain feature matrix R with a sampling window of length. We use … to denote the number of samples and … to denote the start points. We average the sampling window on the dimension of the feature sequence and use … to indicate the input of the clustering algorithm. We can obtain:
In this module, we extract the frequency domain features from high speed traffic. We acquire the per-packet features of packets from the same flow by polling the high speed packet parser module. We use the mathematical representation similar to Bartos et al.  to denote the features.
Once the optimal subset of features is selected, this subset is then taken into the classifier training phase where LS-SVM is employed. Since SVMs can only handle binary classification problems and because for KDD Cup 99 five optimal feature subsets are selected for all classes, five LS-SVM classifiers need to be employed. Each classifier distinguishes one class of records from the others. For example the classifier of Normal class distinguishes Normal data from nonNormal (all types of attacks). The DoS class distinguishes DoS traffic from non-DoS data (including Normal, Probe, R2L and U2R instances) and so on. The five LS-SVM classifiers are then combined to build the intrusion detection model to distinguish all different classes.
(1) Chuanpu Fu, et al. Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis. CCS.
(2) Mohammed A. Ambusaidi, et al. Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm. IEEE TRANSACTIONS ON COMPUTERS.
(3) Ning Wang, et al. MANDA: On Adversarial Example Detection for Network Intrusion Detection System. IEEE INFOCOM.
(1) Chuanpu Fu, et al. Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis. CCS.
(2) Ron Bitton, et al. A Machine Learning-Based Intrusion Detection System for Securing Remote Desktop Connections to Electronic Flight Bag Servers. IEEE TDSC.
(3) Tohid Shekari, et al. RFDIDS: Radio Frequency-based Distributed Intrusion Detection System for the Power Grid. NDSS.
(4) Ryan Heartfield, et al. Self-Configurable Cyber-Physical Intrusion Detection for Smart Homes Using Reinforcement Learning. IEEE TIFS.
(5) Congyuan Xu, et al. A Method of Few-Shot Network Intrusion Detection Based on Meta-Learning Framework. IEEE TIFS.
(6) Jun Zeng, et al. WATSON: Abstracting Behaviors from Audit Logs via Aggregation of Contextual Semantics. NDSS.
(7) Ning Wang, et al. MANDA: On Adversarial Example Detection for Network Intrusion Detection System. IEEE Infocom.
(8) Sunwoo Ahn, et al. Hawkware: Network Intrusion Detection based on Behavior Analysis with ANNs on an IoT Device. DAC
(9) Xinghua Li, et al. Sustainable Ensemble Learning Driving Intrusion Detection Model. IEEE TDSC.
(10) Neha Gupta, et al. LIO-IDS: Handling class imbalance using LSTM and improved one-vs-one technique in intrusion detection system. Computer Networks.
(11) Ron Bitton, et al. A Machine Learning-Based Intrusion Detection System for Securing Remote Desktop Connections to Electronic Flight Bag Servers. IEEE TDSC.
(By:Eastmount 2021-12-07 晚上12点 http://blog.csdn.net/eastmount/ )