An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system.
With increasing trends of network environment, everyone gets across to the network system. So there is a need for securing information that attempt to compromise the confidentiality, integrity or availability of a resource. Abnormal network traffic especially denial of service (DoS) is such a serious problem that network suffers a lot.
four attack is detected :
U2R , R3L , Probe , DOS
Denial-of-service attack (DoS): is a class of attacks where an attacker makes some computing or memory resource too busy or too full to respond to requests, ex. smurf, neptune, back, teardrop, pod and land.
Probing (Probe): is a class of attacks where an attacker scans a network to get some information about potential vulnerabilities in the network, ex. satan, ipsweep, portsweep and nmap.
User to Root Attacks (R2L): is a class of attacks where an attacker gets an access to a normal user account on the system to get a root user access to the system later, ex. warezclient, guess_passwd, warezmaster, ftp_write, multihop, phf, spy and imap.
Remote to User Attacks (U2R): is a class of attacks where an attacker sends some packets to a system over a network remotely, and then it gets some information about the potential vulnerabilities in this system, ex. buffer_overflow, rootkit, loadmodule and perl
five class is considered :
U2R , R3L , Probe , DOS, Normal
Input database is NSL KDD.
KDD cup 99 dataset was derived in 1999 from the DARPA98 network traffic dataset by assembling individual TCP packets into TCP connections. It was the benchmark dataset used in the International KDD tools competition, and also the most popular dataset that has ever been used in the intrusion detection field . The KDD cup 99 dataset includes a set of 41 features derived for each connection and a label which specifies the status of connection records as either normal or specific attack type.
The original dataset contain 744 MB data with 4,940,000 records. However, most of researchers dealt only with a small part of the dataset (10% percent) which have been chosen for conducting experiments on this dataset. The 10% of the data contains 494021 records. The dataset has 41 features for each connection record plus one class label.
There are 41 features for each connection. Features are grouped into four categories:
- Basic Features: Basic features can be derived from packet headers without inspecting the payload.
- Content Features: Domain knowledge is used to access the payload of the original TCP packets. This includes features such as number of failed login attempts.
- Time-based Traffic Features: These features are designed to capture properties that mature over a 2 second temporal window. One example of such a feature would be the number of connections to the same host over the 2 second interval.
- Host-based Traffic Features: Utilize a historical window estimated over the number of connections instead of time. Host-based features are designed to access attacks, which span intervals longer than 2 seconds .
Basic features of individual TCP connections
feature name description
duration length (number of seconds) of the connection
protocol_type type of the protocol, e.g. tcp, udp, etc.
service network service on the destination, e.g., http, telnet, etc.
src_bytes number of data bytes from source to destination
dst_bytes number of data bytes from destination to source
flag normal or error status of the connection
land 1 if connection is from/to the same host/port; 0 otherwise
wrong_fragment number of “wrong” fragments
urgent number of urgent packets
hot number of “hot” indicators
num_failed_logins number of failed login attempts
logged_in 1 if successfully logged in; 0 otherwise
num_compromised number of “compromised” conditions
root_shell 1 if root shell is obtained; 0 otherwise
su_attempted 1 if “su root” command attempted; 0 otherwise
num_root number of “root” accesses
num_file_creations number of file creation operations
num_shells number of shell prompts
num_access_files number of operations on access control files
num_outbound_cmds number of outbound commands in an ftp session
is_hot_login 1 if the login belongs to the “hot” list; 0 otherwise
is_guest_login 1 if the login is a “guest”login; 0 otherwise
Traffic features computed using a two-second time window
feature name description
count number of connections to the same host as the current connection in the past two seconds
serror_rate % of connections that have “SYN” errors
rerror_rate % of connections that have “REJ” errors
same_srv_rate % of connections to the same service
diff_srv_rate % of connections to different services
srv_count number of connections to the same service as the current connection in the past two seconds
srv_serror_rate % of connections that have “SYN” errors
srv_rerror_rate % of connections that have “REJ” errors
srv_diff_host_rate % of connections to different hosts
Host-based Traffic Features
feature name description
dst host count Count of connections having the same destination host
dst host srv count Count of connections having the same destination host and using the same service
dst host same srv rate % of connections having the same destination host and using the same service
dst host diff srv rate % of different services on the current host
dst host same src port rate % of connections to the current host having the same src port
dst host srv diff hostrate % of connections to the same service coming from different hosts
dst host serror rate % of connections to the current host that have an S0 error
dst host srv serror rate % of connections to the current host and specified service that have an S0 error
dst host rerror rate % of connections to the current host that have an RST error
dst host srv rerror rate % of connections to the current host and specified service that have an RST error