BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Packet Inspection for Unauthorized OS Detection in Enterprises

Packet Inspection for Unauthorized OS Detection in Enterprises

Bookmarks

This article first appeared in IEEE Security & Privacy magazine. IEEE Security & Privacy offers solid, peer-reviewed information about today's strategic technology issues. To meet the challenges of running reliable, flexible enterprises, IT managers and technical leads rely on IT Pro for state-of-the-art solutions.

 

Many recent malware implementations employ virtual machines to carry out malicious activities. These are hard to detect because antivirus software running in the native OS can’t detect virtual machines’ system states. An approach that uses TCP SYN packets for OS fingerprinting can detect the presence of unauthorized OSs.

Many modern malware implementations carry out their activities using virtual machines to escape detection from antivirus so ware running on the host OS. 1 This malware can also be used as part of a botnet to transfer information from the infected machine to a command-and-control center. Such malware is hard to detect because the context data and state of the programs run by a virtual machine can’t be accessed by antivirus so ware installed on the native OS.

Due to increasing OS vulnerabilities, enterprise network administrators must regularly perform OS audits. These help determine various services running on different systems and identify OSs with flaws that might cause vulnerabilities in the enterprise network. Audits also help in configuring network-based intrusion detection systems and maintaining an adaptive enterprise security policy. If OSs’ virtual machines differ from the native OSs, these malicious OSs can be identified and the infected machines can be cleaned. Although various RFCs specify definitions and interpretations of different TCP/IP packet fields, 2-7 many fail to specify a standard set of initial values for these fields. As a consequence, developers of various OSs implement the protocol stack with different initial values for these fields.

Similar to the way a human fingerprint serves as a tool to uniquely identify a person, an OS can be uniquely identified on a network by its packet fingerprint. Packet fingerprints are derived from the implementation dissimilarities of various OSs’ communication protocols. By analyzing initial values of certain protocol flags, options, packet fields, and data in the packets that a host sends over a network, we can determine the OS installed in a host. If the OS determined from the packet generated by an enterprise host differs from the original OS installed in the host machine, an unauthorized OS is likely present. An unauthorized OS might have been installed either by a malicious user without obtaining permission from a network administrator or by a virtual machine installed by malware over a native host OS.

In this article, we present an approach to passively fingerprint OSs using the information available from TCP packets as well as a system to use this approach to detect unauthorized OSs in an enterprise network. For more information on related work in OS fingerprinting, see the sidebar.

Proposed Unauthorized OS Detection System for Enterprises

Figure 1 shows the outline of our proposed system. The enterprise network connects to the Internet through a rewall that is typically combined with a router. The enterprise router captures outgoing TCP SYN packets from the network, which are forwarded to an OS fingerprinting analyzer that extracts and analyzes their headers. Our OS fingerprinting analyzer applies a Euclidean distance estimation algorithm using specific header fields and determines the OS of the machine generating the packet. The system also contains an OS database that includes information about all the authorized OSs installed in an enterprise’s machines. Such OS databases might be created through the OS and software audit process. The classifier compares the database’s OS information with the OS fingerprinting analyzer’s results and makes a decision about the presence of an unauthorized OS.

OS Fingerprinting Strategy

Our system’s OS fingerprinting approach uses a TCP SYN packet’s TCP/IP headers to identify the OS generating the packet. IP is used to assign logical addresses to users to enable communication, and TCP ensures reliable packet delivery. OSs must conform to the rules specified in the relevant RFCs to implement these protocols.

Again, many RFCs fail to specify a standard set of initial values for the TCP header fields; thus, OS developers implemented the TCP/IP protocol stack and TCP/IP header fields with differing initial values. For example, a TCP/IP header’s window size and time to live (TTL) fields don’t require any specific initial values. The concept of passive OS fingerprinting arises from the disparity among OSs in the implementation of these protocol fields’ initial values. In our application, we use TTL, total length, window size, and TCP options fields.

Time to live. The TTL field indicates the maximum time a datagram can remain in the Internet system. If the TTL field contains the value zero, the datagram must be dropped by a router. Table 1 shows typical TTL values for various OSs.

Total length. The total length field indicates the length of the packet and includes the TCP header, IP header, and payload. This field plays an important role in passive OS fingerprinting. For a SYN packet, the total length is OS specific. Each OS sets its own TCP NOP (No Operation) options, which affect the total length, making each OS unique in total packet length. In some cases, we can identify an OS by the SYN packet length alone. Every OS uses at least one TCP option field, typically maximum segment size, so the TCP SYN packet length should be at least 44 bytes. Thus, a 40-byte TCP SYN packet is a strong indication of a crafted packet.

Window size. This field tells us how many octets of data the receiving computer is ready to accept. Once a connection is established and data is transferred, the window size changes depending on how much data occupies the receiver buffer. The initial value for window size can be different for each OS. Table 1 also shows popular OSs’ initial window size.

Figure 1. Proposed system for packet-based unauthorized OS fingerprinting in enterprise networks. The OS fingerprint analyzer utilizes TCP header information to detect the OS that originated the packet. The classifier verifies the detected OS with the authorized enterprise OS database. Deviation of the detected OS from the authorized OS indicates the presence of an unauthorized OS.

Tabel 1

TCP options. At times, the TCP options field alone can identify OSs. Typically, it includes the following:

  • Maximum segment size. This field communicates the maximum receivable segment size at the TCP endpoint sending this segment.
  • Time stamp options. The time stamp options field carries two four-byte time stamp fields. The time stamp value (TSval) field contains the current value of the time stamp clock of the TCP endpoint sending the option. The time stamp echo reply (TSecr) field is valid only if the ACK (acknowledgment) bit is set in the TCP header. If TSecr is valid, it echoes the TSval sent by the remote TCP endpoint in the TSval field of a time stamp option field. When TSecr isn’t valid, its value must be zero. The TSecr value will generally be from the most recent time stamp option received; however, there are exceptions as explained in the RFCs. 2–7
  • SACK OK (selective acknowledgment). Selective acknowledgment informs the sender of all received data so that the sender can retransmit only data that wasn’t received.
  • NOP. NOPs are one byte in size and used to pad the TCP options field to increase the packet length to a number divisible by four.

Tabel 2

The OSs are fingerprinted based on the data shown in Tables 1 and 2. The data is limited to specific versions of popular OSs; most of today’s enterprise networks work only with these sets of OSs. Different versions of Linux vary only in window size. Therefore, if we fail to obtain an exact version of Linux, our system just estimates the OS to be Linux. Similarly, multiple versions of the same OS vendor might have identical fields; in this case, our system returns the names of both versions.

We obtained the data for different versions of Windows, Linux, MAC, Cisco IOS, and AIX 4.3 through manual examination of their TCP SYN packets. The rest of the data was obtained from RFC 1323.6 To identify OSs based on Tables 1 and 2, we used Euclidean distance to obtain the deviation from the different OSs’ defined values. The Euclidean distance for an n-dimensional vector is

where d is the distance between two vectors x and y, and i varies from 1 to n where n is the size of the vector. The OS fields in Tables 1 and 2 are used as an n-dimensional vector, for example, x = [TTL, packet size, window size, NOP, SACK OK, Don’t Fragment (DF), time stamp], n = 7, and any new outgoing TCP SYN packet from the enterprise network is used to create another n-dimensional vector using the same fields. We compute the Euclidean distance between two vectors using Equation 1 for every set of OSs and choose the one with the least distance (ideally zero) as the OS generating that TCP SYN packet.

To give equal importance to all fields in the distance calculation, we mathematically transform certain fields’ values to comparable values (see Table 3). For example, we take the packet length’s logarithm to the base 4 and window size logarithm to the base 10, and we transform the TTL value to 1, 2, or 3 for the values 64, 128, and 255, respectively. We don’t transform the other fields.

We compare the fingerprinted OS result with the listed OS in the enterprise database of OSs; if a match fails to occur, an alert is generated indicating the presence of an unauthorized or unregistered OS in the enterprise. The alert could be used for manual verification of a possibly malware-controlled virtual machine.

Tabel 3

Performance Results

Here we describe the OS fingerprinting results we obtained by applying the algorithm in Figure 2 to a large set of network packets. We installed a virtual machine using Windows OS on a machine with a native Linux OS. We captured and analyzed packets generated from both the OSs using our proposed technique (see Table 4). A group of packets generated from the Linux OS was successfully identified as being generated from Linux, and the packets generated from the Windows 7 machine were identified as being generated from either Windows 7 or Windows 8, as the initial values of the TCP header fields used by both OSs are almost the same. Figure 2. Packet-based OS fingerprinting algorithm. Our algorithm first extracts the TCP SYN packet fields and then estimates the Euclidean distance with known OSs’ TCP SYN fields. The closest distance is used for detecting the OS. Once a detection is made, it is verified against the enterprise’s OS audit database to confirm the presence of an unauthorized OS. We also applied our proposed solution to a subsection of a university’s network. Using our algorithm, we analyzed 2,000 SYN packets captured from a common router. We correctly fingerprinted 95.5 percent of the systems. Approximately 86.3 percent of the systems exactly matched the reference OSs from the enterprise database and were determined with zero distance; 9.2 percent were determined correctly using the minimum distance approach even though there wasn’t an exact match. Approximately 4.5 percent of the results were incorrect.

Figure 2. Packet-based OS fingerprinting algorithm. Our algorithm first extracts the TCP SYN packet fields and then estimates the Euclidean distance with known OSs’ TCP SYN fields. The closest distance is used for detecting the OS. Once a detection is made, it is verified against the enterprise’s OS audit database to confirm the presence of an unauthorized OS.

Tabel 4

Tabel 5

Because we captured packets from a common router, some packets were from mobile phone OSs whose information wasn’t incorporated in our database. Others were packets whose OSs were incorrectly identified. Table 5 summarizes the results. Our OS fingerprinting approach can also be used to detect the OS distribution in enterprise networks or public network environments. For example, Figure 3 shows the OS distribution of a typical university dormitory building, where close to 85 percent of OSs reported were identified as Windows. Furthermore, the fingerprinted OS can be verified against the database of the OSs in the enterprise to evaluate the results. Although our system can be useful, it can also be attacked by

  • poisoning the database of an enterprise’s authorized OSs,
  • spoofing packet field contents to avoid detection by our solution, or
  • performing a distributed denial-of-service (DDoS) attack on the computer that runs our solution. Existing database authorization solutions can protect the database we need to use for authorized OS information in enterprises. Furthermore, we haven’t come across malware that manipulates TCP headers generated by a virtualized OS to match the native OS in order to evade OS fingerprinting. Finally, because it’s a passive monitoring system, it’s less vulnerable to DDoS attacks.

Figure 3. OS distribution in a university network. Our technique can be used to identify the distribution of the OSs in a campus network or a public network. Our system can be deployed for unauthorized OS detection in large enterprise networks. The solution requires information on authorized OSs installed on different enterprise network machines. Because OS audits in enterprises are becoming an important process, our system can be very effective for enterprise network security.

References

  1. S.T. King and P.M. Chen, “SubVirt: Implementing Malware with Virtual Machines,” IEEE Symp. Security and Privacy, 2006; doi:10.1109/SP.2006.38.
  2. RFC 791—Internet Protocol, Sept. 1981.
  3. RFC 792—Internet Control Message Protocol, Sept. 1981.
  4. RFC 793—Transmission Control Protocol, Sept. 1981.
  5. RFC 1122—Requirements for Internet Hosts—Communication Layers, Oct. 1989.
  6. RFC 1323—TCP Extensions for High Performance, May 1992.
  7. RFC 1349—Type of Service in the Internet Protocol Suite, July 1992.

Related Work in OS Fingerprinting

In “HTTP Fingerprinting and Advanced Assessment Techniques,” Saumil Shah proposed a solution to distinguish HTTP server software and the OS by using information included in the HTTP responses. 1 In “Remote OS Detection via TCP/IP Stack Fingerprinting,” Gordon “Fyodor” Lyon used the Network Mapper (Nmap) to fingerprint servers. 2 The Nmap tool has a remote OS fingerprinting function; it sends probe packets to target devices and monitors the response. The application then determines the target’s OS from the response packets. Because Nmap usually initiates a sequence of conversations with the target system, the target system can detect such attempts and disable the responses. Nmap utilizes a database of OS fingerprints that helps identify the hosts’ OSs. Active OS detection wastes bandwidth and reveals information about the testing process. In comparison, our solution is a passive OS detection approach using a nonexact matching of fingerprints. In “Passive OS Fingerprinting by DNS Traffic Analysis,” Takashi Matsunaka and his colleagues utilized characteristics of DNS queries specific to each OS, such as unique domain names, query and patterns, and time interval. 3 The Passive OS Fingerprinting Tool  estimates the number of devices with each OS from the number of queries by using the characteristics of the time interval patterns. This tool has limited passive OS fingerprinting capability. Similarly, tools such as Ettercap and Siphon are passive OS fingerprinting tools that depend on exact matches of fingerprint values. In “Passive Operating System Identification from TCP/IP Packet Headers,” Richard Lippmann and his colleagues proposed a solution to identify OSs by using TCP header fingerprinting. 4Their solution relies on an exact matching of the header’s fingerprinted value with the example values. In contrast, our work focuses on OS fingerprinting by looking into enterprise networks’ TCP packet header fields, transforming the default header values, and comparing them using a Euclidean distance method to avoid the need for exact matches. Furthermore, in our solution, when a TCP packet is detected as originating from a particular OS, this information is verified with the enterprise’s OS database to determine the presence of an unauthorized OS.

References

  1. S. Shah, “HTTP Fingerprinting and Advanced Assessment Techniques,” BlackHat, 2003.
  2. G.F. Lyon, “Remote OS Detection via TCP/IP Stack Fingerprinting,” Nmap, 2011.
  3. T. Matsunaka, A. Yamada, and A. Kubota, “Passive OS Finger- 4 printing by DNS Traffic Analysis,” Proc. 27th IEEE Int’l Conf. Advanced Information Networking and Applications (AINA 13), 2013, pp. 243–250.
  4. R.P. Lippmann et al., “Passive Operating System Identification from TCP/IP Packet Headers,” Proc. Workshop on Data Mining for Computer Security (DMSEC 03), 2003.

About the Authors

Rohit Tyagi is a scientist/engineer at the Indian Space Research Organisation. His research interests include designing technologies for information security. Tyagi received a B.Tech in avionics from the Indian Institute of Space Science and Technology. Contact him at rohit01011992@yahoo.com.

Tuhin Paul is a scientist/engineer focusing on microwave remote sensors at the Indian Space Research Organisation. His research interests include computer network security using network analysis. Paul received a B.Tech in avionics from the Indian Institute of Space Science and Technology. Contact him at tsagar50@gmail.com.

B.S. Manoj is an associate professor in the Avionics Department of the Indian Institute of Space Science and Technology. Manoj received a PhD in computer science and engineering from the Indian Institute of Technology Madras. Contact him at bsmanoj@ieee.org.

B. Thanudas is a scientist and engineer at Vikram Sarabhai Space Centre and a research scholar at Dr. M.G.R. Educational and Research Institute University. Thanudas received an M.Tech in computer science from the Indian Institute of Technology Madras. Contact him at b_thanudas@vssc.gov.in.

 

This article first appeared in IEEE Security & Privacy magazine. IEEE Security & Privacy offers solid, peer-reviewed information about today's strategic technology issues. To meet the challenges of running reliable, flexible enterprises, IT managers and technical leads rely on IT Pro for state-of-the-art solutions.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Article Text Missing?

    by J R,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This article currently seems to be missing.

  • Make our own article?

    by Mike Gale,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Interesting. Do we need to create our own article and post it as a comment?

  • Re: Article Text Missing?

    by Charles Humble,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Apologies. The new, improved version (i.e. with the article!) is live now.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT