How TCP/IP Protocol Works - Part 1
By Gabriel Torres on March 28, 2012
TCP/IP is the most used network protocol nowadays. In this tutorial we will explain how it works in a very easy to follow language.
So, what is a network protocol anyway? Protocol is like a language used to make two computers to talk to each other. Like in real world, if they are not talking the same language, they cannot communicate.
Before going further, we recommend you to read our tutorial The OSI Reference Model for Network Protocols, which is a primer for understanding how network protocols work. Consider the present tutorial as a sequel to our OSI Reference Model tutorial.
TCP/IP is not really a protocol, but a set of protocols – a protocol stack, as it is most commonly called. Its name, for example, already refers to two different protocols, TCP (Transmission Control Protocol) and IP (Internet Protocol). There are several other protocols related to TCP/IP like FTP, HTTP, SMTP and UDP – just to name a few. Don’t worry about this for now; we will explain all you need to know about them later.
TCP/IP architecture can be seen in Figure 1.
Figure 1: TCP/IP architecture.
As you can see, TCP/IP has four layers. Programs talk to the Application layer. On the Application layer you will find Application protocols such as SMTP (for e-mail), FTP (for file transfer) and HTTP (for web browsing). Each kind of program talks to a different Application protocol, depending on the program purpose.
After processing the program request, the protocol on the Application layer will talk to another protocol from the Transport layer, usually TCP. This layer is in charge of getting data sent by the upper layer, dividing them into packets and sending them to the layer below, Internet. Also, during data reception, this layer is in charge of putting the packets received from the network in order (because they can be received out-of-order) and also checking if the contents of the packets are intact.
On the Internet layer we have the IP (Internet Protocol), which gets the packets received from the Transport layer and adds virtual address information, i.e., adds the address of the computer that is sending data and the address of the computer that will receive this data. These virtual addresses are called IP addresses. Then the packet is sent to the lower layer, Network Interface. On this layer packets are called datagrams.
The Network Interface will get the packets sent by the Internet layer and send them over the network (or receive them from the network, if the computer is receiving data). What is inside this layer will depend on the type of network your computer is using. Nowadays almost all computers use a type of network called Ethernet (which is available in several different speed grades; wireless networks are also Ethernet networks) and thus you should find inside the Network Interface layer the Ethernet layers, which are Logic Link Control (LLC), Media Access Control (MAC) and Physical, listed from up to bottom. Packets transmitted over the network are called frames.
Let’s now talk more in depth about the TCP/IP layers and protocols.
This layer makes the communication between programs and the transport protocols. There are several different protocols that work on the Application layer. The most known are HTTP (HyperText Transfer Protocol), SMTP (Simple Mail Transfer Protocol), FTP (File Transfer Protocol), SNMP (Simple Network Management Protocol), DNS (Domain Name System) and Telnet. You may have already seen these names before.
When you ask your e-mail program (called e-mail client) to download e-mails that are stored on an e-mail server, it will request this task to the TCP/IP Application layer, being served by the SMTP protocol. When you type in a www address on your web browser to open a web page, your browser will request this task to the TCP/IP Application layer, being served by the HTTP protocol (that is why web pages start with “http://”). And so on.
The Application layer talks to the Transport layer through a port. Ports are numbered and standard applications always use the same port. For example, SMTP protocol always use port 25, HTTP protocol always use port 80 and FTP protocol always use ports 20 (for data transmission) and 21 (for control).
The use of a port number allows the Transport protocol (typically TCP) to know which kind of contents is inside the packet (for example, to know that the data being transported is an e-mail) allowing it to know, at the reception side, to which Application protocol it should deliver the received data. So, when receiving a packet target to port 25, TCP protocol will know that it must deliver data to the protocol connected to this port, usually SMTP, which in turn will deliver data to the program that requested it (the e-mail program).
In Figure 2 we illustrate how the Application layer works.
When transmitting data, the Transport layer gets data from the Application layer and divides them into several data packets. TCP (Transmission Control Protocol) is the most used protocol on the Transport layer. When receiving data, TCP protocol gets the packets sent by the Internet layer and put them in order, because packets can arrive at the destination out-of-order, and also checks if the contents of the received packet are intact and sends an acknowledge signal to the transmitter, allowing it to know that the packet arrived intact at destination. If no acknowledge signal is received (either because it didn’t arrive the destination or because TCP found out that data was corrupted), the transmitter will re-send the lost packet.
While TCP re-orders packets and also uses this acknowledge system we mention, which is desirable when transmitting data, there is another protocol that works on this layer that does not have these two features. This protocol is called UDP (User Datagram Protocol).
Thus TCP is considered a reliable protocol, while UDP is considered an unreliable protocol. UDP is typically used when no important data is being transmitted, typically on DNS (Domain Name System) requests. Because it does not implement reordering nor an acknowledge system, UDP is faster than TCP.
When UDP is used, the application that requested the transmission will be in charge of checking whether data arrived and if it is intact or not and also reordering the received packets, i.e., the application will do the task of TCP.
Both UDP and TCP will get the data from the Application layer and add a header to it when transmitting data. When receiving data, the header will be removed before sending data to the proper port. On this header there are several control information, in particular the source port number, the target port number, a sequence number (for the acknowledge and reordering systems used on TCP) and a checksum (which is a calculation used to check whether data arrived intact at destination or not). UDP header has 8 bytes while TCP header has 20 or 24 bytes (whether the options field isn’t or is used, respectively).
In Figure 3 we illustrate the data packet generated on the transport layer. This data packet will be sent to the Internet layer (if we are transmitting data) or was sent from the Internet layer (if we are receiving data).
On TCP/IP networks each computer is identified with a unique virtual address, called IP address. The Internet layer is in charge of adding a header to the data packet received from the Transport layer where, among other control data, it will add the source IP address and the target IP address – i.e., the IP address of the computer that is sending the data and the IP address of the computer that should receive the data.
The network card of each computer has a physical address assigned to it. This address is written on the network card read-only memory (ROM) and is called MAC address. So on a local area network whenever computer A wants to send data to computer B, it will have to know computer’s B MAC address. While on a small local area network computers can easily discover each other’s MAC address, this isn’t an easy task on a global network like the Internet.
If no virtual addressing were used, you would have to know the MAC address of the destination computer, which is not only a hard task, but also does not help out packet routing, because it does not use a tree-like structure (putting in other words, while using virtual addresses computers on the same network will have sequential addresses, with MAC addresses the computer with the next sequential MAC address to yours may be in Russia).
Routing is the path that a data packet should use in order to arrive at destination. When requesting data from an Internet server, for example, this data passes through several locations (called routers) before arriving at your computer. If you want to see this in action, try this: click on Start, Run, Cmd. Then on the command prompt type in tracert www.google.com. The output will be the path between your computer and Google’s web server. See how the data packet passes through several different routers before arriving at its destination. Each router in the middle of the road is also called hop.
On every network that is connected to the Internet there is a device called router, which makes the bridge between the computers on your local area network and the Internet. Every router has a table with its known networks and also a configuration called default gateway pointing to another router on the Internet. When your computer sends a data packet to the Internet, the router connected to your network first looks if it knows the target computer – in other words, if the target computer is located on the same network or on a network that the router knows the path to. If it doesn’t know, it will send the packet to its default gateway, i.e., to another router. Then the process repeats until the data packet arrives at its destination.
There are several protocols that work on the Internet layer: IP (Internet Protocol), ICMP (Internet Control Message Protocol), ARP (Address Resolution Protocol) and RARP (Reverse Address Resolution Protocol). Data packets are sent using the IP protocol, so that is the protocol we will explain.
IP protocol gets the data packets received from the Transport layer (from TCP protocol if you are transmitting real data like e-mails or files) and divide them into datagrams. Datagram is a packet that does not have any kind acknowledge system, meaning that IP does not implement any acknowledge system, thus it is an unreliable protocol.
You must note that when transferring data TCP protocol will be used on top, and TCP implements an acknowledge system. Thus even though IP protocol does not check whether the datagram arrived at the destination, the TCP protocol will. The connection will be then reliable, even though IP protocol alone isn’t reliable.
Each IP datagram can have a maximum size of 65,535 bytes, including its header, which can use 20 or 24 bytes, depending whether a field called “options” is used or not. Thus IP datagrams can carry up to 65,515 or 65,511 bytes of data. If the data packet received from the Transport layer is bigger than 65,515 or 65,511 bytes, IP protocol will cut the packet down in as many datagrams as necessary.
In Figure 4 we illustrate the datagram generated on the Internet layer by the IP protocol. It is interesting to notice that what the Internet layer sees as “data” is the whole packet it got from the Transport layer, which includes the TCP or UDP header. This datagram will be sent to the Network Interface layer (if we are transmitting data) or was sent from the Network Interface layer (if we are receiving data).
As we mentioned before, the header added by the IP protocol includes the source IP address, the target IP address and several other control information.
If you pay close attention, we didn’t say that the IP datagram has 65,535 bytes, but it can have up to 65,535 bytes. This means that the data field of the datagram does not have a fixed size. Since datagrams will be send over the network inside frames produced by the Network Interface layer, usually the operating system will configure the size of the IP datagram to have the maximum size of the data area of the data frames used on your network. The maximum size of the data field of the frames that will be sent over the network is called MTU, Maximum Transfer Unit.
Ethernet networks – which is the most common type of network available, including its wireless incarnation – can carry up to 1,500 bytes of data, i.e., its MTU is of 1,500 bytes. Thus usually the operating system automatically configures the IP protocol to create IP datagrams that are 1,500 bytes long, instead of 65,535 (which wouldn’t fit the frame). On the next page we will see that the real size is of 1,497 or 1,492 bytes, as the LLC layer “eats” 3 or 5 bytes for adding its header.
Just a clarification, you may be confused how a network can be classified as TCP/IP and Ethernet at the same time. TCP/IP is a set of protocols that deals with layers 3 to 7 from the OSI reference model. Ethernet is a set of protocols that deals with layers 1 and 2 from the OSI reference model – meaning Ethernet deals with the physical aspect of the data transmission. So they complement each other, as we need the full seven layers (or their equivalents) to establish a network connection. We will explain more about this relationship in the next page.
Another feature that IP protocol allows is fragmentation. As we mentioned, until arriving at its destination, the IP datagram will probably pass through several other networks in the middle of the road. If all networks in the path between the transmitting computer and the receiving one use the same kind of network (e.g., Ethernet) then everything is fine, as all routers will work with the same frame structure (i.e., the same MTU size).
However, if those other networks are not Ethernet networks, they may use a different MTU size. If that happens, the router that is receiving the frames with the MTU set to 1,500 bytes will cut the IP datagram inside each frame in as many as necessary in order to cross over the network with the small MTU size. Upon arriving at a router that has its output connected to an Ethernet network, this router will re-assemble the original datagram.
In Figure 5, you can see an example of this. The original frame uses a MTU of 1,500 bytes. When it arrived at a network with a MTU size of 620 bytes, each frame had to be broken into three frames (two with 600 bytes and one with 300 bytes). Then the router at the exit of this network (router 2) re-assembled the original datagram.
Of course the IP header has a field for controlling fragmentation.
Datagrams generated on the Internet layer will be sent down to the Network Interface layer, if we are sending data, or the Network Interface layer will get data from the network and send it to the Internet layer, if we are receiving data.
This layer is defined by what type of physical network your computer is connected to. Almost always your computer will be connected to an Ethernet network (wireless networks are also Ethernet networks like we will explain).
Like we said in the previous page, TCP/IP is a set of protocols that deals with layers 3 to 7 from the OSI reference model, while Ethernet is a set of protocols that deals with layers 1 and 2 from the OSI reference model – meaning Ethernet deals with the physical aspect of the data transmission. So they complement each other, as we need the full seven layers (or their equivalents) to establish a network connection.
Ethernet has three layers: Logic Link Control (LLC), Media Access Control (MAC) and Physical. LLC and MAC layers correspond, together, to the second layer from the OSI reference model. You can see Ethernet architecture in Figure 6.
The Logic Link Control layer (LLC) is in charge of adding information of which protocol on the Internet layer delivered data to be transmitted, so when receiving a frame from the network this layer on the receiving computer has to know to which protocol from the Internet layer it should deliver data. This layer is defined by IEEE 802.2 protocol.
The Media Access Control layer (MAC) is in charge of assembling the frame that will be sent over the network. This layer is in charge of adding the source MAC address and the target MAC address – as we explained before, MAC address is the physical address of a network card. Frames that are targeted to another network will use the router MAC address as the target address. This layer is defined by IEEE 802.3 protocol, if a cabled network is being used, or by IEEE 802.11 protocol, if a wireless network is being used.
The Physical layer is in charge of converting the frame generated by the MAC layer into electricity (if a cabled network is being used) or into electromagnetic waves (if a wireless network is being used). This layer is also defined by IEEE 802.3 protocol, if a cabled network is being used, or by IEEE 802.11 protocol, if a wireless network is being used.
The LLC and MAC layers add their own headers to the datagram they receive from the Internet layer. So a complete structure of the frames generated by these two layers can be seen in Figure 7. Notice that the headers added by the upper layers are seen as “data” by the LLC layer. The same thing happens with the header inserted by the LLC layer, which will be seen as data by the MAC layer.
The LLC layer adds a 3-byte or 5-byte header and its datagram has a maximum total size of 1,500 bytes, leaving a maximum of 1,497 or 1,492 bytes for data. The MAC layer adds a 22-byte header and a 4-byte CRC (data correction) data at the end of the datagram received from the LLC layer, forming the Ethernet frame. Thus the maximum size of an Ethernet frame is of 1,526 bytes.
To learn more about other TCP/IP protocols and functionalities, read the second part of this tutorial.