A protocol is a way for computers to communicate. An analogy to that could be the language we use to communicate with each other. A computer will understand only a set of pre-defined protocols.
There are many network protocols. Let’s see the most important ones in depth and the others that could be more crucial to develop systems. We will explore it in a more shallow way.
IP – Internet protocol
IP stands for Internet Protocol. It’s a set of rules that govern how data is transmitted over the internet. Every device connecting to the internet has an IP address, a unique numerical identifier assigned to it. This address allows data to be sent over the internet from one device to another.
We represent IP addresses in either IPv4 or IPv6 format. IPv4 addresses consist of four numbers separated by periods, while IPv6 addresses are more extended and use hexadecimal notation.
The IP protocol also governs how data is broken up into packets, how packets are addressed and routed, and how they are reassembled into the original data at their destination. All internet communication, including email, web browsing, and file transfers, rely on the IP protocol.
TCP
Transmission Control Protocol is a communication protocol that reliably transmits data over the internet. It breaks up data into smaller packets, establishes a connection between devices, detects and corrects errors, and regulates data flow between devices to ensure that data is transmitted and received correctly, without errors or loss. It’s like a postal service that guarantees that letters are sent and received correctly and can fix any mistakes during delivery.
HTTP
HTTP (Hypertext Transfer Protocol) is the protocol used for transferring data over the World Wide Web. It allows your web browser to communicate with web servers and retrieve web pages, images, videos, and other resources. When you type a URL into your web browser, it sends an HTTP request to the web server asking for the resource you want to access.
The server then responds with an HTTP response containing the requested resource or an error message if the resource cannot be found or any other error happens. HTTP is a client-server protocol that involves communication between a client (your web browser) and a server (the web server hosting the resource you want to access).
HTTP Request Header
When sending an HTTP request to a server, we need to inform what is the URL or endpoint we want to access in the server and the request method to send information. If it’s a more complicated request, then we might need to authenticate, pass a specific type of information.
In the example, the client is sending a GET request to the server to retrieve the file “example.html.” The request header contains the request method (GET), the requested resource (“/example.html”), the HTTP version (HTTP/1.1), and the hostname of the server being requested (www.example.com). Additionally, the request header includes information about the client making the request, such as the user agent (Firefox web browser on a Windows 10 computer) and the types of content it can accept.
GET /example.html HTTP/1.1 Host: www.example.com User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
HTTP/1.1 200 OK Date: Sat, 24 Jul 2023 16:30:00 GMT Server: Apache/2.4.6 (Red Hat Enterprise Linux) Content-Type: text/html; charset=UTF-8 Content-Length: 1234 Connection: close
Let’s see now an example of an HTTP request sending an authorization Bearer token:
GET /API/v1/users/123 HTTP/1.1 Host: example.com Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0 Accept: application/json
The Authorization field contains the authorization token required to access the resource. In this case, it’s a Bearer authentication scheme, and the token is a JSON Web Token (JWT) that includes the user ID, name, and timestamp.
Additionally, the request header includes information about the client making the request, such as the user agent (Firefox web browser on a Windows 10 computer) and the types of content it can accept (JSON in this case).
The HTTP response header includes the HTTP version (HTTP/1.1), the date and time the response was sent, the server software being used (Apache/2.4.6 on Red Hat Enterprise Linux), the type of content being returned (JSON), the character encoding (UTF-8), the length of the content being returned (256 bytes), and the authorization token required to access the resource. The response body would contain the requested data in JSON format.
HTTP Methods
HTTP defines a set of methods, also known as verbs or actions, that describe the desired action to be performed on a resource identified by a URI (Uniform Resource Identifier).
Here are the most commonly used HTTP methods:
GET: retrieves information or data from the server specified by the URL. A practical example is to get information from a user of a system.
POST: sends data to the server to create or update a resource. Even though it can be used for creation or update, it’s more common to use the post method to create a register in the database, for example. The post method is also non-idempotent, which means that every request you make to the server might create many registers.
PUT: sends data to the server to replace an existing resource or create a new one. The PUT method is more common to be used to update a register in the database. Another important point is that the put method is idempotent, meaning we can send as many requests as we want, and the results will always be the same when sending the same data. Also, the put method will send the whole data even if it wasn’t changed. Obviously, the code implementation on the server side has also to be congruent with the PUT method to be idempotent.
DELETE: deletes the specified resource from the server. We should use it to delete a register in the database, for example. The DELETE method is also considered idempotent because, let’s suppose we are deleting the user with the id of ‘1’, and then after this user is deleted, we can invoke the same endpoint as many times as we want, and the response will be the same, nothing will be deleted because the user is already deleted.
PATCH: sends data to the server to update or modify an existing resource. It’s used for partial updates; this means that only the data that was changed will be processed.
HEAD: retrieves the header information associated with a resource without actually retrieving the resource itself.
OPTIONS: returns the HTTP methods, headers, and other options supported by the server for a specified resource.
HTTP Response Codes
There are some HTTP response codes that you will see very often when developing software. Let’s see some of them:
200 – OK: This response code means that the request was successful and the server has returned the requested data.
201 – Created: This response code means that the server has successfully created a new resource as a result of the request.
202 – Accepted: This response code means that the request has been accepted for processing, but the processing has not yet been completed.
204 – No Content: This response code means that the server has successfully processed the request, but there is no data to return to the client.
301 – Moved Permanently: This response code means that the requested resource has been moved permanently to a new location, and the client should update its records accordingly.
302 – Found: This response code means that the requested resource has been temporarily moved to a new location, and the client should use the new location for future requests.
304 – Not Modified: This response code means that the requested resource has not been modified since the last time the client requested it, and the server is instructing the client to use its cached version of the resource.
401 – This response status code indicates that the requested resource is restricted and requires authentication. In other words, the server is telling the client that they need to provide valid credentials (such as username and password) in order to access the requested resource. This status code is often used for web pages that require users to log in before they can access the content. If the client does not provide valid credentials, the server will return a 401 Unauthorized status code, indicating that access is denied.
403 – Forbidden: This response code means that the client does not have permission to access the requested resource.
404 – Not Found: This response code means that the requested resource could not be found on the server.
500 – Internal Server Error: Those are more serious errors from an HTTP response. It’s a very generic error and it might be related to the case that the server is down. The response codes from 500 onwards are related to infrastructure such as network or any other issue from the server where the application is deployed and it shouldn’t happen very often.
If you want to see all the HTTP response codes and have fun, I recommend the following website: https://http.cat
Other Internet Protocols
Simple Mail Transfer Protocol (SMTP) – used to send and receive email messages.
Post Office Protocol (POP) – used to retrieve email messages from a mail server.
Internet Message Access Protocol (IMAP) – used to access email messages stored on a mail server.
Transmission Control Protocol (TCP) – used to establish and maintain connections between devices on a network.
User Datagram Protocol (UDP) – used for low-latency and loss-tolerating connections between applications.
Domain Name System (DNS) – used to translate domain names to IP addresses.
Dynamic Host Configuration Protocol (DHCP) – used to assign IP addresses to devices on a network automatically.
Simple Network Management Protocol (SNMP) – used to manage and monitor network devices.
Conclusion
Understanding the basics of network protocols is crucial to ace the systems design interview. Before using the significant amount of technologies we have nowadays, understanding the fundamentals will dramatically accelerate your learning.
IP, HTTP, and TCP are the most used network protocols. As a software engineer, you must master HTTP since we always use it with web applications.