Web Security with HTTPS for Systems Design Interview

Locker representing HTTPS security.

Security is not asked much in Systems Design interviews, but we need to know the difference between HTTP and HTTPS to avoid common hack attacks.

This article won’t go deep into security because it’s a vast topic. However, as software engineers, we need to know the basics of security to prevent systems from being hacked because of a basic mistake.

Therefore, let’s explore the essence of HTTP and HTTPS to know why they’re important and why we should use HTTPS instead of HTTP.

The Problem with HTTP

HTTP (Hypertext Transfer Protocol) is the foundation of communication on the World Wide Web. While HTTP has been widely successful in enabling the exchange of information between clients (such as web browsers) and servers, it does have a few limitations and shortcomings:

Lack of security: HTTP is inherently insecure because it does not encrypt the transmitted data. That’s because any information sent over HTTP, including sensitive data like passwords or credit card details, can be intercepted and read by malicious actors. HTTPS (HTTP Secure) addresses this issue by adding a layer of encryption using SSL/TLS protocols.

Lack of state: HTTP is a stateless protocol, which means that each request from a client to a server is independent and does not carry any context or memory of previous requests. This poses challenges when building web applications that require maintaining session information or user authentication across multiple requests. Developers often use techniques like cookies or server-side sessions to work around this limitation.

Performance overhead: HTTP relies on a request-response model, where each client request results in a separate response from the server. This introduces overhead due to the repeated establishment and termination of connections. Additionally, the text-based nature of HTTP messages can lead to larger payload sizes, which can impact performance, especially on low-bandwidth or high-latency networks.

Limited flexibility: HTTP has a fixed set of methods (GET, POST, PUT, DELETE, etc.) that define the operations that can be performed on resources. While these methods cover many common use cases, they might be insufficient for more complex interactions or custom requirements. This limitation has led to the development of additional protocols, such as WebDAV and RESTful APIs, to extend the capabilities of HTTP.

Lack of real-time communication: HTTP is primarily designed for request-response communication, which makes it less suitable for real-time or bidirectional communication scenarios, such as instant messaging or live streaming. To address this, alternative protocols like WebSockets or technologies like Server-Sent Events (SSE) have been introduced to enable real-time communication over HTTP.

It’s worth noting that while HTTP has its limitations, it has been the foundation of the modern web and has undergone significant improvements over the years. Many of these limitations have been addressed through various extensions, protocols, and best practices, making HTTP a widely adopted and reliable protocol for web communication.

Man in the Middle Attack

As mentioned, HTTP is insecure, making it vulnerable to the man-in-the-middle attack.

To understand this attack, imagine you want to visit a website using your web browser. Usually, your browser communicates directly with the website to exchange data. However, in a Man-in-the-Middle attack, an unauthorized third party intercepts and manipulates the communication between your browser and the website.

Here’s how it works:

Normal Communication: In a regular HTTP connection, your browser sends requests to a website, which responds with the requested data. This communication happens directly between your browser and the website’s server.

Interception: In a Man-in-the-Middle attack, an attacker positions themselves between your browser and the website. An attacker can intercept the request through various means, such as by compromising a Wi-Fi network or by gaining control over a network router.

Impersonation: Once the attacker is in the middle, they can impersonate your browser and the website. The attacker pretends to be your browser when communicating with the website and pretends to be the website when communicating with your browser. This allows them to intercept and manipulate the data being exchanged.

Data Manipulation: With control over the communication, the attacker can read, modify, or even inject their own data into the messages passing between your browser and the website. For example, they could steal sensitive information like login credentials or credit card details by capturing them before they reach the legitimate website.

Transparent Relay: To make the attack less noticeable, the attacker typically relays the intercepted data between your browser and the website in real-time. This means that your browser may still receive responses from the website, creating the illusion of a normal connection, while the attacker is secretly manipulating the data in between.

MitM attacks can be particularly dangerous because they allow attackers to eavesdrop on sensitive information or carry out fraudulent activities without the knowledge of the victim or the website. That’s why it’s important to use secure protocols like HTTPS, which encrypts the communication between your browser and the website, making it much harder for attackers to perform successful MitM attacks.

Remember to exercise caution when connecting to public Wi-Fi networks or when accessing websites that handle sensitive information. Always look for the secure padlock icon and use websites with HTTPS to help protect against Man-in-the-Middle attacks.

Before exploring HTTPS…

Let’s learn first basic security elements so HTTPS makes sense. We will understand first:

  • Symmetric Encryption Key – one key cryptography
  • Asymmetric Encryption Key – two keys cryptography
  • SSL certificate (Secure socket layer) – certificate for the Certificate authority
  • TLS Handshake – estabilish a secret code for communication
  • Certificate Authority – authority that registers the domain
Symmetric Encryption key

In cryptography, a symmetric key is a type of encryption where the same key is used for both the encryption and decryption of data. It’s like having a single key that can lock and unlock a door.

Here’s how it works:

Key Generation: To use symmetric key encryption, a secret key is generated. This key is a string of bits or characters that can be randomly generated or derived from a password or passphrase.

Encryption: When you want to encrypt a message or data using a symmetric key, you take the plain text (the original, unencrypted data) and the secret key. Using an encryption algorithm, the secret key is applied to the plain text to produce the encrypted data, also known as the cipher text.

Decryption: To decrypt the encrypted data and retrieve the original plain text, you use the same secret key that was used for encryption. Applying the key with a decryption algorithm reverses the encryption process, transforming the cipher text back into the plain text.

Key Sharing: One challenge with symmetric key encryption is securely sharing the secret key between the sender and the recipient. If an attacker intercepts the key during transmission, they can also decrypt the encrypted data. Secure key distribution methods, such as using secure channels or exchanging keys in person, are crucial to protect against unauthorized access.

Symmetric key encryption is generally faster than other encryption methods, such as asymmetric key encryption, because the algorithms used are computationally less complex. It’s commonly used for encrypting large amounts of data and for securing communication channels.

However, one limitation of symmetric key encryption is that the same key needs to be securely shared between the sender and the recipient. This can be a challenge, especially when communicating over untrusted networks or between multiple parties.

AES Symmetric Encryption

AES is a widely used symmetric encryption algorithm that is used to secure sensitive information. It is considered one of the most secure encryption algorithms available today.

Here’s how it works:

Key Generation: To use AES, a secret key is generated. The key is a string of bits or characters that must be kept secret and shared only between the sender and the recipient.

Block Encryption: AES operates on fixed-size blocks of data, typically 128 bits (16 bytes) in length. The plain text (the original, unencrypted data) is divided into these blocks. Each block is then encrypted independently using the secret key.

Encryption Rounds: AES uses a series of encryption rounds to transform the plain text block into the encrypted form, known as the cipher text. These rounds involve several operations, including substitution, permutation, and mixing of the data, based on the key.

Key Expansion: The original secret key is expanded to create a set of round keys, which are used in the encryption rounds. This key expansion process ensures that each round uses a different key, increasing the security of the encryption.

Decryption: To decrypt the cipher text and retrieve the original plain text, the same secret key is used in reverse. The cipher text is divided into blocks, and each block is decrypted independently using the secret key and the inverse of the encryption operations performed in AES.

AES provides a high level of security due to its robust encryption process and the use of multiple rounds. It is resistant to various known attacks, making it suitable for a wide range of applications, including securing data in transit and at rest.

The strength of AES lies in the key length used, with 128-bit, 192-bit, and 256-bit key sizes being the most common. The larger the key size, the stronger the encryption, but also the more computationally intensive the encryption and decryption processes become.

Asymmetric Encryption Key

In cryptography, asymmetric key encryption (also known as public-key encryption) is a method that uses two separate but mathematically related keys: a public key and a private key. We commonly use an asymmetric key on GitHub for the initial SSH (Secure Shell) handshake to access resources.

Now, let’s see how it works:

Key Generation: First, a user generates a pair of keys: a public key and a private key. The keys are mathematically linked, but it is computationally infeasible to derive one key from the other.

Public Key Distribution: The user shares their public key freely with others. The public key is used to encrypt data and verify digital signatures. It can be openly distributed without compromising the security of the encryption.

Encryption: When someone wants to send a secure message to the user, they use the recipient’s public key to encrypt the message. The encryption process uses the public key to transform the original message into an encrypted form, known as the cipher text.

Private Key Decryption: Only the user in possession of the private key can decrypt the cipher text. The private key is kept secret and should not be shared. To decrypt the encrypted message, the user applies their private key, which reverses the encryption process and retrieves the original plain text.

One of the significant advantages of asymmetric key encryption is that the public key can be freely distributed, allowing anyone to encrypt messages for the intended recipient. However, decryption can only be done using the private key, which remains secret and known only to the recipient.

Asymmetric key encryption is commonly used for various purposes, including secure communication, digital signatures, and key exchange protocols. It enables secure communication even when the sender and recipient have no prior shared secret.

That’s a simplified explanation of asymmetric key encryption! It’s a cryptographic method that uses two related but distinct keys: a public key for encryption and a private key for decryption.

SSL certificate (Secure socket layer)

An SSL certificate is like a digital passport that confirms the identity of a website and enables secure communication between your web browser and the website. It helps ensure that the information you exchange with the website remains private and cannot be easily intercepted or tampered with by attackers.

Here’s how it works:

Website Identity: When a website wants to obtain an SSL certificate, it goes through a process to prove its identity. This involves providing information about the website and its ownership to a trusted certificate authority (CA).

Certificate Issuance: The certificate authority verifies the website’s information and if everything checks out, it issues an SSL certificate specific to that website. The certificate contains the website’s public key, its domain name, and other details.

Certificate Installation: The website installs the SSL certificate on its web server. This enables the server to encrypt and decrypt information using the public and private keys associated with the certificate.

Secure Communication: When you visit a website with an SSL certificate, your web browser checks the certificate to ensure it’s valid. It verifies if a trusted CA has signed the certificate and if it has not expired. This step helps confirm that you are indeed connecting to the genuine website.

Encryption: Once the certificate is validated, your browser and the website initiate an encrypted connection. They use the website’s public key from the certificate to encrypt the data you send and receive. This encryption ensures that even if someone intercepts the data, they won’t be able to understand it without the corresponding private key, which only the website possesses.

Using an SSL certificate, websites can establish a secure connection with your browser, protecting sensitive information such as login credentials, credit card details, and other personal data from unauthorized access.

You can identify a website that has an SSL certificate by looking for the padlock icon in your browser’s address bar or seeing “https://” at the beginning of the website’s URL, where the “s” stands for “secure.”

That’s a simplified explanation of an SSL certificate! It’s an important tool in ensuring secure communication and protecting your sensitive information online.

Certificate Authority

A Certificate Authority (CA) is a trusted entity that issues digital certificates to verify the authenticity of websites, servers, or individuals. It plays a crucial role in establishing secure communication over the Internet.

When a website wants to use HTTPS and secure its communication with users, it must obtain an SSL/TLS certificate from a trusted CA. The CA verifies the identity and ownership of the website before issuing the certificate. The certificate contains information about the website, including its domain name, public key, and the CA’s digital signature.

When a user visits a website secured with HTTPS, their web browser checks the validity of the SSL/TLS certificate presented by the website. The browser verifies the certificate’s digital signature and checks its authenticity by cross-referencing it with a list of trusted CA certificates pre-installed in the browser or operating system. If the certificate is valid and issued by a trusted CA, the browser establishes an encrypted connection with the website.

The role of a CA is crucial in maintaining the trust and security of the certificate ecosystem. CAs are responsible for verifying the identity of the certificate requestor before issuing a certificate, ensuring that only legitimate entities receive valid certificates. They follow industry standards and security practices to safeguard the integrity and confidentiality of the certificate issuance process.

However, it’s important to note that CAs can make mistakes or be compromised, leading to fraudulent or malicious certificates being issued. In recent years, there have been instances where CAs were compromised and unauthorized certificates were issued. To mitigate this risk, browser vendors and operating system manufacturers maintain a list of trusted CAs and regularly update it to remove compromised or untrustworthy CAs.

In summary, a Certificate Authority is a trusted entity that issues digital certificates to validate the identity and authenticity of websites, servers, or individuals, enabling secure online communication.

Certificate Authority Examples

Let’s explore a few examples of well-known certificate authorities:

Let’s Encrypt: Let’s Encrypt is a free, automated, and open certificate authority. It aims to make it easy for website owners to obtain and install SSL/TLS certificates. Let’s Encrypt is widely used and supported by many hosting providers and web browsers.

DigiCert: DigiCert is a leading global certificate authority that provides a wide range of SSL/TLS certificates and related security solutions. They offer certificates for various purposes, including website encryption, code signing, and document signing.

Sectigo (formerly Comodo CA): Sectigo is a prominent certificate authority offering a comprehensive range of SSL/TLS certificates, including domain validation, organization validation, and extended validation certificates. They also provide other security solutions such as secure email and code signing certificates.

GlobalSign: GlobalSign is a trusted certificate authority that offers a variety of SSL/TLS certificates, including wildcard certificates, multi-domain certificates, and extended validation certificates. They cater to different types of organizations, from small businesses to large enterprises.

GoDaddy: GoDaddy is a well-known provider of domain registration and web hosting services, and they also offer SSL/TLS certificates. They provide a range of certificate options and support for various needs, including single-domain, multi-domain, and wildcard certificates.

These are just a few examples of certificate authorities, and there are many other trusted CAs available in the market. When choosing a certificate authority, it’s important to consider factors such as their reputation, compatibility with web browsers and devices, customer support, and pricing.

TLS handshake

Imagine you want to have a secure conversation with someone over the internet. The TLS handshake is like establishing a secret code between you and the other person so that you can communicate securely.

Here’s how it works:

Client Hello: The client (your web browser, for example) starts the handshake by sending a “hello” message to the server. This message includes information like the supported encryption algorithms and other details.

Server Hello: The server responds with its own “hello” message. It selects the strongest encryption algorithm that both the client and server support. The server also sends its digital certificate, which includes a public key.

Certificate Validation: The client checks the server’s digital certificate to make sure it’s valid. It verifies the certificate’s authenticity by checking if it’s been signed by a trusted certificate authority (CA). This step ensures that the client is communicating with the genuine server.

Key Exchange: The client generates a random session key, which is a secret code that will be used to encrypt and decrypt the messages during the session. The client encrypts this session key using the server’s public key obtained from the certificate and sends it back to the server.

Server Key Decryption: The server receives the encrypted session key from the client. It uses its private key (which is paired with the public key in the certificate) to decrypt the session key.

Session Established: Now, both the client and server have the same session key. They use this key to encrypt and decrypt their messages during the session. This ensures that the communication is secure and cannot be easily intercepted or tampered with by attackers.

Once the handshake is complete, the client and server can start exchanging encrypted data, such as web pages, securely. The TLS handshake typically happens once at the beginning of a session, and a new handshake is performed if the session needs to be re-established or if a new connection is made.

HTTPS

HTTPS, or Hypertext Transfer Protocol Secure, is a protocol used for secure communication over the internet. It ensures that the data sent between a web browser and a website is encrypted and cannot be easily intercepted or tampered with by malicious actors.

When you visit a website that uses HTTPS, your browser establishes a secure connection with the website’s server. This connection is encrypted, meaning that any data transmitted between your browser and the server is encoded in a way that only the intended recipient can understand it.

The encryption is achieved using SSL (Secure Sockets Layer) or its successor TLS (Transport Layer Security) protocols. These protocols use cryptographic algorithms to encrypt the data and verify the identity of the website.

Here’s a simplified explanation of how HTTPS works:

Client Request: You type in a website’s address (e.g., https://www.example.com) in your browser.

Server Response: Your browser sends a request to the server asking for the website’s content.

SSL/TLS Handshake: The server responds by sending its SSL/TLS certificate, which contains a public key. Your browser uses this public key to initiate the handshake process.

Secure Connection Established: Your browser generates a random symmetric encryption key and encrypts it using the server’s public key. This encrypted key is sent to the server.

Encrypting Data: Now, your browser and the server have established a secure connection. Any data transmitted between them, such as web pages, form submissions, or personal information, is encrypted using the symmetric encryption key.

Data Exchange: Encrypted data is exchanged between your browser and the server. Even if someone intercepts this data, they won’t be able to read its contents without the encryption key.

Secure Session: The secure connection remains active until you close your browser or navigate away from the website.

The use of HTTPS provides several advantages. It helps protect sensitive information, such as login credentials, credit card details, and personal data, from being intercepted and stolen. It also ensures the integrity of the data, as any tampering with the encrypted content would be detected.

In summary, HTTPS is a secure version of the HTTP protocol that encrypts data during transmission, providing a safer and more private browsing experience for users.

Conclusion

Let’s recap the concepts we explored in this article so we remember the content more easily:

HTTP (Hypertext Transfer Protocol):

  • Protocol used for transmitting data over the internet.
  • Operates over unencrypted connections, making it vulnerable to eavesdropping and tampering.

HTTPS (Hypertext Transfer Protocol Secure):

  • Secure version of HTTP that uses encryption to protect data transmitted over the internet.
  • Adds a layer of security by using SSL/TLS protocols.

TLS (Transport Layer Security):

  • Cryptographic protocol that provides secure communication over a network.
  • Encrypts data and ensures its integrity during transmission.
  • Successor to SSL (Secure Sockets Layer) and commonly used for secure communication on the web.

SSL (Secure Sockets Layer):

  • Cryptographic protocol used for secure communication over a network.
  • Predecessor to TLS and widely used in the past for securing web connections.
  • Provides encryption, data integrity, and authentication.

Certificate Authority (CA):

  • Trusted entity that issues digital certificates.
  • Verifies the identity of individuals, organizations, or servers.
  • Certificates are used for authentication and encryption in SSL/TLS.

Symmetric Key:

  • Encryption method that uses a single shared key for both encryption and decryption.
  • Fast and efficient for bulk data encryption.
  • Requires secure key distribution between communicating parties.

Asymmetric Key:

  • Encryption method that uses a pair of mathematically related keys: a public key and a private key.
  • Public key is used for encryption, while the private key is kept secret and used for decryption.
  • Enables secure key exchange and digital signatures.

AES (Advanced Encryption Standard):

  • Symmetric encryption algorithm widely used for secure data transmission.
  • Provides strong encryption and is considered secure for most practical purposes.
  • Adopted as a standard by the U.S. government and used globally.
Written by
Rafael del Nero
Join the discussion

1 comment