Skip to content

Application Layer Protocols

1. HTTP Evolution

1.1 HTTP/1.1

The most widely deployed version (published in 1997, still heavily used today).

Core features:

Feature Description
Persistent connections (Keep-Alive) Reuses TCP connections by default, avoiding repeated handshakes
Pipelining Can send multiple requests consecutively (but responses must return in order)
Chunked Transfer No need to know Content-Length in advance
Host header Supports virtual hosting

Problem -- Head-of-Line Blocking (HOL):

  • In pipelining, if the first response is not yet complete, subsequent responses are blocked
  • In practice, most browsers disable pipelining
  • Workaround: Open 6 parallel TCP connections per domain

1.2 HTTP/2

Standardized in 2015, evolved from Google's SPDY.

Core improvements:

Feature Mechanism
Binary framing Splits HTTP messages into binary frames, replacing text format
Multiplexing Parallel transmission of multiple request/response streams over a single TCP connection
Header compression (HPACK) Static table + dynamic table + Huffman encoding
Server push Server proactively pushes resources the client may need
Stream prioritization Client specifies resource priorities
HTTP/1.1:                    HTTP/2:
Conn 1: GET /a  → Response a  Single connection:
Conn 2: GET /b  → Response b   Stream 1: GET /a  → Response a
Conn 3: GET /c  → Response c   Stream 2: GET /b  → Response b
Conn 4: GET /d  → Response d   Stream 3: GET /c  → Response c
(Multiple TCP connections)      Stream 4: GET /d  → Response d
                               (Single TCP connection, interleaved frames)

Remaining issue: TCP-layer head-of-line blocking -- a single lost packet causes all streams on the TCP connection to stall.

1.3 HTTP/3 (QUIC)

Standardized in 2022, built on UDP + QUIC protocol.

Core improvements:

Feature Description
Based on UDP Bypasses TCP limitations; transport protocol implemented in user space
No HOL blocking Each stream recovers from loss independently; one stream's loss doesn't affect others
0-RTT / 1-RTT connection establishment Transport handshake + TLS handshake merged
Connection migration Uses Connection ID instead of 4-tuple to identify connections; network switching doesn't break connections
Header compression (QPACK) Improved HPACK adapted for out-of-order delivery
TCP + TLS 1.3:                QUIC:
  SYN                           QUIC Initial (with TLS ClientHello)
  SYN-ACK                       QUIC Handshake (with TLS ServerHello)
  ACK + ClientHello             → 1-RTT to send data
  ServerHello
  → 2-RTT before sending data   0-RTT: Can send data immediately on session resumption

1.4 HTTP Methods

Method Semantics Idempotent Safe
GET Retrieve resource Yes Yes
POST Create resource / submit data No No
PUT Replace entire resource Yes No
PATCH Partially update resource No No
DELETE Delete resource Yes No
HEAD Get headers (without body) Yes Yes
OPTIONS Query supported methods Yes Yes

1.5 HTTP Status Codes

Range Category Common Examples
1xx Informational 100 Continue
2xx Success 200 OK, 201 Created, 204 No Content
3xx Redirection 301 Moved Permanently, 304 Not Modified
4xx Client Error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found
5xx Server Error 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable

2. HTTPS and TLS

2.1 TLS Handshake (TLS 1.3)

sequenceDiagram
    participant C as Client
    participant S as Server

    C->>S: ClientHello (supported cipher suites, key share parameters)
    S->>C: ServerHello (selected cipher suite, key share parameters)
    Note over C,S: Both sides compute shared key (ECDHE)
    S->>C: {EncryptedExtensions}
    S->>C: {Certificate}
    S->>C: {CertificateVerify}
    S->>C: {Finished}
    C->>S: {Finished}
    Note over C,S: 1-RTT complete, encrypted communication begins

TLS 1.3 improvements (compared to TLS 1.2):

  • Handshake reduced from 2-RTT to 1-RTT (supports 0-RTT resumption)
  • Removed insecure algorithms (RC4, 3DES, SHA-1, etc.)
  • Only AEAD ciphers supported (e.g., AES-GCM, ChaCha20-Poly1305)
  • Perfect Forward Secrecy (PFS) is mandatory

2.2 Certificate System

Certificate chain:

Root CA certificate (pre-installed in OS/browser)
  └── Intermediate CA certificate
        └── Server certificate (bound to domain name)

Verification flow:

  1. Server sends certificate chain
  2. Client verifies signature chain up to a trusted root CA
  3. Check domain match, validity period, revocation status (OCSP/CRL)

Let's Encrypt: Free, automated certificate issuance (ACME protocol), driving HTTPS adoption.

3. DNS (Domain Name System)

3.1 Domain Name Hierarchy

Root domain "."
├── Top-Level Domains (TLD): .com, .org, .cn, .io
│   ├── Second-level domains: google.com, example.org
│   │   ├── Subdomains: mail.google.com, www.example.org

3.2 Query Types

Type Process
Recursive query Client → Local DNS server (recursive resolver) → Queries level by level → Returns final result
Iterative query Server doesn't query on behalf; tells the querier "ask this server instead"

Typical resolution flow (for www.example.com):

  1. Client → Recursive resolver (e.g., 8.8.8.8)
  2. Recursive resolver → Root server: "What are the NS for .com?"
  3. Recursive resolver → .com TLD server: "What are the NS for example.com?"
  4. Recursive resolver → example.com authoritative server: "What is the A record for www.example.com?"
  5. Returns IP address to client

DNS caching exists at every level (TTL controls expiration).

3.3 Record Types

Type Meaning Example
A Domain → IPv4 address example.com → 93.184.216.34
AAAA Domain → IPv6 address example.com → 2606:2800:220:1:...
CNAME Alias → canonical name www.example.com → example.com
MX Mail exchange server example.com → mail.example.com
NS Name server example.com → ns1.example.com
TXT Text information SPF, DKIM verification
SRV Service locator _sip._tcp.example.com
SOA Start of authority Serial number, refresh intervals, etc.

3.4 DNS Security

  • DNS hijacking: Tampers with DNS responses, redirecting users to malicious sites
  • DNSSEC: Digitally signs DNS responses to prevent tampering
  • DoH (DNS over HTTPS): Encrypts DNS queries to prevent eavesdropping
  • DoT (DNS over TLS): Same purpose, port 853

4. WebSocket

4.1 Motivation

HTTP is a request-response model; the server cannot proactively push data. WebSocket provides full-duplex communication.

4.2 Handshake

Based on the HTTP Upgrade mechanism:

Client:
GET /chat HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

Server:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

After the handshake, communication switches to the WebSocket protocol, and either side can send messages at any time.

4.3 Use Cases

Scenario Reason
Real-time chat Bidirectional instant messaging
Real-time data push Stock quotes, sports scores
Online gaming Low-latency bidirectional communication
Collaborative editing Multi-user real-time synchronization

4.4 Alternatives

Technology Mechanism Comparison
SSE (Server-Sent Events) Server → client unidirectional push Simpler, unidirectional only
Long Polling Client sends request, server holds until new data Good compatibility but low efficiency
WebSocket Full duplex Most flexible but more complex to implement

5. API Design Paradigms

5.1 REST (Representational State Transfer)

Principle Description
Resource-oriented URLs represent resources (nouns), HTTP methods represent operations (verbs)
Stateless Each request contains all necessary information
Uniform interface GET/POST/PUT/DELETE semantics are consistent
Cacheable Leverages HTTP caching mechanisms
GET    /api/users          → List users
GET    /api/users/123      → Get user 123
POST   /api/users          → Create user
PUT    /api/users/123      → Update user 123
DELETE /api/users/123      → Delete user 123

5.2 GraphQL

A query language developed by Facebook:

Feature Compared to REST
Client specifies return fields Avoids over-fetching / under-fetching
Single endpoint No need for multiple URLs
Strongly typed Schema Self-describing API
Nested queries Fetch related data in a single request
query {
  user(id: 123) {
    name
    email
    posts {
      title
      createdAt
    }
  }
}

Limitations: Caching is difficult (POST to single endpoint), N+1 query problem, complexity limits.

5.3 gRPC

A high-performance RPC framework developed by Google:

Feature Description
Protocol Buffers Binary serialization, more compact and faster than JSON
HTTP/2 Multiplexing, header compression
Strongly typed .proto files define interfaces, code generation
Streaming Unidirectional and bidirectional streaming
Multi-language support Auto-generates client/server code
service UserService {
  rpc GetUser (UserRequest) returns (UserResponse);
  rpc ListUsers (ListRequest) returns (stream UserResponse);  // Server-side streaming
}

Use cases: Microservice communication, low-latency requirements, multi-language environments.

5.4 Comparison Summary

Dimension REST GraphQL gRPC
Data format JSON JSON Protobuf (binary)
Transport HTTP/1.1+ HTTP/1.1+ HTTP/2
Type system Weak (OpenAPI optional) Strong (Schema) Strong (.proto)
Real-time Not natively supported Subscriptions Bidirectional streaming
Browser support Native Native Requires gRPC-Web proxy
Best for Public APIs, simple CRUD Complex frontend queries Internal microservice communication

6. CDN (Content Delivery Network)

6.1 Principle

Cache content at globally distributed edge nodes; users access the nearest node:

User → DNS resolution → CDN intelligent routing → Nearest edge node
                                                  ├── Cache hit → Return directly
                                                  └── Cache miss → Fetch from origin → Cache → Return

6.2 CDN Functions

Function Description
Static acceleration Cache images, JS, CSS, and other static resources
Dynamic acceleration Optimize routing paths, reduce origin latency
DDoS protection Distributed nodes absorb attack traffic
HTTPS termination Offload TLS at edge nodes
Edge computing Run logic at edge nodes (e.g., Cloudflare Workers)

6.3 Caching Strategies

Controlled via HTTP headers:

Header Function
Cache-Control: max-age=3600 Cache resource for 1 hour
Cache-Control: no-cache Validate freshness each time (can cache)
Cache-Control: no-store Never cache
ETag Resource fingerprint for conditional requests
Last-Modified Last modification time

Cache Invalidation is one of the hardest problems in distributed systems:

  • URL versioning: /app.v2.js or /app.js?v=abc123
  • Purge API: Proactively clear CDN cache
  • Short TTL + conditional requests

Navigation


评论 #