Quizzr Logo

Border Gateway Protocol (BGP)

Configuring Autonomous Systems and BGP Peering Relationships

Understand how independent networks exchange routing data using Autonomous System Numbers and established peering sessions.

Networking & HardwareAdvanced15 min read

The Architecture of a Decentralized Internet

The internet is not a single, unified entity managed by one organization. It is instead a massive federation of independent networks known as Autonomous Systems that must cooperate to move traffic globally. This cooperation relies on a shared understanding of how to reach specific IP address blocks across different administrative boundaries.

Without a standardized method for these networks to communicate, your data would never leave your local service provider. The Border Gateway Protocol serves as the glue that binds these thousands of individual networks together into a single global fabric. It allows a router in Tokyo to know exactly which path to take to reach a server sitting in a data center in London.

An Autonomous System is defined as a set of internet routable IP prefixes under the control of one or more network operators that presents a common routing policy to the internet. Each of these systems is assigned a unique number known as an Autonomous System Number which acts as its identity on the global stage. These numbers are critical for tracking the path a packet takes through the web of interconnected routers.

BGP is often described as the protocol of trust in an environment where no single entity has total control over the infrastructure.

Autonomous System Numbers and Identification

Identifiers for networks come in two primary formats known as 16-bit and 32-bit values. While the original 16-bit space provided only 65536 possible IDs, the exhaustion of this space led to the adoption of 32-bit numbers. This expansion ensures that every new data center, cloud provider, and regional carrier can have a unique identity for global routing.

Private ASNs also exist for use within internal network architectures where global visibility is not required. These are similar to private IP addresses and must be stripped before a route is announced to the public internet. Managing the transition between internal identifiers and public identities is a core responsibility of the network edge.

The Concept of Routing Policy

Unlike internal routing protocols that prioritize the fastest link, BGP is driven by business logic and policy. A network administrator might choose a longer path through a specific partner because it costs less than a shorter path through a competitor. This flexibility allows organizations to enforce complex transit agreements and traffic engineering goals.

Policy is implemented using filters that examine incoming and outgoing route advertisements. These filters can modify attributes, block certain prefixes, or prioritize specific neighbors based on the needs of the organization. This level of control is what makes BGP suitable for the complex geopolitical and economic landscape of the internet.

Establishing Peering and Adjacency

Before two networks can exchange information, they must establish a formal communication session. This process is known as peering and occurs over a standard TCP connection on port 179. Using a reliable transport protocol ensures that routing updates are delivered without loss and in the correct order.

The relationship between two BGP routers is referred to as a peer or neighbor relationship. Once the TCP connection is established, the routers perform a handshake to negotiate capabilities and confirm identities. This stateful connection remains open indefinitely to allow for real-time updates as network conditions change.

bashBGP Peer Configuration Example
1! Configuration for an edge router using FRRouting or Cisco IOS
2router bgp 65001
3  bgp router-id 192.0.2.1
4  ! Define a neighbor in a different Autonomous System
5  neighbor 203.0.113.5 remote-as 65002
6  neighbor 203.0.113.5 description PEERING_PARTNER_A
7  ! Specify which local networks to advertise to the peer
8  address-family ipv4 unicast
9    network 198.51.100.0 mask 255.255.255.0
10    neighbor 203.0.113.5 activate
11  exit-address-family

The stability of these sessions is paramount for internet uptime. If a session flaps or disconnects repeatedly, it can cause routing instability across the entire globe as other routers react to the change. Modern implementations use specialized timers to detect failures and prevent minor glitches from causing massive traffic shifts.

The BGP Finite State Machine

A BGP session moves through a series of specific states before it is considered fully operational. It starts in the Idle state where the router waits for a start event to begin the connection process. It then moves through Connect and Active states as it attempts to establish the underlying TCP session with its neighbor.

Once the TCP connection is successful, the router sends an Open message and transitions to the OpenSent state. After receiving a valid Open message from the neighbor, it moves to OpenConfirm and finally reaches the Established state. Only when in the Established state can the routers actually exchange routing information via Update messages.

Message Types and Communication

There are four primary message types used to maintain the relationship between peers. Open messages initiate the session, while Update messages carry the actual routing data including new paths or withdrawn routes. Keepalive messages are sent periodically to ensure the neighbor is still active and the link is healthy.

Notification messages are used when an error occurs during the session. If a router receives an invalid message or experiences a configuration mismatch, it sends a Notification and immediately closes the session. This fail-fast mechanism prevents incorrect routing data from corrupting the local routing table.

Path-Vector Logic and Route Selection

BGP is classified as a path-vector protocol because it includes a list of all Autonomous Systems that a route has passed through. This list is known as the AS-Path attribute and is the primary tool for preventing infinite routing loops. If a router sees its own number in the path of an incoming advertisement, it knows a loop has occurred and rejects the route.

The protocol does not use a simple metric like hop count or link speed to determine the best path. Instead, it uses a complex tie-breaking algorithm that evaluates several attributes in a specific order. This allows network engineers to fine-tune exactly how traffic enters and exits their infrastructure.

  • Weight: A local value used to prefer one path over another on a single router.
  • Local Preference: Used to tell all routers within an AS which exit point is preferred.
  • AS-Path Length: Prefers the route that passes through the fewest number of networks.
  • Multi-Exit Discriminator: Used to suggest a preferred entry point to an external neighbor.
  • Router ID: A final tie-breaker based on the unique identity of the advertising router.

Developers interacting with network APIs often need to understand these attributes to debug latency or reachability issues. For example, a sudden increase in the AS-Path length usually indicates that a primary link has failed and traffic is taking a sub-optimal backup route.

AS-Path Prepending

Network engineers sometimes intentionally manipulate the AS-Path to influence how the rest of the world sends traffic to them. By repeating their own number multiple times in the path, they make a specific route look longer and therefore less desirable. This technique is known as prepending and is a common way to manage inbound traffic across multiple providers.

While prepending is effective, it is a blunt instrument that can have unintended consequences. If everyone prepends their routes, the global routing table becomes more complex and harder to optimize. It is often considered a last resort when other attribute modifications are not possible.

Traffic Engineering with Communities

BGP Communities are tags attached to routes that allow for more granular control over how they are handled by other networks. These tags act as metadata that can signal a provider to reduce the priority of a route or prevent it from being advertised to certain regions. They enable a more collaborative approach to traffic management between independent entities.

Standard communities are 32-bit values, but modern implementations support large communities for better compatibility with 32-bit ASNs. This tagging system is essential for building complex routing policies that scale across thousands of peers. Without communities, network operators would have to manually coordinate every minor policy change with their partners.

Internal vs External BGP

While BGP is famous for connecting different networks, it is also used within a single network to manage routing information. When two routers in different systems talk, they use External BGP. When two routers within the same system talk, they use Internal BGP.

The rules for these two modes differ significantly to ensure internal stability and prevent loops. For instance, routes learned via iBGP are not typically passed on to other iBGP peers to avoid loops within the network core. This behavior creates a requirement for a full mesh of connections where every router must talk to every other router.

pythonSimulating a BGP Update Packet
1# A conceptual representation of a BGP Update message structure
2bgp_update = {
3    "withdrawn_routes": ["192.0.2.0/24"],
4    "path_attributes": {
5        "origin": "IGP",
6        "as_path": [65001, 64512, 64496],
7        "next_hop": "203.0.113.1",
8        "multi_exit_disc": 100,
9        "local_pref": 200
10    },
11    "network_layer_reachability_info": ["198.51.100.0/24", "203.0.113.0/24"]
12}
13
14def process_route(update):
15    # Business logic to evaluate the AS-Path length
16    path_length = len(update["path_attributes"]["as_path"])
17    return f"Processing route with path length: {path_length}"

Scaling an iBGP network requires architectural changes to avoid the complexity of a full mesh. Technologies like Route Reflectors or Confederations allow a network to grow without requiring thousands of individual peering sessions. These tools act as centralized hubs that distribute routing information efficiently across the entire organization.

The Next-Hop Problem

In eBGP, the next-hop address for a route is typically the address of the neighbor that sent the advertisement. However, when that route is passed into the internal network via iBGP, the next-hop address remains unchanged by default. This can lead to issues where internal routers do not know how to reach the external gateway.

To solve this, engineers use a feature called next-hop-self which forces the edge router to update the next-hop attribute to its own internal address. This ensures that every router within the system has a valid path to exit the network. Understanding this behavior is critical for troubleshooting internal connectivity gaps.

Route Reflectors and Scaling

A Route Reflector serves as a focal point for routing updates within an Autonomous System. Instead of every router peering with every other router, they all peer with the reflector which then rebroadcasts the best routes to everyone else. This significantly reduces the number of sessions and the memory overhead on smaller routers.

While reflectors simplify the topology, they can introduce a single point of failure if not designed with redundancy. Most production networks deploy redundant pairs of reflectors to ensure that the internal control plane remains operational even if one device fails. This design pattern is standard for large-scale cloud and service provider environments.

Security and Modern Challenges

The original design of BGP was built on a foundation of implicit trust between network operators. In the modern internet, this lack of built-in security has led to numerous incidents where traffic was redirected or intercepted through malicious or accidental route announcements. These incidents, known as BGP hijacks, can disrupt services for millions of users simultaneously.

A hijack occurs when a network announces IP space that it does not actually own. Other routers may see this new path as more attractive and begin sending traffic to the wrong destination. This can be used for denial-of-service attacks or for more sophisticated data interception man-in-the-middle operations.

The industry has responded with the Resource Public Key Infrastructure which provides a way to cryptographically verify route ownership. By using digital signatures, network operators can prove they are authorized to announce specific IP prefixes. This adds a much-needed layer of validation to the global routing table.

Security in BGP is transitioning from a manual, social process of verification to an automated, cryptographic standard.

Route Origin Authorizations

A Route Origin Authorization is a digital object that links an IP prefix to a specific Autonomous System Number. These objects are stored in a distributed database that routers can query to validate incoming advertisements. If a router receives a route that contradicts the signed authorization, it can automatically discard the invalid path.

Implementing this validation prevents the most common types of accidental route leaks and malicious hijacks. While adoption is growing, it requires participation from all major networks to be fully effective. Modern network engineers must be familiar with creating and maintaining these digital records as part of standard operations.

The Impact of Route Flapping

Route flapping occurs when a network connection goes up and down rapidly, causing a flood of update and withdrawal messages. This creates a high CPU load on routers and can cause significant latency for users as the network constantly recalculates paths. To combat this, routers use a technique called flap damping to temporarily ignore unstable routes.

Damping works by assigning a penalty to a route every time it changes state. If the penalty exceeds a certain threshold, the route is suppressed for a specific period to allow the link to stabilize. While this protects the global control plane, it can lead to longer recovery times for legitimate network issues if not configured carefully.

We use cookies

Necessary cookies keep the site working. Analytics and ads help us improve and fund Quizzr. You can manage your preferences.