Internet-Draft bw-aware-bypass-no-reservation January 2024
Szarecki & Mitchell Expires 26 July 2024 [Page]
Workgroup:
Traffic Engineering Architecture and Signaling
Internet-Draft:
draft-szarecki-teas-bw-aware-bypass-no-reservation
Published:
Intended Status:
Informational
Expires:
Authors:
R.J. Szarecki
Google LLC
J. Mitchell
Google LLC

MPLS FRR bypass path calculation with bandwidth awareness without reservation

Abstract

RFC4090 documents facility backup FRR in MPLS-TE networks. This document describes methods that allow the Point of Local Repair (PLR) to find a path with sufficient available bandwidth to accommodate protected traffic, while not making undesired reservations that would require additional capacity. Below aspects are covered:

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 26 July 2024.

Table of Contents

1. Introduction

Network operators use MPLS-TE technology, as described in [RFC3209], to optimize network resources (bandwidth over network graph) while providing prioritized treatment - high bandwidth and low loss. This is often done by combination of bandwidth reservation for LSPs and protection (of some more critical) of them by facility backup [RFC4090] technique. The facility backup technique uses a pre-signaled bypass tunnel (instantiated as MPLS LSP) to temporally carry impacted LSP around a faulty resource, such as a link or node, immediately after failure. In many networks, operators have not configured the LSPs to have bandwidth protection desired, and therefore the bypass tunnels are provisioned without any matching bandwidth reservation constraint, to prevent the booking of bandwidth for these tunnels which only carry traffic briefly during failure events until global convergence. Without a bandwidth reservation constraint that matches the protected LSPs required bandwidth, the bypass tunnels are typically routed on the shortest path between the Point of Local Repair (PLR) and Merge Point (MP) based on a CSPF satisfying any other constraints such as shared risk link groups (SRLG). This practice has the limitation that the bypass tunnel could be routed over interfaces that have available bandwidth much lower than protected resource's reservations. This limitation is specifically common in networks that utilize IGP metric values based on distance or latency rather than bandwidth. Such networks may have a large difference between the highest and lowest capacity of adjacencies between various locations.

This document provides procedures that the headend routing node of the bypass tunnel (PLR) can utilize to optimize bypass tunnel path computation, so it is routed over links in a network that are more likely to have sufficient available bandwidth, while still not utilizing bandwidth reservations. Procedures described are local to PLR, hence do not impose any special requirements on other nodes in the network, and could be deployed incrementally. These procedures could be deployed whether bypass tunnel's Merge Point (MP) is explicitly provisioned via local configuration or when the PLR automatically determines the necessary MP form Traffic Engineering Database (TED) and inspection of RRO of protected LSPs.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Finding of bypass required bandwidth

In order for the PLR to perform bypass path computation that has sufficient available bandwidth, the volume of traffic expected to be protected by a given bypass tunnel needs to be determined even if it will not be utilized in RSVP signaling. In this document the terminology protected bandwidth (PBW) refers to this value.

2.1. Static Protected Bandwidth

The PBW for the bypass tunnel can be computed by an offline tool and configured on the PLR either on per bypass tunnel basis, globally or on per protected resource basis. The accuracy of utilization of this method is limited by the offline modeling and the current network state, so it may not be as reactive as the other approach this document describes. The exact procedure on how the offline tool calculates PBW is out of scope of this document.

2.2. Dynamically Computed Protected Bandwidth

Since the PLR is aware of the bandwidth reservations by the LSPs per protected resource, such as a link or node, the PLR can compute protected bandwidth per bypass tunnel to match the sum of these protected LSPs bandwidth reservations, and can update the required bandwidth utilized for CSPF on a bypass tunnel on regular intervals. By setting the interval by which the value is reset, referred to as the PBW timer, the operator can balance accuracy versus computational resources and churn included by making this calculation. To reduce the error rate from a single measurement between intervals, multiple samples of the aggregate value of PBW can be stored and only the largest sample over several sample periods utilized for the next PBW timer induced CSPF.

Optionally, a percentage scaled value of the top sample could be applied to optimize the trade off between how much of the sum of protected LSPs bandwidth reservations the PBW should account for. For instance, a Implementation should allow the application of a scaling factor in the range from 0% to at least 200% of PBW to derive the value, however the default value should be 100%.

Implementations may apply some logic to prevent negligible changes requiring churn induced by re-routing of the tunnel by comparing the value to be utilized versus the previous PBW timer initiated value if that value is retained by the implementation. The periodic computation interval is expected to be shorter or equal to optimization timer LSPs implementing bypass tunnel.

This overall approach and implementation of this technology is conceptually very similar to many existing vendor implementations of the "auto-bandwidth" feature, a generalized summary of which is described in Section 4 of [RFC8733]. Because the number of bypass tunnels in many networks is relatively low compared to protected LSPs, the cost of re-computation and associated churn is likely to be relatively low in comparison to those incurred if the network utilizes auto-bandwidth for their protected LSPs.

2.3. Hybrid Approach

The initial PBW value utilized for a bypass tunnel using this approach even when using the dynamically computed PBW maybe incorrect for a long period of time after initial bypass tunnel creation if the PBW timer period is long as the number of protected LSPs over a newly created bypass tunnel will likely rapidly increase in a number of network events such as when new protected resource, such as a new interface, becomes operational. Initially there will be just very few LSP (or just one) LSP that require protection, hence PBW may be initially very low. Implementations SHOULD allow for configuration of minimum PBW value to be used when a bypass tunnel is initialized and in place of the dynamically calculated PBW if the dynamically calculated PBW is lower than the minimum PBW configured.

3. Bypass path computation

The network path for the bypass tunnel is computed using the same logic as described in Section 6.2 of [RFC4090]. There are four classes of events that trigger the computation of a bypass path:

  1. Expiration of existing bypass lsp reoptimization timer
  2. Receipt of PathErr due to network failure along existing bypass path or failure of existing bypass egress interface on PLR
  3. Initialization of new bypass tunnel
  4. Change of reserved bandwidth of protected LSP or signaling of a new LSP that is to be protected by an existing bypass tunnel

The last event can occur quite frequently, especially in networks that utilize automatic bandwidth determination for protected LSPs with aggressive intervals. As such, these events SHOULD NOT trigger bypass path re-computation. This is because including these events would lead to never converging network and adversely impact computational resources - control plane CPU of network device.

For the events of the first three classes, the PBW value determined as per Section 2 should be used to derive path computation bandwidth constraints utilized by CSPF. This value together with other constraining attributes configured, (e.g. SRLG) are used as arguments for path computation by CSPF procedure.. The result of this process is an ordered list of network interfaces in the form of Explicit Route Object (ERO) for bypass tunnel, that is used for signaling of LSP instantiating bypass tunnel.

4. Bypass tunnel signaling

Bypass tunnels are instantiated as an LSP that have head-end on PLR and tail-end on MP. They are signaled by RSVP just like any other MPLS LSP. Specifically PLR uses and inserts ERO objects into PATH messages to enforce network path given bypass traverses. The other important RSVP PATH's object is Traffic-Specification (T-SPEC), which encodes bandwidth that needs to be reserved for signaled LSP.

Under this procedure, T-SPEC bandwidth is independent of the bandwidth utilized by the CSPF algorithm as described in Section 2. Since the purpose of the procedures in this document are to provide a less likely to be congested path for the backup tunnel, the LSP bandwidth for the bypass tunnels should be configured to a value smaller than the value of PBW (Section 2) and value utilized for CSPF (Section 3) procedures, otherwise reservations are more likely to fail due to lack of available bandwidth on the computed path. Implementation SHOULD support static configuration of signaled bandwidth independent from the PBW, including signaled bandwidth of zero bps value.

PLR MUST insert The ERO object in RSVP PATH. Its value MUST be the ERO computed as per Section 3.

Note: If ERO returned by path computation described in Section 3 is equal to network path bypass tunnel currently traverses, and the signaled bandwidth did not changed (e.g. because it has a statically configured value), there is no need for signaling a new LSP - existing one can be refreshed and utilized.

5. IANA Considerations

This memo includes no request to IANA.

6. Security Considerations

This document does not introduce new security issues. The security considerations listed in Section 9 of [RFC4090] still remain relevant.

7. References

7.1. Normative References

[RFC3209]
Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, DOI 10.17487/RFC3209, , <https://www.rfc-editor.org/info/rfc3209>.
[RFC4090]
Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, DOI 10.17487/RFC4090, , <https://www.rfc-editor.org/info/rfc4090>.

7.2. Informative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8733]
Dhody, D., Ed., Gandhi, R., Ed., Palle, U., Singh, R., and L. Fang, "Path Computation Element Communication Protocol (PCEP) Extensions for MPLS-TE Label Switched Path (LSP) Auto-Bandwidth Adjustment with Stateful PCE", RFC 8733, DOI 10.17487/RFC8733, , <https://www.rfc-editor.org/info/rfc8733>.

Acknowledgements

TBD

Contributors

TBD

Authors' Addresses

Rafal Jan Szarecki
Google LLC
1600 Amphitheatre Parkway
Mountain View, California 94043
United States of America
Jon Mitchell
Google LLC
1600 Amphitheatre Parkway
Mountain View, California 94043
United States of America