Throttling in SharePoint Online is a mechanism implemented to prevent overuse of resources and maintain optimal performance and reliability. When throttling occurs, SharePoint Online returns HTTP status codes, such as 429 (“Too many requests”) or 503 (“Server Too Busy”). This article focuses on the best practices for handling throttling by honoring the Retry-After and RateLimit headers provided in the response.
Understanding Throttling in SharePoint Online
Throttling in SharePoint Online can occur at both the user and application levels. User throttling restricts the number of calls and operations made by applications on behalf of a user, while application throttling imposes limits on applications within a tenant based on the number of licenses purchased per organization.
Handling 429 and 503 Errors with Retry
To handle throttling effectively, it is crucial to honor the Retry-After and RateLimit headers provided in the response. The following Python code snippet demonstrates how to handle 429 and 503 errors with a maximum retry of three times:
Sample Python Code
import requests
import time
class GraphAPIHandler:
def __init__(self, tenant_id, client_id, client_secret, user_agent):
self.tenant_id = tenant_id
self.client_id = client_id
self.client_secret = client_secret
self.user_agent = user_agent
self.access_token = None
def get_access_token(self):
token_url = (
f"https://login.microsoftonline.com/{self.tenant_id}/oauth2/v2.0/token"
)
payload = {
"client_id": self.client_id,
"client_secret": self.client_secret,
"scope": "https://graph.microsoft.com/.default",
"grant_type": "client_credentials",
}
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
response = requests.post(token_url, data=payload, headers=headers)
response_data = response.json()
if "access_token" in response_data:
self.access_token = response_data["access_token"]
else:
raise Exception("Failed to obtain access token")
def make_graph_api_request(self, url):
headers = {
"Authorization": f"Bearer {self.access_token}",
"Content-Type": "application/json",
"User-Agent": self.user_agent
}
max_retries = 3
retry_wait_time = 1 # Wait time in seconds between retries
for _ in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 429 or response.status_code == 503:
if "Retry-After" in response.headers:
# Retry-After header provides the recommended wait time
wait_time = int(response.headers.get("Retry-After", 0))
if wait_time > 0:
print(
f"Received {response.status_code} status code. Waiting for {wait_time} seconds before retrying..."
)
time.sleep(wait_time)
else:
print(
f"Received {response.status_code} status code, but Retry-After header was not provided. Waiting for {retry_wait_time} seconds before retrying..."
)
time.sleep(retry_wait_time)
else:
# If Retry-After header is not available, use a default wait time
print(
f"Received {response.status_code} status code, but Retry-After header was not provided. Waiting for {retry_wait_time} seconds before retrying..."
)
time.sleep(retry_wait_time)
else:
# Other error occurred, handle it accordingly
raise Exception(
f"Request failed with status code: {response.status_code}"
)
# Reached maximum retries without a successful response
raise Exception(
"Maximum retry attempts reached. Unable to complete the request."
)
# Usage example
tenant_id = "<your_tenant_id>"
client_id = "<your_client_id>"
client_secret = "<your_client_secret>"
sp_target_host = "<your_tenant_name>.sharepoint.com" # e.g. "contoso.sharepoint.com"
sp_site_path = "teams/<your_site_name>" # e.g. "/teams/hr"
user_agent = "NONISV|CompanyName|AppName/Version" # If ISV Application type, use ISV|CompanyName|AppName/Version
graph_handler = GraphAPIHandler(tenant_id, client_id, client_secret, user_agent)
try:
graph_handler.get_access_token()
graph_api_url = f"https://graph.microsoft.com/v1.0/sites/{sp_target_host}:/{sp_site_path}"
site_data = graph_handler.make_graph_api_request(graph_api_url)
print("Site data:", site_data)
except Exception as e:
print("An error occurred:", str(e))
By incorporating the above code into your SharePoint Online application, you can handle 429 and 503 errors effectively. The code retries the request with appropriate wait times based on the Retry-After header provided in the response. It allows a maximum of three retries before raising an exception if a successful response is not received. Adjust the retry_wait_time and max_retries variables as per your application’s requirements.
Wrap-Up
By following the best practices mentioned in this article and honoring the Retry-After and RateLimit headers in SharePoint Online, you can handle throttling more effectively. Remember to reduce concurrent requests, avoid request spikes, utilize Microsoft Graph APIs, and implement proper retry mechanisms for 429 and 503 errors. With these practices in place, you can optimize your application’s performance and ensure a reliable and responsive experience for SharePoint Online users.
Although the best practice for avoiding SharePoint Online throttling is to optimize your code to minimize the impact of service-related limits, SharePoint Online provides a number of tools to manage and troubleshoot throttling in your tenant. For example, you can use the SharePoint Online Component-based Scalable Logical Architecture(CSLA) tool, monitor the health of your environment, and identify potential throttling issues. The tool, accessed through an elevated command prompt, enables you to configure throttling policies, restrict access to specific endpoints and resources, and monitor the health of your SharePoint Online deployments. Additionally, the SharePoint Online Knowledge Base contains information regarding the most common throttling scenarios and possible solutions.
Even with all the available tools, it can be difficult to detect and address throttling issues in your SharePoint Online environment. To manage and troubleshoot potential throttling issues, perform continuous monitoring of your environment. Monitor the health of your SharePoint Online deployments and identify potential misuse. Also, keep in mind that throttling can occur for any number of reasons—from network bottlenecks to limited server resources and performance degradation.