Debugging generic API Gateway errors with access logs
1. Problem statement
I was playing around with the Amazon API Gateway HTTP API the other day (more on that in a future post) and added a Lambda integration to one of the routes.
The route should have had the /pets/{proxy+}
format.
When I invoked the API from Postman, I received a generic 500 Internal server error
.
Similarly to some other AWS errors, this message wasn’t helping either.
2. Debugging process
The request didn’t invoke the Lambda integration because I haven’t seen anything in the function’s CloudWatch log group.
Something must have happened inside API Gateway (unlikely) or between the gateway and the function.
2.1. Create a log group
It’s a good idea to turn on access logging for the API, especially when we are in the development phase.
First we’ll need a log group, so let’s create one in CloudWatch to which API Gateway will send its access logs.
2.2. Turn on access logging
We can turn on access logging at the bottom of the left menu in the AWS Console.
After switching on Access logging with the slider, we should add the ARN of the log group we created above. Out of the available log formats, select JSON.
API Gateway will log the following object to CloudWatch:
{
"requestId": "$context.requestId",
"ip": "$context.identity.sourceIp",
"requestTime": "$context.requestTime",
"httpMethod": "$context.httpMethod",
"routeKey": "$context.routeKey",
"status": "$context.status",
"protocol": "$context.protocol",
"responseLength": "$context.responseLength"
}
These are the default logging variables.
2.3. Add error messages
This object only provides basic information, most of which we already know.
But the context
object contains a lot more properties. I have found two of them particularly useful.
The $context.integration.error
property returns the error message from the integration, in this case, from the Lambda function. Although the request didn’t trigger a function invocation, Lambda can still return an error. API Gateway will log that error messes here.
The second error I added is the value of the $context.error.message
key, which returns an error message from the API Gateway itself.
So I ended up adding these properties to the log JSON:
{
"integrationError": "$context.integration.error",
"apiGatewayError": "$context.error.message"
}
Indeed, the following invocation revealed the issue.
2.4. Create the entire route in one go
apiGatewayError
returned Internal Server Error
, which was not new to me.
But the integrationError
property showed me the real problem:
The IAM role configured on the integration or API Gateway doesn't have
permission to call the integration. Check the permissions and try again.
At first, the error message was weird because when you create a Lambda integration to an HTTP API route, API Gateway will generate the necessary permissions to invoke the function. More accurately, it will add the following policy to the Lambda function’s resource-based policy:
{
"Effect": "Allow",
"Principal": {
"Service": "apigateway.amazonaws.com",
},
"Action": "lambda:InvokeFunction",
"Resource": "arn:aws:lambda:us-east-1:ACCOUNT_ID:function:FUNCTION_NAME",
"Condition": {
"ArnLike": {
// /{proxy+} is missing from the end!!!
"AWS:SourceArn": "arn:aws:execute-api:us-east-1:ACCOUNT_ID:API_ID/*/*/pets"
}
}
}
The problem is in the Condition
block: The /{proxy+}
is missing from the end of the ARN. So when I invoked /pets/SOMETHING
, Lambda responded with an error.
The reason was that I created the integration for the /pets
route first, so API Gateway added the relevant permission to the resource-based policy. Then I realized it wouldn’t work for me this way, so I edited the route and added the /{proxy}
to it. Of course, API Gateway didn’t update the permission.
The solution is to either add the missing bit to the Lambda permission manually, or create the route in its final form in API Gateway, or use Infrastructure as Code that creates and updates the permissions.
Finding the root cause was much easier with access logs. It can uncover other issues too, and saves us the time we spend finding the cause of the problem. It’s worth giving it a go!
3. Summary
API Gateway access logging can be helpful when we have to debug generic error messages.
The log object only contains a fraction of the available properties. We can add error messages to the access log by extracting them from the context
object.
4. Further reading
Customizing HTTP API access logs - More available logging variables
The Missing Guide to AWS API Gateway Access Logs - Alex DeBrie’s excellent guide on access logs in REST API