Where to get EMR cluster failure logs after steps start running

Solution for Where to get EMR cluster failure logs after steps start running
is Given Below:

My EMR cluster starts and the step runs for a while, but then the step gets cancelled and I get a cluster error message next to the cluster name like this:

Terminated with errors The master failed: Connect timed out

However, I have not been able to find the error logs anywhere, even in the s3 Log URI, located in the EMR summary. I may be missing it here however. Would someone know where I could get the error log?

Following this link the access to the logs depends on the submit method:

  • client mode on the command line: Collect standard out and standard error into log files when submitting the Spark job:
spark-submit [--deploy-mode client] ... 1>output.log 2>error.log
  • client mode using an EMR step: Download the (compressed) log files from the associated S3 location. The link above contains a detailled description how to identify the correct S3 location. The S3 location depends on the cluster id, the instance id and the application.

  • cluster mode (I): Identify the YARN applicationId from the client logs (from the process that started the Spark job) and then again download the log files from the associated S3 location. The same like as above describes the details.

  • cluster mode (II): Following this answer you can also download the logs directly from the YARN cluster using yarn logs -applicationId <app ID> as described more in detail in the Spark documentation. This would be the standard way to access the logs in a non-EMR environment. Like for method three the applicationId is taken from the client logs.