{"id":2535439,"date":"2023-04-07T11:43:11","date_gmt":"2023-04-07T15:43:11","guid":{"rendered":"https:\/\/platoai.gbaglobal.org\/platowire\/how-to-transfer-amazon-emr-step-logs-from-amazon-ec2-instances-to-amazon-cloudwatch-logs\/"},"modified":"2023-04-07T11:43:11","modified_gmt":"2023-04-07T15:43:11","slug":"how-to-transfer-amazon-emr-step-logs-from-amazon-ec2-instances-to-amazon-cloudwatch-logs","status":"publish","type":"platowire","link":"https:\/\/platoai.gbaglobal.org\/platowire\/how-to-transfer-amazon-emr-step-logs-from-amazon-ec2-instances-to-amazon-cloudwatch-logs\/","title":{"rendered":"How to Transfer Amazon EMR Step Logs from Amazon EC2 Instances to Amazon CloudWatch Logs"},"content":{"rendered":"
Amazon EMR (Elastic MapReduce) is a managed big data platform that allows users to process large amounts of data using open-source tools such as Apache Hadoop, Spark, and Hive. Amazon EC2 (Elastic Compute Cloud) is a web service that provides resizable compute capacity in the cloud. Amazon CloudWatch Logs is a monitoring service that allows users to monitor, store, and access log files from Amazon EC2 instances, AWS CloudTrail, and other sources.<\/p>\n
In this article, we will discuss how to transfer Amazon EMR step logs from Amazon EC2 instances to Amazon CloudWatch Logs.<\/p>\n
Step 1: Create an Amazon S3 bucket<\/p>\n
The first step is to create an Amazon S3 bucket where the EMR step logs will be stored. To create an S3 bucket, follow these steps:<\/p>\n
1. Log in to the AWS Management Console.<\/p>\n
2. Navigate to the S3 service.<\/p>\n
3. Click on the “Create bucket” button.<\/p>\n
4. Enter a unique name for your bucket and select the region where you want to create it.<\/p>\n
5. Leave the default settings for the rest of the options and click on the “Create bucket” button.<\/p>\n
Step 2: Configure EMR to write step logs to S3<\/p>\n
The next step is to configure EMR to write step logs to the S3 bucket that you created in step 1. To do this, follow these steps:<\/p>\n
1. Log in to the AWS Management Console.<\/p>\n
2. Navigate to the EMR service.<\/p>\n
3. Click on the “Create cluster” button.<\/p>\n
4. Enter a name for your cluster and select the region where you want to create it.<\/p>\n
5. Select the appropriate software configuration for your cluster.<\/p>\n
6. Under “Edit software settings”, expand “Advanced options”.<\/p>\n
7. In the “Classification” field, enter “emrfs-site”.<\/p>\n
8. In the “Properties” field, enter the following:<\/p>\n
fs.s3.consistent.retryPeriodSeconds: 10<\/p>\n
fs.s3.consistent: true<\/p>\n
fs.s3.consistent.retryCount: 5<\/p>\n
fs.s3.consistent.metadata.tableName: emrfs-metadata<\/p>\n
fs.s3.consistent.metadata.region: us-east-1<\/p>\n
fs.s3.consistent.retryPolicyType: exponential<\/p>\n
9. Under “Edit software settings”, expand “Bootstrap actions”.<\/p>\n
10. Click on the “Add bootstrap action” button.<\/p>\n
11. Enter a name for your bootstrap action and select “Custom action”.<\/p>\n
12. In the “Script location” field, enter the following URL:<\/p>\n
s3:\/\/elasticmapreduce\/bootstrap-actions\/configure-hadoop<\/p>\n
13. In the “Arguments” field, enter the following:<\/p>\n
–mapred-config-file<\/p>\n
s3:\/\/\/emrfs-site.xml<\/p>\n
14. Replace “” with the name of the S3 bucket that you created in step 1.<\/p>\n
15. Click on the “Create cluster” button.<\/p>\n
Step 3: Configure CloudWatch Logs agent on EC2 instances<\/p>\n
The next step is to configure the CloudWatch Logs agent on the EC2 instances that are running your EMR cluster. To do this, follow these steps:<\/p>\n
1. Log in to the EC2 instance that you want to configure.<\/p>\n
2. Download and install the CloudWatch Logs agent by running the following commands:<\/p>\n
sudo yum install -y awslogs<\/p>\n
sudo service awslogs start<\/p>\n
3. Edit the CloudWatch Logs agent configuration file by running the following command:<\/p>\n
sudo nano \/etc\/awslogs\/awslogs.conf<\/p>\n
4. Add the following lines to the end of the file:<\/p>\n
[\/var\/log\/hadoop\/steps\/*]<\/p>\n
datetime_format = %Y-%m-%d %H:%M:%S,%f<\/p>\n
file = \/var\/log\/hadoop\/steps\/application.log<\/p>\n
buffer_duration = 5000<\/p>\n
log_stream_name = {instance_id}<\/p>\n
initial_position = start_of_file<\/p>\n
log_group_name = <\/p>\n
5. Replace “” with the name of the CloudWatch Logs log group that you want to use.<\/p>\n
6. Save and close the file.<\/p>\n
7. Restart the CloudWatch Logs agent by running the following command:<\/p>\n
sudo service awslogs restart<\/p>\n
Step 4: Verify logs are being transferred to CloudWatch Logs<\/p>\n
The final step is to verify that the EMR step logs are being transferred to CloudWatch Logs. To do this, follow these steps:<\/p>\n
1. Log in to the AWS Management Console.<\/p>\n
2. Navigate to the CloudWatch service.<\/p>\n
3. Click on the “Logs” menu item.<\/p>\n
4. Select the log group that you specified in step 3.<\/p>\n
5. Verify that log streams are being created for each EC2 instance in your EMR cluster.<\/p>\n
6. Click on a log stream to view the EMR step logs.<\/p>\n
In conclusion, transferring Amazon EMR step logs from Amazon EC2 instances to Amazon CloudWatch Logs is a straightforward process that involves configuring EMR to write step logs to an<\/p>\n
Amazon EMR (Elastic MapReduce) is a managed big data platform that allows users to process large amounts of data using open-source tools such as Apache Hadoop, Spark, and Hive. Amazon EC2 (Elastic Compute Cloud) is a web service that provides resizable compute capacity in the cloud. Amazon CloudWatch Logs is a monitoring service that allows […]<\/p>\n","protected":false},"author":2,"featured_media":2527035,"menu_order":0,"template":"Default","format":"standard","meta":[],"aiwire-tag":[560,2141,933,1570,1975,10509,11,213,3090,28516,28496,2773,132,18,28505,2158,20,7302,1388,21,17135,23,140,1584,21189,2551,2169,29,3024,9000,3983,3984,2454,577,28531,11967,7314,19651,2336,15667,15668,15669,3338,5743,731,1324,6925,591,11575,3038,1325,4107,1783,25795,28518,2574,1206,381,5913,5499,3237,7134,743,6048,3653,2671,50,51,1794,28513,167,537,57,15787,1643,2220,2817,60,61,391,2951,4524,6647,5264,16350,2233,614,28523,4918,28514,2961,4924,3769,23853,822,759,760,75,78,261,184,80,5,10,7,8,299,6180,88,704,5883,2510,2982,414,6187,3076,634,11058,1457,635,5789,10045,778,710,1291,779,2861,3130,6494,103,108,109,207,111,1377,3082,13291,2870,2302,117,7127,307,118,430,5303,645,309,1474,9,1838,124,125,3956,1742,1382,6],"aiwire":[722],"_links":{"self":[{"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/platowire\/2535439"}],"collection":[{"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/platowire"}],"about":[{"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/types\/platowire"}],"author":[{"embeddable":true,"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/users\/2"}],"version-history":[{"count":0,"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/platowire\/2535439\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/media\/2527035"}],"wp:attachment":[{"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/media?parent=2535439"}],"wp:term":[{"taxonomy":"aiwire-tag","embeddable":true,"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/aiwire-tag?post=2535439"},{"taxonomy":"aiwire","embeddable":true,"href":"https:\/\/platoai.gbaglobal.org\/wp-json\/wp\/v2\/aiwire?post=2535439"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}