Last active
January 17, 2016 20:48
-
-
Save apple-corps/b0da92eb313b1bf71912 to your computer and use it in GitHub Desktop.
Running out of memory locally launching multiple spark jobs using spark yarn / submit from shell.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| I launch around 30-60 of these jobs defined like start-job.sh in the background from a wrapper script. I wait about 30 seconds between launches, then the wrapper monitors yarn to determine when to launch more. There is a limit defined at around 60 jobs, but even if I set it to 30, I run out of memory on the host submitting the jobs. Why does my approach to using spark-submit cause me to run out of memory. I have about 6G free, and I don't feel like I should be running out of memory when submitting jobs. | |
| start-job.sh | |
| export HADOOP_CONF_DIR=/etc/hadoop/conf | |
| spark-submit \ | |
| --class sap.whcounter.WarehouseCounter \ | |
| --master yarn-cluster \ | |
| --num-executors 1 \ | |
| --driver-memory 1024m \ | |
| --executor-memory 1024m \ | |
| --executor-cores 4 \ | |
| --queue hlab \ | |
| --conf spark.yarn.submit.waitAppCompletion=false \ | |
| --conf spark.app.name=wh_reader_sp \ | |
| --conf spark.streaming.receiver.maxRate=1000 \ | |
| --conf spark.streaming.concurrentJobs=2 \ | |
| --conf spark.eventLog.dir="hdfs:///user/spark/applicationHistory" \ | |
| --conf spark.eventLog.enabled=true \ | |
| --conf spark.eventLog.overwrite=true \ | |
| --conf spark.yarn.historyServer.address="http://spark-history.local:18080/" \ | |
| --conf spark.yarn.jar="hdfs:///user/spark/share/lib/spark-assembly.jar" \ | |
| --conf spark.yarn.dist.files="hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar" \ | |
| hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar \ | |
| $1 $2 | |
| ps aux | grep java | |
| /usr/java/latest/bin/java -cp ::/usr/lib/spark/conf:/usr/lib/spark/lib/spark-assembly.jar:/etc/hadoop/conf:/usr/lib/hadoop/client/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop/../hadoop-hdfs/./:/usr/lib/hadoop/../hadoop-hdfs/lib/*:/usr/lib/hadoop/../hadoop-hdfs/.//*:/usr/lib/hadoop/../hadoop-yarn/lib/*:/usr/lib/hadoop/../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/spark/lib/scala-library.jar:/usr/lib/spark/lib/scala-compiler.jar:/usr/lib/spark/lib/jline.jar -XX:MaxPermSize=128m -Xms1024m -Xmx1024m org.apache.spark.deploy.SparkSubmit --class sap.whcounter.WarehouseCounter --master yarn-cluster --num-executors 1 --driver-memory 1024m --executor-memory 1024m --executor-cores 4 --queue hlab --conf spark.yarn.submit.waitAppCompletion=false --conf spark.app.name=wh_reader_sp --conf spark.streaming.receiver.maxRate=1000 --conf spark.streaming.concurrentJobs=2 --conf spark.eventLog.dir=hdfs:///user/spark/applicationHistory --conf spark.eventLog.enabled=true --conf spark.eventLog.overwrite=true --conf spark.yarn.historyServer.address=http://spark-history.local:18080/ --conf spark.yarn.jar=hdfs:///user/spark/share/lib/spark-assembly.jar --conf spark.yarn.dist.files=hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar hdfs:///wh/2015/04/19/* | |
| free -m | |
| total used free shared buffers | |
| Mem: 7873 992 6881 0 62 | |
| -/+ buffers/cache: 500 7373 | |
| Swap: 14947 574 14373 | |
| hs_err_pid7433.log | |
| # There is insufficient memory for the Java Runtime Environment to continue. | |
| # Native memory allocation (malloc) failed to allocate 716177408 bytes for committing reserved memory. | |
| # Possible reasons: | |
| # The system is out of physical RAM or swap space | |
| # In 32 bit mode, the process size limit was hit | |
| # Possible solutions: | |
| # Reduce memory load on the system | |
| # Increase physical memory or swap space | |
| # Checes insufficient memory for the Java Runtime Environment to continue. | |
| # # Native memory allocation (malloc) failed to allocate 716177408 bytes for committing reserved memory. | |
| # # Possible reasons: | |
| # # The system is out of physical RAM or swap space | |
| # # In 32 bit mode, the process size limit was hit | |
| # # Possible solutions: | |
| # # Reduce memory load on the system | |
| # # Increase physical memory or swap space | |
| # # Check if swap backing store is full | |
| # # Use 64 bit Java on a 64 bit OS | |
| # # Decrease Java heap size (-Xmx/-Xms) | |
| # # Decrease number of Java threads | |
| # # Decrease Java thread stack sizes (-Xss) | |
| # # Set larger code cache with -XX:ReservedCodeCacheSize= | |
| # # This output file may be truncated or incomplete. | |
| # # | |
| # # Out of Memory Error (os_linux.cpp:2747), pid=7357, tid=140414250673920 | |
| # # | |
| # # JRE version: (7.0_60-b19) (build ) | |
| # # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.60-b09 mixed mode linux-amd64 compressed oops) | |
| # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again | |
| # if swap backing store is full | |
| # Use 64 bit Java on a 64 bit OS | |
| # Decrease Java heap size (-Xmx/-Xms) | |
| # Decrease number of Java threads | |
| # Decrease Java thread stack sizes (-Xss) | |
| # Set larger code cache with -XX:ReservedCodeCacheSize= | |
| # This output file may be truncated or incomplete. | |
| # | |
| # Out of Memory Error (os_linux.cpp:2747), pid=7357, tid=140414250673920 | |
| # | |
| # JRE version: (7.0_60-b19) (build ) | |
| # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.60-b09 mixed mode linux-amd64 compressed oops) | |
| # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again | |
| # | |
| VM Arguments: | |
| jvm_args: -XX:MaxPermSize=128m -Xms1024m -Xmx1024m | |
| java_command: org.apache.spark.deploy.SparkSubmit --class | |
| sap.whcounter.WarehouseCounter --master yarn-cluster --num-executors 1 | |
| --driver-memory 1024m --executor-memory 1024m --executor-cores 4 --queue hlab | |
| --conf spark.yarn.submit.waitAppCompletion=false --conf | |
| spark.app.name=wh_reader_sp --conf spark.streaming.receiver.maxRate=1000 --conf | |
| spark.streaming.concurrentJobs=2 --conf | |
| spark.eventLog.dir=hdfs:///user/spark/applicationHistory --conf | |
| spark.eventLog.enabled=true --conf spark.eventLog.overwrite=true --conf | |
| spark.yarn.historyServer.address=http://spark-history.local:18080/ --conf | |
| spark.yarn.dist.files=hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar | |
| hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar | |
| hdfs://wh/2015/04/10/* 2015-04-10T00:00:00+00:00 | |
| Launcher Type: SUN_STANDARD | |
| Environment Variables: | |
| JAVA_HOME=/usr/java/latest | |
| CLASSPATH=::/usr/lib/spark/conf:/usr/lib/spark/lib/spark-assembly.jar:/etc/hadoop/conf:/usr/lib/hadoop/client/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop/../hadoop-hdfs/./:/usr/lib/hadoop/../hadoop-hdfs/lib/*:/usr/lib/hadoop/../hadoop-hdfs/.//*:/usr/lib/hadoop/../hadoop-yarn/lib/*:/usr/lib/hadoop/../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/spark/lib/scala-library.jar:/usr/lib/spark/lib/scala-compiler.jar:/usr/lib/spark/lib/jline.jar | |
| PATH=/home/colin.williams/bin:/home/colin.williams/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/rvm/bin | |
| SHELL=/bin/bash |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment