Why spark streaming receive data from kafka use more memory than <executorMemory * executorCount + driverMemory>? -


i submitted spark streaming application yarn cluster client mode follows:

./spark-submit \ --jars $jars \ --class $appcls \ --master yarn-client \ --driver-memory 64m \ --executor-memory 64m \ --conf spark.shuffle.service.enabled=false \ --conf spark.dynamicallocation.enabled=false  \ --num-executors 6 \ /data/apps/app.jar  

executormemory * executorcount + drivermemory = 64m*6 + 64m = 448m,

but application used 3968mb. why did happen , how can reduce memory use?

there spark configuration parameters spark.yarn.executor.memoryoverhead , spark.yarn.driver.memoryoverhead default 384 mb in case (docs).

then there fact yarn has memory allocation granularity (yarn.scheduler.increment-allocation-mb) defaults 512 mb. rounded multiple of that.

also there minimum allocation size (yarn.scheduler.minimum-allocation-mb) defaults 1 gb. it's either been set lower in case or not correctly looking @ memory allocation.


all overhead should negligible compared memory use. should set --executor-memory 20 gb or more. why trying configure ridiculously low amount of memory?


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

jquery - javascript onscroll fade same class but with different div -