Code execution taking long time, some time failed


#1

Hi @itversity1 , @itversity

I am trying to solve this scenario
“Get count of customers in each city who have placed order of amount more than 100 and
whose order status is not PENDING.”

rd=sc.textFile("/user/rakeshdey0018/pract/practice3/problem3/orders")
rd1=rd.filter(lambda s:s.split("\t")[3] not in(‘PENDING’)).map(lambda s:(s.split("\t")[0],s.split("\t")[2],s.split("\t")[3]))
rd1.toDF([‘o_id’,‘o_c_id’,‘o_stat’]).registerTempTable(“ord”)

rd2=sc.textFile("/user/rakeshdey0018/pract/practice3/problem3/order_items")
rd3=rd2.filter(lambda s:float(s.split("\t")[4])>100).map(lambda s:(s.split("\t")[0],float(s.split("\t")[4])))
rd3.toDF([‘oi_id’,‘price’]).registerTempTable(“item”)

rd4=sc.textFile("/user/rakeshdey0018/pract/practice3/problem3/customers")
rd5=rd4.map(lambda s:(s.split("\t")[0],s.split("\t")[6]))
rd5.toDF([‘c_id’,‘city’]).registerTempTable(“cust”)

df=sqlContext.sql(“select a.city,count(a.c_id) from cust a,item b,ord c where a.c_id=c.o_c_id and b.oi_id=c.o_id group by a.city”)

df.show() taking lots of time and some time failing

I have used though 2 executors 2 cores and 2GB for each core

Hold on, read this through before raising topic in this category

Are you getting Permission denied, too many logins issue?
Don’t raise new ticket. Click here for the solution. If the issue persists after 30 minutes then raise new ticket

Go through other common issues in this category before raising any issue.