We are considering how to use Impala in our production
environment.  There are two choice:
1.     
Building a dedicated clusters
2.     
Installing Impalad on the existed hadoop clusters
I prefer #2, but
it raises another question, how many Impalad daemons  should I install on
existed hadoop clusers? one impalad per datanode?
Yes, one impalad per datanode. We generally recommend to co-locate the Impalads with the HDFS datanodes to allow scheduling queries for local short-circuit reads (better performance).
 
댓글 없음:
댓글 쓰기