We are considering how to use Impala in our production
environment. There are two choice:
1.
Building a dedicated clusters
2.
Installing Impalad on the existed hadoop clusters
I prefer #2, but
it raises another question, how many Impalad daemons should I install on
existed hadoop clusers? one impalad per datanode?
Yes, one impalad per datanode. We generally recommend to co-locate the Impalads with the HDFS datanodes to allow scheduling queries for local short-circuit reads (better performance).
댓글 없음:
댓글 쓰기