Is there any approach to support the QUALIFY SQL key word, or any workaround?
Like the SQL statement below,
SELECT id, col1, col2, ROW_NUMBER() OVER(PARTITION BY col3 ORDER BY col1 ASC) AS Rwn FROM table_name QUALIFY Rwn=1;
Currently impala supports the row_number() over(...) analytic function,
However, if I want to filter the result of the row_number() over(...), is there any approach or workaround?
I tried to replace the QUALIFY with WHERE
SELECT id, col1, col2, ROW_NUMBER() OVER(PARTITION BY col3 ORDER BY col1 ASC) AS Rwn FROM table_name WHERE Rwn=1;
I got exception "AnalysisException: couldn't resolve column reference: Rwn"
It seems that the code public SlotDescriptor registerColumnRef(TableName tblName, String colName) is checking the specific column_name from table, obviouslly can not find the Rwn, and then throw the exception when parse the SQL.
Anyway, is there any approach / workaround to support QUALIFY key word?
The where clause function is working fine for me as a substitution for "QUALIFY".Could you check if the RowNum function alone is working for you.
Thanks Pranay, the RowNum function alone is working well.
By checking the document, http://www.cloudera.com/ content/cloudera/en/ documentation/core/latest/ topics/impala_analytic_ functions.html
it seems that the analytic function can only work with select and order by now
"Analytic function calls are only allowed in the SELECT list and in the outermost ORDER BY clause of the query."
Also by checking the code, when analyze the SQL statement, the where clause can not contain analytic expressions.
public void analyze(Analyzer analyzer) throws AnalysisException
229 Expr e = whereClause_.findFirstOf( AnalyticExpr.class);
230 if (e != null) {
231 throw new AnalysisException(
232 "WHERE clause must not contain analytic expressions: " + e.toSql());
233 }
However, I am thinking whether can work if utilize the alias, and put the analytic function alias in where clause. But it failed in my test...
And I am not sure whether there are any other workaround way if want to achieve similar goal.
btw, I am using the version 2.0. Which version you used?
I tried the following with the rank as an alias in the inner query :
SELECT
Co1,
Col2,
FROM
(
SELECT
Col1,
Col2,
CASE WHEN MAX(TNS_8Wk_Amt) OVER (PARTITION BY Consumer_Id) <> 0
THEN RANK() OVER (PARTITION BY Col1 ORDER BY Col2 DESC)
ELSE RANK() OVER (PARTITION BY Col1 ORDER BY Col2 DESC) END AS RANK
FROM
Table1;
) r where RANK=1
I am using Impala version 2.0
Thanks Pranay, yes it works if utilize the sub-query! This could be a workaround way now :)
댓글 없음:
댓글 쓰기