I had this column named age_band which will have values like "55-64|65-74|75+"
As you can see it contains age groups stored in as a string concatenated with '|' and each age group needs to be compared separately.
I had this taken care in PostgreSQL using:
which results:
# unnest
1 55-64
2 65-74
3 75+
Now, I have to perform the same in Spark SQL and here it is:
and here is the result:
As you can see it contains age groups stored in as a string concatenated with '|' and each age group needs to be compared separately.
I had this taken care in PostgreSQL using:
select unnest(string_to_array('55-64|65-74|75+', '|'));
which results:
# unnest
1 55-64
2 65-74
3 75+
Now, I have to perform the same in Spark SQL and here it is:
select explode(split('55-64|65-74|75+', '[|]'));
and here is the result:
col |
---|
55-64 |
65-74 |
75+ |
Update: 11/27/2019
And in SnowFlake:
select col1, c.value::string as age_band
from san_test,
lateral flatten(input=>split(col2, '|')) c;
col2 is the column where the age_band has concatenated values with | delimited.
select c.value::string as age_band from lateral flatten(input=>split('55-64|65-74|75+', '|')) c;
AGE_BAND
55-64
65-74
75+