Spark Column : How to deal with null values while using Mathematical functions or spark expression ? (Java)
I am trying to perform some mathematical functions to multiple column values and create a new column in same dataframe. For instance
id | name | Quantity | price _______________________________ 1 | Luke | 10 | 5
should become
iid | name | Quantity | price | derivedCol _______________________________ 1 | Luke | 10 | 5 | 50
So spark Expression is : derivedCol = (Quantity * price).
and i am executing it in loop so multiple columns can be created.
Now the problem i am facing is while executing if value of column is null then the program throws NullPointeException and stops. If all values are present without any null then the program runs smoothly. Calculations are happening at runtime and i am not able to find an option to check null values of columns. I don’t want to filter null value rows.
if(null != derivedColNxtObj) { switch (operatorStr) { case "+": derivedCol = derivedCol.plus(derivedColNxtObj); break; case "-": derivedCol = derivedCol.minus(derivedColNxtObj); break; case "*": derivedCol = derivedCol.multiply(derivedColNxtObj); break; case "/": derivedCol = derivedCol.divide(derivedColNxtObj); break; } } } df = df.withColumn(fieldname, derivedCol);
So how can i just ignore null values and execute my expression smoothly ?