how to use spark map with .net for spark
how to use spark map with .net for spark? like in python:
nums=sc.parallelize([1,2,3,4]) squared=nums.map(lambda x: x*x).collect);
in scale:
val input=sc.parallelize([1,2,3,4]) val res=input.map(x=>x*x)
but .net dataframe doesn’t has a funtion named map
this document(https://docs.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.functions.map?view=spark-dotnet) hasn’t demo.
The map
function belongs to the RDD API, while the .Net implements the Dataframe API (aka Spark SQL). You need to use correct functions, like, Select
to transform data. So, if you get your data into the dataframe df
, then you can do something like in your map
with df.Select(df["col"]*df["col"])
, etc.
See the examples in the Spark.Net repo.