How to tune Spark performance for ML needs

In RussianComplexity -

When speaking about machine learning on large data volumes, Apache Spark is a popular solution. While coding on Spark is pretty easy, to make performance of your application higher you need to understand not only Spark internals, but also what data and in what volumes you are dealing with. Artem will tell about a set of methods tried on a "live" project, which helped make execution time of some jobs 5-20 times better.

#big data
#ml

Speakers

Artem Shutak
Grid Dynamics

Talks

How to tune Spark performance for ML needs

Speakers

Artem Shutak