Talk type: Talk

How to tune Spark performance for ML needs

  • Talk in Russian
Presentation pdf

When speaking about machine learning on large data volumes, Apache Spark is a popular solution. While coding on Spark is pretty easy, to make performance of your application higher you need to understand not only Spark internals, but also what data and in what volumes you are dealing with. Artem will tell about a set of methods tried on a "live" project, which helped make execution time of some jobs 5-20 times better.

  • #big data
  • #ml

Speakers

Talks