Artem Shutak
Company: Grid Dynamics
When speaking about machine learning on large data volumes, Apache Spark is a popular solution. While coding on Spark is pretty easy, to make performance of your application higher you need to understand not only Spark internals, but also what data and in what volumes you are dealing with. Artem will tell about a set of methods tried on a "live" project, which helped make execution time of some jobs 5-20 times better.
Company: Grid Dynamics