In Part 1 of this series, we learnt how to set up a Hadoop cluster on Azure HDInsight and run a Spark job to process huge volumes of data. In most practical scenarios, however, such jobs are executed as part of an orchestrated process or workflow unless the need is for a one-time processing. In our specific use case, we had to derive different metrics related to error patterns and usage scenarios from the log data and report them on a daily basis. (more…)

Processing Big Data with Azure HDInsight and Spark: Part I

By Manmohan Muraleedharan June 27, 2018 October 9, 2019

START A CONVERSATION

Processing Big Data Part II: Orchestrating Spark Jobs with Azure Data Factory

Categories

Processing Big Data with Azure HDInsight and Spark: Part I