> Extract, Load and transform (ETL) large sets of structured, semi-structured and unstructured data > Data ingestion using Spark Streaming, Spark SQL and Sqoop from RDBMS, NoSQL databases and file sources > Memory tuning, Hadoop administration, and cluster configuration > Data analysis and reporting using Hive > Data visualization using Tableau, ReactJs > Practical experience with Hive ETL and Impala > Data warehousing using Vertica and Cassandra >Stream processing with Kafka-Spark Streaming