Data Scientist with strong math background and 3+ years of experience using predictive modeling, data processing, and data mining algorithms to solve challenging business problems.
I have worked on projects that involved researching hard data problems, analyzing complex, label-poor datasets, building novel production ETL and ML platforms, helping to bring intelligent capabilities to fruition for users, and creating functional data tooling.
- Machine Learning: Building supervised & unsupervised classification and regression pipelines via state of the art algorithms; devising high-performance statistical and numerical methods that run in production clusters; time series analysis and forecasting; architecting high-volume ETL and machine learning pipelines.
- Software engineering: Building projects from prototypes to production using Python and Java. Experienced in using SQL. Experienced in building functional front-end prototypes.
- Soft skills: Cross-team collaboration, project management, and leadership, mentoring and advising.
Technical Skills
Programming Languages: Python, Java, PySpark, SQL, Shell Scripting
Data Science: Apache Spark, Numpy, Pandas, Scikit Learn, FAST API, Tensor Flow and Keras
Database and Visualization Tools: MySQL, PostgreSQL, Power BI, Matplotlib, Flask and Streamlit
Cloud, Devops and Other tools: AZURE, Docker, Jenkins, JIRA, GIT, Bit Bucket, Cloudera HDP and Airflow