I am an Information Systems gradute student from Northeastern University, with majors in Data Science and Machine Learning focused towards helping businesses make data driven decisions. Upon graduation, I intend to make a career in the field of data analytics & data science. I have a passion for doing data analytics using R, Python, SQL, Tableau, and Power BI etc. to tell compelling stories and identify trends and anomalies.
Prior to my masters, I worked in a start up for ~ 2.5 years and in multinational company for 1 year as a Business Intelligence Engineer. I have experience in data analytics, ETL, reporting and software developement Currently, I am looking for an opportunities in data analytics, Business Intelligence and data science.
Data Cleaning, Data Transformation,Exploratory Data Analysis, k-means clustering
Performed behavioral customer segmentation using k-means clustering technique. Grouped customers based on RFM (Recency, Frequency, Monetary) attributes to determine effective & tailored marketing strategies for 500k customers for an e-commerce company.
Web scraping, Data analysis, Ddeep learning, Covolutional Neural Networks, Transfer Learning
Created Deep Learning Convolutional Neural Network (CNN) to predict the genre of the book by its cover to classify 57000 images. Implemented transfer learning technique to address this image classification problem and improved accuracy to 72%.
Database design and Modeling
Designed and developed an online food ordering system's database to simulate various functionalities performed by customers and restaurants. Incorporated use of stored procedures, views, triggers, role based privileges to insert, modify the database record.Performed data analysis to generate insightful reports.
Machine Learning, Statistics & Python
Implemented machine learning model for predicting if a patient has diabetes based on diagnostic measurements. Performed data set preprocessing, feature selection, regularization using Lasso and Ridge regression.Verified the results using various accuracy metrics including confusion matrix, AUC-ROC, RMSE and cross validation scores and got accuracy of 83.83%
Machine Learning using Python
Detected fraudulent and non fraudulent transactions of the given highly unbalanced dataset using various regression techniques
Machine Learning using Python
Developed regression models to predict the price of houses and compared results of Linear Regression, Lasso Regression, Ridge Regression and 3rd degree polynomial Regression algorithms. Performed exploratory data analysis on large data set to understand the patterns and detect anomalies as well as used recursive feature elimination for feature selection.
Logistic Regression, Python
Developed statistical models to determine whether or not to grant the loan based on the likelihood of the loan being repaid. Created different models including Random forest, Logistic regression, Decision tree algorithms and selected the best algorithm based on accuracy and cross validation score.Performed categorical variable analysis and distribution analysis by plotting histograms and box plots with combination of variables and their probability using NumPy, Pandas, Matplotlib
Web scrapping, Data Transformation, Data Analysis and Interactive Dashboard
Web scrapped the stocks fundamentals for 502 stocks then transformed the data in suitable form to load it into MS SQL server. Carried out data analysis and built a very useful visualization for Value Investing. Predicted the price of stock using 4 different machine learning models
Big Data Analysis,Recommender System, Hadoop, Map Reduce, Hive,Apache Mahout
Perform Big Data Analysis on 130M+ records. Derived vaarious Numerical and Summerization pattern using Hadoop Map reduce. Built a recommender system which serves the functionality of recommending similar products based on the similar items bought by other customers. (People who bought this….also bought this…..)
Big Data Analysis,Hadoop, Map Reduce, Hive, Pig
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop. MR algorithms like Sorting, Filtering, Summarization etc
Big Data Analysis, MongoDB, Map Reduce
Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.
Data Analysis, Data Visualization
Visualize the launches in an informative way, and write a few words describing your work and what you found.Identify any "outlier" launches that an engineer should look over, and why. Identify any interesting patterns in the data, e.g. weather seasonality, poorly performing parts, etc.
REST API, CRUD Operations
Developed a REST based web service which served as student portal for the university. It provided multiple functionalities for students and Professors such as adding/removing a course, make announcements for an assignment, send email notifications to Professors and students, register for course etc. Used multiple AWS services like DynamoDB, Step Functions, Lambda, SNS and deployed it on AWS Elastic Beanstalk.