Sayali Walke

Data Analyst - BI Engineer - Data Scientist

Portfolio


About Me


I am an Information Systems gradute student from Northeastern University, with majors in Data Science and Machine Learning focused towards helping businesses make data driven decisions. Upon graduation, I intend to make a career in the field of data analytics & data science. I have a passion for doing data analytics using R, Python, SQL, Tableau, and Power BI etc. to tell compelling stories and identify trends and anomalies.

Prior to my masters, I worked in a start up for ~ 2.5 years and in multinational company for 1 year as a Business Intelligence Engineer. I have experience in data analytics, ETL, reporting and software developement Currently, I am looking for an opportunities in data analytics, Business Intelligence and data science.






Work Experience




  • Company: Vidya Technology Solutions
  • Location: Aurangabad, India
  • Duration: June 2015 - Sept 2017
  • Position: Data Analyst

  • Built a Python & Internet of Things (IoT) based Automated Machine Monitoring & Alarming System. Collected real-time data from sensors attached to CNC machines. Carried out data pre-processing on sensor values and performed diagnostic data analysis to identify patterns in machine's key performance indicators. It resulted in reduction of average machine maintenance cost by 20%.


  • Company: Tata Consultancy Services
  • Location: Pune, India
  • Duration: Oct 2013 - Oct 2014
  • Position: Business Intelligence Engineer

  • 1] Solved Data Reconciliation problem by creating 12 different types of business reports using customer usage and payments data which resulted in loss prevention of $1.5 million. Created multiple visualizations using Tableau and SSRS to present business operation KPIs to management which gave a unified view of data from multiple databases. 2] Integrated data from multiple sources by designing ETL packages. Performed data analysis by building complex MySQL queries, functions, stored procedures as a part of reconciliation solution.








Customer Segmentation


Data Cleaning, Data Transformation,Exploratory Data Analysis, k-means clustering

Performed behavioral customer segmentation using k-means clustering technique. Grouped customers based on RFM (Recency, Frequency, Monetary) attributes to determine effective & tailored marketing strategies for 500k customers for an e-commerce company.

Close Project

Judge a book by its cover


Web scraping, Data analysis, Ddeep learning, Covolutional Neural Networks, Transfer Learning

 Created Deep Learning Convolutional Neural Network (CNN) to predict the genre of the book by its cover to classify 57000 images. Implemented transfer learning technique to address this image classification problem and improved accuracy to 72%.

Close Project

Online Food ordering System


Database design and Modeling

Designed and developed an online food ordering system's database to simulate various functionalities performed by customers and restaurants. Incorporated use of stored procedures, views, triggers, role based privileges to insert, modify the database record.Performed data analysis to generate insightful reports.

Close Project

Diabetes Prediction


Machine Learning, Statistics & Python

Implemented machine learning model for predicting if a patient has diabetes based on diagnostic measurements. Performed data set preprocessing, feature selection, regularization using Lasso and Ridge regression.Verified the results using various accuracy metrics including confusion matrix, AUC-ROC, RMSE and cross validation scores and got accuracy of 83.83%

Close Project

Credit Card Fraud Detection


Machine Learning using Python

Detected fraudulent and non fraudulent transactions of the given highly unbalanced dataset using various regression techniques

Close Project

House Price Prediction


Machine Learning using Python

Developed regression models to predict the price of houses and compared results of Linear Regression, Lasso Regression, Ridge Regression and 3rd degree polynomial Regression algorithms. Performed exploratory data analysis on large data set to understand the patterns and detect anomalies as well as used recursive feature elimination for feature selection.

Close Project

Bank Loan Prediction


Logistic Regression, Python

Developed statistical models to determine whether or not to grant the loan based on the likelihood of the loan being repaid. Created different models including Random forest, Logistic regression, Decision tree algorithms and selected the best algorithm based on accuracy and cross validation score.Performed categorical variable analysis and distribution analysis by plotting histograms and box plots with combination of variables and their probability using NumPy, Pandas, Matplotlib

Close Project

Stock Analysis & Price Prediction


Web scrapping, Data Transformation, Data Analysis and Interactive Dashboard

Web scrapped the stocks fundamentals for 502 stocks then transformed the data in suitable form to load it into MS SQL server. Carried out data analysis and built a very useful visualization for Value Investing. Predicted the price of stock using 4 different machine learning models

Close Project

Amazon Recommender System


Big Data Analysis,Recommender System, Hadoop, Map Reduce, Hive,Apache Mahout

Perform Big Data Analysis on 130M+ records. Derived vaarious Numerical and Summerization pattern using Hadoop Map reduce. Built a recommender system which serves the functionality of recommending similar products based on the similar items bought by other customers. (People who bought this….also bought this…..)

Close Project

Big Data Analysis Hadoop


Big Data Analysis,Hadoop, Map Reduce, Hive, Pig

Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using Hadoop. MR algorithms like Sorting, Filtering, Summarization etc

Close Project

Big Data Analysis MongoDB


Big Data Analysis, MongoDB, Map Reduce

Data analysis on Big Data. Used various databases from 1M to 100M including Movie Lens dataset to perform analysis. Covers basics and advance map reduce using MongoDB.

Close Project

Aircraft Data Analysis


Data Analysis, Data Visualization

Visualize the launches in an informative way, and write a few words describing your work and what you found.Identify any "outlier" launches that an engineer should look over, and why. Identify any interesting patterns in the data, e.g. weather seasonality, poorly performing parts, etc.

Close Project

University Portal


REST API, CRUD Operations

Developed a REST based web service which served as student portal for the university. It provided multiple functionalities for students and Professors such as adding/removing a course, make announcements for an assignment, send email notifications to Professors and students, register for course etc. Used multiple AWS services like DynamoDB, Step Functions, Lambda, SNS and deployed it on AWS Elastic Beanstalk.

Close Project