About Me


From a young age, I’ve always had a sense of motivation and passion driving me forward towards Big Dream. Whether it’s exploring unique opportunities, learning additional skills, or meeting new people, I bring these values to every experience throughout my life on a personal and professional level. I'm a Internet Marketing Expert & Data Scientist with strong math background and years of experience in Data Modeling, Engineering, Processing, and Mining Algorithm to solve critical challenges. Experienced in SEO/SEM/SMO, Data Analyst & Product Research. Expert in Operations Management, Market Research, Business Analyst, BigData, MachineLearning, AWS & Web Development (Django + Flask).
To learn more about me, keep exploring my site or reach out directly.

You will be happy to know that I have completed over 20+ projects successfully!

Download Resume
What I do?

Here are some of my expertise

Innovative Ideas Generator

My brain child was Wireless Health Monitoring and I did it with the product developed by me, SleepDoc

Python and Machine Learning

Experience of years in Python and years in ML. Worked on Regression, SVM, KNN, RF, PCA, DL, NLP.

Deep Learning and NLP

Neural Network has been my favorite. I have used CNN, LSTM, BLSTM, CNN-LSTM, NLTK, spaCy in my projects.

Exploratory Data Visualization

EDA makes me understand the data before applying any algorithms. I have used this in almost all projects.

Enhanced Feature Extraction

I have extracted features in Temporal, Spectral, Energy, Harmonic, and Cepstral domains for good accuracy.

Intelligent Feature Selection

All features are not important, sometimes few needs to be removed. It increases training speed and accuracy.

Database Management and Architecture on (AWS / AZURE)

Designing, deploying, and maintaining databases in support of high volume, complex data transactions for specific services or groups of services.

IoT, GIT, AWS, and GCP

I manage codes with a large team on GIT. Large ML models are trained on AWS and GCP.

Signal Processing and Analysis

I understand the maths behind the ML models which helps me to decide which model I should use.

Skills


I am equipped with diverse skills by the virtue of my hard works. In the past, I have worked on multiple additional roles including, Web Development, Machine Learning, and Mobile App Development apart from my Embedded System projects. These additional skill sets helped me to be a team player and builder. Besides, I have a specialty in Electronics and related domains.


Few of my skills are

Embedded c/c++

50%

Python

90%

Internet of Things

90%

Machine Learning Algorithms

80%

Deep Learning Algorithms

50%

MySQL/MongoDB

80%

GitHub

50%

Web Development (Django / Flask)

75%

Cloud Computing (AWS / Azure)

70%

Digital Marketing

90%

Work Experience


Accunite Solutions Pvt Ltd, Sep 2020 - Present

Data Analyst & Product Researcher
As a Data Analyst & Product Researcher, Directed end-to-end product development lifecycle in Influencer Marketing Domain. Defined and Actualized the Influencer Dashboard product visions. Consistently followed Product Analyses and Brand Analyses to Drive up sells. Implementing marketing strategies that have pulled in a 32% increase in qualified leads.

Weblinkindia.Net Pvt Ltd Oct 2017 - Aug 2020

SEO Analyst
As a SEO Specialist & Analyst managed 07 Executive’s Team. Responsible Was for planning, implementing and managing website's overall On-Page & Off-Page SEO strategy, focused on Content-Driven SEO, Email Marketing, and Paid-Advertisement Campaign on different Social Media Channels & Search Engine Platform.Regularly Used Google Analytic tool to conduct performance reports regularly..

Concentrix IBM Dec 2017 - Aug 2018

Customer Relationship Manager
I worked as a CRM in OYO ROOMS Process. I was in the International Process for Malaysia’s Clients. I used to handle Sales Calls, and Client queries related to Booked Rooms..

Data Science with Machine Learning in Python Internship


Internship Training Experience of more than 6+ month Contributor to the Data Science Community in TechVision.

Started a Course with mentality as an Analytic Mind and become Predictive Mind.
Now I'm not asking questions from data like"how many clicks did this link get?".
Now I'm asking "based on the previous history of links on this publisher’s site, can I predict how many people from France will read this in the next three hours?".

Learned & Explored : Python Programming, Artificial Intelligence, Machine Learning, Deep Learning, OpenCV, NLP, Automations, FAST API, REST API, DJANGO, FLASK, Cloud Computing(AWS), Apache SPARK, TABLUER, Business Analyses, Business Sense, Gathering Data, Data Preprocessing, Modeling Data, Exploring Data, Feature Engineering, Statistics, Python Libraries, Algorithms, Interpret Data, Predicting Phenomenon.

Projects


  • SleepDoc uses FMCW RADAR system to measure respiration and heart rate. Designed Android Application and state-of-the-art ML Algorithms (SVM, RF, CNN, LSTM, CNN-LSTM, Inception) to measure vital parameters. [Filed Patent Ref No. 201731038573]. The 99% accuracy was achieved in Respiration and Heart Rate measurement with 30-sec epochs. Deep learning models were trained on Google Cloud Platform (GCP) with TensorFlow-GPU.
  • The trained model was converted for the use of TensorFlow Lite in Android and then Data was uploaded to Google’s Firebase.
  • Successfully achieved kernel changes into Android OS to read data on Qualcomm’s Snapdragon processor using SPI, I2C, and UART. Interfaced Java and custom module coded in C-Language in Android 6.0.
  • Designed my own email marketing software for the marketing of SleepDoc. Used AWS Boto3 Python, AWS SES, SNS, SQS, AWS Lambda, DynamoDB, and PyQt5. Written code for automatic email and phone number extractor crawler.

I Measured respiration and heart rate through array of pressure sensors and Deep Learning (CNN, RNN-LSTM) algorithms. [Filed Patent Ref No. 201731038573]. 96% accuracy was achieved. The project had three stages 1) preprocessing with analog filters, 2) Digital Signal Processing, 3) Machine Learning model there after data is uploaded to the cloud for further processing.

Collected audio using microphone and Qualcomm’s Snapdragon processor enabled custom built board running Android 6.0. Audio Cleaning, Feature Extraction (more than 700 features) in Temporal, Spectral, Energy, Harmonic, and Cepstral domain, Feature Selection for faster training and better accuracy and ML models (CNN, LSTM, RF, SVM) are designed to classify snoring and other sound and their impact on sleep quality. I used Google Audio dataset to train the model.

Being as research scholar at IIT Kharagpur and working in a project. I used CNN, Inception, LSTM, CNN-LSTM models to estimate Sleep Apnea and Sleep Stages using the respiration signal only. The training data was obtained from MESA, National Sleep Research Resource, USA. I did data cleaning and balancing thereafter feature engineering and selection was done before applying ML model. Achieved 82.7% accuracy in Sleep Apnea which is higher than any known methods applied on respiration signal in MESA Dataset and achieved more than 80% accuracy in sleep stage classification which can be considered as good accuracy.

I have designed my own Automated Emailing software for the marketing. It uses AWS's SES, SNS, SQS, Lambda, DynamoDB and PyQt5. Thereafter, I did web scrapping to collect emails and mobile numbers for more than 10 lakhs people across the globe. Later, I had to drop all those data because of the privacy policy. This automated emailing software could result in huge saving during marketing campaign.

I had performed feature selection and data balancing on Santander Dataset to achieve better accuracy (>96%) than 1st ranker at Kaggle (82.907%). Applied two methods 1) Constant, Quasi-Constant, Duplicate, and Correlated feature removal and 2) Principal Component Analysis, PCA and LDA. In both cases, I achieved better accuracy. Feature Selection helped me to reduce feature dimension from 370 to less than 100 which increases the accuracy and boosted training time. Tried feature selection with ROC_AUC, Univariate ANOVA test, Fisher Score, Chi2 (χ2), Step Forward and Backward, L1 and L2 Regularization, and Recursive Feature elimination.

Used IMDB movies poster image dataset and CNN for Multi-label Image Prediction. The movie poster classified in multiple genre like Drama, Romance, Music, Comedy etc. Learning curve is plotted to understand how model is performing. In another project, I used LSTM for a week ahead House Hold Electricity Consumption prediction. Exploratory Data Analysis (EDA) was done to understand data graphically, it helped me to design algorithms. RMSE was less than the standard deviation of true power consumption.

Class imbalence was major problem in Credit Card Fraud detection project. I used data balancing techniques to improve the accuracy up to 92% using 2 layers of CNN with Dropout and Batch Normalization layers.

I have used WISDM_v1 dataset for the HAR detection. The Dataset was highly unstructured, I had to clean and impute it before start working on it. The EDA helped me to decide which algorithm might work on this. Since this is time series data both RNN and CNN was the best fit. I had used 2D CNN to detect HAR by using only accelerometer data with more than 90% accyracy. In a another project, where malaria parasite affected blood cell was given, I used Image Processing to detect Malaria with more than 92% accuracy.

A Real-time Sentiment Analysis is presentd in this project by using NLTK and Speech Recognition techniques. The phone call voice was automatically transcripted into the long sentences using Google's speech to text api in real-time. Thereafter, sentiment analysis was applied on each sentences.

A Spam Text Classification using spaCy is presented in this project. I did Data Exploratory Analysis (EDA) to understand how total number of text characters are related with SPAM or HAM messages. It was found that text messages with larger number of characters are mostly SPAM messages. Thereafter, I did TF-IDF to convert text messages into numerical values after cleaning the raw text data. TF-IDF is tranined and tested with SVM classifier and then I had also tested it with some real messages received in my mobile phone.

I have experience in feature extraction from Text Files and text extraction from PDF files. Mostly, I have used spaCy to do text processing and written some custom rules as well. Custom rules were inserted in pipeline processing of spaCy for correct working of whole algorithm. I have used Py2PDF for text data extraction from the PDF files.

I have shown a real-time graph of sentiments of twitter users on Donald Trump and Elizabeth Warren who are running for US Presidential Election 2020. There are many steps involved in this process 1. Twitter data extraction 2. Feature extractiong and text cleaning 3. Sentiment analysis for each twitts separately for Donald Trump and Elizabeth Warren 4. Real-time dynamic plot using matplotlib.

Contact


On the off chance that you might want to connect with me, be it for investigating innovation, business, or to simply say hello, don't hesitate to send me an email at rkr.datascientist@gmail.com

Visit my blog RKR Data Science Blog Central