As a follow-up to my previous post, I will be applying transfer learning to the RAVDESS Audio Dataset in hopes to improve the model’s accuracy. To review, transfer learning is a deep learning approach in which a model that has been trained on one task is used as a starting point to train a model for a similar task. In this post by DJ Sarkar, he provides a great guide in understanding transfer learning with examples.
We will first try to use the VGG-16 pretrained model as a feature extractor on our dataset, which is where we freeze the convolution blocks of a pretrained model and modify the dense layers. …
Through all the available senses, humans can sense the emotional state of their communication partner. This emotional detection is natural for humans, but it is very difficult task for computers; although they can easily understand content based information, accessing the depth behind content is difficult and that’s what speech emotion recognition (SER) sets out to do. It is a system through which various audio speech files are classified into different emotions such as happy, sad, anger and neutral by computers. Speech emotion recognition can be used in areas such as the medical field or customer call centers. …
Classification plus using ensemble methods to achieve an overall accuracy score of ~92%
As a follow-up to my previous article (found here), here I will be demonstrating the steps I took to build a classification model using UCI’s Heart Disease Dataset as well as utilizing ensemble methods to achieve a better accuracy score.
By creating a suitable machine learning algorithm which can classify heart disease more accurately would be highly beneficial to health organizations as well as for patients.
Let’s get started!
First I imported the necessary libraries and read in the cleaned .csv file:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from collections import Counter
from sklearn.preprocessing import StandardScaler
# data splitting
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
# data modeling
from sklearn.metrics import confusion_matrix,accuracy_score,roc_curve,roc_auc_score,classification_report,f1_score
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from xgboost import XGBClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
from mlxtend.classifier import StackingCVClassifier
import xgboost as xgb
from sklearn.dummy import DummyClassifier
from sklearn import…
With the rapid growth of data, the demand for data scientists grows as well. According to Smith Hanley Associates, data scientists are being sought for positions in a variety of fields such as healthcare, pharmaceuticals, retail, and other industries. This is great news for those interested in becoming data scientists, especially for those whose jobs were affected by COVID-19, having a secure job is important.
Cardiovascular disease or heart disease is the leading cause of death amongst women and men and amongst most racial/ethnic groups in the United States. Heart disease describes a range of conditions that affect your heart. Diseases under the heart disease umbrella include blood vessel diseases, such as coronary artery disease. From the CDC, roughly every 1 in 4 deaths each year are due to heart disease. The WHO states that human life style is the main reason behind this heart problem. …
One of the key concepts in data science is time-series analysis which involves the process of using a statistical model to predict future values of a time series (i.e. financial prices, weather, COVID-19 positive cases/deaths) based on past results. Some components that might be seen in a time-series analysis are:
Regular Expressions, also known as Regex, comes in handy in a multitude of text processing scenarios. You can search for patterns of numbers, letters, punctuation, and even whitespace. Regex is fast and helps avoid unnecessary loops in your program to match and extract desired information. Until recently I felt that Regex was very complicated, the syntax looks frustrating and thought that I would not be able to learn about it. As with many others, we share this same feeling.
Python is an object oriented programming language, which focuses on dividing a program into objects, whereas procedure oriented programming focuses on dividing a program into functions. Objects are simply a collection of attributes (variables) and methods (functions) that act on those data and a class is a blueprint for that object. In this article by Vipul J, he does a great job explaining how Python classes can be thought of as blueprints of a house, and objects can be thought of as a particular instance of that house (there can be multiple objects for one class, while they all may differ in number of bedrooms/bathrooms/etc., …
In my previous article found here, I provided a step-by-step guide on how to perform topic modeling and sentiment analysis using VADER on Amazon Alexa reviews. From my analysis I realized that there were multiple Alexa devices, which I should’ve analyzed from the beginning to compare devices, and see how the negative and positive feedback differ amongst models, insight that is more specific and would be more beneficial to Amazon (*insert embarrassed face here*). …
In statistics and data analysis, hypothesis testing is very important because when we perform experiments, we typically do not have access to all members of a population so we take samples of measurements to make inferences about the population. These inferences are hypotheses. In essence, a statistical hypothesis test is a method for testing a hypothesis about a parameter in a population using data measured in a sample.
In this article, I will be reviewing the steps in hypothesis testing, define key terminology and use examples to show the different types of hypothesis tests.
Regardless of the type of statistical hypothesis test you are performing, there are five main steps to executing…