Blog Posts

Can Python displace R for Data Science?

R and Python are two of the most popular data science languages, but which one is better? And will Python replace R in the near future? Let’s find out! R vs. Python: the Basics First, some history. R first appeared in 1990; it was derived from the language S, a statistical programming language developed for statisticians. It was (and still is) commonly used in educational settings and is a favorite among biostatisticians. At the end of the day, R excels

Learn SQL over the summer

Think summer is reserved for flying to warm places and hanging out at the beach? Sure! But it’s also a great time to learn new skills that you haven’t had time for. If you recently graduated from high school and want to get a head start on computer programming for college, learning SQL over the summer is a great opportunity. You have nothing to lose and everything to gain—SQL is actually really easy to learn, especially with so much free

R Programming: What It Means to a Data Scientist

Programmers commonly have many questions about R, a popular programming language in data science and analysis. R is used all over the world by professionals in the fields of data science, data visualization, data mining, and statistical analysis. But what exactly is R? Where did it come from? And why is it being used specifically by data science professionals? This article attempts to answer all these questions, including the most important of them all: Should you be learning R as

Exploring Summer Solstice With SQL

Why is June 21 the official start of summer? Let’s see how SQL can help us answer this question. The Summer Solstice Officially, June 21 is recognized as the summer solstice, the longest day of the entire year in terms of daylight. Why? Because on this day, the sun rises early and sets quite late. People in the Northern Hemisphere celebrate the summer solstice with feasts, bonfires, picnics, and traditional dances and songs. In ancient times, the summer solstice was

High-Performance Statistical Queries: Dependencies Between Continuous and Discrete Variables

In my previous articles, I explained how you can check for associations between two continuous and two discrete variables. This time, we’ll check for linear dependencies between continuous and discrete variables. You can do this by measuring the variance between the means of the continuous variable and different groups of the discrete variable. The null hypothesis here is that all variances between the means are a result of the variance within each group. If you reject this null hypothesis, then

Why Data Visualization Is Important: Two Perspectives

Have you ever wondered how you can deal with an overwhelming amount of data? How do you use it? How do you understand what it’s saying? And last but not least, how do you present your data to the world such that everyone understands your point? In this article, we’ll explore these questions to understand the importance of data visualization. Where are the data? When I want someone to understand my perspective, I try to visualize it precisely so I

