As a Data Scientist and in the field of data analysis more globally, load and save data (DataFrame) is almost systematic.

Usually, I use df.to_csv('path/df.csv') and i think that almost all Pandas users do the same or did that at least once.

All Python users know that saving data in CSV format is very practical and very easy but it has some drawbacks, which we will detail below, that can turn this usage as a bad habit.

For the rest of this article, I will generate a Pandas DataFrame, as follow, and I will use it as an example.


Knowing that a customer is going to bring a certain amount of money for the company early enough can be very helpful to know how much the company can spend to retain it.
Indeed, if a customer only brought in € 10 during his life as a customer in the company, we will not spend more than € 10 to retain him.

There are many techniques out there for estimating Customer Lifetime Value and we are going to introduce you two of them that will allow you to estimate its value without waiting to have a significant purchase history.


In this post, we will learn how create a contact form using Flask. The result will be like this :

I. Build Contact Form

All script are available on:

First of all you need to install flask and flask_wtf using pip : pip install flask flask_wtf.

a. Create our html contact page

Here we will add all fields that we want to retrieve.

<!-- here we add our contact forms -->
{% block content %}
<div class="contact">
<h3>Contact us</h3>
<!-- Just here i use "url_for" to call the function get_contact.
This function return the html…

Here’s a complete tutorial to learn how we can create a responsive navigation bar, that adapt their layouts to the viewport size, using Flask from scratch.

In this tutorial, we’ll look into how use Bootstrap to create a responsive NavBar on Flask. Our navigation will have three different layouts, depending on the viewport size:

  1. a mobile layout in which only the logo and a toggle button will be visible by default and users can open and close the menu using the toggle,
  2. a tablet layout in which we will show two call-to-action buttons between the logo and toggle in the…

Like à lot of data scientist and data analyst the print command is very common for data debugging but that is a bad habit for many reasons that we will explain here.

  • The first limitation of using print is that wen you want to move your script to production, you move all prints too and sometime they are not needed.
    All this prints will just create noises on your script and make it less readable.
  • You can’t disable all print outputs or some of them
  • You can’t create create different severity level for your messages

To change this bad habit…

In this article, I will share with you some tips that, I learned from senior Data Scientists and throughout my career as well, that you should know and that will save you a lot of time.

Some of the tips that I am going to share to you will surely be familiar to some of you depending on how far you are into your career.

I. Tips For Univariate Analysis

When I started my journey as a data scientist, I spent a lot of time on univariate analysis. …

If you are starting your life as a data scientist, you will hear around you things like why we didn’t get the expected performance? This model is a failure!
We must find a solution to improve this model or We must start from the beginning.

If you want to avoid to be in this kind of situation, you must avoid some beginner mistakes. Here, we will discuss some of them that we already have committed and you can avoid.

  1. Understand The Client Need

Understanding the need and not what your are being asked to achieve is essential.

The client or…

Allow users to connect to their personal space is one of the most important feature for you web application. In this article, i will explain you how you can add an authentication by adding a login page using the Flask-Login python package.

All script are available on my GitHub here.

Login Page Flask Example

So here, we will create a sign up, sign in and login out pages that will allow users to connect to their personnal private spaces that if they are not connected they can’t see them. …

High Cardinality

When you staring a machine learning or a data science project, you begin your explanatory analysis to extract interesting informations.
One of the most common problem that we encounter is high cardinality on qualitative variables (too many unique values).
This effect will introduce instability in your model.

So how can we treat high cardinality ?
Common solutions are :

  • Label Encoder : Replace string values by integer classes [0, 1, 2, 3…]
  • Dummy Encoder : This method consist on creating n new variables of
    {0, 1}. This method is not very practical and processing time will significantly increase.
  • Aggregating Values…

Missing values and outliers are frequently encountered when dealing with data. So the big question in this kind of case is how to treat these missing or outliers values?

In this article, we will present you with some methods to identify and treat missing values as well as outliers.

For this article, we will use this data:

Univariate analysis

To start any data project, you need to know the data you have. Are there missing values? What about Outliers? Some variables have too many values and will introduce instability on your model? …

So univariate analysis can answer all these questions…


