Below, we show an example of how you can build a function that interacts with users by asking for their input. We will build a function that returns their Body Mass Index (BMI) by being able to take as input both the metric and imperial system. Let’s do it. Here is the bmi.py
file:
We will give as input:
And we get a BMI of 21.63:
In this tutorial, we will provide basic examples of the conditional statements in bash scripting.
In the script, we can define the variables with the dollar sign $
. By adding a number, we are referring to the variable that we want to pass. So, the $1
is the first variable, the $2
is the second variable and so on. Let's see an example. Let's consider the following myexample.sh
script.
echo "Hello, I pass the first variable $1 and the second one $2"
And let’s run it by typing:
bash myexample.sh var1 var2
Here, the var1 is assigned to $1
and…
Anna Karenina novel begins with the quote:
All happy families are alike; each unhappy family is unhappy in its own way. Leo Tolstoy
This is the Anna Karenina principle and personally, I have found it very suitable to describe the world of Data Science. Think about a Machine Learning pipeline:
We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed the previous installments:
Vol. 1:
Vol. 2:
Vol. 3:
Vol. 4:
This function returns the first non-null
value between two columns:
import pandas as pd
import numpy as np
df=pd.DataFrame({"A":[1,2,np.nan,4,np.nan],"B":['A',"B","C","D","E"]})
df
In the first article, we learned how to start running Airflow in Docker, and in this post, we will provide an example of how you can run a DAG in Docker. We assume that you have already followed the steps of running Airflow in Docker and you are ready to run the compose.
The first thing that you need to do is to set your working directory to your airflow directory which most probably consists of the following folders.
The simplest and fastest way to start Airflow is to run it with CeleryExecutor
in Docker. We assume you have a basic understanding of Docker and you’ve already installed the Docker Community Edition (CE) on your computer.
It’s convenient to create an Airflow directory where you’ll have your folders (like dags
, etc.). So open your terminal, and run:
mkdir airflow-docker cd airflow-docker
I created a folder called airflow-docker
.
To deploy Airflow on Docker Compose, you should fetch the docker-compose.yaml
file. So let’s download it with the curl
command.
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.0.1/docker-compose.yaml'
Some directories in the container are mounted, which means…
We have started a series of articles on tips and tricks for data scientists, mainly in Python and R. (In case you missed them, here are Vol. 1, Vol. 2, and Vol 3.)
Google Colab is becoming more and more popular in the data science community. Working with Colab Jupyter Notebooks, you are able to mount your Google Drive so that you can get data from it directly. Let’s see how you can do that. You should open your notebook and type the following commands:
Let’s say that we want to search for Python images. We can run the command:
docker search python
As we can see we get the NAME, the DESCRIPTION, the STARTS and if it is OFFICIAL and/or AUTOMATED. Always try to get images with many starts and prefer to be official.
How to get images from Docker Hub
You can get an image from a public repository by running the command docker pull and the name of the image. Let’s say that we want to get the “hello-world” image.
docker pull hello-world:latest
There are many different approaches to predict the winner of a race. The race can be any distance and the runners can be dogs, horses and humans. Also, apart from trying to predict the winner, it may be possible to answer other questions like the probability of a runner being on the podium (top three positions) and so on.
Personally, in this kind of problem, I prefer to approach them with Monte Carlo simulation instead of trying to build Machine Learning models. Let‘s describe the Monte Carlo approach.
Let’s say that we want to predict the probability of each runner…
Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/