In probability theory, the **birthday problem** or **birthday paradox** concerns the probability that, in a set of *n* randomly chosen people, some pair of them will have the same birthday. In a group of 23 people, the probability of a shared birthday exceeds 50%, while a group of 70 has a 99.9% chance of a shared birthday.

You can calculate explicitly the probability of at least two people have the same birthday between **n** people by applying the mathematical formulas. Let’s try to estimate these probabilities numerically by applying Monte Carlo simulation. …

Get a list of all of the Docker commands:

`docker -h`

`builder`

Manage builds`config`

Manage Docker configs`container`

Manage containers`engine`

Manage the docker engine`image`

Manage images`network`

Manage networks`node`

Manage Swarm nodes`plugin`

Manage plugins`secret`

Manage Docker secrets`service`

Manage services`stack`

Manage Docker stacks`swarm`

Manage Swarm`system`

Manage Docker`trust`

Manage trust on Docker images`volume`

Manage volumes

`docker image`

`build`

Build an image from a dockerfile`history`

Show the history of an image`import`

Import the contents from a tarball to create a filesystem image`inspect`

Display detailed information on one or more images`…`

In this post, we will provide some examples of what you can do with OpenCV.

We will give a walk-through example of how we can blend images using Python OpenCV. Below we represent the target and the filter images.

**Target Image**

**Filter Image**

Sometimes we need to generate correlated data for exhibition purposes, technical assessments, testing, etc. We have provided a walk-through example of how to generate correlated data in Python using the `scikit-learn`

library. In R, as far as I know, there is not any library that allows us to generate correlated data. For that reason, we will work with the simulated data from the Multivariate Normal Distribution. I would suggest having a look at the variance-covariance matrix and the relationship between correlation and covariance.

We will generate 1000 observations from the Multivariate Normal Distribution of 3 Gaussians as follows:

- V1~N(10,1), V2~N(5,1)…

In this tutorial, we will show how you can get the Power of Test when you apply Hypothesis Testing with Binomial Distribution. Before we provide the example let’s recall that is the Type I, and Type II errors.

**Type I error**

This is the probability to reject the null hypothesis, given that the null hypothesis is true. This is the level of significance α and in statistics is usually set to 5%

**Type II error**

This is the probability to accept the null hypothesis, given that the null hypothesis is false. …

A Luddite can be characterized as a person opposed to new technology or ways of working. The word comes from the “Luddites” that were a secret oath-based organization of English textile workers in the 19th century, a radical faction that destroyed textile machinery as a form of protest.

Back in 2008, I got my first job in the banking sector, in the Sale Department as a Sales Analyst. In our department, we were dealing with Car Loans and some of the KPIs that we were monitoring were:

- Number of Loan Applications from the Car Dealers
- Number of Approved Loan Applications
- …

Medium publications are very important for medium writers. As a writer, if you want to increase your audience you should publish your stories to a Medium publication. As a rule of thumb, you should choose popular publications, and by popular we mean the publications with many followers.

It is frustrating that there is no easy way to get the number of followers per publication. Let’s go to the “Start it up” which is the publication with the most followers. I type the URL “https://medium.com/swlh” and I land on this page:

Docker has two main categories of data storage, the **persistent **and the **non-persistent**.

Persistent data storage is the volumes that are decoupled from the containers.

- Use a volume for persistent data: Create the volume first, then create your container.
- Mounted to a directory in the container
- Data is written to the volume
- Deleting a container does not delete the volume
- First-class citizens
- Uses the local driver
- Third-party drivers: Block storage, File storage, Object storage
- Storage locations: Linux: /var/lib/docker/volumes/ , Windows: C:\ProgramData\Docker\volumes

**Non-Persisent**

- Local storage
- Data that is ephemeral
- Every container has it
- Tied to the lifecycle of the contain

**By…**

When building data workflows and machine learning pipelines, we often check for the existence of specific files and directories (folders). In this article, we will provide some hands-on examples of how you can check for files or directories in R, Python, and Bash.

For this example, we have created a file called `myfile.txt`

and a directory called `my_test_folder`

.

We can easily check if a file exists with the `file.exists()`

command from the base package. Let's have a look at the following example:

`if (file.exists("myfile.txt")) {`

print("The file exists")

} else {

print("The file does not exist")

}

And we get:

`…`

We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed the previous installments:

Vol. 1:

Vol. 2:

Vol.3:

Vol.4:

Vol.5:

Assume that we have the following list:

`mylist = [1,1,1,2,2,3,3]`

And we want to get the mode (i.e. the most frequent element). We can use the following trick using the `max`

and the lambda key:

`max(mylist, key = mylist.count)`

And we get `1`

since this was the mode in our list. In the case where there is a draw in the mode and you want to get…

Data Scientist @ Persado | Co-founder of the Data Science blog: https://predictivehacks.com/