en fr

Generate datasets to understand some clustering algorithms behavior

Posted on Sun 11 November 2018 in machine learning • Tagged with clustering, R, machine learning • 7 min read

In order to understand how a clustering algorithm works, good sample datasets are useful to highlight its behavior under certain circumstances. This post shows how to generate 9 datasets which will be used in other posts of this series on clustering.


Continue reading

Data classes in Python

Posted on Sat 27 October 2018 in coding • Tagged with Python • 5 min read

This blob post presents the brand-new Python data classes feature and its benefits.


Continue reading

Static typing in Python

Posted on Sat 13 October 2018 in coding • Tagged with Python • 5 min read

This article covers static typing in Python: how and why type annotate Python code, how to type check statically, and how to enable powerful IDEs features.


Continue reading

Encoding in Python

Posted on Sat 29 September 2018 in coding • Tagged with Python • 6 min read

The transition from Python 2 to Python 3 caused some problems since the two versions handle text differently. First, we will see how the text is represented in Python 2 and Python 3, then how to do the conversion between the different representations, and then the different places where encoding step in: the encoding of the source code, the implicit conversions, the encoding of the inputs and outputs, and the file system encoding.


Continue reading

XXVth Meeting of the Société Francophone de Classification

Posted on Sun 16 September 2018 in Meeting • Tagged with Clustering • 4 min read

Last week, I was at the XXVth Meeting of the Société Francophone de Classification, both as a participant and a member of the steering committee.


Continue reading

Key differences between mainly used languages for data science

Posted on Sat 01 September 2018 in Coding • Tagged with Javascript, Python, Scala, C • 6 min read

This blog post introduces the notions of strongly and weakly typed in one hand, and the notion of static and dynamic typing in the other hand. It is illustrated with four languages commonly used in data science pipelines.


Continue reading

Purpose of this blog

Posted on Sat 01 September 2018 in misc • 1 min read

This article is an introduction to this blog.


Continue reading