image via

Creating intuitive, interactive web applications to visualize Big Data with Streamlit, Dask, and Coiled


  1. a heavier groupby computation,
  2. interactive widgets to scale the Coiled cluster up or down, and
  3. an option to shut down the cluster on demand.

Using unsupervised NLP topic modelling and clustering to build a machine learning classifier that can identify misinformation in short-text documents

image by author


A comparative analysis of two NLP topic modelling approaches for short-text documents, using Arabic Twitter data

image under license to author via iStock


How to create delayed objects around functions you only want to run once per worker

Photo by Joshua Sortino on Unsplash


Use Case

Pre-processing Arabic text for machine-learning using the camel-tools Python package

image under license to Richard Pelgrim


  1. The form of characters and spelling of words can vary depending on their context (fancy…

  1. Use distributed processing to work with larger-than-memory datasets hosted in the cloud,
  2. Work with Arabic Natural Language Processing, and
  3. To do that, I’ll have to wrap my brain cells around working with deep-learning networks using Tensorflow.

The Dataset

Data for Change

Building a machine learning model to predict the intensity of conflicts using a century of climate change data

Image via iStock under license to Richard Pelgrim


…dots. LOTS of dots.

Richard Pelgrim

Data Scientist & Fermentation Fanatic | Data Science Evangelist Intern @

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store