For most data scientists, the path from raw, messy data to clear narratives and actionable insights passes through visualization. Giving concrete shape to your data makes it more approachable for human brains; it also opens up the space for discussion and interpretation.
We’ve published numerous tutorials on basic and effective data visualization techniques over the years, so this week we’re digging a bit deeper. We’ve selected some excellent recent articles that cover more advanced and more specific use cases, so dive in if you’d like to expand your toolkit and pick up new ideas along the way.
- Leverage a visualization tool to better understand your model. The importance of explainable AI has become very clear in recent years. Zolzaya Luvsandorj‘s tutorial introduces partial dependence plots (PDPs), shows how to implement them in scikit-learn, and explains how they can help us "understand how different values of a particular feature impact a model’s predictions."
- A powerful way to troubleshoot embeddings. In a helpful deep dive, Francisco Castillo Carrasco turns to SNE, t-SNE, and UMAP—three popular dimensionality reduction techniques—and unpacks how visualizing embeddings (and unstructured data more broadly) can help us detect issues and recognize unexpected patterns and changes.
- How to add a custom touch to your NLP workflow. Bringing together NLTK and spaCy, Leonie Monigatti‘s quick, handy tutorial walks us through the process of creating colorful, customized part-of-speech tags to make your text-analysis project easier to digest and draw conclusions from -which is, ultimately, the end goal of any solid visualization.
- When your slides need a bit more punch. There’s absolutely nothing wrong with a clean, straightforward bar chart. For those times when you want to really grab your audience’s attention, though, Boriharn K proposes several other eye-catching options that might just suit the specific needs of your next presentation or report.
If you’re looking for some compelling reads on other topics in data science and machine learning, you’re in the right spot. Below are some of our recent favorites.
- We enjoyed chatting with Himalaya Bir Shrestha about his work as an energy system analyst, the benefits of technical skills for roles outside of traditional data or developer teams, and how industry data scientists can get involved in projects around sustainability and climate change.
- If you’re looking for ways to make your neural networks more efficient, Sabrina Göllner‘s debut TDS post explains how to reduce training parameters without sacrificing accuracy.
- A key step in any ML workflow is to assess how distinct features in your model contributed to its output. Stacy Li‘s thorough guide covers best practices for calculating and interpreting feature importance.
- Roll up your sleeves and learn some new SQL tips and tricks—Zoumana Keita recently shared a concise guide on the four JOIN clauses every data pro should know.
- In the market for an illuminating and fun longread? Bradley Stephen Shaw shared an engaging and easy-to-follow walkthrough of a recent project where he used DBSCAN, a spatial-clustering algorithm, to analyze crime data from the UK.
Thank you for spending time with us this week, and for offering your support—whether it’s by reading our authors’ work, sharing it with your network, and (for those of you who’ve just taken the leap) becoming a Medium member.
Until the next Variable,
TDS Editors






