Deep Learning | Towards Data Science https://towardsdatascience.com/category/artificial-intelligence/deep-learning/ Publish AI, ML & data-science insights to a global community of data professionals. Fri, 11 Jul 2025 18:37:06 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.1 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Deep Learning | Towards Data Science https://towardsdatascience.com/category/artificial-intelligence/deep-learning/ 32 32 June Must-Reads: AI Agents, Dashboards, and More https://towardsdatascience.com/june-must-reads-ai-agents-dashboards-and-more/ Thu, 03 Jul 2025 11:00:00 +0000 https://towardsdatascience.com/?p=606559 A selection of our most-read and -shared articles of the past month.

The post June Must-Reads: AI Agents, Dashboards, and More appeared first on Towards Data Science.

]]>
Taking ResNet to the Next Level https://towardsdatascience.com/taking-resnet-to-the-next-level/ Thu, 03 Jul 2025 04:11:06 +0000 https://towardsdatascience.com/?p=606487 Understanding how ResNeXt improves upon ResNet, with a comprehensive PyTorch implementation guide

The post Taking ResNet to the Next Level appeared first on Towards Data Science.

]]>
How I Automated My Machine Learning Workflow with Just 10 Lines of Python https://towardsdatascience.com/how-i-automated-my-machine-learning-workflow-with-just-10-lines-of-python/ Fri, 06 Jun 2025 13:11:46 +0000 https://towardsdatascience.com/?p=606226 Use LazyPredict and PyCaret to skip the grunt work and jump straight to performance.

The post How I Automated My Machine Learning Workflow with Just 10 Lines of Python appeared first on Towards Data Science.

]]>
Detecting Malicious URLs Using LSTM and Google’s BERT Models https://towardsdatascience.com/detecting-malicious-urls-using-lstm-and-googles-bert-models/ Wed, 28 May 2025 18:38:05 +0000 https://towardsdatascience.com/?p=606123 A progressive approach to implementing AI-powered webpage detection applications into production

The post Detecting Malicious URLs Using LSTM and Google’s BERT Models appeared first on Towards Data Science.

]]>
Bayesian Optimization for Hyperparameter Tuning of Deep Learning Models https://towardsdatascience.com/bayesian-optimization-for-hyperparameter-tuning-of-deep-learning-models/ Tue, 27 May 2025 21:02:02 +0000 https://towardsdatascience.com/?p=606118 Explore how Bayesian Optimization outperforms Grid Search in efficiency and performance over binary classification tasks.

The post Bayesian Optimization for Hyperparameter Tuning of Deep Learning Models appeared first on Towards Data Science.

]]>
Why Regularization Isn’t Enough: A Better Way to Train Neural Networks with Two Objectives https://towardsdatascience.com/why-regularization-isnt-enough-a-better-way-to-train-neural-networks-with-two-objectives/ Tue, 27 May 2025 18:09:12 +0000 https://towardsdatascience.com/?p=606111 Why splitting your objectives and your model might be the key to better performance and clearer trade-offs in deep learning.

The post Why Regularization Isn’t Enough: A Better Way to Train Neural Networks with Two Objectives appeared first on Towards Data Science.

]]>
Demystifying Policy Optimization in RL: An Introduction to PPO and GRPO https://towardsdatascience.com/demystifying-policy-optimization-in-rl-an-introduction-to-ppo-and-grpo/ Mon, 26 May 2025 18:25:23 +0000 https://towardsdatascience.com/?p=606095 A beginner-friendly guide to PPO and GRPO: simplifying policy optimization in reinforcement learning

The post Demystifying Policy Optimization in RL: An Introduction to PPO and GRPO appeared first on Towards Data Science.

]]>
The CNN That Challenges ViT https://towardsdatascience.com/the-cnn-that-challenges-vit/ Tue, 06 May 2025 01:44:13 +0000 https://towardsdatascience.com/?p=605914 A PyTorch implementation on the ConvNeXt architecture

The post The CNN That Challenges ViT appeared first on Towards Data Science.

]]>
Why Are Convolutional Neural Networks Great For Images? https://towardsdatascience.com/why-are-convolutional-neural-networks-great-for-images/ Thu, 01 May 2025 01:00:07 +0000 https://towardsdatascience.com/?p=605856 How data symmetry informs neural network architectures

The post Why Are Convolutional Neural Networks Great For Images? appeared first on Towards Data Science.

]]>
Adding Training Noise To Improve Detections In Transformers https://towardsdatascience.com/adding-training-noise-to-improve-detections-in-transformers/ Mon, 28 Apr 2025 17:56:52 +0000 https://towardsdatascience.com/?p=605817 Denoising, explained

The post Adding Training Noise To Improve Detections In Transformers appeared first on Towards Data Science.

]]>
The Art of Noise https://towardsdatascience.com/the-art-of-noise/ Thu, 03 Apr 2025 01:12:22 +0000 https://towardsdatascience.com/?p=605395 Understanding and implementing a diffusion model from scratch with PyTorch

The post The Art of Noise appeared first on Towards Data Science.

]]>
Show and Tell https://towardsdatascience.com/show-and-tell-e1a1142456e2/ Mon, 03 Feb 2025 16:30:24 +0000 https://towardsdatascience.com/show-and-tell-e1a1142456e2/ Implementing one of the earliest neural image caption generator models with PyTorch.

The post Show and Tell appeared first on Towards Data Science.

]]>
DeepSeek-V3 Explained 1: Multi-head Latent Attention https://towardsdatascience.com/deepseek-v3-explained-1-multi-head-latent-attention-ed6bee2a67c4/ Fri, 31 Jan 2025 10:02:05 +0000 https://towardsdatascience.com/deepseek-v3-explained-1-multi-head-latent-attention-ed6bee2a67c4/ Key architecture innovation behind DeepSeek-V2 and DeepSeek-V3 for faster inference

The post DeepSeek-V3 Explained 1: Multi-head Latent Attention appeared first on Towards Data Science.

]]>
The Three Phases of Learning Machine Learning https://towardsdatascience.com/the-three-phases-of-learning-machine-learning-df0a53148dd3/ Tue, 28 Jan 2025 13:02:11 +0000 https://towardsdatascience.com/the-three-phases-of-learning-machine-learning-df0a53148dd3/ Part One: The beginner phase

The post The Three Phases of Learning Machine Learning appeared first on Towards Data Science.

]]>
Understanding the Evolution of ChatGPT: Part 3- Insights from Codex and InstructGPT https://towardsdatascience.com/understanding-the-evolution-of-chatgpt-part-3-insights-from-codex-and-instructgpt-04ece2967bf7/ Tue, 21 Jan 2025 18:19:27 +0000 https://towardsdatascience.com/understanding-the-evolution-of-chatgpt-part-3-insights-from-codex-and-instructgpt-04ece2967bf7/ Mastering the art of fine-tuning: Learnings for training your own LLMs.

The post Understanding the Evolution of ChatGPT: Part 3- Insights from Codex and InstructGPT appeared first on Towards Data Science.

]]>
Influential Time-Series Forecasting Papers of 2023-2024: Part 1 https://towardsdatascience.com/influential-time-series-forecasting-papers-of-2023-2024-part-1-1b3d2e10a5b3/ Fri, 17 Jan 2025 12:02:18 +0000 https://towardsdatascience.com/influential-time-series-forecasting-papers-of-2023-2024-part-1-1b3d2e10a5b3/ Exploring the latest advancements in time series

The post Influential Time-Series Forecasting Papers of 2023-2024: Part 1 appeared first on Towards Data Science.

]]>
Why Data Scientists Can’t Afford Too Many Dimensions and What They Can Do About It https://towardsdatascience.com/why-data-scientists-cant-afford-too-many-dimensions-and-what-they-can-do-about-it-653230d50f9c/ Thu, 16 Jan 2025 13:32:00 +0000 https://towardsdatascience.com/why-data-scientists-cant-afford-too-many-dimensions-and-what-they-can-do-about-it-653230d50f9c/ An in-depth article about dimensionality reduction and its most popular methods

The post Why Data Scientists Can’t Afford Too Many Dimensions and What They Can Do About It appeared first on Towards Data Science.

]]>
Understanding Flash Attention: Writing the Algorithm from Scratch in Triton https://towardsdatascience.com/understanding-flash-attention-writing-the-algorithm-from-scratch-in-triton-5609f0b143ea/ Wed, 15 Jan 2025 17:01:59 +0000 https://towardsdatascience.com/understanding-flash-attention-writing-the-algorithm-from-scratch-in-triton-5609f0b143ea/ Find out how Flash Attention works. Afterward, we'll refine our understanding by writing a GPU kernel of the algorithm in Triton.

The post Understanding Flash Attention: Writing the Algorithm from Scratch in Triton appeared first on Towards Data Science.

]]>
LossVal Explained: Efficiently Estimate the Importance of Your Training Data https://towardsdatascience.com/lossval-explained-efficiently-estimate-the-importance-of-your-training-data-cef557434bf8/ Wed, 15 Jan 2025 14:01:59 +0000 https://towardsdatascience.com/lossval-explained-efficiently-estimate-the-importance-of-your-training-data-cef557434bf8/ How to Exploit the Loss Function for Efficient Data Valuation

The post LossVal Explained: Efficiently Estimate the Importance of Your Training Data appeared first on Towards Data Science.

]]>
From Darwin to Deep Work https://towardsdatascience.com/from-darwin-to-deep-work-0db4bf1761d8/ Tue, 14 Jan 2025 13:02:21 +0000 https://towardsdatascience.com/from-darwin-to-deep-work-0db4bf1761d8/ Focus Strategies for Machine Learning Practitioners

The post From Darwin to Deep Work appeared first on Towards Data Science.

]]>