Machine Learning with cuML and cuDF

Tejas Kamble
April 5, 2025
5 min read
cuDF,cuML,Data Science,GPU,Machine learning

Revolutionary GPU Acceleration: cuDF as a Seamless Pandas Accelerator

Breaking News: cuDF Now Offers Zero-Code-Change Acceleration for pandas!

cuDF (pronounced “KOO-dee-eff”) has evolved into a powerful GPU-accelerated DataFrame library that revolutionizes data manipulation operations. This cutting-edge technology leverages NVIDIA’s blazing-fast libcudf C++/CUDA backend and the efficient Apache Arrow columnar format to deliver pandas-like functionality with dramatic performance improvements.

Two Powerful Implementation Options

Option 1: Direct cuDF Usage

Import cuDF directly and utilize its pandas-compatible API for GPU-accelerated data operations:

import cudf

# Load data directly to GPU memory
tips_df = cudf.read_csv("https://github.com/plotly/datasets/raw/master/tips.csv")

# Perform calculations with GPU acceleration
tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"] * 100

# Execute GPU-accelerated groupby operations
print(tips_df.groupby("size").tip_percentage.mean())

Option 2: Transparent Pandas Acceleration

The groundbreaking new cudf.pandas extension transforms your existing pandas workflows without requiring code changes:

# Enable GPU acceleration for pandas operations
%load_ext cudf.pandas

# Use familiar pandas syntax - now powered by GPU!
import pandas as pd

tips_df = pd.read_csv("https://github.com/plotly/datasets/raw/master/tips.csv")
tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"] * 100

# Same syntax, dramatically faster execution
print(tips_df.groupby("size").tip_percentage.mean())

The cudf.pandas extension delivers 100% pandas API compatibility, intelligently utilizing cuDF for supported operations while seamlessly falling back to pandas when necessary.

Comprehensive Resources

Get Started Immediately:

Interactive Demo: Explore cudf.pandas on a free GPU-enabled Google Colab instance
Installation Guide: Detailed instructions for setting up cuDF and the entire RAPIDS ecosystem
Documentation: Access comprehensive guides for both Python cuDF and the underlying C++/CUDA libcudf
Community Support: Join the RAPIDS community to get assistance, contribute to development, and collaborate with other users

Technical Requirements and Installation Options

Hardware and CUDA Requirements

CUDA Version: 11.2 or newer
NVIDIA Driver: 450.80.02 or newer
GPU Architecture: Volta or newer (Compute Capability 7.0+)

Installation Methods

Via pip: Choose the appropriate package based on your CUDA environment:

For CUDA 11.x environments:

pip install --extra-index-url=https://pypi.nvidia.com cudf-cu11

For CUDA 12.x environments:

pip install --extra-index-url=https://pypi.nvidia.com cudf-cu12

Via conda: Install through the rapidsai channel using miniforge:

conda install -c rapidsai -c conda-forge -c nvidia \
    cudf=25.06 python=3.12 cuda-version=12.8

Nightly builds from the latest development branch are also available for those seeking cutting-edge features.

Important Compatibility Notes:

cuDF currently supports Linux operating systems only
Compatible with Python 3.10 and later versions
For detailed compatibility information, consult the comprehensive RAPIDS installation documentation

Transform your data processing workflow today with GPU-accelerated pandas operations!

cuML: Powering Machine Learning with GPU Acceleration

Unleash Lightning-Fast Machine Learning with RAPIDS cuML Library

cuML delivers a comprehensive suite of GPU-accelerated machine learning algorithms that maintain API compatibility with popular frameworks like scikit-learn. This powerful library enables data scientists, researchers, and engineers to leverage GPU computing for traditional ML tasks without requiring CUDA programming expertise.

Performance That Transforms Workflows

When working with large datasets, cuML’s GPU implementations deliver 10-50x faster performance compared to equivalent CPU algorithms. This dramatic speedup is thoroughly documented in the cuML Benchmarks Notebook.

Simple Implementation with Familiar Syntax

Here’s how easily you can implement DBSCAN clustering using cuML with GPU-accelerated cuDF DataFrames:

import cudf
from cuml.cluster import DBSCAN

# Create and populate a GPU DataFrame
gdf_float = cudf.DataFrame()
gdf_float['0'] = [1.0, 2.0, 5.0]
gdf_float['1'] = [4.0, 2.0, 1.0]
gdf_float['2'] = [4.0, 2.0, 1.0]

# Setup and fit clusters
dbscan_float = DBSCAN(eps=1.0, min_samples=1)
dbscan_float.fit(gdf_float)

print(dbscan_float.labels_)

Output:

0 0 1 1 2 2 dtype: int32

Multi-GPU and Distributed Computing Capability

cuML extends beyond single-GPU acceleration to harness multiple GPUs across multiple nodes using Dask. This architecture enables unprecedented scaling for data-intensive ML workflows:

# Initialize UCX for high-speed transport of CUDA arrays
from dask_cuda import LocalCUDACluster

# Create a Dask single-node CUDA cluster w/ one worker per device
cluster = LocalCUDACluster(protocol="ucx",
                           enable_tcp_over_ucx=True,
                           enable_nvlink=True,
                           enable_infiniband=False)

from dask.distributed import Client
client = Client(cluster)

# Read CSV file in parallel across workers
import dask_cudf
df = dask_cudf.read_csv("/path/to/csv")

# Fit a NearestNeighbors model and query it
from cuml.dask.neighbors import NearestNeighbors
nn = NearestNeighbors(n_neighbors=10, client=client)
nn.fit(df)
neighbors = nn.kneighbors(df)

Extensive Algorithm Support

cuML provides GPU-accelerated implementations across diverse machine learning categories:

Clustering

DBSCAN with multi-node multi-GPU support via Dask
HDBSCAN for hierarchical density-based clustering
K-Means with distributed computing capabilities
Single-Linkage Agglomerative Clustering

Dimensionality Reduction

PCA and Incremental PCA
Truncated SVD
UMAP with multi-node GPU inference
Random Projection and t-SNE

Linear Models

Linear Regression (OLS)
Regularized regression (Lasso, Ridge, ElasticNet)
LARS Regression (experimental)
Logistic Regression
Naive Bayes with distributed support

Non-linear Models

Random Forest Classification and Regression
Forest Inference Library (FIL)
K-Nearest Neighbors Classification and Regression
Support Vector Machine Classifier (SVC)
Epsilon-Support Vector Regression (SVR)

Time Series Analysis

Holt-Winters Exponential Smoothing
ARIMA/SARIMA models

Model Explanation

SHAP Kernel Explainer
SHAP Permutation Explainer

Additional Features

Comprehensive preprocessing tools
Flexible device interoperability between CPU and GPU
Multi-node accelerated KNN search with Faiss integration

Resources and Installation

cuML’s complete documentation includes detailed API references and example notebooks demonstrating implementations across various use cases. The extensive notebooks-contrib repository provides end-to-end application examples.

For installation instructions and compatibility details, consult the RAPIDS Release Selector to find the appropriate command line for installing either nightly builds or official releases.Retry