r/DataCentricAI is a subreddit with 569 members. The most common kinds of discussions are solution requests and pain & anger, and the community frequently discusses ai, data, machine learning, datasets, and ml_models.
If "80% of Machine Learning is simply data cleaning", perhaps we should focus on the data.
A community for discussions on how to make the most of our datasets.
Resource hub: https://mindkosh.com/data-centric-ai
Popular Themes in r/DataCentricAI
#1
Solution Requests
: "Data-centric AI resources"
12 posts
#2
Pain & Anger
: "The diversity problem plaguing the Machine Learning community"
8 posts
#3
Advice Requests
: "Understanding Gradient based adversarial attacks."
5 posts
#4
Self-Promotion
: "Great Expectations - An open source tool for Data validation and profiling"
5 posts
#5
News
: "DeepMind buys Physics simulator MuJuCo, will open-source it soon!"
2 posts
#6
Ideas
: "We built AI to do human things."
1 post
#7
Money Talk
: "What is healthcare data analyst salary?"
1 post
#8
Opportunities
: "Some jobs disappear, but new ones always sneak in."
1 post
Popular Topics in r/DataCentricAI
#1
Ai
: "Checkout the latest issue of our Ai and ML newsletter - Mindkosh Ai Review"
13 posts
#2
Data
: "We test models. We never test Datasets."
7 posts
#3
Machine Learning
: "MLEM - The First Open, Git-based Machine Learning Model Deployment and Management Tool Introduced"
6 posts
#4
Datasets
: "Distilling Datasets into smaller, synthetic Datasets"
6 posts
#5
Ml_models
: "The breakdown of Zillow's price prediction Machine Learning models due to COVID."
6 posts
#6
Terraform
: "Managing AI/ML workloads for cloud providers with TPI Terraform plugin"
4 posts
#7
Cloud
: "Managing AI/ML workloads for Cloud providers with TPI Terraform plugin"
4 posts
#8
Adversarial_attacks
: "Understanding Gradient based adversarial attacks."
3 posts
#9
Model
: "Moral of the story- your Model learns what you feed it."
3 posts
#10
Data_augmentation
: "AutoAugment - Automatically augment training datasets using Reinforcement Learning"
2 posts
Flair Used in r/DataCentricAI
#1
Discussion
: "The breakdown of Zillow's price prediction Machine Learning models due to COVID."
22 posts
#2
Resource
: "Updated list of Open source tools in Data Centric AI"
19 posts
#3
Research Paper Shorts
: "A few hundred data samples might be worth billions of parameters"
18 posts
#4
AI/ML
: "DeepMind buys Physics simulator MuJuCo, will open-source it soon!"
14 posts
#5
Tool
: "Great Expectations - An open source tool for Data validation and profiling"
8 posts
#6
Meme
: "Its 2AM and you just received your 100th out of memory error."
2 posts
#7
Concept Explainer
: "Understanding Gradient based adversarial attacks."
2 posts
#8
How do I do this?
: "Any good libraries for dataset validation?"
2 posts
Member Growth in r/DataCentricAI
Yearly
+42 members(8.0%)
Similar Subreddits to r/DataCentricAI
r/ArtificialInteligence
1.8M members
23.2% / yr
r/CIO
17k members
23.5% / yr
r/DataScienceMemes
6k members
5.0% / yr
r/devops
496k members
23.0% / yr
r/kaggle
23k members
27.6% / yr
r/learnmachinelearning
650k members
24.2% / yr
r/MachineLearning
3.1M members
2.6% / yr
r/MLQuestions
107k members
37.9% / yr
r/neuralnetworks
40k members
43.3% / yr
r/SAP
59k members
35.9% / yr
About
GummySearch helps people research Reddit communities by organizing activity, growth, themes, and post-level signals into one place.
This page gives a focused view of r/DataCentricAI, including current member size, discussion patterns, product reviews, and related communities to explore.
This data is synced periodically so insights stay current and useful for ongoing research.
Last updated: June 16, 2026