/r/datasets/

r/datasets

218k members
r/datasets is a subreddit with 218k members. The most common kinds of discussions are solution requests and advice requests, and the community frequently discusses dataset, looking for, looking, looking for a, and datasets, and they frequently recommend/review database.
A place to share, find, and discuss Datasets.

Popular Themes in r/datasets

#1
Solution Requests
: "Need dataset for global monthly oil prices"
11 posts
#2
Advice Requests
: "Looking for realistic datasets for analytics + ML projects after running into synthetic data issues"
10 posts
#3
Pain & Anger
: "tested some proxy providers for city-level geotrgeting and most of them lied to me"
2 posts
#4
Ideas
: "Need fun project ideas for a 3 node physical cluster (Uni Project)"
1 post

Popular Topics in r/datasets

#1

Dataset

: "I got tired of checking Kaggle, HuggingFace, data.gov, and other sites every time I needed a Dataset, so I built a tool that searches all of them at once"
95 posts
#2

Looking For

: "Looking For a character network dataset for Dracula by Bram Stoker"
28 posts
#3

Looking

25 posts
#4

Looking For A

19 posts
#5

Datasets

: "Open, self-hostable pipeline for U.S. financial Datasets — SEC filings (full-text), 13F holdings, insider and congressional trades, FINRA short data, FRED, CFTC, CBOE"
13 posts
#6

Ai

: "We just captured 1800+ human motion sequences for Ai model trAining. Here's what 4 days of continuous motion capture looks like."
13 posts
#7

Data

9 posts
#8

Api

: "[Tool] Built an Api to instantly extract any public HTML table or Wikipedia page into a clean JSON data matrix"
8 posts
#9

Synthetic

: "Do you consider Synthetic datasets useful for real-world data work?"
8 posts
#10

Us

: "hand-drawn mUsic scores paired with their digital vector and text representations"
6 posts

Products Discussed in r/datasets

Database

1 review
#1
Stanford
5.0 from 1 review

Flair Used in r/datasets

#1
resource
: "I got tired of checking Kaggle, HuggingFace, data.gov, and other sites every time I needed a dataset, so I built a tool that searches all of them at once"
41 posts
#2
dataset
: "I built an open-source dataset of every major US layoff"
39 posts
#3
question
: "I can scrape/aggregate pretty much any fragmented public data. What datasets are missing"
28 posts
#4
request
: "Looking for realistic datasets for analytics + ML projects after running into synthetic data issues"
20 posts
#5
discussion
: "tested some proxy providers for city-level geotrgeting and most of them lied to me"
12 posts
#6
API
: "[Tool] Built an API to instantly extract any public HTML table or Wikipedia page into a clean JSON data matrix"
2 posts
#7
mock dataset
: "UK GDPR Small Business Q&A — 5,000 synthetic pairs with article-level citations [Synthetic]"
1 post

Member Growth in r/datasets

Yearly
+14k members(6.6%)

Similar Subreddits to r/datasets

/r/artificial

r/artificial

1.3M members
17.4% / yr
/r/digital_marketing

r/digital_marketing

355k members
37.6% / yr
/r/fintech

r/fintech

86k members
95.7% / yr

r/LanguageTechnology

64k members
14.3% / yr
/r/MachineLearning

r/MachineLearning

3.1M members
2.6% / yr
/r/Python

r/Python

1.5M members
9.0% / yr
/r/research

r/research

58k members
51.4% / yr
/r/SEO_Digital_Marketing

r/SEO_Digital_Marketing

73k members
40.7% / yr

r/smallbusiness

2.5M members
13.8% / yr
/r/writing

r/writing

3.4M members
6.4% / yr

About

GummySearch helps people research Reddit communities by organizing activity, growth, themes, and post-level signals into one place.

This page gives a focused view of r/datasets, including current member size, discussion patterns, product reviews, and related communities to explore.

This data is synced periodically so insights stay current and useful for ongoing research.

Last updated: June 1, 2026