/r/datasets/

r/datasets

219k members
r/datasets is a subreddit with 219k members. The most common kinds of discussions are solution requests and advice requests, and the community frequently discusses dataset, looking for, looking, looking for a, and ai, and they frequently recommend/review database.
A place to share, find, and discuss Datasets.

Popular Themes in r/datasets

#1
Solution Requests
: "exercisedb down? Anyone know alternatives?"
14 posts
#2
Advice Requests
: "What is a dataset that you can’t believe is available to the public?"
12 posts
#3
Self-Promotion
: "[Self Promotion] Feature Extracted Human and Synthetic Voice datasets - free research use, legally clean, no audio."
2 posts
#4
Pain & Anger
: "tested some proxy providers for city-level geotrgeting and most of them lied to me"
1 post

Popular Topics in r/datasets

#1

Dataset

: "I scraped over 2 million job postings across 100,000+ company career sites into a unified, daily-updated Dataset."
141 posts
#2

Looking For

: "Looking For a character network dataset for Dracula by Bram Stoker"
28 posts
#3

Looking

25 posts
#4

Looking For A

19 posts
#5

Ai

: "jobdatapool is a forever free dataset validated by humans and curated by humans for Ai"
18 posts
#6

Api

: "What is the best travel search Api (flights, hotels, etc) today?"
14 posts
#7

Datasets

: "Open, self-hostable pipeline for U.S. financial Datasets — SEC filings (full-text), 13F holdings, insider and congressional trades, FINRA short data, FRED, CFTC, CBOE"
13 posts
#8

Research

: "African Countries: A Curated Dataset on Africa Indicators for Education and Data Science"
12 posts
#9

Data

10 posts
#10

Open Source

: "Built an alternative to OpenCorporates using strictly first-party government data. Looking for feedback."
9 posts

Products Discussed in r/datasets

Database

1 review
#1
Stanford
5.0 from 1 review

Flair Used in r/datasets

#1
dataset
: "I scraped over 2 million job postings across 100,000+ company career sites into a unified, daily-updated dataset."
63 posts
#2
resource
: "I got tired of checking Kaggle, HuggingFace, data.gov, and other sites every time I needed a dataset, so I built a tool that searches all of them at once"
50 posts
#3
question
: "would anyone use a voice interface for querying the 3.5M epstein files pages?"
37 posts
#4
request
: "[Slef-promotion][Synthetic] I built a 100K-row sleep health dataset from scratch - it just earned a Kaggle Silver Medal (7,800 views, 1,700+ downloads in 2 weeks)"
34 posts
#5
discussion
: "tested some proxy providers for city-level geotrgeting and most of them lied to me"
8 posts
#6
API
: "Business profile data API — looking for feedback on fields, samples, and data quality"
3 posts
#7
survey
: "I built a free tool that lets you click anywhere on a map and get weather, terrain, vegetation, and hazard data. Looking for honest feedback from GIS professionals"
2 posts
#8
mock dataset
: "Open-source tool for schema-driven synthetic data generation for testing data pipelines"
1 post

Member Growth in r/datasets

Yearly
+14k members(6.8%)

Similar Subreddits to r/datasets

/r/artificial

r/artificial

1.3M members
17.4% / yr
/r/data

r/data

52k members
16.4% / yr
/r/digital_marketing

r/digital_marketing

356k members
37.4% / yr

r/LanguageTechnology

64k members
14.1% / yr
/r/learnpython

r/learnpython

1.0M members
10.7% / yr
/r/MachineLearning

r/MachineLearning

3.1M members
2.6% / yr
/r/Python

r/Python

1.5M members
9.0% / yr
/r/research

r/research

59k members
52.1% / yr
/r/SEO

r/SEO

488k members
23.0% / yr

r/smallbusiness

2.5M members
13.9% / yr

About

GummySearch helps people research Reddit communities by organizing activity, growth, themes, and post-level signals into one place.

This page gives a focused view of r/datasets, including current member size, discussion patterns, product reviews, and related communities to explore.

This data is synced periodically so insights stay current and useful for ongoing research.

Last updated: June 11, 2026