Best etl tool on Reddit

243 reviews from r/dataengineering, r/ETL, r/BusinessIntelligence and 4 more subreddits

243 reviews from
and
By Brand
/
By Product
#1

Fivetran

4.0
(22)
"Fivetran + dbt will solve it for most use cases."
·
"Fivetran is very popular - they are an industry leader and have a ton of rebuilt connectors. Fivetran just works..."
·
"Handles data loading from various services"
·
"Fivetran / airbyte / dlt for data extraction and ingestion"
·
"Fivetran for the repository and data replication."
·
"The most common tools are Fivetran and Airbyte."
·
"Fivetran is only for Extract and Load and it is simple to use so gets wider adoption."
·
"Fivetran/ Qliks are not ETL per say, but great ELT enablers."
·
"It's good"
·
"Fivetran is the easiest to use with the most available sources and targets."
·
#2

Talend

3.7
(21)
"Talend also offers a free version."
·
"I loved using Talend Open Studio and orchestrating with Rundeck back in the day"
·
"Pretty terrific"
·
"Best bets"
·
"Top free tool right now"
·
"Talend is easy to use, have lot of integrations inbuilt."
·
"If you are a small team I can suggest Talend, you can quickly build up ETL with minimum infrastructure setup."
·
"Cost effective"
·
"Talend, Mulesoft, Informatica"
·
"Try Talend open studio (open source) will suit your need aptly ."
·
#3

dbt

4.5
(17)
"Dbt simplifies the data transformation process."
·
"DBT is just a build/modeling tool run on straight sql"
·
"We actually like our work because of them."
·
"Dbt is a good option."
·
"Dbt."
·
"Run snapshots on raw or staged tables"
·
"A lot of buzz about dbt lately, it’s focused on SQL, look into it."
·
"I personally love DBT for a framework."
·
"There are many transformation tools available such as dbt."
·
"Dbt has been around longer and is a solid choice for data transformation."
·
#4

Matillion

4.0
(12)
"Great program"
·
"I would suggest giving Matillion Productivity Cloud a go (Cloud SaaS). I'm yet to come across an equivalent ELT Cloud hosted tool that covers both integration and transformation."
·
"Fantastic"
·
"GUI based tooling"
·
"If the datasets are average and transformation are not complex, Matillion works and goes above a simple EL solution."
·
"Matillion is a top choice for its user-friendly design."
·
"Your credits are universal within Matillion, so you can start building out jobs in Matillion Data Productivity Cloud any time you like."
·
"Matillion Cloud is a scalable solution for ETL."
·
"GUI-based and powerful"
·
"Good performance and reasonable cost"
·
#5

Informatica

3.1
(14)
"Depends on your budget, timeframe and your skillset."
·
"Informatica provides robust features for data transformation."
·
"Cost effective"
·
"Informatica is a good option for file-based integrations."
·
"Talend, Mulesoft, Informatica"
·
"DataStage, Informatica, Talend could be some traditional ETL choices."
·
"You may want a standard ETL tool like Informatica."
·
"Cost effective"
·
"Informatica have modernized for cloud, but, it's for large enterprise."
·
"Informatica was good, but it's INCREDIBLY dated"
·
#6

Apache Airflow

4.0
(8)
"Airflow should be mentioned. Situation needs to be right though."
·
"Dbt or sqlmesh + Airflow"
·
"Dbt and Apache air flow"
·
"I’ve suggested using a mixture of Lambda + Airflow to build a somewhat 'Event-Driven' system."
·
"Apache Airflow."
·
"This may evolve into Airflow in the future."
·
"You can check **Airflow**. It's a task scheduler."
·
"Look at Apache Airflow."
#7

PySpark

4.0
(8)
"PySpark has become a pretty standard part of many DE stacks, and it's pretty battle-tested"
·
"Widely applicable and a more generalized skillset in addition to allowing ETL"
·
"Just use Pyspark."
·
"Going with something like Pyspark is great if you have a strong team."
·
"For pure ELT big things PySpark."
·
"Learning PySpark is valuable, but most ETL work can be automated."
·
"Pyspark is fine if you have large datasets to process using spark/databrick clusters."
·
"Haven't used pyspark but we don't have big data requirements (yet)."
#8

Airbyte

4.4
(7)
"Just started exploring airbyte."
·
"The pricing is fair and per use"
·
"Both an open-source version and a cloud one"
·
"The most common tools are Fivetran and Airbyte."
·
"**Open-source**: sling-cli, airbyte, peerdb"
·
"Airbyte is a powerful tool for data ingestion."
·
"The pricing is fair and per use."
#9

Airflow

4.3
(7)
"Using airflow for orchestration"
·
"Python-based"
·
"Consider running your scripts as Airflow tasks."
·
"Good for orchestration."
·
"Airflow for ingress with DBT for transformations."
·
"Airflow or dagster may be something you'd be interested in."
·
"If, however, you're going to deal with multiple independent data sources, you might want to look into more complex solutions like Airflow or Nifi."
#10

Pentaho

4.3
(7)
"Best bets"
·
"Free edition is available and very full featured"
·
"I'm quite fond of Pentaho personally - free edition is available and very full featured."
·
"Big fan of Pentaho. Open source and free goes a long way, if you don't mind the occasional bug."
·
"Pentaho is also a strong contender in the free ETL tools space."
·
"More user-friendly and easier to use"
·
"It is free and has lots of connectors."
#11

Apache

4.8
(6)
"Open source. Visual low code / no code. Metadata driven. It is a fork of Pentaho Kettle. It is easy to learn but with a lot of features."
·
"Pyspark is an excellent tool for big data processing."
·
"Apache Beam makes this pretty easy with the documentation."
·
"Lots of good choices mentioned here already, but one more is Apache Hop."
·
"Apache Airflow is scalable, extensible, and dynamic with configuration-as-code in Python."
·
"You'll still want to get to know Apache Spark in general."
#12

Snowflake

4.2
(6)
"One of the best etl tools"
·
"Snowflake, Fivetran for the repository and data replication."
·
"Snowflake is a lot easier to use compared to other solutions."
·
"Snowflake is among the top technologies to consider."
·
"Snowflake also has some integrations that work directly with Azure Blob storage."
·
"Snowflake is one of the best ETL tools."
#13

Meltano

4.4
(5)
"Meltano + dbt is a great option."
·
"Use airbyte or meltano, or the best tool ever python, but if you wanna give a good platform, use meltano + python and dbt with airflow or dagster as orchestrator."
·
"Using meltano for staging/replication"
·
"Can probably make things work"
·
"Meltano is a handy ETL tool, I’ve used it for E&L before following up with dbt for T&L within the warehouse."
#14

MuleSoft

3.8
(5)
"No compromises"
·
"Depends on your budget, timeframe and your skillset."
·
"Talend, Mulesoft, Informatica"
·
"MuleSoft however data pipelines or CRMA may suffice."
·
"Mulesoft is overkill and extremely expensive for this integration."
#15

Hevo

4.8
(4)
"Really like it"
·
"Simple ETL that I find easier to use than Stitcher or Fivetran."
·
"We use Hevo for E&L for easy built-in integrations and webhooks."
·
"If you want a no-code kind of thing, go for Hevo."
#16

Python

4.5
(4)
"The best ETL tool is Python."
·
"A great library for boilerplate loading!"
·
"For pure ELT small things Python."
·
"Much as I love R I'd recommend Python and Airflow for ETL stuff."
#17

Databricks

4.3
(4)
"Databricks is the best for all kind of ETL operations."
·
"Databricks"
·
"Databricks is a popular technology in data engineering."
·
"I'm on Databricks"
#18

Ab Initio

4.7
(3)
"Ab initio is the best ETL tool."
·
"Can do such things if it makes sense price wise. Will run on your hardware and it's quite easy to parallelize graphs. Also supports metaprogramming so you might find a smart way to handle different file formats."
·
"Abinitio has very good parallel processing when your source data is huge."
#19

Integrate.io

4.7
(3)
"Integrate.io has done the job for my team for 3 years. No complaints."
·
"We did a pilot with Integrate.io three months ago and found they make data pipelines work faster than anyone else's."
·
"Gave Integrate.io a try and was straightforward overall, which is a win when I have a billion-and-one-things on our plate."
#20

Alteryx

4.7
(3)
"Huge wealth of information online in their forums"
·
"Alteryx is excellent for data engineering and analytics, allowing for seamless ETL processes."
·
"We use a tool called Alteryx for this I think."
#21

Etlworks

4.7
(3)
"Look at Etlworks. Linux, Windows, Docker, cloud, self-hosted. Hundreds of connectors."
·
"It does what you need."
·
"Not very well known but allows us to do almost everything ETL/ELT related."
#22

Stitch

4.3
(3)
"They now support more than 65 integrations, the most of any ETL vendor we’re aware of."
·
"Segment, Stitch, and Fivetran are all different but affordable"
·
"Doesn’t FiveTran, Stitch and Supermetrics (to a minor degree) do the same thing?"
#23

ETL Tools

4.0
(3)
"Ultimately, the best ETL tool for your business will depend on your specific needs, budget, and technical expertise."
·
"I would lean towards a coded etl solution, like python script, glue, what have you."
·
"Have you considered writing a custom tool in Go or Python?"
#24

Microsoft

5.0
(2)
"If you're already using Azure I'd recommend checking Azure Data Factory."
·
"Cheap, effective and battle tested."
#25

Dagster

4.5
(2)
"Great too"
·
"Dagster and dbt."
#26

Prefect

4.5
(2)
"Highly recommend it"
·
"A solid alternative for orchestration."
#27

Conduit

4.5
(2)
"Check out conduit. It's a fantastic ETL tool."
·
"Conduit is open-source so you can use it on your infrastructure... It focuses on real-time and CDC."
#28

dlt

4.5
(2)
"Has a small learning curve compared to a gui tool, but absolutely was worth the investment for me."
·
"+1 to dlt, dbt, and sqlmesh."
#29

PeerDB

4.0
(2)
"I've found PeerDB for Postgres - > Snowflake the best free and open source solution yet."
·
"**Open-source**: sling-cli, airbyte, peerdb"
#30

IBM DataStage

4.0
(2)
"DataStage"
·
"DataStage, Informatica, Talend could be some traditional ETL choices."
#31

Azure Data Factory

4.0
(2)
"Azure data factory the best one. Easy to maintain, easy to develop, easy create dynamic pipelines."
·
"No one uses azure data factory?? It's common where I've been..."
#32

n8n

4.0
(2)
"N8n"
·
"A bit more visual, but still quite simple would be n8n."
#33

Jitterbit

4.0
(2)
"Jitterbit has a free version available."
·
"Cost effective"
#34

Apache Hop

4.0
(2)
"Have you checked out Apache Hop?"
·
"Apache hop."
#35

AWS Glue

4.0
(2)
"I like AWS Glue because it’s very flexible."
·
"You could potentially replace Informatica with AWS Glue."
#36

Apache NiFi

4.0
(2)
"Apache Nifi"
·
"Have you considered Apache NiFi?"
#37

Estuary

4.0
(2)
"Estuary's CDC integration tool is cost-effective and easy to set up."
·
"Estuary is an up-and-coming option for affordable data extraction and loading."
#38

Mage

3.5
(2)
"I’ve been pleasantly surprised with Mage after a few days of playing around with it."
·
"I was hoping that as time goes on I'd start seeing more advocates for Mage."
#39

Mulesoft

3.0
(2)
"Mulesoft has integrations specifically built for Salesforce applications."
·
"Mulesoft is the 'no compromises' Middleware, but there are more cost-effective offerings."
#40

Singer

5.0
(1)
"We use Singer taps running in docker containers for the extract/load and DBT for the transform."
#41

dagster

5.0
(1)
"We use it for our data warehouse and are very happy with it."
#42

AirByte

5.0
(1)
"AirByte offers an open source version that can be easily set up with docker-compose."
#43

Ask On Data

5.0
(1)
"Ask On Data is a cutting-edge NLP-based ETL tool designed to streamline and optimize data processing workflows."
#44

Megalada

5.0
(1)
"Low-code + Visual Design + Performance is the best bunch for ETL."
#45

BigQuery

5.0
(1)
"BigQuery is recommended for data transformation."
#46

Data Studio

5.0
(1)
"Data Studio is free and excellent for dashboards and reports."
#47

Skyvia

5.0
(1)
"Skyvia is a standout option for simplifying data ingestion into Amazon Redshift."
#48

Google

5.0
(1)
"It's extremely easy to run on GCP with Dataflow."
#49

ClickHouse

5.0
(1)
"ClickHouse is great for analytical queries."
#50

Spark

5.0
(1)
"Spark is highly scalable and works well with iceberg tables."
#51

prophecy.io

5.0
(1)
"It supports both ETL and ELT; allows for visual development; gives you clean & editable Dbt Core and Airflow code."
#52

Metabase

5.0
(1)
"Personally quite fond of Metabase since it's easy to put in existing infrastructure."
#53

GetDBT.com

5.0
(1)
"Great tool for data transformation and integration with delta lake and airflow."
#54

Sprinkle Data

5.0
(1)
"I can recommend a tool which has a free plan and it might suit your use case perfectly. Personally, I have been using it for more than a couple of years now."
#55

SQL Server Integration Services

2.0
(2)
"I would think you could solve this in SSIS with a query to get the config data set and then a foreach loop."
·
"SSIS 🤡🤡🤡🤡🤡🤡"
#56

IBM

4.0
(1)
"IBM Sterling is also a strong choice for file-based integrations."
#57

GCP

4.0
(1)
"For GCP Pub/Sub."
#58

MinIO

4.0
(1)
"MinIO provides excellent object storage capabilities."
#60

Postgres

4.0
(1)
"Postgres is a reliable choice for data storage."
#61

Trino

4.0
(1)
"Stand up Trino and configure its PostgreSQL connector to point at each database."
#62

KNIME

4.0
(1)
"I'm using KNIME Analytics for dinner kinds of things."
#63

sqlmesh

4.0
(1)
"Sqlmesh seems like a more sanely thought-out tool set."
#64

portable.io

4.0
(1)
"Portable.io is a lower cost option that can extract and load data easily."
#65

Rivery

4.0
(1)
"Rivery is another affordable option for data extraction and loading."
#66

Coalesce

4.0
(1)
"Coalesce is great for transformation requirements, especially on Snowflake."
#67

Mage.ai

4.0
(1)
"Great for coding in Python."
#68

Rundeck

4.0
(1)
"Orchestrating with Rundeck back in the day was great."
#69

Coalesce.io

4.0
(1)
"Coalesce.io is a great transformation tool but requires an integration tool as well thus more vendors to deal with."
#70

Xplenty

3.0
(1)
"Decent performance but slightly expensive"
#71

Stitcher

3.0
(1)
"Trailed Stitcher"
#72

AWS

2.0
(1)
"AWS solutions like Redshift require a lot of maintenance."
#73

CloverETL

2.0
(1)
"CloverETL is not free but nothing really is."
#74

VM

1.0
(1)
"Why it isn't effective?"

Discover your audience

GummySearch is an audience research toolkit for 130,000 unique communities on Reddit.

If you are looking for startup problems to solve, want to validate your idea or find your first customers online, GummySearch is for you.

Sign up for free, get community insights in minutes.

Tell me more
Get started
Audience Research