Etl Tool reviews from Reddit
Summary
We analyzed 333 Reddit reviews across 12 subreddits and 45 posts to rank the best Etl Tool brands recommended by redditors, including communities like r/dataengineering, r/ETL, r/salesforce, r/BusinessIntelligence, r/datascience. Top-rated brands include Talend (3.8/5), Fivetran (4.0/5), dbt (4.4/5).
Stats
Reviews333
Subreddits12
Posts45
Brands84
Products19
333 reviews from
and
By Brand
/By Product
#1
Talend
3.8
(28)
"Has been a game-changer for our team."
"Talend also offers a free version."
"I loved using Talend Open Studio and orchestrating with Rundeck back in the day"
"Pretty terrific"
"Best bets"
"Top free tool right now"
"Talend is easy to use, have lot of integrations inbuilt."
"If you are a small team I can suggest Talend, you can quickly build up ETL with minimum infrastructure setup."
"Talend offers an intuitive interface and strong data transformation features."
"Cost effective"
#2
Fivetran
4.0
(26)
"Fivetran + dbt will solve it for most use cases."
"Fivetran is very popular - they are an industry leader and have a ton of rebuilt connectors. Fivetran just works..."
"Fivetran is my favorite, but also i believe it is probably the most costly of those options."
"Handles data loading from various services"
"Fivetran / airbyte / dlt for data extraction and ingestion"
"Fivetran for the repository and data replication."
"The most common tools are Fivetran and Airbyte."
"Fivetran is only for Extract and Load and it is simple to use so gets wider adoption."
"Fivetran/ Qliks are not ETL per say, but great ELT enablers."
"It's good"
#3
dbt
4.4
(21)
"Dbt simplifies the data transformation process."
"DBT is just a build/modeling tool run on straight sql"
"We actually like our work because of them."
"Dbt is a good option."
"Dbt."
"Run snapshots on raw or staged tables"
"A lot of buzz about dbt lately, it’s focused on SQL, look into it."
"Dbt is a great tool for modeling data, all SQL driven."
"I personally love DBT for a framework."
"DBT for modeling"
#4
Airbyte
4.1
(17)
"Just started exploring airbyte."
"The pricing is fair and per use"
"Both an open-source version and a cloud one"
"I’ve been running Airbyte in a commercial setting for years now. It has gotten substantially better over that time and works well enough for reasonable use cases where data volume is not massive and transfer speed is not expected to be near real time."
"The most common tools are Fivetran and Airbyte."
"Airbyte and Fivetran should be seen as EL tools, not transformation tools"
"**Open-source**: sling-cli, airbyte, peerdb"
"We currently use Airbyte + dbt and built custom connectors for ERP’s."
"Https://airbyte.com is worth checking out. Kind of like FiveTran except open source"
"I would recommend to look into Airbyte."
#5
Apache Airflow
4.0
(15)
"Open source job orchestrator would do the trick for you. Something like Apache Airflow or Prefect if you're a Python user."
"Go for open-source tools like Apache Airflow for data extraction"
"Consider Apache Airflow for orchestration"
"Airflow (orchestration)"
"Airflow should be mentioned. Situation needs to be right though."
"Dbt or sqlmesh + Airflow"
"Dbt and Apache air flow"
"For open-source ETL tools compatible with GCP, consider Apache NiFi, Talend Open Studio, and Airflow."
"I’ve suggested using a mixture of Lambda + Airflow to build a somewhat 'Event-Driven' system."
"Surprised to not have seen Airflow mentioned for scheduling yet. GCP has a lot of connections available for Airflow (e.g. GCS, BigQuery) and you can deploy managed Airflow via Cloud Composer too so you only have to deal with writing Python."
#6
Informatica
3.1
(16)
"Depends on your budget, timeframe and your skillset."
"Informatica all day."
"Cost effective"
"Informatica is a good option for file-based integrations."
"Talend, Mulesoft, Informatica"
"DataStage, Informatica, Talend could be some traditional ETL choices."
"You may want a standard ETL tool like Informatica."
"Some people will have a lot of business people involved, where it can make sense to use visual tools, like Talend and Informatica."
"Cost effective"
"Informatica have modernized for cloud, but, it's for large enterprise."
#7
Matillion
4.0
(12)
"Great program"
"I would suggest giving Matillion Productivity Cloud a go (Cloud SaaS). I'm yet to come across an equivalent ELT Cloud hosted tool that covers both integration and transformation."
"Fantastic"
"GUI based tooling"
"If the datasets are average and transformation are not complex, Matillion works and goes above a simple EL solution."
"Matillion is a top choice for its user-friendly design."
"Your credits are universal within Matillion, so you can start building out jobs in Matillion Data Productivity Cloud any time you like."
"Matillion Cloud is a scalable solution for ETL."
"GUI-based and powerful"
"Good performance and reasonable cost"
#8
Apache
4.9
(8)
"Pyspark is the way to go."
"Airflow"
"Open source. Visual low code / no code. Metadata driven. It is a fork of Pentaho Kettle. It is easy to learn but with a lot of features."
"Pyspark is an excellent tool for big data processing."
"Apache Beam makes this pretty easy with the documentation."
"Lots of good choices mentioned here already, but one more is Apache Hop."
"Apache Airflow is scalable, extensible, and dynamic with configuration-as-code in Python."
"You'll still want to get to know Apache Spark in general."
#9
Pentaho
4.1
(9)
"Best bets"
"Free edition is available and very full featured"
"I'm quite fond of Pentaho personally - free edition is available and very full featured."
"Big fan of Pentaho. Open source and free goes a long way, if you don't mind the occasional bug."
"We have been analyzing Hevo, Talend, Pentaho, Airbyte, etc. They suit very well for SMEs."
"Pentaho is also a strong contender in the free ETL tools space."
"More user-friendly and easier to use"
"It is free and has lots of connectors."
"Pentaho is a good solution if the team building and maintaining the ETL jobs aren't comfortable with Python."
#10
Apache NiFi
4.0
(8)
"Also would recommend checking out Apache NiFi. It’s easy to start with, pretty powerful out of the box and you can customize where you want/need."
"For open-source ETL tools compatible with GCP, consider Apache NiFi, Talend Open Studio, and Airflow."
"If you want a low code versatile data broker, NiFi is a great choice, lots of features and capabilities."
"Apache NiFi could be a good option for you."
"Apache Nifi"
"Have you considered Apache NiFi?"
"You might want to give a tool like nifi a try"
"Apache Nifi"
#11
Airflow
4.3
(7)
"Using airflow for orchestration"
"Python-based"
"Consider running your scripts as Airflow tasks."
"Good for orchestration."
"Airflow for ingress with DBT for transformations."
"Airflow or dagster may be something you'd be interested in."
"If, however, you're going to deal with multiple independent data sources, you might want to look into more complex solutions like Airflow or Nifi."
#12
PySpark
4.0
(7)
"PySpark has become a pretty standard part of many DE stacks, and it's pretty battle-tested"
"Widely applicable and a more generalized skillset in addition to allowing ETL"
"Just use Pyspark."
"Going with something like Pyspark is great if you have a strong team."
"Learning PySpark is valuable, but most ETL work can be automated."
"Pyspark is fine if you have large datasets to process using spark/databrick clusters."
"Haven't used pyspark but we don't have big data requirements (yet)."
#13
dlt
4.5
(6)
"Dlt worked great for me getting salesforce to big query."
"Has a small learning curve compared to a gui tool, but absolutely was worth the investment for me."
"Open source, external (us) maintenance, runs on any env, means you cannot beat it in terms of price and privacy."
"Dlt (the python library, dlthub.com not Databricks Delta Live Tables) is great, but has a pretty steep learning curve."
"+1 to dlt, dbt, and sqlmesh."
"Gonna give this a try."
#14
Apache Hop
4.0
(6)
"Apache Hop (ETL pipeline)"
"Have you checked out Apache Hop?"
"Apache hop."
"Check out Apache Hop!"
"You might want to also check out Matt Casters' latest project, Apache Hop."
"Apache hop easily does all that"
#15
Integrate.io
4.6
(5)
"Streamlined our data processes significantly."
"Integrate.io has done the job for my team for 3 years. No complaints."
"We did a pilot with Integrate.io three months ago and found they make data pipelines work faster than anyone else's."
"Is user-friendly."
"Gave Integrate.io a try and was straightforward overall, which is a win when I have a billion-and-one-things on our plate."
#16
Meltano
4.4
(5)
"Meltano + dbt is a great option."
"Use airbyte or meltano, or the best tool ever python, but if you wanna give a good platform, use meltano + python and dbt with airflow or dagster as orchestrator."
"Using meltano for staging/replication"
"Meltano for ingestion
DBT (Core) for transformation, if you're good with ELT"
"Can probably make things work"
#17
Hevo
4.4
(5)
"Really like it"
"We use Hevo for E&L for easy built-in integrations and webhooks."
"If you want a no-code kind of thing, go for Hevo."
"We have been analyzing Hevo, Talend, Pentaho, Airbyte, etc. They suit very well for SMEs."
"If you want to use a single platform for both of your needs, I recommend using Hevo Data."
#18
Azure Data Factory
3.5
(6)
"Azure data factory the best one. Easy to maintain, easy to develop, easy create dynamic pipelines."
"Azure Data Factory is a solid choice"
"Then now using azure data factory. But that’s not open source."
"No one uses azure data factory?? It's common where I've been..."
"I’ve heard of Data Factory, and a tool hosted by any of the cloud services (AWS, GCP, Azure) is going to have better support and lifespan than something like the version of Pentaho that’s still around today."
"IICS, GCP Datafusion , azure data Factory"
#19
Snowflake
4.2
(5)
"One of the best etl tools"
"Snowflake, Fivetran for the repository and data replication."
"Snowflake is a lot easier to use compared to other solutions."
"Snowflake is among the top technologies to consider."
"Snowflake also has some integrations that work directly with Azure Blob storage."
#20
Databricks
4.2
(5)
"Databricks is the best for all kind of ETL operations."
"Databricks"
"Databricks"
"Databricks is a popular technology in data engineering."
"I'm on Databricks"
#21
Alteryx
3.8
(5)
"Huge wealth of information online in their forums"
"Alteryx."
"We use a tool called Alteryx for this I think."
"Alteryx"
"Alteryx, is the logical modern successor, but I've had problems in their server environment (Desktop was fine)"
#22
MuleSoft
3.8
(5)
"No compromises"
"Depends on your budget, timeframe and your skillset."
"Talend, Mulesoft, Informatica"
"MuleSoft however data pipelines or CRMA may suffice."
"Mulesoft is overkill and extremely expensive for this integration."
#23
Dagster
4.5
(4)
"Dagster is great, and it’s open source."
"Great too"
"Dagster for orchestration"
"Dagster and dbt."
#24
KNIME
4.3
(4)
"I like KNIME"
"KNIME all the way."
"If open-source is appealing, be sure to check out KNIME"
"KNIME. You need to install a separate node to allow python scripting but it is a great mix of both worlds IMO."
#25
Stitch
4.0
(4)
"They now support more than 65 integrations, the most of any ETL vendor we’re aware of."
"Segment, Stitch, and Fivetran are all different but affordable"
"Doesn’t FiveTran, Stitch and Supermetrics (to a minor degree) do the same thing?"
"Otherwise I would look for an ETL tool that already has connectors to the Saas tools they use. Stitch is one possibility."
#26
Ab Initio
4.7
(3)
"Ab initio is the best ETL tool."
"Can do such things if it makes sense price wise. Will run on your hardware and it's quite easy to parallelize graphs. Also supports metaprogramming so you might find a smart way to handle different file formats."
"Abinitio has very good parallel processing when your source data is huge."
#27
Python
4.3
(3)
"The best ETL tool is Python."
"Much as I love R I'd recommend Python and Airflow for ETL stuff."
"Personally I would just script it in a language like python."
#28
Prefect
4.3
(3)
"Highly recommend it"
"In a small company I setup prefect with python for Pipelines info a basic on prem SQL server."
"A solid alternative for orchestration."
#29
n8n
4.0
(3)
"No-code/low-code tools like n8n, fivetran, make.com could probably be pretty helpful."
"N8n"
"A bit more visual, but still quite simple would be n8n."
#30
ETL Tools
4.0
(3)
"Ultimately, the best ETL tool for your business will depend on your specific needs, budget, and technical expertise."
"I would lean towards a coded etl solution, like python script, glue, what have you."
"Have you considered writing a custom tool in Go or Python?"
#31
Talend Open Studio
4.0
(3)
"For open-source ETL tools compatible with GCP, consider Apache NiFi, Talend Open Studio, and Airflow."
"Talend Open Studio is what my team uses, it has a GUI and I like it"
"Talend Open Studio (based on java eclipse). Log4j enabled or custom component (tLogcatcher),Build and simply schedule with crontab."
#32
DuckDB
4.0
(3)
"DuckDB for processing"
"Use DuckDB or Postgres for your data warehouse."
"I might consider DuckDB instead of PostgreSQL depending on what he or she’s looking to do"
#33
Microsoft
5.0
(2)
"If you're already using Azure I'd recommend checking Azure Data Factory."
"Cheap, effective and battle tested."
#34
Sprinkle
4.5
(2)
"Really satisfied with their services."
"Sprinkle makes the overall process of data ingestion and transformation simple."
#35
Conduit
4.5
(2)
"Check out conduit. It's a fantastic ETL tool."
"Conduit is open-source so you can use it on your infrastructure... It focuses on real-time and CDC."
#36
DltHub
4.5
(2)
"Have you tried using dlthub?"
"Or Dlthub."
#37
Sprinkle Data
4.5
(2)
"I can recommend a tool which has a free plan and it might suit your use case perfectly. Personally, I have been using it for more than a couple of years now."
"You should definitely have a peek at Sprinkle Data, I think they meet all of your requirements and pretty much easy to use too."
#38
Estuary
4.5
(2)
"We're a startup real-time ETL tool by the name of Estuary.dev"
"Estuary is an up-and-coming option for affordable data extraction and loading."
#39
Rivery
4.0
(2)
"Offers automated data pipelines that are easy to setup."
"Rivery is another affordable option for data extraction and loading."
#40
StackWizard
4.0
(2)
"Stackwizard.com for this. Put in your ETL requirements and it will show you the most compatible ETL tools to evaluate."
"Try selecting your specific requirements and integrations in stackwizard.com and it’ll show you the most compatible ETL tool."
#41
AWS Glue
4.0
(2)
"I like AWS Glue because it’s very flexible."
"You could potentially replace Informatica with AWS Glue."
#42
PeerDB
4.0
(2)
"I've found PeerDB for Postgres - > Snowflake the best free and open source solution yet."
"**Open-source**: sling-cli, airbyte, peerdb"
#43
IBM DataStage
4.0
(2)
"DataStage"
"DataStage, Informatica, Talend could be some traditional ETL choices."
#44
Jitterbit
4.0
(2)
"Jitterbit has a free version available."
"Cost effective"
#45
Apache SeaTunnel
4.0
(2)
"I have had my eyes on Apache SeaTunnel for a while."
"Apache SeaTunnel"
#46
Skyvia
4.0
(2)
"If you need low-code ETL tools for SaaS integrations, Skyvia could fit well."
"Skyvia, as an example, is pretty simple to use (no code) ETL tool."
#47
NiFi
4.0
(2)
"We are using Nifi, open source, hosted in EC2"
"If, however, you're going to deal with multiple independent data sources, you might want to look into more complex solutions like Airflow or Nifi."
#48
PostgreSQL
4.0
(2)
"PostgreSQL/MySQL for storage"
"PostgreSQL (database)"
#49
Etlworks
4.0
(2)
"Check etlworks.com."
"Not very well known but allows us to do almost everything ETL/ELT related."
#50
Metabase
4.0
(2)
"Personally quite fond of Metabase since it's easy to put in existing infrastructure."
"Metabase, Pentaho, Apache Superset"
#51
Mage
3.5
(2)
"I’ve been pleasantly surprised with Mage after a few days of playing around with it."
"I was hoping that as time goes on I'd start seeing more advocates for Mage."
#52
Amazon Redshift
2.5
(2)
"Redshift is missing a lot of functionality inherent in Postgres."
"The issue with Redshift, and probably the main driver for us moving to Azure, is that it's really designed to ingest from S3 only."
#53
Atlan
5.0
(1)
"Made tracking lineage and ownership way smoother."
#54
Pentaho Data Integration (PDI)
5.0
(1)
"My engineer built a very complex and fully automated ETL solution with PDI, a VM, and a cron job scheduler. Currently migrating it to Apache Hop. He won’t use anything else. He is obsessed with PDI/Apache Hop."
#55
SQL Server Integration Services
2.0
(2)
"I would think you could solve this in SSIS with a query to get the config data set and then a foreach loop."
"SSIS 🤡🤡🤡🤡🤡🤡"
#56
Epitech Integrator
4.0
(1)
"If you are looking for an easy to use data integration tool with a short learning curve, I suggest you go to https://epitechintegrator.com/."
#57
Tabsdata
4.0
(1)
"If you are comfortable with python, you can use tabsdata to model, transform and export data of interest within the system."
#58
Wayfare
4.0
(1)
"There is also https://wayfare.ai if you want e2e with enterprise controls and security-first workflows"
#59
Stack Wizard
4.0
(1)
"Check [stack wizard](https://stackwizard.com) for the most compatible tool based on what you’re running."
#60
Dixer
4.0
(1)
"Try with https://dixer.stgo.do"
#61
DataOps Live
4.0
(1)
"DataOps.live"
#62
Preswald
4.0
(1)
"Preswald is a solid choice for cleaning, enriching, and visualizing your data without breaking the bank."
#63
Oracle Data Integrator
4.0
(1)
"Oracle data integrator, SAP Data services"
#64
Debizium
4.0
(1)
"Take a look at Debizium on Kafka or Pulsar, it handles and maintains schema changes."
#65
AskOnData
4.0
(1)
"A chat based GenAI powered data engineering tool."
#66
Matia.io
4.0
(1)
"Use Matia.io. It’s handy for ETL/RETL and they offer some cool data observability features."
#67
Conduit.io
4.0
(1)
"Conduit.io for me - not only does it have lots of connectors (both source and destination), but its the full realtime ETL pipeline all in one golang binary."
#68
Palantir Foundry
4.0
(1)
"Palantir Foundry"
#69
Node-RED
4.0
(1)
"Personally I would use nodered."
#70
TimeXtender
4.0
(1)
"It's been a few years now, but for a GUI application I found timextender to be quite nice."
#71
Apache Beam
4.0
(1)
"Apache Beam / Dataflow (GCP)"
#72
HVR
4.0
(1)
"My third choice is not on this list but I think would be worth checking out: HVR"
#73
Supermetrics
4.0
(1)
"Doesn’t FiveTran, Stitch and Supermetrics (to a minor degree) do the same thing?"
#74
Microsoft SSIS
4.0
(1)
"SSIS"
#75
QuickTable
4.0
(1)
"We are building QuickTable which is low code. It can generate SQL and udf functions for different data warehouses automatically."
#76
Databricks Delta Live Tables
4.0
(1)
"I would checkout delta live tables by databricks"
#77
Dremio
4.0
(1)
"Looks like you need dremio"
#78
Prophecy
4.0
(1)
"I came across prophecy.io and have to say that they are fun 🤩"
#79
Presto
3.0
(1)
"[Presto](https://prestodb.io/) does this, but I'm honestly uncertain how performant it is."
#80
Google Cloud Data Fusion
3.0
(1)
"IICS, GCP Datafusion , azure data Factory"
#81
IICS
3.0
(1)
"IICS, GCP Datafusion , azure data Factory"
#82
Jitterbit Salesforce Dataloader
3.0
(1)
"I used Jitterbit Salesforce Dataloader for simple tasks"
#83
Data360 Analyze
3.0
(1)
"I use Data360 Analyze. It has connectors to a lot of the applications you mentioned. You can get a free download of the desktop product."
#84
R
2.0
(1)
"I've used R as an ETL tool to build a data warehouse from scratch."
Discover your audience
GummySearch is an audience research toolkit for 130,000 unique communities on Reddit.
If you are looking for startup problems to solve, want to validate your idea or find your customers online, GummySearch is for you.
Sign up for free, get community insights in minutes.
Tell me more
Get started
