External reviews
External reviews are not included in the AWS star rating for the product.
Databricks usage for job creation and cluster management and manage spark jobs effectivly.
What do you like best about the product?
Easy to schedule and run jobs and integrate with airflow and azure storage accounts.
Easy to execute code cell-wise and debug the errors because of its interpreter.
Easy to execute code cell-wise and debug the errors because of its interpreter.
What do you dislike about the product?
It won't give auto-fill suggestions while coding like how other IDEA's gives.
What problems is the product solving and how is that benefiting you?
We use for our data engineering projects for large scale datasets.
- Leave a Comment |
- Mark review as helpful
Great Collaborative Platform for Data Science Projects
What do you like best about the product?
I have been using Databricks platform for business research projects and building ML models for almost a year. It has been a great experience to be able to run analysis and model testing for big data projects in a single platform without switching between SQL server and development environment with Python, R, or Stata. Also, I like the fact that MLflow can track data ingestion for any data shift in realtime for model retraining purposes.
What do you dislike about the product?
We have had issues using MLflow and feature store on Databricks for ML projects, which slows down the development process. Wish there was better documentation on these tools or more diverse examples to demonstrate different use cases. Also, the test-train split with MLflow does not support time series time interval test-train split for model validation purposes.
What problems is the product solving and how is that benefiting you?
The Databricks lakehouse platform allows the data science team better work with the development team in a single platform, which help improve ML project development in the long run.
Best data all in one solution
What do you like best about the product?
Pyspark, Delta lake, The way that it integrates seamlessly with AWS services and how they managed to open source everything. It provides a great managed spark infrastructure.
What do you dislike about the product?
Harder to integrate with more legacy data sets. Requires you to move data into AWS to use.
What problems is the product solving and how is that benefiting you?
Databricks is creating a solution that allows us to query and manage our data lake with immense performance. Delta lake ensures ACID transactions on data and the query performance from databricks is unmatched
Good experience so far!
What do you like best about the product?
Great unification of functions & features and data sharing across the organization.
What do you dislike about the product?
There's still a lot to learn and make sure that all the functions I use work well and properly. Nothing bad, just more to find out.
What problems is the product solving and how is that benefiting you?
It's helping me do my job and unifying data sources across all my different work streams.
User friendly and intuitive platform
What do you like best about the product?
As a Cloud Operation Specialist, I deploy the databricks workspace, setup and manage the clusters. It’s easy to setup and manage the users within the workspace.
UI is very user friendly and intuitive.
UI is very user friendly and intuitive.
What do you dislike about the product?
Error messages can me more detailed and explained well.
What problems is the product solving and how is that benefiting you?
Highly efficient in executing queries and analysing data.
Powerful platform
What do you like best about the product?
The platform is powerful and flexible enough to do almost anything you want to do, like ETL, ML models, data mining, simple adhoc queries, etc. Also easy to switch languages between python, sql, r, scala, etc. anytime you want.
What do you dislike about the product?
The search function is not my favoriate, I often like to use the search function from the browser but it doesn't work well with scripts in a big cell. Also the clusters takes a while to start.
What problems is the product solving and how is that benefiting you?
It meets all my data mining and data science project needs. Simple and easy to use.
The most flexible and potent data platform available, without a doubt
What do you like best about the product?
The most reliable and user-friendly option for creating ELT pipelines that employ Python, Spark, and SQL is Databricks. Configuring and deploying it doesn't take much labour, and it frees developers from having to worry about setting up the infrastructure.
What do you dislike about the product?
using the same cluster to perform several streaming tasks
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
What problems is the product solving and how is that benefiting you?
comprehensive Batch & streaming pipeline
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Intuitive and Powerful
What do you like best about the product?
As a frequent user of Databricks, it has made my life so much easier by simplifying processes and allowing me to develop proof-of-concept designs rapidly. The orchestration of notebooks via workflows provides excellent visualization and enables me to conduct real-time demos for members on the business side. In addition, the integration with Azure and AWS makes it so that Databricks does not operate in isolation and allows me and other engineering team members to transform large amounts of data that is ingested via our enterprise pipelines.
What do you dislike about the product?
There can sometimes be issues integrating Databricks workflows with open source frameworks, often requiring lots of debugging and trial and error. Additionally, I've been told that the platform can be pretty expensive.
What problems is the product solving and how is that benefiting you?
The Databricks Lakehouse Platform allows me to create and deploy workflows to orchestrate and test proof-of-concept ideas in our organization. This will enable us to validate ideas and develop presentations for the organization's business side.
Progressing in the right direction
What do you like best about the product?
Being quickly able to get the environment up and running for any kind of workloads. The support for all three languages and catering to the needs of Data Engineering and ML.
What do you dislike about the product?
Too many customizations are needed to achieve the right mix of parameterization for optimal performance. On the other hand, snowflake provides lots of features out of the box without the developer worrying about these things.
What problems is the product solving and how is that benefiting you?
Managing the intermediate layers and data engineering activities like wrangling/mashing/slicing/dicing of the data well. Greater control of the data via data frames.
Excellent solution to unlock data analytics full power
What do you like best about the product?
The infrastructure is pretty straightforward. I started out using the Community edition before switching to the premium version, but if you're a student or working on one-off projects, the Community edition should be more than sufficient.
What do you dislike about the product?
Finding some answers can be challenging at times because there aren't many Pyspark users, forums, or resources available.
What problems is the product solving and how is that benefiting you?
People who are not very proficient in coding can nonetheless gain useful insights from the data utilizing notebooks prepared by data scientists. I've been accustomed to Databricks and creating PySpark programs pretty easily. Databricks have a great ability to manipulate data and perform any desired action.
showing 231 - 240