Alvin Endratnoalvinend.hashnode.net·Sep 9, 2022Setup Jupyter in EC2 and Apache Spark with Delta Lake connection to S3Delta lake has been booming for the last two years after Databricks announce it as "New Generation Data Lakehouse," but behind the boom, there are not enough examples and posts of it. I want to change it by adding one article about it. This time we w...2 likes·203 readsDelta Lakelakehouse
Alvin Endratnoalvinend.hashnode.net·Sep 22, 2022Using SQL to Query Data with Delta LakeLast time, we set up Jupyter in EC2 and Apache Spark with Delta Lake connection to S3. We will import data from the dataset and query it with SQL this time. About Dataset For this experiment, we will use a dataset about courses, students, and their i...68 readsDelta Lakebig data
Mike Kenneth Houngbadjimikekenneth.hashnode.net·Feb 4, 2023Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache SupersetEvery act of conscious learning requires the willingness to suffer an injury to one's self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily.Thomas Szasz Motivation A Data Lakehouse is a modern d...apache-airflow
Jonathan Reisjreissup.hashnode.net·Feb 23, 2023Implementing a Data Lakehouse Architecture in AWS — Part 3 of 4Introduction In our previous article, part 2 of the series, we walked through the extraction, processing, and creation of some data mart, using the New York City taxi trip data which is publicly available to do consumption. We used some of the princi...Exploring the Data Lakehouse and Its Implementation in AWSData-lake
Jonathan Reisjreissup.hashnode.net·Feb 23, 2023Implementing a Data Lakehouse Architecture in AWS — Part 2 of 4Introduction In part 1 of this article series, we walked through how to feed a Data Lake built on top of Amazon S3, based on streaming data, using Amazon Kinesis. In part 2, we will cover all of the steps needed to build a Data Lakehouse, using trip ...Exploring the Data Lakehouse and Its Implementation in AWSData-lake
Mike Kenneth Houngbadjimikekenneth.hashnode.net·Feb 4, 2023Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache SupersetEvery act of conscious learning requires the willingness to suffer an injury to one's self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily.Thomas Szasz Motivation A Data Lakehouse is a modern d...apache-airflow
Jonathan Reisjreissup.hashnode.net·Nov 4, 2022Implementing a Data Lakehouse Architecture in AWS — Part 1 of 4Introduction Numerous applications in today’s world accumulate significant amounts of data to build insight and knowledge. Adding value and improving functionality is essential, but at what cost? A critical factor in the "Big Data" era’s arrival is v...36 readsExploring the Data Lakehouse and Its Implementation in AWSData Architecture
Alvin Endratnoalvinend.hashnode.net·Sep 22, 2022Using SQL to Query Data with Delta LakeLast time, we set up Jupyter in EC2 and Apache Spark with Delta Lake connection to S3. We will import data from the dataset and query it with SQL this time. About Dataset For this experiment, we will use a dataset about courses, students, and their i...68 readsDelta Lakebig data
Alvin Endratnoalvinend.hashnode.net·Sep 9, 2022Setup Jupyter in EC2 and Apache Spark with Delta Lake connection to S3Delta lake has been booming for the last two years after Databricks announce it as "New Generation Data Lakehouse," but behind the boom, there are not enough examples and posts of it. I want to change it by adding one article about it. This time we w...2 likes·203 readsDelta Lakelakehouse