Data lakes have become the norm in the industry for storing critical business data. The primary rationale for a data lake is to store all types of data, from raw data to preprocessed and postprocessed data without the need to first structure or transform it. Data lakes allow controlled access to data from many different types of analytics and machine learning (ML) processes in order to guide better decision-making. Join our webinar to learn how to use Amazon SageMaker Studio notebooks to easily load and transform data stored in the Delta Lake format. We will use a standard Jupyter notebook to run Apache Spark commands that read and write table data in CSV and Parquet format. The open-source library delta-spark allows you to directly access this data in its native format. This library allows you to take advantage of the many API operations to apply data transformations, make schema modifications, and use time-travel or as-of-timestamp queries to pull a particular version of the data.
Learning Objectives:
* Objective 1: How to leverage the new data lake capabilities for transactions, version control and indexing and enforce strict data access controls via Delta Lake.
* Objective 2: How to use SageMaker Studio notebooks to easily load and transform data stored in the Delta Lake format.
* Objective 3: How to integrate with Jupyter notebook to scale ML models.
***To learn more about the services featured in this talk, please visit: https://aws.amazon.com/sagemaker/studio/
****To download a copy of the slide deck from this webinar visit: https://pages.awscloud.com/Easily-connect-your-ML-models-to-data-lakes-using-Amazon-SageMaker-Studio_2022_1019-MCL_OD Subscribe to AWS Online Tech Talks On AWS:
https://www.youtube.com/@AWSOnlineTechTalks?sub_confirmation=1
Follow Amazon Web Services:
Official Website: https://aws.amazon.com/what-is-aws
Twitch: https://twitch.tv/aws
Twitter: https://twitter.com/awsdevelopers
Facebook: https://facebook.com/amazonwebservices
Instagram: https://instagram.com/amazonwebservices
☁️ AWS Online Tech Talks cover a wide range of topics and expertise levels through technical deep dives, demos, customer examples, and live Q&A with AWS experts. Builders can choose from bite-sized 15-minute sessions, insightful fireside chats, immersive virtual workshops, interactive office hours, or watch on-demand tech talks at your own pace. Join us to fuel your learning journey with AWS.
#AWS
Learning Objectives:
* Objective 1: How to leverage the new data lake capabilities for transactions, version control and indexing and enforce strict data access controls via Delta Lake.
* Objective 2: How to use SageMaker Studio notebooks to easily load and transform data stored in the Delta Lake format.
* Objective 3: How to integrate with Jupyter notebook to scale ML models.
***To learn more about the services featured in this talk, please visit: https://aws.amazon.com/sagemaker/studio/
****To download a copy of the slide deck from this webinar visit: https://pages.awscloud.com/Easily-connect-your-ML-models-to-data-lakes-using-Amazon-SageMaker-Studio_2022_1019-MCL_OD Subscribe to AWS Online Tech Talks On AWS:
https://www.youtube.com/@AWSOnlineTechTalks?sub_confirmation=1
Follow Amazon Web Services:
Official Website: https://aws.amazon.com/what-is-aws
Twitch: https://twitch.tv/aws
Twitter: https://twitter.com/awsdevelopers
Facebook: https://facebook.com/amazonwebservices
Instagram: https://instagram.com/amazonwebservices
☁️ AWS Online Tech Talks cover a wide range of topics and expertise levels through technical deep dives, demos, customer examples, and live Q&A with AWS experts. Builders can choose from bite-sized 15-minute sessions, insightful fireside chats, immersive virtual workshops, interactive office hours, or watch on-demand tech talks at your own pace. Join us to fuel your learning journey with AWS.
#AWS
- Category
- AWS Developers
- Tags
- SageMaker, data lake, data processing

Be the first to comment