Revisit Amazon Web Services re:Invent 2024’s biggest moments and watch keynotes and innovation talks on demand
Build with Amazon SageMaker
Amazon SageMaker makes it easy to build machine learning (ML) models at scale and get them ready for training, by providing everything you need to label training data, access and share notebooks, and use built-in algorithms and frameworks.
Features
Collaborative notebook experience
Amazon SageMaker Notebooks provide one-click Jupyter notebooks with elastic compute that can be spun up quickly. Notebooks contain everything needed to run or recreate a machine learning workflow and are integrated within Amazon SageMaker Studio. Notebooks are pre-loaded with all the common CUDA and cuDNN drivers, Anaconda packages, and framework libraries.
The notebook environment lets you explore and visualize your data and document your findings in re-usable workflows. From within the notebook, you can bring in your data stored in Amazon S3. You can also use Amazon Glue to easily move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis.
Fully managed data processing at scale
Quite often, data processing and analytics workloads for machine learning are run on self-managed infrastructure that is difficult to allocate and scale, as business requirements change. The use of different tools to achieve this becomes cumbersome resulting in sub-optimal performance and increased capital and operating expenses. Amazon SageMaker Processing overcomes this challenge by extending the ease, scalability, and reliability of SageMaker to a full managed experience of running data processing workloads at scale. SageMaker Processing allows you to connect to existing storage or file system data sources, spin up the resources required to run your job, save the output to persistent storage, and provide the logs and metrics. You can also bring your own containers using frameworks of your choice and take advantage of running data processing and analytics workloads.
Built-in, high-performance algorithms
Amazon SageMaker provides high-performance, scalable machine learning algorithms, optimized for speed, scale, and accuracy, that can perform training on petabyte-scale data sets. You can choose from supervised algorithms where the correct answers are known during training and you can instruct the model where it made mistakes. SageMaker includes supervised algorithms such as XGBoost and linear/logistic regression or classification, to address recommendation and time series prediction problems. SageMaker also includes support for unsupervised learning (i.e. the algorithms must discover the correct answers on their own), such as with k-means clustering and principal component analysis (PCA), to solve problems like identifying customer groupings based on purchasing behavior.
Broad framework support
Amazon SageMaker supports many popular frameworks for deep learning such as TensorFlow, Apache MXNet, PyTorch, Chainer, and more. These frameworks are automatically configured and optimized for high performance. You don’t need to manually setup these frameworks and can use them within the built-in containers. You can also bring in any framework you like to SageMaker by building it into a Docker container that you can store in the Amazon EC2 Container Registry.
Test and prototype locally
The open source Apache MXNet and TensorFlow Docker containers used in Amazon SageMaker are available on Github. You can download these containers to your local environment and use the SageMaker Python SDK to test your scripts before deploying to SageMaker training or hosting environments. When you’re ready go from local testing to production training and hosting, a change to a single line of code is all that's needed.
Reinforcement learning
Amazon SageMaker supports reinforcement learning in addition to traditional supervised and unsupervised learning. SageMaker has built-in, fully-managed reinforcement learning algorithms, including some of the newest and best performing in the academic literature. SageMaker supports RL in multiple frameworks, including TensorFlow and MXNet, as well as newer frameworks designed from the ground up for reinforcement learning, such as Intel Coach, and Ray RL. Multiple 2D and 3D physics simulation environments are supported, including environments based on the open source OpenGym interface. Additionally, SageMaker RL will allow you to train using virtual 3D environments built in Amazon Sumerian and Amazon RoboMaker. To help you get started, SageMaker also provides a range of example notebooks and tutorials.