Posted On: Oct 25, 2021
We are announcing the support of using Apache Spark SQL to update Apache Hive metadata tables when using Amazon EMR integration with Apache Ranger.
This January, we launched Amazon EMR integration with Apache Ranger, a feature that allows you to define and enforce database, table, and column-level permissions when Apache Spark users access data in Amazon S3 through the Hive Metastore. Previously, with Apache Ranger is enabled, you were limited to only being able to read data using Spark SQL statements such as SHOW DATABASES and DESCRIBE TABLE. Now, you can also insert data into, or update the Apache Hive metadata tables with these statements: INSERT INTO, INSERT OVERWRITE, and ALTER TABLE.
This feature is enabled on Amazon EMR 6.4 in the Amazon Web Services China (Beijing) Region, operated by Sinnet and Amazon Web Services China (Ningxia) Region, operated by NWCD.
To get started, see the following list of resources:
- AWS Big Data Blog post:
- Amazon EMR Management Guide: Using Apache Spark SQL with Apache Ranger plugin