请在 下方输入 要搜索的题目:

               A healthcare company uses AWS data and analytics tools to collect, ingest, and store electronic healthrecord (EHR) data about its patients. The raw EHR data is stored in Amazon S3 in JSON format partitionedby hour, day, and year and is updated every hour. The company wants to maintain the data catalog andmetadata in an AWS Glue Data Catalog to be able to access the data using Amazon Athena or AmazonRedshift Spectrum for analytics.When defining tables in the Data Catalog, the company has the following requirements:・ Choose the catalog table name and do not rely on the catalog table naming algorithm.・ Keep the table updated with new partitions loaded in the respective S3 bucket prefixes.Which solution meets these requirements with minimal effort?


A、

 RunanAWSGluecrawlerthatconnectstooneormoredatastores,determinesthedatastructures,andwritestablesintheDataCatalog.


B、

UsetheAWSGlueconsoletomanuallycreateatableintheDataCatalogandscheduleanAWSLambdafunctiontoupdatethetablepartitionshourly.


C、

UsetheAWSGlueAPICreateTableoperationtocreateatableintheDataCatalog.CreateanAWSGluecrawlerandspecifythetableasthesource.


D、

CreateanApacheHivecataloginAmazonEMRwiththetableschemadefinitioninAmazonS3,andupdatethetablepartitionwithascheduledjob.MigratetheHivecatalogtotheDataCatalog.

发布时间:2024-09-27 12:14:27
推荐参考答案 ( 由 搜搜题库网 官方老师解答 )
联系客服
答案:
搜搜题找答案
用户信息
没有账号?点我注册
登录 - 搜搜题库网
立即注册
注册 - 搜搜题库网
验证码
立即登录