i have file in s3 zipped. insert redshift database. way research has found launching ec2 instance. moving file there, unzipping it, , sending s3. insert redshift table. trying javasdk outside machine , not want have use ec2 instance. there way have emr job unzip file? or insert zipped file directly redshift?
files .zip not .gzip
you cannot directly insert zipped file redshift per guy's comment.
assuming not 1 time task, suggest using aws data pipeline perform work. see example of copy data between s3 buckets. modify example unzip , gzip data instead of copying it.
use shellcommandactivity execute shell script performs work. assume script invoke java if choose , appropriate ami ec2 resource (ymmv).
data pipeline highly efficient type of work because start , terminate ec2 resource automatically plus not have worry discovering name of new instance in scripts.
Comments
Post a Comment