StorageException: Anonymous caller does not have storage.objects.get access - google-cloud-platform

When trying to run the below code on CircleCI
fun getJsonFromCloudStorage(): ByteArrayInputStream {
val blobId = BlobId.of("my-company", "creds/my-company-creds.json")
val storage = StorageOptions.getDefaultInstance().service
val get = storage.get(blobId)
return get.getContent().inputStream()
}
it will throw the below error during the integration tests.
> Task :test FAILED
function.GetMetadataFromYouTubeTest > extractIncorrectId FAILED
java.lang.ExceptionInInitializerError
at function.GetMetadataFromYouTube.expand(GetMetadataFromYouTube.kt:17)
at function.GetMetadataFromYouTube.expand(GetMetadataFromYouTube.kt:14)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:491)
at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:299)
at function.GetMetadataFromYouTubeTest.extractIncorrectId(GetMetadataFromYouTubeTest.kt:71)
Caused by:
com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.get access to cni-analytics/creds/cni-awesome.json.
at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:220)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:414)
at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:198)
at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:195)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89)
at com.google.cloud.RetryHelper.run(RetryHelper.java:74)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51)
at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:195)
at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:209)
at storage.CredentialHelper$Companion.getJsonFromCloudStorage(CredentialHelper.kt:18)
at service.YoutubeService.initialiseYouTube(YoutubeService.kt:50)
at service.YoutubeService.<init>(YoutubeService.kt:19)
at MainKt.<clinit>(main.kt:15)
... 6 more
Caused by:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 401 Unauthorized
{
"code" : 401,
"errors" : [ {
"domain" : "global",
"location" : "Authorization",
"locationType" : "header",
"message" : "Anonymous caller does not have storage.objects.get access to my-company/creds/my-company-creds.json.",
"reason" : "required"
} ],
"message" : "Anonymous caller does not have storage.objects.get access to my-company/creds/my-company-creds.json."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:411)
... 17 more
I followed their documentation.

They said this in their documentation:
Note: To use certain services (like Google Cloud Datastore), you will also need to set the CircleCI $GOOGLE_APPLICATION_CREDENTIALS environment variable to ${HOME}/gcloud-service-key.json.
Instead I set $GOOGLE_APPLICATION_CREDENTIALS in the CircleCI UI to /home/circleci/gcloud-service-key.json and it worked.
I'm assuming this is because I was trying to reference an environment variable from the UI so ${HOME} had not been set when it was setting this environment variable. Perhaps if this environment variable was set in the config.yml ${HOME} would resolve.

Related

AWS SecretsManager works in Eclipse, can't connect to Service Endpoint in ColdFusion

I have the following class written in Java using Eclipse on my Amazon EC2 instance.
import java.nio.ByteBuffer;
import com.amazonaws.auth.*;
import com.amazonaws.client.builder.AwsClientBuilder;
import com.amazonaws.services.secretsmanager.*;
import com.amazonaws.services.secretsmanager.model.*;
public class SMtest {
public static void main(String[] args) {
String x = getSecret();
System.out.println(x);
}
public SMtest()
{
}
public static String getSecret() {
String secretName = "mysecret";
String endpoint = "secretsmanager.us-east-1.amazonaws.com";
String region = "us-east-1";
String result = "";
//BasicAWSCredentials awsCreds = new BasicAWSCredentials("mypublickey", "mysecretkey");
AwsClientBuilder.EndpointConfiguration config = new AwsClientBuilder.EndpointConfiguration(endpoint, region);
AWSSecretsManagerClientBuilder clientBuilder = AWSSecretsManagerClientBuilder.standard()
.withCredentials(new InstanceProfileCredentialsProvider(false));
//.withCredentials(new AWSStaticCredentialsProvider(awsCreds));
clientBuilder.setEndpointConfiguration(config);
AWSSecretsManager client = clientBuilder.build();
String secret;
ByteBuffer binarySecretData;
GetSecretValueRequest getSecretValueRequest = new GetSecretValueRequest()
.withSecretId(secretName).withVersionStage("AWSCURRENT");
GetSecretValueResult getSecretValueResult = null;
try {
getSecretValueResult = client.getSecretValue(getSecretValueRequest);
} catch(ResourceNotFoundException e) {
System.out.println("The requested secret " + secretName + " was not found");
} catch (InvalidRequestException e) {
System.out.println("The request was invalid due to: " + e.getMessage());
} catch (InvalidParameterException e) {
System.out.println("The request had invalid params: " + e.getMessage());
}
if(getSecretValueResult == null) {
result = "";
}
// Depending on whether the secret was a string or binary, one of these fields will be populated
if(getSecretValueResult.getSecretString() != null) {
secret = getSecretValueResult.getSecretString();
result = secret;
//System.out.println(secret);
}
else {
binarySecretData = getSecretValueResult.getSecretBinary();
result = binarySecretData.toString();
}
return result;
}
}
When I execute it from within Eclipse, it works just fine. When I compile the class to use it in ColdFusion (2021) from the same EC2 with the following code:
<cfscript>
obj = CreateObject("java","SMtest");
obj.init();
result = obj.getSecret();
</cfscript>
<cfoutput>#result#</cfoutput>
I get a "Failed to connect to service endpoint" error. I believe I have all the IAM credentials set up properly since it is working in straight Java. However, when I change the credentials to use the Basic Credentials with my AWS Public and Secret Key (shown in comments in the code above), it works in both Java and ColdFusion.
I created an AWS Policy that manages the Secrets Manager permissions. I also added AmazonEC2FullAccess. I also tried to create a VPC Endpoint, but this had no effect.
Why would it be working in Java and not in ColdFusion (which is based on Java) ? What roles/policies would I have to add to get it to work in ColdFusion when it is already working in Java?
STACK TRACE ADDED:
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:69)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:165)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1266)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:842)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:792)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
at com.amazonaws.services.secretsmanager.AWSSecretsManagerClient.doInvoke(AWSSecretsManagerClient.java:2454)
at com.amazonaws.services.secretsmanager.AWSSecretsManagerClient.invoke(AWSSecretsManagerClient.java:2421)
at com.amazonaws.services.secretsmanager.AWSSecretsManagerClient.invoke(AWSSecretsManagerClient.java:2410)
at com.amazonaws.services.secretsmanager.AWSSecretsManagerClient.executeGetSecretValue(AWSSecretsManagerClient.java:943)
at com.amazonaws.services.secretsmanager.AWSSecretsManagerClient.getSecretValue(AWSSecretsManagerClient.java:912)
at SMtest.getSecret(SMtest.java:52)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at coldfusion.runtime.java.JavaProxy.invoke(JavaProxy.java:106)
at coldfusion.runtime.CfJspPage._invoke(CfJspPage.java:4254)
at coldfusion.runtime.CfJspPage._invoke(CfJspPage.java:4217)
at cfsmtest2ecfm1508275519.runPage(D:\ColdFusion2021\cfusion\wwwroot\smtest.cfm:4)
at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:257)
at coldfusion.tagext.lang.IncludeTag.handlePageInvoke(IncludeTag.java:749)
at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:578)
at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)
at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:573)
at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:43)
at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)
at coldfusion.filter.PathFilter.invoke(PathFilter.java:162)
at coldfusion.filter.IpFilter.invoke(IpFilter.java:45)
at coldfusion.filter.LicenseFilter.invoke(LicenseFilter.java:30)
at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:97)
at coldfusion.filter.BrowserDebugFilter.invoke(BrowserDebugFilter.java:81)
at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)
at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)
at coldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:60)
at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)
at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)
at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62)
at coldfusion.CfmServlet.service(CfmServlet.java:231)
at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:311)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:228)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:163)
at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:46)
at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:47)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:190)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:163)
at coldfusion.inspect.weinre.MobileDeviceDomInspectionFilter.doFilter(MobileDeviceDomInspectionFilter.java:57)
at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:47)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:190)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:163)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:190)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:163)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:373)
at org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:462)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1723)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.MalformedURLException: Unknown protocol: http
at java.base/java.net.URL.<init>(URL.java:708)
at java.base/java.net.URL.fromURI(URL.java:748)
at java.base/java.net.URI.toURL(URI.java:1139)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:83)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:80)
... 80 more
Caused by: java.lang.IllegalStateException: Unknown protocol: http
at org.apache.felix.framework.URLHandlersStreamHandlerProxy.parseURL(URLHandlersStreamHandlerProxy.java:373)
at java.base/java.net.URL.<init>(URL.java:703)
... 84 more
After noticing that the bottom end of the Stack Trace says "unknown protocol: http" and "Illegal State Exception: Unknown Protocol", I changed the endpoint in my class above to:
String endpoint = "https://secretsmanager.us-east-1.amazonaws.com";
and I am running the web-site via https now. It still works with no problem running it in Eclpse, and I am still getting the same error when using ColdFusion (i.e., when using a browser)
Since the basic credentials are working and since the error is not access denied, it suggests there is no issue with your IAM setup. Instead, likely your process failed to fetch the EC2 instance metadata. For example, timeout when calling the metadata endpoint, hence the Failed to connect to service endpoint error.
One way around this is to retrieve the accessKey and secretKey manually by calling the instance metadata API. For example, using cfhttp to populate variables.
Example curl command from docs: retrieve /api/token and use it to retrieve /meta-data/iam/security-credentials/<rolename>.
[ec2-user ~]$ TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` \
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access
Output:
{
"Code" : "Success",
"LastUpdated" : "2012-04-26T16:39:16Z",
"Type" : "AWS-HMAC",
"AccessKeyId" : "ASIAIOSFODNN7EXAMPLE",
"SecretAccessKey" : "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"Token" : "token",
"Expiration" : "2017-05-17T15:09:54Z"
}
From here, if you still face any errors, at least it should be more explicit to you which step has failed and why.
Just a note, AWS recommends caching the credentials until near expiry instead of querying for every transaction to avoid throttling.

python, google cloud platform: unable to overwite a file from google bucket: CRC32 does not match

I am using python3 client to connect to google buckets and trying to the following
download 'my_rules_file.yaml'
modify the yaml file
overwrite the file
Here is the code that i used
from google.cloud import storage
import yaml
client = storage.Client()
bucket = client.get_bucket('bucket_name')
blob = bucket.blob('my_rules_file.yaml')
yaml_file = blob.download_as_string()
doc = yaml.load(yaml_file, Loader=yaml.FullLoader)
doc['email'].clear()
doc['email'].extend(["test#gmail.com"])
yaml_file = yaml.dump(doc)
blob.upload_from_string(yaml_file, content_type="application/octet-stream")
This is the error I get from the last line for upload
BadRequest: 400 POST https://storage.googleapis.com/upload/storage/v1/b/fc-sandbox-datastore/o?uploadType=multipart: {
"error": {
"code": 400,
"message": "Provided CRC32C \"YXQoSg==\" doesn't match calculated CRC32C \"EyDHsA==\".",
"errors": [
{
"message": "Provided CRC32C \"YXQoSg==\" doesn't match calculated CRC32C \"EyDHsA==\".",
"domain": "global",
"reason": "invalid"
},
{
"message": "Provided MD5 hash \"G/rQwQii9moEvc3ZDqW2qQ==\" doesn't match calculated MD5 hash \"GqyZzuvv6yE57q1bLg8HAg==\".",
"domain": "global",
"reason": "invalid"
}
]
}
}
: ('Request failed with status code', 400, 'Expected one of', <HTTPStatus.OK: 200>)
why is this happening. This seems to happen only for ".yaml files".
The reason for your error is because you are trying to use the same blob object for both downloading and uploading this will not work you need two separate instances... You can find some good examples here Python google.cloud.storage.Blob() Examples
You should use a seperate blob instance to handle the upload you are trying with only one...
.....
blob = bucket.blob('my_rules_file.yaml')
yaml_file = blob.download_as_string()
.....
the second instance is needed here
....
blob.upload_from_string(yaml_file, content_type="application/octet-stream")
...

GoogleStorageException - 401 Unauthorized / Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket

I want to transfer data from GCS to BigQuery by embulk and digdag.
But error occurs.
com.google.api.client.googleapis.json.GoogleJsonResponseException: 401 Unauthorized
.......
Error: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
↓ Details
command :
embulk run XXXX.yaml
XXXX.yaml :
in:
type: gcs
bucket: <bucket name>
path_prefix: <file path>
auth_method: compute_engine
parser:
type: poi_excel
sheets: <sheet name>
skip_header_lines: 4
columns:
- {name: 'name', type: string}
.
.
.
out:
type: bigquery
mode: replace
project: <project name>
dataset: <dataset name>
table: <table name>
auth_method: compute_engine
schema_file: <file name of json type>
gcs_bucket: <gcs tmp bucket name>
output :
$ embulk run target_item_bottoms_config.yaml
2020-07-22 14:27:36.559 +0900: Embulk v0.9.23
2020-07-22 14:27:37.609 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2020-07-22 14:27:40.577 +0900 [INFO] (main): Gem's home and path are set by default: "/Users/oniki/.embulk/lib/gems"
2020-07-22 14:27:41.662 +0900 [INFO] (main): Started Embulk v0.9.23
2020-07-22 14:27:41.853 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-gcs (0.3.2)
2020-07-22 14:27:46.263 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-bigquery (0.6.4)
2020-07-22 14:27:46.369 +0900 [INFO] (0001:transaction): Loaded plugin embulk-parser-poi_excel (0.1.7)
org.embulk.exec.PartialExecutionException: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:566)
at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:35)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:242)
at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:291)
at org.embulk.EmbulkRunner.run(EmbulkRunner.java:155)
at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:431)
at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:90)
at org.embulk.cli.Main.main(Main.java:64)
Suppressed: java.lang.NullPointerException
at org.embulk.exec.BulkLoader.doCleanup(BulkLoader.java:463)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:397)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:394)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.cleanup(BulkLoader.java:394)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:245)
... 5 more
Caused by: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at org.embulk.input.gcs.AuthUtils.newClient(AuthUtils.java:81)
at org.embulk.input.gcs.GcsFileInput.listFiles(GcsFileInput.java:49)
at org.embulk.input.gcs.GcsFileInputPlugin.transaction(GcsFileInputPlugin.java:59)
at org.embulk.spi.FileInputRunner.transaction(FileInputRunner.java:62)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:507)
... 11 more
Caused by: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:226)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:366)
at com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:338)
at com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:335)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.StorageImpl.listBlobs(StorageImpl.java:334)
at com.google.cloud.storage.StorageImpl.list(StorageImpl.java:290)
at org.embulk.input.gcs.AuthUtils.newClient(AuthUtils.java:77)
... 15 more
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 401 Unauthorized
{
"code" : 401,
"errors" : [ {
"domain" : "global",
"location" : "Authorization",
"locationType" : "header",
"message" : "Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.",
"reason" : "required"
} ],
"message" : "Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:401)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1097)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:499)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:356)
... 23 more
Error: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
my environment :
$ gcloud config list
[compute]
region = us-east1
zone = us-east1-c
[core]
account = myname#xxx.com
disable_usage_reporting = False
project = <project ID>
Your active configuration is: [default]
$ gcloud auth list
Credentialed Accounts
ACTIVE ACCOUNT
* myname#xxxx.com
To set the active account, run:
$ gcloud config set account `ACCOUNT`
$ gsutil ls
gs://<bucket name>
my gcp IAM role :
owner
I understand that the solution to this error is authorization.
But my preferences seem to be fine.
what's wrong?
As the documentation [1], if we have 401- Unauthorized error then there could be many reasons, please have a related list of reasons listed below [followed the link 1], which could be helpful for troubleshooting:
Reason:AuthenticationRequiredRequesterPays
Access to a Requester Pays bucket requires authentication.
Reason: authError
This error indicates a problem with the authorization provided in the request to Cloud Storage. The following are some situations where that will occur:
The OAuth access token has expired and needs to be refreshed. This can be avoided by refreshing the access token early, but code can also catch this error, refresh the token and retry automatically.
Multiple non-matching authorizations were provided; choose one mode only.
The OAuth access token's bound project does not match the project associated with the provided developer key.
The Authorization header was of an unrecognized format or uses an unsupported credential type.
reason:lockedDomainExpired
When downloading content from a cookie-authenticated site, e.g., using the Storage Browser, the response will redirect to a temporary domain. This error will occur if access to said domain occurs after the domain expires. Issue the original request again, and receive a new redirect.
Reason: push.webhookUrlUnauthorized
Requests to storage.objects.watchAll will fail unless you verify you own the domain.
Reason: required
Access to a non-public method that requires authorization was made, but none was provided in the Authorization header or through other means.
[1] https://cloud.google.com/storage/docs/json_api/v1/status-codes#401_Unauthorized
I try locally , and create Service Account Key and save at local .
◾️XXXX.yaml
before
auth_method: compute_engine
after
auth_method: json_key
json_keyfile: /path/to/json_keyfile.json

Dataprep Bigquery running in different region

So I get the following error when running dataprep.
java.io.IOException: Query job beam_job_9e016180fbb74637b35319c89b6ed6d7_clouddataprepleads6085795bynick-query-d23eb37a1bee4a788e7b16c1de1f92e6 failed, status: { "errorResult" : { "message" : "Not found: Dataset lumi-210601:attribution was not found in location US", "reason" : "notFound" }, "errors" : [ { "message" : "Not found: Dataset lumi-210601:attribution was not found in location US", "reason" : "notFound" } ], "state" : "DONE" }
I've tried doing the following already:
Under dataprep "Project Settings" I've et the regional endpoint to asia-east1 and the zone to australia-southeast-1a
Have set under "Profile" all of the upload/job run dir/temp dir to new directories in a bucket that belongs to the australia south east
In the flow output I'm just doing a basic CSV output to test with dataprep execution settings to be australia region as well
I can't seem to find any other references to US anywhere. If I go into the dataflow UI I can see that the jobs ran and failed on the asia-east region as well

Pyspark - read data from elasticsearch cluster on EMR

I am trying to read data from elasticsearch from pyspark. I was using the elasticsearch-hadoop api in Spark. The es cluster sits on aws emr, which requires credential to sign in. My script is as below:
from pyspark import SparkContext, SparkConf sc.stop()
conf = SparkConf().setAppName("ESTest") sc = SparkContext(conf=conf)
es_read_conf = { "es.host" : "vhost", "es.nodes" : "node", "es.port" : "443",
"es.query": '{ "query": { "match_all": {} } }',
"es.input.json": "true", "es.net.https.auth.user": "aws_access_key",
"es.net.https.auth.pass": "aws_secret_key", "es.net.ssl": "true",
"es.resource" : "index/type", "es.nodes.wan.only": "true"
}
es_rdd = sc.newAPIHadoopRDD( inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable", conf=es_read_conf)
Pyspark keeps throwing error:
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: [HEAD] on
[index] failed; servernode:443] returned [403|Forbidden:]
I checked everything which all made sense except for the user and pass entries, would aws access key and secret key work here? We don't want to use the console user and password here for security purpose. Is there a different way to do the same thing?