cloudbuil.yaml does not unmarshall when using base64-encoded value on build trigger - google-cloud-platform

On my cloudbuild.yaml definition, I used to have a secrets section to get environment values from Google KMS. The secretEnv fields had keys mapping to 'encrypted + base64-encoded' values:
- kmsKeyName: <API_PATH>
I've tried to put this value on a substitution instead, which is replaced when a build trigger is used:
- kmsKeyName: <API_PATH>
<KEY>: ${_VALUE}
With that I intend to keep the file generic.
However, the build process keeps failing with a message failed unmarshalling build config cloudbuild.yaml: illegal base64 data at input byte 0. I've checked several times and the base64 value was not copied wrong into the substitution on the trigger.
Thank you in advance.
After reading Using user-defined substitutions section carefully, I've seen that
The length of a parameter key is limited to 100 bytes and the length
of a parameter value is limited to 4000 bytes.
Mine was a 253-character long string.

I managed to reproduce a similar error to yours (exactly this one: "Failed to trigger build: failed unmarshalling build config cloudbuild.yaml: json: cannot unmarshal string into Go value of type map[string]json.RawMessage, it is because using"). But this was only when my variable was something like "name:content" instead of "name: content". Notice the white space, so important.
Then, going back to your point... user-defined substitutions are limited to 255 characters (yes, docs are currently wrong and this has been reported). But, for example, if you use something like:
variable_name: cool_really_long_content_but_still_no_255_chars
And then you do this:
- name: ""
args: ["build", "-t", "$PROJECT_ID/$cool_really_long_content_but_still_no_255_chars", "."]
It still will fail if "$PROJECT_ID/$cool_really_long_content_but_still_no_255_chars" is, in fact, more than 255 chars even if your really long content is still not 255 chars. And this error will appear in Build details>Logs instead of being a popup that you see when you click "run trigger" in "build triggers" section on Google Cloud Build which is where the kind of the reported error appears since logs in that case are showing disabled in Build details section.


Filebeat regex for tomcat

I'm having trouble to get the correct regex for filebeat when using tomcat and log4j.
For this log:
21/10/2022 16:04:37 ERROR en Clase: ExceptionLogger - MSN: test
ErrorCode: 0
Usuario: test
I've configured this pattern: '^[[:space:]]' with negate=false and match=after (as the documentation says) but it doesn't work.
Even if I use the go playground, it should work:
Here's what we have for configuration for log4j-based files with a slightly different pattern, but you should be able to adapt it to your situation:
multiline.type: pattern
multiline.pattern: '^\d{4}-\d{2}-'
multiline.negate: true
multiline.match: after
Here's an example standard log4j log line:
2022-10-22 13:55:34,932 [pool-8-thread-1] TRACE fully.qualified.class.Name- Here's the raw message
Here's an example exception message:
2022-10-21 20:14:42,442 [catalina-exec-6] ERROR fully.qualified.class.Name- Main error message
fully.qualified.exception.Type: Exception error message
at stack.trace.class.method(
at stack.trace.class.method(
at stack.trace.class.method(
at stack.trace.class.method(
So we are just looking for log lines starting with dddd-dd- and assuming that those are always "new log entries". We could certainly confuse things with a log line that was a continuation of something previous which started with that same pattern, but that's very rare.

Checks not running on every file (when running -a) even though they are parsed

Summary of the problem
I've setup a repository with the checks I developed, designed to be used exclusively in an another repository.
The problem I've been facing is that when I run pre-commit run -a -v the checks don't go through every file, and time to time they even change!
A little more details
I've run the [identity][1] check and it prints every file in the repo, meaning the files are read by pre-commit (per my understanding)
When I execute pre-commit run the file in staging are correctly parsed
Why do i think that the checks don't run on every file?
I always print something to the standard output for every check, and when I run it verbosely it only prints 5 to 7 lines, meaning it checked only those.
If I try and edit a file (breaking a check) whose not on the short list, the check still passes through
Some code
The structure of the folder containing the files to be checked
Summary of the .pre-commit-hooks.yaml
- id: team-check-name
name: blabla
description: blabla
entry: team-checkname
files: 'team/.*/'
types: [markdown]
language: python
- id: article-check-name
name: blabla
description: blabla
entry: article-checkname
files: 'articles/.*/'
types: [markdown]
language: python
Essentially the checks supposed to run on articles/.*/ start with article- and they all have this property: files: 'articles/.*/'.
Similarly every check supposed to run on team/.*/ starts with team- and they all have files: 'team/.*/'
Summary of the .pre-commit-config.yaml
- repo:
rev: commit_hash
- id: team-check-name
- id: article-check-name
But as you can see, I don't override any settings so it shouldn't interfere
the hook tools you've written are incorrect -- they only process sys.argv[1] rather than positional arguments
your code uses a lot of global variables so it's not a straightforward refactor -- typically you'd either use argparse to collect nargs='*' or loop over sys.argv[1:] (if you don't have any options)
as to why this is the convention -- it's very wasteful to start a linter process over and over to lint a single file (often times executable startup cost dwarfs the actual linting / formatting process)
disclaimer: I wrote pre-commit

What gridmix input format likes?

I use Rumen mine job-history files, contains job-trace.json and job-topology.json.
GirdMix usage likes:
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/tools/lib/hadoop-gridmix-2.7.3.jar -libjars $HADOOP_HOME/share/hadoop/tools/lib/hadoop-rumen-2.7.3.jar -Dgridmix.compression-emulation.enable=false <iopath> <trace>
And, means working directory for Gridmix, so I feed with: file:///home/hadoop/input, means the trace file extracted from log files, feed with file:///home/hadoop/rumen/job-trace-1hr.json.
Finally, meet with following Exceptions:
2019-03-07 16:37:12,495 ERROR [main] gridmix.Gridmix ( - Startup failed. Found no satisfactory file in file:/home//hadoop/input
2019-03-07 16:37:13,040 INFO [main] util.ExitUtil ( - Exiting with status 2
2019-03-07 16:37:13,041 INFO [Thread-1] gridmix.Gridmix ( - Exiting...
So what this parameter like , or how to use it?
can anyone have some ideas?
I found it's my own incorrect useage;
I check out gridmix parameters usage, due to too small input data.
gridmix.min.file.size | The minimum size of the input files. The default limit is 128 MiB. Tweak this parameter if you see an error-message like "Found no satisfactory file" while testing GridMix with a relatively-small input data-set.
So, I tuned larger input data.
Using -generate 10G.

GATE_Using for Thesis_Run-time Error

When I am trying to run corpus pipeline on language resources. It is throwing the below (even though I follow the order as Document reset, english tokeniser, sentence splitter)
Can someone help me with the process to debug this run-time error
gate.creole.ExecutionException: No sentences or tokens to process in document Password_Safe-window1.txt_0003E
Please run a sentence splitter and tokeniser first!
at gate.creole.POSTagger.execute(
at gate.util.Benchmark.executeWithBenchmarking(
at gate.creole.SerialController.runComponent(
at gate.creole.SerialController.executeImpl(
at gate.creole.SerialAnalyserController.executeImpl(
at gate.creole.SerialAnalyserController.execute(
at gate.util.Benchmark.executeWithBenchmarking(
at gate.gui.SerialControllerEditor$RunAction$
The files are not empty. As i tried to implement #dedek's suggestion, it has thrown no errors. But raised one more problem as follows:
Exception in thread "ApplicationViewer1" java.lang.OutOfMemoryError: Java heap space
I think it is because your document is empty.
Can you confirm that?
There is a run-time param failOnMissingInputAnnotations of the POSTagger, set it to false and it should be ok.
See also the docs:
failOnMissingInputAnnotations - if set to false, the PR will not fail with an ExecutionException if no input Annotations are found and instead only log a single warning message per session and a debug message per document that has no input annotations (run-time, default = true).
Concerning the OutOfMemoryError: Java heap space
See following questions:
Getting OOM while using GATE on large data set
GATE PersistenceManager.loadObjectFromFile outofmemory error while loading .gapp files
JAVA PermGem memory

Using file.managed for downloading a file in Salt

salt.states.file.managed takes source_hash as an argument to verify a downloaded file. This blocks me from using file.managed for a file on an online server I don't have control over. The file also changes regularly. My configuration looks like this.
- name: localfile.tar.gz
- source:
- source_hash: ???
I don't want to use with Curl or wget because this would always download the file, even when it's already on the local machine.
I would like the know if one of the options below is possible/exists:
online md5 calculation service. Is there any way of getting an md5 hash of the file, using a free web service? I'm thinking of something like{url-to-file}.
salt-internal conversion or workaround. Is it possible to handle this in Salt? Maybe by leaving out source_hash somehow?
alternative state. Is there another state in Salt for doing something like this, without losing the benefit of only downloading the file when needed?
If you can't control the other server, please make sure that you can trust it to download its content. Not using a hash will prevent you from detecting partial or corrupted downloads. There's also no way to work with a file that has changed on the remote server.
Nevertheless you could use a state like this to circumvent the hashcode. The creates part will prevent a second download once the file has been downloaded:
- name: curl -L -o /etc/salt/cloud.deploy.d/
- creates: /etc/salt/cloud.deploy.d/
Downloading a file with file.managed can be done since version 2016.3.0., even if you don't have access to the hash, by adding skip_verify: True. For the example given, it would be:
- name: localfile.tar.gz
- source:
- skip_verify: True
From the docs:
If True, hash verification of remote file sources (http://, https://, ftp://) will be skipped, and the source_hash argument will be ignored.