task.py: error: unrecognized arguments: --job-dir when launching training with gcloud ai-platform - google-cloud-ml

I keep running into the unrecognized arguments issue when submitting a training job. This post is similar to mine, but I don't understand what the accepted answer was about.
I have tried adding --job-dir as a user defined argument and popping it from my arguments in my task.py:
args = parser.parse_args()
arguments = args.__dict__
arguments.pop('job-dir')
arguments.pop('job_dir')
but that didn't work.
This is my command to submit the training:
gcloud ai-platform jobs submit training model2_60days_23 \
--scale-tier basic \
--package-path C:/Users/me/ml/trainer \
--module-name trainer.task \
--job-dir=gs://my_bucket/ML/job_output \
--region us-east1 \
--python-version 3.5 \
--runtime-version 1.13 \
-- \
--config_path="gs://my_bucket/ML/config/model_params.json" \
--mode=train \
--look_forward=60
How can i resolve this?

Related

Unable to create a Windows Core VM instance using gcloud

I wanted to create a Windows VM using gcloud command line.
Tried the "Equivalent Command Line" syntax - the syntax failed.
After some trial and error, discovered that the --create-disk list of parameters needs to be repeated (please observe the script below).
gcloud compute instances create ifworker-0 \
--project=ceng-test \
--zone=us-east4-c \
--machine-type=n2-standard-2 \
--network-interface=nic-type=VIRTIO_NET \
--network-tier=PREMIUM \
--maintenance-policy=MIGRATE \
--provisioning-model=STANDARD \
--service-account=the-service-account \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--tags=ifworker-net-0 \
--create-disk=mode=rw \
--create-disk=size=40GB \
--create-disk=type=projects/ceng-test/zones/us-central1-a/diskTypes/pd-balanced \
--create-disk=boot=yes \
--create-disk=auto-delete=yes \
--create-disk=image=projects/windows-cloud/global/images/windows-server-2022-dc-core-v20220513 \
--no-shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring \
--reservation-affinity=any
However, even then the script is failing - the error is reproduced below.
ERROR: (gcloud.compute.instances.create) Could not fetch resource:
- Invalid value for field 'resource.disks[0]': '{
"type": "PERSISTENT",
"mode": "READ_WRITE",
"boot": true,
"initializeParams": { },
"autoDele...'.
Boot disk must have a source specified.
Need some guidance here. Thanks for your attention and time.
As checked on your command, boot and image properties should be in the same line.
It should look like this.
--create-disk=boot=yes,image=projects/windows-cloud/global/images/windows-server-2022-dc-core-v20220513
Based on GCP's documentation the image properties should be included in the same line with --create-disk=[PROPERTY=VALUE,…] parameters, specifying the name of the image that will be initialized.
Below is the command that worked on my end:
gcloud compute instances create ifworker-0 \
--project=<project_name> \
--zone=us-east4-c \
--machine-type=n2-standard-2 \
--network-interface=nic-type=VIRTIO_NET \
--network-tier=PREMIUM \
--maintenance-policy=MIGRATE \
--provisioning-model=STANDARD \
--service-account=the-service-account \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--tags=ifworker-net-0 \
--create-disk=mode=rw \
--create-disk=size=40GB \
--create-disk=type=projects/ceng-test/zones/us-central1-a/diskTypes/pd-balanced \
--create-disk=boot=yes,image=projects/windows-cloud/global/images/windows-server-2022-dc-core-v20220513 \
--create-disk=auto-delete=yes \
--no-shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring \
--reservation-affinity=any
Note:
Change <project_name> and/or service account details.
For "gcloud compute instances create" there should be only one --create-disk line per disk. In other cases multiple disks are created.
As we want only one disk, there should be only one line, with all parameters delimited by ",".
The correct example follows.
gcloud compute instances create ifworker-0 \
--project=<project_name> \
--zone=us-east4-c \
--machine-type=e2-micro \
--network-interface=nic-type=VIRTIO_NET \
--network-tier=PREMIUM \
--maintenance-policy=MIGRATE \
--provisioning-model=STANDARD \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--tags=ifworker-net-0 \
--create-disk=mode=rw,size=40GB,type=projects/<project_name>/zones/us-central1-a/diskTypes/pd-balanced,boot=yes,auto-delete=yes,image=projects/windows-cloud/global/images/windows-server-2022-dc-core-v20220513 \
--no-shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring \
--reservation-affinity=any

Purge parachain issue

I want to purge my parachain collator node, but I got this error
Input("Error parsing spec file: missing field `relay_chain` at line 143 column 1")(cannot purge parachain)
This is the command I used to purge my parachain
./target/release/parachain-collator purge-chain --base-path /tmp/parachain/alice --chain rococo-custom.json
This is the command I used to run this parachain-collator
./target/release/parachain-collator \
--alice \
--collator \
--force-authoring \
--parachain-id 2000 \
--base-path /tmp/parachain/alice \
--port 40333 \
--ws-port 8844 \
-- \
--execution wasm \
--chain rococo-custom.json \
--port 30343 \
--ws-port 9977
Thank you so much for your help!
./XXX/parachain-collator purge-chain --base-path <your collator DB path set above>
no need args for chainspec.

CloudML job + verbosity == Error

Runnning the dataeng-machine-learning codelab on step 9. 4. Feature Engineering.
The notebook step for running a tarin job is:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100
That works great no matter how many time I run it.
However if I run:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100 \
--verbosity DEBUG
Job fails after about 40 sec. with this in the logs:
The replica master 0 exited with a non-zero status of 2. Termination reason: Error.
I've found this usage in here:
https://cloud.google.com/ml-engine/docs/how-tos/getting-started-training-prediction#cloud-train-single
So I guesss it's ok to use.
What am I doing wrong?
Note that every argument after the "-- \" line is a pass through to the tensorflow code and is therefore dependent on the individual sample code.
In this case, the "--verbosity" flag isn't supported by the sample you are running. Looking at the samples repo, it looks like the only sample that has that flag is the census estimator sample.
The taxifare example is currently hardcoded to INFO, and the code doesn't parse the --verbose flag.

Texture Packer with Xcode

I added texturePacker script to export sprite sheet and its working. I would like to know how to set 'Pre Multiply Alpha' and 'NPot any size' while exporting sheet through Xcode script?
Here is my present Code:
TP="/usr/local/bin/TexturePacker"
${TP} --smart-update \
--format cocos2d \
--padding 2 \
--main-extension "-ipadhd" \
--autosd-variant 0.5:-ipad \
--autosd-variant 0.5:-hd \
--autosd-variant 0.25: \
--opt RGBA8888 \
--data iOS/Resources/Game_SpriteSheet/CBirdSpriteSheet_1-ipadhd.plist \
--sheet iOS/Resources/Game_SpriteSheet/CBirdSpriteSheet_1-ipadhd.pvr.ccz \
SpriteSheet/Sprite_Sheet_1/*.png
Screenshot from external texture packer. I want same in script.
Have you tried adding the --premultiply-alpha and --size-constraints <value> options to the command? [1]
TP="/usr/local/bin/TexturePacker"
${TP} --smart-update \
--format cocos2d \
--padding 2 \
--main-extension "-ipadhd" \
--autosd-variant 0.5:-ipad \
--autosd-variant 0.5:-hd \
--autosd-variant 0.25: \
--opt RGBA8888 \
--premultiply-alpha \
--size-constraints NPOT \
--data iOS/Resources/Game_SpriteSheet/CBirdSpriteSheet_1-ipadhd.plist \
--sheet iOS/Resources/Game_SpriteSheet/CBirdSpriteSheet_1-ipadhd.pvr.ccz \
SpriteSheet/Sprite_Sheet_1/*.png
[1] http://www.codeandweb.com/texturepacker/documentation

Issue with receiving imap email in redmine

My incoming emails keep getting ignored and not filed into the correct project. What am I missing here?
rake -f /home/kickapps/redmine/Rakefile redmine:email:receive_imap \
RAILS_ENV="production" \
host=imap.gmail.com \
ssl=SSL \
port=993 \
move_on_success=FILED \
move_on_failure=IGNORED \
username=redmine#kitops.com \
password=*************** \
unknown_user=accept \
no_permission_check=1 \
project=test \
allow_override=project,tracker
If you don't see the email become read in gmail then try adding --trace at the end of your rake parameters (you should get a rake error). The email must be unread/new at gmail box or it won't be read by the rake because it thinks it already read it.
Another gotcha: 993 blocked by firewall between redmine and gmail.
Check the rails log/production.log right after running the rake - check if there's some error message about the mail.
Assuming the rake task is reading and changing the status in gmail, then it might be the parameters. I notice your ssl is different from what I had.
rake -f /home/kickapps/redmine/Rakefile redmine:email:receive_imap \
RAILS_ENV="production" \
host=imap.gmail.com \
ssl=1 \ # it's 1 on my install -- double check
port=993 \
username=redmine#kitops.com \
password=*************** \
project=test \ # must be the project identfier not the name
status=assigned \ # must be a status used in the project, check popups in redmine
unknown_user=accept \ #haven't tried this
no_permission_check=1 \ # or this
allow_override=project,tracker # or this