Install Scrapy in apache airflow will cause INVALID_ARGUMENT - google-cloud-platform

I`m trying to install Scrapy from PyPi using below command.
gcloud composer environments update $(AIRFLOW_ENVIRONMENT_NAME) \
--update-pypi-packages-from-file requirements.txt \
--location $(AIRFLOW_LOCATION)
requirements.txt is like this.
google-api-python-client==1.7.*
google-cloud-datastore==1.7.*
Scrapy==2.0.0
After running gcloud command, It will cause an invalid argument but it runs successfully in the local environment.
gcloud composer environments update xxxx \
--update-pypi-packages-from-file requirements.txt \
--location asia-northeast1
ERROR: (gcloud.composer.environments.update) INVALID_ARGUMENT: Found 1 problem:
1) Error validating key Scrapy. PyPi dependency name is not formatted properly. It must be lowercase and follow the format of 'identifier' specified in PEP-508.
Is there any way to install?

As the previous answer stated, the error that you are receiving now is quite clear and it's caused by the wrong formatting of the dependency. It should be scrapy==2.0.0 instead of Scrapy==2.0.0 inside the requirements.txt.
I would like to add that to avoid the installation error when you fix the formatting, you should add one more dependency to your list and that is attrs==19.2.0. I was able to install your requirements to my environment by specifying the following list:
google-api-python-client==1.7.*
google-cloud-datastore==1.7.*
scrapy==2.0.0
attrs==19.2.0

Even though you adjust package name in requirements.txt file according to PEP-508 document prerequisites, formatting certan package name in lowercase layout scrapy==2.0.0, the issue most probably will remain the same and updating process will stuck with the error:
Failed to install PyPI packages
Generally, this kind of error appears then the source PyPI package has some external dependencies or this package is sensitive on some system-level libraries that GCP Composer doesn't support.
In this case a vendor recommends two ways either using KubernetesPodOperator to build own custom image and use it in particular Kubernetes Pod or deploy PyPi package as a local Python library, uploading shared object libraries for the PyPI dependency to Airflow /plugins directory, find more info here.

Related

Unable to execute a step on a running EMR

I have an EMR cluster 5.28.1 running in AWS but I forgot to install from python libraries as part of the bootstrap action. Now that the cluster is running, I was simply attempting to add a step via the EMR console. Here are my settings
JAR: s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
Main class: None
Arguments: s3://xxxx/install_python_libraries.sh
Unfortunately, I get the following error.
Cannot run program "s3://xxxxx/install_python_libraries.sh" (in directory "."): error=2, No such file or directory
I am not sure what I am doing wrong. The shell script looks like this.
#!/bin/bash -xe
# Non-standard and non-Amazon Machine Image Python modules:
sudo pip-3.6 install boto3
sudo pip-3.6 install xmltodict
I also tried this by simply using 'command-runner.jar' but I get the same error. Can you please help me figure out the problem so I do this via the console? I would like to install the libraries on all nodes - master and core.
Thanks
The issue is the xxx.sh files EOL/carriage return type.
In other words, if it is Windows ("\r\n") then it will not work and return the ./ file not found error.
Convert it to unix type ("\n") using something like notepad++ and it will run fine.
(In notepad++ edit>EOL Conversion>Unix(LF) hit save and try again)

Composer outdated shows drupal/core has a newer version, but composer update says "nothing to update"

This is similar to an unanswered question from a year ago. Supposedly I have an update for drupal/core:
$ composer outdated "drupal/*"
drupal/core 8.6.10 8.6.12 Drupal is an open source content ...
But when I run update ...
$ composer update drupal/core --with-dependencies
Dependency "asm89/stack-cors" is also a root requirement, but is not explicitly whitelisted. Ignoring.
Dependency "composer/semver" is also a root requirement, but is not explicitly whitelisted. Ignoring.
[ ... ]
Loading composer repositories with package information
Updating dependencies (including require-dev)
Nothing to install or update
Package phpunit/phpunit-mock-objects is abandoned, you should avoid using it. No replacement was suggested.
Generating autoload files
> Drupal\Core\Composer\Composer::preAutoloadDump
> Drupal\Core\Composer\Composer::ensureHtaccess
I'm trying to follow the instructions to update drupal 8 via composer found here: https://www.drupal.org/docs/8/update/update-core-via-composer
I had the same issue today with updating Drupal and the following process helped me solve the issue.
Run the composer update command using the specific version you are trying to update to. In this instance it would be composer require drupal/core:8.6.12 --update-with-dependencies If there is an issue blocking the update this should show you a list of problems. in my case I tried to update to version 8.6.11 and it output the following.
Problem 1
. Installation request for drupal/core 8.6.11 -> satisfiable by
drupal/core[8.6.11].
. Can only install one of: twig/twig[1.x-dev, v1.35.3].
. Can only install one of: twig/twig[v1.35.3, 1.x-dev].
. Can only install one of: twig/twig[1.x-dev, v1.35.3].
. drupal/core 8.6.11 requires twig/twig ^1.38.2 -> satisfiable by
twig/twig[1.x-dev, v1.38.2].
. Conclusion: don't install twig/twig v1.38.2
. Installation request for twig/twig (locked at v1.35.3, required as
^1.35.0) -> satisfiable by twig/twig[v1.35.3].
If there is no problem listed try clearing the composer cache composer clearcache and then try the update command again.
you can also try running the why-not composer command to see if that highlights any issues composer why-not drupal/core:8.6.12
In my case the issue was that the twig component required for 8.6.12 was v1.38.2 but was capped at a lower version 1.35 in the composer file. I used the following command to update the twig version and that allowed me to update to Drupal 8.6.12 using my normal update process.
composer require twig/twig:1.35.2
I hope this helps.

GCP Deployment manager error

When I try to use the project creation template which is on github, even after changing the appropriate values in config.yaml I am getting following error.
location: /deployments/projectcreation000/manifests/manifest-1534790908361
message: 'Manifest expansion encountered the following errors: Error compiling Python code: No module named apis Resource: project.py Resource: config'
you can find the repo link here : https://github.com/GoogleCloudPlatform/deploymentmanager-samples/tree/master/examples/v2/project_creation
Please help as I need it for production workflow. I have tried "sudo pip install apis" in Cloud Shell but it does not help, even after successful installation of apis module.
you either need to fix the import or move the file, so that apis.py will be found.
The apis module in this context refers to,
not a pip package. Ensure you have all the files in the same relative paths to each other when deploying these samples.

elastic beanstalk: incremental push git

When I would like to push incremental changes to the AWS Elastic Beanstalk solution I get the following:
$ git aws.push
Updating the AWS Elastic Beanstalk environment None...
Error: Failed to get the Amazon S3 bucket name
I've already added FULLS3Access to my AWS users policies.
I had a similar issue today and here are the steps I followed to investigate :-
I modified line no 133 at .git/AWSDevTools/aws/dev_tools.py to print the exception like
except Exception, e:
print e
* Please make sure of spaces as Python does not work in case of spaces.
I ran command git aws.push again
and here is the exception printed :-
BotoServerError: 403 Forbidden
{"Error":{"Code":"SignatureDoesNotMatch","Message":"Signature not yet current: 20150512T181122Z is still later than 20150512T181112Z (20150512T180612Z + 5 min.)","Type":"Sender"},"
The issue is because there was a time difference in server and machine I corrected it and it stated working fine.
Basically the Exception will helps to let you know exact root cause, It may be related to Secret key as well.
It may have something to do with the boto-library (related thread). If you are on ubuntu/debian try this:
Remove old version of boto
sudo apt-get remove python-boto
Install newer version
sudo apt-get install python-pip
sudo pip install -U boto
Other systems (e.g. Mac)
Via easy_install
sudo easy_install pip
pip install boto
Or simply build from source
git clone git://github.com/boto/boto.git
cd boto
python setup.py install
Had the same problem a moment ago.
Note:
I just noticed your environment is called none. Did you follow all instructions and executed eb config/eb init?
One more try:
Add export PATH=$PATH:<path to unzipped eb CLI package>/AWSDevTools/Linux/ to your path and execute AWSDevTools-RepositorySetup.sh maybe something is wrong with your repository setup (notice the none weirdness). Other possible solutions:
Doublecheck AWSCredentials (maybe you are using different Key IDs / Wrong CredentialsFile-format)
Old/mismatching versions of eb client & python (check with eb -v and python -v) (current client is this)
Use amazons policy validator to doublecheck if your AWS User is allowed to perform all actions
If all that doesn't help im out of options. Good luck.

Setting Content-Type for static website hosted on AWS S3

I'm hosting a static website on S3. To push my site to Amazon I use the s3cmd command line tool. All works fine except setting the Content-Type to text/html;charset=utf-8.
I know I can set the charset in the meta tag in the HTML file, but I would like to avoid it.
Here is the exact command I'm using:
s3cmd --add-header='Content-Encoding':'gzip'
--add-header='Content-Type':'text/html;charset=utf-8'
put index.html.gz s3://www.example.com/index.html
Here is the error I get:
ERROR: S3 error: 403 (SignatureDoesNotMatch): The request signature we calculated does not match the signature you provided. Check your key and signing method.
If I remove the ;charset=utf-8 part from the above command it works, but the Content-Type gets set to text/html not text/html;charset=utf-8.
Two step process to solve your problem.
(1) Upgrade your installation of S3cmd. Version 1.0.x does not have the capability to set the charset. Install from master on github. Master includes fixes for this (1) bug and this (2) bug that result in failure to recognize the format of the content-type and the "called before definition" problem in earlier versions.
To install s3cmd from master on OSX do the following:
git clone https://github.com/s3tools/s3cmd.git
cd s3cmd/
sudo python setup.py install (sudo optional based on your setup)
Make sure your python libraries are in your path by adding the following to your .profile or .bashrc or .zshrc (again, depending on your system).
export PATH="/Library/Frameworks/Python.framework/Versions/2.7/bin:$PATH"
but if you use homebrew to might cause conflicts so - just symlink to the executable.
ln -s /Library/Frameworks/Python.framework/Versions/2.7/bin/s3cmd /usr/local/bin/s3cmd
Close terminal and reopen.
s3cmd --version
will still output
s3cmd version 1.5.0-alpha3 - but its the patched version.
(2) Once upgraded, use:
s3cmd --acl-public --no-preserve --add-header="Content-Encoding:gzip" --add-header="Cache-Control:public, max-age=86400" --mime-type="text/html; charset=utf-8" put index.html s3://www.example.com/index.html
If the upload succeeds and sets the Content-Type to "text/html; charset=utf-8" but you see this error in the process:
WARNING: Module python-magic is not available...
I prefer to live without python-magic - I find that if you don't specifically set the mime-type, python-magic often guesses wrong. Install python-magic but be sure to set mime-type="application/javascript" in s3cmd or python-magic will guess it to be "application/x-gzip" if you gzip your js locally.
Install python-magic:
sudo pip install python-magic
PIP broke with the recent OSX upgrade so you may need to update PIP:
sudo easy_install -U pip
That will do it. All this works with S3cmd sync too - not just put. I suggest you put s3cmd sync into a thor-type task so you don't forget to set the mime-type on any particular file (if you are using python-magic on gzipped files).
This is a gist of an example thor task for deploying a static Middleman site to s3. This task allows you to rename files locally and use s3cmd sync rather than using S3cmd put to rename them one-by-one.