where to write DAG files in apache air flow? - google-cloud-platform

I just started learning apache airflow, and I created an environment in composer in gcp and web server is working fine and everything, but I was just confused about where to write the DAG file? I mean I want to write the file where I can test it multiple times because in the web UI it's showing me a bucket where I can store the file, but I am unable to understand where to write the code. do I have to install airflow in my machine?
p.s - i know this is a stupid question, any help will be appreciated

You can install it locally yes. If you want to test it locally this is the only way I think.
There are couple of tools that you could do that - there is an astro CLI for managing your "dag development" environment which is published by Astronomer, https://github.com/astronomer/astro-cli
Also MWAA has their own tool too - I think, I think Composer has no Composer-specific one.
However for "generic" Airflow (which should be enough to start), you can use the community managed quick-start (either with local venv or Docker-Compose):
https://airflow.apache.org/docs/apache-airflow/stable/start/index.html

Related

Selenium cloud execution on a machine without code or IDE

I set up my Selenium project (Maven, Java, TestNG) in GitHub repo and it is connected to Jenkins. I am able to execute the Maven project via Jenkins and do the testing. This requires all dependant tools (Maven,Java,Jenkins) set up in my local machine.
But we have a requirement to do this in the cloud. I know we can use Selenium Grid-Docker, BrowserStack or GCP to execute the tests in the cloud but what we need is to have everything installed in the cloud and any external user with access being able to execute any test via UI or executable file without installing anything in user's local machine.
Is this possible at all? If yes,how?
I searched a lot and couldn't find anything. One of my friends said it can be done using AWS but doesn't know how. I just need guidance on the path to take here and I'm willing to learn and implement it myself.
Solved this my deploying code to AWS-EC2.
Here's what I did.
I created a TestNG-Maven project and uploaded to GitHub. Then created a AWS-EC2 t2.micro linux instance and installed Chrome and Jenkins in it. I accessed Jenkins from my local machine and connected it to GitHub repo. From Jenkins when I build the project everything was getting downloaded in EC2 and execution happened in EC2. This will be chrome-headless execution.

How to set up a local development environment for PySpark ETL to run in AWS Glue?

PyCharm professional supports connecting,
deploying and remote debugging of AWS Glue developer endpoint (https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-pycharm.html) , but I can't figure out how to use VS Code (my code editor of choice) for this purpose. Does VS Code support any of these functionalities? Or is there another free alternative to PyCharm professional with the same capabilities?
I have not use pyCharm, but have setup a local Development End Point with Zeppelin, for my Glue jobs development / testing. Please see my related posts & references for setting up local development end point. Maybe you can try it, if it is useful, and you can try to use pyCharm instead of Zeppelin.
Reference : Is it possible to use Jupyter Notebook for AWS Glue instead of Zeppelin & Link for zeppelin local development endpoint SO discussions

Developing in Adobe CQ5 with jetty?

We use maven to deploy the code changes to cq interner server / CRX Lite and the problem here is that it takes long time where the changes itself is often only one line code.
Has somebody experience with CQ5 with jetty and can give me a good Guide?
am not sure i understand the relationship with jetty (which ships as servlet container of latter versions of AEM/CQ5), but will answer to the code deployment part:
deploying a full content package (full content) should be done using
maven-content-package plugin for smaller deployments of content,
when you can't use integrated dev environments like sling eclipse dev
tools, i'd suggest you use the excellent repo command that basically zips the current folder and deploy it. I'm using it as an external tool command of intellij and it's really fast.
finally, if the deployment you're referring to is osgi deployment, maven sling plugin can help you with that (will still compile/package the whole osgi bundle though)

How to use vagrant to develop on django locally and then deploy to EC2/Azure?

I chose Vagrant so that other developers in my team can quickly start contributing to the project. Is there anyway we can also make it easy for the developed code to be deployed on EC2 or Azure servers? If there are any articles on the optimal setup, please point me to them. Thanks!
The first video of Getting started with Django shows how to use Vagrant for locally Django developing and how to use it for deploying it to Heroku, you may want to use the first part of the tutorial (the one related with the local development). For the second it depends how you are going to deploy it, but as long as your code will be in a Git repository, you could clone it to EC2/azure from git.

Moving from runserver to a production server

I am quite new to programming, and all of my development has been on my local runserver using textmate and terminal. I have written a small app with a few hundred and I'd like to push it to an EC2 server. My only knowledge in terms of 'developing tools' is Django's local runserver, TextMate and Terminal.
What tools or programs should I look into learning to be able to have an effective workflow?Should I be using some sort of IDE over TextMate for this? My main concerns are being able to develop on my local runserver and then painlessly push that to my production server.
As #isbadawi said, use Fabric. It's better than just using the terminal because you can automate things. As far as deployments go, you can simplify it down to: fab -H your.host.com deploy. Inside the file you write commands, a simple one might go:
Cause the server to download the most recent code from SCM
Update the database (syncdb / migrations / what have you)
Cause apache or whatever you're using to reload the configuration
As far as some more general tips go:
If you're using WSGI, put it under source control
Same goes with local settings files, have your deploy script rename them to local_settings.py as part of the build
If you're really going for painless, look into Django hosting services like Gondor or Ep.io. Those will have clients that you can just deploy to pseudo-painlessly, although you will have to change some settings on your side to match theirs better, as there are many many ways to deploy a Django app.
Update: Ep.io is no longer in business as a hosting service. My new go-to is Heroku.
Update 2: I used to link local_settings.py in deployments, but now I'm leaning towards using the DJANGO_SETTINGS_MODULE config variable. See rdegge's "django-skel" settings for a good way to do this.
A DVCS such as git or Mercurial will allow you to develop and test locally, and then push the changes to a remote system for staging and production.