AWS CodeBuild buildspec.yml get all files and subfolders recursively - amazon-web-services

I'm trying to use AWS CodeBuild to get all files and subfolders inside a nested public folder and deploy to an S3 bucket using CodePipeline. I was able to hook them all together but struggling to configure the buildspec.yml file to get the output I want.
My folder structure:
<path>/public/
├── 404.html
├── css
│   ├── ...
├── fonts
│   ├── bootstrap
│   │   ├── ...
│   ├── icomoon
│   │   ├── icomoon
│   │   │   ├── ...
│   └── simple-line-icons
│   ├── ...
├── images
│   ├── ...
├── index.html
├── index.xml
├── js
│   ├── ...
└── tags
└── index.xml
I need to put everything (including the subfolders) inside the public folder into the root of an S3 bucket.
So far I've tried following the docs here, here and here. I've tried using:
**/* to get everything recursively inside the folder public but the S3 bucket will have the path to the folder so the index.html is not in the root.
discard-paths: yes to remove the path to the public folder but inside the S3 bucket, all files are there in the root, no sub-folder structure is kept.
base-directory: as described here.
artifacts:
secondary-artifacts:
artifact1:
files:
- directory/file
artifact2:
files:
- directory/file2 to keep the folder structure but my build failed.
Different combinations of all the syntaxes above but my build just failed.

TL;DR You probably don't need to use "discard-paths: yes". So, drop "discard-paths: yes" and instead use "base-directory" with "**/*" for glob pattern. Read on for more reasoning why.
So yeah, as it usually happens, the culprit was pretty trivial. In my case the directory structure looked like this:
dist/
|-- favicon.ico
|-- index.html
|-- js
| `-- app.js
`-- revision.txt
So in order for the artifact to contain everything from the "dist" directory whilst preserving the nested directories structure, the buildspec had to look like this:
artifacts:
files:
- '**/*'
base-directory: 'dist'
Essentially "base-directory" parameter behaves as unix's "cd" command, so at first CodeBuild jumps to that directory and only after that resolves the glob pattern, in this case - **/*. If you also happen to use "discard-paths" then what happens is that, presumably, CodeBuild will change current directory to "dist", resolve the files structure correctly but then remove all path prefixes from the resolved paths, thus the result will be a plain directory structure with no nesting.
Hope that helps!

I've figured out a way to work around it. I don't think it's the best way but it works. I hope you can find this solution useful buildspec.yml:
version: 0.2
phases:
build:
commands:
- mkdir build-output
- cp -R <path>/public/ build-output
post_build:
commands:
- mv build-output/**/* ./
- mv build-output/* ./
- rm -R build-output *.yml LICENSE README* .git*
artifacts:
files:
- '**/*'
In words:
I copy everything inside the nested <path>/public folder out to a folder called ./build-output.
Then in post_build I move everything out of folder build-output.
Delete files pulled from my GitHub's repo which aren't needed to host the static website in the S3 bucket
Then I get the result I want: all files inside public in the root of the S3 bucket with the right folder tree.
Update:
You can also use my buildspec.yml example file here. Since the example is out of the scope of this question, I don't paste the code here.
I explain in details here.

I was running into a similar issue and after many permutations below syntax worked for me.
artifacts:
files:
- './**/*'
base-directory: public
name: websitename

This worked for me.
/
buildspec.yml
/public
index.html
/js
/img
appspec.yml
/src
in buildspec.yml
artifacts:
files:
- 'public/*'
discard-paths: yes

For those who are facing this issue while using CodeBuid in CodePipline and have tried the correct configuration for buildspec.yml file, the culprit for me was on the deploy stage, where I had set as Input Artifact the Source artifact instead of the Build artifact.

Related

Is there a way to write a glob pattern that matches to either a file within a directory or a whole directory within that same directory?

I'm trying to write a glob pattern that will match to either the cloudbuild.yaml file or anything in the API directory in this structure:
Server/
├── API/
│ ├── script.js
│ ├── somethingelse.html
├── Excluded Directory/
│ ├── bootstrap.js
│ └── bootstrap.min.js
└── cloudbuild.yaml
I've tried this so far and while I can get individual parts of it to match I can't seem to get it to match either pattern with the same pattern:
Server/{API/**, cloudbuild.yaml}
In case it's relevant, this is so my Cloud Build trigger will only trigger a redeploy if files within the API folder are updated, or if the cloudbuild.yaml file itself is.
In case anyone else is wondering, it's actually pretty easy - I was just missing a few slashes and this works:
/Server/{cloudbuild.yaml,API/*}

Packer file provisioner doesn't copy

I have a File Provisioner configured on my packer template json:
"provisioners": [{
"type": "file",
"source": "packer/api-clients/database.yml",
"destination": "/tmp/api-clients-database.yml"
},
The code below doesn't work when I'm trying to build an AMI on Amazon AWS, it always says:
Bad source 'packer/api-clients/database.yml': stat packer/api-clients/database.yml: no such file or directory
If I do this:
"source": "api-clients/database.yml",
It works like a charm. But I must have all my Packer files inside of a packer folder within my app folder for organization purposes.
What am I doing wrong?
My app folder is like this:
api_v1
├── template.json
├── app
│   ├── bin
│   ├── config
│   ├── packer
│   │   ├── api-clients
│   │   │   └── database.yml
│   ├── lib
│   ├── log
│   ├── ...
It seems that it has something to do with Relative Paths / Absolute Paths on Packer but I couldn't figure out what is wrong...
Thanks in advance,
Since the path doesn't start with a / it's a relative path. The are relative to the current working directory when executing packer build.
With source packer/api-clients/database.yml you have to run packer from the app directory, i.e.
packer build ../template.json
With source api-clients/database.yml you have to run packer from the packer directory, i.e.
packer build ../../template.json
For more info see Packer documentation - File provisioner: source.
It is as you have surmised a path thing.
You do not say from what folder you are calling packer and what the calling command is, or when you have it working with "source": "api-clients/database.yml", if you have moved the api-clients folder or it works with packer in that location.
If your folder structure will always look that way then to avoid confusions if you use a full path for the source it will always work no matter where you run packer from
eg
/api_v1/app/packer/api-clients/database.yml
if you must use relative paths then make sure that the source path is always relative from the folder in which packer is run.

How to deploy a Go web application in Beanstalk with custom project folder structure

I'm new to Go.
I am trying to deploy a simple web project to EB without success.
I would like to deploy a project with the following local structure to Amazon EB:
$GOPATH
├── bin
├── pkg
└── src
├── github.com
│   ├── AstralinkIO
│   │   └── api-server <-- project/repository root
│   │   ├── bin
│   │   ├── cmd <-- main package
│   │   ├── pkg
│   │   ├── static
│   │   └── vendor
But I'm not sure how to do that, when building the command, Amazon is treating api-server as the $GOPATH, and of course import paths are broken.
I read that most of the time it's best to keep all repos under the same workspace, but it makes deployment harder..
I'm using Procfile and Buildfile to customize output path, but I can't find a solution to dependencies.
What is the best way to deploy such project to EB?
Long time has past since I used Beanstalk, so I'm a bit rusty on the details. But basic idea is as follows. AWS Beanstalk support for go is a bit odd by design. It basically extracts your source files into a folder on the server, declares that folder as GOPATH and tries to build your application assuming that your main package is at the root of your GOPATH. Which is not a standard layout for go projects. So your options are:
1) Package your whole GOPATH as "source bundle" for Beanstalk. Then you should be able to write build.sh script to change GOPATH and build it your way. Then call build.sh from your Buildfile.
2) Change your main package to be a regular package (e.g. github.com/AstralinkIO/api-server/cmd). Then create an application.go file at the root of your GOPATH (yes, outside of src, while all actual packages are in src as they should be). Your application.go will become your "package main" and will only contain a main function (which will call your current Main function from github.com/AstralinkIO/api-server/cmd). Should do the trick. Though your mileage might vary.
3) A bit easier option is to use Docker-based Go Platform instead. It still builds your go application on the server with mostly same issues as above, but it's better documented and possibility to test it locally helps a lot with getting configuration and build right. It will also give you some insights into how Beanstalk builds go applications thus helping with options 1 and 2. I used this option myself until I moved to plain EC2 instances. And I still use skills gained as a result of it to build my current app releases using docker.
4) Your best option though (in my humble opinion) is to build your app yourselves and package it as a ready to run binary file. See second bullet point paragraph here
Well, which ever option you choose - good luck!

How to make CloudFront never cache index.html on S3 bucket

I have a React app hosted on an S3 bucket. The code is minified using yarn build (it's a create-react-app based app). The build folder looks something like:
build
├── asset-manifest.json
├── favicon.ico
├── images
│   ├── map-background.png
│   └── robot-icon.svg
├── index.html
├── js
│   ├── fontawesome.js
│   ├── packs
│   │   ├── brands.js
│   │   ├── light.js
│   │   ├── regular.js
│   │   └── solid.js
│   └── README.md
├── service-worker.js
└── static
├── css
│   ├── main.bf27c1d9.css
│   └── main.bf27c1d9.css.map
└── js
├── main.8d11d7ab.js
└── main.8d11d7ab.js.map
I never want index.html to be cached, because if I update the code (causing the hex suffix in main.*.js to update), I need the user's next visit to pick up on the <script src> change in index.html to point to the updated code.
In CloudFront, I can only seem to exclude paths, and excluding "/" doesn't seem to work properly. I'm getting strange behavior where I change the code, and if I hit refresh, I see it, but if I quit Chrome and go back, I see very outdated code for some reason.
I don't want to have to trigger an invalidation on every code release (via CodeBuild). Is there some other way? I think one of the challenges is that since this is an app using React Router, I'm having to do some trickery by setting the error document to index.html and forcing an HTTP status 200 instead of 403.
A solution based on CloudFront configuration:
Go to your CloudFront distribution, under the "Behavior" tab and create a new behavior.
Specify the following values:
Path Pattern: index.html
Object Caching: customize
Maximum TTL: 0 (or another very small value)
Default TTL: 0 (or another very small value)
Save this configuration.
CloudFront will not cache index.html anymore.
If you never want index.html to be cached, set the Cache-Control: max-age=0 header on that file only. CloudFront will make a request back to your origin S3 bucket on every request, but it sounds like this is desired behavior.
If you're wanting to set longer expiry times and invalidate the CloudFront cache manually, you can use a * or /* as your invalidation path (not / as you have mentioned). This can take up to 15 minutes for all CloudFront edge nodes around the world to reflect the changes in your origin however.
Here is the command I ran to set cache-control on my index.html file after uploading new files to s3 and invalidating Cloudfront:
aws s3 cp s3://bucket/index.html s3://bucket/index.html --metadata-directive REPLACE --cache-control max-age=0 --content-type "text/html"
It's much better to run an invalidation for index.html on every release than to defeat Cloudfront's purpose and serve it (what is basically an entrypoint for your app) from S3 every single time.

how to use Hugo with github pages to automatically update content

I am using Hugo to deploy a static page to github pages
I have a git repo in the /public folder but the contents of the /static folder are not a part of the repository. Therfore they are not uploaded to the username.github.io page.
the /static folder contains images and css files. This is why my page does not look good after pushing to github.
My workaround is that each time I manually copy the /static folder into the /public folder after I build the site.
I think there should be a better solution and I am probably missing something in the config.toml file of the hugo workflow.
I am following the instructions from this site
Any ideas how to automatically include /static files into the repository?
Hugo copies all the files in the static/ directory into the public/ directory when your site is rendered. For example, if you have a static/ folder that looks like this:
.
├── css
│   └── main.css
└── img
├── favicon.png
└── avatar.png
Then when you build your site, the public/ folder will look like this:
.
├── css
│ ├── main.css
│ └── <theme css files>
├── img
│ ├── favicon.png
│ ├── avatar.png
│ └── <theme images>
<more content folders>
So the files in your static folder are probably being included already. The problem is likely to be that your theme is looking for your static files in the wrong place. Take a look at your theme documentation and see if it says anything about static assets.