Download Last 24 hour files from s3 using Powershell - amazon-web-services

I have an s3 bucket with different filenames. I need to download specific files (filenames that starts with impression) that are created or modified in last 24 hours from s3 bucket to local folder using powershell?
$items = Get-S3Object -BucketName $sourceBucket -ProfileName $profile -Region 'us-east-1' | Sort-Object LastModified -Descending | Select-Object -First 1 | select Key Write-Host "$($items.Length) objects to copy" $index = 1 $items | % { Write-Host "$index/$($items.Length): $($_.Key)" $fileName = $Folder + ".\$($_.Key.Replace('/','\'))" Write-Host "$fileName" Read-S3Object -BucketName $sourceBucket -Key $_.Key -File $fileName -ProfileName $profile -Region 'us-east-1' > $null $index += 1 }

A workaround might be to turn on access log, and since the access log will contain timestamp, you can get all access logs in the past 24 hours, de-duplicate repeated S3 objects, then download them all.
You can enable S3 access log in the bucket settings, the logs will be stored in another bucket.
If you end up writing a script for this, just bear in mind downloading the S3 objects will essentially create new access logs, making the operation irreversible.
If you want something fancy perhaps you can even query the logs and perhaps deduplicate using AWS Athena.

Related

Scheduling a Powershell script to run weekly in AWS

So, I've got the following powershell script to find inactive AD users and disable their accounts, creating a log file containing a list of what accounts have been disabled:
Import-Module ActiveDirectory
# Set the number of days since last logon
$DaysInactive = 60
$InactiveDate = (Get-Date).Adddays(-($DaysInactive))
# Get AD Users that haven't logged on in xx days
$Users = Get-ADUser -Filter { LastLogonDate -lt $InactiveDate -and Enabled -eq $true } -
Properties LastLogonDate | Select-Object #{ Name="Username"; Expression=.
{$_.SamAccountName} }, Name, LastLogonDate, DistinguishedName
# Export results to CSV
$Users | Export-Csv C:\Temp\InactiveUsers.csv -NoTypeInformation
# Disable Inactive Users
ForEach ($Item in $Users){
$DistName = $Item.DistinguishedName
Disable-ADAccount -Identity $DistName
Get-ADUser -Filter { DistinguishedName -eq $DistName } | Select-Object #{ Name="Username"; Expression={$_.SamAccountName} }, Name, Enabled
}
The script works and is doing everything it should. What I am trying to figure out is how to automate this in an AWS environment.
I'm guessing I need to use a Lambda function in AWS to trigger this script to run on a schedule but don't know where to start.
Any help greatly appreciated.
I recomment to create a Lambda function with dotnet environment: https://docs.aws.amazon.com/lambda/latest/dg/lambda-powershell.html
Use a CloudWatch Event on a Scheduled basis to trigger the function:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html
An alternative, if you like to to have a more pipeline style execution you could use CodePipeline and CodeBuild to run the script. Use again CloudWatch to trigger the CodePipeline on a scheduled basis!

aws s3 sync missing to create root folders

I am archiving some folders to S3
Example: C:\UserProfile\E21126\data ....
I expect to have a folder structure in s3 like, UserProfiles\E21126.
Problem is it created the folders under \E21126 and misses creating the root folder \E21126.
Folds1.txt contains these folders to sync:
G:\UserProfiles\E21126
G:\UserProfiles\E47341
G:\UserProfiles\C68115
G:\UserProfiles\C30654
G:\UserProfiles\C52860
G:\UserProfiles\E47341
G:\UserProfiles\C68115
G:\UserProfiles\C30654
G:\UserProfiles\C52860
my code below:
ForEach ($Folder in (Get-content "F:\scripts\Folds1.txt")) {
aws s3 sync $Folder s3://css-lvdae1cxfs003-archive/Archive-Profiles/ --acl bucket-owner-full-control --storage-class STANDARD
}
It will upload all the folders with their names excluding the path. If you want to include the UserProfiles in the S3 bucket then you will needs to include that in the key. You need to upload them to the S3 bucket with specifying the key name
aws s3 sync $Folder s3://css-lvdae1cxfs003-archive/Archive-Profiles/UserProfiles --acl bucket-owner-full-control --storage-class STANDARD
and If your files have different name instead of UserProfiles string then you can get the parent path and then fetch the leaf to get the username from the string
PS C:\> Split-Path -Path "G:\UserProfiles\E21126"
G:\UserProfiles
PS C:\> Split-Path -Path "G:\UserProfiles" -Leaf -Resolve
UserProfiles
If you were to modify the text file to contain:
E21126
E47341
C68115
Then you could use the command:
ForEach ($Folder in (Get-content "F:\scripts\Folds1.txt")) {
aws s3 sync G:\UserProfiles\$Folder s3://css-lvdae1cxfs003-archive/Archive-Profiles/$Folder/ --acl bucket-owner-full-control --storage-class STANDARD
}
Note that the folder name is included in the destination path.

Powershell writing to AWS S3

I'm trying to get powershell to write results to AWS S3 and I can't figure out the syntax. Below is the line that is giving me trouble. If I run this without everything after the ">>" the results print on the screen.
Write-host "Thumbprint=" $i.Thumbprint " Expiration Date="$i.NotAfter " InstanceID ="$instanceID.Content" Subject="$i.Subject >> Write-S3Object -BucketName arn:aws:s3:::eotss-ssl-certificatemanagement
Looks like you have an issue with >> be aware that you can't pass the write-host function result into another command.
In order to do that, you need to assign the string you want into a variable and then pass it into the -Content.
Take a look at the following code snippet:
Install-Module AWSPowerShell
Import-Module AWSPowerShell
#Set AWS Credential
Set-AWSCredential -AccessKey "AccessKey" -SecretKey "SecretKey"
#File upload
Write-S3Object -BucketName "BucketName" -Key "File upload test" -File "FilePath"
#Content upload
$content = "Thumbprint= $($i.Thumbprint) Expiration Date=$($i.NotAfter) InstanceID = $($instanceID.Content) Subject=$($i.Subject)"
Write-S3Object -BucketName "BucketName" -Key "Content upload test" -Content $content
How to create new AccessKey and SecretKey - Managing Access Keys for Your AWS Account.
AWSPowerShell Module installation.
AWS Tools for PowerShell - S3 Documentation.

How can I retrieve all data from a given AWS Powershell cmdlets/alias?

When running a cmdlet like Get-WKSWorkspaces, it will return a set of properties about your workspaces (e.g. WorkspaceID, Username, SubnetID, BundleID, etc.), but not everything you see in the AWS GUI. I am specifically trying to pull things like Running Mode, Compute Type, and Creation Time as well, but can't seem to find where to pull it.
In my research, I got up to the point where I was using $AWSHistory to try and dig deeper into the data returned from my previous cmdlets, but have definitely hit a wall and can't seem to get around it.
I do have a partial command that is giving me most of the output I need:
$region = Get-DefaultAWSRegion
$lastuserconnect = Get-WKSWorkspacesConnectionStatus | Select LastKnownUserConnectionTimestamp
Get-WKSWorkspace -ProfileName ITSLayer1-053082227562-Profile | Select WorkspaceID, UserName, BundleID, DirectoryID,
#{Name="Region"; Expression={$region.Region}},
#{Name="LastKnownUserConnect"; Expression=
{$lastuserconnect.LastKnownUserConnectionTimestamp}}
Update for posterity: Actually got something decent to come out here. It's slow, but it renders in a table format pretty well and includes a bit at the start to select your AWS region.
Suggestions for improvement include:
Automatically switching the Region select to get all workspaces from
the main Regions we use
Cleaning the lines up so it's easier to
read
Getting the region to automatically append the filename so it
doesn't overwrite your file every time (it's in there but broken at
the moment...still pops out a file with 'workspace_properties.csv'
as the name)
Optimizing the script because it's pretty slow
$lastuserconnect = Get-WKSWorkspacesConnectionStatus -ProfileName $profile
$defaultregion = Get-DefaultAWSRegion
$showallregions = Get-AWSRegion
$exportpath = "" + $env:USERPROFILE + "\workspace_properties" +
$defaultregion.Region + ".csv"
$showallregions | Format-Table
$setregion = Read-Host -Prompt 'AWS Region'
Clear-DefaultAWSRegion
Set-DefaultAWSRegion $setregion
Get-WKSWorkspace -ProfileName $profile | Select WorkspaceID, UserName, BundleID, DirectoryID, #{Name="ComputeType"; Expression={$.WorkspaceProperties.ComputeTypeName}}, #{Name="RunningMode"; Expression={$.WorkspaceProperties.RunningMode}}, #{Name="Region"; Expression={$defaultregion.Region}}, #{Name="LastKnownUserConnect"; Expression={$_ | foreach {$lastuserconnect = Get-WKSWorkspacesConnectionStatus -ProfileName $profile -WorkspaceId $_.WorkspaceId; echo $lastuserconnect.LastKnownUserConnectionTimestamp}}} | Export-Csv $exportpath
Here is an example of fetching those properties you are looking for:
Get-WKSWorkspace | foreach {
$connectionStatus = Get-WKSWorkspacesConnectionStatus -WorkspaceId $_.WorkspaceId;
echo "";
echo "==> About $($_.WorkspaceId)";
echo "Last State Check: $($connectionStatus.ConnectionStateCheckTimestamp)";
echo "User Last Active: $($connectionStatus.LastKnownUserConnectionTimestamp)";
echo "Directory: $($_.DirectoryId)";
echo "Compute: $($_.WorkspaceProperties.ComputeTypeName)";
echo "Running mode $($_.WorkspaceProperties.RunningMode)";
echo "State $($_.State)"
}
I don't see a 'Creation Time' on workspace on the console either.
[edit]
I believe you are looking for a way to export these info, may be below code will help:
[System.Collections.ArrayList]$output=#()
Get-WKSWorkspace | foreach {
$connectionStatus = Get-WKSWorkspacesConnectionStatus -WorkspaceId $_.WorkspaceId;
$bunch = [pscustomobject]#{
WorkspaceId = $_.WorkspaceId
LastStateCheck=$connectionStatus.ConnectionStateCheckTimestamp
UserLastActive=$connectionStatus.LastKnownUserConnectionTimestamp
Directory= $_.DirectoryId
Compute=$_.WorkspaceProperties.ComputeTypeName
Runningmode= $_.WorkspaceProperties.RunningMode
State= $_.State
}
$output.Add($bunch)|Out-Null
}
$output | Export-Csv -NoType c:\dd.csv
From looking at the docs it appears what you are looking for in the property WorkspaceProperties which contains an Amazon.WorkSpaces.Model.WorkspaceProperties object with the following properties:
ComputeTypeName Amazon.WorkSpaces.Compute
RootVolumeSizeGib System.Int32
RunningMode Amazon.WorkSpaces.RunningMode
RunningModeAutoStopTimeoutInMinutes System.Int32
UserVolumeSizeGib System.Int32
Not sure about the CreationTime though...

Quickly finding the size of an S3 'folder'

We have s3 'folders' (objects with a prefix under a bucket) with millions and millions of files and we want to figure out the size of these folders.
Writing my own .net application to get the lists of s3 objects was easy enough but the maximum number of keys per request is 1000, so it's taking forever.
Using S3Browser to look at a 'folder's' properties is taking a long time too. I'm guessing for the same reasons.
I've had this .NET application running for a week - I need a better solution.
Is there a faster way to do this?
The AWS CLI's ls command can do this: aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$PREFIX --region $REGION
Seems like AWS added a menu item where it's possible to see the size:
I prefer using the AWSCLI. I find that the web console often times out when there are too many objects.
replace s3://bucket/ with where you want to start from.
relies on awscli, awk, tail, and some bash-like shell
start=s3://bucket/ && \
for prefix in `aws s3 ls $start | awk '{print $2}'`; do
echo ">>> $prefix <<<"
aws s3 ls $start$prefix --recursive --summarize | tail -n2
done
or in one line form:
start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
Output looks something like:
$ start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
>>> extracts/ <<<
Total Objects: 23
Total Size: 10633858646
>>> hackathon/ <<<
Total Objects: 2
Total Size: 10004
>>> home/ <<<
Total Objects: 102
Total Size: 1421736087
I think the ideal solution does not exist. But I offer some ideas you can further develop:
Is the app the only mean by which file are written to S3? If so, you can store (in a db, a file or what ever) the files size and sum it when necessary
Do concurrent calls to the LIST api
Can you switch from an organisation based on folders to one based on buckets? If so, you could query the billing API (yes, the billing) and calculating the size (or an approximation of) from cost...
If they're throttling you too 1000 keys per request, I'm not certain how PowerShell is going to help, but if you want to size of a bunch of folders, something like this should do it.
Save the following in a file called Get-FolderSize.ps1:
param
(
[Parameter(Position=0, ValueFromPipeline=$True, Mandatory=$True)]
[ValidateNotNullOrEmpty()]
[System.String]
$Path
)
function Get-FolderSize ($_ = (get-item .)) {
Process {
$ErrorActionPreference = "SilentlyContinue"
#? { $_.FullName -notmatch "\\email\\?" } <-- Exlcude folders.
$length = (Get-ChildItem $_.fullname -recurse | Measure-Object -property length -sum).sum
$obj = New-Object PSObject
$obj | Add-Member NoteProperty Folder ($_.FullName)
$obj | Add-Member NoteProperty Length ($length)
Write-Output $obj
}
}
Function Class-Size($size)
{
IF($size -ge 1GB)
{
"{0:n2}" -f ($size / 1GB) + " GB"
}
ELSEIF($size -ge 1MB)
{
"{0:n2}" -f ($size / 1MB) + " MB"
}
ELSE
{
"{0:n2}" -f ($size / 1KB) + " KB"
}
}
Get-ChildItem $Path | Get-FolderSize | Sort-Object -Property Length -Descending | Select-Object -Property Folder, Length | Format-Table -Property Folder, #{ Label="Size of Folder" ; Expression = {Class-Size($_.Length)} }
Usage: .\Get-FolderSize.ps1 -Path \path\to\your\folders