Amazon S3 - ColdFusion's fileExists breaks when file was deleted by s3cmd - amazon-web-services

I'm running a site on ColdFusion 9 that stores cached information on Amazon S3.
The ColdFusion app builds the files and puts them into Amazon S3. Every N hours, the cache gets flushed with a bash script that executes s3cmd del, because it's much more efficient than ColdFusion's fileDelete or directoryDelete.
However, after the file has been deleted by s3cmd, ColdFusion will still flag it as an existing file, even though it won't be able to read its contents.
For the ColdFusion app, I provide the S3 credentials on Application.cfc, and they are the same authentication keys used by s3cmd, so I don't think it's a user permission issue.
Let's run through the process:
// Create an S3 directory with 3 files
fileWrite( myBucket & 'rabbits/bugs-bunny.txt', 'Hi there, I am Bugs Bunny' );
fileWrite( myBucket & 'rabbits/peter-rabbit.txt', 'Hi there, I am Peter Rabbit' );
fileWrite( myBucket & 'rabbits/roger-rabbit.txt', 'Hi there, I am Roger Rabbit' );
writeDump( var = directoryList(myBucket & 'rabbits/', 'true', 'name' ), label = 'Contents of the rabbits/ folder on S3' );
// Delete one of the files with ColdFusion's fileDelete
fileDelete( myBucket & 'rabbits/roger-rabbit.txt' );
writeDump( var = directoryList(myBucket & 'rabbits/', 'true', 'name' ), label = 'Contents of the rabbits/ folder on S3' );
// Now, let's delete a file using the command line:
[~]$ s3cmd del s3://myBucket/rabbits/peter-rabbit.txt
File s3://myBucket/rabbits/peter-rabbit.txt deleted
writeDump( var = directoryList(myBucket & 'rabbits/', 'true', 'name' ), label = 'Contents of the rabbits/ folder on S3' );
// So far, so good!
// BUT!... ColdFusion still thinks that peter-rabbit.txt exists, even
// though it cannot display its contents
writeOutput( 'Does bugs-bunny.txt exist?: ' & fileExists(myBucket & 'rabbits/bugs-bunny.txt') );
writeOutput( 'Then show me the content of bugs-bunny.txt: ' & fileRead(myBucket & 'rabbits/bugs-bunny.txt') );
writeOutput( 'Does peter-rabbit.txt exist?: ' & fileExists(myBucket & 'rabbits/peter-rabbit.txt') );
writeOutput( 'Then show me the content of peter-rabbit.txt: ' & fileRead(myBucket & 'rabbits/peter-rabbit.txt') );
// Error on fileRead(peter-rabbit.txt) !!!

I agree with the comment by #MarkAKruger that the problem here is latency.
Given that ColdFusion can't consistently tell whether a file exists, but it DOES consistently read its up-to-date contents (and consistently fails to read them when they are not available), I've come up with this solution:
string function cacheFileRead(
required string cacheFileName
){
var strContent = '';
try{
strContent = FileRead( ARGUMENTS.cachefileName );
}catch(Any e){
strContent = '';
}
return strContent;
}

This answer assumes latency is your problem as I have asserted in the comments above.
I think I would keep track of when s3cmd is run. If you are running it via CFEXECUTE then store a timestamp in the Application scope or a file or DB table. Then, when checking for a file if the command has been run in the last N number of minutes (you'll have to experiment to figure out what makes sense) you would recache automatically. When N minutes have passed you can rely on your system of checks as reliable.
If your are not running s3cmd from cfexecute, try creating a script that updates the timestamp in the application scope and then add a curl command to your s3cmd script that hits your cf script - keeping the 2 processes in synch.
Your other option is to constantly use fileExists() (not a good idea - very expensive) or keep track of what is cached or not cached some other way that can be updated in real time - a DB table for example. You would then need to clear the table from your s3cmd script (perhaps using mysql command line).
I may think of something else for you. That's all I have for now. :)

Related

using split/join to create a new aws key

Alright, so I have the file transfer part working, but what I'm dealing with is on a huge scale (100s of thousands of potential uploads); so what I'm trying to do is this:
Trigger lambda to move source uploaded object to a new location
Location to be a named key that includes the objects name (in a different bucket)
I have it moving the files from one s3 bucket to another, i just can't figure out how to get it to create a new key in my destination bucket based on the name of the uploaded file.
Example: uploaded file : grandkids.jpg -> lambda put trigger moves file to /grandkids/grandkids.jpg
Thank you all in advance (It doesn't help that I only know the little bit of nodejs/python due to lambda, I am not an experienced coder at all)
You just want to split the filename and use that as the prefix, like below.
fn = 'grandkids.jpg'
folder = fn.split('.')[0]
newkey = folder + '/' + fn
print(newkey)
grandkids/grandkids.jpg
But what if you have a filename with more than one '.'? Use rsplit and '1' to only split on the farthest right '.'
fn = 'my.awesome.grandkids.jpg'
folder = fn.rsplit('.', 1)[0].replace('.', '_') #personal preference to use underscores in folder names
newkey = folder + '/' + fn
print(newkey)
my_awesome_grandkids/my.awesome.grandkids.jpg

How to copy files from gdrive to s3 bucket using google scripts?

I created a Google Form with a linked Google Spreadsheet. I would like that everytime someone submits the form, the spreadsheet is copied to an s3 bucket in AWS. To do so, I just got started with Google Scripts. I managed to get the trigger part working on form submit but I am struggling to understand the readme of this GitHub project to upload to s3.
function setUpTrigger() {
ScriptApp.newTrigger('copyDataS3')
.forForm('1SK-2Ow63vs_TaoF54UjSgn35FL7F8_ANHDTOOiTabMM')
.onFormSubmit()
.create();
}
function copyDataS3() {
// https://github.com/viuinsight/google-apps-script-for-aws
// I do not understand where should I place aws.js and util.js.
// Should I do File -> New -> Script file and copy paste the contents? Should the file be .js or .gs?
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY");
// if I wanwt to copy an spreadsheet with the following id, what should go into "object" below?
var ssID = "SPREADSHEET_ID";
S3.putObject(bucketName, objectName, object, region)
}
I believe your goal as follows.
You want to send Google Spreadsheet to s3 bucket as a CSV data using Google Apps Script.
Modification points:
When I saw google-apps-script-for-aws of the library you are using, I noticed that the data is requested as the string. I thought that in this case, your CSV data might be able to be directly sent. But for example, when you want to sent a binary data, it will occur an error. So in this answer, I would like to propose the modified script of 2 patterns.
I thought that the situation might similar to this thread. But I noticed that you are using the different library from the thread. So I post this answer.
Pattern 1:
In this pattern, it supposes that only the text data is sent. It's like the CSV data in your replying. In this case, I think that it is not required to modify the library.
Modified script:
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY"); // Please set this.
var spreadsheetId = "###"; // Please set the Spreadsheet ID.
var sheetName = "Sheet1"; // Please set the sheet name.
var region = "###"; // Please set this.
var csv = SpreadsheetApp
.openById(spreadsheetId)
.getSheetByName(sheetName)
.getDataRange()
.getValues() // or .getDisplayValues()
.map(r => r.join(","))
.join("\n");
var blob = Utilities.newBlob(csv, MimeType.CSV, sheetName + ".csv");
S3.putObject("bucketName", "test.csv", blob, region);
Pattern 2:
In this pattern, it supposes that both the text data and binary data are sent. In this case, it is required to also modify the library side.
For google-apps-script-for-aws
Please modify the line 110 in s3.js as follows.
From:
var content = object.getDataAsString();
To:
var content = object.getBytes();
And, please modify the line 146 in s3.js as follows.
From:
Utilities.DigestAlgorithm.MD5, content, Utilities.Charset.UTF_8));
To:
Utilities.DigestAlgorithm.MD5, content));
For Google Apps Script:
In this case, please give the blob to S3.putObject as follows.
Script:
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY"); // Please set this.
var fileId = "###"; // Please set the file ID.
var region = "###"; // Please set this.
var blob = DriveApp.getFileById(fileId).getBlob();
S3.putObject("bucketName", blob.getName(), blob, region);
References:
viuinsight/google-apps-script-for-aws
Class UrlFetchApp
computeDigest(algorithm, value)
PutObject

How i can display pdf file into region oracle apex?

i want display PDF file into region , i tried that by call application process using below code but always same file open.( plsql dynamic content region)
DECLARE
V_URL VARCHAR2(2500);
BEGIN
V_URL :='f?p=&APP_ID.:1:&APP_SESSION.:APPLICATION_PROCESS=display_emp_blob:::FILE_ID:' ||:P6_ID;
Sys.htp.p('<p align="center">');
sys.htp.p('<iframe src="'||V_URL||'"width="99%" height="1000">');
sys.htp.p('</iframe>');
sys.htp.p('</p>');
END;
and the application process code in below
CREATE OR REPLACE PROCEDURE OPEN_FILE (P_ID NUMBER)
IS
vBlob blob;
vmimetype varchar2(50);
BEGIN
SELECT ORG_FILES.FILE_CONTENT,MIME_TYPE INTO vBlob, vmimetype
FROM ORG_FILES
WHERE ID =P_ID ;
sys.HTP.init;
owa_util.mime_header(vmimetype,false);
htp.p('Content-Length: ' || dbms_lob.getlength(vBlob));
owa_util.http_header_close;
wpg_docload.download_file(vBlob);
apex_application.stop_apex_engine;
exception
when no_data_found then
null;
END;
How i can open different PDF file into region based a value in ITEM (P6_ID) .
I think the problem you have is that the browser caches the file.
You can specify the time the browser caches with the "Cache-control" header option. Below, you have the code that I use (I have this code in the application process, not in the database):
sys.htp.init;
sys.owa_util.mime_header( 'application/pdf', FALSE );
sys.htp.p('Content-length: ' || sys.dbms_lob.getlength( v_blob));
sys.htp.p('Content-Disposition: inline; filename="'|| v_filename || '"' ); -- "attachment" for download, "inline" for display
sys.htp.p('Cache-Control: max-age=3600'); -- in seconds. Tell the browser to cache for one hour, adjust as necessary
sys.owa_util.http_header_close;
sys.wpg_docload.download_file( v_blob );
apex_application.stop_apex_engine;
You can also try some lazy load, which is the way I access my files (It may be that the way to access your file is also part of the problem). This way you make the page load without waiting for the user and then it loads and shows the file. I don't use the iframe tag but the embed tag. The way to do it is as follows:
Create a region with static content with this html
<div id="view_pdf"></div>
creates a dynamic action when the page loads, which executes javascript and add the following code
$('#view_pdf').html('');
var url = 'f?p=&APP_ID.:1:&APP_SESSION.:APPLICATION_PROCESS=display_emp_blob:::FILE_ID:' + apex.item('P6_ID').getValue();
var preview = document.createElement('embed');
preview.type = "application/pdf";
preview.width="100%";
preview.height="625px";
preview.src = url;
$("#view_pdf").append(preview);
You can modify the values depending on what you need. The embed tag uses the default way to view pdf files from browsers.
Also if what you want is to change the pdf without reloading the page you must use the previous javacript in a dynamic action when you change the value of the item.
I hope you find it useful.
My apologies, on the rare occasion I use this region type, I always think it can be refreshed.
https://spendolini.blogspot.com/2015/11/refreshing-plsql-regions-in-apex.html
The solution is to create a classic report that calls a PL/SQL function that returns your HTML.
SELECT package_name.function_name(p_item => :P1_ITEM) result FROM dual

Postman accessing the stored results in the database leveldb

So I have a set of results in Postman from a runner on a collection using some data file for iterations - I have the stored data from the runner in the Postman app on Linux, but I want to know how I can get hold of the data. There seems to be a database hidden away in the ~/.config directory (/Desktop/file__0.indexeddb.leveldb) - that looks like it has the data from the results there.
Is there anyway that I can get hold of the raw data - I want to be able to save the results from the database and not faff around with running newman or hacking a server to post the results and then save, I already have 20000 results in a collection. I want to be able to get the responseData from each post and save it to a file - I will not execute the posts again, I need to just work out a way
I've tried KeyLord, FastNoSQL (this crashes), levelDBViewer(Jar), but not having any luck here.
Any suggestions?
inline 25024 of runner.js a simple yet hack for small numbers of results I can do the following
RunnerResultsRequestListItem = __WEBPACK_IMPORTED_MODULE_2_pure_render_decorator___default()(_class = class RunnerResultsRequestListItem extends __WEBPACK_IMPORTED_MODULE_0_react__["Component"] {
constructor(props) {
super(props);
var text = props.request.response.body,
blob = new Blob([text], { type: 'text/plain' }),
anchor = document.createElement('a');
anchor.download = props.request.ref + ".txt";
anchor.href = (window.webkitURL || window.URL).createObjectURL(blob);
anchor.dataset.downloadurl = ['text/plain', anchor.download, anchor.href].join(':');
anchor.click();
it allows me to save but obviously I have to click save for now, anyone know how to automate the saving part - please add something here!

How to create a C/C++ program that generates an XML and runs a DOS command afterwards?

I need to come up with a program that generates an xml file like this:
<?xml version="1.0"?>
<xc:XmlCache xmlns:xc="XmlCache" xc:action="Update">
<xc:XmlCacheArea xc:target="AllSubFields" xc:value="MarketParameters">
<mp:nickName xmlns:mp="mx.MarketParameters" xc:value="MDS">
<mp:date xc:value="TODAY">
<fx:forex xmlns:fx="mx.MarketParameters.Forex">
<fxsp:spot xmlns:fxsp="mx.MarketParameters.Forex.Spot">
<fxsp:pair type="Fields" value="USD/BRL">
<mp:ask xc:type="Field" xc:keyFormat="N">1.890</mp:ask>
<mp:bid xc:type="Field" xc:keyFormat="N">1.800</mp:bid>
</fxsp:pair>
</fxsp:spot>
</fx:forex>
</mp:date>
</mp:nickName>
</xc:XmlCacheArea>
</xc:XmlCache>
with the values in the nodes mp:ask and mp:bid randomly generated but between two predefined values (1.65 and 1.99).
After the xml is generated in the same directory of the program, the program should run a command in the cmd command line that states:
cachetool.bat -i cacheBody.xml -u REALTIME
where cachetool.bat is an already done bash script that cannot be changed and that is also place in the same directory of the program, and where cacheBody.xml is that previously generated xml.
The trick here is that this should run repeatedly overwriting the xml file with new values each time and then running the command again calling the xml with the new values.
There should be a way to easy interrupt the loop, but besides that, this should run indefinitely.
Note: there isn't a strict rule to use c or c++, if it isn't feasible in these languages or if there other ways to do it easily, please feel free to suggest. My initial proposal is in these languages because these are the two that I'm a little used to deal with.
I'm learning how to use javascript for Windows local scripting, so here's a solution in javascript.
It looks like you don't really need to generate the XML dynamically, but rather the XML structure is static and only a couple data fields are dynamic. With that in mind, I approached the problem with search-and-replace using a template file.
The template file (template.xml) contains xml content with some variables to search and replace. The format variable format is $RANDOM_X_Y$, where X and Y are the lower and upper bounds for the random number. To help the example, I generated the ask and bid prices slightly differently in the template file:
<?xml version="1.0"?>
<xc:XmlCache xmlns:xc="XmlCache" xc:action="Update">
<xc:XmlCacheArea xc:target="AllSubFields" xc:value="MarketParameters">
<mp:nickName xmlns:mp="mx.MarketParameters" xc:value="MDS">
<mp:date xc:value="TODAY">
<fx:forex xmlns:fx="mx.MarketParameters.Forex">
<fxsp:spot xmlns:fxsp="mx.MarketParameters.Forex.Spot">
<fxsp:pair type="Fields" value="USD/BRL">
<mp:ask xc:type="Field" xc:keyFormat="N">1.$RANDOM_65_99$0</mp:ask>
<mp:bid xc:type="Field" xc:keyFormat="N">1.$RANDOM_650_990$</mp:bid>
</fxsp:pair>
</fxsp:spot>
</fx:forex>
</mp:date>
</mp:nickName>
</xc:XmlCacheArea>
</xc:XmlCache>
The javascript file is called replace.js. All versions of Windows should be able to execute it natively without installing any extra components.
if( WScript.Arguments.Count() != 2 || WScript.Arguments.Item(0) == WScript.Arguments.Item(1) )
{
WScript.Echo("Usage: replace.js <template> <output filename>");
WScript.Quit();
}
var template_filename = WScript.Arguments.Item(0);
var output_filename = WScript.Arguments.Item(1);
var fso = new ActiveXObject("Scripting.FileSystemObject");
var ForReading = 1;
var file, file_contents, lower, upper;
var var_regex = /\$RANDOM_(\d+)_(\d+)\$/g;
if( fso.FileExists(template_filename) )
{
file = fso.OpenTextFile(template_filename, ForReading, false);
file_contents = file.ReadAll().replace(var_regex,
function(str, lower, upper) {
return Math.floor(
Math.random() * (+upper - +lower + 1)) + +lower;
});
file.Close();
file = fso.CreateTextFile(output_filename, true);
file.Write(file_contents);
file.Close();
}
else
{
WScript.Echo("Template does not exist: " + template_filename);
}
Now to run your scripts indefinitely, just create a batch file called run.bat or whatever and have it run the javascript and batch files in a loop. CTRL-C will exit the script.
#echo off
echo Starting. Press CTRL-C to exit.
:loop
replace.js template.xml cacheBody.xml
cachetool.bat -i cacheBody.xml -u REALTIME
goto loop
Well, to create the random value, you can use the rand() function, and just scale it so it's between the two values you want.
To call the command line, try system("cachetool.bat -i cacheBody.xml -u REALTIME");
And for the xml, if it's all the same except for the numbers, you can just hardcode it. If not, you'll need an xml library.