Increasing concurrent requests in DropNet

Increasing concurrent requests in DropNet - dropnet

I have some giant folders saved in DropBox with more than 10k files in them. I want to check if a list of files exists there, but I can't get the metadata on the parent folder because I am over the 10k limit.
So I have written code to check if each file in a list of files is present.
What I can't figure out is how many requests will run concurrently and how can I increase this number to the max that my machine can handle?
foreach(string f in files)
{
client.GetMetaDataAsync("/blah/blah/" + f, (response) =>
{
found.Add(f);
count++;
Console.WriteLine("{0} of {1} found - {2}", count, files.Count, f);
},
(error) =>
{
if (error.Message.Contains("[NotFound]"))
{
missing.Add(f);
count++;
Console.WriteLine("{0} of {1} missing - {2}", count, files.Count, f);
}
else
{
Console.WriteLine("Unknown error");
}
});
}

Related

Tracking failed jobs in SAS

I'm looking to improve the way to track a server that uses SAS to collect information from several databases, the challenge here is that it's one of those servers that a company used for a while without any governance, folders are everywhere, no actual folder structure, anyone was doing whatever they pleased with it. You know the drill.
The option that I found for now is to put a macro on the jobs that are failing that will track the logs, and if it finds an error, then it should send out an email to a distribution list.
But this solution isn't a very good software architecture, every time a new job is scheduled in the crontab, I need to slap the macro there, and if the things move to another directory it stops working and all of that, a lot of manual work cascading down.
A solution that I'm looking for would be finding something that can read all logs from a list of directories (and this would be the biggest manual update it needs then, but only this, and only because I don't have authorization to move the folders), fetch the ones that have errors, and output this to a web page, so then we can only check that web page. to know centralized information (when it's scheduled, when it last ran, etc, etc).
By any chance does anyone have any suggestions here on how to do this? I thought of running a SAS script to just output a txt file that I can fetch somewhere and then reading that with Javascript on a web page, but I ran into a few difficulties on how to loop through all of the folders, and because this scripting language doesn't fell really feel all that great for this purpose. But I'm out of ideas for now. Recommendations?

If you are using Node server side to deliver web content, you can write a function that scans the folders listed in a control file for log files containing ERROR: messages and reports them.
For example:
log folders.txt (control file)
d:\temp\logs\job-1
d:\temp\logs\tuesday jobs
d:\jobs\wednesday\job set 1
d:\jobs\wednesday\job set 2\prod
node app (test version)
const express = require('express');
const app = express();
const port = 8081;
app
.get('/sas/logs/errors', (req, res) => {
res.setHeader('Content-Type', 'text/html');
processFolders(res);
res.end();
})
;
app.listen(port, () => console.log(`Listening on port ${port}`));
const fs = require('fs');
const readline = require('readline');
const path = require('path');
const listOfFoldersFile = 'log folders.txt';
function processFolders(res) {
var level1Count = 0;
var level2Count = 0;
var level3Count = 0;
var content;
try {
content = fs.readFileSync(listOfFoldersFile, 'UTF-8');
}
catch (err) {
console.log(err.toString());
res.write('<p>Configuration error in processFolders.</p>');
return;
}
const folders = content.split(/\r?\n/);
for (let folder of folders) {
scanForLogs(folder);
}
if (level1Count == 0) {
res.write('<p>No error messages found in SAS logs.</p>');
return;
}
return;
function scanForLogs(folder) {
//console.log(`scan folder ${folder}`);
var logfiles;
try {
logfiles = fs.readdirSync (folder).filter(filename => filename.match(/\.log$/));
}
catch (err) {
console.log(err.toString());
res.write('<p>Problem scanning a log folder.</p>');
return;
}
level2Count = 0;
for (let logfile of logfiles) {
parseSASLog(folder, logfile);
}
}
function parseSASLog(folder, filename) {
//console.log(`parse log ${path.join(folder,filename)}`);
var content;
try {
content = fs.readFileSync(path.join(folder,filename), 'UTF-8');
}
catch (err) {
console.log(err.toString());
res.write(`<p>Problem reading file ${filename}</p>`);
return;
}
const lines = content.split(/\r?\n/);
level3Count = 0;
var linenum = 0;
for (const line of lines) {
linenum++;
if (line.match (/^ERROR:/)) {
reportErrorMessage(folder, filename, line, linenum);
}
}
if (level3Count) {
res.write('</pre>');
}
}
function reportErrorMessage(folder, filename, line, linenum) {
//console.log(`reportErrorMessage`);
level1Count++;
level2Count++;
level3Count++;
if (level1Count == 1) {
res.write('<h1>SAS errors</h1>');
}
if (level2Count == 1) {
res.write(`<h2>folder: ${folder}<h2>`);
}
if (level3Count ==1) {
res.write(`<h3>file: ${filename}</h3><pre>`);
}
res.write(`${linenum}: ${line}\n`);
}
}
Errors reported (example)
Other approaches could include
asynchronous server side scans delivering output to client side piecewise through signalR
delivering error report as json data which the client side uses to render the report

How to delete mass records using Map/reduce script?

I have created a Map/Reduce script which will fetch customer invoices and delete it. If I am creating saved search in UI based on the below criteria, it shows 4 million records. Now, if I run the script, execution stops before completing the "getInputData" stage as maximum storage limit of this stage is 200Mb. So, I want to fetch first 4000 records out of 4 million and execute it and schedule the script for every 15 mins. Here is the code of first stage (getInputData) -
var count=0;
var counter=0;
var result=[];
var testSearch = search.create({
type: 'customrecord1',
filters: [ 'custrecord_date_created', 'notonorafter', 'startOfLastMonth' ],
columns: [ 'internalid' ]
});
do{
var resultSearch = testSearch.run().getRange({
start : count,
end : count+1000
});
for(var arr=0;arr<resultSearch.length;arr++){
result.push(resultSearch[arr]);
}
counter = count+counter;
}while(resultSearch.length >= 1000 && counter != 4000);
return result;
During creating the saved search, it is taking long time, is there any work around where we can filter first 4000 records during saved search creation?

Why not a custom mass update?
It would be a 5-10 line script that grabs the internal id and record type of the current record in the criteria of the mass update then deletes the record.

I believe this is what search.runPaged() and pagedData.fetch() is for.
search.runPaged runs the current search and returns summary information about paginated results - it does not give you the result set or save the search.
pagedData.fetch retrieves the data within the specified page range.

If you are intent on the Map/Reduce you can just return your created search. Netsuite will run it and pass each line to the next phase. You can even use a saved search where you limit the number of lines and then in your summarize phase re-trigger the script if there's anything left to do.
The 4k record syntax though is:
var toDelete = [];
search.run().each(function(r){
toDelete.push(r.id);
return toDelete.length < 4000;
});
return toDelete;
finally I normally do this as scheduled mass update. It will tend to interfere less with any production scheduled and map/reduce scripts.
/**
* #NApiVersion 2.x
* #NScriptType MassUpdateScript
*/
define(["N/log", "N/record"], function (log, record) {
function each(params) {
try {
record.delete({
type: params.type,
id: params.id
});
log.audit({ title: 'deleted ' + params.type + ' ' + params.id, details: '' });
}
catch (e) {
log.error({ title: 'deleting: ' + params.type + ' ' + params.id, details: (e.message || e.toString()) + (e.getStackTrace ? (' \n \n' + e.getStackTrace().join(' \n')) : '') });
}
}
return {
each:each
};
});

How can I find corrupt files in a SharePoint 2013 ListItemCollection using CSOM

I have a CSOM program that transfers hundreds PDF files into SharePoint 2013 libraries. Once in a while one of those transferred files will be corrupt and can't be opened. The source file is good and the same file can be opened after having been transferred to other libraries but in one random library it will be corrupt.
I want to loop through the libraries, find the corrupt files and delete them but how can I tell using CSOM if the file is corrupt? I tried looping through and using File.OpenBinaryStream() but that succeeds on the corrupt files. Below is the code that reads the library and loops through the files. Any suggestions would be appreciated.
using (ClientContext destContext = new ClientContext(clientSite.Value))
{
destContext.AuthenticationMode = ClientAuthenticationMode.FormsAuthentication;
destContext.FormsAuthenticationLoginInfo = new FormsAuthenticationLoginInfo(ClientSiteUsername, ClientSitePassword);
// get the new list
Web destWeb = destContext.Web;
ListCollection lists = destWeb.Lists;
List selectedList = lists.GetByTitle(clientLibraryName);
destContext.Load(lists);
destContext.Load(selectedList);
ListItemCollection clientCurrentItemsList = selectedList.GetItems(CamlQuery.CreateAllItemsQuery());
destContext.Load(clientCurrentItemsList,
eachItem => eachItem.Include(
item => item,
item => item["ID"],
item => item["FileLeafRef"]));
try
{
destContext.ExecuteQuery();
}
catch (Exception ex)
{
log.Warn(String.Format("Error in VerifyClientDocuments. Could not read client library: {0}", clientSite.Value), ex);
continue;
}
foreach (ListItem item in clientCurrentItemsList)
{
try
{
item.File.OpenBinaryStream();
}
catch (Exception ex)
{
var val = ex.Message;
//delete here
}
}
}

What worked was checking the size of the new file against the expected file size. The corrupt files actually report a 0 byte size so I deleted those then re-added them later.
for (var counter = clientCurrentItemsList.Count; counter > 0; counter--)
{
var clientFileSize = clientCurrentItemsList[counter - 1].FieldValues.
Where(x => x.Key == "File_x0020_Size").FirstOrDefault().Value.ToString();
var fileName = clientCurrentItemsList[counter - 1].FieldValues.
Where(x => x.Key == "FileLeafRef").FirstOrDefault().Value.ToString();
var serverFileSize = approvedDocumentList.Where(x => x.FieldValues["FileLeafRef"].ToString() ==
fileName).FirstOrDefault().FieldValues["File_x0020_Size"].ToString();
if (clientFileSize != serverFileSize)
{
clientCurrentItemsList[counter - 1].DeleteObject();
destContext.ExecuteQuery();
log.Info(String.Format("File [{0}] deleted from {1} File type {2} - Reason: Client file was corrupt",
fileName, clientSiteURL, documentType.ToString()));
}
}

How to use casperjs evaluate to diff 2 strings and assert

I have a simple test that checks to see a user's quota correctly changes after they upload a file.
casper.then(function() {
quota_begin = this.evaluate(function() {
return document.querySelector('.storage_used p').textContent;
});
});
casper.then(function() {
common.ACTIONS.uploadFile(casper);
});
casper.then(function() {
quota_changed = this.evaluate(function() {
return document.querySelector('.storage_used p').textContent;
});
this.echo('Storage quota change: ' + quota_begin + ' => ' + quota_changed);
});
That last echo's output gives me:
Storage quota change: Upload quota 0B of 1GB used => Upload quota 192 KB of 1GB used
I'd like to include an assert in the test that fails when quota_begin and quota_changed do not actually change.
Something like:
test.assert(parseFloat(quota_changed) > parseFloat(quota_begin), "Quota was increased by file");
(doesn't work)
Is there an easy way to assert a diff on the two? regex?

Write a simple function to parse used bytes from that string will do that task:
function get_used_bytes(input) {
var unit_dict = {'B':1,'KB':1024,'MB':1024*1024,'GB':1024*1024*1024}
var ret = /Upload quota ([\d.]+)(\S+) of ([\d.]+)(\S+) used/g.exec(input)
return ret[1] * unit_dict[ret[2]]
}
// get_used_bytes("Upload quota 192KB of 1GB used")
// 196608
test.assert(get_used_bytes(quota_changed) > get_used_bytes(quota_begin), "Quota was increased by file");

How to read a JSON file containing multiple root elements?

If I had a file whose contents looked like:
{"one": 1}
{"two": 2}
I could simply parse each separate line as a separate JSON object (using JsonCpp). But what if the structure of the file was less convenient like this:
{
"one":1
}
{
"two":2
}

No one has mentioned arrays:
[
{"one": 1},
{"two": 2}
]
Is valid JSON and might do what the OP wants.

Neither example in your question is a valid JSON object; a JSON object may only have one root. You have to split the file into two objects, then parse them.
You can use http://jsonlint.com to see if a given string is valid JSON or not.
So I recommend either changing what ever is dumping multiple JSON objects into a single file to do it in separate files, or to put each object as a value in one JSON root object.
If you don't have control over whatever is creating these, then you're stuck parsing the file yourself to pick out the different root objects.
Here's a valid way of encoding those data in a JSON object:
{
"one": 1,
"two": 2
}
If your really need separate objects, you can do it like this:
{
"one":
{
"number": 1
},
"two":
{
"number": 2
}
}

Rob Kennedy is right. Calling it a second time would extract the next object, and so on.Most of the json lib can not help you to do all in a single root. Unless you are using more high end framework in QT.

You can also use this custom function to parse multiple root elements even if you have complex objects.
static getParsedJson(jsonString) {
const parsedJsonArr = [];
let tempStr = '';
let isObjStartFound = false;
for (let i = 0; i < jsonString.length; i += 1) {
if (isObjStartFound) {
tempStr += jsonString[i];
if (jsonString[i] === '}') {
try {
const obj = JSON.parse(tempStr);
parsedJsonArr.push(obj);
tempStr = '';
isObjStartFound = false;
} catch (err) {
// console.log("not a valid JSON object");
}
}
}
if (!isObjStartFound && jsonString[i] === '{') {
tempStr += jsonString[i];
isObjStartFound = true;
}
}
return parsedJsonArr;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js