Replacement Character inserted between each letter of CSV Dataset. How to Replace? - regex

I'm working on importing a CSV dataset into a google sheet from my drive. I have the script working, however whenever the data imports it looks like this.
After Import
var file = DriveApp.getFileById(url);
var csvString = file.getBlob().getDataAsString('UTF-8').replace(/\uFFFD/g, '');
var csvData = Utilities.parseCsv(csvString);
var sheet = SpreadsheetApp.openById(sheetid);
var s = sheet.getSheetByName('Data');
s.getRange(1, 1, csvData.length, csvData[0].length).setValues(csvData);
I've tried a number of different regex expressions to replace the unknown characters but after a few days trying to figure it out, I figured I'd post it on here and get a bit of help. (I didn't include the .replace() in the code because I couldn't get it to work. This is the code that is working to only paste it to my sheet)
Edit* Here is the Expected Output - I've whited out the email addresses and usernames to keep the information private.
Expected Output

Related

How to copy files from gdrive to s3 bucket using google scripts?

I created a Google Form with a linked Google Spreadsheet. I would like that everytime someone submits the form, the spreadsheet is copied to an s3 bucket in AWS. To do so, I just got started with Google Scripts. I managed to get the trigger part working on form submit but I am struggling to understand the readme of this GitHub project to upload to s3.
function setUpTrigger() {
ScriptApp.newTrigger('copyDataS3')
.forForm('1SK-2Ow63vs_TaoF54UjSgn35FL7F8_ANHDTOOiTabMM')
.onFormSubmit()
.create();
}
function copyDataS3() {
// https://github.com/viuinsight/google-apps-script-for-aws
// I do not understand where should I place aws.js and util.js.
// Should I do File -> New -> Script file and copy paste the contents? Should the file be .js or .gs?
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY");
// if I wanwt to copy an spreadsheet with the following id, what should go into "object" below?
var ssID = "SPREADSHEET_ID";
S3.putObject(bucketName, objectName, object, region)
}
I believe your goal as follows.
You want to send Google Spreadsheet to s3 bucket as a CSV data using Google Apps Script.
Modification points:
When I saw google-apps-script-for-aws of the library you are using, I noticed that the data is requested as the string. I thought that in this case, your CSV data might be able to be directly sent. But for example, when you want to sent a binary data, it will occur an error. So in this answer, I would like to propose the modified script of 2 patterns.
I thought that the situation might similar to this thread. But I noticed that you are using the different library from the thread. So I post this answer.
Pattern 1:
In this pattern, it supposes that only the text data is sent. It's like the CSV data in your replying. In this case, I think that it is not required to modify the library.
Modified script:
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY"); // Please set this.
var spreadsheetId = "###"; // Please set the Spreadsheet ID.
var sheetName = "Sheet1"; // Please set the sheet name.
var region = "###"; // Please set this.
var csv = SpreadsheetApp
.openById(spreadsheetId)
.getSheetByName(sheetName)
.getDataRange()
.getValues() // or .getDisplayValues()
.map(r => r.join(","))
.join("\n");
var blob = Utilities.newBlob(csv, MimeType.CSV, sheetName + ".csv");
S3.putObject("bucketName", "test.csv", blob, region);
Pattern 2:
In this pattern, it supposes that both the text data and binary data are sent. In this case, it is required to also modify the library side.
For google-apps-script-for-aws
Please modify the line 110 in s3.js as follows.
From:
var content = object.getDataAsString();
To:
var content = object.getBytes();
And, please modify the line 146 in s3.js as follows.
From:
Utilities.DigestAlgorithm.MD5, content, Utilities.Charset.UTF_8));
To:
Utilities.DigestAlgorithm.MD5, content));
For Google Apps Script:
In this case, please give the blob to S3.putObject as follows.
Script:
S3.init("MY_ACCESS_KEY", "MY_SECRET_KEY"); // Please set this.
var fileId = "###"; // Please set the file ID.
var region = "###"; // Please set this.
var blob = DriveApp.getFileById(fileId).getBlob();
S3.putObject("bucketName", blob.getName(), blob, region);
References:
viuinsight/google-apps-script-for-aws
Class UrlFetchApp
computeDigest(algorithm, value)
PutObject

Extract Hyperlinks Google Apps Script

For a long time the following code did work perfectly to extract hyperlinks from a text using regex-expression:
var text = "this is a http://google.de link!";
var link = text.match("(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+üäö&##/%=~_|$?!:,.]*\)|[-A-Z0-9+üäö&##/%=~_|$?!:,.])*(?:\([-A-Z0-9+üäö&##/%=~_|$?!:,.]*\)|[A-Z0-9+üäö&##/%=~_|$])", "gi");
The result should be "http://google.de" for the variable link. But it doesn't work anymore. I deem Google has changed something in GAS!?
Can you please tell me, which expression I can use to extract hyperlinks from a string?

Regex to grab data from massive HTML string

I am grabbing a HTML source dump that includes some sort of JSON props created by react.
Trying to grab data in syntax like this: "siteName":"Example Site". I want to grab that "Example Site" text without the quotations.
I know I could be using an HTML parser but this is actually within some JS code in the source.
Any thoughts on how I could do this? Thanks
With this regex you get it but I would use something else like a Json parser
var regex = /"siteName":"(.+?)"/g;
var str = `{"siteName":"ABC Example Business","contactName":"Jeff","siteKey":"abcexample","tabKey":"service","entityKey":"1192289","siteId":152285976,"entityId":13123055221,"phone":"","mobile":"0100 000 000",}`;
var result = regex.exec(str);
console.log(result[1]);
How about that:
\"siteName\":\"(.+)\"

Parsing email with Google Apps Script, regex issue?

I used to be quite proficient in VBA with excel, but I'm currently trying to do something with Google Scripts and I am well and truly stuck.
Basically, I am trying like to extract data out of a standardised email from Gmail into a Google sheet. There are a couple of other threads on the subject which I have consulted so far, and I can get the body of the email into the sheet but cannot parse it.
I am new to regex, but it tests OK on regex101
I am also brand new to Google Script, and even the debugger seems to have stopped working now (it did before, so would be grateful if anyone can suggest why this is).
Here is my basic function:
function processInboxToSheet() {
var label = GmailApp.getUserLabelByName("NEWNOPS");
var threads = label.getThreads();
// Set destination sheet
var sheet = SpreadsheetApp.getActiveSheet();
// Get all emails labelled NEWNOPS
for (var i = 0; i < threads.length; i++) {
var tmp,
message = threads[i].getMessages()[1], // second message in thread
content = message.getPlainBody(); // remove html markup
if (content) {
// search email for 'of:' and capure next line of text as address
// tests OK at regex101.com
property = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
// if no match, display error
var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';
sheet.appendRow([property]);
} // End if
// remove label to avoid duplication
threads[i].removeLabel(label)
} // End for loop
}
I can append 'content' to the sheet Ok, but cannot extract the address text required by the regex. Content displays as follows:
NOPS for the purchase of:
123 Any Street, Anytown, AN1 1AN
DATE: 05/05/2017
PRICE: £241,000
Seller’s Details
NAME: Mrs Seller
Thanks for reading :)
The return value of .match() is an array. The first captured group, containing the address, will be at index 1.
Based on the following line after your call to .match(), it looks like the tmp variable should have been assigned that array, not the property variable.
var property = (tmp && tmp[1]) ? tmp[1].trim() : 'No property';
That line says, if .match() returned something that isn't null and has a value at index 1, then trim that value and assign to property, otherwise assign it the string 'No property'.
So, try changing this line:
property = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
To this:
tmp = content.match(/of:[\n]([^\r\n]*)[\r\n]/);
Thanks Kevin, I think I must have changed it while debugging.
The problem was with my regexp in the end. After a bit of trial and error the following worked:
tmp = content.match(/of:[\r\n]+([^\r\n]+)/);

How to search for a specific word in nodejs and return the coloun number

I am trying to find out where the source ip colon is there in the excel file i want it to search in the whole file rather than me manually telling in the code .
How to do it i want to know weather there is possibility and any modules available for that work
My nodejs code
var Regex = require("regex");
var regex = new Regex(/(S|s)(O|o)(U|u)(R|r)(C|c)(E|e)( )*(I|i)(P|p)/);
var parseXlsx = require('excel');
parseXlsx('ISFWREQ-373_update.xlsx', function(err, data) {
console.log(regex.test(data[5][0]));
});