Google Scripts - Find and Replace Function that searches multiple documents

Google Scripts - Find and Replace Function that searches multiple documents - replace

I'm new to Google Scripts and am looking for a good place to start and educate myself on being able to write a script that accomplishes the following:
It scans a number of different google docs and does a find and replace, essentially a mail merge, as instructed by the code. So basically each doc would have a [REPLACE TEXT HERE] already in it and I'd tell the script that I'd like each [REPLACE TEXT HERE] changed to [WORD A] for a number of different documents.
I understand this is may be a basic request, so if there's a place to point me towards that walks me through google script basics that would include this, that'd be great - my initial research did not have anything that honed in exactly on this specific need.
Thanks for your help everyone!

Here's how I did it. (I learned how to use the API from the documentation.)
function myFunction() {
var files = DriveApp.getFiles(); // Note: this gets *every* file in your Google Drive
while (files.hasNext()) {
var file = files.next();
Logger.log(file.getName());
var doc = DocumentApp.openById(file.getId());
doc.replaceText("My search string or regex", "My replacement string");
}
Logger.log("Done")
}

Related

How to search files in google drive using apps script based on custom keyword with regex?

I have script to search file in google drive with apps script based on keyword.
e.g. I want to search file that contains "FX.ABCDEF".
I using fullText contains {text}.
Here the code :
var searchFiles = DriveApp.searchFiles('fullText contains "FX.ABCDEF" ')
On the other hand many keywords are similar to FX.ABCDEF. Characters after FX. is dynamic. Therefore I don't want to define keywords like that over and over again. So in cases like this a simplification is needed using regex.
So we can write :
/FX\.[A-Z]+/g
I'm still confused about implementing regex in searchFiles.

Searching in Drive files contents using regex is not currently supported. Here is the list of the available search syntax for Drive file contents.
If this feature is really important for your application consider filing a Feature Request here.

Get all files, put their names in array and do RegEx search inside this array:
function myFunction() {
var files = DriveApp.getFiles();
var file_names = [];
while (files.hasNext()) file_names.push(files.next().getName());
var reg = /.FX\.[A-Z]+/g;
var found_names = file_names.filter(n => reg.test(n));
Logger.log(found_names);
}
This way you will get the array with wanted names. You can take names from this array and get the file by the name. But keep in mind -- there can be many different files with the same name. You have to handle this collision some way.

Grabbing Text from Sublime Text Regions

I'm new to creating Sublime Text 3 Plugins and I'm trying to understand the Plugin API that's offered. I want to be able to "grab" text I highlight with my mouse and move it somewhere else. Is there a way I can do this using the Sublime Text Plugin API?
So far, all I've done is be able to create a whole region:
allcontent = sublime.Region(0, self.view.size())
I've tried to grab all of the text in the region and put it into a lot file:
logfile = open("logfile.txt", "w")
logfile.write(allcontent)
But unsuccessfully of course as the log file is blank after it runs.
I've looked over google and there is not a lot of documentation, except for the unofficial documentation, in which I can't find a way to grab the text. Nor are there many tutorials on this.
Any help is greatly appreciated!

A Region just represents a region of text (i.e. from position 0 to position 10), and isn't tied to any specific view.
To get the underlying text from the view's buffer, you need to call the view.substr method with the region as a parameter.
import os
logpath = os.path.join(sublime.cache_path(), 'logfile.txt')
allcontent = self.view.substr(sublime.Region(0, self.view.size()))
with open(logpath, 'w') as logfile:
logfile.write(allcontent)
print('written buffer contents to', logpath)
To get the region represented by the first selection, you can use self.view.sel()[0] in place of sublime.Region(0, self.view.size()).

What is the intended purpose of CSV files in POSTman?

I've seen no clear documentation on using CSV. What is the intended purpose - to read in variables?
For example, i'd like to run this many times with different vars. Does CSV files let me do this? The problem is that i cant see what my actual requests are in the collection window - i can only see the request with the var name, right?
localhost/api/{{var}}

Yes, you can do this with the CSV. If your csv was
var,name
1,name
2,name2
In collection runner if you iterate twice with localhost/api/{{var}}, the urls would be the following
localhost/api/1
localhost/api/2
Let me know if I haven't explained this well and I can add a little more detail a bit later.

Find & replace in files within a folder using Google Apps Script

Could you post a Snippet that does find & replace across all files within a folder please?
I did find something similar,
Google Scripts - Find and Replace Function that searches multiple documents
but that one search in every file in Google Drive, not within a folder. Since I'm only learning Google Apps Script by example now, I can't make that tiny step myself.
EDIT: Further on the replacing part.
The DocumentApp.openById returns a type of Document right?
However, I can't find the replaceText method in the Document type doc:
https://developers.google.com/apps-script/reference/document/document
Fine, I then dig deeper, and found another example,
Google Apps Script document not accessible by replaceText(), which indicates that the replaceText method should be invoked from var body = doc.getActiveSection(). However, I can't find the getActiveSection method in the Document type doc either.
Please help. thx

The first thing is to get a hold of your folder, this can be done in a few ways.
If you are writing a script for a limited audience, and you are working with only a particular folder you can use DriveApp.getFolderById(id). The id can be found in the url of your drive web app when you navigate to a folder as such:
https://drive.google.com/a/yourcompany.com/?tab=mo#folders/0B5eEwPQVn6GOaUt6Vm1GVjZmSTQ
Once you have that id, you can simply using the same code in the answer you reference, iterate through the file iterator for that particular folder as such:
function myFunction() {
var files = DriveApp.getFolderById("0B5eEwPQVn6GOaUt6Vm1GVjZmSTQ").getFiles();
while (files.hasNext()) {
var file = files.next();
Logger.log(file.getName());
var doc = DocumentApp.openById(file.getId());
doc.replaceText("My search string or regex", "My replacement string");
}
Logger.log("Done")
}
Alternative way of getting the folder if you only know its name are the use of DriveApp.getFoldersByName(name) which returns a folder iterator. If you know you have only one folder of that name, then you need to simply get the first and only element in the iterator as such:
function myFunction() {
var folders = DriveApp.getFoldersByName("myfoldername");
var myFolder = null;
if (folders.hasNext())
myFolder = folders.next();
if (myFolder !== null) {
// do stuff
} else {
// exit gracefully
}
Further if you have multiple folders with the same name, you would have to iterate through them in a while loop (similar to the file iterator in the code you linked) and find a marker that proves this is the folder you are looking for (i.e. you could have an empty file with a particular name in there)

Nutch regex doesn't crawl the way I want it to

Ok, I asked this already, but I guess I didn't ask it to the way stackoverflow expects. Hopefully I will get more luck this time and an answer.
I am trying to run nutch to crawl this site: http://www.tigerdirect.com/
I want it to crawl that site and all sublinks.
The problem is its not working. In my reg-ex file I tried a couple of things, but none of them worked:
+^http://([a-z0-9]*\.)*tigerdirect.com/
+^http://tigerdirect.com/([a-z0-9]*\.)*
my urls.txt is:
http://tigerdirect.com
Basically what I am trying to accomplish is to crawl all the product pages on their website so I can create a search engine (I am using solr) of electronic products. Eventually I want to crawl bestbuy.com, newegg.com and other sites as well.
BTW, I followed the tutorial from here: http://wiki.apache.org/nutch/NutchTutorial and I am using the script mentioned in session 3.3 (after fixing a bug it had).
I have a background in java and android and bash so this is a little new to me. I used to do regex in perl 5 years ago, but that is all forgotten.
Thanks!

According to your comments I see that you have crawled something before and this is why your Nutch starts to crawl Wikipedia.
When you crawl something with Nutch it records some metada at a table (if you use Hbase it is a table named webpage) When you finish a crawling and start a new one that table is scanned and if there is a record that has a metada says "this record can be fetched again because next fetch time is passed" Nutch starts to fetch that urls and also your new urls.
So if you want to have just http://www.tigerdirect.com/ crawled at your system you have to clean up that table first. If you use Hbase start shell:
./bin/hbase shell
and disable table:
disable 'webpage'
and finally drop it:
drop 'webpage'
I could truncate that table but removed it.
Next thing is putting that into your seed.txt:
http://www.tigerdirect.com/
open regex-urlfilter.txt that is located at:
nutch/runtime/local/conf
write that line into it:
+^http://([a-z0-9]*\.)*www.tigerdirect.com/([a-z0-9]*\.)*
you will put that line instead of +.
I have indicated to crawl subdomains of tigerdirect, it is up to you.
After that you can send it into solr to index and make a search on it. I have tried it and works however you may have some errors at Nutch side but it is another topic to talk about.

You've got a / at the end of both of your regexes but your URL doesn't.
http://tigerdirect.com/ will match, http://tigerdirect.com will not.
+^http://tigerdirect.com/([a-z0-9]*\.)*
Try moving that tailing slash inside the parens
+^http://tigerdirect.com(/[a-z0-9]*\.)*

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Google Scripts - Find and Replace Function that searches multiple documents - replace

Related

How to search files in google drive using apps script based on custom keyword with regex?

Grabbing Text from Sublime Text Regions

What is the intended purpose of CSV files in POSTman?

Find & replace in files within a folder using Google Apps Script

Nutch regex doesn't crawl the way I want it to

Categories

Resources