Drupal 8 unable to get path location from custom module - drupal-8

I created a module in Drupal 8 that needs to load a csv file from the module folder, but I was unable to do it, I have already tried:
$directory = drupal_get_path('module', 'my_module');
$file = 'source.csv';
$path = $directory . '/' . $file;
kint($path);
// open the CVS file
$handle = fopen($path, 'r');
if (!$handle) {
// ...
}
But I'm getting false when loading the file, so looks like it's not the correct way.

I found a way to got it using the following code:
$file = 'source.csv';
$path = __DIR__ . '/' . $file;
// open the CVS file
$handle = #fopen($path, 'r');
if (!$handle) {
// ...
}
If there is a better way just let me know.

basically:
$moduleDir = drupal_get_path('module','my_module');
is the right way
so if your source.csv file is located under modules/MY_MODULE/files/sources.csv
then you should be able to do something like the following in your my_module.module file or elsewhere:
$file = $moduleDir . DIRECTORY_SEPARATOR . 'files' . DIRECTORY_SEPARATOR . 'sources.csv;
if(file_exists($file)){
//do your stuff
}

Related

Generate CSV import file for AutoML Vision from an existing bucket

I already have a GCloud bucket divided by label as follows:
gs://my_bucket/dataset/label1/
gs://my_bucket/dataset/label2/
...
Each label folder has photos inside. I would like to generate the required CSV – as explained here – but I don't know how to do it programmatically, considering that I have hundreds of photos in each folder. The CSV file should look like this:
gs://my_bucket/dataset/label1/photo1.jpg,label1
gs://my_bucket/dataset/label1/photo12.jpg,label1
gs://my_bucket/dataset/label2/photo7.jpg,label2
...
You need to list all files inside the dataset folder with their complete path and then parse it to obtain the name of the folder containing the file, as in your case this is the label you want to use. This can be done in several different ways. I will include two examples from which you can base your code on:
Gsutil has a method that lists bucket contents, then you can parse the string with a bash script:
# Create csv file and define bucket path
bucket_path="gs://buckbuckbuckbuck/dataset/"
filename="labels_csv_bash.csv"
touch $filename
IFS=$'\n' # Internal field separator variable has to be set to separate on new lines
# List of every .jpg file inside the buckets folder. ** searches for them recursively.
for i in `gsutil ls $bucket_path**.jpg`
do
# Cuts the address using the / limiter and gets the second item starting from the end.
label=$(echo $i | rev | cut -d'/' -f2 | rev)
echo "$i, $label" >> $filename
done
IFS=' ' # Reset to originnal value
gsutil cp $filename $bucket_path
It also can be done using the Google Cloud Client libraries provided for different languages. Here you have an example using python:
# Imports the Google Cloud client library
import os
from google.cloud import storage
# Instantiates a client
storage_client = storage.Client()
# The name for the new bucket
bucket_name = 'my_bucket'
path_in_bucket = 'dataset'
blobs = storage_client.list_blobs(bucket_name, prefix=path_in_bucket)
# Reading blobs, parsing information and creating the csv file
filename = 'labels_csv_python.csv'
with open(filename, 'w+') as f:
for blob in blobs:
if '.jpg' in blob.name:
bucket_path = 'gs://' + os.path.join(bucket_name, blob.name)
label = blob.name.split('/')[-2]
f.write(', '.join([bucket_path, label]))
f.write("\n")
# Uploading csv file to the bucket
bucket = storage_client.get_bucket(bucket_name)
destination_blob_name = os.path.join(path_in_bucket, filename)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(filename)
For those, like me, who were looking for a way to create the .csv file for batch processing in googleAutoML, but don't need the label column :
# Create csv file and define bucket path
bucket_path="gs:YOUR_BUCKET/FOLDER"
filename="THE_FILENAME_YOU_WANT.csv"
touch $filename
IFS=$'\n' # Internal field separator variable has to be set to separate on new lines
# List of every [YOUREXTENSION] file inside the buckets folder - change in next line - ie **.png beceomes **.your_extension. ** searches for them recursively.
for i in `gsutil ls $bucket_path**.png`
do
echo "$i" >> $filename
done
IFS=' ' # Reset to originnal value
gsutil cp $filename $bucket_path

Bash & Perl script to convert relative paths to absolute paths

I have a top level dir path and I want to convert all the relative paths to absolute paths existing in all files inside this directory recursively.
e.g. I have this dir structure:
$ tree
.
|-- DIR
| |-- inner_level.ext1
| `-- inner_level.ext2
|-- top_level.ext1
`-- top_level.ext2
Content of top_level.ext1:
../../a/b/c/filename_1.txt
../../a/d/e/filename_2.txt
Assume the top level dir path is /this/is/the/abs/dir/path/
Want to convert the content of top_level.ext1 to:
/this/is/the/abs/a/b/c/filename_1.txt
/this/is/the/abs/a/d/e/filename_2.txt
Content of top_level.ext2:
cc_include+=-I../../util1/src/module1/moduleController -I../../util/src/module1/module2Controller;
cc_include+=-I../../util2/src/module2/moduleUtility;
Want to convert the content of top_level.ext2 to:
cc_include+=-I/this/is/the/abs/util1/src/module1/moduleController -I/this/is/the/abs/util/src/module1/module2Controller;
cc_include+=-I/this/is/the/abs/util2/src/module2/moduleUtility;
Also, want to apply this same conversion over the files inside DIR.
e.g.
Content of DIR/inner_level.ext1:
../../../a/b/c/filename_1.txt
../../../a/d/e/filename_2.txt
Want to convert the content of DIR/inner_level.ext1 to:
/this/is/the/abs/a/b/c/filename_1.txt
/this/is/the/abs/a/d/e/filename_2.txt
Same for the DIR/inner_level.ext2 also.
Have written this two scripts.
Conversion of top_level.ext1 is working successfully.
file_manager.sh:
#!/usr/bin/bash
file='resolve_path.pl'
basedir='/this/is/the/abs/dir/path'
run_perl(){
echo -e "\n File getting modified: $1"
cp $1 tmp.in
perl $file
mv tmp.out $1
rm tmp.in
}
find $basedir -type f |while read inputfile
do
run_perl $inputfile
done
resolve_path.pl:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use 5.010;
use Switch;
#******************************************************
# Set-up Directory And Input/Output File Names
#******************************************************
our $in_file = glob('tmp.in');
my $out_file1 = 'tmp.out';
print "Input file: $in_file\n";
#************************************
# Local and Global Variables
#*************************************
my $current_path = "/this/is/the/abs/dir/path";
my $temp_path = $current_path;
#************************************
# Open Read and Write File
#************************************
open(READ, $in_file) || die "cannot open $in_file";
open(WRITE, ">$out_file1") || die "cannot open $out_file1";
#******************************************************
# Read The Input [*.out] File Line By Line
#******************************************************
while (<READ>) {
if(/^(\.\.\/){1,}(\w+\/)*(\w+).(\w+)/){
my $file_name = $3;
my $file_ext = $4;
my #count = ($_ =~ /\.\.\//g);
my $cnt = #count;
my #prev_dir = ($_ =~ /\w+\//g);
my $prev_dir_cnt = #prev_dir;
my $file_prev_dir = join('', #prev_dir);
$temp_path = $current_path;
for(my $i=0; $i<$cnt; $i++){
if($temp_path =~m/(\/.*)\/\w+/){
$temp_path = $1;
}
}
print WRITE "$temp_path"."\/"."$file_prev_dir"."$file_name"."\."."$file_ext"."\n";
} else {
print WRITE "$_";
}
}
Issues I am facing:
No conversion is applied over top_level.ext2 & DIR/inner_level.ext2
as my Perl script is not parsing properly for ../es (i.e.
cc_include+=-I is coming at the beginning).
conversion from relative path to absolute path is not working
properly for DIR/inner_level.ext1 and a wrong path is getting
appended.
It would be helpful if someone can suggest expected changes in my scripts to solve the above said two issues.
Why the 2 scripts? That's inefficient.
Perl is perfectly capable of retrieving the list of files and has modules which simplifies that process as well as modules to parse and alter the paths.
File::Find - Traverse a directory tree.
File::Find::Rule - Alternative interface to File::Find
File::Basename - Parse file paths into directory, filename and suffix.
File::Spec - portably perform operations on file names

use shell command tesseract in perl script to print a text output

Hi i have some script that i want to write, first i took from the html the image, and then i wanted to use tesseract to take the output txt from it.
i cant really figure out how to do it.
Here is the code:
#!/usr/bin/perl -X
##########
$user = ''; # Enter your username here
$pass = ''; # Enter your password here
###########
# Server settings (no need to modify)
$home = "http://37.48.90.31";
$url = "$home/c/test.cgi?u=$user&p=$pass";
# Get HTML code
$html = `GET "$url"`;
#### Add code here:
# Grab img from HTML code
if ($html =~ /\img[^>]* src=\"([^\"]*)\"[^>]*/) {
$takeImg = $1;
}
#dirs = split m!/!, $takeImg;
$img = $dirs[2];
#########
die "<img> not found\n" if (!$img);
# Download img to server (save as: ocr_me.img)
print "GET '$img' > ocr_me.img\n";
system "GET '$img' > ocr_me.img";
#### Add code here:
# Run OCR (using shell command tesseract) on img and save text as ocr_result.txt
system ("tesseract", "tesseract ocr_me.img ocr_result");
###########
die "ocr_result.txt not found\n" if (!-e "ocr_result.txt");
# Check OCR results:
$txt = `cat ocr_result.txt`;
I took the image right from the html or i need another Regex?
and how to display the 'ocr_result.txt'
Thanks for all who will help!

How to copy files using subprocess in python?

I have a list of files:
file_list=['test1.txt','test2.txt','test3.txt']
I want to find and copy these files to a destination folder. I have the following code:
for files in file_list:
subprocess.call(["find", "test_folder/",
"-iname", files,
"-exec", "cp", "{}",
"dest_folder/",
"\;"])
But, i keep getting the error:
find: missing argument to `-exec
The shell command looks something like this:
$find test_folder/ -iname 'test1.txt' -exec cp {} dest_folder/ \;
Anything i am doing wrong?
You don't need to escape semi-colon. Here's what is working for me:
import shlex
import subprocess
file_list = ['test1.txt','test2.txt','test3.txt']
cmd = 'find test_folder -iname %s -exec cp {} dest_folder ;'
for files in file_list:
subprocess.Popen(shlex.split(cmd % files))
Also see:
Python equivilant to find -exec
find command with exec in python subprocess gives error
Hope that helps.
You don't need to escape the arguments; subprocess module calls find command directly without the shell. Replace "\;" with ";" and your command will work as is.
You could combine the search into a single command:
from subprocess import call
expr = [a for file in file_list for a in ['-iname', file, '-o']]
expr.pop() # remove last `-o`
rc = call(["find", "test_folder/", "("] + expr + [")", "-exec",
"cp", "-t", "dest_folder/", "--", "{}", "+"])
You could also combine expr list into a single -iregex argument if desired.
You don't need find command; you could implement the copying in pure Python using os.walk, re.match, and shutil.copy:
import os
import re
import shutil
found = re.compile('(?i)^(?:%s)$' % '|'.join(map(re.escape, file_list))).match
for root, dirs, files in os.walk('test_folder/'):
for filename in files:
if found(filename):
shutil.copy(os.path.join(root, filename), "dest_folder/")

How can I parse an excel file within a zip file?

I want to be able to parse an excel within a zip file. I've been able to parse the zip file to return files within that compressed file, and if regex match brings up an excel file, I'd like to parse the file.
Here's the script that parses the zip file for the name of the excel spreadsheet...
#!/usr/bin/perl
use strict;
use warnings;
use Archive::Zip;
use Spreadsheet::ParseExcel;
my $zipFile = Archive::Zip->new();
my $xl_file = "";
#open zipfile
$zipFile->read( '/home/user/Desktop/test.zip' ) == 0 || die "cannot read zip file\n";
#find all files within zipfile
my #files = $zipFile->memberNames('/home/user/Desktop/test.zip');
foreach my $file (sort #files) {
#find all excel files
if($file =~ m/(.*xls)/){
$xl_file = $1;
print "excel file found.\n";
}
}
And this is the script that parses for the value in cells.
#!/usr/bin/perl
use strict;
use warnings;
my $filename = "/home/user/worksheet.xls";
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse("$filename");
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
open(FILE, '>', "parse.txt")||die "cannot open parse.txt!\n";
for my $worksheet ( $workbook->worksheets() ) {
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
my $s = $worksheet -> get_cell(2,2);
my $p = $worksheet-> get_cell(2,3);
print FILE $s->value()."\n";
print FILE $p->value()."\n";
}
close FILE;
How do I integrate these together?
According to the documentation of Archive::Zip, it's possible to get the contents of a compressed file member as a string:
$xls_content = $zipFile->contents($file);
And according to the documentation of Spreadsheet::ParseExcel, it's possible to parse a string containg the contents of an Excel file by passing the string as a reference:
my $workbook = $parser->parse(\$xls_content);
So you should be able to combine both together.
Another possibility is to extract the zip file member into a temporary file.