With the standard dataproc image 1.5 (Debian 10, Hadoop 2.10, Spark 2.4), a dataproc cluster cannot be created. Region is set to europe-west-2.
The stack-driver log says:
"Failed to initialize node <name of cluster>-m: Component hdfs failed to activate See output in: gs://.../dataproc-startup-script_output"
Scanning through the output (gs://.../dataproc-startup-script_output), I can see the hdfs activation has failed:
Aug 18 13:21:59 activate-component-hdfs[2799]: + exit_code=1
Aug 18 13:21:59 activate-component-hdfs[2799]: + [[ 1 -ne 0 ]]
Aug 18 13:21:59 activate-component-hdfs[2799]: + echo 1
Aug 18 13:21:59 activate-component-hdfs[2799]: + log_and_fail hdfs 'Component hdfs failed to activate' 1
Aug 18 13:21:59 activate-component-hdfs[2799]: + local component=hdfs
Aug 18 13:21:59 activate-component-hdfs[2799]: + local 'message=Component hdfs failed to activate'
Aug 18 13:21:59 activate-component-hdfs[2799]: + local error_code=1
Aug 18 13:21:59 activate-component-hdfs[2799]: + local client_error_indicator=
Aug 18 13:21:59 activate-component-hdfs[2799]: + [[ 1 -eq 2 ]]
Aug 18 13:21:59 activate-component-hdfs[2799]: + echo 'StructuredError{hdfs, Component hdfs failed to activate}'
Aug 18 13:21:59 activate-component-hdfs[2799]: StructuredError{hdfs, Component hdfs failed to activate}
Aug 18 13:21:59 activate-component-hdfs[2799]: + exit 1
What am I missing?
EDIT
As #Dagang suggested, I ssh-ed into the master node and ran grep "activate-component-hdfs" /var/log/dataproc-startup-script.log. The output is here.
So the problem is there is an user name called "pete{" on which the hadoop fs -mkdir -p command failed. These kind of user names with special chars especially open parenthesis e,g,"()[]{}" will potentially fail the HDFS activation step during cluster creation.
So the easy solution is just to remove those accidentally created user.
Related
I am using gcp vm machine instance N1-standard 8V-30GB and N1-standard 4V-15GB
os-Debian
version - Debian GNU/Linux 10(buster)
this issue i am facing from last 1 month.
public access permission denied is one of message i am seeing while trying to access from cloud shell
I had run command chmod 777 <home directory> earlier.
I've tried to reproduce your steps and was able to solve this issue.
Please have a look at my steps below:
create VM instances:
gcloud compute instances create instance-1 --zone=europe-west3-a --machine-type=e2-medium --image=ubuntu-1804-bionic-v20200701 --image-project=ubuntu-os-cloud
gcloud compute instances create instance-2 --zone=europe-west3-a --machine-type=e2-medium --image=ubuntu-1804-bionic-v20200701 --image-project=ubuntu-os-cloud
change permissions recursively on my home directory at the VM instance instance-1:
instance-1:~$ chmod -R 777 ~
instance-1:~$ ls -la
...
drwxrwxrwx 2 username username 4096 Jul 15 07:50 .ssh
create snapshot of the VM instance instance-1 boot disk:
gcloud compute disks snapshot instance-1 --snapshot-names instance-1-snapshot --zone=europe-west3-a
create a new disk with the snapshot:
gcloud compute disks create instance-1-snapshot-disk --zone=europe-west3-a --source-snapshot=instance-1-snapshot
attach created disk instance-1-snapshot-disk to the VM instance instance-2:
instance-2:~$ ls -l /dev/ | grep sd
brw-rw---- 1 root disk 8, 0 Jul 15 07:39 sda
brw-rw---- 1 root disk 8, 1 Jul 15 07:39 sda1
brw-rw---- 1 root disk 8, 14 Jul 15 07:39 sda14
brw-rw---- 1 root disk 8, 15 Jul 15 07:39 sda15
instance-2:~$ mount | grep sda
/dev/sda1 on / type ext4 (rw,relatime)
/dev/sda15 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
then
gcloud compute instances attach-disk instance-2 --disk=instance-1-snapshot-disk --zone=europe-west3-a
after that
instance-2:~$ ls -l /dev/ | grep sd
brw-rw---- 1 root disk 8, 0 Jul 15 07:39 sda
brw-rw---- 1 root disk 8, 1 Jul 15 07:39 sda1
brw-rw---- 1 root disk 8, 14 Jul 15 07:39 sda14
brw-rw---- 1 root disk 8, 15 Jul 15 07:39 sda15
brw-rw---- 1 root disk 8, 16 Jul 15 08:04 sdb
brw-rw---- 1 root disk 8, 17 Jul 15 08:04 sdb1
brw-rw---- 1 root disk 8, 30 Jul 15 08:04 sdb14
brw-rw---- 1 root disk 8, 31 Jul 15 08:04 sdb15
instance-2:~$ sudo mkdir /mnt/instance-1-snapshot-disk
instance-2:~$ sudo mount /dev/sdb1 /mnt/instance-1-snapshot-disk
instance-2:~$ ls -la /mnt/instance-1-snapshot-disk
total 104
drwxr-xr-x 23 root root 4096 Jul 15 07:56 .
drwxr-xr-x 3 root root 4096 Jul 15 08:05 ..
drwxr-xr-x 2 root root 4096 Jul 1 19:14 bin
drwxr-xr-x 4 root root 4096 Jul 1 19:19 boot
drwxr-xr-x 4 root root 4096 Jul 1 19:11 dev
drwxr-xr-x 93 root root 4096 Jul 15 07:55 etc
drwxr-xr-x 4 root root 4096 Jul 15 07:50 home
lrwxrwxrwx 1 root root 30 Jul 1 19:18 initrd.img -> boot/initrd.img-5.3.0-1030-gcp
lrwxrwxrwx 1 root root 30 Jul 1 19:18 initrd.img.old -> boot/initrd.img-5.3.0-1030-gcp
drwxr-xr-x 22 root root 4096 Jul 1 19:17 lib
drwxr-xr-x 2 root root 4096 Jul 1 19:01 lib64
drwx------ 2 root root 16384 Jul 1 19:13 lost+found
drwxr-xr-x 2 root root 4096 Jul 1 19:01 media
drwxr-xr-x 2 root root 4096 Jul 1 19:01 mnt
drwxr-xr-x 2 root root 4096 Jul 1 19:01 opt
drwxr-xr-x 2 root root 4096 Apr 24 2018 proc
drwx------ 3 root root 4096 Jul 15 07:36 root
drwxr-xr-x 4 root root 4096 Jul 1 19:19 run
drwxr-xr-x 2 root root 4096 Jul 1 19:17 sbin
drwxr-xr-x 6 root root 4096 Jul 15 07:36 snap
drwxr-xr-x 2 root root 4096 Jul 1 19:01 srv
drwxr-xr-x 2 root root 4096 Apr 24 2018 sys
drwxrwxrwt 7 root root 4096 Jul 15 07:56 tmp
drwxr-xr-x 10 root root 4096 Jul 1 19:01 usr
drwxr-xr-x 13 root root 4096 Jul 1 19:12 var
lrwxrwxrwx 1 root root 27 Jul 1 19:18 vmlinuz -> boot/vmlinuz-5.3.0-1030-gcp
lrwxrwxrwx 1 root root 27 Jul 1 19:18 vmlinuz.old -> boot/vmlinuz-5.3.0-1030-gcp
change permissions:
.ssh directory: 700 drwx------
public key (.pub file): 644 -rw-r--r--
private key (id_rsa): 600 -rw-------
lastly your home directory should not be writeable by the group or others: 755 drwxr-xr-x
instance-2:~$ chmod -R 755 /mnt/instance-1-snapshot-disk/home/username/
instance-2:~$ chmod -R 700 /mnt/instance-1-snapshot-disk/home/username/.ssh/
instance-2:~$ chmod 644 /mnt/instance-1-snapshot-disk/home/username/.ssh/authorized_keys
unmount the disk when you finish:
instance-2:~$ sudo umount /mnt/instance-1-snapshot-disk/
detach disk instance-1-snapshot-disk from the VM instance instance-2:
gcloud compute instances detach-disk instance-2 --disk=instance-1-snapshot-disk --zone=europe-west3-a
create a new instance from the repaired disk:
gcloud compute instances create instance-3 --zone=europe-west3-a --machine-type=e2-medium --disk=name=instance-1-snapshot-disk
check SSH connection to at the VM instance instance-1.
In addition, please have a look at the documentation Troubleshooting SSH section Inspect the VM instance without shutting it down to find more details.
From owner's account i tried to access instance-1 but owner is also not able to connect to the instance-1.
owner of project got this pop-up on ssh window
[1]: https://i.stack.imgur.com/y2fzC.jpg
I observe that in fresh new created instance if i add add some file like git clone repo, after that if i restart it then i am able to connect SSH again.
so here's my problem: I have big log files and want a script to grep certain periods of time and safe them to a file (sorted), basically
bash script.sh Jul 4 Sep 30
will return for example
Sep 30 user0 logged in
Sep 15 user1 logged in
Aug 6 user0 logged in
Aug 3 user1 logged in
Jul 28 user2 logged in
Jul 27 user2 logged in
Jul 4 user0 logged in
My first attempt was that every month and date gets his own variable like
bash script.sh Jul 4 Sep 3 0
so I can use $1 for start month (July), $2 for start date (4) and so on in grep like
for logs in logs*
do
grep -qEe "^\"$1\" [\"$2\"-9]\s" $messages >> result.txt
done
to get all logs from July 4 to 9 but I don't know how to get logs from the entire time period that aren't in the same month nor in a period like 1-9 or 10-19 and so on
Any help greatly appreciated!
EDIT:
As some people asked, here's how my log files look like (just much bigger and not sorted):
Sep 30 user0 logged in
Jul 27 user2 logged in
Aug 6 user0 logged in
Aug 31 user1 logged in
Jul 8 user2 logged in
Sep 5 user1 logged in
Jul 27 user2 logged in
Jul 14 user0 logged in
[...]
Here's my take:
#/bin/bash
year="$(date +"%Y")"
start="$(date -d"$1 $2, $year" +'%s')"
end="$(($(date -d"$3 $4, $year" +'%s')+86400))"
for log in logs*; do
while IFS= read -r line; do
d="$(date -d"$(cut -d' ' -f1,2 <<< "$line"), $year" +'%s')"
if (( $start <= $d && $d < $end )); then
echo "$s"
fi
done < "$log"
done
You run it like that: ./script.sh Jul 04 Sep 03. Since no year is included in the logs, it assumes that all dates (including the ones in the command line) are for the current year. It's probably not the most optimal solution but it works. It relies on date which it repeatedly calls to parse dates into a unix timestamp. unix timestamps are nice because they are just numbers and thus can be used in numeric comparisons.
$ range="Jul 4 Sep 30"
$ awk -v range="$range" '
BEGIN {
numMths = split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",m)
for (i in m) {
mths[m[i]] = i
}
split(range,r)
beg = sprintf("%02d%02d", mths[r[1]], r[2])
end = sprintf("%02d%02d", mths[r[3]], r[4])
}
{ cur = sprintf("%02d%02d", mths[$1], $2) }
(cur >= beg) && (cur <= end) { vals[$1,$2] = $0 }
END {
for (mthNr=numMths; mthNr>0; mthNr--) {
for (dayNr=31; dayNr>0; dayNr--) {
date = m[mthNr] SUBSEP dayNr
if (date in vals) {
print vals[date]
}
}
}
}
' file
Sep 30 user0 logged in
Sep 5 user1 logged in
Aug 31 user1 logged in
Aug 6 user0 logged in
Jul 27 user2 logged in
Jul 14 user0 logged in
Jul 8 user2 logged in
Here are some sample input I obtain from doing ls -l :
-rwxr-xr-x 1 root root 1779 Jan 10 2014 zcmp
-rwxr-xr-x 1 root root 5766 Jan 10 2014 zdiff
-rwxr-xr-x 1 root root 142 Jan 10 2014 zegrep
-rwxr-xr-x 1 root root 142 Jan 10 2014 zfgrep
-rwxr-xr-x 1 root root 2133 Jan 10 2014 zforce
-rwxr-xr-x 1 root root 5940 Jan 10 2014 zgrep
lrwxrwxrwx 1 root root 8 Dec 5 2015 ypdomainname -> hostname
I would like to print out the last column and 5th column using ONLY sed like this:
zcmp 1779
zdiff 5766
zegrep 142
zfgrep 142
zforce 2133
zgrep 5940
ypdomainname -> hostname 8
I'm trying to find a regex to match but have not succeeded. And I'm not allowed to use awk or cut either.
Thank you in advance.
Try this;
ls -l | sed -r 's/^(\S+\s+){5}(\S+\s+){3}/\1/' | sed 's/^\(.*\) \(.*\)$/\2\ \1/g'
I would like to filter the output of the utility last based on a variable set of usernames.
This is sample output from last unfiltered,
reboot system boot server Wed Apr 6 13:15 - 14:24 (01:09)
user1 pts/0 server Wed Apr 6 13:08 - 13:15 (00:06)
reboot system boot system Wed Apr 6 13:08 - 13:15 (00:06)
user1 pts/0 server Wed Apr 6 13:06 - down (00:01)
reboot system boot system Wed Apr 6 13:06 - 13:07 (00:01)
user1 pts/0 server Wed Apr 6 12:59 - down (00:06)
What I would like to do is pipe the output of last to sed. Then, using sed I would print the first occurrence of each specified user name i.e. their last log entry in wtmp. The output should appear as so,
reboot system boot server Wed Apr 6 13:15 - 14:24 (01:09)
user1 pts/0 server Wed Apr 6 13:08 - 13:15 (00:06)
The sed expression that I particularly like is,
last|sed '/user1/{p;q;}'
Unfortunately this only gives me the ability to match the first occurrence of one username. Using this syntax is there a way I could specify a multiple of usernames? Thanks in advance!
awk is better fit here than sed due to awk's ability to use associative arrays:
last | awk '!seen[$1]++'
reboot system boot server Wed Apr 6 13:15 - 14:24 (01:09)
user1 pts/0 server Wed Apr 6 13:08 - 13:15 (00:06)
This seems so simple, yet I cannot find a single source that provides a working solution.
I have a project like this:
src
-main
-resources
-bpmns
-nnn.bpmn
and I want gradle to rename all files that end with bpmn to bpmn20.xml, leaving all files that already have bpmn20.xml extensions untouched.
This is my approach:
task renameBPMN(type: Copy) {
outputs.upToDateWhen { false } //to pevent gradle from not executing this task
from('src/main/resources/bpmn/')
into('src/main/resources/bpmn/')
rename ('/^.*\\.(bpmn)$/', '$1.bpmn20.xml')
}
According to some regex testers online, the regex should match nnn.bpmn, gradle however does nothing at all.
Any ideas anyone?
This is what he needs or actually this script would do it.
Here it'll change a file: filenamebpmn or filename.bpmn to filenamebpmn20.xml or filename.bpmn20.xml respectively. It'll ignore all other files whether it's a filenamebpmn.xml or filename.bpmn.xml or any other file.
$ cat build.gradle
task renameBPMN() << {
fileTree("bpmnfolder").matching { include "*bpmn" }.each { fileName ->
println "Will be changed to new name --- " + fileName
file("${fileName}").renameTo("${fileName}20.xml")
}
}
$ ls -l bpmnfolder
total 3
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:14 1bpmn
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 14:50 2bpmn
-rw-r--r-- 1 c400093 Domain Users 6 Aug 6 14:50 3bpmn.xml
$ /cygdrive/c/gradle-2.0/bin/gradle renameBPMN
:renameBPMN
Will be changed to new name --- C:\cygwin\home\c400093\giga\goga\giga\bpmnfolder\1bpmn
Will be changed to new name --- C:\cygwin\home\c400093\giga\goga\giga\bpmnfolder\2bpmn
BUILD SUCCESSFUL
Total time: 3.153 secs
$ ls -l bpmnfolder
total 3
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:14 1bpmn20.xml
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 14:50 2bpmn20.xml
-rw-r--r-- 1 c400093 Domain Users 6 Aug 6 14:50 3bpmn.xml
$
AND
If you want to include filenames which end with bpmn as extension i.e. filename.bpmn or filenamebpmn, then use the following code.
$ ls -l bpmnfolder
total 5
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 1bpmn
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 2bpmn
-rw-r--r-- 1 c400093 Domain Users 6 Aug 6 15:39 3bpmn.xml
-rw-r--r-- 1 c400093 Domain Users 8 Aug 6 15:39 4.bpmn
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 5.bpmn.xml
$ cat build.gradle
task renameBPMN() << {
fileTree("bpmnfolder").matching { include "*bpmn" include "*.bpmn" }.each { fileName ->
println "Will be changed to new name --- " + fileName
file("${fileName}").renameTo("${fileName}20.xml")
}
}
$ /cygdrive/c/gradle-2.0/bin/gradle renameBPMN
:renameBPMN
Will be changed to new name --- C:\cygwin\home\c400093\giga\goga\giga\bpmnfolder\1bpmn
Will be changed to new name --- C:\cygwin\home\c400093\giga\goga\giga\bpmnfolder\2bpmn
Will be changed to new name --- C:\cygwin\home\c400093\giga\goga\giga\bpmnfolder\4.bpmn
BUILD SUCCESSFUL
Total time: 4.271 secs
$ ls -l bpmnfolder
total 5
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 1bpmn20.xml
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 2bpmn20.xml
-rw-r--r-- 1 c400093 Domain Users 6 Aug 6 15:39 3bpmn.xml
-rw-r--r-- 1 c400093 Domain Users 8 Aug 6 15:39 4.bpmn20.xml
-rw-r--r-- 1 c400093 Domain Users 3 Aug 6 15:39 5.bpmn.xml
$
Modify your task to:
task rename(type: Copy) {
outputs.upToDateWhen { false } //to pevent gradle from not executing this task
from('src/main/resources/bpmn/')
into('src/main/resources/bpmn/')
exclude { resource -> !resource.file.name.endsWith('.bpmn')}
rename '(.*).bpmn', '$1.bpmn20.xml'
}
According to the gradle documentation, AbstractCopyTask::rename()'s regular expression does not need delimiters:
rename '(.*)_OEM_BLUE_(.*)', '$1$2'
Try replacing:
rename ('/^.*\\.(bpmn)$/', '$1.bpmn20.xml')
With:
rename ('^.*\\.(bpmn)$', '$1.bpmn20.xml')