How to remove lines which match lines in another document - centos7

So I am running Centos 6.9, but could switch to Centos 7 if needed
If I have 2 txt files, one contains
Gold
Silver
Copper
Aluminum
Titanium
And my second contains
Gold
Silver
Titanium
How can I run a command to have a file that contains
Copper
Aluminum
Summarized: How can I remove lines in a file which match those of another file

This can easily be done with grep
laforge#ncc1701d:~ $ cat file1.txt
Gold
Silver
Copper
Aluminum
Titanium
laforge#ncc1701d:~ $ cat file2.txt
Gold
Silver
Titanium
laforge#ncc1701d:~ $ grep -vf file2.txt file1.txt
Copper
Aluminum
laforge#ncc1701d:~ $ grep -f file2.txt file1.txt
Gold
Silver
Titanium
laforge#ncc1701d:~ $

Related

Access the word in the file with SSH command

I have a conf file and I use GREP to access the data in this file but not a very useful method for me.
How can I just get the main word?
I using:
grep "HelloWorld" /etc/VDdatas.conf
Print:
export: HelloWorld
I want: (without "export: ")
HelloWorld
How can I do that?
Try the -o or --only-matching option, if you're using gnu grep. It shows only the parts of each line that match..
grep -o "HelloWorld" /etc/VDdatas.conf

sed: can't read book/ : No such file or directory

I ran the following one line code on Red Hat Enterprise Linux Server release 6.6 (Santiago)
grep -rl 'room' book/ | xargs sed -i 's/room/equipment/g'
And I got the following message
sed: can't read book/
book/del_entry_ajax.php: No such file or directory
Acutally I can run
grep -rl 'room' book/del_entry_ajax.php | xargs sed -i 's/room/equipment/g'
successfully and then run the first command again, I got
sed: can't read book/
: No such file or directory
Why is that and how can I fix it?
The GNU guys really messed up when they gave grep an option to find files. There is a perfectly good UNIX tool to find files and it has a perfectly obvious name - find. Try this:
find book -type f -print0 | xargs -0 sed -i 's/room/equipment/g'

How to call clang-format over a cpp project folder?

Is there a way to call something like clang-format --style=Webkit for an entire cpp project folder, rather than running it separately for each file?
I am using clang-format.py and vim to do this, but I assume there is a way to apply this once.
Unfortunately, there is no way to apply clang-format recursively. *.cpp will only match files in the current directory, not subdirectories. Even **/* doesn't work.
Luckily, there is a solution: grab all the file names with the find command and pipe them in. For example, if you want to format all .h and .cpp files in the directory foo/bar/ recursively, you can do
find foo/bar/ -iname *.h -o -iname *.cpp | xargs clang-format -i
See here for additional discussion.
What about:
clang-format -i -style=WebKit *.cpp *.h
in the project folder. The -i option makes it inplace (by default formatted output is written to stdout).
First create a .clang-format file if it doesn't exist:
clang-format -style=WebKit -dump-config > .clang-format
Choose whichever predefined style you like, or edit the resulting .clang-format file.
clang-format configurator is helpful.
Then run:
find . -regex '.*\.\(cpp\|hpp\|cc\|cxx\)' -exec clang-format -style=file -i {} \;
Other file extensions than cpp, hpp, cc and cxx can be used in the regular expression, just make sure to separate them with \|.
I recently found a bash-script which does exactly what you need:
https://github.com/eklitzke/clang-format-all
This is a bash script that will run clang-format -i on your code.
Features:
Finds the right path to clang-format on Ubuntu/Debian, which encode the LLVM version in the clang-format filename
Fixes files recursively
Detects the most common file extensions used by C/C++ projects
On Windows, I used it successfully in Git Bash and WSL.
For the Windows users: If you have Powershell 3.0 support, you can do:
Get-ChildItem -Path . -Directory -Recurse |
foreach {
cd $_.FullName
&clang-format -i -style=WebKit *.cpp
}
Note1: Use pushd . and popd if you want to have the same current directory before and after the script
Note2: The script operates in the current working directory
Note3: This can probably be written in a single line if that was really important to you
When you use Windows (CMD) but don't want to use the PowerShell cannon to shoot this fly, try this:
for /r %t in (*.cpp *.h) do clang-format -i -style=WebKit "%t"
Don't forget to duplicate the two %s if in a cmd script.
The below script and process:
works in Linux
should work on MacOS
works in Windows inside Git For Windows terminal with clang-format downloaded and installed.
Here's how I do it:
I create a run_clang_format.sh script and place it in the root of my project directory, then I run it from anywhere. Here's what it looks like:
run_clang_format.sh
#!/bin/bash
THIS_PATH="$(realpath "$0")"
THIS_DIR="$(dirname "$THIS_PATH")"
# Find all files in THIS_DIR which end in .ino, .cpp, etc., as specified
# in the regular expression just below
FILE_LIST="$(find "$THIS_DIR" | grep -E ".*(\.ino|\.cpp|\.c|\.h|\.hpp|\.hh)$")"
echo -e "Files found to format = \n\"\"\"\n$FILE_LIST\n\"\"\""
# Format each file.
# - NB: do NOT put quotes around `$FILE_LIST` below or else the `clang-format` command will
# mistakenly see the entire blob of newline-separated file names as a SINGLE file name instead
# of as a new-line separated list of *many* file names!
clang-format --verbose -i --style=file $FILE_LIST
Using --style=file means that I must also have a custom .clang-format clang-format specifier file at this same level, which I do.
Now, make your newly-created run_clang_format.sh file executable:
chmod +x run_clang_format.sh
...and run it:
./run_clang_format.sh
Here's a sample run and output for me:
~/GS/dev/eRCaGuy_PPM_Writer$ ./run_clang-format.sh
Files found to format =
"""
/home/gabriel/GS/dev/eRCaGuy_PPM_Writer/examples/PPM_Writer_demo/PPM_Writer_demo.ino
/home/gabriel/GS/dev/eRCaGuy_PPM_Writer/examples/PPM_Writer_demo2/PPM_Writer_demo2.ino
/home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/eRCaGuy_PPM_Writer.h
/home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/eRCaGuy_PPM_Writer.cpp
/home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/timers/eRCaGuy_TimerCounterTimers.h
"""
Formatting /home/gabriel/GS/dev/eRCaGuy_PPM_Writer/examples/PPM_Writer_demo/PPM_Writer_demo.ino
Formatting /home/gabriel/GS/dev/eRCaGuy_PPM_Writer/examples/PPM_Writer_demo2/PPM_Writer_demo2.ino
Formatting /home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/eRCaGuy_PPM_Writer.h
Formatting /home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/eRCaGuy_PPM_Writer.cpp
Formatting /home/gabriel/GS/dev/eRCaGuy_PPM_Writer/src/timers/eRCaGuy_TimerCounterTimers.h
You can find my run_clang_format.sh file in my eRCaGuy_PPM_Writer repository, and in my eRCaGuy_CodeFormatter repository too. My .clang-format file is there too.
References:
My repository:
eRCaGuy_PPM_Writer repo
run_clang_format.sh file
My notes on how to use clang-format in my "git & Linux cmds, help, tips & tricks - Gabriel.txt" doc in my eRCaGuy_dotfiles repo (search the document for "clang-format").
Official clang-format documentation, setup, instructions, etc! https://clang.llvm.org/docs/ClangFormat.html
Download the clang-format auto-formatter/linter executable for Windows, or other installers/executables here: https://llvm.org/builds/
Clang-Format Style Options: https://clang.llvm.org/docs/ClangFormatStyleOptions.html
[my answer] How can I get the source directory of a Bash script from within the script itself?
Related:
[my answer] Indenting preprocessor directives with clang-format
See also:
[my answer] https://stackoverflow.com/questions/67678531/fixing-a-simple-c-code-without-the-coments/67678570#67678570
Here is a solution that searches recursively and pipes all files to clang-format as a file list in one command. It also excludes the "build" directory (I use CMake), but you can just omit the "grep" step to remove that.
shopt -s globstar extglob failglob && ls **/*.#(h|hpp|hxx|c|cpp|cxx) | grep -v build | tr '\n' ' ' | xargs clang-format -i
You can use this inside a Make file. It uses git ls-files --exclude-standard to get the list of the files, so that means untracked files are automatically skipped. It assumes that you have a .clang-tidy file at your project root.
format:
ifeq ($(OS), Windows_NT)
pwsh -c '$$files=(git ls-files --exclude-standard); foreach ($$file in $$files) { if ((get-item $$file).Extension -in ".cpp", ".hpp", ".c", ".cc", ".cxx", ".hxx", ".ixx") { clang-format -i -style=file $$file } }'
else
git ls-files --exclude-standard | grep -E '\.(cpp|hpp|c|cc|cxx|hxx|ixx)$$' | xargs clang-format -i -style=file
endif
Run with make format
Notice that I escaped $ using $$ for make.
If you use go-task instead of make, you will need this:
format:
- |
{{if eq OS "windows"}}
powershell -c '$files=(git ls-files --exclude-standard); foreach ($file in $files) { if ((get-item $file).Extension -in ".cpp", ".hpp", ".c", ".cc", ".cxx", ".hxx", ".ixx") { clang-format -i -style=file $file } }'
{{else}}
git ls-files --exclude-standard | grep -E '\.(cpp|hpp|c|cc|cxx|hxx|ixx)$' | xargs clang-format -i -style=file
{{end}}
Run with task format
If you want to run the individual scripts, then use these
# powershell
$files=(git ls-files --exclude-standard); foreach ($file in $files) { if ((get-item $file).Extension -in ".cpp", ".hpp", ".c", ".cc", ".cxx", ".hxx", ".ixx") { clang-format -i -style=file $file } }
# bash
git ls-files --exclude-standard | grep -E '\.(cpp|hpp|c|cc|cxx|hxx|ixx)$' | xargs clang-format -i -style=file
I'm using the following command to format all objective-C files under the current folder recursively:
$ find . -name "*.m" -o -name "*.h" | sed 's| |\\ |g' | xargs clang-format -i
I've defined the following alias in my .bash_profile to make things easier:
# Format objC files (*.h and *.m) under the current folder, recursively
alias clang-format-all="find . -name \"*.m\" -o -name \"*.h\" | sed 's| |\\ |g' | xargs clang-format -i"
In modern bash you can recursively crawl the file tree
for file_name in ./src/**/*.{cpp,h,hpp}; do
if [ -f "$file_name" ]; then
printf '%s\n' "$file_name"
clang-format -i $file_name
fi
done
Here the source is assumed to be located in ./src and the .clang-format contains the formatting information.
As #sbarzowski touches on in a comment above, in bash you can enable globstar which causes ** to expand recursively.
If you just want it for this one command you can do something like the following to format all .h, .cc and .cpp files.
(shopt -s globstar; clang-format -i **/*.{h,cc,cpp})
Or you can add shopt -s globstar to your .bashrc and have ** goodness all the time in bash.
As a side note, you may want to use --dry-run with clang-format the first time to be sure it's what you want.
I had similar issue with clang-format, we have a huge project with a lot of files to check and to reformat.
Scripts were a ok solutions, but there was too slow.
So, I've wrote an application that can recursively going thru files in folder and executes clang-format on them in fast multithreaded manor.
Application also supports ignore directories and files that you might not wanna touch by format (like thirdparty dirs)
You can checkout it from here: github.com/GloryOfNight/clang-format-all
I hope it would be also useful for other people.
ps: I know that app huge overkill, but its super fast at it job
A bit <O/T>, but when I googled "how to feed a list of files into clang-format" this was the top hit. In my case, I don't want to recurse over an entire directory for a specific file type. Instead, I want to apply clang-format to all the files I edited before I push my feature/bugfix branch. The first step in our pipeline is clang-format, and it almost always fails, so I wanted to run this "manually" on my changes just to take care of that step instead of nearly always dealing with a quickly failing pipeline. You can get a list of all the files you changed with
git diff <commitOrTagToCompareTo> --name-only
And borrowing from Antimony's answer, you can pipe that into xargs and finally clang-format:
git diff <commitOrTagToCompareTo> --name-only | xargs clang-format -i
Running git status will now show which files changed (git diff(tool) will show you the changes), and you can commit and push this up, hopefully moving on to more important parts of the pipeline.
The first step is to find out header and source files, we use:
find . -path ./build -prune -o -iname "*.hpp" -o -iname "*.cpp" -o -iname "*.c" -o -iname "*.h"
The -o is for "or" and -iname is for ignoring case. And in your case specifically, you may add more extensions like -o -iname "*.cc". Here another trick is to escape ./build/ directory, -path ./build -prune suggests do not descend into the given directory "./build".
Type above command you will find it still prints out "./build", then we use sed command to replace "./build" with empty char, something like:
sed 's/.\/build//' <in stream>
At last, we call clang-format to do formatting:
clang-format -i <file>
Combine them, we have:
find . -path ./build -prune -o -iname "*.hpp" -o -iname "*.cpp" -o -iname "*.cc" -o -iname "*.cxx" -o -iname "*.c" -o -iname "*.h"|sed 's/.\/build//'|xargs clang-format -i
I had similar issue where I needed to check for formatting errors, but I wanted to do it with a single clang-format invocation both on linux and windows.
Here are my one-liners:
Bash:
find $PWD/src -type f \( -name "*.h" -o -name "*.cpp" \) -exec clang-format -style=file --dry-run --Werror {} +
Powershell:
clang-format -style=file --dry-run --Werror $(Get-ChildItem -Path $PWD/src -Recurse | Where Name -Match '\.(?:h|cpp)$' | Select-Object -ExpandProperty FullName)

Find & replace recursively except for certain files

With regards to this post, how would I exclude one or more files from applying the string replacement? By using the aforementioned post as an example, I would like to be able to replace "apples" with "oranges" in all descendant files of a given directory except, say, ./fpd/font/symbol.php.
My idea was using the -regex switch in the find command but unfortunately it does not have a -v option like the grep command hence I can't negate the regex to not match the files where the replacement must occur.
I use this in my Git repository:
grep -ilr orange . | grep -v ".git" | grep -e "\\.php$" | xargs sed -i s/orange/apple/g {}
It will:
Run find and replace only in files that actually have the word to be replaced;
Not process the .git folder;
Process only .php files.
Needless to say you can include as many grep layers you want to filter the list that is being passed to xargs.
Known issues:
At least in my Windows environment it fails to open files that have spaces in the path or name. Never figured that one out. If anyone has an idea of how to fix this I would like to know.
Haven't tested this but it should work:
find . -path ./fpd/font/symbol.php -prune -o -exec sed -i 's/apple/orange/g' {} \;
You can negate with ! (or -not) combined with -name:
$ find .
.
./a
./a/b.txt
./b
./b/a.txt
$ find . -name \*a\* -print
./a
./b/a.txt
$ find . ! -name \*a\* -print
.
./a/b.txt
./b
$ find . -not -name \*a\* -print
.
./a/b.txt
./b

Can't get grouping to work with sed

This works as expected:
INPUT FILE src.txt:
ffmpeg -i uno.3gp
ffmpeg -i dos.3gp
ffmpeg -i tres.3gp
COMMAND:
sed 's/-i .*\./XXX/' <src.txt
RESULT AS EXPECTED:
ffmpeg XXX3gp
ffmpeg XXX3gp
ffmpeg XXX3gp
Then why don't these work as expected:
COMMAND:
sed 's/-i (.*)\./XXX/' <src.txt
EXPECTED:
ffmpeg XXX3gp
ffmpeg XXX3gp
ffmpeg XXX3gp
ACTUAL RESULT:
ffmpeg -i uno.3gp
ffmpeg -i dos.3gp
ffmpeg -i tres.3gp
COMMAND:
sed 's/-i (.*)\.3gp/\1.mp3/' <src.txt
EXPECTED:
ffmpeg uno.mp3
ffmpeg dos.mp3
ffmpeg tres.mp3
ACTUAL RESULT
sed: -e expression #1, char 18: invalid reference \1 on `s' command's RHS
The parenthesis don't seem to work for grouping, but all the tutorials and examples I've found around seem to assume they should...
In Classic sed (not GNU sed necessarily), the grouping commands use \( and \) (and the counts use \{ and \}) rather than unescaped.
Thus, you should try:
sed 's/-i \(.*\)\./XXX/' <src.txt
sed 's/-i \(.*\)\.3gp/\1.mp3/' <src.txt
Or, if you've got GNU sed, add -r or --regexp-extended to 'use extended regular expressions in the script' (quoting from sed --help).
sed -r 's/-i (.*)\./XXX/' <src.txt
sed -r 's/-i (.*)\.3gp/\1.mp3/' <src.txt
As Jonathan Leffler answered about the source of your error, I would like to mention, that backreference not always is good stuff, sometimes it is really slows down the script.
Furthermore in you case you don't need backreference at all:
sed 's/-i //;s/\.3gp/.mp3/' <src.txt
will do the job.