Merging multiple csv files

Merging multiple csv files - c++

I have many 1GB csv files. What is the easiest way to merge them. Can this be done using shell commands or do I have to write a C++ program for it.

cat *.csv > mega-merged.csv2
mv mega-merged.csv2 mega-merged.csv
(The use of the .csv2 is so that the *.csv doesn't catch it.)
Re Joce's comment, if you have headers, you can trim off all the headers (on GNU/Linux or any other platform with GNU tools) using something like:
tail -qn +2 *.csv > mega-merged.csv2

Related

Use sed/regex to rename a file - bash with macOS

I have a list of files that a date has been added to the end.
ex: Chorus Left Octave (consolidated) (2020_10_14 20_27_18 UTC). The files will end with .wav or .mp3
I want to leave the (consolidated) but take out the date. I have come up with the regex and tested with regexr.com. It does format the text correctly there.
The regex is: /(\([0-9]+(.*)(?=.wav|.mp3))+/g
Now, I am trying to actually rename the files. In my terminal I have cd'ed into the folder with the files. Based on other answers here I have tried:
rename -n '/(\([0-9]+(.*)(?=.wav|.mp3))+/g' *.wav|*.mp3 - using rename installed with homebrew
sed '/(\([0-9]+(.*))+/g' *.wav|*.mp3
for f in *.wav|*.mp3; do mv "$f" "${f/(\([0-9]+(.*)(?=.wav|.mp3))+/g}” done
The first two do not throw any errors, but do not do any renames (I know that the -n after rename just prints out the files that will be changed, it doesn't actually change the files)
The last one starts a bash session.
I'd rather use the rename or sed, seems simpler to me. But, what am I doing wrong?.

In plain bash:
#!/bin/bash
pat='([0-9][0-9][0-9][0-9]_[0-9][0-9]_[0-9][0-9] [0-9][0-9]_[0-9][0-9]_[0-9][0-9] UTC)'
for f in *.mp3 *.wav; do echo mv "$f" "${f/$pat}"; done
Remove the echo preceding the mv after making sure it will work as intended. You may also consider adding the -i option to the mv in order to avoid clobbering an existing file unintentionally.

Is there a way for changing text file names in a folder using C++

I am working with a bunch of txt files(thousands) on my project. Each txt file has 'csv' information on it. The problem is that each txt file has a random name and I cannot create a code for loading them in my project due to it. So, I want to rename them in a particular pattern to make easier the loading of the files in my work. I will use C++ for accomplish this task.
I put all the txt files in a folder but I cannot see a way of renaming them using C++. How can I do this? is there a way to do it? Can someone help me?

You can use std::filesystem::directory_iterator and std::filesystem::rename (c++17), as documented here.

Disclaimer
This answer validity is based on a comment where the author precised they were not bound to the C++ language (it may be worth editing the question, the C++ tag, and the OS). This solution may work for UNIX systems supporting bash, that is most Linux distributions and all releases of Apple's macOS prior to macOS Catalina (correct me if I'm wrong).
Bash command line
Using the following bash command should rename all the files in a folder with increasing numbers, that is:
toto.csv -> 1.csv
titi.csv -> 2.csv etc
It assumes the ordering is not important.
a=1; for i in *; do mv -n "$i" "$a.csv" ; let "a +=1"; done
To test it, you can prepare a test folder by opening a terminal and typing:
mkdir test
cd test
touch toto.csv titi.csv tata.csv
ls
Output:
tata.csv titi.csv toto.csv
Then you can run the following command:
a=1; for i in *; do mv -n "$i" "$a.csv" ; let "a +=1"; done
ls
Output:
1.csv 2.csv 3.csv
Explication:
a=1 declare a variable
for i in *; begin to iterate over all files in the folder
do mv will move (rename) a file of the list (that is, the variable $i) to a new name called a.csv
and we increment the counter a, and close the loop.
the option -n will make sure no file gets overwritten by the command mv
I assumed there was no specific criterion to rename the files. If there is a specific structure (pattern) in the renaming, the bash command can probably accommodate it, but the question should then give more details about these requirements :)

Regex to add an extension to a directory full of files

I am new to regular expressions.
I have many irregularly numbered ascii files with no extension: g000554, g000556, g000558, g000561, g000563 ... g001979 etc
I would like to type a regex at the terminal (or in a short script) to add a .dat to all of these files.
So I would like to change them to become: g000554.dat, g000556.dat, g000558.dat, g000561.dat, g000563.dat ... g001979.dat etc
p.s. Sorry I should have provided more info: by terminal I meant a mac terminal and I cannot use the 'rename' command.

I think you're using a linux system. So i provide a bash solution. It works only if your files starts with g and there is no other files in that directory except the files you want to rename.
for i in g*; do mv "$i" "$i.dat"; done
The below would add .dat extension to all the files present in the current directory,
for i in *; do mv "$i" "$i.dat"; done

Automatically fix filename cases in C++ codebase?

I am porting a C++ codebase which was developed on a Windows platform to Linux/GCC. It seems that the author didn't care for the case of filenames, so he used
#include "somefile.h"
instead of
#include "SomeFile.h"
to include the file which is actually called "SomeFile.h". I was wondering if there is any tool out there to automatically fix these includes? The files are all in one directory, so it would be easy for the tool to find the correct names.

EDIT: Before doing anything note that I'm assuming you either have copies of the files off ot the side or preferably that you have a baseline version in source control should you need to roll back for any reason.
You should be able to do this with sed: Something like sed -i 's/somefile\.h/SomeFile.H/I' *.[Ch]
This means take a case-insensitive somefile (trailing /I) and do an in-place (same file) replacement (-i) with the other text, SomeFile.H.
You can even do it in a loop (totally untested):
for file in *.[Ch]
do
sed -i "s/$file/$file/I" *.[Ch]
done
I should note that although I don't believe this applies to you, Solaris sed doesn't support -i and you'd have to install GNU sed or redirect to a file and rename.

Forgive my, I'm away from my linux environment right now so I can't test this myself, but I can tell you what utilities you would need to use to do it.
Open a terminal and use cd to navigate to the correct directory.
cd ~/project
Get a list of all of the .h files you need. You should be able to accomplish this with the shell's wildcard expansion without any effort.
ls include/*.h libs/include/*.h
Get a list of all of the files in the entire project (.c, .cpp, .h, .whatever), anything that can #include "header.h". Again, wildcard expansion.
ls include/*.h libs/include/*.h *.cpp libs/*.cpp
Iterate over each file in the project with a for loop
for f in ... # wildcard file list
do
echo "Looking in $f"
done
Iterate over each header file with a for loop
for h in ... # wildcard header list
do
echo "Looking for $h"
done
For each header in each project file, use sed to search for #include "headerfilename.h", and replace with #include "HeaderFileName.h" or whatever the correct case is.
Warning: Untested and probably dangerous: This stuff is a place to start and should be thoroughly tested before use.
h_escaped=$(echo $h | sed -e 's/$[[\/.*]\|\]$/\\&/g') # escapes characters in file name
argument="(^\s*\#include\s*\")$h_escaped(\"\s*\$)" # I think this is right
sed -i -e "s/$argument/\$1$h\$2/gip"`
Yes, I know it looks awful.
Things to consider:
Rather than going straight to running this on your production codebase, test it thoroughly first.
sed can eat files like a VCR can eat tapes.
Make a backup.
Make another backup.
This is an O(N^2) operation involving hard disk access, and if your project is large it will run slowly. If your project is not gigantic, don't bother, but if it is, consider doing something to pipe sed's output to other seds.
Your search should be case insensitive: it should match #include, #INCLUDE, #iNcLuDe, and any combination of case present in the existing header filename, as well as any amount of whitespace between the include and the header. Bonus points if you preserve whitespace.

Use Notepad++ to do a 'Find in Files' and replace.
From toolbar:
Search - Find in Files.
Then complete the 'Find what' and 'Replace with'.

Best way to join group .mp3 files together

I have more .mp3 files such as :
t001.mp3 t002.mp3 t003.mp3 .....
e001.mp3 e002.mp3 e003.mp3 .....
I would like to merge :
t001.mp3 and e001.mp3 ->>>> r001.mp3
t002.mp3 and e002.mp3 ->>>> r002.mp3
t003.mp3 and e003.mp3 ->>>> r003.mp3
something like this.
What is the best way to do this command? have an application or batch command?

If you are on Linux you can simply use cat file1 file2 > file3 command to concatenate the files and get merged mp3 file which would play the above in sequence.
Similar functionality is available in other Operating Systems including Windows eg: (type file1 file2 > file3) as well.
More info is available in the following related question.
Using cat to join mp3 files. What is this black sorcery?
Cheers!!!

I've use MP3Wrap, a command-line tool, successfully. It can be used at the comand prompt or in batch files. Some commands for its use:
mp3wrap combinedfiles.mp3 file*
To wrap all the files in a directory:
mp3wrap combinedfiles.mp3 *
or
mp3wrap combinedfiles.mp3 *.*
Two or more files:
mp3wrap combinedfiles.mp3 file1.mp3 file2.mp3 etc.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js