boost::algorithim::split_regex - c++

For the function
boost::split_regex(std::vector<std::string>, std::string, std::string);
I end up with Empty tokens and I would like to compress them, but unlike boost::split, I cannot find a token_compress_on option for regex_split. As it is still undocumented (see below), I was wondering if anyone had any pointers as to how to go about this?
From: https://www.boost.org/doc/libs/1_81_0/libs/algorithm/doc/html/index.html
Not-yet-documented Other Algorithms
Reference
Header <boost/algorithm/algorithm.hpp>
Header <boost/algorithm/apply_permutation.hpp>
Header <boost/algorithm/clamp.hpp>
Header <boost/algorithm/cxx11/all_of.hpp>
Header <boost/algorithm/cxx11/any_of.hpp>
Header <boost/algorithm/cxx11/copy_if.hpp>
Header <boost/algorithm/cxx11/copy_n.hpp>
Header <boost/algorithm/cxx11/find_if_not.hpp>
Header <boost/algorithm/cxx11/iota.hpp>
Header <boost/algorithm/cxx11/is_partitioned.hpp>
Header <boost/algorithm/cxx11/is_permutation.hpp>
Header <boost/algorithm/cxx14/is_permutation.hpp>
Header <boost/algorithm/cxx11/is_sorted.hpp>
Header <boost/algorithm/cxx11/none_of.hpp>
Header <boost/algorithm/cxx11/one_of.hpp>
Header <boost/algorithm/cxx11/partition_copy.hpp>
Header <boost/algorithm/cxx11/partition_point.hpp>
Header <boost/algorithm/cxx14/equal.hpp>
Header <boost/algorithm/cxx14/mismatch.hpp>
Header <boost/algorithm/cxx17/exclusive_scan.hpp>
Header <boost/algorithm/cxx17/for_each_n.hpp>
Header <boost/algorithm/cxx17/inclusive_scan.hpp>
Header <boost/algorithm/cxx17/reduce.hpp>
Header <boost/algorithm/cxx17/transform_exclusive_scan.hpp>
Header <boost/algorithm/cxx17/transform_inclusive_scan.hpp>
Header <boost/algorithm/cxx17/transform_reduce.hpp>
Header <boost/algorithm/find_backward.hpp>
Header <boost/algorithm/find_not.hpp>
Header <boost/algorithm/gather.hpp>
Header <boost/algorithm/hex.hpp>
Header <boost/algorithm/is_clamped.hpp>
Header <boost/algorithm/is_palindrome.hpp>
Header <boost/algorithm/is_partitioned_until.hpp>
Header <boost/algorithm/minmax.hpp>
Header <boost/algorithm/minmax_element.hpp>
Header <boost/algorithm/searching/boyer_moore.hpp>
Header <boost/algorithm/searching/boyer_moore_horspool.hpp>
Header <boost/algorithm/searching/knuth_morris_pratt.hpp>
Header <boost/algorithm/sort_subrange.hpp>
Header <boost/algorithm/string.hpp>
Header <boost/algorithm/string_regex.hpp>

There's not much use for it with a delimiter pattern, because you can always just use (pattern)+ instead of pattern to have the desired effect:
Live On Coliru
#include <boost/algorithm/string_regex.hpp>
auto tokenize(std::string_view input, std::string delim) {
boost::regex re(delim);
std::vector<std::string> tokens;
split_regex(tokens, input, re);
return tokens;
}
#include <fmt/ranges.h>
int main() {
fmt::print("{}\n", tokenize("a,,b,c,,,d", ","));
fmt::print("{}\n", tokenize("a,,b,c,,,d", ",+"));
}
Prints
["a", "", "b", "c", "", "", "d"]
["a", "b", "c", "d"]

Related

Multiple definitions when using #ifdef

I am having a problem when compiling: Multiple definitions of "myFunction()"
I will greatly simplify the problem here. Basically, I have 3 files: "main", "header", and "myLibrary".
main.cpp
#include "header.hpp"
int main() { }
header.hpp
#ifndef HEADER_HPP
#define HEADER_HPP
#include "myLibrary.hpp"
// ...
#endif
header.cpp
#include "header.hpp"
// ...
myLibrary.hpp
#ifndef LIB_HPP
#define LIB_HPP
#if defined(__unix__)
#include <dlfcn.h>
std::string myFunction() { return std::string(); }
#endif
#endif
myLibrary.cpp
#include "myLibrary.hpp"
//...
So, why does the compiler say that I have Multiple definitions of "myFunction()"?
One clue I found: When I take header.cpp and erase the line that says #include "header.hpp", the program compiles without complaining. On the other hand, if I erase myFunction (from myLibrary.hpp) instead, the program also compiles without complains
You are defining the body of the function inside the header file. So every translation unit that you include that header in (in this case, main.cpp and header.cpp) will end up with its own copy of that function body. And when you try to link those multiple units together, you get the "duplicate definition" error.
The function needs to be declared in the hpp file, and defined in the cpp file:
myLibrary.hpp
#ifndef LIB_HPP
#define LIB_HPP
#if defined(__unix__)
#include <dlfcn.h>
#include <string>
std::string myFunction();
#endif
#endif
myLibrary.cpp
#include "myLibrary.hpp"
#if defined(__unix__)
std::string myFunction()
{
return std::string();
}
#endif
//...
Include guards only prevent the same header from being included twice within the same translation unit, which in practice is usually a single .cpp file. For example, it prevents errors when doing this:
#include "header.h"
#include "header.h"
int main()
{
}
But, more generally, it means that it doesn't matter if you've include a header that has already been included as a dependency of another header.
However, if you have two .cpp files include the same header, and that header contains the definition of a function (such as your myLibrary.hpp) then each .cpp file will have its own definition (the include guard won't help because the header is being included in two separate translation units / .cpp files).
The simplest thing to do is to declare the function in the header, which tells every file that includes your header that the function exists somewhere, and then define it in the .cpp file so that it is only defined once.
You should to define functions in the .cpp files, not in the header files. You declare them in the header files. What you're doing is defining it in the header file, so when it gets included into multiple files, the function is getting duplicated. Cross-file duplicate symbols will throw an error, unless they're static.
myLibrary.cpp:
#include "myLibrary.hpp"
#ifdef(__unix__)
std::string myFunction() { return std::string(); }
#endif
myLibrary.hpp:
#ifndef LIB_HPP
#define LIB_HPP
#if defined(__unix__)
#include <dlfcn.h>
#include <string>
std::string myFunction();
#endif
#endif

#include inside function body or reduce its visibility

Is there a way to reduce the scope of #include directive?
I mean for example to do something like this
void functionOneDirective()
{
#include "firstInstructionSet.h"
registerObject();
// cannot access instantiate()
}
void functionSecondDirective()
{
#include "secondInstructionSet.h"
instantiate();
// cannot access registerObject()
}
void main()
{
//cannot access instantiate() && registerObject()
}
It is not possible to restrict "includes" to a subset of the using file. It is always visible for anything following the include.
If you are interested in providing "different" views on certain functionality, consider using different namespaces to express these different "views".
#include directly inserts the file contents at the spot. So it depends on the contents of the header, but generally the answer is no. If it's a C header surrounding the inclusion with a namespace might work, but I'm not sure.
#include is resolved during compilation, you can't change it in the code. But you can use preprocessor directives to control which files are included :
#if defined(A) && A > 3
#include 'somefile.h'
#else
#include 'someotherfile.h'
#endif
A possible solution to your question (and the correct way to organize your source code) is to create a separate .c file for each function or group of related functions. For each .c file you also write a .h file that contains the declarations of the elements (types, constants, variables, functions) from the .c file that are published by that file.
The variables declared in the .h file need to be prefixed with the extern keyword (to let the compiler know they reside in a different .c file).
Then, let each .c file include only the .h files it needs (those that declare the functions/types/variables that are used in this .c file).
Example
File firstInstructionSet.h
typedef int number;
void registerObject();
File firstInstructionSet.c
void registerObject()
{
/* code to register the object here */
}
File oneDirective.h
void functionOneDirective();
File oneDirective.c
#include "firstInstructionSet.h"
void functionOneDirective()
{
registerObject();
// cannot access instantiate() because
// the symbol 'instantiate' is not known in this file
}
File secondDirective.h
extern int secondVar;
void functionSecondDirective();
File secondDirective.c
#include "secondInstructionSet.h"
int secondVar = 0;
void functionSecondDirective()
{
instantiate();
// cannot access registerObject() because
// the symbol 'registerObject' is not known in this file
}
File secondInstructionSet.h
void instantiate();
File secondInstructionSet.c
void instantiate()
{
/* code to instantiate here */
}
File main.c
#include "oneDirective.h"
#include "secondDirective.h"
void main()
{
// cannot access instantiate() && registerObject()
// because these symbols are not known in this file.
// but can access functionOneDirective() and functionSecondDirective()
// because the files that declare them are included (i.e. these
// symbols are known at this point
// it also can access variable "secondVar" and type "number"
}

.cpp vs .h and where should I put function definitions [duplicate]

This question already has answers here:
C++ Header Files, Code Separation
(4 answers)
Closed 8 years ago.
I've been writing in C++ lately and I'm getting confused with .cpp vs .h — when to use them and what should go in them. I've been reading that you should put function definitions in a separate .cpp file, and headers should be used for declarations, but how do I use the separate .cpp file? Do I #include it or what? I'm looking for clarification on .h and .cpp and what should go where and how to include separate .cpp files.
You should use .h file for function prototype and data type declarations and also for pre-processor directives, and .cpp files for definitions. For example, test.h might be look like
#define CONSTANT 123 // pre-processor directive
void myfunction(char* str);
and your test.cpp might look like
#include <stdio.h>
#include "test.h"
int main(int argc char **argv)
{
myfunction("Hello World");
return 0;
}
void myfunction (char* str)
{
printf("%s and constant %d", str, CONSTANT);
return;
}
Usually the class declaration goes into the (.h) header file, and the implementation goes in the .cpp file.
You include the header file in the cpp file, so all the functions will be recognized, and you should remember to use #ifndef in the header file to avoid errors (includes loops)

How to define function in other source file:C++ CodeBlocks

I am trying to separate my functions in another source file. But i am getting error that multiple definition on add function.
Main source file
Main.cpp
#include<iostream>
#include "myHeader.h"
using namespace std;
int main()
{
int result = add(1,2);
}
Header file "myHeader.h"
#include "calc.cpp"
int add(int, int);
Other Source file "calc.cpp"
int add(int a, int b)
{
return a+b;
}
What you need is:
"myHeader.h"
#ifndef MY_HEADER
#define MY_HEADER
int add(int, int);
#endif
calc.cpp
#include "myHeader.h"
int add(int a, int b)
{
return a+b;
}
main.cpp
#include "myHeader.h"
int main()
{
int result = add(1,2);
return 0;
}
You don't include the .cpp into the .h . The header file is used to tell the compiler the existence of a function with the specified prototype, but the liker will be tke care of matching up the call to a function with the implementation of that function.
Also, it's usually a good idea to give you header file and .cpp the same name, so calc.h and calc.cpp rather than myHeader.h.
Don't include calc.cpp from myHeader.h. Except for that one line, your example is right as far as headers go. (main() should return a value).
calc.cpp and main.cpp are two different "compilation units" which will be compiled separately into object files. The two object files are then combined into one executable by a linker.
You problem is that you include a Code File (cpp) into a header. You should do the inverse. Include your header "myHeader.h" into calc.cpp. And to be coherent, you should name your header the same name as your Code file, so calc.h for the header and calc.cpp for you code.
This is pretty simple. Do not Include your "calc.cpp" file in "MyHeader.h" file.
Take also a look at C/C++ IncludeGuard here
This is a fundamental of C/C++ programming. You will need to use it many times.
Protect your "myHeader.h"
#ifndef ADD_HEADER
#define ADD_HEADER
int add(int, int);
#endif // ADD_HEADER
#include "calc.cpp"
Don't do that - it includes the function definition in any translation unit that includes the header, so you'll end up with multiple definitions.
Without that, your code should be fine if you build a program from both source files. main.cpp will include a declaration, so that it knows that function exists; the definition from the other source file will be included in the program by the linker.
Do not include calc.cpp . This is causing the redefinition
you can include myHeader.h in calc.cpp

Trying to create an error log ofstream -- getting "one or more multiply defined symbols found"

I'm trying to do this:
#pragma once
#include <fstream>
#include <string>
static std::ofstream ErrorLog;
void InitErrorLog(std::string FileName) {
ErrorLog.open(FileName);
}
but am getting a "One or more multiply defined symbols found" error when #include-ing in multiple CPP files. What is the STL doing (to provide cout, cin, cerr, etc. -- this approach originates as an alternative to redirecting cerr) that I'm not?
You are providing the definition for ErrorLog in a header file. Instead, define it in a source file and leave an extern declaration at the header.
source
std::ofstream ErrorLog;
void InitErrorLog(std::string FileName) {
ErrorLog.open(FileName);
}
header
extern std::ofstream ErrorLog;
void InitErrorLog(std::string FileName);
Additionaly, in order to keep your function at the header you have to make it inline.
You're breaking the one definition rule. You need to make the method inline.
inline void InitErrorLog(std::string FileName) {
ErrorLog.open(FileName);
}
Also, note that by declaring your variable static, you'll have a copy per translation unit - i.e. it's not a global. To make it global, you need to declare it extern in the header and define it in a single implementation file.