Concatenating string_view objects - c++

I've been adding std::string_views to some old code for representing string like config params, as it provides a read only view, which is faster due to no need for copying.
However, one cannot concatenate two string_view together as the operator+ isn't defined. I see this question has a couple answers stating its an oversight and there is a proposal in for adding that in. However, that is for adding a string and a string_view, presumably if that gets implemented, the resulting concatenation would be a std::string
Would adding two string_view also fall in the same category? And if not, why shouldn't adding two string_view be supported?
Sample
std::string_view s1{"concate"};
std::string_view s2{"nate"};
std::string_view s3{s1 + s2};
And here's the error
error: no match for 'operator+' (operand types are 'std::string_view' {aka 'std::basic_string_view<char>'} and 'std::string_view' {aka 'std::basic_string_view<char>'})

A view is similar to a span in that it does not own the data, as the name implies it is just a view of the data. To concatenate the string views you'd first need to construct a std::string then you can concatenate.
std::string s3 = std::string(s1) + std::string(s2);
Note that s3 will be a std::string not a std::string_view since it would own this data.

A std::string_view is an alias for std::basic_string_view<char>, which is a std::basic_string_view templated on a specific type of character, i.e. char.
But what does it look like?
Beside the fairly large number of useful member functions such as find, substr, and others (maybe it's an ordinary number, if compared to other container/string-like things offered by the STL), std::basic_string_view<_CharT>, with _CharT being the generic char-like type, has just 2 data members,
// directly from my /usr/include/c++/12.2.0/string_view
size_t _M_len;
const _CharT* _M_str;
i.e. a constant pointer to _CharT to indicate where the view starts, and a size_t (an appropriate type of number) to indicate how long the view is starting from _M_str's pointee.
In other words, a string view just knows where it starts and how long it is, so it represents a sequence of char-like entities which are consecutive in memory. With just two such memebrs, you can't represent a string which is made up of non-contiguous substrings.
Yet in other words, if you want to create a std::string_view, you need to be able to tell how many chars it is long and from which position. Can you tell where s1 + s2 would have to start and how many characters it should be long? Think about it: you can't, becase s1 and s2 are not adjacent.
Maybe a diagram can help.
Assume these lines of code
std::string s1{"hello"};
std::string s2{"world"};
s1 and s2 are totally unrelated objects, as far as their memory location is concerned; here is what they looks like:
&s2[0]
|
| &s2[1]
| |
&s1[0] | | &s2[2]
| | | |
| &s1[1] | | | &s2[3]
| | | | | |
| | &s1[2] | | | | &s2[4]
| | | | | | | |
| | | &s1[3] v v v v v
| | | | +---+---+---+---+---+
| | | | &s1[4] | w | o | r | l | d |
| | | | | +---+---+---+---+---+
v v v v v
+---+---+---+---+---+
| h | e | l | l | o |
+---+---+---+---+---+
I've intentionally drawn them misaligned to mean that &s1[0], the memory location where s1 starts, and &s2[0], the memory location where s2 starts, have nothing to do with each other.
Now, imagine you create two string views like this:
std::string_view sv1{s1};
std::string_view sv2(s2.begin() + 1, s2.begin() + 4);
Here's what they will look like, in terms of the two implementation-defined members _M_str and _M_len:
&s2[0]
|
| &s2[1]
| |
&s1[0] | | &s2[2]
| | | |
| &s1[1] | | | &s2[3]
| | | | | |
| | &s1[2] | | | | &s2[4]
| | | | | | | |
| | | &s1[3] v v v v v
| | | | +---+---+---+---+---+
| | | | &s1[4] | w | o | r | l | d |
| | | | | +---+---+---+---+---+
v v v v v · ^ ·
+---+---+---+---+---+ · | ·
| h | e | l | l | o | +---+ ·
+---+---+---+---+---+ | · ·
· ^ · | · s2._M_len ·
· | · | <----------->
+---+ · |
| · · +-- s2._M_str
| · s1._M_len ·
| <------------------->
|
+-------- s1._M_str
Given the above, can you see what's wrong with expecting that
std::string_view s3{s1 + s2};
works?
How can you possible define s3._M_str and s3._M_len (based on s1._M_str, s1._M_len, s2._M_str, and s2._M_len), such that they represent a view on "helloworld"?
You can't because "hello" and "world" are located in two unrelated areas of memory.

std::string_view does not own any data, it is only a view. If you want to join two views to get a joined view, you can use boost::join() from the Boost library. But result type will be not a std::string_view.
#include <iostream>
#include <string_view>
#include <boost/range.hpp>
#include <boost/range/join.hpp>
void test()
{
std::string_view s1{"hello, "}, s2{"world"};
auto joined = boost::join(s1, s2);
// print joined string
std::copy(joined.begin(), joined.end(), std::ostream_iterator(std::cout, ""));
std::cout << std::endl;
// other method to print
for (auto c : joined) std::cout << c;
std::cout << std::endl;
}
C++23 has joined ranges in the standard library with the name of std::ranges::views::join_with_view
#include <iostream>
#include <ranges>
#include <string_view>
void test()
{
std::string_view s1{"hello, "}, s2{"world"};
auto joined = std::ranges::views::join_with_view(s1, s2);
for (auto c : joined) std::cout << c;
std::cout << std::endl;
}

Related

How is deque implemented in c++ stl

I just wanted to know how deque is implemented and how are the basic operations like push_front and random access operator are provided in that implementation.
I just wanted to know how deque is implemented
It's always a good to have an excuse for doing ASCII art:
+-------------------------------------------------------------+
| std::deque<int> |
| |
| subarrays: |
| +---------------------------------------------------------+ |
| | | | | | | |
| | int(*)[8] | int(*)[8] | int(*)[8] |int(*)[8]|int(*)[8] | |
| | | | | | | |
| +---------------------------------------------------------+ |
| / \ |
| / \ |
| / \ |
| / \ |
| / \ |
| / \ |
| / \ |
| / \ |
| - - |
| +------------------------------+ |
| | ?, ?, 42, 43, 50, ?, ?, ?, ? | |
| +------------------------------+ |
| |
| additional state: |
| |
| - pointer to begin of the subarrays |
| - current capacity and size |
| - pointer to current begin and end |
+-------------------------------------------------------------+
how are the basic operations like push_front and random access operator are provided in that implementation?
First, std::deque::push_front, from libcxx:
template <class _Tp, class _Allocator>
void
deque<_Tp, _Allocator>::push_front(const value_type& __v)
{
allocator_type& __a = __base::__alloc();
if (__front_spare() == 0)
__add_front_capacity();
__alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v);
--__base::__start_;
++__base::size();
}
This obviously checks whether the memory already allocated at the front can hold an additional element. If not, it allocates. Then, the main work is shifted to the iterator: _VSTD::addressof(*--__base::begin()) goes one location before the current front element of the container, and this address is passed to the allocator to construct a new element in place by copying v (the default allocator will definitely do a placement-new).
Now random access. Again from libcxx, std::deque::operator[] (the non-const version) is
template <class _Tp, class _Allocator>
inline
typename deque<_Tp, _Allocator>::reference
deque<_Tp, _Allocator>::operator[](size_type __i) _NOEXCEPT
{
size_type __p = __base::__start_ + __i;
return *(*(__base::__map_.begin() + __p / __base::__block_size) + __p % __base::__block_size);
}
This pretty much computes an index relative to some start index, and then determines the subarray and the index relative to the start of the subarray. __base::__block_size should be the size of one subarray here.

Mockito verifying method invocation without using equals method

While using Spock i can do something like this:
when:
12.times {mailSender.send("blabla", "subject", "content")}
then:
12 * javaMailSender.send(_)
When i tried to do same in Mockito:
verify(javaMailSender,times(12)).send(any(SimpleMailMessage.class))
I got an error that SimpleMailMessage has null values, so i had to initialize it in test:
SimpleMailMessage simpleMailMessage = new SimpleMailMessage()
simpleMailMessage.setTo("blablabla")
simpleMailMessage.subject = "subject"
simpleMailMessage.text = "content"
verify(javaMailSender,times(12)).send(simpleMailMessage))
Now it works but it's a large workload and i really don't care about equality. What if SimpleMailMessage will have much more arguments or another objects with another arguments, meh. Is there any way to check that send method was just called X times?
EDIT: added implementation of send method.
private fun sendEmail(recipient: String, subject: String, content: String)
{
val mailMessage = SimpleMailMessage()
mailMessage.setTo(recipient)
mailMessage.subject = subject
mailMessage.text = content
javaMailSender.send(mailMessage)
}
There are 2 senders, mailSender is my custom object and javaMailSender is from another libary
Stacktrace:
Mockito.verify(javaMailSender,
Mockito.times(2)).send(Mockito.any(SimpleMailMessage.class))
| | | | |
| | | | null
| | | Wanted but not invoked:
| | | javaMailSender.send(
| | | <any org.springframework.mail.SimpleMailMessage>
| | | );
| | | -> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
| | |
| | | However, there were exactly 2 interactions with this mock:
| | | javaMailSender.send(
| | | SimpleMailMessage: from=null; replyTo=null; to=blabla; cc=; bcc=; sentDate=null; subject=subject; text=content
| | | );
| | | -> at MailSenderServiceImpl.sendEmail(MailSenderServiceImpl.kt:42)
| | |
| | | javaMailSender.send(
| | | SimpleMailMessage: from=null; replyTo=null; to=blabla; cc=; bcc=; sentDate=null; subject=subject; text=content
| | | );
If you don't care for the parameter of send, leave any() empty:
verify(javaMailSender,times(12)).send(any())

Casting buffers as structures

I'm doing this as a learning exercise. The C++ book I'm studying from casts a buffer as a structure for easy manipulation and streaming. Everything seems fine until I try using an array (body) and look at the binary data in the buffer after assigning values. It doesn't match what I expect.
#include <iostream>
#include <bitset>
#include <netinet/in.h>
using namespace std;
struct dataStruct
{
uint32_t header;
uint32_t *body;
};
int main(int argc, char* argv[])
{
int size, streamSize;
// 4 bytes per size + 4 bytes for header
size = 1;
streamSize = (size * 4) + 4;
// Create a stream of bytes of appropriate size
uint8_t *buffer = new uint8_t[streamSize];
// Cast stream as structure
dataStruct *sStream = (dataStruct *)buffer;
// Populate structure with nice 101010... binary patterns
sStream->header = 2863311530;
sStream->body = new uint32_t[1];
sStream->body[0] = 2863311530;
cout << "Struct: " << sStream->header << ", " << sStream->body[0] << endl;
// Look at raw data in stream
for (int i=0; i<sizeof(buffer); i++)
{
std::bitset<8> x(buffer[i]);
cout << "[" << i << "]->" << x << endl;
}
return 0;
}
The output is:
Struct: 2863311530, 2863311530
[0]->10101010
[1]->10101010
[2]->10101010
[3]->10101010
[4]->00000000
[5]->00000000
[6]->00000000
[7]->00000000
Why is index 4-7 not the same as 0-3? Both sStream->header and sStream->body contain the same values. They are mapped to the buffer. Is this because body is an array? If so how would I manipulate the stream for this to work when using an array?
Thanks
You are using uninitialized varieable size in:
streamSize = (size * 4) + 4;
Everything after that depends on streamSize is suspect and is a cause for undefined behavior.
Update
Even after size is initialized to 1, there are problems. Let's me walk through the code and how it affects the memory you have allocated.
After you execute the line:
uint8_t *buffer = new uint8_t[streamSize];
you have buffer pointing to memory like this:
buffer
|
v
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
After you have executed the line:
dataStruct *sStream = (dataStruct *)buffer;
you have sStream pointing to the same memory like:
sStream
|
v
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
If your compiler does not add any padding to the members of dataStruct (the best case scenario), you'll have:
sStream.header sStream.body
| |
v v
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
If your compiler adds padding to dataStream.header, sStream.body will point to something different. Worst case scenario: You have a 64-bit compiler. It adds 32 bits of padding to dataStream.header. In that case, you will have:
sStream.header sStream.body
| |
v v
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
Then, you will end up using unathorized memory when you try to assign anything to sStream.body, like:
sStream->body = new uint32_t[1];
Best case scenario, you have 32 bit compiler and there is no padding added to dataStream.header. Looks like you have a 64-bit compiler. Even if you compiler does not add any padding to dataStream.header, you are still looking at a memory overrun problem if sizeof(void*) is 64 bits, which I think you do.
Let's take the best case scenario of a 32 bit compiler that doesn't add any padding and the member of sStream point to the allocated memory like:
sStream.header sStream.body
| |
v v
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
After you execute the line:
sStream->header = 2863311530;
the memory looks like:
sStream.header sStream.body
| |
v v
+---+---+---+---+---+---+---+---+
| 2863311530 | |
+---+---+---+---+---+---+---+---+
After you execute the line:
sStream->body = new uint32_t[1];
the memory looks like:
sStream.header sStream.body
| |
v v
+---+---+---+---+---+---+---+---+
| 2863311530 | SomeMemory |
+---+---+---+---+---+---+---+---+
SomeMemory
|
v
+---+---+---+---+
| |
+---+---+---+---+
After you execute the line:
sStream->body[0] = 2863311530;
SomeMemory gets populated and looks like:
SomeMemory
|
v
+---+---+---+---+
| 2863311530 |
+---+---+---+---+
I think you were surprised to see that the memory pointed to by buffer does not look like:
buffer
|
v
+---+---+---+---+---+---+---+---+
| 2863311530 | 2863311530 |
+---+---+---+---+---+---+---+---+
I hope it makes sense now why it does not.

Sorting Vector Alphabetically by Index Value

I have a vector that I want to sort alphabetically. I have successfully been able to sort it by one indexes value alphabetically, but when I do it only changes the order of that index and not the entire vector. How can I get it to apply the order change to the entire vector?
This is my current code I am running:
std::sort (myvector[2].begin(), myvector[2].end(), compare);
bool icompare_char(char c1, char c2)
{
return std::toupper(c1) < std::toupper(c2);
}
bool compare(std::string const& s1, std::string const& s2)
{
if (s1.length() > s2.length())
return true;
if (s1.length() < s2.length())
return false;
return std::lexicographical_compare(s1.begin(), s1.end(),
s2.begin(), s2.end(),
icompare_char);
}
My general structure for this vector is vector[row][column] where:
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
For example if I had a vector:
myvector[0][0] = 'One' AND myvector[2][0]='b'
myvector[0][1] = 'Two' AND myvector[2][1]='a'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
And I sort it I get:
myvector[0][0] = 'One' AND myvector[2][0]='a'
myvector[0][1] = 'Two' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| a | b | c |
and not what I want:
myvector[0][0] = 'Two' AND myvector[2][0]='a'
myvector[0][1] = 'One' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| Two | One | Three |
| 2 | 1 | 3 |
| a | b | c |
I looked around for a good approach but could not find anything that worked... I was thinking something like:
std::sort (myvector.begin(), myvector.end(), compare);
Then handle the sorting of the third index within my compare function so the whole vector would get edited... but when I passed my data I either only changed the order in the function and still did not change the top layer or got errors. Any advice or help would be greatly appreciated. Thank you in advance.
Ideally, merge the 3 data fields into a struct so that you can have just 1 vector and so sort it simply.
struct DataElement{
std::string str;
char theChar;
int num;
bool operator<(const DataElement& other)const{return theChar<other.theChar;}
};
std::vector<DataElement> myvector;
std::sort (myvector.begin(), myvector.end());

Displays nothing when debugged [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to pass objects to functions in C++?
Main class
#include "List.h"
#include "Car.h"
#include "Worker.h"
#include "Queue.h"
#include <iostream>
#include <string>
using namespace std;
void initWorkerList(List<Worker>);
void initCarList(List<Car>, Queue, Queue);
int main() {
List<Worker> WorkerList;
List<Car> CarList;
Queue q1, q2;
initWorkerList(WorkerList);
initCarList(CarList, q1, q2); // Error here
//..... e.g cout << "Successful!"; but it does not displays it...
}
void initWorkerList(List<Worker> WorkerList) {
Worker w1 = Worker("Ben Ang", "Ben123", "pass123", 'M');
WorkerList.add(w1);
Worker w2 = Worker("Grace Eng", "Gr4ce", "loveGrace", 'W');
WorkerList.add(w2);
Worker w3 = Worker("Rebecca Xuan", "Xuanz", "Rebecca Xuan", 'W');
WorkerList.add(w3);
}
void initCarList(List<Car> CarList, Queue q1, Queue q2) {
Car c1 = Car("SJS1006Z","Toyota", "Saloon car");
Car c2 = Car("SFW6666E", "hyundai", "Taxi (Saloon)");
Car c3 = Car("SCF1006G","Mercedes", "Large Van");
Car c4 = Car("SBQ1006Z", "Puma", "Saloon Car");
q1.enqueue(c1);
q2.enqueue(c1);
q2.enqueue(c3);
q1.enqueue(c4);
q1.enqueue(c1);
q1.enqueue(c1);
q1.enqueue(c1);
q2.enqueue(c2);
q2.enqueue(c2);
}
There is no error at all. But nothing is displayed when being debugged...I have tried and my guess is there is something wrong with initCarList(CarList,q1,q2); cause after that code, other codes can work at all. Is there anything wrong with it? Thanks
You are passing the Queue Variables by value rather than by reference.
initCarList(CarList, q1, q2); // Error here
So any change in initCarList wont get reflected back to caller
Change your function signature to
void initCarList(List<Car> CarList, Queue& q1, Queue& q2) {
and the declaration to
void initCarList(List<Car>, Queue&, Queue&);
If you pass parameter by value, any change within initCarList is local to the function scope and does not get reflected back.
Pass by Value
Caller Callee
|------| |------|
workedList workedList
| ___ | | ___ |
|| | |--------> || | | <------
||___| | ||___| | |
| | | | |
|q1 | |q1 | (Changing any of these variables
| ___ | | ___ | won't be reflected back)
|| | |--------> || | | |
||___| | ||___| | |
| | | | |
|q2 | |q2 | |
| ___ | | ___ | |
|| | |--------> || | | |
||___| | ||___| | <------
|______| |______|
Pass by reference
Caller Callee
-------- --------
|wList | |wList |
| ___ | | ____ |
|| | |--------> || || <------
||___|<|------------||-*__|| |
| _ | | | |
|q1 | |q1 | (Changing any of these variables
| ___ | | ____ | will be reflected back)
|| | |--------> || || |
||___|<|------------||-*__|| |
| | | | |
|q2 | |q2 | |
| ___ | | ____ | |
|| | |--------> || || |
||___|<|------------||-*__|| <------
|______| |______|
You're passing your variables in by value, which means that the function's parameters hold a copy of them, which you modify and discard when the function ends. Pass by reference instead to modify the original variable. For example, initCarList would become:
void initCarList(List<Car> CarList, Queue &q1, Queue &q2)
You also don't use the CarList parameter, so you might as well take it out if this is how it is in your code.
Your functions pass by value which means functions make a copy of passed in variables and manipulate on the copied ones. To modify on original ones you need to pass parameter by reference
Change:
void initWorkerList(List<Worker>);
void initCarList(List<Car>, Queue, Queue);
To:
void initWorkerList(List<Worker> &);
void initCarList(List<Car>&, Queue&, Queue&);