Filling map with 2 keys from a string. Character and frequency c++

Filling map with 2 keys from a string. Character and frequency c++ - c++

I am new to maps so an a little unsure of the best way to do this. This task is in relation to compression with huffman coding. Heres what I have.
#include <map>
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
typedef map<char,int> huffmanMap;
void getFreq(string file, map<char, int> map)
{
map.clear();
for (string::iterator i = file.begin(); i != file.end(); ++i) {
++map[*i];
}
}
above is one method I found online but was unable to print anything
int main()
{
map<char, int> huffmanMap;
string fileline;
ifstream myfile;
myfile.open("text.txt",ios::out);
while(!myfile.eof()) {
getline(myfile, fileline); //get the line and put it in the fileline string
}
myfile.close();
I read in a from a text file to populate string fileline.
for (int i=0; i<fileline.length(); i++) {
char t = fileline[i];
huffmanMap[i]? huffmanMap[i]++ : huffmanMap[i]=1;
}
here is a second method I tried for populating the map, the char values are incorrect, symbols and smileyfaces..
getFreq(fileline,huffmanMap);
huffmanMap::iterator position;
for (position = huffmanMap.begin(); position != huffmanMap.end(); position++) {
cout << "key: \"" << position->first << endl;
cout << "value: " << position->second << endl;
}
This is how I tried to print map
system("pause");
return 0;
}
When I run my getFreq method the program crashes. I dont get any errors with either. With the second method the char values are nonsense.Note I have not had both methods running at the same time I just incuded them both to show what i have tried.
Any insight would be appreciated.Thanks. Be lenient im a beginner ;)

Your code is all over the place, it's not very coherent so difficult to understand the flow.
Here are some low-lights:
This is wrong: myfile.open("text.txt",ios::out); - why would you open an input stream with out flag? it should simply be:
string fileline;
ifstream myfile("text.txt");
while(getline(myfile, fileline)) {
// now use fileline.
}
In the while loop, what you want to do is to iterate over the content and add it to your map? So now the code looks like:
string fileline;
ifstream myfile("text.txt");
while(getline(myfile, fileline)) {
getFreq(fileline, huffmanMap);
}
Next fix, this is wrong: you have a typedef and a variable of the same name!
typedef map<char,int> huffmanMap;
map<char, int> huffmanMap;
Use sensible naming
typedef map<char,int> huffmanMap_Type;
huffmanMap_Type huffmanMap;
Next fix, your getFreq method signature is wrong, you are passing the map by value (i.e. copy) rather than reference, hence you modification in the function is to a copy not the original!
wrong: void getFreq(string file, map<char, int> map)
correct: void getFreq(string file, huffmanMap_Type& map)
Next: why clear() in the above method? What if there is more than one line? No need for that surely?
That's enough for now, clean up your code and update your question if there are more issues.

One fix and One improvement.
Fix is : make second parameter in getFreq reference:
void getFreq(string file, map<char, int> & map); //notice `&`
Improvement is : just write
huffmanMap[i]++;
instead of
huffmanMap[i]? huffmanMap[i]++ : huffmanMap[i]=1;
After all, by writing huffmanMap[i]? you're checking whether it's zero or not. If zero, then you make it one, which is same as huffmanMap[i]++.

(An answer using C++ language features fom C++20.
But first, you were asking about getting getting the count (frequency) of letters in a text.
There is nearly a universal solution approach for this. We can use the std::unordered_map. It is described in the C++ reference here.
It is the std::unordered_maps very convienient index operator [] which makes counting very simple. This operator returns a reference to the value that is mapped to a key. So, it searched for the key and then returns the value. If the key does not exist, it inserts a new key/value pair and returns a reference to the value.
So, in any way, a reference to the value is returned. And this can be incremented. Example:
With a "std::unordered_map<char, int> mymap{};" and a text "aba", the follwoing can be done with the index operator:
mymap['a'] will search for an 'a' in the map. It is not found, so a new entry for 'a' with corresponding value=0 is created: The a reference to the value is returned. And, if we now increment that, we will increment the counter value. So, mymap['a']++, wiil insert a new gey/value pair 'a'/0, then increment it, resulting in 'a'/1
For 'b' the same as above will happen.
For the next 'a', an entry will be found in the map, an so a reference to the value (1) is returned. This is incremented and will then be 2
And so on and so on.
By using some modern language elements, a whole file can be read and its characters counted, with one simple for-loop:
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
Additional information:
For building a Huffmann tree, we can use a Min-Heap, which can be easily implemented with the existing std::priority_queue. Please read here abour it.
With 4 lines of code, the complete Huffmann tree can be build.
And the end, we put the result in a code book. Again a std::unordered_map and show the result to the user.
This could zhen be for example implemented like the below:
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <algorithm>
#include <queue>
#include <ranges>
#include <vector>
#include <utility>
namespace rng = std::ranges; // Abbreviation for the rnages namespace
using namespace std::string_literals; // And we want to use stding literals
// The Node of the Huffmann tree
struct Node {
char letter{ '\0' }; // The letter that we want to encode
std::size_t frequency{}; // The letters frequency in the source text
Node* left{}, *right{}; // The pointers to the children of this node
};
// Some abbreviations to reduce typing work and make code more readable
using Counter = std::unordered_map<char, std::size_t>;
struct Comp { bool operator ()(const Node* n1, const Node* n2) { return n1->frequency > n2->frequency; } };
using MinHeap = std::priority_queue<Node*, std::vector<Node*>, Comp>;
using CodeBook = std::unordered_map<char, std::string>;
// Traverse the Huffmann Tree and build the code book
void buildCodeBook(Node* root, std::string code, CodeBook& cb) {
if (root == nullptr) return;
if (root->letter != '\0') cb[root->letter] = code;
buildCodeBook(root->left, code + "0"s, cb);
buildCodeBook(root->right, code + "1"s, cb);
}
// Get the top most two Elements from the Min-Heap
std::pair<Node*, Node*> getFrom(MinHeap& mh) {
Node* left{ mh.top() }; mh.pop();
Node* right{ mh.top() }; mh.pop();
return { left, right };
}
// Demo function
int main() {
if (std::ifstream ifs{ "r:\\lorem.txt" }; ifs) {
// Define moste important resulting work products
Counter counter{};
MinHeap minHeap{};
CodeBook codeBook{};
// Read complete text from source file and count all characters ---------
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
// Build the Huffmann tree ----------------------------------------------
// First, create a min heap, based and the letters frequency
for (const auto& p : counter) minHeap.push(new Node{p.first, p.second});
// Compress the nodes
while (minHeap.size() > 1u) {
auto [left, right] = getFrom(minHeap);
minHeap.push(new Node{ '\0', left->frequency + right->frequency, left, right });
}
// And last but not least, generate the code book -----------------------
buildCodeBook(minHeap.top(), {}, codeBook);
// And, as debug output, show the code book -----------------------------
for (const auto& [letter, code] : codeBook) std::cout << '\'' << letter << "': " << code << '\n';
}
else std::cerr << "\n\n***Error: Could not open source text file\n\n";
}
You my notice that we use new to allocate memory. But we do not delete it afterwards.
We could now add the delete statements at the approiate positions but I will show you a modified solution using smart pointers.
Please see here:
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <map>
#include <algorithm>
#include <queue>
#include <ranges>
#include <vector>
#include <utility>
#include <memory>
namespace rng = std::ranges; // Abbreviation for the rnages namespace
using namespace std::string_literals; // And we want to use stding literals
struct Node; // Forward declaration
using UPtrNode = std::unique_ptr<Node>; // Using smart pointer for memory management
// The Node of the Huffmann tree
struct Node {
char letter{ '\0' }; // The letter that we want to encode
std::size_t frequency{}; // The letters frequency in the source text
UPtrNode left{}, right{}; // The pointers to the children of this node
};
// Some abbreviations to reduce typing work and make code more readable
using Counter = std::unordered_map<char, std::size_t>;
struct CompareNode { bool operator ()(const UPtrNode& n1, const UPtrNode& n2) { return n1->frequency > n2->frequency; } };
using MinHeap = std::priority_queue<UPtrNode, std::vector<UPtrNode>, CompareNode>;
using CodeBook = std::map<Counter::key_type, std::string>;
// Traverse the Huffmann Tree and build the code book
void buildCodeBook(UPtrNode&& root, std::string code, CodeBook& cb) {
if (root == nullptr) return;
if (root->letter != '\0') cb[root->letter] = code;
buildCodeBook(std::move(root->left), code + "0"s, cb);
buildCodeBook(std::move(root->right), code + "1"s, cb);
}
// Get the top most to Elements from the Min-Heap
std::pair<UPtrNode, UPtrNode> getFrom(MinHeap& mh) {
UPtrNode left = std::move(const_cast<UPtrNode&>(mh.top()));mh.pop();
UPtrNode right = std::move(const_cast<UPtrNode&>(mh.top()));mh.pop();
return { std::move(left), std::move(right) };
}
// Demo function
int main() {
if (std::ifstream ifs{ "r:\\lorem.txt" }; ifs) {
// Define moste important resulting work products
Counter counter{};
MinHeap minHeap{};
CodeBook codeBook{};
// Read complete text from source file and count all characters ---------
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
// Build the Huffmann tree ----------------------------------------------
// First, create a min heap, based and the letters frequency
for (const auto& p : counter) minHeap.push(std::make_unique<Node>(Node{ p.first, p.second }));
// Compress the nodes
while (minHeap.size() > 1u) {
auto [left, right] = getFrom(minHeap);
minHeap.push(std::make_unique<Node>(Node{ '\0', left->frequency + right->frequency, std::move(left), std::move(right) }));
}
// And last but not least, generate the code book -----------------------
buildCodeBook(std::move(const_cast<UPtrNode&>(minHeap.top())), {}, codeBook);
// And, as debug output, show the code book -----------------------------
for (std::size_t k{}; const auto & [letter, code] : codeBook) std::cout << ++k << "\t'" << letter << "': " << code << '\n';
}
else std::cerr << "\n\n***Error: Could not open source text file\n\n";
}

Related

I don't understand why my sort on a string breaks everything

I have the following code:
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
#include <unordered_map>
using namespace std;
vector<vector<string>> findAnagrams(vector<string> wordlist) {
vector<vector<string>> result;
unordered_map<string, vector<string>*> indexes;
for (const string& word : wordlist) {
string wordSorted = word;
sort(wordSorted.begin(), wordSorted.end()); // <= This line break everything
auto index = indexes.find(wordSorted);
if (index == indexes.end()) {
vector<string> vec = { word };
result.push_back(vec);
indexes[wordSorted] = &vec;
} else {
index->second->push_back(word);
}
}
return result;
}
int main()
{
vector<string> wordlist = {"eat", "tea", "tan", "ate", "nat", "bat", "test", "estt"};
auto result = findAnagrams(wordlist);
for (const auto& vec : result) {
for (const auto& word : vec) {
cout << word << " ";
}
cout << endl;
}
return 0;
}
This code detects all anagrams in a list of given words.
As my comment says, when I sort wordSorted using std::sort, it breaks everything and my code ends with a bad_alloc. As if the std::sort manipulates the memory outside of wordSorted. If I remove this specific line, the code "works" (the result is obviously wrong, but it does what it should do).
How it is possible? What am I missing?

I'm guessing these lines are the main cause of your problem:
{
vector<string> vec = { word };
result.push_back(vec);
indexes[wordSorted] = &vec;
}
Here you store a pointer to the local variable vec in the indexes map. When the block ends at } the life-time of vec also ends, and the pointer you just stored will become invalid.
Any use of this pointer will lead to undefined behavior.
It seems to me that the solution is to simply not store pointers to the vector (pointers to containers are seldom, if ever, needed), and instead store a copy.

Find X-largest values in a large file with optional input file command line parsing method in C++

I have a file in the following fixed format:
<unique record identifier> <white_space> <numeric value>
e.g.
1426828011 9
1426828028 350
1426828037 25
1426828056 231
1426828058 109
1426828066 111
.
.
.
I want to write a program that reads from 'stdin' the contents of a file, and optionally accepts
the absolute path of a file from the command line. The file/stdin stream is expected
to be in the above format. The output should be a list of the unique ids associated
with the X-largest values in the rightmost column, where X is specified by an input
parameter.
For example, given the input data above and X=3, the following would be
valid output:
1426828028
1426828066
1426828056
Note that the output does not need to be in any particular order. Multiple instances
of the same numeric value count as distinct records of the total X. So if we have 4
records with values: 200, 200, 115, 110 and X=2 then the result must consist of the two
IDs that point to 200 and 200 and no more.
Notice: take into account extremely large files.
My idea and brief implementation:
Sorting by k-largest values
1st way: I want to read file content into multimap then iterate k elements to output
2nd way: Read file data into a vector<pair<int, int>> then use heap sort (priority queue).
I'm wondering which way has better time complexity & higher performance? Time complexity of 2nd way should be nlog(n). Is time complexity of 1st way log(n)? Please tell me both time & space complexity of the above methods and suggest any other better methods.
Besides, the input file is huge, so I think of using external sort. But I haven't done it before. I'd appreciate if someone can instruct me or write sample code of it for my better understanding.
Anyways, it's not required to sort output. We only need to print X-largest values in any order. So I'm wondering whether I need to do any sorting algorithm. The requirement to print the X-largest values in any order is weird, because we must sort it in descending order before printing. So I even don't know why it says "in any order" as if it makes the problem easier.
My brief code:
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
//#include "stdafx.h"
using namespace std;
std::multimap<int, int> mp;
typedef std::pair<std::string, int> mypair;
struct IntCmp {
bool operator()(const mypair &lhs, const mypair &rhs) {
return lhs.second < rhs.second;
}
};
void printK(const std::map<std::string,int> &mymap, int k) {
std::vector<mypair> myvec(mymap.begin(), mymap.end());
assert(myvec.size() >= k);
std::partial_sort(myvec.begin(), myvec.begin() + k, myvec.end(), IntCmp());
for (int i = 0; i < k; ++i) {
std::cout << i << ": " << myvec[i].first
<< "-> " << myvec[i].second << "\n";
}
}
void readinfo(std::istream& in)
{
std::string s, ID, Value;
//while(getline(in, s))
while(in >> ID >> Value)
std::cout << s << '\n';
}
int main (int argc, char **argv) {
if (argc > 1) { /* read from file if given as argument */
std::ifstream fin (argv[1]);
if (fin.is_open())
readinfo(fin);
else
{
std::cerr << "error: Unable to open file " << argv[1] << '\n';
return 1;
}
}
else
// No input file has been passed in the command line.
// Read the data from stdin (std::cin).
{
readinfo(std::cin);
}
return 0;
}
But I don't know how to split the huge file to sort and combine back together. Please tell me how to fix my code for this problem.

Maybe you could use a min-heap via std::priority_queue:
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <queue>
#include <vector>
struct IdAndValue {
std::string id;
int value;
};
struct ValueCmp {
bool operator()(const IdAndValue &lhs, const IdAndValue &rhs) {
return lhs.value > rhs.value;
}
};
void PrintTopK(std::istream &in, long k) {
std::priority_queue<IdAndValue, std::vector<IdAndValue>, ValueCmp> largest_k;
std::string id;
int value;
while (in >> id >> value) {
if (largest_k.size() < k) {
largest_k.push({.id = id, .value = value});
} else {
if (value > largest_k.top().value) {
largest_k.pop();
largest_k.push({.id = id, .value = value});
}
}
}
std::cout << "Top " << k << " IDs with largest values:\n";
while (!largest_k.empty()) {
IdAndValue id_and_value = largest_k.top();
largest_k.pop();
std::cout << id_and_value.id << '\n';
}
}
int main(int argc, char **argv) {
if (argc > 2) { // Read from file if given as argument.
std::ifstream fin(argv[1]);
if (fin.is_open())
PrintTopK(fin, std::strtol(argv[2], nullptr, 10));
else {
std::cerr << "Error: Unable to open file " << argv[1] << '\n';
return 1;
}
} else { // Read the data from stdin (std::cin).
PrintTopK(std::cin, std::strtol(argv[1], nullptr, 10));
}
return 0;
}
Usage from stdin (Ctrl + D to send EOF on unix):
./PrintTopK 3
1426828011 9
1426828028 350
1426828037 25
1426828056 231
1426828058 109
1426828066 111
Top 3 IDs with largest values:
1426828066
1426828056
1426828028
Usage when passed in a file:
$ ./PrintTopK /Users/shash/CLionProjects/PrintTopK/data.txt 3
Top 3 IDs with largest values:
1426828066
1426828056
1426828028
With data.txt:
1426828011 9
1426828028 350
1426828037 25
1426828056 231
1426828058 109
1426828066 111

I think we can come up with a better approach that has a lower space and time complexity.
One requirement is to get the x largest values. Then we do only need to store x values. Not more. The others are of no interest. All values will be read and, if not larger than the already collected values, then we discard them. With that, we save tons of memory.
Next, how to store?
If we have an already sorted container, then the smallest element is always at the beginning. So, if we read a new value, then we just need to compare this new value with the first element in the container. Because, if the new value would be smaller than the smallest existing value, we can discard it. But if it is bigger, then we need to add it to our container, and eliminate the so far smallest element.
If we use a function like std::lower_bound then it will give us the exact position on where we need to insert the new element. Without the need for any resorting. It can be inserted at the exact correct position. Then we have a new smallest value.
To select the type of container, we think of the operations that we need to do.
We want to eliminate the first element (Without shifting all other data). - We want to add an element in a given position, without the need to shift all following values to the right.
This leads us to a std::list, which will fulfil our criteria in an optimal way.
So, how will we implement the solution?
Define a struct to hold the data, the unique id and the associated value
Add extraction >> and insertion << operators for easier IO
Add 2 sort operator overloads for std::list::sort and std::lower_bound
In main, get an std::istreameither to a given source file or std::cin
Read the first X values and store them in the list as is. If there should be only these X values or less, then we have the solution already
Sort the values in the list. The smalles value is now at the front of the list
If the std::istream contains still data, then continue to read values
If the new value id greater than than the smalled value in the list, then add the value to the list with insert sort
Delete the smallest value at the front.
After the initial sorting, all operations will be done in constant time. The number of input values does not add additional complexity. Anyway. Most time will be burned with reading data from the disk, if the file is huge. For small files with 100'000 values or so, any other algorithm would be also fine.
Please see one of many potential solutions below.
#include <iostream>
#include <fstream>
#include <random>
#include <string>
#include <list>
#include <limits>
#include <algorithm>
const std::string fileName{ "r:\\big.txt" };
// ----------------------------------------------------------------------------
// Create a big file for Test purposes
void createBigFile() {
if (std::ofstream ofs(fileName); ofs) {
constexpr size_t uniqueIDStart = 1'426'828'028;
constexpr size_t numberOfRecords = 10'000'000;
constexpr size_t endRecords = uniqueIDStart + numberOfRecords;
std::random_device randomDevice;
std::mt19937 randomEngine(randomDevice());
std::uniform_int_distribution<int> uniformDistribution(1, 10'000'000);
for (size_t k{ uniqueIDStart }; k < endRecords; ++k) {
ofs << k << ' ' << uniformDistribution(randomEngine) << '\n';
}
}
}
// ----------------------------------------------------------------------------
// Here we will store our data
struct Data {
unsigned int ID{};
int value{ std::numeric_limits<int>::max() };
// Sorting operators
bool operator < (const int& i) const { return value < i; } // For lower_bound
bool operator < (const Data& other) const { return value < other.value; } // For sort
// Simple extractor and inserter
friend std::istream& operator >> (std::istream& is, Data& d) { return is >> d.ID >> d.value; }
friend std::ostream& operator << (std::ostream& os, const Data& d) { return os << d.ID << ' ' << d.value; }
};
// Whatever number of X you need for the X-largest values
constexpr size_t Rank = 50;
// We will use a list to store the C-largest data
using DList = std::list<Data>;
using ListIter = DList::iterator;
// For faster reading we will increase the input buffer size
constexpr size_t ifStreamBufferSize = 500'000u;
static char buffer[ifStreamBufferSize];
int main(int argc, char* argv[]) {
// If you want to create test data, then uncomment the following line
//createBigFile();
//We will either read from std::cin or from a file
std::shared_ptr<std::istream> input{};
if (argc == 2) {
// Try to open the source file, given by command line arguments
input.reset(new std::ifstream(argv[1]));
input->rdbuf()->pubsetbuf(buffer, ifStreamBufferSize);
}
else {
// Use std::cin for input. Handover a NoOp custom deleter. We do not want to close std::cin
input.reset(&std::cin, [](...) {});
}
// If file could be opened or if std::cin is OK
if (input) {
// Here we will store all data
DList dList;
// Read the first X values as is
size_t numberOfElementsInArray{};
Data data;
while (*input >> data) {
if (numberOfElementsInArray < Rank) {
dList.push_front(std::move(data));
++numberOfElementsInArray;
}
if (numberOfElementsInArray >= Rank) break;
}
// And do a first time (and the only sort)
dList.sort();
// For comparison
int smallest{ dList.front().value };
// Read all data from file
while (*input >> data) {
// If the latest read value is bigger than the smalles in the list, the we need to add a new value now
if (data.value > smallest) {
// FInd the position, where to insert the new element
ListIter dIter = std::lower_bound(dList.begin(), dList.end(), data.value);
if (dIter != dList.end()) {
// Insert new value where is should be. Keeps sorting
dList.insert(dIter,std::move(data));
// We have now more values then needed. Get rid of the smalles one and get new smallest value.
dList.pop_front();
smallest = dList.front().value;
}
}
}
std::copy(dList.rbegin(), dList.rend(), std::ostream_iterator<Data>(std::cout, "\n"));
}
else std::cerr << "*** Error with input file (" << fileName << ") or with std::cin\n\n";
}

How can I order a map that has duplicated values in C++?

I am trying to reorder the map in a descending way depending on the values, I have been trying to create a new map and insert the one which has the biggest value first but it keeps ordering the map by the keys.
I have also tried to reorder it by the value changing the form of the map into the other way but I will loose some data because I have more than one key which has the same value.
#include <iostream>
#include "SymbolFreq.h"
#include <string>
#include <fstream>
#include <streambuf>
#include <map>
using namespace std;
int main()
{
map <char, int> mymap;
map <char, int> realmap;
ifstream infile{ "ToCompress.txt" };
std::string str((std::istreambuf_iterator<char>(infile)),
std::istreambuf_iterator<char>());
std::map<char, int>::iterator itera;
for (auto it = str.begin(); it != str.end(); ++it)
{
itera = mymap.find(*it);
if (itera != mymap.end())
{
itera->second++;
}
else
{
mymap.insert({ *it, 1 });
}
}
int max = 0;
char provisionalChar;
int provisionalInt;
while (mymap.empty() == false)
{
for (auto it = mymap.cbegin(); it != mymap.cend(); ++it)
{
if (it->second > max)
{
max = it->second;
provisionalChar = it->first;
provisionalInt = it->second;
}
//cout << it->first << "\t" << it->second << "\n";
}
mymap.erase(provisionalChar);
realmap.insert({ provisionalChar, provisionalInt });
max = 0;
}
for (auto it = realmap.cbegin(); it != realmap.cend(); ++it)
{
cout << it->first << "\t" << it->second << "\n";
}
return 0;
}

If I understand the question properly, you'd like to count how many times each char appears in the file and then produce a map sorted with the char that appeared most time first.
Here's one idea:
#include <algorithm>
#include <cstdint>
#include <fstream>
#include <functional>
#include <iostream>
#include <iterator>
#include <map>
#include <string>
#include <unordered_map>
int main() {
std::ifstream infile{"ToCompress.txt"};
// "mymap" is only used for counting how many times each char appears.
std::unordered_map<char, std::uintmax_t> mymap;
// Loop directly over the file. No vector needed:
std::for_each(std::istreambuf_iterator<char>(infile),
std::istreambuf_iterator<char>(), [&mymap](char ch) {
// No need to find first - operator[] inserts an element
// for the key ("ch") if it's missing.
++mymap[ch];
});
// Transform the unordered_map into a multimap where the count is the key
// and in which we use a descending sort order (std::greater):
std::multimap<std::uintmax_t, char, std::greater<std::uintmax_t>> realmap;
std::transform(mymap.begin(), mymap.end(),
std::inserter(realmap, realmap.end()),
[](const auto& ch_co) -> std::pair<std::uintmax_t, char> {
// return a pair with key and value swapped
return {ch_co.second, ch_co.first};
});
// Print the result
for(auto& [count, ch] : realmap) {
std::cout << count << '\t' << ch << '\n';
}
}
Possible output:
537479
120204 t
113285 e
80681
80670 i
79862 n
77984 r
77464 s
69994 o
67377 a
...
Apparently, <space>, t, e and \n are tne most common characters in my C++ programs (which is what I used as input)

Your question may be ill-posed; take a step back and state what you are really trying to accomplish.
That said, I'll attempt an answer based on what you've written.
It looks like you're trying to sort an std::map by value, in which case your question is a duplicate of either this or this question.
Regarding your initial attempt:
Take a look at this table. Only sequence containers allow you to directly influence order. As with priority queue you have limited control over the order of associative containers and almost zero control for unordered containers.

Choose an element from a std::vector other than the given element

I have a std::vector of strings say
std::vector<std::string> MyVect = {
"CIRCLE","SQUARE","RECTANGLE","TRIANGLE","LINE"
};
And I am generating a string. My question is if the generated string is "SQUARE", the code should choose any element other than "SQUARE" from MyVect(like choosing CIRCLE or RECTANGLE or TRIANGLE but not SQUARE).
I am very new to cocos2d-x and c++.
Please Help!
Thanks

// Then Out vector should have elements other than the string generated.
#include <algorithm>
#include <vector>
std::string generated = "SQUARE";
std::vector<std::string> MyVect = {
"CIRCLE","SQUARE","RECTANGLE","TRIANGLE","LINE"
};
std::vector<std::string> OutputVect;
for (auto str : MyVect )
{
if( str != generated)
{
OutputVect.push_back(str);
}
}

One solution is to keep another vector of ints choicesUsed and every time you select a random element, add the vector index to it. Then when next choosing your random element, ignore any choices where the index you've selected is inside your choicesUsed vector.

You could use a function template provided by the <algorithm> header like this:
#include <algorithm>
#include <cassert>
std::string generated = "...";
const auto firstNonEqual = std::find_if(MyVec.cbegin(), MyVec.cend(),
[&generated](const auto& element){ return element != generated; });
assert(firstNonEqual != MyVec.cend());
std::cout << "selected: " << *firstNonEqual << "\n";

How to use vector and struct?

I need to count letters from the string, sort them by count and cout results. For this purpose I'm trying to use vector and struct. Here is part of my code, but it's not working, because I don't know how to implement something:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
struct int_pair{
int key;
int value;
};
bool sort_by_value(int_pair left, int_pair right){
return left.value < right.value;
}
int main() {
string characters = "aasa asdfs dfh f ukjyhkh k wse f sdf sdfsdf";
vector<int_pair> most_frequent;
for (string::size_type i = 0; i <= characters.length(); i++) {
int int_char = (int)characters[i];
most_frequent[int_char]++; <-- I want to do something like this, but it's not working
}
sort(most_frequent.begin(), most_frequent.end(), sort_by_value);
for (vector<int_pair>::iterator it = most_frequent.begin(); it != most_frequent.end(); ++it) <-- is this call correct?
cout << " " << it->key << ":" << it->value << endl;
return 0;
}
At this code I have 2 parts that I don't know how to deal:
most_frequent[int_char]++; <-- I want to do something like this, but it's not working
and
for (vector<int_pair>::iterator it = most_frequent.begin(); it != most_frequent.end(); ++it) <-- is this call correct?
Maybe you can see any other mistakes and potential issues at this code.

I would use a std::map to determine the frequency of each letter, then copy that into a multimap while reversing the key and value to get them in order.
#include <iostream>
#include <map>
#include <algorithm>
template<class T, class U>
std::pair<U,T> flip_pair(const std::pair<T,U>& p) {
return std::make_pair(p.second,p.first);
}
int main(){
std::string characters = "zxcvopqiuweriuzxchajksdui";
std::map<char,int> freq;
std::multimap<int,char> rev_freq;
// Calculate the frequency of each letter.
for(char c: characters){
freq[c]++;
}
// Copy the results into a multimap with the key and value flipped
std::transform(std::begin(freq), std::end(freq),
std::inserter(rev_freq, rev_freq.begin()),
flip_pair<char,int>);
// Print out the results in order.
for(std::pair<int,char> p : rev_freq){
std::cout << p.first << ": " << p.second << std::endl;
}
};

This should do what you need:
most_frequent[int_char].key = int_char;
most_frequent[int_char].value++;
Yes, it sets the key many times, even though it doesn't need to.

When accessing the container with the key (vector is indexed with an integer, which is "the key" in your case), you don't have to store the key in the value field of the container again.
So you don't need your struct since you only need the value field and can can store the number of occurrences directly in the vector.
The idea is to fill the vector with 256 integers in the beginning, all initialized to zero. Then, use the vector index as your "key" (character code) to access the elements (number of occurrences).
This will result in a code similar to this:
// initialize with 256 entries, one for each character:
vector<int> counts(256);
for (string::size_type i = 0; i <= characters.length(); i++) {
// for each occurrence of a character, increase the value in the vector:
int int_char = (int)characters[i];
counts[int_char]++;
}
Once filling of the vector is done, you can find the maximum value (not only the value but also the key where it is stored) using the std::max_element algorithm:
vector<int>::iterator most_frequent =
std::max_element(counts.begin(), counts.end());
// getting the character (index within the container, "key"):
std::cout << (char)(most_frequent - counts.begin());
// the number of occurrences ("value"):
std::cout << (*most_frequent);
Here is your example with the changes (only printing the most frequent character, here it is the space so you don't see it): http://ideone.com/94GfZz
You can sort this vector, however, you will loose the key of course, since the elements will move and change their indices. There is a nice trick to process statistics like that: Use a reversed (multi)map (key, value reversed):
multimap<int,int> keyForOccurrence;
for (vector<int>::iterator i = counts.begin(); i != counts.end(); ++i) {
int occurrences = *i;
int character = i - counts.begin();
keyForOccurrence.insert(std::pair<int,int>(occurrences, character));
}
Updated code: http://ideone.com/Ub5rnL
The last thing you should now sort out by yourself is how to access and process the data within this map. The fancy thing about this reversed map is that it is now automatically sorted by occurrence, since maps are sorted by key.

I find more natural to use a std::map container to store each character occurrences. The character is map's key, its occurrence count is map's value.
It's easy to scan the source string and build this map using std::map::operator[], and ++ to increase the occurrence count.
Then, you can build a second map from the above map, with key and value inverted: so this map will be sorted by occurrences, and then you can print this second map.
Note that you have to use a std::multimap as this second map, since its keys (i.e. the occurrences) can be repeated.
Sample code follows (I tested it with VS2010 SP1/VC10):
#include <stddef.h> // for size_t
#include <algorithm> // for std::transform
#include <functional> // for std::greater
#include <iostream> // for std::cout
#include <iterator> // for std::inserter
#include <map> // for std::map, std::multimap
#include <ostream> // for std::endl
#include <string> // for std::string
#include <utility> // for std::pair
using namespace std;
int main()
{
string str = "aasa asdfs dfh f ukjyhkh k wse f sdf sdfsdf";
// Build the occurrences map (char -> occurrences)
map<char, size_t> freq;
for (size_t i = 0; i < str.length(); ++i)
freq[ str[i] ]++;
// Build a new map from previous map with inverted <key, value> pairs,
// so this new map will be sorted by old map's value (i.e. char's
// occurrences), which is new map's key.
// Use the std::greater comparator to sort in descending order.
multimap<size_t, char, greater<size_t>> sorted_freq;
transform(
freq.begin(), freq.end(), // source
inserter(sorted_freq, sorted_freq.begin()), // destination
[](const pair<char, size_t>& p) // invert key<->value
{
return pair<size_t, char>(p.second, p.first);
}
);
// Print results
for (auto it = sorted_freq.begin(); it != sorted_freq.end(); ++it)
cout << it->second << ": " << it->first << endl;
}
Output:
: 9
s: 7
f: 7
d: 5
a: 4
k: 3
h: 3
u: 1
w: 1
y: 1
j: 1
e: 1
If you don't want to print the space character occurrences, you can easily filter that out.
Note that using std::map/std::multimap will also scale up better than std::vector for non-ASCII characters, e.g. if you use Unicode UTF-32 (since Unicode characters are much more than just 256).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Filling map with 2 keys from a string. Character and frequency c++ - c++

Related

I don't understand why my sort on a string breaks everything

Find X-largest values in a large file with optional input file command line parsing method in C++

How can I order a map that has duplicated values in C++?

Choose an element from a std::vector other than the given element

How to use vector and struct?

Categories

Resources