How to Stop a Brute Force Attack When Key is Found - c++

I am writing a program that encrypts/decrypts a message using Caesar Cipher and a shift number given by the user. I got that part down, more or less.
However, the final part of the program performs a brute force attack on an encrypted message, and we are attempting to find the shift key that was used to encrypt it. I am having trouble determining the best way to stop the program when it finds the match.
I will paste my code below, and what it comes up with running it now. I have tested with pencil and paper, and can see the key is 17 for our particular input, so I put a crap condition just to get it working. I am wondering how to stop it when it finds the correct value and/or a sentence that is actual English, so that it works for every input, not just our test case.
P.S. I apologize if this is answered elsewhere, the only posts I found on the topic were more related to stopping brute force attacks and/or having secure passwords.
char decrypt(char letter, int key);
int main(){
// Part 3 Brute-Force Attack
std::string ciphertext;
std::cout << "Enter in the ciphertext: ";
std::getline(std::cin, ciphertext);
std::cout << "\nBrute-Force Decryptions:" << std::endl;
int i = 0;
int keyCode = 0;
std::string gang = "";
int alphaArray[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24};
// The algorithm has nested for each loops, the first one keeps track of the key initialized to 0
// and the second loop goes through the string given and converts it to the proper value based
// on the shift key. The reason we want to find 't', 'h' or 'e' is because this signifies the
// begginning of an actual English sentence, and most likely our cracked message. I was able to
// determine this would begin the correct string we are looking for by bruteforcing the first // few letters using pencil and paper
for(auto key : alphaArray){
std::cout << "\nkey = " << key << ":\t";
for(auto letter : ciphertext){
letter = decrypt(letter, i);
gang += letter;
}
std::cout << gang << std::endl;
if (tolower(ciphertext[0]) - key + 26 == 't' && tolower(ciphertext[1]) - key == 'h' && tolower(ciphertext[2]) - key == 'e'){
keyCode = key;
break;
}
i++;
gang = "";
}
// Output results
std::cout << "\nPart 3 KeyCode: " << keyCode << std::endl;
std::cout << "\n\n";
return 0;
}
//////////////////////////////////////////////////////////////////////////////////////
// Function: Attempts to decrypt letter by shifting it up the alphabet 'key' times /
// Params: Letter as a char and key as an int value /
// Return: Letter shifted 'key' times /
//////////////////////////////////////////////////////////////////////////////////////
char decrypt(char letter, int key){
int position = (letter - key - 'A') % 26;
if(position < 0)
position += 26;
return (char)position + 'a';
}
Output:
Brute-Force Decryptions:
key = 0: kyvtjyzwkttzgyvitzjtrcjftbefnetrjtkyvttrvjrittzgyviftrwkvitalczljttrvjriftnyftivglkvucptljvutkyvttzgyvitkftgrjjtdvjjrxvjtkftyzjtkiffgjtulizextyzjtdzczkripttrdgrzxejtzetxrlchtefkvtkyrktkyvttrvjrittzgyvitzjtvrjzcpttirtbvutsptkyvtsilkvgwfitvtrkkrtbh
key = 1: jxusixyvjssyfxuhsyisqbiesademdsqisjxussquiqhssyfxuhesqvjuhszkbykissquiqhesmxeshufkjutboskiutsjxussyfxuhsjesfqiiscuiiqwuisjesxyisjheefistkhydwsxyiscybyjqhossqcfqywdisydswqkbgsdejusjxqjsjxussquiqhssyfxuhsyisuqiybosshqsautsrosjxusrhkjufvehsusqjjqsag
key = 2: iwtrhwxuirrxewtgrxhrpahdrzcdlcrphriwtrrpthpgrrxewtgdrpuitgryjaxjhrrpthpgdrlwdrgtejitsanrjhtsriwtrrxewtgridrephhrbthhpvthridrwxhrigddehrsjgxcvrwxhrbxaxipgnrrpbepxvchrxcrvpjafrcditriwpiriwtrrpthpgrrxewtgrxhrtphxanrrgprztsrqnriwtrqgjiteudgrtrpiiprzf
key = 3: hvsqgvwthqqwdvsfqwgqozgcqybckbqogqhvsqqosgofqqwdvsfcqothsfqxizwigqqosgofcqkvcqfsdihsrzmqigsrqhvsqqwdvsfqhcqdoggqasggousgqhcqvwgqhfccdgqrifwbuqvwgqawzwhofmqqoadowubgqwbquoizeqbchsqhvohqhvsqqosgofqqwdvsfqwgqsogwzmqqfoqysrqpmqhvsqpfihsdtcfqsqohhoqye
key = 4: gurpfuvsgppvcurepvfpnyfbpxabjapnfpgurppnrfneppvcurebpnsgrepwhyvhfppnrfnebpjubperchgrqylphfrqpgurppvcurepgbpcnffpzrffntrfpgbpuvfpgebbcfpqhevatpuvfpzvyvgnelppnzcnvtafpvaptnhydpabgrpgungpgurppnrfneppvcurepvfprnfvylppenpxrqpolpgurpoehgrcsbeprpnggnpxd
key = 5: ftqoeturfooubtqdoueomxeaowzaizomeoftqoomqemdooubtqdaomrfqdovgxugeoomqemdaoitaodqbgfqpxkogeqpoftqooubtqdofaobmeeoyqeemsqeofaotueofdaabeopgduzsotueoyuxufmdkoomybmuszeouzosmgxcozafqoftmfoftqoomqemdooubtqdoueoqmeuxkoodmowqponkoftqondgfqbradoqomffmowc
key = 6: espndstqenntaspcntdnlwdznvyzhynldnespnnlpdlcnntaspcznlqepcnufwtfdnnlpdlcznhszncpafepowjnfdponespnntaspcneznalddnxpddlrpdneznstdneczzadnofctyrnstdnxtwtelcjnnlxaltrydntynrlfwbnyzepneslenespnnlpdlcnntaspcntdnpldtwjnnclnvponmjnespnmcfepaqzcnpnleelnvb
key = 7: dromcrspdmmszrobmscmkvcymuxygxmkcmdrommkockbmmszrobymkpdobmtevsecmmkockbymgrymbozedonvimeconmdrommszrobmdymzkccmwocckqocmdymrscmdbyyzcmnebsxqmrscmwsvsdkbimmkwzksqxcmsxmqkevamxydomdrkdmdrommkockbmmszrobmscmokcsvimmbkmuonmlimdromlbedozpybmomkddkmua
key = 8: cqnlbqrocllryqnalrbljubxltwxfwljblcqnlljnbjallryqnaxljocnalsdurdblljnbjaxlfqxlanydcnmuhldbnmlcqnllryqnalcxlyjbblvnbbjpnblcxlqrblcaxxyblmdarwplqrblvrurcjahlljvyjrpwblrwlpjduzlwxcnlcqjclcqnlljnbjallryqnalrblnjbruhllajltnmlkhlcqnlkadcnyoxalnljccjltz
key = 9: bpmkapqnbkkqxpmzkqakitawksvwevkiakbpmkkimaizkkqxpmzwkinbmzkrctqcakkimaizwkepwkzmxcbmltgkcamlkbpmkkqxpmzkbwkxiaakumaaiomakbwkpqakbzwwxaklczqvokpqakuqtqbizgkkiuxiqovakqvkoictykvwbmkbpibkbpmkkimaizkkqxpmzkqakmiaqtgkkziksmlkjgkbpmkjzcbmxnwzkmkibbiksy
key = 10: aoljzopmajjpwolyjpzjhszvjruvdujhzjaoljjhlzhyjjpwolyvjhmalyjqbspbzjjhlzhyvjdovjylwbalksfjbzlkjaoljjpwolyjavjwhzzjtlzzhnlzjavjopzjayvvwzjkbypunjopzjtpspahyfjjhtwhpnuzjpujnhbsxjuvaljaohajaoljjhlzhyjjpwolyjpzjlhzpsfjjyhjrlkjifjaoljiybalwmvyjljhaahjrx
key = 11: znkiynolziiovnkxioyigryuiqtuctigyiznkiigkygxiiovnkxuiglzkxiparoayiigkygxuicnuixkvazkjreiaykjiznkiiovnkxizuivgyyiskyygmkyizuinoyizxuuvyijaxotminoyisorozgxeiigsvgomtyiotimgarwituzkizngziznkiigkygxiiovnkxioyikgyoreiixgiqkjiheiznkihxazkvluxikigzzgiqw
key = 12: ymjhxmnkyhhnumjwhnxhfqxthpstbshfxhymjhhfjxfwhhnumjwthfkyjwhozqnzxhhfjxfwthbmthwjuzyjiqdhzxjihymjhhnumjwhythufxxhrjxxfljxhythmnxhywttuxhizwnslhmnxhrnqnyfwdhhfrufnlsxhnshlfzqvhstyjhymfyhymjhhfjxfwhhnumjwhnxhjfxnqdhhwfhpjihgdhymjhgwzyjuktwhjhfyyfhpv
key = 13: xligwlmjxggmtlivgmwgepwsgorsargewgxliggeiwevggmtlivsgejxivgnypmywggeiwevsgalsgvityxihpcgywihgxliggmtlivgxsgtewwgqiwwekiwgxsglmwgxvsstwghyvmrkglmwgqmpmxevcggeqtemkrwgmrgkeypugrsxigxlexgxliggeiwevggmtlivgmwgiewmpcggvegoihgfcgxligfvyxitjsvgigexxegou
key = 14: wkhfvkliwfflskhuflvfdovrfnqrzqfdvfwkhffdhvdufflskhurfdiwhufmxolxvffdhvdurfzkrfuhsxwhgobfxvhgfwkhfflskhufwrfsdvvfphvvdjhvfwrfklvfwurrsvfgxulqjfklvfplolwdubffdpsdljqvflqfjdxotfqrwhfwkdwfwkhffdhvdufflskhuflvfhdvlobffudfnhgfebfwkhfeuxwhsirufhfdwwdfnt
key = 15: vjgeujkhveekrjgtekuecnuqempqypecuevjgeecgucteekrjgtqechvgtelwnkwueecguctqeyjqetgrwvgfnaewugfevjgeekrjgtevqercuueoguuciguevqejkuevtqqruefwtkpiejkueoknkvctaeecorckipuekpeicwnsepqvgevjcvevjgeecgucteekrjgtekuegcuknaeetcemgfedaevjgedtwvgrhqtegecvvcems
key = 16: uifdtijguddjqifsdjtdbmtpdlopxodbtduifddbftbsddjqifspdbgufsdkvmjvtddbftbspdxipdsfqvufemzdvtfeduifddjqifsdupdqbttdnfttbhftdupdijtdusppqtdevsjohdijtdnjmjubszddbnqbjhotdjodhbvmrdopufduibuduifddbftbsddjqifsdjtdfbtjmzddsbdlfedczduifdcsvufqgpsdfdbuubdlr
key = 17: thecshiftccipherciscalsocknowncasctheccaesarccipherocaftercjuliusccaesarocwhocreputedlycusedctheccipherctocpasscmessagesctochisctroopscduringchiscmilitaryccampaignscincgaulqcnotecthatctheccaesarcciphercisceasilyccrackedcbycthecbrutepforcecattackq
Part 3 KeyCode: 17

Brute-force-breaking Caesar-Cypher can be easily done by using heuristics. A very good and simple approach is, to use the average frequency of a letter in a given language. Such tables can be found on the internet for many languages.
Let us look at the English language. I found for example the following:
a: 0.0684
b: 0.0139
c: 0.0146
d: 0.0456
e: 0.1267
f: 0.0234
g: 0.0180
h: 0.0701
i: 0.0640
j: 0.0033
k: 0.0093
l: 0.0450
m: 0.0305
n: 0.0631
o: 0.0852
p: 0.0136
q: 0.0004
r: 0.0534
s: 0.0659
t: 0.0850
u: 0.0325
v: 0.0084
w: 0.0271
x: 0.0007
y: 0.0315
z: 0.0004
Later we need to look up this occurrence value for each letter of the alphabet. We will use lowercase only. We can define a compile-time array for that.
constexpr std::array<double, 26> LetterWeight{ .0684,.0139,.0146,.0456,.1267,
.0234,.0180,.0701,.0640,.0033,.0093,.0450,.0305,.0631,.0852,.0136,.0004,.0534,
.0659,.0850,.0325,.0084,.0271,.0007,.0315,.0004 };
Then, we will do the following. With the brute force approach, we will try all possible keys for decryption in a loop.
So, we will decrypt the encrypted message 25 times. We can omit the key 0, because then the message would be clear and readable.
In each loop for the 25 keys, we will additionally analyze each letter of the decrypted message (if it is the right key or not does not matter at this moment).
We will build a score for this decrypted string by adding up the above defined occurrence-value for each key based on the letter.
In the end, we will have a sum for each key / letter of the decrypted string. Additionally, we will store the key that has been used for those sums.
For this, we will define a std::array of std::pair. There will be 26 elements in the std::array. One for each of the tested keys. The content will be the sum of letter-occurrence-values for each key and again the key.
std::array<std::pair<double, int>, 26> score{};
We need to store again the key in the array, because we want to sort the std::array later to find the highest score and its corresponding key.
Regarding the encryption/decryption function, I gave a detailed answer with a long explanation here. Please, kindly check this.
Now, with all the above wisdom, we can come up with the following code cracker:
#include <iostream>
#include <array>
#include <cctype>
#include <string>
#include <algorithm>
// Some test string
const std::string test{ R"(Kyv rcxfizkyd yrj evjkvu wfi vrty cffgj, kyv wzijk fev bvvgj kirtb fw kyv bvp zezkzrczqvu kf 0 reu kyv
jvtfeu cffg xfvj kyiflxy kyv jkizex xzmve reu tfemvikj zk kf kyv gifgvi mrclv srjvu fe kyv jyzwk bvp.
Kyv ivrjfe nv nrek kf wzeu 'k', 'y' fi 'v' zj svtrljv kyzj jzxezwzvj kyv svxxzeezex fw re rtklrc Vexczjy
jvekvetv,reu dfjk czbvcp fli tirtbvu dvjjrxv.Z nrj rscv kf uvkvidzev kyzj nflcu svxze kyv tfiivtk jkizex
nv riv cffbzex wfi sp silkvwfitzex kyv wzijk wvn cvkkvij ljzex gvetzc reu grgvi.)" };
// Letter frequencies in the English Language
constexpr std::array<double, 26> LetterWeight{ .0684,.0139,.0146,.0456,.1267,.0234,.0180,.0701,.0640,.0033,.0093,.0450,.0305,.0631,.0852,.0136,.0004,.0534,.0659,.0850,.0325,.0084,.0271,.0007,.0315,.0004 };
// Simple function for Caesar encyption/decyption (decryption by simply using a negative key)
std::string caesar(const std::string& in, int key) {
std::string res(in.size(), ' ');
std::transform(in.begin(), in.end(), res.begin(), [&](char c) {return std::isalpha(c) ? (char)((((c & 31) - 1 + ((26 + (key % 26)) % 26)) % 26 + 65) | c & 32) : c; });
return res;
}
// Test code
int main() {
// We will try all possible ciphers 1..25
std::array<std::pair<double, int>, 26> score{};
for (int key = 1; key < 26; ++key) {
// Get one possible deciphered test
for (const char c : caesar(test, key)) {
// Calculate score according toLetter weight
score[key].first += (std::isalpha(c)) ? LetterWeight[(c & 31) - 1] : 0.0; // Build the score for this key, using the accumulated letter weight
score[key].second = key; // And store the key. Is necessary, becuase we will sort later and do not want to loose the key information
}
}
// Now sort for getting the index with the highes score
std::sort(score.begin(), score.end());
// And show the most probable result to the user.
std::cout << "Decrypted with key: "<< score.back().second-26 << "\n\n" << caesar(test, score.back().second) << "\n\n";
};
All the code breaking can be done with ~15 lines of code. So, rather simple.
The resulting output will be:

Related

How to implement Cryptarithmetic using Constraint Satisfaction in C++

I'll start by explaining what a cryptarithmetic problem is, through an example:
T W O
+ T W O
F O U R
We have to assign a digit [0-9] to each letter such that no two letters share the same digit and it satisfies the above equation.
One solution to the above problem is:
7 6 5
+ 7 6 5
1 5 3 0
There are two ways to solve this problem, one is brute force, this will work but it's not the optimal way. The other way is using constraint satisfaction.
Solution using Constraint Satisfaction
We know that R will always be even because its 2 * O
this narrows down O's domain to {0, 2, 4, 6, 8}
We also know that F can't be anything but 1, since F isn't an addition of two letters, it must be getting its value from carry generated by T + T = O
This also implies that T + T > 9, only then will it be able to generate a carry for F;
This tells us that T > 4 {5, 6, 7, 8, 9}
And as we go on doing this, we keep on narrowing down the domain and this helps us reduce time complexity by a considerable amount.
The concept seems easy, but I'm having trouble implementing it in C++. Especially the part where we generate constraints/domain for each variable. Keep in mind that there are carries involved too.
EDIT: I'm looking for a way to generate a domain for each variable using the concept I stated.
This kind of problem is a good application for generic constraint programming packages like Google's open source OR-Tools. (See https://developers.google.com/optimization and https://developers.google.com/optimization/cp/cryptarithmetic).
The package is written in c++, so it should be a good match for you.
Then programming the problem is as simple as this (sorry, since I work with OR-Tools in c#, this is c# code, but the c++ code will look pretty much the same)
public void initModel(CpModel model)
{
// Make variables
T = model.NewIntVar(0, 9, "T");
W = model.NewIntVar(0, 9, "W");
O = model.NewIntVar(0, 9, "O");
F = model.NewIntVar(0, 9, "F");
U = model.NewIntVar(0, 9, "U");
R = model.NewIntVar(0, 9, "R");
// Constrain the sum
model.Add((2 * (100 * T + 10 * W + O)) == (1000 * F + 100 * O + 10 * U + R));
// Make sure the variables are all different
model.AddAllDifferent(decisionVariables);
// The leading digit shouldn't be 0
model.Add(T != 0);
model.Add(F != 0);
}
and then calling the Solve method.
In the constraint for the sum, the operators* + and == are all overridden in the package to create objects that can be used by the model to enforce the constraint.
This is the start of the output which enumerates the solution
Solution #0: time = 0,00 s;
T = 8
W = 6
O = 7
F = 1
U = 3
R = 4
Solution #1: time = 0,01 s;
T = 8
W = 4
O = 6
F = 1
U = 9
R = 2
Solution #2: time = 0,01 s;
T = 8
W = 3
O = 6
F = 1
U = 7
R = 2
Solution #3: time = 0,01 s;
T = 9
W = 3
O = 8
F = 1
U = 7
R = 6
And here's the complete code including solution printing and Main method for the execution:
using Google.OrTools.Sat;
using System;
using System.IO;
namespace SO69626335_CryptarithmicPuzzle
{
class Program
{
static void Main(string[] args)
{
try
{
Google.OrTools.Sat.CpModel model = new CpModel();
ORModel myModel = new ORModel();
myModel.initModel(model);
IntVar[] decisionVariables = myModel.decisionVariables;
// Creates a solver and solves the model.
CpSolver solver = new CpSolver();
VarArraySolutionPrinter solutionPrinter = new VarArraySolutionPrinter(myModel.variablesToPrintOut);
solver.SearchAllSolutions(model, solutionPrinter);
Console.WriteLine(String.Format("Number of solutions found: {0}",
solutionPrinter.SolutionCount()));
}
catch (Exception e)
{
Console.WriteLine(e.Message);
Console.WriteLine(e.StackTrace);
throw;
}
Console.WriteLine("OK");
Console.ReadKey();
}
}
class ORModel
{
IntVar T;
IntVar W;
IntVar O;
IntVar F;
IntVar U;
IntVar R;
public void initModel(CpModel model)
{
// Make variables
T = model.NewIntVar(0, 9, "T");
W = model.NewIntVar(0, 9, "W");
O = model.NewIntVar(0, 9, "O");
F = model.NewIntVar(0, 9, "F");
U = model.NewIntVar(0, 9, "U");
R = model.NewIntVar(0, 9, "R");
// Constrain the sum
model.Add((2 * (100 * T + 10 * W + O)) == (1000 * F + 100 * O + 10 * U + R));
// Make sure the variables are all different
model.AddAllDifferent(decisionVariables);
// The leading digit shouldn't be 0
model.Add(T != 0);
model.Add(F != 0);
}
public IntVar[] decisionVariables
{
get
{
return new IntVar[] { T, W, O, F, U, R };
}
}
public IntVar[] variablesToPrintOut
{
get
{
return decisionVariables;
}
}
}
public class VarArraySolutionPrinter : CpSolverSolutionCallback
{
private int solution_count_;
private IntVar[] variables;
public VarArraySolutionPrinter(IntVar[] variables)
{
this.variables = variables;
}
public override void OnSolutionCallback()
{
// using (StreamWriter sw = new StreamWriter(#"C:\temp\GoogleSATSolverExperiments.txt", true, Encoding.UTF8))
using (TextWriter sw = Console.Out)
{
sw.WriteLine(String.Format("Solution #{0}: time = {1:F2} s;",
solution_count_, WallTime()));
foreach (IntVar v in variables)
{
sw.Write(
String.Format(" {0} = {1}\r\n", v.ShortString(), Value(v)));
}
solution_count_++;
sw.WriteLine();
}
if (solution_count_ >= 10)
{
StopSearch();
}
}
public int SolutionCount()
{
return solution_count_;
}
}
}
A full solution is way out of scope for a simple SO question, but I can sketch what you would need.
First, generate new letters for the carries:
0 T W O
0 T W O
+ Z Y X V
F O U R
You can then generate a std::map<char, std::set<int>> containing all the options. The letters have the standard range {0..9}, V is {0}, Z is {1} and Y and X have {0..1}.
Next, you need to encode the additions into a set of clauses.
enum class Op { Equal, SumMod10, SumDiv10, Even, Odd };
struct clause { Op op; std::vector<Var> children; };
std::vector<clause> clauses{
{Equal, { 'Z' , 'F'}},
{SumMod10, {'O', 'T', 'T', 'Y'}}, // O = (T+T+Y) mod 10
{SumMod10, {'U', 'W', 'W', 'X'}},
{SumMod10, {'R', 'O', 'O', 'V'}},
{SumDiv10, {'F', 'T', 'T', 'Y'}}, // F is the carry of T+T+Y
{SumDiv10, {'O', 'W', 'W', 'X'}},
{SumDiv10, {'U', 'O', 'O', 'V'}},
};
Then the fun part begins: you need to create a calculation that will try to simplify the constraints using the knowledge it has.
For example, {SumMod10, {'U', 'O', 'O', 'V'}} can be simplified to {SumMod10, {'U', 'O', 'O', 0}} since V=0.
Sometimes a clause can reduce the range of a variable, for example the {Equal, {'Z', 'F'}} constraint can immediately reduce the range of F to {0,1}.
Next, you need to teach your system about basic algebraic equalities for furhter simplification, such as:
{SumMod10, {A, 0, C}} === {SumMod10, {A, C, 0}} === {Equal, {A,C}}
and even more abstract things like "if A >= 5 and B >= 5 then A+B >= 10" or "if A is even and B is even then A + B is also even".
Finally, your system needs to be able to assume hypotheses and disprove them, or prove that a hypothesis is true no matter what, like you did in your post.
For example, assuming that R is odd would mean that O + O is odd, which can only happen if O is odd and even at the same time. Therefore R must be even.
At the end of the day, you will have implemented not only a formal system for describing and evaluating boolean clauses in the numbers domain, you will also have a goal-driven solution engine to go with it. If this is more than just an idle musing, I would strongly look into adoption an SMT system to solve this for you or at least learning Prolog and expressing your problem there.
Here is how I solved it using backtracking
My approach here was to smartly brute force it, I recursively assign every possible value [0-9] to each letter and check if there is any contradiction.
Contradictions can be one of the following:
Two or more letters end up having the same value.
Sum of letters don't match the value of the result letter.
Sum of letters is already assigned to some letter.
As soon as a contradiction occurs, the recursion for that particular combination ends.
#include <bits/stdc++.h>
using namespace std;
vector<string> words, wordOg;
string result, resultOg;
bool solExists = false;
void reverse(string &str){
reverse(str.begin(), str.end());
}
void printProblem(){
cout<<"\n";
for(int i=0;i<words.size();i++){
for(int j=0;j<words[i].size();j++){
cout<<words[i][j];
}
cout<<"\n";
}
cout<<"---------\n";
for(int i=0;i<result.size();i++){
cout<<result[i];
}
cout<<"\n";
}
void printSolution(unordered_map<char, int> charValue){
cout<<"\n";
for(int i=0;i<words.size();i++){
for(int j=0;j<words[i].size();j++){
cout<<charValue[wordOg[i][j]];
}
cout<<"\n";
}
cout<<"---------\n";
for(int i=0;i<result.size();i++){
cout<<charValue[resultOg[i]];
}
cout<<"\n";
}
void solve(int colIdx, int idx, int carry, int sum,unordered_map<char, int> charValue, vector<int> domain){
if(colIdx<words.size()){
if(idx<words[colIdx].size()){
char ch = words[colIdx][idx];
if(charValue.find(ch)!=charValue.end()){
solve(colIdx + 1, idx, carry, sum + charValue[ch], charValue, domain);
}
else{
for(int i=0;i<10;i++){
if(i==0 && idx==words[colIdx].size()-1) continue;
if(domain[i]==-1){
domain[i] = 0;
charValue[ch] = i;
solve(colIdx + 1, idx, carry, sum + i, charValue, domain);
domain[i] = -1;
}
}
}
}
else solve(colIdx + 1, idx, carry, sum, charValue, domain);
}
else{
if(charValue.find(result[idx])!=charValue.end()){
if(((sum+carry)%10)!=charValue[result[idx]]) return;
}
else{
if(domain[(sum + carry)%10]!=-1) return;
domain[(sum + carry)%10] = 0;
charValue[result[idx]] = (sum + carry)%10;
}
carry = (sum+carry)/10;
if(idx==result.size()-1 && (charValue[result[idx]]==0 || carry == 1)) return;
if(idx+1<result.size()) solve(0, idx+1, carry, 0, charValue, domain);
else{
solExists = true;
printSolution(charValue);
}
}
}
int main() {
unordered_map<char, int> charValue;
vector<int> domain(10,-1);
int n;
cout<<"\nEnter number of input words: ";
cin>>n;
cout<<"\nEnter the words: ";
for(int i=0;i<n;i++){
string inp;
cin>>inp;
words.push_back(inp);
}
cout<<"\nEnter the resultant word: ";
cin>>result;
printProblem();
wordOg = words;
resultOg = result;
reverse(result);
for(auto &itr: words) reverse(itr);
solve(0, 0, 0, 0, charValue, domain);
if(!solExists) cout<<"\nNo Solution Exists!";
return 0;
}

Cant find the values I stored into a map

I'm basically reading a file character by character, and when I encounter a new number I store it in a std::map as a key with the element equal to 1. The more times I encounter the same number, I increment that key's element:
if(intMap.count(singleCharacter)){
//keyexist just increment count else
//cout<<"exist"<<endl;
int newValue = intMap.at(singleCharacter) + 1; //grabs the value that already exists and we need to incrmenet it by 1 and store the new value in the map.
std::map<int, int>::iterator it = intMap.find(singleCharacter); //finds the single character and increments it by 1.
if (it != intMap.end())
it->second = newValue; //setting the new increment value into the element
}else{
//doesnt exist and and we need to create it and incrmenet by 1
//cout<<"doesnt exist"<<endl;
intMap.insert(pair<int, int>(singleCharacter,1));
cout<<singleCharacter <<" new made : "<<intMap.at(singleCharacter) <<endl;
}
}
for (auto& p : intMap ) {
cout << p.first<<": "<< p.second <<endl;; // "Karl", "George"
}
The only problem is, when I try to print out all of the values in the map, it gives me random numbers, and I don't understand where they are coming from.
This the file that I'm reading:
An International Standard Book Number (ISBN) is a code of 10 characters, referred to as ISBN-10,
separated by dashes such as 0-7637-0798-8. An ISBN-10 consists of four parts: a group code, a publisher code,
a code that uniquely identifies the book among those published by a particular publisher, and a check character.
The check character is used to validate an ISBN. For the ISBN 0-7637-0798-8, the group code is 0,
which identifies the book as one from an English-speaking country. The publisher code 7637 is for "Jones and Bartlett Publishers
The output I'm getting:
48: 8
49: 3
51: 3
54: 3
55: 8
56: 4
57: 2
The output I should be getting should be like:
1: and the amount of times it was seen
That goes the same for any number.
You get those "random" numbers because you are using std::map<int,int> instead of std::map<char,int>. What you get printed are the ASCII numeric codes for character symbols, and their counts.
So, you need to either change the map's type, or cast the keys back to char:
for (auto& p : intMap ) {
cout << static_cast<char>(p.first) << ": "<< p.second <<endl;
}
Note: std::map::operator[] is precisely designed for such case, it will initialize missed values with 0 in this case, so all of your conditions can be replaced with:
intMap[singleCharacter]++;
Details can be found in this documentation.

Getting a "terminate called after throwing an instance of 'nlohmann::detail::type_error' what(): invalid UTF-8 byte at index 0: 0x81"

I'm working on a coding problem on a website and when I compile my code it gives me:
terminate called after throwing an instance of
'nlohmann::detail::type_error' what():
[json.exception.type_error.316] invalid UTF-8 byte at index 0: 0x81
Aborted exit status 134
However, when I compile on Sublime it works just fine with the correct output. Is there something wrong with how I'm using the ASCII values to store into the string variable answer? Here is my code:
string caesarCypherEncryptor(string str, int key) {
string answer = "";
for(char letter : str) {
// if it goes over 'z': get amount pass 'z' and start at 'a'
if(letter + key > int('z')) {
// push back char into answer string
answer += ((letter + key) % int('z') + int('a'));
continue;
}
// else just add key from current position
answer += letter + key;
}
return answer;
}
int main() {
cout << caesarCypherEncryptor("mvklahvjcnbwqvtutmfafkwiuagjkzmzwgf", 7) << endl;
return 0;
}
I also tried out this question, probably on the same website as you mentioned.
To any one in the future who is faced with this issue, the problem is in the line
of comparison:
if(letter + key > int('z'))
In the above line the ascii value will cross a certain limit which JSON excepts as valid characters, if key is greater than 26, and which by the way is also part of useless calculation, because a key > 26, for example say 27 is same as key = 1.
Hence, the first line of the solution should be key = (key % 26).
This prevents the above given JSON Exception error.
I faced this issue too.
This basically happens when we add int to character and final ascii value cross 127. As result, the ascii value rolls over. i.e. 'z' + 7 = 129 - but it rolls over and becomes - 129 - 127 = 2.
Basically you need to use below condition:
if((unsigned int)(letter + key) > 'z')
Please try this out and let me know if this helps.

Time optimize C++ function to find number of decoding possibilities

This is an interview practice problem from CodeFights. I have a solution that's working except for the fact that it takes too long to run for very large inputs.
Problem Description (from the link above)
A top secret message containing uppercase letters from 'A' to 'Z' has been encoded as numbers using the following mapping:
'A' -> 1
'B' -> 2
...
'Z' -> 26
You are an FBI agent and you need to determine the total number of ways that the message can be decoded.
Since the answer could be very large, take it modulo 10^9 + 7.
Example
For message = "123", the output should be
mapDecoding(message) = 3.
"123" can be decoded as "ABC" (1 2 3), "LC" (12 3) or "AW" (1 23), so the total number of ways is 3.
Input/Output
[time limit] 500ms (cpp)
[input] string message
A string containing only digits.
Guaranteed constraints:
0 ≤ message.length ≤ 105.
[output] integer
The total number of ways to decode the given message.
My Solution so far
We have to implement the solution in a function int mapDecoding(std::string message), so my entire solution is as follows:
/*0*/ void countValidPaths(int stIx, int endIx, std::string message, long *numPaths)
/*1*/ {
/*2*/ //check out-of-bounds error
/*3*/ if (endIx >= message.length())
/*4*/ return;
/*5*/
/*6*/ int subNum = 0, curCharNum;
/*7*/ //convert substr to int
/*8*/ for (int i = stIx; i <= endIx; ++i)
/*9*/ {
/*10*/ curCharNum = message[i] - '0';
/*11*/ subNum = subNum * 10 + curCharNum;
/*12*/ }
/*13*/
/*14*/ //check for leading 0 in two-digit number, which would not be valid
/*15*/ if (endIx > stIx && subNum < 10)
/*16*/ return;
/*17*/
/*18*/ //if number is valid
/*19*/ if (subNum <= 26 && subNum >= 1)
/*20*/ {
/*21*/ //we've reached the end of the string with success, therefore return a 1
/*22*/ if (endIx == (message.length() - 1) )
/*23*/ ++(*numPaths);
/*24*/ //branch out into the next 1- and 2-digit combos
/*25*/ else if (endIx == stIx)
/*26*/ {
/*27*/ countValidPaths(stIx, endIx + 1, message, numPaths);
/*28*/ countValidPaths(stIx + 1, endIx + 1, message, numPaths);
/*29*/ }
/*30*/ //proceed to the next digit
/*31*/ else
/*32*/ countValidPaths(endIx + 1, endIx + 1, message, numPaths);
/*33*/ }
/*34*/ }
/*35*/
/*36*/ int mapDecoding(std::string message)
/*37*/ {
/*38*/ if (message == "")
/*39*/ return 1;
/*40*/ long numPaths = 0;
/*41*/ int modByThis = static_cast<int>(std::pow(10.0, 9.0) + 7);
/*42*/ countValidPaths(0, 0, message, &numPaths);
/*43*/ return static_cast<int> (numPaths % modByThis);
/*44*/ }
The Issue
I have passed 11/12 of CodeFight's initial test cases, e.g. mapDecoding("123") = 3 and mapDecoding("11115112112") = 104. However, the last test case has message = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211", and my program takes too long to execute:
Expected_output: 782204094
My_program_output: <empty due to timeout>
I wrote countValidPaths() as a recursive function, and its recursive calls are on lines 27, 28 and 32. I can see how such a large input would cause the code to take so long, but I'm racking my brain trying to figure out what more efficient solutions would cover all possible combinations.
Thus the million dollar question: what suggestions do you have to optimize my current program so that it runs in far less time?
A couple of suggestions.
First this problem can probably be formulated as a Dynamic Programming problem. It has that smell to me. You are computing the same thing over and over again.
The second is the insight that long contiguous sequences of "1"s and "2"s are a Fibonacci sequence in terms of the number of possibilities. Any other value terminates the sequence. So you can split the strings into runs of of ones and twos terminated by any other number. You will need special logic for a termination of zero since it does not also correspond to a character. So split the strings count, the length of each segment, look up the fibonacci number (which can be pre-computed) and multiply the values. So your example "11115112112" yields "11115" and "112112" and f(5) = 8 and f(6) = 13, 8*13 = 104.
Your long string is a sequence of 1's and 2's that is 100 digits long. The following Java (Sorry, my C++ is rusty) program correctly computes its value by this method
public class FibPaths {
private static final int MAX_LEN = 105;
private static final BigInteger MOD_CONST = new BigInteger("1000000007");
private static BigInteger[] fibNum = new BigInteger[MAX_LEN];
private static void computeFibNums() {
fibNum[0] = new BigInteger("1");
fibNum[1] = new BigInteger("1");
for (int i = 2; i < MAX_LEN; i++) {
fibNum[i] = fibNum[i-2].add(fibNum[i-1]);
}
}
public static void main(String[] argv) {
String x = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211";
computeFibNums();
BigInteger val = fibNum[x.length()].mod(MOD_CONST);
System.out.println("N=" + x.length() + " , val = " + val);
}
}

How to generate all variations with repetitions of a string?

I want to generate all variations with repetitions of a string in C++ and I'd highly prefer a non-recursive algorithm. I've come up with a recursive algorithm in the past but due to the complexity (r^n) I'd like to see an iterative approach.
I'm quite surprised that I wasn't able to find a solution to this problem anywhere on the web or on StackOverflow.
I've come up with a Python script that does what I want as well:
import itertools
variations = itertools.product('ab', repeat=4)
for variations in variations:
variation_string = ""
for letter in variations:
variation_string += letter
print variation_string
Output:
aaaa
aaab
aaba
aabb
abaa
abab
abba
abbb
baaa
baab
baba
babb
bbaa
bbab
bbba
bbbb
Ideally I'd like a C++ program that can produce the exact output, taking the exact same parameters.
This is for learning purposes, it isn't homework. I wish my homework was like that.
You could think of it as counting, in a radix equal to the number of characters in the alphabet (taking special care of multiple equal characters in the alphabet if that's a possible input). The aaaa aaab aaba ... example for instance, is actually the binary representation of the numbers 0-15.
Just do a search on radix transformations, implement a mapping from each "digit" to corresponding character, and then simply do a for loop from 0 to word_lengthalphabet_size
Such algorithm should run in time linearly proportional to the number of strings that needs to be produced using constant amount of memory.
Demonstration in Java
public class Test {
public static void main(String... args) {
// Limit imposed by Integer.toString(int i, int radix) which is used
// for the purpose of this demo.
final String chars = "0123456789abcdefghijklmnopqrstuvwxyz";
int wordLength = 3;
char[] alphabet = { 'a', 'b', 'c' };
for (int i = 0; i < Math.pow(wordLength, alphabet.length); i++) {
String str = Integer.toString(i, alphabet.length);
String result = "";
while (result.length() + str.length() < wordLength)
result += alphabet[0];
for (char c : str.toCharArray())
result += alphabet[chars.indexOf(c)];
System.out.println(result);
}
}
}
output:
aaa
aab
aac
aba
abb
abc
aca
acb
acc
baa
bab
bac
bba
bbb
bbc
bca
bcb
bcc
caa
cab
cac
cba
cbb
cbc
cca
ccb
ccc
here is general recipe, not C++ specific to implement product:
Take product input string "abc.." to generate matrix "abc.."x"abc..". N^2 complexity.
represent matrix as vector and repeat multiplication by "abc", complexity (N^2)*N, repeat.
STL like function for next_variation. Accept iterators of any container of number like type. You can use float/doubles also.
Algorithm it self is very simple. Requirement for iterator is to be only forward. Even a std::list<...> can be used.
template<class _Tnumber, class _Titerator >
bool next_variation
(
_Titerator const& _First
, _Titerator const& _Last
, _Tnumber const& _Upper
, _Tnumber const& _Start = 0
, _Tnumber const& _Step = 1
)
{
_Titerator _Next = _First;
while( _Next != _Last )
{
*_Next += _Step;
if( *_Next < _Upper )
{
return true;
}
(*_Next) = _Start;
++_Next;
}
return false;
}
int main( int argc, char *argv[] )
{
std::string s("aaaa");
do{
std::cout << s << std::endl;
}while (next_variation(s.begin(), s.end(), 'c', 'a'));
std::vector< double > dd(3,1);
do{
std::cout << dd[0] << "," << dd[1] << "," << dd[2] << "," << std::endl;
}while( next_variation<double>( dd.begin(), dd.end(), 5, 1, 0.5 ) );
return EXIT_SUCCESS;
}