use correctly String#tr method? - crystal-lang

New to Crystal-lang, I'm actually trying to code a Caesar cipher.
Problem is, when I enter a string to encode, the program show the same string without modification.
if ARGV.size < 3
puts "./caesarcipher [ed] [text] [num]"
exit
end
letter = ARGV[0]
str = ARGV[1]
n = ARGV[2].to_i
alphabet = ("A".."Z").to_a
case letter
when "e" then puts str.tr(alphabet.join, alphabet.rotate(n).join)
when "d" then puts str.tr(alphabet.join, alphabet.rotate(n * -1).join)
else puts "./caesarcipher [ed] [text] [num]"
end
Since the two arguments in the tr method contain what I want and tr must return a value, I don't understand why nothing change.

Welcome to Stack Overflow!
The reason why this didn't work for you, is because if you examine the alphabet array, it's actually only capital letters. So in your translation, you are only translating the upper case characters. If you instead change
alphabet=("a".."z").to_a
that will translate for the lower case characters.
If you want to do both, then I would suggest creating two "alphabets" one with upper and one with lower case letters, and then applying the translation twice on the string, one with upper and one with lower case alphabets.

Related

How to compare two arrays of chars and store different value between the two

I'm at my wits' end here (which isn't saying very much). My program takes two forms of user input, a sentence and a word to be stripped from that sentence. The sentence has a fixed size of 101 characters, and the word to be stripped, has a fixed size of 5. Although strings and vectors would make this much easier, I would like to only use char arrays.
The problem that I seem to be having is comparing arrays of two different sizes. I try to loop through the sentence length and compare each individual element in the sentence and key word to be removed. The issue is that I can't get the key word to compare with every word in the sentence, but only the first (if there's a match). The code to compare these two char arrays is
for(int i = 0; i < strlen(sentence); i++)
{
if (sentence[i] == keyword[i])
sentence[i] = ' ';
}
The thought process here is to directly compare each individual element, and if there's a match, replace that instance with a space, to 'deconstruct' the sentence. That way, the user can see how their sentence looks after removing all instances of their 'key' word entered. For example, the keyword 'anti' compared to the sentence 'antivenom is antinice' would yield 'venom is nice.'
Assume that all user input is converted to lowercase and punctuation is removed to ease this comparison.
What am I doing wrong here? How can I compare the 'key' word to every single word found in the user sentence? Any advice would be greatly appreciated. Thank you.

Hello. Can anybody tell me what's going on with this code. Can anyone explain what's is happening here

I'm new to python so I don't know much about it.
So here's the code.
import string
def ispangram(string1,alphabet = string.ascii_lowercase):
alphaset = set(alphabet)
alpha = set(string1.lower())
return alphaset <= alpha
ispangram("The quick brown fox jumps over the lazy dog")
output:
True
A pangram is a sentence which has every letter of the alphabet in it.
This code asks "are (all the letters in the alphabet) in (the input)?".
All the letters in the alphabet comes from the builtin module string.ascii_lowercase.
To make the comparison work for things like The and the, the input is converted to lowercase using lower().
The comparison is done using sets - one of Python's collection types. When the operator <= is used on sets, it checks if every element of the left hand set is in the right hand set ( https://docs.python.org/2/library/stdtypes.html#set.issubset ).
So it literally says "is (the set of (all the letters in the alphabet in lowercase)) a subset of (the set of (characters in the input - after converting those to lowercase))?".

Regex: "password must have at least 3 of the 4 of the following"

I'm a Regex newbie, and so far have only used it for simple things, like "must be a number or letter". Now I have to do something a bit more complex.
I need to use it to validate a password, which must be 8-16 characters, free of control/non-printing/non-ASCII characters, and must have at least three of the following:
one capital letter
one lowercase letter
one number 0-9
one symbol character ($, %, &, etc.)
I'm thinking what I have to do is write something like "one capital letter, lowercase letter and number, OR one capital letter, lowercase letter and one symbol, OR one capital letter, one number or one symbol, OR...." to cover all possible "3 out of 4" combinations, but that seems excessive. Is there a simpler solution?
The correct way to do this is to check all of the five conditions separately. However, I assume there is a reason you want a regex, here you go:
/^((?=.*[A-Z])(?=.*[a-z])(?=.*\d)|(?=.*[a-z])(?=.*\d)(?=.*[\$\%\&])|(?=.*[A-Z])(?=.*\d)(?=.*[\$\%\&])|(?=.*[A-Z])(?=.*[a-z])(?=.*[\$\%\&])).{8,16}$/
Explanation:
We want to match the whole thing, hence we surround it with ^$
.{n,m} matches between n and m characters (8 and 16 in our case).
The general way you can check if a string contains something, without actually matching it is by using positive lookahead (?=.*X), where X is the thing you want to check. For example, if you want to make sure the string contains a lowercase letter you can do (?=.*[a-z]).
If you want to check if a string contains X, Y and Z, but without actually matching them, you can use the previous recipe by appending the three lookaheads (?=.*X)(?=.*Y)(?=.*Z)
We use the above to match three of the four things mentioned. We go through all possible combinations with |(or) - cCD|cDS|CDS|CcS (c = lowercase letter, C = capital letter, D = digit, S = special)
See it in action
The best way to do this is by checking each condition separately. Performance will suffer if you try to fit all conditional criteria into one expression (see the accepted answer). I also highly recommend against limiting the length of the password to 16 chars — this is extremely insecure for modern standards. Try something more like 64 chars, or even better, 128 — assuming your hashing architecture can handle the load.
You also didn't specify a language, but this is one way to do it in JavaScript:
var pws = [
"%5abCdefg",
"&5ab",
"%5abCdef",
"5Bcdwefg",
"BCADLKJSDSDFlk"
];
function pwCheck(pw) {
var criteria = 0;
if (pw.toUpperCase() != pw) {
// has lower case letters
criteria++;
}
if (pw.toLowerCase() != pw) {
// has upper case letters
criteria++;
}
if (/^[a-zA-Z0-9]*$/.test(pw) === false) {
// has special characters
criteria++;
}
if (/\d/.test(pw) === true) {
// has numbers
criteria++;
}
// returns true if 3 or more criteria was met and length is appropriate
return (criteria >= 3 && pw.length >= 8 && pw.length <= 16);
}
pws.forEach(function(pw) {
console.log(pw + ": " + pwCheck(pw).toString());
});
Not sure if its a iOS thing, the regex with "d" for digits [0-9] wasn't working as expected, example String that had issues = "AAAAAA1$"
The fix below works fine in Objective-C and Swift 3
^((?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])|(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[^a-zA-Z0-9])|(?=.*?[A-Z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])|(?=.*?[a-z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])).{8,16}$

How can I parse a char array with octal values in Python?

EDIT: I should note that I want a general case for any hex array, not just the google one I provided.
EDIT BACKGROUND: Background is networking: I'm parsing a DNS packet and trying to get its QNAME. I'm taking in the whole packet as a string, and every character represents a byte. Apparently this problem looks like a Pascal string problem, and using the struct module seems like the way to go.
I have a char array in Python 2.7 which includes octal values. For example, let's say I have an array
DNS = "\03www\06google\03com\0"
I want to get:
www.google.com
What's an efficient way to do this? My first thought would be iterating through the DNS char array and adding chars to my new array answer. Every time i see a '\' char, I would ignore the '\' and two chars after it. Is there a way to get the resulting www.google.com without using a new array?
my disgusting implementation (my answer is an array of chars, which is not what i want, i want just the string www.google.com:
DNS = "\\03www\\06google\\03com\\0"
answer = []
i = 0
while i < len(DNS):
if DNS[i] == '\\' and DNS[i+1] != 0:
i += 3
elif DNS[i] == '\\' and DNS[i+1] == 0:
break
else:
answer.append(DNS[i])
i += 1
Now that you've explained your real problem, none of the answers you've gotten so far will work. Why? Because they're all ways to remove sequences like \03 from a string. But you don't have sequences like \03, you have single control characters.
You could, of course, do something similar, just replacing any control character with a dot.
But what you're really trying to do is not replace control characters with dots, but parse DNS packets.
DNS is defined by RFC 1035. The QNAME in a DNS packet is:
a domain name represented as a sequence of labels, where each label consists of a length octet followed by that number of octets. The domain name terminates with the zero length octet for the null label of the root. Note that this field may be an odd number of octets; no padding is used.
So, let's parse that. If you understand how "labels consisting of "a length octet followed by that number of octets" relates to "Pascal strings", there's a quicker way. Also, you could write this more cleanly and less verbosely as a generator. But let's do it the dead-simple way:
def parse_qname(packet):
components = []
offset = 0
while True:
length, = struct.unpack_from('B', packet, offset)
offset += 1
if not length:
break
component = struct.unpack_from('{}s'.format(length), packet, offset)
offset += length
components.append(component)
return components, offset
import re
DNS = "\\03www\\06google\\03com\\0"
m = re.sub("\\\\([0-9,a-f]){2}", "", DNS)
print(m)
Maybe something like this?
#!/usr/bin/python3
import re
def convert(adorned_hostname):
result1 = re.sub(r'^\\03', '', adorned_hostname )
result2 = re.sub(r'\\0[36]', '.', result1)
result3 = re.sub(r'\\0$', '', result2)
return result3
def main():
adorned_hostname = r"\03www\06google\03com\0"
expected_result = 'www.google.com'
actual_result = convert(adorned_hostname)
print(actual_result, expected_result)
assert actual_result == expected_result
main()
For the question as originally asked, replacing the backslash-hex sequences in strings like "\\03www\\06google\\03com\\0" with dots…
If you want to do this with a regular expression:
\\ matches a backslash.
[0-9A-Fa-f] matches any hex digit.
[0-9A-Fa-f]+ matches one or more hex digits.
\\[0-9A-Fa-f]+ matches a backslash followed by one or more hex digits.
You want to find each such sequence, and replace it with a dot, right? If you look through the re docs, you'll find a function called sub which is used for replacing a pattern with a replacement string:
re.sub(r'\\[0-9A-Fa-f]+', '.', DNS)
I suspect these may actually be octal, not hex, in which case you want [0-7] rather than [0-9A-Fa-f], but nothing else would change.
A different way to do this is to recognize that these are valid Python escape sequences. And, if we unescape them back to where they came from (e.g., with DNS.decode('string_escape')), this turns into a sequence of length-prefixed (aka "Pascal") strings, a standard format that you can parse in any number of ways, including the stdlib struct module. This has the advantage of validating the data as you read it, and not being thrown off by any false positives that could show up if one of the string components, say, had a backslash in the middle of it.
Of course that's presuming more about the data. It seems likely that the real meaning of this is "a sequence of length-prefixed strings, concatenated, then backslash-escaped", in which case you should parse it as such. But it could be just a coincidence that it looks like that, in which case it would be a very bad idea to parse it as such.

Strategy to replace spaces in string

I need to store a string replacing its spaces with some character. When I retrieve it back I need to replace the character with spaces again. I have thought of this strategy while storing I will replace (space with _a) and (_a with _aa) and while retrieving will replace (_a with space) and (_aa with _a). i.e even if the user enters _a in the string it will be handled. But I dont think this is a good strategy. Please let me know if anyone has a better one?
Replacing spaces with something is a problem when something is already in the string. Why don't you simply encode the string - there are many ways to do that, one is to convert all characters to hexadecimal.
For instance
Hello world!
is encoded as
48656c6c6f20776f726c6421
The space is 0x20. Then you simply decode back (hex to ascii) the string.
This way there are no space in the encoded string.
-- Edit - optimization --
You replace all % and all spaces in the string with %xx where xx is the hex code of the character.
For instance
Wine having 12% alcohol
becomes
Wine%20having%2012%25%20alcohol
%20 is space
%25 is the % character
This way, neither % nor (space) are a problem anymore - Decoding is easy.
Encoding algorithm
- replace all `%` with `%25`
- replace all ` ` with `%20`
Decoding algorithm
- replace all `%xx` with the character having `xx` as hex code
(You may even optimize more since you need to encode only two characters: use %1 for % and %2 for , but I recommend the %xx solution as it is more portable - and may be utilized later on if you need to code more characters)
I'm not sure your solution will work. When reading, how would you
distinguish between strings that were orginally " a" and strings that
were originally "_a": if I understand correctly, both will end up
"_aa".
In general, given a situation were a specific set of characters cannot
appear as such, but must be encoded, the solution is to choose one of
allowed characters as an "escape" character, remove it from the set of
allowed characters, and encode all of the forbidden characters
(including the escape character) as a two (or more) character sequence
starting with the escape character. In C++, for example, a new line is
not allowed in a string or character literal. The escape character is
\; because of that, it must be encoded as an escape sequence as well.
So we have "\n" for a new line (the choice of n is arbitrary), and
"\\" for a \. (The choice of \ for the second character is also
arbitrary, but it is fairly usual to use the escape character, escaped,
to represent itself.) In your case, if you want to use _ as the
escape character, and "_a" to represent a space, the logical choice
would be "__" to represent a _ (but I'd suggest something a little
more visually suggestive—maybe ^ as the escape, with "^_" for
a space and "^^" for a ^). When reading, anytime you see the escape
character, the following character must be mapped (and if it isn't one
of the predefined mappings, the input text is in error). This is simple
to implement, and very reliable; about the only disadvantage is that in
an extreme case, it can double the size of your string.
You want to implement this using C/C++? I think you should split your string into multiple part, separated by space.
If your string is like this : "a__b" (multiple space continuous), it will be splited into:
sub[0] = "a";
sub[1] = "";
sub[2] = "b";
Hope this will help!
With a normal string, using X characters, you cannot write or encode a string with x-1 using only 1 character/input character.
You can use a combination of 2 chars to replace a given character (this is exactly what you are trying in your example).
To do this, loop through your string to count the appearances of a space combined with its length, make a new character array and replace these spaces with "//" this is just an example though. The problem with this approach is that you cannot have "//" in your input string.
Another approach would be to use a rarely used char, for example "^" to replace the spaces.
The last approach, popular in a combination of these two approaches. It is used in unix, and php to have syntax character as a literal in a string. If you want to have a " " ", you simply write it as \" etc.
Why don't you use Replace function
String* stringWithoutSpace= stringWithSpace->Replace(S" ", S"replacementCharOrText");
So now stringWithoutSpace contains no spaces. When you want to put those spaces back,
String* stringWithSpacesBack= stringWithoutSpace ->Replace(S"replacementCharOrText", S" ");
I think just coding to ascii hexadecimal is a neat idea, but of course doubles the amount of storage needed.
If you want to do this using less memory, then you will need two-letter sequences, and have to be careful that you can go back easily.
You could e.g. replace blank by _a, but you also need to take care of your escape character _. To do this, replace every _ by __ (two underscores). You need to scan through the string once and do both replacements simultaneously.
This way, in the resulting text all original underscores will be doubled, and the only other occurence of an underscore will be in the combination _a. You can safely translate this back. Whenever you see an underscore, you need a lookahed of 1 and see what follows. If an a follows, then this was a blank before. If _ follows, then it was an underscore before.
Note that the point is to replace your escape character (_) in the original string, and not the character sequence to which you map the blank. Your idea with replacing _a breaks. as you do not know if _aa was originally _a or a (blank followed by a).
I'm guessing that there is more to this question than appears; for example, that you the strings you are storing must not only be free of spaces, but they must also look like words or some such. You should be clear about your requirements (and you might consider satisfying the curiosity of the spectators by explaining why you need to do such things.)
Edit: As JamesKanze points out in a comment, the following won't work in the case where you can have more than one consecutive space. But I'll leave it here anyway, for historical reference. (I modified it to compress consecutive spaces, so it at least produces unambiguous output.)
std::string out;
char prev = 0;
for (char ch : in) {
if (ch == ' ') {
if (prev != ' ') out.push_back('_');
} else {
if (prev == '_' && ch != '_') out.push_back('_');
out.push_back(ch);
}
prev = ch;
}
if (prev == '_') out.push_back('_');