Let's say there is a certain way of encrypting strings:
Append the character $, which is the first character in the alphabet, at the end of the string.
Form all the strings we get by continuously moving the first character to the end of the string.
Sort all the strings we have gotten into alphabetical order.
Form a new string by appending last character of each string to it.
For example, the word FRUIT is encrypted in the following manner:
We append the character $ at the end of the word:
FRUIT$
We then form all the strings by moving the first character at the end:
FRUIT$
RUIT$S
UIT$FR
IT$FRU
T$FRUI
$FRUIT
Then we sort the new strings into alphabetical order:
$FRUIT
FRUIT$
IT$FRU
RUIT$F
T$FRUI
UIT$FR
The encrypted string:
T$UFIR
Now my problem is obvious: How to decrypt a given string into it's original form.
I've been pounding my head for half a week now and I've finally run out of paper.
How should I get on with this?
What I have discovered:
if we have the last step of the encryption:
$FRUIT
FRUIT$
IT$FRU
RUIT$F
T$FRUI
UIT$FR
We can know the first and last character of the original string, since the rightmost column is the encrypted string itself, and the leftmost column is always in alphabetical order. The last character is the first character of the encrypted string, because $ is always first in the alphabet, and it only exists once in a string. Then, if we find the $ character from the rightmost column, and look up the character on the same row in the leftmost column, we get the first character.
So what we can know about the encrypted string T$UFIR is that the original string is F***T$, where * is an unknown character.
There ends my ideas. Now I have to utilize the world-wide-web and ask another human being: How?
You could say this is homework, and being familiar with my tutor, I place my bets on this being a dynamic programming -problem.
This is the Burrows-Wheeler transform.
It's an algorithm typically used for aiding compression algorithms, as it tends to group together common repeating phrases, and is reversible.
To decode your string:
Number each character:
T$UFIR
012345
Now sort, retaining the numbering. If characters repeat, you use the indices as a secondary sort-key, such that the indices for the repeated characters are kept in increasing order, or otherwise use a sorting algorithm that guarantees this.
$FIRTU
134502
Now we can decode. Start at the '$', and use the associated index as the next character to output ('$' = 1, so the next char is 'F'. 'F' is 3, so the next char is 'R', etc...)
The result:
$FRUIT
So just remove the marker character, and you're done.
Related
I'm at my wits' end here (which isn't saying very much). My program takes two forms of user input, a sentence and a word to be stripped from that sentence. The sentence has a fixed size of 101 characters, and the word to be stripped, has a fixed size of 5. Although strings and vectors would make this much easier, I would like to only use char arrays.
The problem that I seem to be having is comparing arrays of two different sizes. I try to loop through the sentence length and compare each individual element in the sentence and key word to be removed. The issue is that I can't get the key word to compare with every word in the sentence, but only the first (if there's a match). The code to compare these two char arrays is
for(int i = 0; i < strlen(sentence); i++)
{
if (sentence[i] == keyword[i])
sentence[i] = ' ';
}
The thought process here is to directly compare each individual element, and if there's a match, replace that instance with a space, to 'deconstruct' the sentence. That way, the user can see how their sentence looks after removing all instances of their 'key' word entered. For example, the keyword 'anti' compared to the sentence 'antivenom is antinice' would yield 'venom is nice.'
Assume that all user input is converted to lowercase and punctuation is removed to ease this comparison.
What am I doing wrong here? How can I compare the 'key' word to every single word found in the user sentence? Any advice would be greatly appreciated. Thank you.
I have these strings:
fghs13412asdf
dfs234245gk
and want to return the position of the last numeric character, like so:
5
3
Perhaps there is something different in LibreOffice than Excel, where I'm seeing all the examples. Here's one that should be straightforward, and is returning an error.
Do you need the position of the first numeric character (as in the heading) or of the last one (as in the body of your question)?
If it's the first one, a simple SEARCH() function using regular expressions should to the trick, e.g. =SEARCH("([:digit:])";A1).
If it's the last one, counted from the start of the string, you can use a different regex (adapted from an answer in the OpenOffice forums by gerard24): =SEARCH("[0-9][^[0-9]]+$";A1).
If you need the position of the last numeric character, counted from the end of the string, just subtract the value calculated in step 2 from the LEN() of the entire string: =LEN(A1)-(SEARCH("[0-9][^[0-9]]+$";A1)).
You'll get a #VALUE! error if there's no numeric character, or if the last character of the input string is numeric. Note that whitespace in the string will be ignored:
I have a field whose value is a concatenated set of fields delimited by | (pipe),
Note:- escape character is also a pipe.
Given:
AB|||1|BC||DE
Required:
["AB|","1","BC|DE"]
How can I split the given string into an array or list without iterating character by character (i.e. using regex or any other method) to get what is required?
If there's an unused character you can substitute for the doubled-pipe you could do this:
groovy:000> s = "AB|||1|BC||DE"
===> AB|||1|BC||DE
groovy:000> Arrays.asList(s.replaceAll('\\|\\|', '#').split('\\|'))*.replaceAll(
'#', '|')
===> [AB|, 1, BC|DE]
Cleaned up with a magic char sequence and using tokenize it would look like:
pipeChars = 'ZZ' // or whatever
s.replaceAll('\\|\\|', pipeChars).tokenize('\\|')*.replaceAll(pipeChars, '|')
Of course this assumes that it's valid to go left-to-right across the string grouping the pipes into pairs, so each pair becomes a single pipe in the output, and the left-over pipes become the delimiters. When you start with something like
['AB|', '|1', 'BC|DE']
which gets encoded as
AB|||||1|BC||DE
then the whole encoding scheme falls apart, it's entirely unclear how to group the pairs of pipes in order to recover the original values. 'X|||||Y' could have been generated by ['X|','|Y'] or ['X||', 'Y'] or ['X', '||Y'], there is no way to know which it was.
How about using the split('|') method - but from what you provided, it looks like you can also have the '|' character in the field value. Any chance you can change the delimiter character to something that is not in the resulting values?
I'm trying to find any occurrences of a character repeating more than 2 times in a user entered string. I have this, but it doesn't go into the if statement.
password = asDFwe23df333
s = re.compile('((\w)\2{2,})')
m = s.search(password)
if m:
print ("Password cannot contain 3 or more of the same characters in a row\n")
sys.exit(0)
You need to prefix your regex with the letter 'r', like so:
s = re.compile(r'((\w)\2{2,})')
If you don't do that, then you'll have to double up on all your backslashes since Python normally treats backlashes like an escape character in its normal strings. Since that makes regexes even harder to read then they normally are, most regexes in Python include that prefix.
Also, in your included code your password isn't in quotes, but I'm assuming it has quotes in your code.
Can't you simply go through the whole string and everytime you found a character equal to the previous, you incremented a counter, till it reached the value of 3? If the character was different from the previous, it would only be a matter of setting the counter back to 0.
EDIT:
Or, you can use:
s = 'aaabbb'
re.findall(r'((\w)\2{2,})', s)
And check if the list returned by the second line has any elements.
Let's say that we have a string declared...
string paragraphy = "This is a really really long string containing a paragraph long content. I want to wrap this text by using a for loop to do so.";
With this string variable I want to wrap the text if it is more than 60 width and if there is a space after those the 60 width.
Can someone please provide me with the code or any help in creating something like this.
A basic idea to solving this is to keep track of the last space in a segment of the string before the 60th character in that segment.
Since this is homework, I'll let you come up with the code, however here's some rough pseudo-code of the above suggestion:
- current_position = start of the string
- WHILE current_position NOT past the end of the string
- LOOP 1...60 from the current_position (also don't go past the end of the string)
- IF the character at current_position is a space, set the space_position to this position
- Replace the character (the space) at the space_position with a newline
- Set the current_position to the next character after the space_position
- If you're printing the string rather than inserting newline characters into it, you would print any remaining part of the string here.
You might also want to consider the case where you don't have any spaces in a block of 60 characters.