Python application cannot search for strings that contain utf-8 characters - python-2.7

I have created a small tool using Tkinter that enters a string using entry widget , search for that string in multiple files and displays the list of file names that contains that string in listbox. All the files are utf-8 encoded already.
Now the problem is, when I run my code from IDE(Pycharm), and input search string that contains a utf-8 character in the tool UI, it works fine and searched all files that contains it.
But if I create a exe file of that code(using py2exe), and launch the tool , enter the same string, it cannot search and code continues to search non-stop.(With non- utf-8 characters, it works fine)
In the application code, I have 'imported codecs' and opened file using command
codecs.open(SourceFile, encoding ='utf-8')
Please help me to solve this problem that how exe file can also work correct and search strings successfully.

Related

I have special characters in list and it breaks SikuliX

I try to get paths into list and everything is working just fine until I get special characters like ä or ö. In string they are represented as bytes for example ä is \xe4. If I use same Python script in Terminal I get all paths printed out correctly even though paths in list contain these bytes instead of actual letters.
Here is my code where I extract all the filenames:
def read_files(path):
"""
Read all files in folder specified by path
:param path: Path to folder which contents will be read
:return: List of all files in folder specified by path
"""
files = []
for f in listdir(path):
if isfile(join(path, f)):
files.append(make_unicode(join(path, f)))
return files
def make_unicode(string):
if type(string) != unicode:
string = string.decode('utf-8')
return string
I don't have any idea where to go from now on. I have tried practically everything I possibly could find from Google. This is more of a SikuliX problem than Python, because Python code works just fine outside SikuliX.
I use Python 2.7 and SikuliX 1.1.1.
So I got this covered. Problem was, that read_files(path) function was called again later and when the path was unicode with special characters marked as bytes the whole thing broke. I changed my code in a fashion that this function was called only once and then I was able to work with special characters.

How do you print unicode text to an output file?

I'm writing a C++ program in Visual Studio for class. I am using certain Unicode characters within my program like:
╚, █, ╗, ╝, & ║
I have figured out how to print these characters onto the console properly but I have yet to find a way to output it to a file properly.
In Visual Studio, choosing [OEM United States - Codepage 437] encoding when saving the .cpp file allows it to display properly onto the console.
Now I just need a way to output these characters to a file without errors.
Hopefully someone knows how. Thank You!
Create the file using a wofstream, which uses wide (wchar_t) characters instead of an ofstream (which uses char).

Encoding problems C++

I have this program that is supposed to load everything from a .txt file into a string and then display it. The problem I'm getting is that when I import the contents of the file they look different than if you view it in a simple text editor. This is what it looks like in a text editor:
bvwÅ.wÅ.Å}.ÅsqÄsÇ.sÑs|.]po{o.r}sÅ|Ç.y|}Ö.op}ÉÇ.wÇ
And this is what it looks like when it's imported and printed in my program:
bvw\201.w\201.\201}.\201sq\200s\202.s\204s|.]po{o.r}s\201|\202.y|}\205.op}\203\202.w\202
It seems like some characters are being encoded in a strange way, e.g. swedish "å" is stored as "/201". I want all of the text that my program handles to be Unicode, so that I can convert characters back and forth between chars and ints. This is how I import the text file:
//Imports the entire file as a string
string toBeDecrypted;
while(getline(inputFile, toBeDecrypted)){
string appendtemp;
getline(inputFile, appendtemp);
toBeDecrypted.append("\n");
toBeDecrypted.append(appendtemp);
}
inputFile.close();
My program also writes to files, so I want it to write in Unicode too.
EDIT
I solved the problem by changing the way that the input file is created, it no longer consists of any ASCII-extended characters.

Writing an interpreter in C++

I'm working on a C++ project which should do following operations:
Open a .txt file which contains list of strings
(for example String1: "Hi,name_1_is,;Ondrej,age24;year,,88;") with optional values determined by empty commas ",,".
After this check each string using regular expressions for valid input
(like "Hi" shouldn't be a number or "1" must be a number and everything with ",," is optional and can be skipped or user can enter this value as well).
Then evaluate the result and save it to variable or new .txt generated file.
This result shows if whole string is correct with an "ok" message attached to it or it will attach "not ok" message right to the parameter with wrong input.
I have already finished the part with opening a .txt file, checking the whole string and saving the right strings to the new file (using Qt and Visual Studio 2010 Express).
I need to do the part where each parameter will be checked but somehow I don't know how exactly, as I should not build Parser but the whole programm must be build like Interpreter.
Actually I'm stucked at this point because I have no idea how to start to build this like an Interpreter.
All my attempts resulted always with structure similar to Parser
(that means: I used split string, then checked each token or char using regex, then built the string together again, ect.)
Could you provide me with some usefull links or tips of how to achieve that or at least where to start at all please?

Folder with 1300 png files into html images list

I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?
If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.
I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.