Python: Grab text printed to console - python-2.7

I'm using a complicated python module called CAMFR. In one function, it calculates a value that I want to use (to plot or otherwise), but unfortunately the module prints this value to the python console but doesn't return it as a variable!
(I have poked through the source code to see if I can recompile the module to return the values, but this looks excessively difficult at my programming level, considering it's written in C++ and uses Boost etc. I just don't get it unfortunately.)
SO, option number two is to grab the text printed to the console and parse out the value I need.
How can I intercept or otherwise acquire this function's console output text in Python (2.7)? (I will RegEx it afterwards.)
Thanks!
Here is an example of the text printed to the Python console:
<<a lot of other output, and then at the end of the function:>>
...
# 1.05554 4.65843e-05 5.54592 0.0903205 1
# 1.05554 2.87907e-05 3.42757 0.0903205 1
# 1.05554 2.87907e-05 3.42757 0.0903205 1
Done pass 1: lambda 1.05554, gain 3.42757
I ultimately want to grab the Lambda=1.05544 & Gain=3.42757 values, for example, and shove them into variables. Grabbing the entire console output of this one function would allow me to do that via a subsequent RegEx search, so I'm looking for a way to do that.
I apologize if there is another thread with the answer I need, I could not figure out google search terms that got me what I needed. Thanks for your patience & generous help!

Related

How do syntax highlighting tools implement automated testing?

How do syntax highlighting tools such as pygments and textmate bundle do automated testing?
Tools like this often simply resort to a large collection of snippets of text representing a chosen input and the expected output. For instance if you look at the Pygments Github, you can see they have giant lists of text files divided into an input section and a tokens section like so:
---input---
f'{"quoted string"}'
---tokens---
'f' Literal.String.Affix
"'" Literal.String.Single
'{' Literal.String.Interpol
'"' Literal.String.Double
'quoted string' Literal.String.Double
'"' Literal.String.Double
'}' Literal.String.Interpol
"'" Literal.String.Single
'\n' Text
Since a highlighting tool reads a piece of code and then has to identify which bits of text are parts of which bits of code (is this the start of a function? is this a comment? is it a variable name?), they usually perform various processing steps that will result in a list of tokens as above, which they can then feed into the next step (insert highlights from the first Literal.String.Interpol to the next, bold any Literal.String.Single, etc. by generating the appropriate HTML or CSS or other markup relevant to the system). Checking that these tokens are generated properly from the input text is key.
Then, depending on the language the tool is built in you might use an existing testing suite or build your own (pygments seems to use a Python-based tool called pyTest), which essentially consists of running each of the inputs through your tool in a loop, reading the output, and comparing it to the expected values. If the output doesn't match, you can display a message showing what test failed, what the input/output/expected/error values were. If an output passes, you could simply signal with a happy green checkmark. Then when the test finishes, the developer can hopefully reason out what they broke by looking over the results.
It is often a good idea to randomize the order that these inputs so that you can be sure that each step in the test doesn't have side effects that are getting passed along to the next test and cause it to pass or fail incorrectly. It might also be a good idea to time the length of the complete test. If the whole thing was taking 12 seconds yesterday, but now it takes two minutes, we may have broken something even if all the test technically "pass".
In tools like a code highlighter, you often have a good idea of what many of the inputs and outputs will look like before you can code everything up, for instance if some spec document already exists. In that case, it may be a good idea to include tests that you know won't pass right away, but mark them with some tag (perhaps some text marker within the file that says "NOT PASSING", or naming the file in a certain way), and telling your testing suite to expect those tests to fail. Then, as you fix bugs and add features, say you fixed Bug X in your attempt to make test #144 pass. Now when you run the text, it also alerts you that 10 other tests that should be failing are now passing. Congrats! You just saved yourself a lot of work trying to fix several separate problems that were actually caused by the same root issue.
As the codebase is updated, a developer would run and rerun the test to ensure that any changes he makes doesn't break tests that were working before, and then would add new tests to the collection to verify that his new feature, fixed edge case, etc., now has a known expected output that you can be sure someone won't accidentally break in the future.

how to support output paging in C++ application

Our application can generate some fairly long report files interactively. We use C++ to generate all the output, but redirected through a TCL console and TCL channel so we can take advantage of output logging etc.
Is there any common way to support paging of output in C++. I've casted around but can't find anything.
Best
Sam
OK, so the situation is that you're writing to a Tcl_Channel that a Tcl interpreter is also writing to. That should work. The simplest way to put paging on top of that is to make that channel be one of the standard channels (I'd pick stdout) and feed the whole lot through a pager program like more or less. That'll only take you a few seconds to get working.
Otherwise, it's possible to write a channel in Tcl 8.5 using just Tcl code; that's what a reflected channel is (that's the Tcl 8.6 documentation, but it works the same way in 8.5). However, using that to do a pager is going to be quite a lot of work; channels work with bytes not characters. It's probably also possible to do it using a stacked channel transformation (8.6 only).
However, if sending the output to a Tk text widget is acceptable (I know it isn't precisely what you asked for…) there's already a package in Tcllib for it.
package require Tk
package require tcl::chan::textwindow
pack [text .t]
set channel [tcl::chan::textwindow .t]
puts $channel "This is a simple test."
That (write-only) channel will work fine if you pass it to your C++ code to use. (You can inspect the source to see how it is done if you wish; the code is pretty short.)

Adding a script in a C++ application

I'm working on a project simulating a stock market. People buy and sell a stock and I would like to call each turn a script to try a strategy against the market.
What I want is a function in C++ which send an vector of integer as argument to a vba or python script which return an array of 3 integers.
I've searched for a solution but all i could find is a way to execute a script in python, but I don't know how I can send and get an argument from this script.
I think my problem is common but i don't know where to head to find a solution.
Thank you!
(I'm not a native english speaker so sorry if I made grammar error)
On windows you use the function CreateProcess() to start another program. Use the full path of the python interpreter as the first argument. Start the second argument with the path to the python script.
If you can fit a string representing your vector in 32768 characters, you can supply the vector in the second argument to CreateProcess.
A more flexible option is create a child process with redirected in- and output, as shown here. You can then write the vector to the standard input of the python process and read the answer back from its standard output.
There are many ways to do this.
The way I would do it is to popen() your "script" [which would be something like "python myscript.py -arg1 -arg2"]. Depending on how large your vector is, you could either store it in a file or pass it as part of the arguments [there is a limit in Windows of something like 8KB for the argument string].
The output would then appear as the result from popen()'s pipe.
Use Boost.Python. It will help you to embed python in your app.

Python: How do you make an area code finder?

def areacode():
code={}
cont='Yes'
while cont == 'Yes':
num=int(raw_input('Type in a zip code:')
if num==407:
print "Found in Florida"
elif num==718:
print "Found in New York"
elif num==201:
print "Found in New Jersey"
elif num==408:
print "Found in California"
else:
print "Zip code not found."
cont=raw_input("Would you like to continue? Yes or No?: ")
I am stuck on how to continue and what to do next. I know what I have is not much, but any direction as to where to go on next would be nice. How would I make this into a nested dictionary?
And there happens to be an error in my if statement, it's telling me that there is an invalid syntax. I don't seem to see what's wrong though.
It seems like you have a good start and a few options ahead of you.
Make sure you're reading the proper documentation when you look at the following advice.
1.) You could check the site's information in real time. I wouldn't recommend this method as, though it is facilitated in Python, it's still the most difficult option and, at your presumed level, I'd assume it's overkill.
If you wanted to go this route however, I would check here for more information -- there's a module to help you out!
2.) Grab the data yourself, stick it in a text file (or CSV, which is a type of data file prime for this type of activity) and then have your program grab data from the text file. It's much easier to grab information in the format you want when you're doing the "heavy lifting" as it were, of getting the information from the website. I'd suggest this method because the state-zip code relation is not likely to change in the time span that you'll be using this program.
3.) Hardcode the zip code - state combinations. This is not recommended and would take a very, very, long time.
Basically, your options are between difficulty in coding and difficulty at run time. 3 is the longest to code, but the easiest to use (don't do 3). 1 is the theoretically easiest (when talking about program length) to code but the hardest to run (as it has to grab the data each time).
I would, as you've probably gathered, suggest 2. Take the data how you want it, put it in a text file in the same folder as the program, and use this documentation to get you in the right direction.
Good luck!

How to run a dictionary search against a large text file?

We're in the final stages of shipping our console game. On the Wii we're having the most problems with memory of course, so we're busy hunting down sloppy coding, packing bits, and so on.
I've done a dump of memory and used strings.exe (from sysinternals) to analyze it, but it's coming up with a lot of gunk like this:
''''$$$$ %%%%
''''$$$$%%%%####&&&&
''''$$$$((((!!!!$$$$''''((((####%%%%$$$$####((((
''))++.-$$%&''))
'')*>BZf8<S]^kgu[faniwkzgukzkzkz
'',,..EDCCEEONNL
I'm more interested in strings like this:
wood_wide_end.bmp
restroom_stonewall.bmp
...which mean we're still embedding some kinds of strings that need to be converted to ID's.
So my question is: what are some good ways of finding the stuff that's likely our debug data that we can eliminate?
I can do some rx's to hack off symbols or just search for certain kinds of strings. But what I'd really like to do is get a hold of a standard dictionary file and search my strings file against that. Seems slow if I were to build a big rx with aardvaark|alimony|archetype etc. Or will that work well enough if I do a .NET compiled rx assembly for it?
Looking for other ideas about how to find stuff we want to eliminate as well. Quick and dirty solutions, don't need elegant. Thanks!
First, I'd get a good word list. This NPL page has a good list of word lists of varying sizes and sources. What I would do is build a hash table of all the words in the word list, and then test each word that is output by strings against the word list. This is pretty easy to do in Python:
import sys
dictfile = open('your-word-list')
wordlist = frozenset(word.strip() for word in dictfile)
dictfile.close()
for line in sys.stdin:
# if any word in the line is in our list, print out the whole line
for word in line.split():
if word in wordlist:
print line
break
Then use it like this:
strings myexecutable.elf | python myscript.py
However, I think you're focusing your attention in the wrong place. Eliminating debug strings has very diminishing returns. Although eliminating debugging data is a Technical Certification Requirement that Nintendo requires you to do, I don't think they'll bounce you for having a couple of extra strings in your ELF.
Use a profiler and try to identify where you're using the most memory. Chances are, there will be a way to save huge amounts of memory with little effort if you focus your energy in the right place.
This sounds like an ideal task for a quick-and-dirty script in something supporting regex's. I'd probably do something in python real quick if it was me.
Here's how I would proceed:
Every time you encounter a string (from the strings.exe output), prompt the user as to whether they'd like to remember it in the dictionary or permanently ignore it. If the user chooses to permanently ignore the string, in the future when its encountered, don't prompt the user about it and throw it away. You can optionally keep an anti-dictionary file around to remember this for future runs of your script. Build up the dictionary file and for each string keep a count or any other info about it you'd like about it. Optionally sort by the number of times the string occurs, so you can focus on the most egregious offenders.
This sounds like an ideal task for learning a scripting language. I wouldn't bother messing with C#/C++ or anything real fancy to implement this.