iMacros formatting random numbers as currency, does not affect the same field each time - imacros

This has me stumped. Running a basic data input macro taking data from a CSV file. I have tried formatting the numbers as both general and numbers in excel. It seems to affect random fields - for example, on one run it will affect boxes 1, 3, and 7, and on run 2 it will affect boxes 2, 5, and 8.
For example, this is a line of code:
TAG POS=1 TYPE=INPUT:TEXT ATTR=ID:Zip CONTENT={{!COL1}}
In half of the boxes, the entry will be formatted as 58123, and the other half $58,123.00.
I believe this may be a bug related to the implementation of the firefox plugin I am using, as I just ran the code on the same form twice in a row. Same data, same page, no reloads or anything. It had the same issue both times, however the boxes it affected were different. Boxes that were formatted correctly the first time were formatted incorrectly the second, and vice versa. It may also be caused by the plugin being thrown off by boxes that format numbers as currency automatically, as the macro will just "paste" the value in the box without triggering the formatting script, so I would get results like this:
cost: 150
cost: $150
cost: 150
when all of them should read $150 and there is a JS currency formatting function applied to the box. The end result being that numbers that should be formatted as currency are not, and numbers that should not be formatted as currency are.
#chivracq
I slowed timing down to one event every 2s and it did not improve the issue. I am using iMacros 8.9.7 on firefox 47. Windows 10. iMacros FF v10 breaks support for several commands that I need, and Chrome has it's own particular issues with the sites I have to use. Unfortunately finance tends to be picky - surprised we got off of IE6, really. The first version of iMacros I purchased was v8 if I remember correctly, and most of the basic ideas/code were written back then and updated as needed. With iMacros being discontinued now I don't see a point in fixing the issues newer versions have - if I can't get this working I suppose I will have to learn a different automation framework.
My apologies for missing the wiki.
A snippet from my CSV would look like this:
12345678,1407 W Random St,Arlington,22205,800000,800000,10/10/2022,10/20/2022
I could snag some of their code from element inspector perhaps?

Related

Find All String Occurrences, Except The Last One Found, and Remove Them

I am using Google Docs to open Walmart receipts that I email to myself. The Walmart store that I use 99.9% of the time seems to have made some firmware update to the Ingenico POS terminal that makes it display a running SUBTOTAL after each item is identified by the scanner. Here are some images to support my question..
The POS terminal looks like this:
Second image is the is the electronic receipt which I email myself from their IOS app. It is presumably taken from the POS terminal because it has the extra running SUBTOTAL lines after each item like the POS terminal screen shows. It has been doing this for a few months and I've been given no reason to believe, by management, that it will be corrected any time soon.
The final image is my actual paper receipt. This is printed from the register, its the one that you walk out with it and show the greeter/exit person to check your buggy and the items you've purchased.
Note that it does not show the extra SUBTOTAL.
I open the electronic receipt in a Google Document and their automatic OCR spits out the text of the receipt. It does a pretty darn good job, I'd say its 95%+ accurate with these receipts. I apply a very crude little regex that reformats these electronic receipts so that I can enter them into a database and use that data for my family's budgeting, taxes, and so forth. That has been working very well for me, albeit I would like to further automate that process but thats for a different question some day perhaps.
Right now, that little crude regex no longer formats the receipt into something usable for me.
What I would like to do is to remove the extra SUBTOTALS from the (broken) electronic receipt but leave the last SUBTOTAL alone. I highlighted the last SUBTOTAL on the receipt, which is always there, and should remain.
I have seen two other questions that are similar but I could not apply them to my situation. One of them was:
Remove all occurrences except the last one
What have I tried?
The following regex works in the online tester at regex101.com:
\nSUBTOTAL\t\d{1,3}(?:[.,]\d{3})*(?:[.,]\d{2})
It took me a while to come up with that regex from searching around but essentially I want it to find all of the SUBTOTAL literals with a preceding new-line and any decimal number amount from 0.01 to 999.99) and I just want to replace what that finds with a new-line and then I can allow my other regex creation to work on that like it used to before the firmware update to the POS terminal.
The regex correctly identifies every SUBTOTAL (including the last one) on the regex101.com site. I can apply a substitution of "\n" and I am back to seeing the receipt data I can work with but there were two issues:
1) I cant replicate this using Google Apps Script.
Here is my example:
function myFunction() {
var body = DocumentApp.getActiveDocument().getBody();
var newText = body.getText()
.match('\nSUBTOTAL\t\d{1,3}(?:[.,]\d{3})*(?:[.,]\d{2})')[1]
.replace(/%/mgi, "%\n");
body.clear();
body.setText(newText);
}
2) If I were to get the above code to work, I still have the issue of wanting to leave the last SUBTOTAL intact.
Here is a Google Doc that I have set up to experiment with:
https://docs.google.com/document/d/11bOJp2rmWJkvPG1FCAGsQ_n7MqTmsEdhDQtDXDY-52s/edit?usp=sharing
I use this regular expresion.
// JavaScript Syntax
'/\nSUBTOTAL\s\d{1,3}\.\d{2}| SUBTOTAL\n\d{1,3}\.\d{2}/g'
Also I make a script for google docs. You can use this Google Doc and see the results.
function deleting_subs() {
var body = DocumentApp.getActiveDocument().getBody();
var newText = body.getText();
var out = newText.replace(/\nSUBTOTAL\s\d{1,3}\.\d{2}|` SUBTOTAL\n\d{1,3}\.\d{2}/g, '');
// This is need to become more readable the resulting text.
out = out.replace(/R /g, 'R\n');
body.clear();
body.setText(out);
}
To execute the script, open the google doc file and click on:
Add ons.
Del_subs -> Deleting Subs.
Tip: After execute the complement/add on (Deleting Subs), undo the document edition, in that way other users can return to previous version of the text.
Hope this help to you.

Google vision fails to identify numbers in a table

I am aiming to extract a table of text & numbers from a document using Google's Vision API. The results are far from satisfactory - Vision seems to completely miss the contents of 2 columns in my table.
Recognition rate improves when I manually erase the column border but I cannot pre-process each file which I intend to process.
Cropping the column text, MOVing the column text to a new location dont seem to make a difference.
Increasing the brightness/contrast of the document seems to help a little bit but not enough to be satisfactory.
I'm using the "Try-It" web interface at cloud.google.com/vision/docs/drag-and-drop to test all my experiments... It mimics the results of running my code on the document.
I'm uploading JPG images, created from scanned PDF originals (converted in photoshop).
I dont have any code since the problem shows up just using the web-tool.
many of the numbers are single digits but many are not.
The numbers missed are 1,3,4,8,500,1,16,100,10
Other columns (which ARE read ok) contain decimal numbers
Perhaps there are some tricks/tips that I've not found that I can use?

How can I resolve INDEX MATCH errors caused by discrepancies in the spelling of names across multiple data sources?

I've set up a Google Sheets workbook that synthesizes data from a few different sources via manual input, IMPORTHTML and IMPORTRANGE. Once the data is populated, I'm using INDEX MATCH to filter and compare the information and to RANK each data set.
Since I have multiple data inputs, I'm running into a persistent issue of names not being written exactly the same between sources, even though they're the same person. First names are the primary culprit (i.e. Mary Lou vs Marylou vs Mary-Lou vs Mary Louise) but some last names with special symbols (umlauts, accents, tildes) are also causing errors. When Sheets can't recognize a match, the INDEX MATCH and RANK functions both break down.
I'm wondering how to better unify the data automatically so my Sheet understands that each occurrence is actually the same person (or "value").
Since you can't edit the results of an IMPORTHTML directly, I've set up "helper columns" and used functions like TRIM and SPLIT to try and fix instances as I go, but it seems like there must be a simpler path.
It feels like IFS could work but I can't figure how to integrate it. Also thinking this may require a script, which I'm just beginning to study.
Here's a simplified example of what I'm trying to achieve and the corresponding errors: Sample Spreadsheet
The first tab is attempting to pull and RANK data from tabs 2 and 3. Sample formulas from the Summary tab, row 3 (Amelia Rose):
Cell B3: =INDEX('Q1 Sales'!B:B, MATCH(A3,'Q1 Sales'!A:A,0))
Cell C3: =RANK(B3,$B$2:B,1)
Cell D3: =INDEX('Q2 Sales'!B:B, MATCH(A3,'Q2 Sales'!A:A,0))
Cell E3: =RANK(D3,$D$2:D,1)
I'd be grateful for any insight on how to best index 'Q2Sales'!B3 as the correct value for 'Summary'!D3. Thanks in advance - the thoughtful answers on Stack Overflow have gotten me this far!
to counter every possible scenario do it like this:
=ARRAYFORMULA(IFERROR(VLOOKUP(LOWER(REGEXREPLACE(A2:A, "-|\s", )),
{REGEXEXTRACT(LOWER(REGEXREPLACE('Q2 Sales'!A2:A, "-|\s", )),
TEXTJOIN("|", 1, LOWER(REGEXREPLACE(A2:A, "-|\s", )))), 'Q2 Sales'!B2:B}, 2, 0)))

Unable to see whole output list on screen in C++ (Not even using scroll)

Ok, I just added a new feature to my student manager program (a console program written in c++) , it is a console application and my program print dates on which particular student is absent in a list format
by saying list format i mean 1 date on 1 line
this is how output looks like for student who is absent 5 times
1. 04/05/2016,Monday
2. 05/05/2016/Tuesday
3. 06/05/2016/Wednesday
(Assume dates are correct)
now since these are only 3 records that are printed on the screen, a user would not require scrolling down,
but in a case of 300 dates , it prints all dates but when i scroll up I'm not able to reach back to 1st date, for example, I'm only able to see last 147 Records only
and I'm not able to scroll my console window to 1st record.
I know many other ways to solve this problem (like displaying 10 records or 100 records at a time) but I want to know how can i solve this particular problem
please give answer considering me as a beginner. :)
Thank You.
(as far as code is concerned, i can assure you its nothing special, just a while loop that keeps on printing dates until certain terminating condition is met )

How to iterate over all the page breaks in an Excel 2003 worksheet via COM

I've been trying to retrieve the locations of all the page breaks on a given Excel 2003 worksheet over COM. Here's an example of the kind of thing I'm trying to do:
Excel::HPageBreaksPtr pHPageBreaks = pSheet->GetHPageBreaks();
long count = pHPageBreaks->Count;
for (long i=0; i < count; ++i)
{
Excel::HPageBreakPtr pHPageBreak = pHPageBreaks->GetItem(i+1);
Excel::RangePtr pLocation = pHPageBreak->GetLocation();
printf("Page break at row %d\n", pLocation->Row);
pLocation.Release();
pHPageBreak.Release();
}
pHPageBreaks.Release();
I expect this to print out the row numbers of each of the horizontal page breaks in pSheet. The problem I'm having is that although count correctly indicates the number of page breaks in the worksheet, I can only ever seem to retrieve the first one. On the second run through the loop, calling pHPageBreaks->GetItem(i) throws an exception, with error number 0x8002000b, "invalid index".
Attempting to use pHPageBreaks->Get_NewEnum() to get an enumerator to iterate over the collection also fails with the same error, immediately on the call to Get_NewEnum().
I've looked around for a solution, and the closest thing I've found so far is http://support.microsoft.com/kb/210663/en-us. I have tried activating various cells beyond the page breaks, including the cells just beyond the range to be printed, as well as the lower-right cell (IV65536), but it didn't help.
If somebody can tell me how to get Excel to return the locations of all of the page breaks in a sheet, that would be awesome!
Thank you.
#Joel: Yes, I have tried displaying the user interface, and then setting ScreenUpdating to true - it produced the same results. Also, I have since tried combinations of setting pSheet->PrintArea to the entire worksheet and/or calling pSheet->ResetAllPageBreaks() before my call to get the HPageBreaks collection, which didn't help either.
#Joel: I've used pSheet->UsedRange to determine the row to scroll past, and Excel does scroll past all the horizontal breaks, but I'm still having the same issue when I try to access the second one. Unfortunately, switching to Excel 2007 did not help either.
Experimenting with Excel 2007 from Visual Basic, I discovered that the page break isn't known unless it has been displayed on the screen at least once.
The best workaround I could find was to page down, from the top of the sheet to the last row containing data. Then you can enumerate all the page breaks.
Here's the VBA code... let me know if you have any problem converting this to COM:
Range("A1").Select
numRows = Range("A1").End(xlDown).Row
While ActiveWindow.ScrollRow < numRows
ActiveWindow.LargeScroll Down:=1
Wend
For Each x In ActiveSheet.HPageBreaks
Debug.Print x.Location.Row
Next
This code made one simplifying assumption:
I used the .End(xlDown) method to figure out how far the data goes... this assumes that you have continuous data from A1 down to the bottom of the sheet. If you don't, you need to use some other method to figure out how far to keep scrolling.
Did you set ScreenUpdating to True, as mentioned in the KB article?
You may want to actually toggle it to True to force a screen repaint. It sounds like the calculation of page breaks is a side-effect of actually rendering the page, rather than something Excel does on demand, so you have to trigger a page rendering on the screen.