Automating Monthly Small Business Task (VBA, VC++, Excel, Access, Quickbooks, etc) - c++

Let me start by giving a quick background on myself (please forgive me). I have an intense interest in programming and computers/technical things in general. I took a year of C/C++ in college and a semester of assembly. I have messed around with Visual BASIC. So, almost all of my programming knowledge is limited to these three languages in order of proficiency:
C/C++
Assembly
Visual BASIC
I have a job at a small business that can't justify hiring a trained/"certified" programmer where I have tasked myself with automating a process that must be completed on a monthly basis. It involves:
Sending faxes that are to be filled out with numbers
Receiving those faxes that are returned (all incoming faxes go to network folder as PDF)
Collecting the numbers from received faxes and entering these numbers into Excel (some are Word format for some reason) and then into QuickBooks after calculations
Sending emails
Receiving replies to these emails that contain numbers
Manually entering these numbers into Excel and then QuickBooks after calculations
Collecting numbers from a website written in Javascript. Numbers from website can be outputted to *.csv file.
Finally, printing invoices out from QuickBooks using the calculated numbers that have been entered.
My goal is to automate this entire process. As of now, everything is done manually. Emails and faxes are sent one at a time. Numbers from website are read and entered into Excel one at a time. Numbers are put into QB and invoices are printed one at a time.
So far I have added an email scheduling add-on to Outlook that automatically sends the emails every month. I am working on setting up faxes to be sent automatically (the only thing I can think of off the top of my head is manipulating Windows Scan/Fax with API library in either VB or VC++).
Also, I am automating the calculations that must be performed in order to prep the collected numbers for entry into QB using VBA/Excel and, potentially, Access.
Right now I'm brainstorming a way to automatically collect the numbers (along with customer name) from the returned faxes. My idea was to create a new fax sheet that forced the customer to "bubble in" the numbers like a ScanTron sheet. This way I could write a program (perhaps in C++) to parse the PDF looking for a certain colored pixel in a specific spot in order to piece together the number (I wonder if I could automatically OCR the PDFs and collect the customer name simply by extracting text from each PDF?) which could then be sent to a database or perhaps directly to an Excel sheet (the Excel sheets have to stay so that hard copies of data can be printed--though I supposed this could be accomplished without Excel).
And lastly, since some customers refuse to use any of those methods available to them, we have to manually call some of them. Once I am finished with all of the aforementioned work I would like to develop a way to allow customers to call a specific phone number and key in the information via voice prompt which would then deposit the information in database somewhere. This will be complicated and require special equipment so it will be last and lowest priority. Not worried about this right now.
Since my experience with programming is only moderate (though I'm sure my working knowledge will expand quickly once I get started since a lot of it is already in my brain somewhere) I wanted to give myself the best advantage and tools possible to tackle this project before I got so far into it that changing my methods would waste a lot of time/work. To sum up, I need to make an outline of exactly what I need to do/learn and what techniques/applications to use.
This is the site I always come to when searching for my programming questions and I have come to the conclusion that the people here are generally extremely knowledgeable, patient and helpful. I will appreciate any contribution of information, advice and/or insights no matter how small. I realize that in this situation I am the "beggar" and thus will be grateful for whatever I get.
Thanks in advance.
P.S. Before anyone says anything: I have "UTFSE" extensively and have assimilated lots of info from it. However, we all know that there's no equal to a human's problem solving capabilities--especially when proficient in the specific field.

Nice work! You are definitely on the right track. That was a lot of information so I apologize if I repeat anything you already know.
1) Faxes - Microsoft has an excellent resource for learning how to send faxes (they even provide the code). Check this out: http://msdn.microsoft.com/en-us/library/windows/desktop/ms693482(v=vs.85).aspx
2) You will have to OCR the PDF's (as you mentioned) and then you can extract the information. But (as you seem to understand), you cannot modify a pdf with c++.
3) C++ does allow you to save (and open) a file in Excel format. However, it's a very complicated format and will probably cause some problems. One of them is that it will want to save all of your data to one cell. A way to get around this is to I/O to Excel with .csv files. A comma separates the columns and a new line the rows. For example,
A1, B1, C1
A2, B2, C2
A3, B3, C3
Excel will open and read these files correctly. However, you won't be able to format font, borders, etc... automatically.
This is the extent of my knowledge, I have never worked with emails or Quickbooks. Hope it helps!

Related

C++ Reading data from excel file and treating them as variables

I am using C++ & CLR to create a GUI in Visual studio. The whole app is like a calculator for the price of transport and the costs associated with it for my work. At work, we are using Helios and we are able to export certain data to Excel. Specifically, the prices of transport, packaging materials, etc. And what I need is for my program to be able to read certain cells in Excel where prices and other variables are recorded and calculate with it so that I don't have to rewrite all the values manually in the source code.
I spent a lot of time looking for a solution but couldn't find anything that referred to my problem. Is it even possible to build such a program? I don't want anyone to solve it for me. Maybe I just missed some banality and just need some direction. And as far as I understand, the CSV format is irrelevant for me, because I need to work with a few specific cells that Helios pre-fills for me in the Excel sheet.

Cycling through URLs to download csv files

I have a list of URLs which will access a webservice. The webservice downloads .csv files. I want to be able to cycle through them using a date field which is in a specific format in the url itself, thereby downloading the data day-by-day. The access seems fairly slow as even a manual url entry can take a couple of minutes to complete, and I suspect the issue is at the webservice' end.
The URL is in the format:
http://web.service.com/ws/XYZ/data/?key=mysecretkeyf&field1=X&start=YYYY-MM-DD 00:00&end= YYYY-MM-DD 00:00&field=Y&format=csv
So the way I envision it (and I am keen to take advice) is using a variable for the start year, month and day fields cycling onto the next URL as the previous .csv is downloaded, with the code ending when the current date is accessed.
Any ideas most welcome.
The coding is really straightforward which makes me wonder if you are looking to code this yourself or asking if there is a service out there that would help do this. If you are coding, I'd choose a language that works well for you. #Vivek mentioned Python which is what I would choose as well.
If you do not want to go the coding approach, you could check out DownThemAll. I have used this utility for batch downloading where you have to increment numbers in parts of the URL. Check it out, it may be a good non-programming solution: DownThemAll

How do you open a file in C++ from HTTP where the URL is NOT the file location

I'm a first year comp sci student with a moderate knowledge of C++ and for a job I'm trying to put together a utility using a new U.S. Census Bureau API. It takes ID codes for things like state/county/census tract and the desired table and spits back out the desired table for the desired location.
Here's an example of a query for population stats for California and New York.
More examples can be found here: http://www.census.gov/developers/
My snag is that I've both never worked with files from HTTP and also I'm not sure how to handle a URL that outputs plain text but doesn't actually lead to the file location. Would it be possible to just use stdin? I don't understand how to handle the output given by one of the census query URLs.
Right now I'm using infile which I know isn't correct but I'm not sure a correct solution is either.
Thanks
The fact that the data you're receiving is (apparently) generated on the fly rather than coming from a file doesn't really make any difference to you -- you get the same stream of bytes either way.
My immediate advice would be to use cURL for the job. Most of your work is generating a correct URL, which is what cURL specializes in. It'll then make it pretty easy to grab the data. From there, you can use any of quite a few JSON parser libraries (e.g., yajl), or you can parse it on your own (JSON is simple enough to make that fairly practical). A quick Google indicates that a fair number of people have already done this, and have various blog posts and such giving information about how to do it (though I suspect most of that is probably unnecessary).

Moving, renaming huge amount of text files based on content and size

*Update July 4*
I ended up doing the following:
Sort on date
Check if last sentence is the same
If Yes: If bigger -> this is the new message to be chosen. If smaller: remove. If no more of the same can be found, choose this one and move to another folder.
If No: move on. Loop this again until all files with certain date have been checked.
Thanks all for the help!!
I'm busy with a big project where I have a huge number of emails that I have to filter, imported from gmail through thunderbird. There is a big problem though.
Because gmail uses conversations, but thunderbird doesn't format them as such, what I have is a text file for each email, though the complete previous conversation as well. And so a whole new text file for each reply.To clarify, an example of a conversation:
Me:Hi, how are you?
You, replying: Good!
Me: Great!
In gmail this looks exactly as above, but for me this are now 3 files:
file 1:
Me, sent at 11:41:
Hi, how are you?
file 2:
You, sent at 11:42:
Good!
Me, sent at 11:41:
Hi how are you?
file 3:
Me, sent at 11:43:
Great!
You, sent at 11:42:
Good!
Me, sent at 11:41:
Hi how are you?
As you can understand, this is no problem with 3 files: I just throw away file 1 and 2 and only use file 3. That's precisely what I want to do. But considering in total there are around 30k files, I would very much like to automate that.
It is unfortunately not possible to do this complete by file name, though partially it can. The files are named after their date. For instance: 20110102 for Jan 2, 2011. However as there are multiple email conversations on a day, I would lose a lot if I would just sort by date and only keep the largest.
I hope the problem is clear and you can help me with this.
I work on Mac OSX 10.7. I've tried using Applescript, but either my script is not good or Applescript can't handle the amount of files.
Maybe you have a recommendation for software or a script in some way? I'm open for all and not unfamiliar with programming.
Thanks in advance!
As your task is basically just text processing, any language you're familiar with, including AppleScript, PHP, bash, C, should be able to do the job. I think perhaps #inTide's breaking the problem down into discreet steps is what you need to do, building one portion at a time in the language of your choice.
Pick a language that you're familiar with and start writing one the code to the first step and make sure it's working as you expect, and then expand, adding a small bit of new functionality at each point and making sure that functionality works before moving on. Without an example of the code you've written or a better description of how AppleScript is failing for you, additional advice is difficult.

Looking for Ideas: How would you start to write a geo-coder?

Because the open source geo-coders cannot begin to compare to Google's or even Yahoo's, I would like to start a project to create a good open source geo-coder. Just to clarify, a geo-coder takes some text (usually with some constraints) and returns one or more lat/lon pairs.
I realize that this is a difficult and garguntuan task, so I am wondering how you might get started. What would you read? What algorithms would you familiarize yourself with? What code would you review?
And also, assuming you were going to develop this very agilely, what would you want the first prototype to be able to do?
EDIT: Let's set aside the data question for now. I am going to use OpenStreetMap data, along with a database of waypoints that I have. I would later plan to include other data sets as well, and I realize the geo-coder would be inherently limited by the quality of the original data.
The first (and probably blocking) problem would be: where do you get your data from? (unless you are willing to pay thousands of dollars for proprietary sets).
You could build a geocoding-api on top of OpenStreetMap (they publish their data in dumps on a regular basis) I guess, but that one was still very incomplete last time I checked.
Algorithms are easy. Good mapping data, however, is expensive. Very expensive.
Google drove their cars all over the world, collecting this data among other things.
From a .NET point of view these articles might be interesting for you:
Writing Your Own GPS Applications: Part I
Writing Your Own GPS Applications: Part 2
Writing GIS and Mapping Software for .NET
I've only glanced at the articles but they've been on CodeProject's 'Most Popular' list for a long time.
And maybe this CodePlex project which the author of the articles above made available.
I would start at the absolute beginning by figuring out how you're going to get the data that matches a street address with a geocode. Either Google had people going around with GPS units, OR they got the information from some existing source. That existing source may have been... (all guesses)
The Postal Service
Some existing maps(printed)
A bunch of enthusiastic users that were early adopters of GPS technology who ere more than willing to enter in street addresses and GPS coordinates
Some government entity (or entities)
Their own satellites
etc
I guess what I'm getting at is the information was either imported from somewhere or was input by someone via some interface. As my starting point I would look at how to get that information. In an open source situation, you may be able to get a bunch of enthusiastic people to enter information.
So for my first prototype, boring as it would be, I would create a form for entering information.
Then you need to know the math for figuring out the closest distance (as the crow flies). From there, try to figure out how to include roads. (My guess is you would have to have data point for each and every curve, where you hold the geocode location of the curve, and the angle of the road on a north/south and east/west vector. You'd probably need to take incline into account, too to get accurate road measurements.)
That's just where I'd start.
But in all honesty, I wouldn't even start on this. Other programmers have done it already, I'm more interested in what hasn't already been done.
get my free raw data from somewhere like http://ipinfodb.com/ip_database.php
load it into a database, denormalizing for fast lookups
design my API
build it out as a RESTful web service
return results in varying formats: JSON, XML, CSV, raw text
The first prototype should accept a ZIP code and return lat/lon in raw text.