Fetch particular pattern from file and read file

Fetch particular pattern from file and read file - c++

I have text file which contain some pattern(First I thought to use awk command into text file
into C++).The below solution works fine for us one single file.
https://stackoverflow.com/questions/15151055/truncate-file-in-linux
But We are getting multiple file which also contain different pattern.So we need to write awk
command for each file which will not be generic solution.
Than after I have found lexer (Flex) for pattern matching in c++ (linux enviroment) But I faced some issue and could write lexer file. So I thought do we have any open source library in linux platform for pattern match in text file and convert into xml file. ( work in progress for google but do not have any concrete solution).
In brief,
1)Search Pattern into Text File(In current, we are using awk command in c++ (in case any general solution)
2)Read Tabular format file into C++.
I hope I am able to convey my message.

Related

Writing text into and retrieving text from the text file using command line auguments

Hello I want to write the output of my C++ Program in a text file using Command Line Argument then afterward retrieve it from the file . can you help me please I cannot get a satisfying code help.

If you are at a unix type command prompt (ie bash or similar), you can generally just redirect your output to a text file using '>'. So if your program is called "MyCommand", just do:
./MyCommand > file.txt
After that, you can use a text editor to open the file, or anything else you'd do with a plain text file.
If you are trying to write out to a file programatically, you need to open up a file handle. There are many ways to do this, one would be to use the standard file handling functions built into the C library. See here for a tutorial:
http://www.tutorialspoint.com/cprogramming/c_file_io.htm

Applescript to extract the Digital Object Identifier (DOI) from a PDF file

I looked for an applescript to extract the DOI from a PDF file, but could not find it. There is enough information available on the actual format of the DOI (i.e. the regular expression), but how could I use this to get the identifier from the PDF file?
(It would be no problem if some external program were used, such as Hazel.)

If you're ok with using an app, I'd recommend Skim. Good AppleScript support. I'd probably structure it like this (especially if the document might be large):
set DOIFound to false
tell application "Skim"
set pp to pages of document 1
repeat with p in pp
set t to text of p
--look for DOI and set DOIFound to true
if DOIFound then exit repeat--if it's not found then use url?
end repeat
end tell
I'm assuming a DOI would always exist on one page (not spread out to between two). Looks like they are invariably (?) on the first page of an article, which would make this quick of course, even with a large doc.
[edit]
Another way would be to get the Xpdf OSX binaries from http://www.foolabs.com/xpdf/download.html and use pdftotext in the command line (just tested this; it works well) and parse the text using AppleScript. If you want to stay in AppleScript, you can do something like:
do shell script "path/to/pdftotext 'path/to/pdf/file.pdf'"
which would output a file in the same directory with a txt file extension -- you parse that for DOI.

Have you tried it with pdfgrep? It works really well in commmandline
pdfgrep -n --max-count 1 --include "*.pdf" "DOI"
i have no idea to build an apple script though, but i would be interested in one also. so that if i drop a pdf into that folder it just automatically extracts the DOI and renames the file with the DOI in the filename.

Folder with 1300 png files into html images list

I've got folder with about 1300 png icons. What I need is html file with all of them inside like:
<img src="path-to-image.png" alt="file name without .png" id="file-name-without-.png" class="icon"/>
Its easy as hell but with that number of files its pure waste of time to do it manually. Have you any ideas how to automate it?

If you need it just once, then do a "dir" or "ls" and redirect it to a file, then use an editor with macro-ability like notepad++ to record modifying a single line like you desire, then hit play macro for the remainder of the file. If it's dynamic, use PHP.

I would not use C++ to do this. I would use vi, honestly, because running regular expressions repeatedly is all that is needed for this.
But young an do this in C++. I would start with a plan text file with all the file names generated by Dir or ls on the command prompt.
Then write code that takes a line of input and turns it into a line formatted the way you want. Test this and get it working on a single line first.
The RE engine of C++ is probably overkill (and is not all that well supported in compilers), but substr and basic find and replace is all you need. Is there a string library you are familiar with? std::string would do.
To generate the file name without PNG, check the last four characters and see if they exist and are .PNG (if not report an error). Then strip them. To remove dashes, copy characters to a new string but if you are reading a dash write a space. Everything else is just string concatenation.

how to read from a text file contents file name extension in c++?

Hello I want to read from a text file full of directory contents
Here's my example:
below is my text file called MyText.txt
MyText.txt
title.txt,image.png,sound.mp3
I want to be able to read that .txt extension not the filename and I want it to be for file extensions only for example .txt or .mp3 how would I do that in c++?.
When I mean read I mean reference it in a if statement like this:
if(.mp3 exists in a text file)
{
fprintf(stderr,"sees the mp3 extensions");
}
I'm running Windows 7 32-bit.
I need a more cross platform approach.

May I suggest you to read a tutorial on C++ file handling and another one on C++ strings?
There is no a quick solution: you have to read the file using the ifstream class.
After reading the file and storing it in one or more strings, you can then use the find and substr string methods to create a queue of discrete filenames. Using the same methods, you can then split the queued elements again, in order to find the extensions and add them to a set. A set does not allow duplicates, so you are sure all the extensions will appear only once.

Programmatically search + replace in a .doc

If I'm given a .doc file with special tags in it such as [first_name], how do I go about replacing all occurrences of it with something like "Clark"? A simple binary replacement only works if the replacement string is the exact same length.
Haskell, C, and C++ answers would be best, but any compiled language would do. I'd also prefer to do this without an external library since it has to be deployed on Windows and Linux and cross-platform dependency handling is a bitch.
To summarize...
.doc -> magic program -> .doc with strings replaced

You could use the Word COM component ("Word.Application") on Windows to open the file, do the replacements, save the file, and close it. However, this is Windows-only and can be buggy.
Another thing you could do is use the OpenOffice.org command line interface to convert the file to the ODF format, unzip the file (ODF is mostly zipped XML), do the replacements with the files inside, re-zip the file, and re-convert it to .doc format. However, OpenOffice.org doesn't always read Word files correctly (especially if there is a lot of complex formatting) and it can make it harder to distribute (users must either have OpenOffice.org or you must distribute it with your program).
Also, if you have a file in the .docx format, you can unzip it, do the replacements, and re-zip it.

First read the Word Document Specification.
If that hasn't terrified you, then you should find it fairly straightforward to figure out how to read and write it. It must be possible; Word manages to do it most of the time.

You probably have to use .Net programming (VB or C#) to create an object of Word.Application and then use the MS Word object model to manipulate your document.

Why do you want to be using C/C++/Haskell or another compiled language? I'm not too familiar with Haskell, but in general I would say that C is not a great language for performing text processing. A lot of interpreted languages (Perl, Python, etc.) also have powerful regular expression libraries that are suited for finding and replacing phrases.
With that said, as the other posters have noted, you will still have to deal with the eccentricities of the .doc format.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Fetch particular pattern from file and read file - c++

Related

Writing text into and retrieving text from the text file using command line auguments

Applescript to extract the Digital Object Identifier (DOI) from a PDF file

Folder with 1300 png files into html images list

how to read from a text file contents file name extension in c++?

Programmatically search + replace in a .doc

Categories

Resources