Elegant way to distinct Path or Entry key - regex

I have an application loading CAD data (Custom format), either from the local filesystem specifing an absolute path to a drawing or from a database.
Database access is realized through a library function taking the drawings identifier as a parameter.
the identifiers have a format like ABC 01234T56-T, while my paths a typical windows Paths (eg x:\Data\cadfiles\cadfile001.bin).
I would like to write a wrapper function Taking a String as an argument which can be either a path or an identifier which calls the appropriate functions to load my data.
Like this:
Function CadLoader(nameOrPath : String):TCadData;
My Question: How can I elegantly decide wether my string is an idnetifier or a Path to a file?
Use A regexp? Or just search for '\' and ':', which are not appearing in the Identifiers?

Try this one
Function CadLoader(nameOrPath : String):TCadData;
begin
if FileExists(nameOrPath) then
<Load from file>
else
<Load from database>
end;

I would do something like this:
function CadLoader(nameOrPath : String) : TCadData;
begin
if ((Pos('\\',NameOrPath) = 1) {UNC} or (Pos(':\',NameOrPath) = 2) { Path })
and FileExists(NameOrPath) then
begin
// Load from File
end
else
begin
// Load From Name
end;
end;
The RegEx To do the same thing would be: \\\\|.:\\ I think the first one is more readable.

In my opinion, the K.I.S.S. principle applies (or Keep It Simple Stupid!). Sounds harsh, but if you're absolutely certain that the combination :\ will never be in your identifiers, I'd just look for it on the 2nd position of the string. Keeps things understandable and readable. Also, one more quote:
Some people, when confronted with a
problem, think "I know, I'll use
regular expressions." Now they have
two problems. - Jamie Zawinski

You should pass in an additional parameter that says exactly what the identifier actually represents, ie:
type
CadLoadType = (CadFromPath, CadFromDatabase);
Function CadLoader(aType: CadLoadType; const aIdentifier: String): TCadData;
begin
case aType of
CadFromPath: begin
// aIdentifier is a file path...
end;
CadFromDatabase: begin
// aIdentifier is a database ID ...
end;
end;
end;
Then you can do this:
Cad := CadLoader(CadFromFile, 'x:\Data\cadfiles\cadfile001.bin');
Cad := CadLoader(CadFromDatabase, 'ABC 01234T56-T');

Related

display the content of a file split by a delimiter character

I am trying to display the content of a file, split by a delimiter character.
More exactly, starting from this topic, I am trying to display the result as:
bbb
aaa
qqq
ccc
but the data source to be taken from a file.
Until now, I tried:
DECLARE
l_bfile bfile;
BEGIN
l_bfile := bfilename(my_dir, my_file);
dbms_lob.fileopen(l_bfile);
FOR i IN
(SELECT TRIM(regexp_substr(TO_CHAR(l_bfile),'[^;]+',1,level) ) AS q
FROM dual
CONNECT BY regexp_substr(TO_CHAR(l_bfile),'[^;]+',1,level) IS NOT NULL
ORDER BY level
)
LOOP
dbms_output.put_line(i.q);
END LOOP;
EXCEPTION
WHEN No_Data_Found THEN
NULL;
END;
As result, I got
PL/SQL: ORA-00932: inconsistent datatypes: expected NUMBER got FILE
Can anyone give me a hint, please?
Have to write this as a new answer since this is too big for a comment to #SmartDumb:
Be advised the regex of the form '[^;]+' (commonly used for parsing delimited lists) fails when NULL elements are found in the list. Please see this post for more information: https://stackoverflow.com/a/31464699/2543416
Instead please use this form of the call to regexp_substr (note I removed the second element):
SELECT TRIM(regexp_substr('bbb;;qqq;ccc','(.*?)(;|$)',1,level, null, 1) ) AS q
FROM dual
CONNECT BY regexp_substr('bbb;;qqq;ccc','(.*?)(;|$)',1,level) IS NOT NULL
ORDER BY level
It may or may not be important in this example, it depends on if the order of the element in the string has importance to you or if you need to preserve the NULL. i.e. if you need to know the second element is NULL then this will work.
P.S. Do a search for external tables and see if that is a solution you could use. That would let you query a file as if it were a table.
You could possible try this if your file contains single line (hence the question about file structure):
DECLARE
utlFileHandle UTL_FILE.FILE_TYPE;
vLine varchar2(100);
BEGIN
utlFileHande := UTL_FILE.FOPEN(my_dir, my_file, 'r');
utl_file.get_line(utlFileHande, vLine);
FOR i IN
(SELECT TRIM(regexp_substr(vLine,'[^;]+',1,level) ) AS q
FROM dual
CONNECT BY regexp_substr(vLine,'[^;]+',1,level) IS NOT NULL
ORDER BY level
)
LOOP
dbms_output.put_line(i.q);
END LOOP;
utl_file.fclose(utlFileHande);
EXCEPTION
WHEN No_Data_Found THEN
utl_file.fclose(utlFileHande);
null;
END;

Stata: Efficient way to replace numerical values with string values

I have code that currently looks like this:
replace fname = "JACK" if id==103
replace lname = "MARTIN" if id==103
replace fname = "MICHAEL" if id==104
replace lname = "JOHNSON" if id==104
And it goes on for multiple pages like this, replacing an ID name with a first and last name string. I was wondering if there is a more efficient way to do this en masse, perhaps by using the recode command?
I will echo the other answers that suggest a merge is the best way to do this.
But if you absolutely must code the lines item-wise (again, messy) you can generate a long list ("pages") of replace commands by using MS Excel to "help" you write the code. Here is a picture of your Excel sheet with one example, showing the MS Excel formula:
columns:
A B C D
row: 1 last first id code
2 MARTIN JACK 103 ="replace fname=^"&B2&"^ if id=="&C2
You type that in, make sure it looks like Stata code when the formula calculates (aside from the carets), and copy the formula in column D down to the end of your list. Then copy the whole block of Stata code in column D generated by the formulas into your do-file, and do a find and replace (be careful here if you are using the caret elsewhere for mathematical uses!!) for all ^ to be replaced with ", which will end up generating proper Stata syntax.
(This is truly a brute force way of doing this, and is less dynamic in the case that there are subsequent changes to your generation list. All--apologies in advance for answering a question here advocating use of Excel :) )
You don't explain where the strings you want to add come from, but what is generally the best technique is explained at
http://www.stata.com/support/faqs/data-management/group-characteristics-for-subsets/index.html
Create an associative array of ids vs Fname,Lname
103 => JACK,MARTIN
104 => MICHAEL,JOHNSON
...
Replace
id => hash{id} ( fname & lname )
The efficiency of doing this will be taken care by the programming language used

How to read semicolon separated certain values from a QString?

I am developing an application using Qt/KDE. While writing code for this, I need to read a QString that contains values like ( ; delimited)
<http://example.com/example.ext.torrent>; rel=describedby; type="application/x-bittorrent"; name="differentname.ext"
I need to read every attribute like rel, type and name into a different QString. The apporach I have taken so far is something like this
if (line.contains("describedby")) {
m_reltype = "describedby" ;
}
if (line.contains("duplicate")) {
m_reltype = "duplicate";
}
That is if I need to be bothered only by the presence of an attribute (and not its value) I am manually looking for the text and setting if the attribute is present. This approach however fails for attributes like "type" and name whose actual values need to be stored in a QString. Although I know this can be done by splitting the entire string at the delimiter ; and then searching for the attribute or its value, I wanted to know is there a cleaner and a more efficient way of doing it.
As I understand, the data is not always an URL.
So,
1: Split the string
2: For each substring, separate the identifier from the value:
id = str.mid(0,str.indexOf("="));
value = str.mid(str.indexOf("=")+1);
You can also use a RegExp:
regexp = "^([a-z]+)\s*=\s*(.*)$";
id = \1 of the regexp;
value = \2 of the regexp;
I need to read every attribute like rel, type and name into a different QString.
Is there a gurantee that this string will always be a URL?
I wanted to know is there a cleaner and a more efficient way of doing it.
Don't reinvent the wheel! You can use QURL::queryItems which would parse these query variables and return a map of name-value pairs.
However, make sure that your string is a well-formed URL (so that QURL does not reject it).

Delphi - User specified string manipulation

I have a problem in Delphi7. My application creates mpg video files according to a set naming convention i.e.
\000_A_Title_YYYY-MM-DD_HH-mm-ss_Index.mpg
In this filename the following rules are enforced:
The 000 is the video sequence. It is incremented whenever the user presses stop.
The A (or B,C,D) specifies the recording camera - so video files are linked with up to four video streams all played simultaneously.
Title is a variable length string. In my application it cannot contain a _.
The YYYY-MM-DD_HH-mm-ss is the starting time of the video sequence (not the single file)
The Index is the zero based ordering index and is incremented within 1 video sequence. That is, video files are a maximum of 15 minutes long, once this is reached a new video file is started with the same sequence number but next index. Using this, we can calculate the actual start time of the file (Filename decoded time + 15*Index)
Using this method my application can extract the starting time that the video file started recording.
Now we have a further requirement to handle arbitrarily named video files. The only thing i know for certain is there will be a YYYY-MM-DD HH-mm-ss somewhere in the filename.
How can i allow the user to specify the filename convention for the files he is importing? Something like Regular expressions? I understand there must be a pattern to the naming scheme.
So if the user inputs ?_(Camera)_*_YYYY-MM-DD_HH-mm-ss_(Index).mpg into a text box, how would i go about getting the start time? Is there a better solution? Or do i just have to handle every single possibility as we come accross them?
(I know this is probably not the best way to handle such a problem, but we cannot change the issue - the new video files are recorded by another company)
I'm not sure if your trying to parse the user input into components '?(Camera)*_YYYY-MM-DD_HH-mm-ss_(Index).mpg` but if your just trying to grab the date and time something like this, the date is in group 1, time in group 2
(\d{4}-\d{2}-\d{2})_(d{2}-\d{2}-\d{2})
Otherwise, not sure what your trying to do.
Possibly you can use the underscores "_" as your positional indicator since you smartly don't allow them in the title.
In your example of a filename convention:
?_(Camera)_*_YYYY-MM-DD_HH-mm-ss_(Index).mpg
you can parse this user-specified string to see that the date YYYY-MM-DD is always between the 3rd and 4th underscore and the time HH-mm-ss is between the 4th and 5th.
Then it becomes a simple matter when getting the actual filenames following this convention, to find the 3rd underscore and know the date and time follow it.
If you want phone-calls 24/7, then you should go for the RegEx-thing and let the user freely enter some cryptography in a TEdit.
If you want happy users and a good night sleep, then be creative and drop the boring RegEx-approach. Create your own filename-decoder by using an Angry bird approach.
Here's the idea:
Create some birds with different string manipulation personalities.
Let the user select and arrange these birds.
Execute the user generated string manipulation.
Sample code:
program AngryBirdFilenameDecoder;
{$APPTYPE CONSOLE}
uses
SysUtils;
procedure PerformEatUntilDash(var aStr: String);
begin
if Pos('-', aStr) > 0 then
Delete(aStr, 1, Pos('-', aStr));
WriteLn(':-{ > ' + aStr);
end;
procedure PerformEatUntilUnderscore(var aStr: String);
begin
if Pos('_', aStr) > 0 then
Delete(aStr, 1, Pos('_', aStr));
WriteLn(':-/ > ' + aStr);
end;
function FetchDate(var aStr: String): String;
begin
Result := Copy(aStr, 1, 10);
Delete(aStr, 1, 10);
WriteLn(':-) > ' + aStr);
end;
var
i: Integer;
FileName: String;
TempFileName: String;
SelectedBirds: String;
MyDate: String;
begin
Write('Enter a filename to decode (eg. ''01-ThisIsAText-Img_01-Date_2011-03-08.png''): ');
ReadLn(FileName);
if FileName = '' then
FileName := '01-ThisIsAText-Img_01-Date_2011-03-08.png';
repeat
TempFileName := FileName;
WriteLn('Now, select some birds:');
WriteLn('Bird No.1 :-{ ==> I''ll eat letters until I find a dash (-)');
WriteLn('Bird No.2 :-/ ==> I''ll eat letters until I find a underscore (_)');
WriteLn('Bird No.3 :-) ==> I''ll remember the date before I eat it');
WriteLn;
Write('Chose your birds: (eg. 112123):');
ReadLn(SelectedBirds);
if SelectedBirds = '' then
SelectedBirds := '112123';
for i := 1 to Length(SelectedBirds) do
case SelectedBirds[i] of
'1': PerformEatUntilDash(TempFileName);
'2': PerformEatUntilUnderscore(TempFileName);
'3': MyDate := FetchDate(TempFileName);
end;
WriteLn('Bird No.3 found this date: ' + MyDate);
WriteLn;
WriteLn;
Write('Check filename with some other birds? (Y/N): ');
ReadLn(SelectedBirds);
until (Length(SelectedBirds)=0) or (Uppercase(SelectedBirds[1])<>'Y');
end.
When you'll do this in Delphi with GUI, you'll add more birds and more checking of course. And find some nice bird glyphs.
Use two list boxes. One one the left with all possible birds, and one on the right with all the selected birds. Drag'n'drop birds from left to right. Rearrange (and remove) birds in the list on the right.
The user should be able to test the setup by entering a filename and see the result of the process. Internally you store the script by using enumerators etc.

String extraction

Currently I am working very basic game using the C++ environment. The game used to be a school project but now that I am done with that programming class, I wanted to expand my skills and put some more flourish on this old assignment.
I have already made a lot of changes that I am pleased with. I have centralized all the data into folder hierarchies and I have gotten the code to read those locations.
However my problem stems from a very fundamental flaw that has been stumping me.
In order to access the image data that I am using I have used the code:
string imageLocation = "..\\DATA\\Images\\";
string bowImage = imageLocation + "bow.png";
The problem is that when the player picks up an item on the gameboard my code is supposed to use the code:
hud.addLine("You picked up a " + (*itt)->name() + "!");
to print to the command line, "You picked up a Bow!". But instead it shows "You picked up a ..\DATA\Images\!".
Before I centralized my data I used to use:
name_(item_name.substr(0, item_name.find('.')))
in my Item class constructor to chop the item name to just something like bow or candle. After I changed how my data was structured I realized that I would have to change how I chop the name down to the same simple 'bow' or 'candle'.
I have changed the above code to reflect my changes in data structure to be:
name_(item_name.substr(item_name.find("..\\DATA\\Images\\"), item_name.find(".png")))
but unfortunately as I alluded to earlier this change of code is not working as well as I planned it to be.
So now that I have given that real long winded introduction to what my problem is, here is my question.
How do you extract the middle of a string between two sections that you do not want? Also that middle part that is your target is of an unknown length.
Thank you so very much for any help you guys can give. If you need anymore information please ask; I will be more than happy to upload part or even my entire code for more help. Again thank you very much.
In all honeasty, you're probably approaching this from the wrong end.
Your item class should have a string "bow", in a private member. The function Item::GetFilePath would then (at runtime) do "..\DATA\Images\" + this->name + ".png".
The fundamental property of the "bow" item object isn't the filename bow.png, but the fact that it's a "bow". The filename is just a derived proerty.
Assuming I understand you correctly, the short version of your question is: how do I split a string containing a file path so I have removed the path and the extension, leaving just the "title"?
You need the find_last_of method. This gets rid of the path:
std::size_type lastSlash = filePath.find_last_of('\\');
if (lastSlash == std::string::npos)
fileName = filePath;
else
fileName = filePath.substr(lastSlash + 1);
Note that you might want to define a constant as \\ in case you need to change it for other platforms. Not all OS file systems use \\ to separate path segments.
Also note that you also need to use find_last_of for the extension dot as well, because filenames in general can contain dots, throughout their paths. Only the very last one indicates the start of the extension:
std::size_type lastDot = fileName.find_last_of('.');
if (lastDot == std::string::npos)
{
title = fileName;
}
else
{
title = fileName.substr(0, lastDot);
extension = fileName.substr(lastDot + 1);
}
See http://msdn.microsoft.com/en-us/library/3y5atza0(VS.80).aspx
using boost filesystem:
#include "boost/filesystem.hpp"
namespace fs = boost::filesystem;
void some_function(void)
{
string imageLocation = "..\\DATA\\Images\\";
string bowImage = imageLocation + "bow.png";
fs::path image_path( bowImage );
hud.addLine("You picked up a " + image_path.filename() + "!"); //prints: You picked up a bow!
So combining Paul's and my thoughts, try something like this (broken down for readability):
string extn = item_name.substr(item_name.find_last_of(".png"));
string path = item_name.substr(0, item_name.find("..\\DATA\\Images\\"));
name_ = item_name.substr( path.size(), item_name.size() - extn.size() );
You could simplify it a bit if you know that item name always starts with "..DATA" etc (you could store it in a constant and not need to search for it in the string)
Edit: Changed extension finding part to use find_last_of, as suggested by EarWicker, (this avoids the case where your path includes '.png' somewhere before the extension)
item_name.find("..\DATA\Images\") will return the index at which the substring "..\DATA\Images\" starts but it seems like you'd want the index where it ends, so you should add the length of "..\DATA\Images\" to the index returned by find.
Also, as hamishmcn pointed out, the second argument to substr should be the number of chars to return, which would be the index where ".png" starts minus the index where "..\DATA\Images\" ends, I think.
One thing that looks wrong is that the second parameter to substr should be the number of chars to copy, not the position.