Extract pattern substring from NSString - regex

I've created my own UITabBarController.
Additionally I've written a few lines of code to determine the current user.
E.g. if I am the current user do/display this, otherwise do/display this etc...
The format pattern is (firstname Lastname).
The Full name of the current user is in "displayName".
This is how I set the title of the tab depending on whether I am looking at 'my' tabs or someone else's tabs.
[activities setTitle:[viewingUser objectForKey:#"displayName"]];
I now want to extract only the firstname and display it like so:
"firstname's".
I do know of substringToIndex and substringWithRange but I just can't seem to work it out myself. I reckon I just need to find the first and extract the part it togehter with that ['s]. Can anybody please point me in the right direction?
Cheers

If first name and last name are separated by a space then simply execute the following statement which returns an NSArray which in your will contain the first and last names.
[displayName componentsSeparatedByString:#" "]

Take a look at the NSScanner documentation and associated sample code. There are simpler ways to do it if that is your only dataset, however, the moment you start getting into even semi-complex sequences, you'll need other, more powerful solutions. This is why I'm recommending NSScanner off the top.

Related

Regex: How to extract dialogue tags from fiction, with speaker information

Totally stumped on this. I need help extracting dialogue from a story so I can hand it off for narration.
Basically, this is a problem where I have a big chunk of text (a novel), and I want to extract all the dialogue from the text in a format I can pipe into a spreadsheet.
But, I also want, if it exists, the speaker information as well. So, given a string like:
'"I'm really hungry," she said.'
I would like the values returned as:
[ "I'm really hungry", "she said" ]
If there is no dialogue, as in this example:
"I'm not hungry."
the result would just be:
["I'm really hungry."]
Is this madness? Is it even possible? I have fooled around with this regex (am not a regex guru, knowing only enough to be dangerous):
"([^"]*)"
Which seems to get the dialogue tags, but doesn't get the speaker info. Any advice in how to get the speaker info as well would be greatly appreciated. I've been wrestling with this for awhile now.
Maybe a better approach would be to get the dialogue in one field, and the entire paragraph it is found in as the second field. That could also work, but I have no idea where to start with this.
Basically I want to put these all into a spreadsheet so I can hand them off to a narrator with enough context that they know whose dialogue is who's in the story.
Any help is greatly appreciated!
It definitely is possible
Look at this regex: ^.*?'?(?P<line>\".*\")(?P<actor>[^'\n]*)'?.*?$
demo here: https://regex101.com/r/UCRZwY/5
It basically marks the outer quotes as optional, but if it does find them, stores whatever provided as '$actor' (and the line as '$line') these are of course just names i've given them, feel free to change
Note updated to include such text as part of regular sentence, see example in demo

How to fix contacts whose First names were saved as Last names?

I'm trying to fix someone else's contacts (which I retrieved from their phone), but it seems that many, many contacts have their First names also written in the Last name field. Probably the phone was asking in turn for each piece of information and the Last name was the first.
Could someone please help me sort this database? I can export from Address Book a large .vcf, which I can open in Text Wrangler, but the application is quite new to me and I don't think Excel (can't use it with vcf) formulas help.
I haven't used Text Wrangler much but tried to look in the manual. Since I couldn't find something through "search", I gave up skimming the manual.
Could someone please help me make Text Wrangler detect the space between the first and last names and move the last name, if it's the case, before a semicolon?
Edit: There also are some cards without last names, but, again, the first name is written instead in there. So if there is one word (name) in the last name field, it should be moved instead to first name. If there are two (names/words separated by a space), then the first should be moved in another label.
This is what one such card (first name in last name)
BEGIN:VCARD
VERSION:3.0
N:Alex Instal;;;;
FN:Alex Instal
TEL;type=CELL;type=pref:nananana
X-ABUID:F9246772-nana-nana-nnana-nananana\:ABPerson
END:VCARD
And a correctly formatted one
BEGIN:VCARD
VERSION:3.0
N:Reynold;Adrian;;;
FN:Adrian Reynold
TEL;type=CELL;type=pref:nananan
X-ABUID:221697DB-3960-nana-nana-nanananana\:ABPerson
END:VCARD
Here is a regular expression with back-references that will work for the Alex Instal case you described. First make sure that the grep option on the TextWrangler search dialog is checked. Search for this:
^N:(\w*)(\s)(\w*);;;;$
and replace with this:
N:\1;\3;;;
Whether this will work for you of course depends on how consistent your file format is and whether there are any special cases like middle names, etc. But this will at least work for the case described and you could tweak it if necessary.

String replacement in Perl after comparing with the original string

I want to compare a string in perl below is detail description
original string
../db/proj/upload/1/22352/eng_wall_paper.jpg
I need to extract the file name eng_wall_paper.jpg from the string
and compare it with variable and append the new variable to the string.
new required string
../db/proj/upload/1/22352/new string.jpg
how can it be done, thanks in advance .
Now as you can see in the comment in your question that you really need to be specific about where you are actually stuck in to ask a question in SO. However, after looking at your other questions in SO it looks like you are completely new in programming world and need serious helping hand. Hence I think it would be helpful for you to get some useful reference to solve your problem.
If I breakdown your main problem statement I can see three parts
1. Need to extract the file name,
Most recommended way to do it is using File::Basename. However, its possible to use regex too. You need to learn how to write regex and how to capture a group. This link should be really helpful.
2. Compare it with variable,
Whichever method you use from above you should get the filename in a variable or in $1. Just compare that whatever you want to compare with. Make sure to use eq instead of ==. Find more details here.
3. Append the new variable to the string,
By now you should have the old filename in a variable, say $oldname, from first step and the new filename in another variable, say $newfilename. This question is already answered several times in SO, like this one. Hope that helps.
The actual solution is merely a 3 or 4 lines of code which I want you to figure out. Good luck.

Regexp to parse out a person's name?

This might be a hard one (if not impossible), but can anyone think of a regular expression that will find a person's name, in say, a resume? I know this won't be 100% accurate, but I can't come up with something.
Let's assume the name only shows up once in the document.
No, you can't use regular expressions for this. The only chance you have is if the document is always in the same format and you can find the name based on the context surrounding it. But this probably isn't the case for you.
If you are asking your applicants to submit their résumé online you could provide a separate field for them to enter their name and any other information you need instead of trying to automatically parse résumés.
Forget it - seriously.
Or expect to get a lot of applications from a Mr C Vitae
In my experience, having written something very similar (but a very long time ago), about 95% of resumes have the person's name as the very first line. You could probably have a pretty loose regex checking for alpha, hyphens, periods, and assume that's the name.
Obviously there's no way to do this 100% accurately, as you said, but this would be close.
Unless you wanted to build an expression that contained every possible name, or-ed together, the expression you are referring to is not "Regular," with a capital R. A good guess might be to go looking for the largest-font words in the document. If they follow a pattern that looks like firstname-lastname, name-initial-name, etc., you could call it a good guess...
That's a really hairy problem to tackle. The regex has to match two words that could be someone's name. The problem with that is that some people, of Hispanic origin, for example, might have a name that's more than 2 words. Also, how would you define two words to match for a name? Would you use a database of common first and last name fields? That might work unless someone has an uncommon name.
I'm reminded of a story of a COBOL teacher in college told me about an individual of Asian origin who's name would break every rule the programmers defined for a bank's internal system. His first name was "O." just the letter O.
The only remotely dependable way to nail down the regex would be if you had something to set off your search with; maybe if a line of text in the resume began with "Name: " then you'd know where to start looking.
tl;dr: People's names and individual resumes are too heavily varied for a regular expression to pick apart.
You could do something like Amazon does for book overviews: SIPs. This would require some after-the-fact double checking by humans but you might find the person's name(s) in there.

Use cases for regular expression find/replace

I recently discussed editors with a co-worker. He uses one of the less popular editors and I use another (I won't say which ones since it's not relevant and I want to avoid an editor flame war). I was saying that I didn't like his editor as much because it doesn't let you do find/replace with regular expressions.
He said he's never wanted to do that, which was surprising since it's something I find myself doing all the time. However, off the top of my head I wasn't able to come up with more than one or two examples. Can anyone here offer some examples of times when they've found regex find/replace useful in their editor? Here's what I've been able to come up with since then as examples of things that I've actually had to do:
Strip the beginning of a line off of every line in a file that looks like:
Line 25634 :
Line 632157 :
Taking a few dozen files with a standard header which is slightly different for each file and stripping the first 19 lines from all of them all at once.
Piping the result of a MySQL select statement into a text file, then removing all of the formatting junk and reformatting it as a Python dictionary for use in a simple script.
In a CSV file with no escaped commas, replace the first character of the 8th column of each row with a capital A.
Given a bunch of GDB stack traces with lines like
#3 0x080a6d61 in _mvl_set_req_done (req=0x82624a4, result=27158) at ../../mvl/src/mvl_serv.c:850
strip out everything from each line except the function names.
Does anyone else have any real-life examples? The next time this comes up, I'd like to be more prepared to list good examples of why this feature is useful.
Just last week, I used regex find/replace to convert a CSV file to an XML file.
Simple enough to do really, just chop up each field (luckily it didn't have any escaped commas) and push it back out with the appropriate tags in place of the commas.
Regex make it easy to replace whole words using word boundaries.
(\b\w+\b)
So you can replace unwanted words in your file without disturbing words like Scunthorpe
Yesterday I took a create table statement I made for an Oracle table and converted the fields to setString() method calls using JDBC and PreparedStatements. The table's field names were mapped to my class properties, so regex search and replace was the perfect fit.
Create Table text:
...
field_1 VARCHAR2(100) NULL,
field_2 VARCHAR2(10) NULL,
field_3 NUMBER(8) NULL,
field_4 VARCHAR2(100) NULL,
....
My Regex Search:
/([a-z_])+ .*?,?/
My Replacement:
pstmt.setString(1, \1);
The result:
...
pstmt.setString(1, field_1);
pstmt.setString(1, field_2);
pstmt.setString(1, field_3);
pstmt.setString(1, field_4);
....
I then went through and manually set the position int for each call and changed the method to setInt() (and others) where necessary, but that worked handy for me. I actually used it three or four times for similar field to method call conversions.
I like to use regexps to reformat lists of items like this:
int item1
double item2
to
public void item1(int item1){
}
public void item2(double item2){
}
This can be a big time saver.
I use it all the time when someone sends me a list of patient visit numbers in a column (say 100-200) and I need them in a '0000000444','000000004445' format. works wonders for me!
I also use it to pull out email addresses in an email. I send out group emails often and all the bounced returns come back in one email. So, I regex to pull them all out and then drop them into a string var to remove from the database.
I even wrote a little dialog prog to apply regex to my clipboard. It grabs the contents applies the regex and then loads it back into the clipboard.
One thing I use it for in web development all the time is stripping some text of its HTML tags. This might need to be done to sanitize user input for security, or for displaying a preview of a news article. For example, if you have an article with lots of HTML tags for formatting, you can't just do LEFT(article_text,100) + '...' (plus a "read more" link) and render that on a page at the risk of breaking the page by splitting apart an HTML tag.
Also, I've had to strip img tags in database records that link to images that no longer exist. And let's not forget web form validation. If you want to make a user has entered a correct email address (syntactically speaking) into a web form this is about the only way of checking it thoroughly.
I've just pasted a long character sequence into a string literal, and now I want to break it up into a concatenation of shorter string literals so it doesn't wrap. I also want it to be readable, so I want to break only after spaces. I select the whole string (minus the quotation marks) and do an in-selection-only replace-all with this regex:
/.{20,60} /
...and this replacement:
/$0"¶ + "/
...where the pilcrow is an actual newline, and the number of spaces varies from one incident to the next. Result:
String s = "I recently discussed editors with a co-worker. He uses one "
+ "of the less popular editors and I use another (I won't say "
+ "which ones since it's not relevant and I want to avoid an "
+ "editor flame war). I was saying that I didn't like his "
+ "editor as much because it doesn't let you do find/replace "
+ "with regular expressions.";
The first thing I do with any editor is try to figure out it's Regex oddities. I use it all the time. Nothing really crazy, but it's handy when you've got to copy/paste stuff between different types of text - SQL <-> PHP is the one I do most often - and you don't want to fart around making the same change 500 times.
Regex is very handy any time I am trying to replace a value that spans multiple lines. Or when I want to replace a value with something that contains a line break.
I also like that you can match things in a regular expression and not replace the full match using the $# syntax to output the portion of the match you want to maintain.
I agree with you on points 3, 4, and 5 but not necessarily points 1 and 2.
In some cases 1 and 2 are easier to achieve using a anonymous keyboard macro.
By this I mean doing the following:
Position the cursor on the first line
Start a keyboard macro recording
Modify the first line
Position the cursor on the next line
Stop record.
Now all that is needed to modify the next line is to repeat the macro.
I could live with out support for regex but could not live without anonymous keyboard macros.