Negating a regex - regex

How do I negate the regex [a-zA-Z]+[0-9]*? i.e. from a string '12-mar-14, 21, 123_4, Value123, USER, 12/2/13' I need to match values anything other than Value123 and USER. Can someone please explain how?
I'm trying to replace the string '12-mar-14, 21, 123_4, Value123, USER, 12/2/13' to '%Value123%USER%' in Java. Anything that doesn't match [a-zA-Z]+[0-9]* should be replaced with %
A regex that would give following outputs for corresponding inputs.
Input: '12-mar-14, 21, 123_4, Value123, USER, 12/2/13'
Output: '%Value123%USER%'
Input: '12-mar-14, 21, 123_4'
Output: '%'
Input: 'New, 12-Mar-14, 123, dat_123, Data123'
Output: '%New%Data123%'

Use this method:
//********** MODIFIED *************//
public static void getSentence(String line) {
String text[] = line.split(",");
String res = "";
for (int i = 0; i < text.length; i++) {
String word = text[i].trim();
if (word.matches("[a-zA-Z]+[0-9]*")){
if (!"".equals(res))
res = res + "%";
res = res + word;
}
}
if ("".equals(res))
res = "%";
else
res = "%" + res + "%";
System.out.println(res);
}
...
this.getSentence("New, 12-Mar-14, 123, dat_123, Data123");
this.getSentence("12-mar-14, 21, 123_4, Value123, USER, 12/2/13");
Output:
%New%Data123%
%Value123%USER%

Related

Java regular expression: escaping multi-lines comment containing the caracter $

final Pattern PATTERN = Pattern.compile("\"[^\"]*\"");
#Test
public void parseCsvTest() {
StringBuffer result = new StringBuffer();
Matcher m = null;
String csv="\"foo$\n" + "bar\"";
try {
m = PATTERN.matcher(csv);
while (m.find()) {
m.appendReplacement(result, m.group().replaceAll("\\R+", ""));
}
m.appendTail(result);
} catch (Exception e) {
e.printStackTrace();
}
String escaped_csv = result.toString();
log.info(escaped_csv);
}
With String csv="\"foo\n" + "bar\"";
I'm getting the expected result that is: "foobar"
But with String csv="\"foo$\n" + "bar\""; (notice the $ char after foo), the pattern doesn't identify the group. Note: $ is a char, not the "end of line symbol", despite it can be followed by a "end of line symbol".
Tried with PATTERN = Pattern.compile("\"[^\"]*^$?\""); without success. Will return foo and bar in 2 lines
Any ideas ?
Got it work with: Pattern.compile("\"*[^$]|\"[^\"]*\"");
Results
csv = "\"foo\n" + "bar\n" + "doe\"" => foobardoe
csv = "\"foo$\n" + "bar\n" + "doe\"" => foo$bardoe
csv = "\"foo$\n" + "bar$\n" + "doe\"" => foo$bar$doe
csv = "\"foo$\n" + "bar$\n" + "doe$\"" => foo$bar$doe$

Findind more than one word in a file and then extracting the values present after those words?

Note: I asked a similar question but that was put "on hold" because i didn't provide my code (i guess). Now i have written my code also but i am facing some other problem.
From my .bench file, i have to read the values written in brakects () which i managed to do. But the problem is that i have read the values in brakets after INPUT, OUTPUT, NAND.
.bench file
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)
OUTPUT(22)
OUTPUT(23)
10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
So far, i have written the code to find the values inside brackets after INPUT, OUTPUT, and NAND but as it can be seen that i am repeating the similar lines of code again ana again. So, how can i generalize the same code to find vales after OUTPUT, NAND etc.
int Circuit::readBenchFile(string filename) //read the benchfile and generate inputs, outputs and gates accordingly
{
//Reading the .bench file
ifstream input_file;
char * S = new char[filename.length() + 1];
std::strcpy(S,filename.c_str());
input_file.open(S);
if(input_file.fail())
{
cout << "Failed to open Bench file.\n";
return 1;
}
///////
string line;
string guard_str("#");
string input_str ("INPUT"), output_str ("OUTPUT"), nand_str("NAND");
while (getline( input_file, line ))
{
std::size_t guard_found = line.find(guard_str);
if (guard_found ==std::string::npos)
{
///Input
std::size_t found = line.find(input_str);
if (found!=std::string::npos)
{
found = line.find_first_of('(', found + 1);
//Getting our input name and printing it.
string out = line.substr( found + 1, ( line.find_first_of(')', found) - found - 1) );
std::cout << out << std::endl;
}
///Output
std::size_t found1 = line.find(output_str);
if (found1!=std::string::npos)
{
found1 = line.find_first_of('(', found1 + 1);
//Getting our input name and printing it.
string out = line.substr( found1 + 1, ( line.find_first_of(')', found1) - found1 - 1) );
std::cout << out << std::endl;
}
///NAND
std::size_t found_2 = line.find(nand_str);
if (found_2!=std::string::npos)
{
found_2 = line.find_first_of('(', found_2 + 1);
//find first input
string first_input = line.substr( found_2 + 1, ( line.find_first_of(',', found_2) - found_2 - 1) );
//Second input
found_2 = line.find_first_of(',', found_2 + 2);
string second_input = line.substr( found_2 + 1, ( line.find_first_of(')', found_2) - found_2 - 1) );
cout<<"\nInputs to NAND gate are: "<<( first_input + string(" & ") + second_input );
}
}
}
}
I guess the best way to do it is to use regular expressions. Good option is the Boost Regex library to do that: http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/index.html.
If you are not familiar with regular expressions, here is the great page that will get you started very quickly: http://www.regular-expressions.info/. The first paragraph on the main page will give you the idea.
In short: regular expressions make it possible to quickly find patterns in text. You can quickly build a regular expression and the function depending on it, that would return true if any of the words you are looking for is found.
Well if you are looking for genericity I would suggest using boost::split.
vector<string> result;
vector<string> value2;
vector<string> nand_case;
boost::split(result , myline, boost::is_any_of("("));
boost::split(value2, result[1], boost::is_any_of(")"));
if (result[0].find("NAND") != string::pos)
boost::split(nand_case, value2[0], boost::is_any_of(",");
will give you for INPUT(23):
result[0] : INPUT
result[1] : 23)
value2[0] : 23
will give you for OUTPUT(18):
result[0] : OUTPUT
result[1] : 18)
value2[0] : 18
will give you for 23 = NAND(16, 19):
result[0] : 23 = NAND
result[1] : 16, 19)
value2[0] : 16, 19
nand_case[0] : 16
nand_case[1] : 19
Hope I understood correctly and this can help.

splitting text using Pattern.compile

Here is the line of text:
003 STATE BANK OF BIK & JAI A/C.1 2 1,01,500.00 1 3,160.00 98,340.00+
Here is my code snippet to split it:
Pattern pat = Pattern.compile("[ ]");
String strs[] = pat.split(s);
for (int i = 0; i < strs.length; i++) {
System.out.println("Next Token = " + strs[i]);
}
Here is what I get:
003,STATE,BANK,OF,BIK,*,JAI...etc.
What I really want is:
003,STATE BANK OF BIK & JAI,A/C.1,2.1,01,500.00...etc
Which pattern or metacharacter do I use to accomplish this?
For your case this split call will work:
String data = "003 STATE BANK OF BIK & JAI A/C.1 2 1,01,500.00 1 3,160.00 98,340.00+";
String[] arr = data.split(" +(?=\\S*\\d)|(?<![A-Z&]) +");
System.out.println(Arrays.toString(arr));
OUTPUT:
[003, STATE BANK OF BIK & JAI, A/C.1, 2, 1,01,500.00, 1, 3,160.00, 98,340.00+]

Flash AS3 count capital letters?

How could I count the number of capital letters in a string using flash as3?
for example
var thestring = "This is The String";
should return int 3
Thank you
// Starting string.
var thestring:String = "This is The String";
// Match all capital letters and check the length of the returned match array.
var caps:int = thestring.match(/[A-Z]/g).length;
trace(caps); // 3
One way to solve this is to convert the string to lower case and count the characters affected. That means you don't have to specify which characters to include in the category of "uppercase letters", which isn't trivial. This method supports accented characters such as É.
// Starting string.
var theString:String = "'Ö' is actually the Swedish word for 'island'";
var lowerCase : String = theString.toLowerCase();
var upperCount : int = 0;
for (var i:int = 0; i < theString.length; i++) {
if (theString.charAt(i) != lowerCase.charAt(i)) {
upperCount++;
}
}
trace(upperCount); // prints 2
Each letter in a string has a value that corresponds with that letter:
var myString:String = "azAZ";
trace(myString.charCodeAt(0));
trace(myString.charCodeAt(1));
trace(myString.charCodeAt(2));
trace(myString.charCodeAt(3));
// Output is 97, 122, 65, 90
The name.charCodeAt(x) returns the code of the letter at the position in the string, starting at 0.
From this output we know that a - z are values ranging from 97 to 122, and we also know, that A - Z are values ranging from 65 - 90.
With that, we can now make a For Loop to find capital letters:
var myString:String = "This is The String";
var tally:int = 0;
for (var i:int = 0; i < myString.length; i++)
{
if (myString.charCodeAt(i) >= 65 && myString.charCodeAt(i) <= 95)
{
tally += 1;
}
}
trace(tally);
// Output is 3.
The variable "tally" is used to keep track of the number of capital letters found. In the For Loop, we are seeing if the value of the current letter it is analyzing is between the values 65 and 90. If it is, it adds 1 to tally and then traces the total amount when the For Loop finishes.
Why be succint? I say, processing power is made to be used. So:
const VALUE_0:uint = 0;
const VALUE_1:uint = 1;
var ltrs:String = "This is JUST some random TexT. How many Caps?";
var cnt:int = 0;
for(var i:int = 0; i < ltrs.length; i++){
cnt += processLetter(ltrs.substr(i,1));
}
trace("Total capital letters: " + cnt);
function processLetter(char:String):int{
var asc:int = char.charCodeAt(0);
if(asc >= Keyboard.A && asc <= Keyboard.Z){
return VALUE_1;
}
return VALUE_0;
}
// Heh heh!

Count how many times new line is present?

For example,
string="help/nsomething/ncrayons"
Output:
String word count is: 3
This is what I have but the program is looping though the method several times and it looks like I am only getting the last string created. Here's the code block:
Regex regx = new Regex(#"\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
MatchCollection matches = regx.Matches(output);
//int counte = 0;
foreach (Match match in matches)
{
//counte = counte + 1;
links = links + match.Value + '\n';
if (links != null)
{
string myString = links;
string[] words = Regex.Split(myString, #"\n");
word_count.Text = words.Length.ToString();
}
}
It is \n for newline.
Not sure if regex is a must for your case but you could use split:
string myString = "help/nsomething/ncrayons";
string[] separator = new string[] { "/n" };
string[] result = myString.Split(separator, StringSplitOptions.None);
MessageBox.Show(result.Count().ToString());
Another way using regex:
string myString = "help/nsomething/ncrayons";
string[] words = Regex.Split(myString, #"/n");
word_count.Text = words.Length;