reader.ReadString does not strip out the first occurrence of delim

reader.ReadString does not strip out the first occurrence of delim - if-statement

I wrote a simple go program and it isn't working as it should:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
reader := bufio.NewReader(os.Stdin)
fmt.Print("Who are you? \n Enter your name: ")
text, _ := reader.ReadString('\n')
if aliceOrBob(text) {
fmt.Printf("Hello, ", text)
} else {
fmt.Printf("You're not allowed in here! Get OUT!!")
}
}
func aliceOrBob(text string) bool {
if text == "Alice" {
return true
} else if text == "Bob" {
return true
} else {
return false
}
}
It should ask the user to tell it's name and, if he is either Alice or Bob, greet him and else tell him to get out.
The problem is, that even when the entered name is Alice or Bob, it tells the User to get out.
Alice:
/usr/lib/golang/bin/go run /home/jcgruenhage/go/workspace/src/github.com/jcgruenhage/helloworld/greet/greet.go
Who are you?
Enter your name: Alice
You're not allowed in here! Get OUT!!
Process finished with exit code 0
Bob:
/usr/lib/golang/bin/go run /home/jcgruenhage/go/workspace/src/github.com/jcgruenhage/helloworld/greet/greet.go
Who are you?
Enter your name: Bob
You're not allowed in here! Get OUT!!
Process finished with exit code 0

This is because your the text is storing Bob\n
One way to solve this is using strings.TrimSpace to trim the newline, eg:
import (
....
"strings"
....
)
...
if aliceOrBob(strings.TrimSpace(text)) {
...
Alternatively, you can also use ReadLine instead of ReadString, eg:
...
text, _, _ := reader.ReadLine()
if aliceOrBob(string(text)) {
...
The reason why the string(text) is needed is because ReadLine will return you byte[] instead of string.

I think the source of confusion here is that:
text, _ := reader.ReadString('\n')
Does not strip out the \n, but instead keeps it as last value, and ignores everything after it.
ReadString reads until the first occurrence of delim in the input,
returning a string containing the data up to and including the
delimiter.
https://golang.org/src/bufio/bufio.go?s=11657:11721#L435
And then you end up comparing Alice and Alice\n. So the solution is to either use Alice\n in your aliceOrBob function, or read the input differently, as pointed out by #ch33hau.

I don't know anything about Go, but you might want to strip the string of leading or trailing spaces and other whitespace (tabs, newline, etc) characters.

reader.ReadLine()
Can leave ‘\n’ but reader.ReadString() can't

Related

Remove inline styles from markdown in SwiftUI

I'm pulling in JSON data and displaying it with Text in my SwiftUI App, but some of the text contains inline styles in the MD. Is there a way to remove this or even apply the styles?
Example:
if !ticket.isEmpty {
Text(self.ticket.first?.notes.first?.prettyUpdatedString ?? "")
.padding()
Text(self.ticket.first?.notes.first?.mobileNoteText ?? "")
.padding()
.fixedSize(horizontal: false, vertical: true)
}
The prettyUpdatedString prints out "Last updated by < strong >Seth Duncan</ strong>"
Update:
In an attempt to apply this fix to the Ticket Short Detail, an exception is being thrown.
Exception NSException * "*** -[NSRegularExpression enumerateMatchesInString:options:range:usingBlock:]: Range or index out of bounds" 0x0000600003f06370
I'm not sure what's going on here. Any ideas?
Example of Data shortDetail is being pulled from
{
id: ID,
type: "Ticket",
lastUpdated: "2020-07-23T08:19:12Z",
shortSubject: null,
shortDetail: "broken screen - # CEH Tina Desk",
displayClient: "STUDENT",
updateFlagType: 0,
prettyLastUpdated: "6 days ago",
latestNote: {
id: ID,
type: "TechNote",
mobileListText: "<b>S. Duncan: </b> Sent to AGI for repair.",
noteColor: "aqua",
noteClass: "bubble right"
}
},
ERROR CODE
Screen I'm attempting to access...short detail is below name

The trick here is to play with RegEx in my opinion, in this case I would create an function that clears the markdown
UPDATE
Based on what I understood from your comment you want to replace   with an white space not an empty string.
To archive that I just replace all occurrences of   with an space " "
.replacingOccurrences(of: " ", with: " ")
Leaving you with this code
// original answer had an incorrect regex
func clearMarkdown(on str: String) -> String {
// we build markdown open and close regular expressions we ensure that they are valid
guard let match = try? NSRegularExpression(pattern: "<[^>]+>|\\n+") else { return str }
// we get the range of the string to analize, in this case the whole string
let range = NSRange(location: 0, length: str.lengthOfBytes(using: .utf8))
// we match all opening markdown
let matches = match.matches(in: str, range: range)
// we start replacing with empty strings
return matches.reversed().reduce(into: str) { current, result in
let range = Range(result.range, in: current)!
current.replaceSubrange(range, with: "")
}.replacingOccurrences(of: " ", with: " ")
}
This function will clear all markdown stylings from your strings, but it will not format the strings, nor give you any information about the markdown taking in count your example the usage would be something like this
var str = "Last updated by < strong >Seth Duncan</ strong>"
str = clearMarkdown(on: str) // prints "Last updated by Seth Duncan" without quotes
If you require the styling to be applied the will not work but I can write something that will
UPDATE 2
After looking your problem I found out that you received a couple of strings with characters not available in the UTF-8 charset.The character in this case is ’ which is available in ANSI while those that use UTF-8 normally use '. This being said you just need to change the charset by
// Replacing this line
let range = NSRange(location: 0, length: str.lengthOfBytes(using: .utf8))
// With
let range = NSRange(location: 0, length: str.lengthOfBytes(using: .windowsCP1254))
I'm not mistaken this is one of the most complete charsets and matches (or closely matches) the ANSI charset, you can also use .ascii that I tested and seems to work

How to remove newlines inside csv cells using regex/terminal tools?

I have a csv file where some of the cells have newline character inside. For example:
id,name
01,"this is
with newline"
02,no newline
I want to remove all the newline characters inside cells.
How to do it with regex or with other terminal tools generically without knowing number of columns in advance?

This is actually a harder problem than it looks, and in my opinion, means that regex isn't the right solution. Because you're dealing with quoting/escaped strings, spanning multiple 'lines' you end up with a complicated and difficult to read regex. (It's not impossible, it's just messy).
I would suggest instead - use a parser. Perl has one in Text::CSV and it goes a bit like this:
#!/usr/bin/env perl
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new( { binary => 1, eol => "\n" } );
while ( my $row = $csv->getline( \*ARGV ) ) {
s/\n/ /g for #$row;
$csv->print( \*STDOUT, $row );
}
This will take files as piped in/specified on command line - that's what \*ARGV does - it's a special file handle that lets you do ... basically what sed does:
somecommand.sh | myscript.pl
myscript.pl filename_to_process
The ARGV filehandle doe either automagically. (You could explicitly open a file or use \*STDIN if you prefer)

I suspect that instead of removing the newline you actually want to replace it with a space. If your input file is as simple as it looks this should do it for you:
$ awk '{ORS=( (c+=gsub(/"/,"&"))%2 ? FS : RS )} 1' file
id,name
01,"this is with newline"
02,no newline

If you are using this xlsx2csv tool, it has this option:
-e, --escape Escape \r\n\t characters
Use it, and then replace \n as needed, like (if \n should be replaced by the empty string):
sed 's/\\n//g' filein.csv` > fileout.csv
In one pass:
PATH/TO/xlsx2csv.py -e filein.xlsx | sed 's/\\n//g' > fileout.csv

How to do it with regex or with other terminal tools generically without knowing number of columns in advance?
I don't think a regex is the most appropriate approach and might end up being quite complicated. Instead, I think a separate program to process the files might be easier to maintain in the long-term.
Since you're OK with any terminal tools, I've chosen python, and the code's below:
#!/usr/bin/python3 -B
import csv
import sys
with open(sys.argv[1]) as csvfile:
reader = csv.reader(csvfile)
for row in reader:
stripped = [col.replace('\n', ' ') for col in row]
print(','.join(stripped))
I think the code above is very straightforward and easy to understand, without a need for complicated regular expressions.
The input file here has the following contents:
id,name
01,"this is
with newline"
02,no newline
To prove it works, its output is reproduced below:
➜ ~ ./test.py input.csv
id,name
01,this is with newline
02,no newline
You could call the python script from some other program and feed filenames to it. You just need to add a minor update for the python program to write out files, if that's what you really need.
I've replaced the newlines with spaces to avoid a potentially unwanted concatenation (e.g. this iswith newline), but you can replace the newline with whatever you want, including the empty string ''.

I have written a method to remove the embedded new line inside the cell. The method below returns a java.util.List object that contains all rows in the CSV file
List<String> getAllRowsInCSVFileAsList(File selectedCSVFile){
FileReader fileReader = null;
BufferedReader reader = null;
List<String> values = new ArrayList<String>();
try{
fileReader = new FileReader(selectedCSVFile);
reader = new BufferedReader(fileReader);
String line = reader.readLine();
String previousLine = "";
//
boolean intendLineInCell = false;
while(line != null){
if(intendLineInCell){
if(line.indexOf("\"") != -1 && line.indexOf("\"") == line.lastIndexOf("\"")){
previousLine += line;
values.add(previousLine);
previousLine = "";
intendLineInCell = false;
} else if(line.indexOf("\"") != -1 && line.indexOf("\"") != line.lastIndexOf("\"")){
if(getTotalNumberOfCharacterSequenceOccurrenceInString("\"", line) % 2 == 0){
previousLine += line;
}else{
previousLine += line;
values.add(previousLine);
previousLine = "";
intendLineInCell = false;
}
} else{
previousLine += line;
}
}else{
if(line.indexOf("\"") == -1){
values.add(line);
}else if ((line.indexOf("\"") == line.lastIndexOf("\"")) && line.indexOf("\"") != -1){
intendLineInCell = true;
previousLine = line;
}else if(line.indexOf("\"") != line.lastIndexOf("\"") && line.indexOf("\"") != -1){
values.add(line);
}
}
line = reader.readLine();
}
}catch(IOException ie){
ie.printStackTrace();
}finally{
if(fileReader != null){
try {
fileReader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if(reader != null){
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return values;
}
int getTotalNumberOfCharacterSequenceOccurrenceInString(String characterSequence, String text){
int count = 0;
while(text.indexOf(characterSequence) != -1){
text = text.replaceFirst(characterSequence, "");
count++;
}
return count;
}
Imagine you are creating a csv file with one row and five columns and in the 4th cell you have an embedded new line(enter inside the cell)
Your data will be look like below (actually we have only one row in csv but if you opened it in notepad it would look like 2 rows).
dinesh,kumar,24,"23
tambaram india",green
If there is a enter inside the cell could be like below
"23
tambaram india"
That cell starts with double quote(") and ends with double quote(").
Through using the double quote(") while reading the line if there is a double quote(") we can understand there is a embedded enter inside the cell.
The code concats the next line with that line and checks whether there is an end double quote(") or not. If there is, it adds a new row in the java.util.List object else it concats the next line and check it for end double quote(") and so on. Here I have explained for one cell, but the method also works if the row has a lot of cells with embedded enter.

Open the *csv file with notepadd++ and then press Ctrl+ H. Go to tab replace and enter to search box the "newline" and then write to replace the word you want to replace or let it empty if you want.

Save next word, if a given word is found (C++)

I'm pretty new to C++. I have a text doc that looks like this:
InputFile.txt
...
.
..
.
.
....
TIME/DISTANCE = 500/ 0.1500E+05
..
..
.
...
TIME/DISTANCE = 500/ 1.5400E+02
.
...
...
.
TIME/DISTANCE = 500/ 320.0565
..
..
.
.
...
The one line shown keeps repeating throughout the file. My objective is to save all the numbers after the 500/ into an array/vector/another file/anything. I know how to read a file and get a line:
string line;
vector <string> v1;
ifstream txtfile ("InputFile.txt");
if (txtfile.is_open())
{
while (txtfile.good())
{
while( getline( txtfile, line ) )
{
// ?????
// if(line.find("500/") != string::npos)
// ?????
}
}
txtfile.close();
}
Does anybody have a solution? Or point me in the right direction?
Thanks in advance.
Edit: Both proposed solutions (Jerry's and Galik's) work perfectly. I love this community. :)

This is one of those rare cases that (IMO) it may make sense to use sscanf in C++.
std::string line;
std::vector<double> numbers;
while (std::getline(txtfile, line)) {
double d;
if (1==sscanf(line.c_str(), " TIME/DISTANCE = 500 / %lf", &d))
numbers.push_back(d);
}
This takes each line, and attempts to treat it as having the format you care about. Where that succeeded, the return value from sscanf will be 1 (the number of items converted). Where it fails, the return value will be 0 (i.e., it didn't convert anything successfully). Then we save it if (and only if) there was a successful conversion.
Also note that sscanf is "smart" enough to treat a single space in the format string as matching an arbitrary amount of white-space in the input, so we don't have to try to match the amount of white space precisely.
We could vary this somewhat. If there has to be a number before the '/', but it could be something different from 500, we could replace that part of the format string with %*d. That means sscanf will search for a number (specifically an integer) there, but not assign it to anything. If it finds something other than an integer, conversion will fail, so (for example) TIME/DISTANCE ABC/1.234 would fail, but TIME/DISTANCE 234/1.l234 would succeed.

When processing your line then you can use line.find() to check its the right line and to find your data:
if(line.find("TIME/DISTANCE") != std::string::npos)
{
// this is the correct line
}
Once you have the correct line you can get the position of the data like this:
std::string::size_type pos = line.find("500/");
if(pos != std::string::npos)
{
// pos holds the position of the numbers you want
std::string wanted_numbers = lint.substr(pos + 4); // get only the numbers in a string
}
Hope that helps
EDIT: Fixed bug (adding 4 to pos to skip over the "500/" part)

std::getline removes whitespaces?

So I am creating a command line application and I am trying to allow commands with parameters, or if the parameter is enclosed with quotations, it will be treated as 1 parameter.
Example: test "1 2"
"test" will be the command, "1 2" will be a single parameter passed.
Using the following code snippet:
while(getline(t, param, ' ')) {
if (param.find("\"") != string::npos) {
ss += param;
if (glue) {
glue = false;
params.push_back(ss);
ss = "";
}
else {
glue = true;
}
}
else {
params.push_back(param);
}
}
However std::getline seems to auto remove whitespace which is causing my parameters to change from "1 2" to "12"
I've looked around but results are flooded with "How to remove whitespace" answers rather than "How to not remove whitespace"
Anybody have any suggestions?

However std::getline seems to auto remove whitespace
That's exactly what you are telling getline to do:
getline(t, param, ' ');
The third argument in getline is the delimiter. If you want to parse the input line, you should read it until '\n' is found and then process it:
while(getline(t, param)) {
/* .. */
}

Umm, you are telling it to use ' ' as a delimiter in std::getline. Of course it's going to strip the whitespace.
http://www.cplusplus.com/reference/string/getline/

Capitalize every word in actionScript using a regular expression

I'm trying to do initial caps in actionScript without loops but I'm stuck. I wanted to select the first letter or every word then apply uppercase on that letter. Well I got the selection part right, but at a dead end right now, any ideas? I was trying to do this without loops and cutting up strings.
// replaces with x since I can't figure out how to replace with
// the found result as uppercase
public function initialcaps():void
{
var pattern:RegExp=/\b[a-z]/g;
var myString:String="yes that is my dog dancing on the stage";
var nuString:String=myString.replace(pattern,"x");
trace(nuString);
}

You can also use this to avoid the compiler warnings.
myString.replace(pattern, function():String
{
return String(arguments[0]).toUpperCase();
});

Try to use a function that returns the uppercase letter:
myString.replace(pattern, function($0){return $0.toUpperCase();})
This works at least in JavaScript.

Just thought I'd throw them two cents in for strings that may be all caps
var pattern:RegExp = /\b[a-zA-Z]/g;
myString = myString.toLowerCase().replace(pattern, function($0){return $0.toUpperCase();});

This answer does not throw any kind of compiler errors under strict and I wanted it to be a little more robust, handling edge cases like hyphens (ignore them), underscores (treat them like spaces) and other special non-word characters such as slashes or dots.
It's really important to note the /g switch at the end of the regular expression. Without it, the rest of the function is pretty useless, because it will only address the first word, and not any subsequent ones.
for each ( var myText:String in ["this is your life", "Test-it", "this/that/the other thing", "welcome to the t.dot", "MC_special_button_04", "022s33FDs"] ){
var upperCaseEveryWord:String = myText.replace( /(\w)([-a-zA-Z0-9]*_?)/g, function( match:String, ... args ):String { return args[0].toUpperCase() + args[1] } );
trace( upperCaseEveryWord );
}
Output:
This Is Your Life
Test-it
This/That/The Other Thing
Welcome To The T.Dot
MC_Special_Button_04
022s33FDs
For the copy-and-paste artists, here's a ready-to-roll function:
public function upperCaseEveryWord( input:String ):String {
return input.replace( /(\w)([-a-zA-Z0-9]*_?)/g, function( match:String, ... args ):String { return args[0].toUpperCase() + args[1] } );
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

reader.ReadString does not strip out the first occurrence of delim - if-statement

I don't know anything about Go, but you might want to strip the string of leading or trailing spaces and other whitespace (tabs, newline, etc) characters.

reader.ReadLine() Can leave ‘\n’ but reader.ReadString() can't

Related

Remove inline styles from markdown in SwiftUI

How to remove newlines inside csv cells using regex/terminal tools?

Save next word, if a given word is found (C++)

std::getline removes whitespaces?

Capitalize every word in actionScript using a regular expression

Categories

Resources