How to implement syntax highlighting using sed command in te gnome-terminal? - regex

I want to highlight function names of a C program using the sed command on the linux terminal.
I am able to do it using tput to color the function name. For which I have provided the code below. (first line)
I am not able to do the coloring if I use printf/echo/command substitution to color the output of the terminal. (second line of code). I guess this is because I am not able to reference the strings with \1 and \2. When using this it shows some other characters instead of the function names.
The regular expression I have used reads that, the first character of function name can be an alphabet or an underscore and the second character can be alphanumeric and underscore and the third character should be an open parenthesis. I want to reference the Regex by using \1 \2 and \3 and colour everything except \3. This is the idea I have come up with.
My question is, is there any other way to not color the open parenthesis or a way to use the printf and color the function name.
sed -E "s,([a-zA-Z_])([a-zA-Z0-9_]*)(\(),$(tput setaf 1)\1\2$(tput sgr0)\3," Sample.c
sed -E "s,([a-zA-Z_])([a-zA-Z0-9_]*)(\(),$(printf "\033[0;36m\1\2\033[0m\3")," Sample.c
Sample .c :
#include <stdio.h>
int main()
{
int array[100], maximum, size, c, location = 1;
printf("Enter the number of elements in array\n");
scanf("%d", &size);
printf("Enter %d integers\n", size);
for (c = 0; c < size; c++)
scanf("%d", &array[c]);
return 0;
}
Expected result -> main, printf, scanf should be coloured in Sample.c.

The tput is clever however the backtracking will not resolve for an embedded printf because it's inside the subshell and so printf won't work.
There is a bashism which may work for you in the var=$'ansi-ized content' syntax. Three capture groups didn't seem necc. so omitted:
BEGC=$'\033[0;36m' ENDC=$'\033[0m'; \
sed -E "s,([a-zA-Z_][a-zA-Z0-9_]*)(\(),${BEGC}\1${ENDC}\2," Sample.c
However, there's another more fundamental issue in that nested functions will not be highlighted. Notice in the updated Sample.c here that the fictitious "getSize()" function will not be highlighted:
#include <stdio.h>
int main()
{
int array[100], maximum, size, c, location = 1;
printf("Enter the number of elements in array\n");
scanf("%d", &size);
printf("Enter %d integers\n", getSize(size));
for (c = 0; c < size; c++)
scanf("%d", &array[c]);
return 0;
}
A simple regex will not work as there is a recursion requirement. Probably awk can do it since it has a while loop and functions (gensub() maybe?)

Related

Deleting multiple types of comments with sed command [duplicate]

This question already has answers here:
Remove comments from C/C++ code
(12 answers)
Closed 3 years ago.
I have a directory of C files. I want to remove all types of comments from these source files.
For example, let's say I have a source code that is similar to the following file.
#include <stdio.h>
int main() {
int number;
/* Sample Multiline Comment
* Line 1
* Line 2
*/
printf("Enter an integer: ");
// reads and stores input
scanf("%d", &number);
printf("You entered: %d", number); //display output
return 0;
/* Comment */
}
I want to remove all types of comments in this code. This includes,
//
/* */
/*
*
*/
I have attempted to carry out this task by using a sed command.
find . -type f |xargs sed -i 's,/\*\*,,g;s,\*/,,g;s,/\*,,g;s,//,,g'
This only removes the above comment symbols itself but not the comment. I would like to remove the entire comment along with the above three comment symbols.
How can I achieve this criteria.
Approach this from two perspectives.
You delete a line that starts with your matching criteria
You delete content that starts with some criteria and ends with different criteria.
To delete a line that starts with criteria:
sed '/^\/\// d'
To delete between a start and finish use:
sed 's/\/\*.*\*\/://'
Warning. Be careful when you have other lines that may start with the applicable characters.
I hope this is what you are looking for.
This is kind of a time-pass try with awk, but maybe it helps:
#! /usr/bin/env bash
awk '
function remove_comments(line)
{
# multi-line comment is active, clear everything
if (flag_c == 1) {
if (sub(/.*[*][\/]$/, "", line)) {
flag_c=0
}
else {
# skip this line
# its all comment
return 1
}
}
# remove multi-line comments(/**/) made on the same line
gsub(/[\/][*].*[*][\/]/, "", line)
# remove single line comments if any
sub(/[\/][\/].*$/, "", line)
# make flag_c=1 if a multi-line comment has been started
if (sub(/[\/][*].*/, "", line))
{
flag_c=1
}
return line
}
##
# MAIN
##
{
$0 = remove_comments($0)
if ($0 == 1 || $0 == "")
next
print
}
' file.c
You are best off using the C preprocessor for this, as in the answer for Remove comments from C/C++ code.
You can ask the preprocessor to remove comments by running gcc -fpreprocessed -dD -E foo.c.
$ cat foo.c
#include <stdio.h>
int main() {
int number;
/* Sample Multiline Comment
* Line 1
* Line 2
*/
printf("Enter an integer: ");
// reads and stores input
scanf("%d", &number);
printf("You entered: %d", number); //display output
return 0;
/* Comment */
}
$ gcc -fpreprocessed -dD -E foo.c
# 1 "foo.c"
#include <stdio.h>
int main() {
int number;
printf("Enter an integer: ");
scanf("%d", &number);
printf("You entered: %d", number);
return 0;
}

Ignore C comments and include statements using sed (//, /**, **/, #)

So I've got some example C code
/**
example text
**/
#include <stdio.h>
int main(){
int example = 0;
// example text
return;
}
How would I specifically use sed to ignore all lines starting with // or # while also ignoring lines in the range of /** to **/?
I've tried things along the lines of sed -E '/(^#|\/\*/,/\*\/|^\/\/)/!s/example/EXAMPLE/g' but I have a feeling I'm not using the | correctly as it pops an error saying "unmatched ("
My desired final output should be
/**
example text
**/
#include <stdio.h>
int main(){
int EXAMPLE = 0;
// example text
return;
}
The change from the sed command would have changed instances of the word "example" in the program to the uppercase version "EXAMPLE", and what I'm trying to do is make sure words on commented lines are not being changed.
Without ignoring the possibility to fall into circumstances that sed will not be the right tool for this job as sin and melpomene mention in comments, the bellow command will do the trick in your particular exercise:
sed -E '/(#|\/\/)/b ; /\/\*\*/,/\*\*\//b; s/example/EXAMPLE/g' file
/**
example text
**/
#include <stdio.h> example
int main(){
int EXAMPLE = 0;
// example text
return;
}
sed special word b makes use of labels:
'b LABEL'
Unconditionally branch to LABEL. The LABEL may be omitted, in
which case the next cycle is started.
In other words, instead of negating a pattern like /pattern/! you can use /pattern/b without a label and when /pattern/ is found sed jumps (because of b) to the next cycle skipping the substitution s/example/EXAMPLE/g command.
Your attempt does not work because you try to use logical OR | in a mix of patterns like # or // and also a range like /\/\*\*/,/\*\*\//

how scanf handles standard input vs pipelined input

In the following code:
#include <cstdio>
using namespace std;
int N;
char x[110];
int main() {
scanf("%d\n", &N);
while (N--) {
scanf("0.%[0-9]...\n", x);
printf("the digits are 0.%s\n", x);
}
}
when I enter the input through file using command
./a.out < "INPUTFILE"
It works as expected but when I enter the input through std input it delay the printf function.
Why is that?
Input
3
0.1227...
0.517611738...
0.7341231223444344389923899277...
Output through file
the digits are 0.1227
the digits are 0.517611738
the digits are 0.7341231223444344389923899277
output through standard input
the digits are 0.1227
the digits are 0.517611738
The code is from a book, commpetitive programing 3 ch1_02_scanf.cpp.
Because there's an extra \n in your scanf format string, it will read and discard ALL whitespaces (spaces, newlines and tabs, if present). There needs to be an indication of the end of a whitespace sequence, so you should either hit Ctrl-D for EOF, or enter another non-WS character that lets scanf stop.
As Sid S suggests, you cam alternatively change your format string like this:
scanf(" 0.%[0-9]...", x);
The program works because when you redirect its stdin to a file, scanf will know if it reaches EOF and terminates the last input. You need to manually give an EOF when you type from console. That's the difference.
Change the second scanf() line to
scanf(" 0.%[0-9]...", x);

What's wrong with my program (scanf, C++)?

What wrong with this program?
#define _CRT_SECURE_NO_WARNINGS
#include <cstdio>
using namespace std;
int N;
char x[110];
int main() {
scanf("%d", &N);
while (N--) {
scanf("0.%[0-9]...", &x);
printf("the digits are 0.%s\n", x);
}
return 0;
}
this is a console application in VS 2013, here is a sample input and output:
input :
1
0.123456789...
output
the digits are 0.123456789
but when I input this to this the output I get : The digits are 0
I have tried inputitng and manually and by text, in Code::Blocks and VS 2013, but neither worked. And when inputting manually, the program doesn't wait for me to input the numbers after I have entered N.
What should I do?
The scanf() in the loop fails, but you didn't notice. It fails because it can't handle a newline before the literal 0. So, fix the format string by adding a space, and the code by testing the return value from scanf():
if (scanf(" 0.%[0-9]...", x) != 1)
…oops — wrong format…
The blank in the format string skips zero or more white space characters; newlines (and tabs and spaces, etc) are white space characters.
You also don't want the & in front of x. Strictly, it causes a type violation: %[0-9] expects a char * but you pass a char (*)[100] which is quite a different type. However, it happens to be the same address, so you get away with it, but correct code shouldn't do that.
Replace
scanf("0.%[0-9]...", &x);
with
scanf(" 0.%[0-9]...", x);
You need to provide the starting location of the array which is x, not &x. Also, the space at the beginning will read all the whitespace characters and ignore it. So the \n left by scanf("%d", &N); is ignored.

printf("something\n") outputs "something " (additional space) (g++/linux/reading output file with gedit)

I have a simple C++ program that reads stdin using scanf and returns results to stdout using printf:
#include <iostream>
using namespace std;
int main()
{
int n, x;
int f=0, s=0, t=0;
scanf("%d",&n); scanf("%d",&x);
for(int index=0; index<n; index++)
{
scanf("%d",&f);
scanf("%d",&s);
scanf("%d",&t);
if(x < f)
{
printf("first\n");
}
else if(x<s)
{
printf("second\n");
}
else if(x<t)
{
printf("third\n");
}
else
{
printf("empty\n");
}
}
return 0;
}
I am compiling with g++ and running under linux. I execute the program using a text file as input, and pipe the output to another text file as follows:
program < in.txt > out.txt
The problem is that out.txt looks like this:
result1_
result2_
result3_
...
Where '_' is an extra space at the end of each line. I am viewing out.txt in gedit.
How can I produce output without the additional space?
My input file looks like this:
2 123
123 123 123
123 234 212
Edit: I was able to find a workaround for this issue: printf("\rfoo");
Thanks for your input!
Try removing the '\n' from your printf() statements, and run the code again. If the output file looks like one long word (no spaces), then you know that the only thing being inserted after the text is that '\n'.
I assume that the editor you are using to read the out.txt file just makes it look like there is an extra space after the output.
If you are still unsure, you can write a quick program to read in out.txt and determine the ASCII code of each character.
The end of line chars are:
System Hex Value Type
Mac 0D 13 CR
DOS 0D 0A 13 10 CR LF
Unix 0A 10 LF
For a end of line on each system you can:
printf("%c", 13);
printf("%c%c", 13, 10);
printf("%c", 10);
You can use this like
printf("empty");
printf("%c", 10);
Wikipedia Newline article here.
Okay, it's a little hard to figure this out, as the example program has numerous errors:
g++ -o example example.cc
example.cc: In function 'int main()':
example.cc:19: error: 'k' was not declared in this scope
example.cc:22: error: 'o' was not declared in this scope
example.cc:24: error: 'd' was not declared in this scope
make: *** [example] Error 1
But it's not going to be your input file; your scanf will be loading whatever you're typing into ints. This example, though:
/* scan -- try scanf */
#include <stdio.h>
int main(){
int n ;
(void) scanf("%d",&n);
printf("%d\n", n);
return 0;
}
produced this result:
bash $ ./scan | od -c
42
0000000 4 2 \n
0000003
on Mac OS/X. Get us a copy of the code you're actually running, and the results of od -c.
More information is needed here, as timhon asked, which environment are you working under? Linux, Windows, Mac? Also, what text editor are you using which displays these extra spaces?
My guess is that your space isn't really a space. Run
od -hc out.txt
to double check that it is really a space.
First, the code sample you've given doesn't compile as o and d are not defined...
Second, you've probably got whitespace at the end of the line you're reading in from the input file. Try opening it in vi to see. Otherwise, you can call a trim function on each line prior to output and be done with it.
Good luck!
Make sure you're looking at the output of the program you expect; this has a syntax error (no ";" after int n).
I feel like it's not even close to this, but if you run this on Windows, you'll get \r\n as line terminators, and, maybe, under *nix, under a non-Windows-aware text editor, you'll get \r as a common blank space, since \r is not printable.
Long shot, the best way to test this is using an hexadecimal editor and see the file yourself.