It's been nearly four decades since I wrote much Fortran code. If I have a string:
"04L "
How do I extract the numeric part (always 2 digits), the single letter, and ignore trailing whitespace? I want to end up with two strings, "04" and "L".
I'm modifying a Fortran 90 program.
Just use indexing: str(1:2) are the first two chars, str(3:3) would be the third one. Here is a minimum example:
program test
character(len=*),parameter :: str = "04L "
print *,str(1:2)
print *,str(3:3)
end program
Related
Simple date string needs to be tokenized. I'am using this sample xslt code:
fn:tokenize(date, '[ .\s]+')
All variants of bad date format (i.e. "10.10.2020", "10. 10 .2020", "10 . 10. 2020") are tokenized ok using the function above, except if there's a leading space present (i.e. " 10.10.2020"). If leading space is present, first element is then tokenized as " " blank space.
Is there an option to ignore these leading spaces as well so no matter how bad the format is, only delimiter "." means another token and all spaces are stripped as well?
The right solution seems to be:
fn:tokenize(normalize-space(date, '[ .\s]+')
having real trouble finding a succinct solution to this simple problem. Currently I have cells which contain many comma separated items. I just want the first 5.
ie. cell A1 =
text, another string, something else, here's another one, guess what another string here, and another, hello i'm another string, another string etc, etc, etccccc
and I'm trying to grab just the first 5 strings.
Beyond that, I wonder if I can incorporate a formula such as =LEN(A1)>20
Currently I do this with numerous; =IFERROR(INDEX( SPLIT(C31,","),1)) then =IFERROR(INDEX( SPLIT(C31,","),2)) etc. then run the LEN formula above.
Is there a simpler solution? Thanks so much.
Try,
=split(replace(A1, find("|", SUBSTITUTE(A1, ", ", "|", 5)), len(A1), ""), ", ", false)
For Excel, with data in A1, in B1 enter:
=TRIM(MID(SUBSTITUTE($A1,",",REPT(" ",999)),COLUMNS($A:A)*999-998,999))
and copy across:
To get all 5 substrings into a single cell, use:
=LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,",",CHAR(1),5))-1)
=ARRAY_CONSTRAIN(SPLIT(A1,","),1,5)
=REGEXEXTRACT(A1,"((?:.*?,){5})")
=REGEXEXTRACT(A1,REPT("(.*?),",5))
SPLIT to split by delimiter
ARRAY_CONSTRAIN to constrain the array
REGEX1 to extract 5 comma separated values
. Any character
.*?, Any character repeated unlimited number of times (? as little as possible) followed by a ,
{5} Quantifier
REPT to repeat strings
I write code below which gets N the number of strings first and concate n strings,and print it. I set advance="no" option ,but it goes to newline.
I run this code in this site(https://yukicoder.me/problems/no/597)
program main
implicit none
integer::i,j,k,n
character(1000)::str,ans
read*,n
do i=1,n
read(*,'(a)')str
str=trim(str)
write(*,'(a)',advance='no')str
end do
print*,""
end program
The str=trim(str) has no effect, it will place the trimmed string back in str and pad it with blanks (the length of str remains 1000).
Contrary to e.g. C. Fortran does not have a termination character but fills the rest of the string with spaces.
The omitting of the blanks at the end should be done when writing the string to the output so:
write(*,'(a)',advance='no') trim(str)
I have a string, and I want to extract, using regular expressions, groups of characters that are between the character : and the other character /.
typically, here is a string example I'm getting:
'abcd:45.72643,4.91203/Rou:hereanotherdata/defgh'
and so, I want to retrieved, 45.72643,4.91203 and also hereanotherdata
As they are both between characters : and /.
I tried with this syntax in a easier string where there is only 1 time the pattern,
[tt]=regexp(str,':(\w.*)/','match')
tt = ':45.72643,4.91203/'
but it works only if the pattern happens once. If I use it in string containing multiples times the pattern, I get all the string between the first : and the last /.
How can I mention that the pattern will occur multiple time, and how can I retrieve it?
Use lookaround and a lazy quantifier:
regexp(str, '(?<=:).+?(?=/)', 'match')
Example (Matlab R2016b):
>> str = 'abcd:45.72643,4.91203/Rou:hereanotherdata/defgh';
>> result = regexp(str, '(?<=:).+?(?=/)', 'match')
result =
1×2 cell array
'45.72643,4.91203' 'hereanotherdata'
In most languages this is hard to do with a single regexp. Ultimately you'll only ever get back the one string, and you want to get back multiple strings.
I've never used Matlab, so it may be possible in that language, but based on other languages, this is how I'd approach it...
I can't give you the exact code, but a search indicates that in Matlab there is a function called strsplit, example...
C = strsplit(data,':')
That should will break your original string up into an array of strings, using the ":" as the break point. You can then ignore the first array index (as it contains text before a ":"), loop the rest of the array and regexp to extract everything that comes before a "/".
So for instance...
'abcd:45.72643,4.91203/Rou:hereanotherdata/defgh'
Breaks down into an array with parts...
1 - 'abcd'
2 - '45.72643,4.91203/Rou'
3 - 'hereanotherdata/defgh'
Then Ignore 1, and extract everything before the "/" in 2 and 3.
As John Mawer and Adriaan mentioned, strsplit is a good place to start with. You can use it for both ':' and '/', but then you will not be able to determine where each of them started. If you do it with strsplit twice, you can know where the ':' starts :
A='abcd:45.72643,4.91203/Rou:hereanotherdata/defgh';
B=cellfun(#(x) strsplit(x,'/'),strsplit(A,':'),'uniformoutput',0);
Now B has cells that start with ':', and has two cells in each cell that contain '/' also. You can extract it with checking where B has more than one cell, and take the first of each of them:
C=cellfun(#(x) x{1},B(cellfun('length',B)>1),'uniformoutput',0)
C =
1×2 cell array
'45.72643,4.91203' 'hereanotherdata'
Starting in 16b you can use extractBetween:
>> str = 'abcd:45.72643,4.91203/Rou:hereanotherdata/defgh';
>> result = extractBetween(str,':','/')
result =
2×1 cell array
{'45.72643,4.91203'}
{'hereanotherdata' }
If all your text elements have the same number of delimiters this can be vectorized too.
program Test
implicit none
character (LEN=100) :: input
character (LEN=100) :: output
print *,"Please input your message: "
read *, input
For every character, I encrypt it in Ceaser's Cipher
Calculations
print *,"This is the output: "
write (*,"(2a)") "Message = ", out
end program Test
This doesn't work entirely.
For every character in the input, I convert it using the modulo(iachar()) functions. It works up until the print, I followed the debugging, the encryption is fine.
But the issue with the output lies in LEN=100. The do loop will go through 100 times converting nonexistent characters into garbage, breaking the program at output with UNDEFINED TYPE.
So if I input "test", it will encrypt CBNC*GARBAGE-TO-100* and not output. If I define length as 4, and do it, it works. but I want to be able to do it without defining a length. Any way around this?
The read statement should pad input out to the full length of the variable (100 characters) with blanks, rather than adding "garbage". The LEN_TRIM intrinsic function will give the significant length of the variable's value - i.e. the length excluding trailing blanks. You may need to remember this significant length of the input string for when you print the output string.
(Note the rules on list directed input (indicated by the * in the read statement) can be a little surprising - a format of "(A)" may be more robust, depending on the behaviour your want.)
In terms of avoiding fixed length strings in the context of reading input - Fortran 2003 introduces deferred length character, which greatly helps here. Otherwise see Reading a character string of unknown length for Fortran 95 possibilities. One complication is that you are reading from the console, so the backspace statement may not work. The work around to that follows a similar approach to that linked, but necessitates piecewise building the input string into an allocatable array of character at the same time as the input record length is being determined. Sequence association is then used to convert that array into a scalar of the right length. Comment or ask again if you want more details.
The following code reads a user input string of unspecified length. Be aware that it requires a compiler that supports deferred-length character strings: character(len = :). Deferred-length character strings were introduced in Fortran 2003.
program test
use iso_fortran_env, only : IOSTAT_EOR
implicit none
integer :: io_number
character(len = 1) :: buffer
character(len = :), allocatable :: input, output
input = ""
print *, "Please input your message."
do
read(unit = *, fmt = '(a)', advance = "no", iostat = io_number) buffer
select case (io_number)
case(0)
input = input // buffer
case(IOSTAT_EOR)
exit
end select
end do
allocate(character(len=(len(input))) :: output)
! Now use "input" and "output" with the ciphering subroutine/function.
end program test
Explanation
The idea is to read in a single character at a time while looking for the end-of-record (eor) condition. The eor condition is caused by the user pressing the "return" key. The "iostat" option can be used to look for eor. The value returned by "iostat" is equal to the integer constant "IOSTAT_EOR" located in the the module "iso_fortran_env":
use iso_fortran_env, only : IOSTAT_EOR
We declare a deferred-length character string to grab user input of an unknown length:
character(len = :), allocatable :: input
In the "read" statement, "advance = 'no'" allows a few characters to be read in at a time. The size of "buffer" determines the number of characters to be read in (1 in our case).
read(unit = *, fmt = '(a)', advance = "no", iostat = io_number) buffer
If "iostat" returns a "0", then there were no errors and no eor. In this case the "buffer" character should be added to the "input" string. Ultimately this step allocates a "new" input that has the size of the "old" input + the buffer character. The newly allocated input contains the characters from the old input + the buffer character.
select case (io_number)
case(0)
input = input // buffer
If "iostat" returns an eor value, then exit the do loop.
case(IOSTAT_EOR)
exit
The standard Fortran string is fixed length, padded on the right with blanks. If your input string will never have trailing blanks the solution is easy: use the Fortran intrinsic function len_trim to find the nonblank length of the string and process only those characters. Another approach is to use a new feature, allocatable string ... this provides variable length strings. If disallowing blanks at the end of the string is acceptable, you will probably find using len_trim easier.