Issue on GMPL code - linear-programming

Issue on GMPL code - linear-programming

I was trying to solve the following problem, using the GLPSOL solver:
Fred has $5000 to invest over the next five years. At the beginning of each year he can invest money in one- or two-year time deposits. The bank pays 4% interest on one-year time deposits and 9 percent (total) on two-year time deposits. In addition, West World Limited will offer three-year certificates starting at the beginning of the second year. These certificates will return 15% (total). If Fred reinvest his money that is available every year, formulate a linear program to show him how to maximize his total cash on hand at the end of the fifth year.
I came up with the following LP model:
Being xij the amount invested in option i at year j, we look to
maximize z = 1,04x15 + 1,09x24 + 1,15x33,
subject to:
x11 + x12 <= 5000
x31 = x34 = x35 = 0
x12 + x22 + x32 <= 1,04 x11
x13 + x23 + x33 <= 1,04 x12 + 1,09 x21
x14 + x24 <= 1,04 x13 + 1,09 x22
x15 <= 1,04 x14 + 1,09 x23 + 1,15 x32
xij >= 0
And tried to write it in GMPL:
/* Variables */
var x{i in 1..3, j in 1..5} >= 0;
/* Objective */
maximize money: 1.04*x[1,5] + 1.09*x[2,4] + 1.15*x[3,3];
/* Constraints */
s.t. x[1,1] + x[2,1] <= 5000;
s.t. x[3,1] = x[3,4] = x[3,5] = 0;
s.t. x[1,2] + x[2,2] + x[3,2] <= 1.04 * x[1,1];
s.t. x[1,3] + x[2,3] + x[3,3] <= 1.04 * x[1,2] + 1.09 * x[2,1];
s.t. x[1,4] + x[2,4] <= 1.04 * x[1,3] + 1.09 * x[2,2];
s.t. x[1,5] <= 1.04 * x[1,4] + 1.09 * x[2,3] + 1.15 * x[3,2];
/* Resolve */
solve;
/* Results */
printf{j in 1..5}:"\n* %.2f %.2f %2.f \n", x[1,j], x[2,j], x[3,j];
end;
However, I'm getting the following error:
inv.mod:14: x multiply declared
Context: ...[ 1 , 5 ] + 1.09 * x [ 2 , 4 ] + 1.15 * x [ 3 , 3 ] ; s.t. x
MathProg model processing error
Does anyone have any thoughts about this?

You have to give a unique name to each constraint. Multiple assignments are not allowed.
This works on my machine:
/* Variables */
var x{i in 1..3, j in 1..5} >= 0;
/* Objective */
maximize money: 1.04*x[1,5] + 1.09*x[2,4] + 1.15*x[3,3];
/* Restrições */
s.t. c1: x[1,1] + x[2,1] <= 5000;
s.t. c2: x[3,1] = 0;
s.t. c3: x[3,4] = 0;
s.t. c4: x[3,5] = 0;
s.t. c5: x[1,2] + x[2,2] + x[3,2] <= 1.04 * x[1,1];
s.t. c6: x[1,3] + x[2,3] + x[3,3] <= 1.04 * x[1,2] + 1.09 * x[2,1];
s.t. c7: x[1,4] + x[2,4] <= 1.04 * x[1,3] + 1.09 * x[2,2];
s.t. c8: x[1,5] <= 1.04 * x[1,4] + 1.09 * x[2,3] + 1.15 * x[3,2];
/* Resolve */
solve;
/* Results */
printf{j in 1..5}:"\n* %.2f %.2f %2.f \n", x[1,j], x[2,j], x[3,j];
end;
It prints:
* 0.00 5000.00 0
* 0.00 0.00 0
* 0.00 0.00 5450
* 0.00 0.00 0
* 0.00 0.00 0
Good luck!

Related

Debugging old Fortran code for sediment dynamics

I am looking at some Fortran code from an old scanned paper. The scan quality is not great so I may have copied it wrong. I tried to run this using an online Fortran compiler but it bombs out. Not being familiar with Fortran, I was wondering if someone can point out where the syntax does not make sense? The code is from a paper on sediment dynamics:
Komar, P.D. and Miller, M.C., 1975. On the comparison between the threshold of sediment motion under waves and unidirectional currents with a discussion of the practical evaluation of the threshold: Reply. Journal of Sedimentary Research, 45(1).
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (6O,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
15 J = 1.20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM //
1 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC //
2 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl //
3 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH //
4 22X, 2HCM, 8X, 2HCM /)
C INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1.60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.0
1 + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
C CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
I0 FORMAT(IH0, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END

The problem is more likely that old Fortran requires fixed form code formatting where the number of spaces before a statement is very important.
Here are some general rules
Normal statements start at column 7 and beyond
Lines cannot exceed 72 columns
Any character placed on column 6 indicates the line is a continuation from the line above. I see that on the code above in the lines following 9 FORMAT(..
A number placed between columns 1-5 indicates a label, which can be a target of a GO TO statement, a DO statement or a formatting specification.
The character C on the first column, and sometimes any character on the first column indicate the line is a comment line.
see https://people.cs.vt.edu/~asandu/Courses/MTU/CS2911/fortran_notes/node4.html for more info.
Based on the rules above, here is how to enter the code, with the correct spacing. I run the F77 code through a converter to make it compatible with F90 and F77 at the same time. The code below might compile with the online compiler now.
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (60,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
DO 15 J=1,20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM // &
& 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC // &
& 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl // &
& 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH // &
& 22X, 2HCM, 8X, 2HCM /)
! INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1,60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.&
& + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
! CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
10 FORMAT(G14.4, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
I found several transcription errors, replacing commas with dots, zeros with the letter O, and a missing DO statement.

why does my circle anti aliasing algorithm in C++ give asymmetrical results?

Some background:
So my plan is to create a stippling algorithm in C++ and I basically just plan on storing a whole bunch of data for each radius of a circle to write onto a texture map in OpenGL I'm not sure if this is the right thing to do but I feel like it
would be quicker than the computer dynamically calculating the radius for each circle especially if lots of circles are the same size, my plan is to create a function that just writes a whole text document full of radiuses up to a certain size and this data will be stored bitwise inside an array of long's std::array <long> bit = {0x21, 0x0A ect... } so that I can encode 4X4 arrays of values with 2 bits assigned to the antialiasing value of each pixel however to create this database of ant-aliased circles I need to write a function that I keep getting wrong;
The actual question:
So this may seem lazy but I can promise I have tried everything to wrap my head around what I am getting wrong here basically i have written this code to anti=alias by dividing up the pixels into sub pixels however it seems to be returning values greater than 1 which shouldn't be possible as i have divided each pixel into 100 pixels of size 0.01
float CircleConst::PixelAA(int I, int J)
{
float aaValue = 0;
for (float i = (float) I; i < I + 1; i += 0.1f)
{
for (float j = (float) J; j < J + 1; j += 0.1f)
{
if ((pow((i - center), 2) + pow((j - center), 2) < pow(rad, 2)))
aaValue += 0.01f;
}
}
return aaValue;
}
also here is the code that writes the actual circle
CircleConst::CircleConst(float Rad)
{
rad = Rad;
dataSize = (unsigned int) ceil(2 * rad);
center = (float) dataSize/2;
arrData.reserve((int) pow(dataSize, 2));
for (int i = 0; i < dataSize; i++)
{
for (int j = 0; j < dataSize; j++)
{
if ( CircleBounds(i, j, rad-1) )
arrData.push_back(1);
else if (!CircleBounds(i, j, rad - 1) && CircleBounds(i, j, rad + 1))
{
arrData.push_back(PixelAA(i,j));
}
else
arrData.push_back(0);
}
}
}
so I noticed without the antialiasing that the way the circle is written is shifted over by one line, but this could be fixed by changing the value of the centre of the circle todataSize/2 - 0.5f but this causes problems later on when the circle is asymmetrical with the antialiasing, here is an example of radius 3.5
0.4 1.0 1.1 1.1 1.1 0.4 0.0
1.0 1.0 1.0 1.0 1.0 1.1 0.2
1.1 1.0 1.0 1.0 1.0 1.0 0.5
1.1 1.0 1.0 1.0 1.0 1.0 0.5
1.1 1.0 1.0 1.0 1.0 1.0 0.2
0.4 1.1 1.0 1.0 1.0 0.5 0.0
0.0 0.2 0.5 0.5 0.2 0.0 0.0
as you can see some of the values are over 1.0 which should not be possible, I'm sure there is an obvious answer to why this is but I'm completely missing it.

The problem lies with lines such as this one:
for (float i = (float) I; i < I + 1; i += 0.1f)
Floating point numbers cannot be stored or manipulated with infinite precision. By repeatedly adding one floating point number to another, the inaccuracies accumulate. This is why you're seeing values higher than 1.0.
The solution is to iterate using an integer type and compute the desired floating point numbers. For example:
for (unsigned i = 0U; i < 10U; ++i)
{
float x = 0.1F * static_cast<float>(i);
printf("%f\n", x);
}

In addition to what #Yun (the round-off error of floating point numbers) indicates, you must also pay attention to the sampling point (which must be at the pixel center).
Here your code, with some modification and addition:
#include <iostream>
#include <vector>
#include <iomanip>
#include <math.h>
float rad, radSquared, center;
const int filterSize = 8;
const float invFilterSize = 1.0f / filterSize;
// Sample the circle returning 1 when inside, 0 otherwise.
int SampleCircle(int i, int j) {
float di = (i + 0.5f) * invFilterSize - center;
float dj = (j + 0.5f) * invFilterSize - center;
return ((di * di + dj * dj) < radSquared) ? 1 : 0;
}
// NOTE: This sampling method works with any filter size.
float PixelAA(int I, int J)
{
int aaValue = 0;
for (int i = 0; i < filterSize; ++i)
for (int j = 0; j < filterSize; ++j)
aaValue += SampleCircle(I + i, J + j);
return (float)aaValue / (float)(filterSize * filterSize);
}
// NOTE: This sampling method works only with filter sizes that are power of two.
float PixelAAQuadTree(int i, int j, int filterSize)
{
if (filterSize == 1)
return (float)SampleCircle(i, j);
// We sample the four corners of the filter. Note that on left and bottom corners
// 1 is subtracted to avoid sampling overlap.
int topLeft = SampleCircle(i, j);
int topRight = SampleCircle(i + filterSize - 1, j);
int bottomLeft = SampleCircle(i, j + filterSize - 1);
int bottomRight = SampleCircle(i + filterSize - 1, j + filterSize - 1);
// If all samples have same value we can stop here. All samples lies outside or inside the circle.
if (topLeft == topRight && topLeft == bottomLeft && topLeft == bottomRight)
return (float)topLeft;
// Half the filter dimension.
filterSize /= 2;
// Recurse.
return (PixelAAQuadTree(i, j, filterSize) +
PixelAAQuadTree(i + filterSize, j, filterSize) +
PixelAAQuadTree(i, j + filterSize, filterSize) +
PixelAAQuadTree(i + filterSize, j + filterSize, filterSize)) / 4.0f;
}
void CircleConst(float Rad, bool useQuadTree)
{
rad = Rad;
radSquared = rad * rad;
center = Rad;
int dataSize = (int)ceil(rad * 2);
std::vector<float> arrData;
arrData.reserve(dataSize * dataSize);
if (useQuadTree)
{
for (int i = 0; i < dataSize; i++)
for (int j = 0; j < dataSize; j++)
arrData.push_back(PixelAAQuadTree(i * filterSize, j * filterSize, filterSize));
}
else
{
for (int i = 0; i < dataSize; i++)
for (int j = 0; j < dataSize; j++)
arrData.push_back(PixelAA(i * filterSize, j * filterSize));
}
for (int i = 0; i < dataSize; i++)
{
for (int j = 0; j < dataSize; j++)
std::cout << std::fixed << std::setw(2) << std::setprecision(2)
<< std::setfill('0') << arrData[i + j * dataSize] << " ";
std::cout << std::endl;
}
}
int main() {
CircleConst(3.5f, false);
std::cout << std::endl;
CircleConst(4.0f, false);
std::cout << std::endl;
std::cout << std::endl;
CircleConst(3.5f, true);
std::cout << std::endl;
CircleConst(4.0f, true);
return 0;
}
Which gives these results (the second ones with use of quad-tree to optimize number of samples required to compute the AA value):
0.00 0.36 0.84 1.00 0.84 0.36 0.00
0.36 1.00 1.00 1.00 1.00 1.00 0.36
0.84 1.00 1.00 1.00 1.00 1.00 0.84
1.00 1.00 1.00 1.00 1.00 1.00 1.00
0.84 1.00 1.00 1.00 1.00 1.00 0.84
0.36 1.00 1.00 1.00 1.00 1.00 0.36
0.00 0.36 0.84 1.00 0.84 0.36 0.00
0.00 0.16 0.70 0.97 0.97 0.70 0.16 0.00
0.16 0.95 1.00 1.00 1.00 1.00 0.95 0.16
0.70 1.00 1.00 1.00 1.00 1.00 1.00 0.70
0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.97
0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.97
0.70 1.00 1.00 1.00 1.00 1.00 1.00 0.70
0.16 0.95 1.00 1.00 1.00 1.00 0.95 0.16
0.00 0.16 0.70 0.97 0.97 0.70 0.16 0.00
0.00 0.36 0.84 1.00 0.84 0.36 0.00
0.36 1.00 1.00 1.00 1.00 1.00 0.36
0.84 1.00 1.00 1.00 1.00 1.00 0.84
1.00 1.00 1.00 1.00 1.00 1.00 1.00
0.84 1.00 1.00 1.00 1.00 1.00 0.84
0.36 1.00 1.00 1.00 1.00 1.00 0.36
0.00 0.36 0.84 1.00 0.84 0.36 0.00
0.00 0.16 0.70 0.97 0.97 0.70 0.16 0.00
0.16 0.95 1.00 1.00 1.00 1.00 0.95 0.16
0.70 1.00 1.00 1.00 1.00 1.00 1.00 0.70
0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.97
0.97 1.00 1.00 1.00 1.00 1.00 1.00 0.97
0.70 1.00 1.00 1.00 1.00 1.00 1.00 0.70
0.16 0.95 1.00 1.00 1.00 1.00 0.95 0.16
0.00 0.16 0.70 0.97 0.97 0.70 0.16 0.00
As further notes:
you can see how quad-trees work here https://en.wikipedia.org/wiki/Quadtree
you can further modify the code and implement fixed-point math (https://en.wikipedia.org/wiki/Fixed-point_arithmetic) which has no round-off problems like floats because numbers are always represented as integers
given that these data are part of a pre-calculation phase, privilege the simplicity of the code over performance

LPSolve IDE cannot find solution

I have following problem that I try to solve with LPSolve IDE:
min: x1;
r_1: 1.08 - k <= x1;
r_2: -1.08 + k <= x1;
c_1: y1 + y2 + y3 = k;
c_2: 2.29 a1 y1 + 2.28 a2 y1 + 2.27 a3 y1 = 1;
c_3: 1.88 b1 y2 + 1.89 b2 y2 + 1.9 b3 y2 = 1;
c_4: 8.98 c1 y3 + 8.99 c2 y3 + 9.0 c3 y3 = 1;
c_14: a1+a2+a3=1;
c_15: b1+b2+b3=1;
c_16: c1+c2+c3=1;
bin a1,a2,a3,b1,b2,b3,c1,c2,c3;
Not sure why I get output from LPSolve as INFEASIBLE when I can use following param values to solve this:
a1=0, a2=1, a3=0
b1=0, b2=1, b3=0
c1=0, c2=1, c3=0
0 + 2.28 0.438596491 + 0 = 1
0 + 1.89 0.529100529 + 0 = 1
0 + 8.99 0.111234705 + 0 = 1
0.438596491 + 0.529100529 + 0.111234705 = 1.0789 (this is k)
1.08 - 1.0789 == 0.0011 <= x1
-1.08 + 1.0789 == -0.0011 <= x1
x1 = 0.0011
Am I formulating the problem in a wrong way, or doing something else wrong? If I relax that =1 constraint to >=1 there are some results, but I need it to be 1 (as it is in my solution).

Lpsolve is for linear models only. You have products of variables in the model such as 2.29 a1 y1. Lpsolve can not solve such quadratic models.
Too bad you don't get a good error message. I guess they never expected this input.
It is noted that products of binary and continuous variables can be linearized resulting in so-called big-M constraints (see link).
This is really a duplicate of lpsolve - unfeasible solution, but I have example of 1. Embarrassingly, this was an earlier question from the same poster!

Dynamic list in netlogo

I was unsure how to title this question, if somebody knows a more specific title, I'm happy to change it
I'm trying to develop a model in netlogo for my thesis where turtles buy water from 18 different wells. Each of the turtles has its distance to the individual wells stored in the breeds-own variable(s) adist_w_1 ,adist_w_2, ... etc.
What I want to do is model the consumption decisions of commercial establishments by first calculating the price of water (constant + x*adist_w_y), then figure out what the demand is at this price and subtract the demands at cheaper wells from this specific well.
price = 0.75 +0.15 * distance;
individual_demand_at_well_x = f (price_at_well) - demands_at_cheaper_wells
What I've done so far to this regard looks like this:
to calc-price-at-well
set earlier-demands 0
ask commercials [
let price_w_1 ((adist_w_1 / 1000) * 0.15 + 0.75 )
let price_w_2 ((adist_w_2 / 1000) * 0.15 + 0.75 )
let price_w_3 ((adist_w_3 / 1000) * 0.15 + 0.75 )
let price_w_4 ((adist_w_4 / 1000) * 0.15 + 0.75 )
let price_w_5 ((adist_w_5 / 1000) * 0.15 + 0.75 )
let price_w_6 ((adist_w_6 / 1000) * 0.15 + 0.75 )
let price_w_7 ((adist_w_7 / 1000) * 0.15 + 0.75 )
let price_w_8 ((adist_w_8 / 1000) * 0.15 + 0.75 )
let price_w_9 ((adist_w_9 / 1000) * 0.15 + 0.75 )
let price_w_10 ((adist_w_10 / 1000) * 0.15 + 0.75 )
let price_w_11 ((adist_w_11 / 1000) * 0.15 + 0.75 )
let price_w_12 ((adist_w_12 / 1000) * 0.15 + 0.75 )
let price_w_13 ((adist_w_13 / 1000) * 0.15 + 0.75 )
let price_w_14 ((adist_w_14 / 1000) * 0.15 + 0.75 )
let price_w_15 ((adist_w_15 / 1000) * 0.15 + 0.75 )
let price_w_16 ((adist_w_16 / 1000) * 0.15 + 0.75 )
let price_w_17 ((adist_w_17 / 1000) * 0.15 + 0.75 )
let price_w_18 ((adist_w_18 / 1000) * 0.15 + 0.75 )
let demand_w_1 5 * price_w_1 ;DUMMY! include demand function of price
let demand_w_2 5 * price_w_2
let demand_w_3 5 * price_w_3
let demand_w_4 5 * price_w_4
let demand_w_5 5 * price_w_5
let demand_w_6 5 * price_w_6
let demand_w_7 5 * price_w_7
let demand_w_8 5 * price_w_8
let demand_w_9 5 * price_w_9
let demand_w_10 5 * price_w_10
let demand_w_11 5 * price_w_11
let demand_w_12 5 * price_w_12
let demand_w_13 5 * price_w_13
let demand_w_14 5 * price_w_14
let demand_w_15 5 * price_w_15
let demand_w_16 5 * price_w_16
let demand_w_17 5 * price_w_17
let demand_w_18 5 * price_w_18
let demand-list (list demand_w_1 demand_w_2 demand_w_3 demand_w_4 demand_w_5 demand_w_6 demand_w_7 demand_w_8 demand_w_9 demand_w_10 demand_w_11 demand_w_12 demand_w_13 demand_w_14 demand_w_15 demand_w_16 demand_w_17 demand_w_18)
foreach sort demand-list
[
let cheapest-well item 0 demand-list
set final_t_dem (cheapest-well - earlier-demands)
set earlier-demands (earlier-demands + cheapest-well)
set demand-list remove cheapest-well demand-list
]
]
*whereas earlier-demands is a global, final_t_dem a commercials-own
This hopefully yields the cheapest well and reiterates until all wells are "treated". But now I'm clueless as to how I can extract the information which well gets which demand from individual turtles and, this is the goal, to sum them up.
In the end, I'd like to be able to say that commercial xy has an individual demand of x for well 1 and all commercials together have a demand of y.
I'd be glad to get advice and since I'm new to netlogo also an idea if this is a smart way to go with regard to model efficiency. Many thanks!

Need to create data frame from patterns in a string

I have the following string I need to extract the patterns into a single column data frame named SIZE
str <- "N · 0.1 [mm]: N · 0.1 + 0.02 [mm]: N · 0.1 + 0.05 [mm] N · 0.1 + 0.08 [mm] M · 1 [mm]: M · 1 + 0.5 [mm] M · 1 + 0.75 [mm]"
The patterns are either followed by : or whitespace and always ends in [mm]
The regex I am using to match my patterns is and it works, but i'm not sure how to extract the matches to create a column as a data frame.
\S\W+\d\.?\d?\s\+?\s?\d?\.?\d?\d?\s?\[mm\]
Output expected: 1 column named SIZE
N · 0.1 [mm]
N · 0.1 + 0.02 [mm]
N · 0.1 + 0.05 [mm]
N · 0.1 + 0.08 [mm]
M · 1 [mm]
M · 1 + 0.5 [mm]
M · 1 + 0.75 [mm]
Any help appreciated. Thanks..

Perhaps, strsplit would make things easier here..
str <- "N · 0.1 [mm]: N · 0.1 + 0.02 [mm]: N · 0.1 + 0.05 [mm] N · 0.1 + 0.08 [mm] M · 1 [mm]: M · 1 + 0.5 [mm] M · 1 + 0.75 [mm]"
vals <- strsplit(str, '(?<=\\])[\\s:]*', perl = T)
data.frame(SIZE = unlist(vals))
Output
SIZE
1 N · 0.1 [mm]
2 N · 0.1 + 0.02 [mm]
3 N · 0.1 + 0.05 [mm]
4 N · 0.1 + 0.08 [mm]
5 M · 1 [mm]
6 M · 1 + 0.5 [mm]
7 M · 1 + 0.75 [mm]

Here's one approach to get the data in: replace any instances of "[mm] " with "[mm]: " and scan the text in with ":" as your separator. No fussing with regexes....
scan(what = "", text = gsub("[mm] ", "[mm]: ", str, fixed=TRUE),
sep = ":", strip.white=TRUE)
# Read 7 items
# [1] "N · 0.1 [mm]" "N · 0.1 + 0.02 [mm]" "N · 0.1 + 0.05 [mm]"
# [4] "N · 0.1 + 0.08 [mm]" "M · 1 [mm]" "M · 1 + 0.5 [mm]"
# [7] "M · 1 + 0.75 [mm]"
Just assign the result there to a column in a data.frame or create a data.frame with the output. Or, all in one:
data.frame(
SIZE = scan(text = gsub("[mm] ", "[mm]: ", str, fixed=TRUE),
sep = ":", strip.white=TRUE, what = ""))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Issue on GMPL code - linear-programming

Related

Debugging old Fortran code for sediment dynamics

why does my circle anti aliasing algorithm in C++ give asymmetrical results?

LPSolve IDE cannot find solution

Dynamic list in netlogo

Need to create data frame from patterns in a string

Categories

Resources