Are these the same? if v1 & v2==1 and if v1==1 & v2==1 - if-statement

In my mind this should be the same thing but it is not! These two blocks of code don't come to the same result and I don't know why.
The only difference is how I phrased the if condition, so it boils down to whether
command var if v1 & v2 ==1
and
command var if v1==1 & v2==1
are the same.
Below is my original code. Everything is copy pasted except for the if condition.
forvalues i = 35/99 {
if inlist(`i', 45, 50, 61, 68, 72, 88) continue
recode A`i'_a A`i'_b A`i'_c (.=0) if A`i'_a & A`i'_b & A`i'_c
}
is not the same as
forvalues i = 35/99 {
if inlist(`i', 45, 50, 61, 68, 72, 88) continue
recode A`i'_a A`i'_b A`i'_c (.=0) if A`i'_a==. & A`i'_b==. & A`i'_c==.
}
after I run
forvalues i = 1/99 {
if inlist(`i', 45, 50, 56, 57, 58, 61, 68 , 72) continue
drop if A`i'_a==. | A`i'_b==. | A`i'_c==.
}
I count the observations, and they are not the same.

if vl is true if vl is not zero. That could include missing values.
if vl == 1 is true if and only if vl is 1.
Note also that if v1 & v2 == 1 is not understood as a contraction of if v1 == 1 & v2 == 1.
Note also this technique:
. numlist "36/93"
. local possible `r(numlist)'
. local exclude 45 50 61 68 72 88
. local wanted : list possible - exclude
. di "`wanted'"
36 37 38 39 40 41 42 43 44 46 47 48 49 51 52 53 54 55 56 57 58 59 60 62 63 64 65 66 67 69 70 71 73 74
75 76 77 78 79 80 81 82 83 84 85 86 87 89 90 91 92 93
Then the technique would be to loop, say
foreach w of local wanted {
}

Related

call and print function which return array in c++

I have this functions which return a random array in c++:
int* randomArray(int countOfRows){
int test1 [countOfRows] = {};
int insertValue;
int check;
for (int n=0; n < countOfRows; ++n){
srand(time (NULL) );
while (test1[n] == NULL){
insertValue = (rand () %100 + 1 );
for(int i = 0; i < countOfRows; i++){
if (test1[i] == insertValue){
check = 1;
break;
}
else{
check = 0;
}
}
if (check == 0){
test1[n] = insertValue;
}
}
}
return test1;
}
How can I call that array?
what is the difference between int* and int[]
thank you :)
Your code has four significant problems, one of them critical, one non-standard and implementation dependent, and two general algorithmic problems.
First, the most important, you're returning the address of an automatic variable, which means it is both useless and will invoke undefined behavior to dereference by the caller. Declared at the top of your function is:
int test1 [countOfRows] = {};
which itself brings up the second point, this non-standard for two reasons: variable-length arrays are not supported by the C++ standard, and by inference, initialization of said-same is likewise not supported. Then later...
return test1;
The caller of your function will receive an address, but that address is useless. It no longer addresses anything concrete, as test1 no longer exists once the function returns. This is remedied a number of ways, and considering this is C++, the easiest is using a std::vector<int>, which supports value-return.
The two significant algorithm problems are
Your seeding of srand should not be in the for loop. In fact, if you're using srand and rand, the seeding should be done once in your entire process.
The process of exhaustive searching to see if a current random pick has already been used to avoid duplicates is needless if you simply use a different algorithm, which I'll cover later.
Therefore, the simplest fix for your code will be to do this:
#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>
std::vector<int> randomArray(int countOfRows)
{
std::vector<int> test1(countOfRows);
int check = 0;
for (int n=0; n < countOfRows; ++n)
{
while (test1[n] == 0)
{
int insertValue = (rand () %100 + 1 );
for(int i = 0; i < countOfRows; i++)
{
if (test1[i] == insertValue){
check = 1;
break;
}
else{
check = 0;
}
}
if (check == 0){
test1[n] = insertValue;
}
}
}
return test1;
}
int main()
{
std::srand(static_cast<unsigned>(std::time(NULL)));
std::vector<int> vec = randomArray(20);
for (auto x : vec)
std::cout << x << ' ';
std::cout.put('\n');
}
Output (varies, obviously)
8 50 74 59 31 73 45 79 24 10 41 66 93 43 88 4 28 30 13 70
A Finite Set Algorithm
What you're really trying to generate here is a finite set of integers in the range of 1..100. I.e., there are no duplicate values used, and the number of items being returned could be anything from 1..100 as well. To do this, consider this algorithm:
Generate a sequence of 1..100 in a std::vector<int>
Using a pseudorandom generator from the standard library, shuffle the sequence using std::shuffle
Resize the resulting vector to be the number of elements you want to return.
Regarding #3 from above, consider a small example, suppose you wanted just ten elements. Initially you build a sequence vector that looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13... ...99 100
Now you shuffle this vector using a std::shuffle and a pseudorandom generator like std::mt19937 : (the first twenty elements shown for brevity):
48 39 31 44 68 84 98 40 57 76 70 16 30 93 9 51 63 65 45 81...
Now, you simply resize the vector down to the size you want, in this case ten elements:
48 39 31 44 68 84 98 40 57
And that is your result. If this sounds complicated, you may be surprised to see how little code it actually takes:
Code
#include <iostream>
#include <algorithm>
#include <vector>
#include <numeric>
#include <random>
std::vector<int> randomSequence100(std::size_t count)
{
if (count > 100)
count = 100;
static std::random_device rd;
std::vector<int> result(100);
std::iota(result.begin(), result.end(), 1);
std::shuffle(result.begin(), result.end(), std::mt19937(rd()));
result.resize(count);
return result;
}
int main()
{
// run twenty tests of random shuffles.
for (int i=0; i<20; ++i)
{
auto res = randomSequence100(20);
for (auto x : res)
std::cout << x << ' ';
std::cout.put('\n');
}
}
Output
27 71 58 6 74 65 56 37 53 44 25 91 10 86 51 75 31 79 18 46
6 61 92 74 30 20 91 89 64 55 19 12 28 13 5 80 62 71 29 43
92 42 2 1 78 89 65 39 37 64 96 20 62 33 6 12 85 34 29 19
46 63 8 44 42 80 70 2 68 56 86 84 45 85 91 33 20 83 16 93
100 99 4 20 47 32 58 57 11 35 39 43 87 55 77 51 80 7 46 83
48 39 31 44 68 84 98 40 57 76 70 16 30 93 9 51 63 65 45 81
32 73 97 83 56 49 39 29 3 59 45 89 43 78 61 5 57 51 82 8
21 46 25 29 48 37 77 74 32 56 87 91 94 86 57 67 33 9 23 36
27 46 66 40 1 72 41 64 53 26 31 77 42 38 81 47 58 73 4 11
79 77 46 48 70 82 62 87 8 97 51 99 53 43 47 91 98 81 64 26
27 55 28 12 49 5 70 94 77 29 84 23 52 3 25 56 18 45 74 48
95 33 25 80 81 53 55 11 70 2 38 77 65 13 27 48 40 57 87 93
70 95 66 84 15 87 94 43 73 1 13 89 44 96 10 58 39 2 23 72
43 53 93 7 95 6 19 89 37 71 26 4 17 39 30 79 54 44 60 98
63 26 92 64 83 84 30 19 12 71 95 4 81 18 42 38 87 45 62 70
78 80 95 64 71 17 14 57 54 37 51 26 12 16 56 6 98 45 92 85
89 73 2 15 43 65 21 55 14 27 67 31 54 52 25 72 41 6 85 33
4 87 19 95 78 97 27 13 15 49 3 17 47 10 84 48 37 2 94 81
15 98 77 64 99 68 34 79 95 48 49 4 59 32 17 24 36 53 75 56
78 46 20 30 29 35 87 53 84 61 65 85 54 94 68 75 43 91 95 52
Each row above was a set of twenty elements take from the sequence of 1..100. No single row has duplicates (check if you want).
Caveat
This technique works wonderfully for either small domains or large result sets from larger domains. But it has its limits to consider.
For example: Once your potential domain reaches the point significant size (say, numbers in 1...1000000) and you want only small result sets (say, no larger than 100 elements), you're better off using a std::unordered_set and iterative probing similar to what you're doing now. The technique you use depends entirely on your performance goals and your usage patterns.
Counterexample: If you wanted a half-million unique elements shuffled from a million-element domain, the load/shuffle/resize technique will work well.
Ultimately you have to decide, and measure to confirm.
Some useful links about some of the things used here (bookmark this site, as it is absolute gold for information about C++):
std::vector
std::iota
std::random_device
std::mt19937
std::shuffle
From my view this function has problems.
It return the point of test1, which is allocated in the stack, which is invalid out of the scope of randomArray.
So if you change to malloc, this is allocated in heap, then it still valid when out of the scope of randomArray.
int *test1 = (int*) malloc(countOfRows* sizeof(int));
And you can using test1[x] to get the value of each int, for sure you should know the length of test1 is countOfRows.
Please don't forget to delete this point when it is not used...
Call this array is simple
int* values = randomArray(1000);
printf("%d\r\n",values[0]);
In the function randomArray() declare test1[] as a static int[].
return the array using pointers,
" return test1 "
in the main function use a pointer to access the return value
" int *ptr=randomArray(n) "

Using linregress to calculate slope and intercept

I have a text file with two columns (x,y) of data. I want to calculate the slope and intercept of the values in this file using linregress, as I think this would be easiest.
This list, which is read in, looks like this
95 68
94 67.99028594
93 67.98057049
92 67.97085365
91 67.96113542
90 67.9514158
89 67.94169479
88 67.93197239
87 67.9222486
86 67.91252341
85 67.90279683
84 67.89306886
83 67.8833395
82 67.87360874
81 67.86387658
80 67.85414303
79 67.84440809
78 67.83467174
77 67.824934
76 67.81519486
75 67.80545432
74 67.79571238
73 67.78596905
72 67.77622431
71 67.76647817
70 67.75673062
69 67.74698168
68 67.73723133
67 67.72747958
66 67.71772642
65 67.70797186
64 67.6982159
63 67.68845852
62 67.67869974
61 67.66893956
60 67.65917796
59 67.64941496
58 67.63965055
57 67.62988473
56 67.62011749
55 67.61034885
54 67.6005788
53 67.59080733
52 67.58103445
51 67.57126015
50 67.56148445
49 67.55170732
48 67.54192879
47 67.53214883
46 67.52236746
45 67.51258468
44 67.50280047
43 67.49301485
42 67.4832278
41 67.47343934
40 67.46364946
39 67.45385816
38 67.44406543
37 67.43427129
36 67.42447572
35 67.41467872
34 67.40488031
33 67.39508047
32 67.3852792
31 67.37547651
30 67.36567239
29 67.35586684
28 67.34605987
27 67.33625147
26 67.32644164
25 67.31663038
24 67.30681769
23 67.29700357
22 67.28718801
21 67.27737103
20 67.26755261
19 67.25773276
18 67.24791148
17 67.23808876
16 67.2282646
15 67.21843901
14 67.20861199
13 67.19878352
12 67.18895362
11 67.17912228
10 67.1692895
9 67.15945528
8 67.14961963
7 67.13978253
6 67.12994399
5 67.120104
I am new to python, and have written some trivial code that I had hoped would do this:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy.stats import linregress
fileNameIN_pres = '/Somepathhere/prescription.txt'
with open(fileNameIN_pres) as f:
for l in f:
x, y = l.split()
X1 = np.array(x)
X2 = np.array(y)
print X1,X2
print linregress(X1,X2)
When I print X1 and X2, I get the columns I want, so the code does what I want up to that point. When I ask it to print linregress I get an error:
File "/usr/local/lib/python2.7/site-packages/scipy/stats/_stats_mstats_common.py", line 87, in linregress
n = len(x)
TypeError: len() of unsized object
[Finished in 0.5s with exit code 1]
Any suggestions how to fix this would be greatly appreciated.
though I don't know the file exactly, it would be like this:
X = []
Y = []
with open(fileNameIN_pres) as f:
for l in f:
x, y = l.split()
X.append(x)
Y.append(y)
X = np.asarray(X);
Y = np.asarray(Y);
model = linregress(X, Y)
slope, intercept = model.slope, model.intercept
you can predict your Y from new_x by
predict = slope*new_x + intercept

looking for the most accurate to sort double numbers [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I need to sort small float number.
When I use std::sort() //algorithm library,
I found that it's inaccurate in case of very very small numbers.
How can I sort this array in most accurate way?
edit : my friend suggested to me this lines of code which i don't understand them and they seemed don't work properly for the second items in pair
bool is_smaller(pair<double,int> a, pair <double , int> b)
{
return (b.first - a.first) > 1e9;
}
sort(a.begin(), a.end(), is_smaller);
#include <bits/stdc++.h>
using namespace std;
int main()
{
string s;
cin >> s;
vector <pair<double,int> > a;
double x = 0, y = 1, k, d;
for(int i = 0;i < s.size();i++)
{
k = (x + y)/2;
d = abs(k - y);
//printf("[%.3lf %0.3lf] %.3lf %.3lf \n",x, y, k, d);
a.push_back({k,i+1});
if(s[i] == 'l')
y = k, x = k - d;
else
y = k + d, x = k;
}
sort(a.begin(), a.end());
for (int i =0;i < a.size();i++)
printf("%d\n",a[i].second);
return 0;
}
input : rrlllrrrlrrlrrrlllrlrlrrrlllrllrrllrllrrlrlrrllllrlrrrrlrlllrlrrrlrlrllrlrlrrlrrllrrrlrlrlllrrllllrl
code's output :
1
2
6
7
8
10
11
13
14
15
19
21
23
24
25
29
32
33
36
39
40
42
44
45
50
52
53
51
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
49
48
47
46
43
41
38
37
35
34
31
30
28
27
26
22
20
18
17
16
12
9
5
4
3
expected output :
1
2
6
7
8
10
11
13
14
15
19
21
23
24
25
29
32
33
36
39
40
42
44
45
50
52
53
54
55
57
61
63
64
65
67
69
72
74
76
77
79
80
83
84
85
87
89
93
94
99
100
98
97
96
95
92
91
90
88
86
82
81
78
75
73
71
70
68
66
62
60
59
58
56
51
49
48
47
46
43
41
38
37
35
34
31
30
28
27
26
22
20
18
17
16
12
9
5
4
3
comment :wrong answer 28th numbers differ - expected: '54', found: '51'
Floating point arithmetic has limited precision, although this precision is high with doubles, but it is still limited.
You algorithm generates a sequence of numbers, K(i), where
|K(i+1) - k(i)| = 2^(-i).
The |difference| above is a geometric sequence, so it decreases exponentially. Therefore, at some value of ì, the difference will become so small that it cannot be reported into the floating-point representation.
I ran your code with exactly the same input, but I also printed the numbers deside the indices, and I did not apply the sorting. I printed the numbers up to 50 decimal digits (%.50f, just to see!). What did I observe?
The numbers for positions i > 53 are all equal (within the precision that the double could achieve). Therefore, the numbers indexed above 53 will be sorted somehow randomly, because they are equal.
If you print the floats with enough precision:
printf("%03d %.18f\n",a[i].second,a[i].first);
then you'll see that the computations lead to the same floating point value for the rank 51 to 100...

How do I extract data as a data frame from a text file in R? The data has names in it and the middle names are messing with my method

I have a text file where strings are separated by whitespaces. I can easily extract these into R as a data frame, by first using the scan command and then seeing that each record has 15 strings in them.
So data[1:15} is one row, data[16:30} is the other row and so on. In each of these records, the name is composed of two strings, say FOO and BAR. But some records have names such as FOO BOR BAR or even FOO BOR BOO BAR. This obviously messes with my 15 string theory. How can I easily extract the data into a data frame?
So my data is in my working directory called results.txt.
I use this to scan my data:
mech <- scan("results.txt", "")
Then I can make the data frames like this:
d1 <- t(data.frame(mech[1:15]))
d2 <- t(data.frame(mech[16:30]))
d3 <- t(data.frame(mech[31:45]))
My plan was to iterate this in a for loop and rbind the data into one consolidated data frame.
d1 results in something like
1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
d2 results in
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
Here, FOO and BAR are first and last names, respectively. Most records are like this. But d3:
3 FOO3 BOR BAR3 2K12/ME/03 72 83 61 75 44 88 75 165 91 30
Because of the extra middle name, I lose the final string of the text, the part right after 30. This then spills over to the next record. So row 46:60, instead of starting with 4, begins with the omitted data from the previous record.
How can I extract the data by treating the names as a single string?
EDIT: Stupid of me for not providing the data frame itself. Here is a sample.
1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93
s1 <- "1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93"
s2 <- readLines(textConnection(s1)) #read from your file here
s2 <- strsplit(s2, "\\s+") #splits by white space
s3 <- lapply(s2, function(s) {
n <- length(s)
s[2] <- paste(s[2:(2 + (n - 14))], collapse = " ")
s[-(3:(2 + (n - 14)))]
})
DF <- do.call(rbind, s3)
DF <- as.data.frame(DF, stringsAsFactors = FALSE)
DF[] <- lapply(DF, type.convert, as.is = TRUE)
str(DF)
#'data.frame': 4 obs. of 14 variables:
# $ V1 : int 1 2 3 4
# $ V2 : chr "FOO BAR" "FOO2 BAR2" "FOO3 BOR BAR3" "FOO4 BOR BAR4"
# $ V3 : chr "2K12/ME/01" "2K12/ME/02" "2K12/ME/03" "2K12/ME/04"
# $ V4 : int 96 72 63 89
# $ V5 : int 86 83 84 88
# $ V6 : int 86 61 62 74
# $ V7 : int 92 75 62 79
# $ V8 : int 73 44 50 77
# $ V9 : int 86 88 79 83
# $ V10: int 72 75 74 68
# $ V11: int 168 165 157 182
# $ V12: int 82 91 85 82
# $ V13: int 30 30 30 30
# $ V14: num 84.9 72.6 69.1 81.9
One approach is to use regex to enclose the names in quotes and then a simple read table. This approach has the advantage of allowing for cases with any number of names.
s1 <- "1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93"
s2 <- gsub("^ *|(?<= ) | *$", "", s1, perl = T)
read.table(text=gsub("(?<=[[:digit:]] )(.*)(?= 2K12)", "'\\1'", s2, perl = T), header = F)
Which gives:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93

Getting WA in ANUGCD from Codechef March Long Contest

I am Getting WA in the Question GCD Condition from Codechef March Long Contest.
Kindly tell me what I've done wrong or some test case where the code produces Wrong answer.
Link for the Question
I Have used RMQ(Range maximum Query) for every prime number
for(i=0;i<limit;i++)
{
int sz=b[i].size();
if(!sz)continue;
int level=0;
cc[i].resize(sz);
for(j=0;j<sz;j++)cc[i][j].push_back(b[i][j]);//level 0
for(level=1;(1<<level)<=sz;level++)
{
for(j=0;j+(1<<level)<=sz;j++)
{
int c1=cc[i][j][level-1];
int c2=cc[i][j+(1<<(level-1))][level-1];
int mx=(a[c1]<a[c2])?c2:c1;
cc[i][j].push_back(mx);
}
}
}
firstly i have converted to a structure like the following:-
Example input:- 10 6 20 15 8
(b[i]-->stores the indices of factors of i)
b[2]--> 1,2,3,5
b[3]--> 2,4
b[5]--> 1,3,4
Now after implementing RMQ, it will be as follow:-
(cc[i][j][k] stores index of the largest element between b[i][j] and b[i][j+(2^k)-1])
cc[2][0]-->1,2,3,5
cc[2][1]-->1,3,3
cc[2][2]-->3
cc[3][0]-->2,4
cc[3][1]-->4
cc[5][0]-->1,3,4
cc[5][1]-->3
My Code
100 1
88 33 23 56 97 54 8 74 43 95 91 63 38 13 7 7 52 29 6 85 70 15 52 18 78 9 85 51 28 43 4 68 75 78 75 23 32 34 48 74 28 90 36 66 2 95 24 54 23 29 90 45 96 93 14 73 2 99 75 81 93 31 100 19 8 75 93 39 60 41 64 88 30 100 5 84 46 28 89 20 56 30 64 3 22 78 75 75 76 2 8 20 32 7 38 39 33 82 30 93
95 95 97
The output is -1 -1, but gcd(38, 95) = 19, so ans should be 38 1.
Replacing 'break' by 'continue' on line 75 gave AC :)