please help, python table file find max value - python-2.7

please help
I'm a beginner to python programming and my problem is this:
I have to make a program which first reads a text file like this one->
A a 1 2 (line one)
A b 3 5 (line two)
A c 9 1
B d 2 4
B e 9 2
C r 3 4
...
and find out: for each First Value (A, B, C, ...), which second value (a, b, c, ...) has max (third value)*(fourth value) (1*2, 3*5, ...) value.
that is, in this example the result should be b, e, r.
And I need to do it 1) without using dictionary class and saving each data
or 2) devise a class and object and do the same thing.
(actually I have to make this program twice by using either methods)
What I'am really confused about is... I made this program first by using dictionary, but I have no idea how to do it with any of those two certain methods mentioned above.
I did this by making dictionary[dictionary[value]] format and (saving each line's data), and found out which one has max value for first value.
How can I do this not on this particular way?
Especially is it even possible to do this on method 1)? (without using dictionary class and saving each data)
thank you for reading my question
I'm really just beginning to learn about this programming and if any of you could give me some advice it would be really appreciated
here is what I've done so far:

The below code works by storing the maximum values and doing comparisons with the values currently being read from the file. This code is not complete as it does not intentionally handle instances where two of the products are the same and it also does not handle an edge case that you should be able to find using your example inputs. I've left those for you to complete.
max_vals = []
with open('FILE.TXT', 'r') as f:
max_first_val = None
max_second_val = None
max_prod = 0
for line in f:
vals = line.strip('\n').split(' ')
curr_prod = int(vals[2]) * int(vals[3])
if vals[0] != max_first_val and max_first_val is not None:
max_vals.append(max_second_val)
max_first_val = vals[0]
max_prod = 0
if curr_prod > max_prod:
max_first_val = vals[0]
max_second_val = vals[1]
max_prod = curr_prod

Related

How to use a string or a char vector (containing any chemical composition respectively formula) and calculate its molar mass?

I try to write a simple console application in C++ which can read any chemical formula and afterwards compute its molar mass, for example:
Na2CO3, or something like:
La0.6Sr0.4CoO3, or with brackets:
Fe(NO3)3
The problem is that I don't know in detail how I can deal with the input stream. I think that reading the input and storing it into a char vector may be in this case a better idea than utilizing a common string.
My very first idea was to check all elements (stored in a char vector), step by step: When there's no lowercase after a capital letter, then I have found e.g. an element like Carbon 'C' instead of "Co" (Cobalt) or "Cu" (Copper). Basically, I've tried with the methods isupper(...), islower(...) or isalpha(...).
// first idea, but it seems to be definitely the wrong way
// read input characters from char vector
// check if element contains only one or two letters
// ... and convert them to a string, store them into a new vector
// ... finally, compute the molar mass elsewhere
// but how to deal with the numbers... ?
for (unsigned int i = 0; i < char_vec.size()-1; i++)
{
if (islower(char_vec[i]))
{
char arr[] = { char_vec[i - 1], char_vec[i] };
string temp_arr(arr, sizeof(arr));
element.push_back(temp_arr);
}
else if (isupper(char_vec[i]) && !islower(char_vec[i+1]))
{
char arrSec[] = { char_vec[i] };
string temp_arrSec(arrSec, sizeof(arrSec));
element.push_back(temp_arrSec);
}
else if (!isalpha(char_vec[i]) || char_vec[i] == '.')
{
char arrNum[] = { char_vec[i] };
string temp_arrNum(arrNum, sizeof(arrNum));
stoechiometr_num.push_back(temp_arrNum);
}
}
I need a simple algorithm which can handle with letters and numbers. There also may be the possibility working with pointer, but currently I am not so familiar with this technique. Anyway I am open to that understanding in case someone would like to explain to me how I could use them here.
I would highly appreciate any support and of course some code snippets concerning this problem, since I am thinking for many days about it without progress… Please keep in mind that I am rather a beginner than an intermediate.
This problem is surely not for a beginner but I will try to give you some idea about how you can do that.
Assumption: I am not considering Isotopes case in which atomic mass can be different with same atomic number.
Model it to real world.
How will you solve that in real life?
Say, if I give you Chemical formula: Fe(NO3)3, What you will do is:
Convert this to something like this:
Total Mass => [1 of Fe] + [3 of NO3] => [1 of Fe] + [ 3 of [1 of N + 3 of O ] ]
=> 1 * Fe + 3 * (1 * N + 3 * O)
Then, you will search for individual masses of elements and then substitute them.
Total Mass => 1 * 56 + 3 * (1 * 14 + 3 * 16)
=> 242
Now, come to programming.
Trust me, you have to do the same in programming also.
Convert your chemical formula to the form discussed above i.e. Convert Fe(NO3)3 to Fe*1+(N*1+O*3)*3. I think this is the hardest part in this problem. But it can be done also by breaking down into steps.
Check if all the elements have number after it. If not, then add "1" after it. For example, in this case, O has a number after it which is 3. But Fe and N doesn't have it.
After this step, your formula should change to Fe1(N1O3)3.
Now, Convert each number, say num of above formula to:
*num+ If there is some element after current number.
*num If you encountered ')' or end of formula after it.
After this, your formula should change to Fe*1+(N*1+O*3)*3.
Now, your problem is to solve the above formula. There is a very easy algorithm for this. Please refer to: https://www.geeksforgeeks.org/expression-evaluation/. In your case, your operands can be either a number (say 2) or an element (say Fe). Your operators can be * and +. Parentheses can also be present.
For finding individual masses, you may maintain a std::map<std::string, int> containing element name as key and its mass as value.
Hope this helps a bit.

How can I create an array from a messy text file

I have a text file in the form below...
Some line of text
Some line of text
Some line of text
--
data entry 0 (i = 0, j = 0); value = 1.000000
data entry 1 (i = 0, j = 1); value = 1.000000
data entry 2 (i = 0, j = 2); value = 1.000000
data entry 3 (i = 0, j = 3); value = 1.000000
etc for quite a large number of lines. The total array ends up being 433 rows x 400 columns. There is a line of hyphens -- separating each new i value. So far I have the following code:
f = open('text_file_name', 'r')
lines = f.readlines()
which is simply opening the file and converting it to a list with each line as a separate string. I need to be able create an array with the given values for i and j positions - let's call the array A. The value of A[0,0] should be 1.000000. I don't know how I can get from a messy text file (at the stage I am, messy list) to a usable array
EDIT:
The expected output is a NumPy array. If I can get to that point, I can work through the rest of the tasks in the problem
UPDATE:
Thank you, Lukasz, for the suggestion below. I sort of understand the code you wrote, but I don't understand it well enough to use it. However, you have given me some good ideas on what to do. The data entries begin on line 12 of the text file. Values for i are within the 22nd and 27th character places, values for j are within the 33rd and 39th character places, and values for value are within the 49th and 62nd character places. I realize this is overly specific for this particular text file, but my professor is fine with that.
Now, I've written the following code using the formatting of this text file
for x in range(12,len(lines)):
if not lines[x].startswith(' data entry'):
continue
else:
i = int(lines[x][22:28])
j = int(lines[x][33:39])
r = int(lines[x][49:62])
matrix[i,j] = r
print matrix
and the following ValueError message is given:
r = int(lines[x][49:62])
ValueError: invalid literal for int() with base 10: '1.000000'
Can anyone explain why this is given (I should be able to convert the string '1.000000' to integer 1) and what I can do to correct the issue?
You may simply skip all lines that does not look like data line.
For retrieving indices simple regular expression is introduced.
import numpy as np
import re
def parse(line):
m = re.search('\(i = (\d+), j = (\d+)\); value = (\S+)', line)
if not m:
raise ValueError("Invalid line", line)
return int(m.group(1)), int(m.group(2)), float(m.group(3))
R = 433
C = 400
data_file = 'file.txt'
matrix = np.zeros((R, C))
with open(data_file) as f:
for line in f:
if not line.startswith('data entry'):
continue
i, j, v = parse(line)
matrix[i, j] = v
print matrix
Main trouble here is hardcoded matrix size. Ideally you' somehow detect a size of destination matrix prior to reading data, or use other data structure and rebuild numpy array from said structure.

Memory Error when trying to brute force a key

def bruteForce( dictionary = {}):
key = 0
for i in range(len(dictionary)):
keyRank = 0
for k in range(68719476736):
attempt = decrypt(dictionary[i], k)
if(i != attempt):
keyRank = 0
break
else:
keyRank += 1
key = k
print 'key attempt: {0:b}'.format(key)
if(keyRank == len(dictionary)):
print 'found key: {0:b}'.format(key)
break
The key is 36 bits
I get a memory error on the for k in range() line of code
Why is this a memory issue? Does python build an actual list of ints before running this line? Is there a better way to write this loop?
I'm brand new to Python and this wouldn't be a problem in C or Java.
This is a known-plaintext/ciphertext attack. dictionary is a mapping of P:C pairs.
Its on a VM, I can up the memory if needed, but want to know both why its failing and a code-based workaround or better idiomatic approach.
In python 2, range() will build the entire list in memory.
xrange() is a sequence object that evaluates lazily.
In python 3, range() does what xrange() did.

What is causing the syntax error here?

I'm trying to implement this algorithm but I keep getting a syntax error on the 12th line but I cannot pinpoint what is causing it. I'm new to ocaml and any help would be greatly appreciated.
"To find all the prime numbers less than or equal to a given integer n by Eratosthenes' method:
Create a list of consecutive integers from 2 through n: (2, 3, 4, ..., n).
Initially, let p equal 2, the first prime number.
Starting from p, enumerate its multiples by counting to n in increments of p, and mark them in the list (these will be 2p, 3p, 4p, ... ; the p itself should not be marked).
Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this new number (which is the next prime), and repeat from step 3."
let prime(n) =
let arr = Array.create n false in
let set_marks (arr , n , prime ) = Array.set arr (n*prime) true in
for i = 2 to n do
set_marks(arr,i,2) done
let findNextPrimeNumberThatIsNotMarked (arr, prime , index ) =
let nextPrime = Array.get arr index in
let findNextPrimeNumberThatIsNotMarkedHelper (arr, prime, index) =
if nextPrime > prime then nextPrime
else prime in
;;
Adding to Jeffrey's answer,
As I have already answered to you at " What exactly is the syntax error here? ",
What you absolutely need to do right now is to install and use a proper OCaml indentation tool, and auto-indent lines. Unexpected auto-indent results often indicate syntactic mistakes like forgetting ;. Without such tools, it is very hard even for talented OCaml programmers to write OCaml code without syntax errors.
There are bunch of auto indenters for OCaml available:
ocp-indent for Emacs and Vim https://github.com/OCamlPro/ocp-indent
Caml mode and Tuareg mode for Emacs
Vim should have some other indenters but I do not know...
OCaml has an expression let a = b in c. Your code ends with in, but where is c? It looks like maybe you should just remove the in at the end.
Looking more closely I see there are more problems than this, sorry.
A function in OCaml is going to look like this roughly:
let f x =
let a = b in
let c = d in
val
Your definition for prime looks exactly like this, except that it ends at the for loop, i.e., with the keyword done.
The rest of the code forms a second, independent, function definition. It has a form like this:
let f x =
let a = b in
let g x = expr in
The syntactic problem is that you're missing an expression after in.
However, your use of indentation suggests you aren't trying to define two different functions. If this is true, you need to rework your code somewhat.
One thing that may be useful (for imperative style programming) is that you can write expr1; expr2 to evaluate two expressions one after the other.

Solving a linear equation in one variable

What would be the most efficient algorithm to solve a linear equation in one variable given as a string input to a function? For example, for input string:
"x + 9 – 2 - 4 + x = – x + 5 – 1 + 3 – x"
The output should be 1.
I am considering using a stack and pushing each string token onto it as I encounter spaces in the string. If the input was in polish notation then it would have been easier to pop numbers off the stack to get to a result, but I am not sure what approach to take here.
It is an interview question.
Solving the linear equation is (I hope) extremely easy for you once you've worked out the coefficients a and b in the equation a * x + b = 0.
So, the difficult part of the problem is parsing the expression and "evaluating" it to find the coefficients. Your example expression is extremely simple, it uses only the operators unary -, binary -, binary +. And =, which you could handle specially.
It is not clear from the question whether the solution should also handle expressions involving binary * and /, or parentheses. I'm wondering whether the interview question is intended:
to make you write some simple code, or
to make you ask what the real scope of the problem is before you write anything.
Both are important skills :-)
It could even be that the question is intended:
to separate those with lots of experience writing parsers (who will solve it as fast as they can write/type) from those with none (who might struggle to solve it at all within a few minutes, at least without some hints).
Anyway, to allow for future more complicated requirements, there are two common approaches to parsing arithmetic expressions: recursive descent or Dijkstra's shunting-yard algorithm. You can look these up, and if you only need the simple expressions in version 1.0 then you can use a simplified form of Dijkstra's algorithm. Then once you've parsed the expression, you need to evaluate it: use values that are linear expressions in x and interpret = as an operator with lowest possible precedence that means "subtract". The result is a linear expression in x that is equal to 0.
If you don't need complicated expressions then you can evaluate that simple example pretty much directly from left-to-right once you've tokenised it[*]:
x
x + 9
// set the "we've found minus sign" bit to negate the first thing that follows
x + 7 // and clear the negative bit
x + 3
2 * x + 3
// set the "we've found the equals sign" bit to negate everything that follows
3 * x + 3
3 * x - 2
3 * x - 1
3 * x - 4
4 * x - 4
Finally, solve a * x + b = 0 as x = - b/a.
[*] example tokenisation code, in Python:
acc = None
for idx, ch in enumerate(input):
if ch in '1234567890':
if acc is None: acc = 0
acc = 10 * acc + int(ch)
continue
if acc != None:
yield acc
acc = None
if ch in '+-=x':
yield ch
elif ch == ' ':
pass
else:
raise ValueError('illegal character "%s" at %d' % (ch, idx))
Alternative example tokenisation code, also in Python, assuming there will always be spaces between tokens as in the example. This leaves token validation to the parser:
return input.split()
ok some simple psuedo code that you could use to solve this problem
function(stinrgToParse){
arrayoftokens = stringToParse.match(RegexMatching);
foreach(arrayoftokens as token)
{
//now step through the tokens and determine what they are
//and store the neccesary information.
}
//Use the above information to do the arithmetic.
//count the number of times a variable appears positive and negative
//do the arithmetic.
//add up the numbers both positive and negative.
//return the result.
}
The first thing is to parse the string, to identify the various tokens (numbers, variables and operators), so that an expression tree can be formed by giving operator proper precedences.
Regular expressions can help, but that's not the only method (grammar parsers like boost::spirit are good too, and you can even run your own: its all a "find and recourse").
The tree can then be manipulated reducing the nodes executing those operation that deals with constants and by grouping variables related operations, executing them accordingly.
This goes on recursively until you remain with a variable related node and a constant node.
At the point the solution is calculated trivially.
They are basically the same principles that leads to the production of an interpreter or a compiler.
Consider:
from operator import add, sub
def ab(expr):
a, b, op = 0, 0, add
for t in expr.split():
if t == '+': op = add
elif t == '-': op = sub
elif t == 'x': a = op(a, 1)
else : b = op(b, int(t))
return a, b
Given an expression like 1 + x - 2 - x... this converts it to a canonical form ax+b and returns a pair of coefficients (a,b).
Now, let's obtain the coefficients from both parts of the equation:
le, ri = equation.split('=')
a1, b1 = ab(le)
a2, b2 = ab(ri)
and finally solve the trivial equation a1*x + b1 = a2*x + b2:
x = (b2 - b1) / (a1 - a2)
Of course, this only solves this particular example, without operator precedence or parentheses. To support the latter you'll need a parser, presumable a recursive descent one, which would be simper to code by hand.