I'm preparing KOI 2022, so, I'm solving KOI 2021, 2020 problems. KOI 2020 contest 1 1st period problem 5(See problem 5 in here)
I want to make <vector<vector<int>> minesweeper(vector<vector<int>> &v) function that works on 5*5 minesweeper.
argument
vector<vector<int>> &v
Numbers in minesweeper that converted to vector. -1 if it is blank.
e.g. {{0, 0, 1, -1, 1}, {-1, 3, 3, -1, 1}, {-1, -1, -1, -1, 0}, {2, 5, -1, 3, -1}, {-1, -1, -1, -1, -1}}
return value
A vector. Size is same with argument. Mine is 1, default is 0.
English translation of KOI 2020 contest 1 1st period problem 5
There is a 5*5 minesweeper puzzle.
There are 11 numbers, and the others are blank.
0
0
1
가
1
나
3
3
1
다
0
2
5
3
라
마
Where is the mine?
A. 가
B. 나
C. 다
D. 라
E. 마
How can I make minesweeper function? I want algorithm description, too.
There two some simple rules to solving Minesweeper:
If a field sees all it's mines then all blank fields don't have mines and can be uncovered.
If a field has as many blank fields as it is missing mines then they all contain mines.
Keep applying those rules over and over till nothing changes.
Now it gets complex because you have to look at combinations of fields. Instead of figuring out lots of special cases for 2, 3, 4 known fields I suggest going straight to a brute force algorithm:
For every blank field next to a know field:
create copies of the map with a mine present and not present
go back to the top to solve each map
if one of the maps results in any known field to not see enough mines then the other case must be the actual solution, apply it and go back to the start
If no progress was made then you have to guess. The above loop can give you probabilities for where mines are and if you know the number of mines total you have a probability for other blank fields. Pick one least likely to have a mine.
Related
I´ve got a time series of temperature data with some wrong values, I want to sort out. The problem is, I want to sort out only the points in a certain period of time.
If I sort out the wrong points by their temperature value, ALL of the points of this temperature value are sorted out (through the wohle measuring period)
This is a very easy version of my code (in reality, there are many more values)
laketemperature <- c(15, 14, 14, 12, 11, 9, 9, 8, 6, 4, 15, 14, 3) #only want to sort out the last 14 and 15
out <- c(15, 14)
laketemperature_clean <- laketemperature [- out] # the 15 and 14s at the beginning are sorted out, too :(
I want to have the whole laketemperature-series in the end, only without the second 15.
I already tried with ifelse, but it didn´t work out.
I am trying to find the jaccard similarity between two documents. However, i am having hard time to understand how the function sklearn.metrics.jaccard_similarity_score() works behind the scene.As per my understanding the Jaccard's sim = intersection of the terms in docs/ union of the terms in docs.
Consider below example:
My DTM for the two documents is:
array([[1, 1, 1, 1, 2, 0, 1, 0],
[2, 1, 1, 0, 1, 1, 0, 1]], dtype=int64)
above func. give me the jaccard sim score
print(sklearn.metrics.jaccard_similarity_score(tf_matrix[0,:],tf_matrix[1,:]))
0.25
I am trying to find the score on my own as :
intersection of terms in both the docs = 4
total terms in doc 1 = 6
total terms in doc 2 = 6
Jaccard = 4/(6+6-4)= .5
Can someone please help me understand if there is something obvious i am missing here.
As stated here:
In binary and multiclass classification, the Jaccard similarity coefficient score is equal to the classification accuracy.
Therefore in your example it is calculating the proportion of matching elements. That's why you're getting 0.25 as the result.
According to me
intersection of terms in both the docs = 2.
peek to peek intersection according to their respective index. As we need to predict correct value for our model.
Normal Intersection = 4. Leaving the order of index.
# so,
jaccard_score = 2/(6+6-4) = 0.25
I have two lists, the first of which represents times of observation and the second of which represents the observed values at those times. I am trying to find the maximum observed value and the corresponding time given a rolling window of various length. For example-sake, here are the two lists.
# observed values
linspeed = [280.0, 275.0, 300.0, 475.2, 360.1, 400.9, 215.3, 323.8, 289.7]
# times that correspond to observed values
time_count = [4.0, 6.0, 8.0, 8.0, 10.0, 10.0, 10.0, 14.0, 16.0]
# actual dataset is of size ~ 11,000
The missing times (ex: 3.0) correspond to an observed value of zero, whereas duplicate times correspond to multiple observations to the floored time. Since my window will be rolling over the time_count (ex: max value in first 2 hours, next 2 hours, 2 hours after that; max value in first 4 hours, next 4 hours, ...), I plan to use an array-reshaping routine. However, it's important to set up everything properly before, which entails finding the maximum value given duplicate times. To solve this problem, I tried the code just below.
def list_duplicates(data_list):
seen = set()
seen_add = seen.add
seen_twice = set(x for x in data_list if x in seen or seen_add(x))
return list(seen_twice)
# check for duplicate values
dups = list_duplicates(time_count)
print(dups)
>> [8.0, 10.0]
# get index of duplicates
for dup in dups:
print(time_count.index(dup))
>> 2
>> 4
When checking for the index of the duplicates, it appears that this code will only return the index of the first occurrence of the duplicate value. I also tried using OrderedDict via module collections for reasons concerning code efficiency/speed, but dictionaries have a similar problem. Given duplicate keys for non-duplicate observation values, the first instance of the duplicate key and corresponding observation value is kept while all others are dropped from the dict. Per this SO post, my second attempt is just below.
for dup in dups:
indexes = [i for i,x in enumerate(time_count) if x == dup]
print(indexes)
>> [4, 5, 6] # indices correspond to duplicate time 10s but not duplicate time 8s
I should be getting [2,3] for time in time_count = 8.0 and [4,5,6] for time in time_count = 10.0. From the duplicate time_counts, 475.2 is the max linspeed that corresponds to duplicate time_count 8.0 and 400.9 is the max linspeed that corresponds to duplicate time_count 10.0, meaning that the other linspeeds at leftover indices of duplicate time_counts would be removed.
I'm not sure what else I can try. How can I adapt this (or find a new approach) to find all of the indices that correspond to duplicate values in an efficient manner? Any advice would be appreciated. (PS - I made numpy a tag because I think there is a way to do this via numpy that I haven't figured out yet.)
Without going into the details of how to implement and efficient rolling-window-maximum filter; reducing the duplicate values can be seen as a grouping-problem, which the numpy_indexed package (disclaimer: I am its author) provides efficient and simple solutions to:
import numpy_indexed as npi
unique_time, unique_speed = npi.group_by(time_count).max(linspeed)
For large input datasets (ie, where it matters), this should be a lot faster than any non-vectorized solution. Memory consumption is linear and performance in general NlogN; but since time_count appears to be sorted already, performance should be linear too.
OK, if you want to do this with numpy, best is to turn both of your lists into arrays:
l = np.array(linspeed)
tc = np.array(time_count)
Now, finding unique times is just an np.unique call:
u, i, c = np.unique(tc, return_inverse = True, return_counts = True)
u
Out[]: array([ 4., 6., 8., 10., 14., 16.])
i
Out[]: array([0, 1, 2, 2, 3, 3, 3, 4, 5], dtype=int32)
c
Out[]: array([1, 1, 2, 3, 1, 1])
Now you can either build your maximums with a for loop
m = np.array([np.max(l[i==j]) if c[j] > 1 else l[j] for j in range(u.size)])
m
Out[]: array([ 280. , 275. , 475.2, 400.9, 360.1, 400.9])
Or try some 2d method. This could be faster, but it would need to be optimized. This is just the basic idea.
np.max(np.where(i[None, :] == np.arange(u.size)[:, None], linspeed, 0),axis = 1)
Out[]: array([ 280. , 275. , 475.2, 400.9, 323.8, 289.7])
Now your m and u vectors are the same length and include the output you want.
I haven't been able to find a solution to this problem anywhere (I've checked many of the custom sort questions) and I've only just started learning Python, apologies if this is a repeat or too specific.
I'm coding a card game that has the user playing against the program. The list that needs to be sorted is the hand of cards, each card represented as two characters (eg. 7c for the seven of clubs, or Td for the ten of diamonds). I want to arrange the cards so that the ranks are in the following order: 3, 4, 5, 6, 7, 8, 9, J, Q, K, A, 2 and T.
So if I had a hand that was ['3d', 'Ac', '6h', 'Kd', '2s'], it would be presented as ['3d', '6h', 'Kd', 'Ac', '2s'].
First, define the order. Then sort your hand by that order:
In [39]: order = "3456789JQKA2T"
In [40]: hand = ['3d', 'Ac', '6h', 'Kd', '2s']
In [41]: hand.sort(key=lambda c:order.index(c[0]))
In [42]: hand
Out[42]: ['3d', '6h', 'Kd', 'Ac', '2s']
For a programming class, I have to convert a range of values to a switch statement without using if/else ifs. Here are the values that I need to convert to cases:
0 to 149 ............. $10.00
150 to 299 .........$15.00
300 to 449 .........$25.00
550 to 749..........$40.00
750 to 1199........$65.00
2000 and above.....$85.00
I am having difficulty finding a way to separate the values since they are so close in number (like 149 to 150).
I have used plenty of algorithms such as dividing the input by 2000, and then multiplying that by 10 to get a whole number, but they are too close to each other to create a new case for.
The first thing to do is to figure out your granularity. It looks like in your case you do not deal with increments less than 50.
Next, convert each range to a range of integers resulting from dividing the number by the increment (i.e. 50). In your case, this would mean
0, 1, 2 --> 10
3, 4, 5 --> 15
6, 7, 8 --> 25
... // And so on
This maps to a very straightforward switch statement.
Note: This also maps to an array of values, like this:
10, 10, 10, 15, 15, 15, 25, 25, 25, ...
Now you can get the result by doing array[n/50].