regarding combit List & Label Version 24, Designer
I want specific objects on specific layers and these on specific pages.
I use page breaks on the objects saying "break before ..." and I want the layers connected to specific pages like this:
Layer 1, Page = 1
Layer 2, Page = 2
Layer 3, Page = 3
Layer 4, Page = 4
Layer 5, Page = 5
... but I do not get any working result with it.
With these "page = number" expression I do not get any result.
For now I have 3 pages and organized it like this:
Layer 1 = Page() = 1
Layer 2 = Page() >= 1
Layer 3 = Page() > 1
The result shows 3 pages and the second object (that is connected to layer 2) shows up on page 2 and 3.
Is there any simple solution to make it happen that every object is connected to one layer and that one layer to only one page?
Thank you very much in advance.
Problem solved.
Somehow it now works. After checking the Barcode example demo it works.
The page connections to the layers work fine with
Page() = <number>
On previous tries it did not …
Anyway, all fine now.
I'm trying to read the discharge data of 346 US rivers stored online in textfiles. The files are more or less in this format:
Measurement_number Date Gage_height Discharge_value
1 2017-01-01 10 1000
2 2017-01-20 15 2000
# etc.
I only want to read the gage height and discharge value columns.
The problem is that in most files additional columns with metadata are added in front of the 'Gage height' column, so i can not just simply read the 3rd and 4th column because their index varies.
I'm trying to find a way to say 'read the columns with the name 'Gage_height' and 'Discharge_value'', but I haven't succeeded yet.
I hope anyone can help. I'm currently trying to load the text files with numpy.genfromtxt so it would be great to find a solution with that package but other solutions are also more than welcome.
This is my code so far
data_url=urllib2.urlopen(#the url of this specific site)
data=np.genfromtxt(data_url,skip_header=1,comments='#',usecols=2,3])
You can use the names=True option to genfromtxt, and then use the column names to select which columns you want to read with usecols.
For example, to read 'Gage_height' and 'Discharge_value' from your data file:
data = np.genfromtxt(filename, names=True, usecols=['Gage_height', 'Discharge_value'])
Note that you don't need to set skip_header=1 if you use names=True.
You can then access the columns using their names:
gage_height = data['Gage_height'] # == array([ 10., 15.])
discharge_value = data['Discharge_value'] # == array([ 1000., 2000.])
See the docs here for more information.
So I recently downloaded the yahoo_finance API and its version 1.4.0. I got it a few days ago, and the .get_historical() was working fine. Now however, it doesn't. Heres what its doing:
import yahoo_finance as yf
apple=yf.Share('AAPL')
apple_price=apple.get_price()
print apple.get_historical('2016-02-15', '2016-04-29')
The error I get is:YQLResponseMalformedError: Response malformed. Is there a bug in the API or am I forgetting something?
The Yahoo Stock Price API doesn't work anymore, which a lot of modules are based on, unfortunately.
Alternatively, you could use Google's API
https://www.google.com/finance/getprices?q=1101&x=TPE&i=86400&p=3d&f=d,c,h,l,o,v
q=1101 is the stock quote
x=TPE is the exchange (List of Exchanges here: https://www.google.com/googlefinance/disclaimer/ )
i=86400 interval in seconds (86400 sec = 1 day)
p=3d data since how long ago
f= fields of data (d=date, c=close, h=high, l=low, o=open, v=volume)
Data would look like this:
EXCHANGE%3DTPE
MARKET_OPEN_MINUTE=540
MARKET_CLOSE_MINUTE=810
INTERVAL=86400
COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN,VOLUME
DATA=
TIMEZONE_OFFSET=480
a1496295000,24.4,24.75,24.35,24.75,11782000
1,24.5,24.5,24.3,24.4,10747000
a1496295000 is the Unix timestamp of first row of data
the second row 1 is the interval offset from first row (offset 1 day)
How do i unit test python dataframes?
I have functions that have an input and output as dataframes. Almost every function I have does this. Now if i want to unit test this what is the best method of doing it? It seems a bit of an effort to create a new dataframe (with values populated) for every function?
Are there any materials you can refer me to? Should you write unit tests for these functions?
While Pandas' test functions are primarily used for internal testing, NumPy includes a very useful set of testing functions that are documented here: NumPy Test Support.
These functions compare NumPy arrays, but you can get the array that underlies a Pandas DataFrame using the values property. You can define a simple DataFrame and compare what your function returns to what you expect.
One technique you can use is to define one set of test data for a number of functions. That way, you can use Pytest Fixtures to define that DataFrame once, and use it in multiple tests.
In terms of resources, I found this article on Testing with NumPy and Pandas to be very useful. I also did a short presentation about data analysis testing at PyCon Canada 2016: Automate Your Data Analysis Testing.
you can use pandas testing functions:
It will give more flexbile to compare your result with computed result in different ways.
For example:
df1=pd.DataFrame({'a':[1,2,3,4,5]})
df2=pd.DataFrame({'a':[6,7,8,9,10]})
expected_res=pd.Series([7,9,11,13,15])
pd.testing.assert_series_equal((df1['a']+df2['a']),expected_res,check_names=False)
For more details refer this link
If you are using pytest, pandasSnapshot will be useful.
# use with pytest
import pandas as pd
from snapshottest_ext.dataframe import PandasSnapshot
def test_format(snapshot):
df = pd.DataFrame([['a', 'b'], ['c', 'd']],
columns=['col 1', 'col 2'])
snapshot.assert_match(PandasSnapshot(df))
One big cons is that the snapshot is not readable anymore. (store the content as csv is more readable, but it is problematic.
PS: I am the author of pytest snapshot extension.
I don't think it's hard to create small DataFrames for unit testing?
import pandas as pd
from nose.tools import assert_dict_equal
input_df = pd.DataFrame.from_dict({
'field_1': [some, values],
'field_2': [other, values]
})
expected = {
'result': [...]
}
assert_dict_equal(expected, my_func(input_df).to_dict(), "oops, there's a bug...")
You could use snapshottest and do something like this:
def test_something_works(snapshot): # snapshot is a pytest fixture from snapshottest
data_frame = calc_something_and_return_pandas_dataframe()
snapshot.assert_match(data_frame.to_csv(index=False), 'some_module_level_unique_name_for_the_snapshot')
This will create a snapshots folder with a file in that contains the csv output that you can update with --snapshot-update when your code changes.
It works by comparing the data_frame variable to what is saved to disk.
Might be worth mentioning that your snapshots should be checked in to source control.
I would suggest writing the values as CSV in docstrings (or separate files if they're large) and parsing them using pd.read_csv(). You can parse the expected output from CSV too, and compare, or else use df.to_csv() to write a CSV out and diff it.
Pandas has built in testing functions, but I don't find the output easy to parse, so I created an open source project called beavis with functions that output error messages that are easier for humans to read.
Here's an example of one of the built in testing methods:
df = pd.DataFrame({"col1": [1042, 2, 9, 6], "col2": [5, 2, 7, 6]})
pd.testing.assert_series_equal(df["col1"], df["col2"])
Here's the error message:
> ???
E AssertionError: Series are different
E
E Series values are different (50.0 %)
E [index]: [0, 1, 2, 3]
E [left]: [1042, 2, 9, 6]
E [right]: [5, 2, 7, 6]
Not very easy to see which rows are mismatched because the output isn't aligned.
Here's how you can write the same test with beavis.
import beavis
beavis.assert_pd_column_equality(df, "col1", "col2")
This'll give you the following readable error message:
The built-in assert_frame_equal doesn't give a readable error message either. Here's how you can compare DataFrame equality with beavis.
df1 = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df2 = pd.DataFrame({'col1': [5, 2], 'col2': [3, 4]})
beavis.assert_pd_equality(df1, df2)
The frame-fixtures Python package (of which I am an author) is designed to make it easy to "create a new dataframe (with values populated)" for unit or performance tests.
For example, if you want to test against a DataFrame of floats and strings with a numerical index, you can use a compact string declaration to generate a DataFrame.
>>> ff.Fixture.to_frame('i(I,int)|v(float,str)|s(4,2)').to_pandas()
0 1
34715 1930.40 zaji
-3648 -1760.34 zJnC
91301 1857.34 zDdR
30205 1699.34 zuVU
>>> ff.Fixture.to_frame('i(I,int)|v(float,str)|s(8,3)').to_pandas()
0 1 2
34715 1930.40 zaji 694.30
-3648 -1760.34 zJnC -72.96
91301 1857.34 zDdR 1826.02
30205 1699.34 zuVU 604.10
54020 268.96 zKka 1080.40
129017 3511.58 zJXD 2580.34
35021 1175.36 zPAQ 700.42
166924 2925.68 zyps 3338.48
I have a large string of characters and would like to extract certain information from it matching pattern:
str(input)
chr [1:109094] "{'asin': '0981850006', 'description': 'Steven Raichlen\'s Best of Barbecue Primal Grill DVD. The first three volumes of the si"| truncated ...
I get the following content of input[1] - description of product meta
[1] ("{'asin': '144072007X', 'related': {'also_viewed': ['B008WC0X0A', 'B000CPMOVG', 'B0046641AE', 'B00J150GAO', 'B00005AMCG', 'B005WGX97I'],
'bought_together': ['B000H85WSA']},
'title': 'Sand Shark Margare Maron Audio CD',
'price': 577.15,
'salesRank': {'Patio, Lawn & Garden': 188289},
'imUrl': 'http://ecx.images-amazon.com/images/I/31B9X0S6dqL._SX300_.jpg',
'brand': 'Tesoro',
'categories': [['Patio, Lawn & Garden', 'Lawn Mowers & Outdoor Power Tools', 'Metal Detectors']],
'description': \"The Tesoro Sand Shark metal combines time-proven PI circuits with the latest digital technology creating the first.\"}")
Now I would like to iterate over each element of the large string and extract asin, title, price, salesRank, brand and categories that should be saved in a data.frame for better handling.
The data is originally from a JSON file as you might notice. I tried to import it using stream_in command, but it didn't help. So just imported it using readLines. Please please help! Being a bit desperate...Any hint is appreciated!
The jsonlite package shows the following problem:
lexical error: invalid char in json text.
{'asin': '0981850006', 'descript
(right here) ------^
closing fileconnectionoldClass input connection.
Any new ideas on that?
Given lots of unanswered questions on that issue, must be very relevant for newbies ;)