python/pandas: moved a script, get attribute error - python-2.7

I have a working script on one system (python2.7, pandas 16), but when I moved it to a different system with python2.7 and pandas 17, the following line --
df["DATE"] = df["DATE"].map(lambda x: pd.datetools.parse(x))
generates the following error
AttributeError: 'module' object has no attribute 'parse'
I tried to remove pandas17 and load 16 but the whl file -- pandas.0.16.2-cp27-none-win32.whl -- "is not a supported wheel on this platform"
It looks like a versioning issue. Anything else I can try?
Thanks

Use to_datetime:
df["DATE"] = pd.to_datetime(df["DATE"])

Related

set_glue_version exception after upgrading aws-glue-sessions

Using interactive Glue Sessions in a Jupyter Notebook was working correctly with the aws-glue-sessions package version 0.32 installed. After upgrading with pip3 install --upgrade jupyter boto3 aws-glue-sessions to version 0.35, the kernel would not start. Gave an error message in GlueKernel.py line 443 in set_glue_version Exception: Valid Glue versions are {'3.0', '2,0} and the Kernel won't start.
Reverting to version 0.32 resolves the issue. Tried installing 0.35, 0.34, 0.33 and get the error, which makes me think it's something I'm doing wrong or don't understand and not something in the product. Is there anything additional I need to do to upgrade the version of the aws-glue-sessions?
Obviously this is not a good workaround - but it worked for me.
I went into the file GlueKernel.py in the directory: \site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark
and hard-coded the 2nd line of this function to set the version to "3.0"
I'm on windows
def set_glue_version(self, glue_version):
glue_version = str("3.0")
if glue_version not in VALID_GLUE_VERSIONS:
raise Exception(f"Valid Glue versions are {VALID_GLUE_VERSIONS}")
self.glue_version = glue_version
I am a bit lost here as well -- and confused. I will add that I am a python newbie. I am running the whole thing on Windows. AWS has an article that describes the installation. So, I am assuming it's supported. I get the same error as #theOtherOne.
line 443 in set_glue_version Exception: Valid Glue versions are {'3.0', '2,0}
I checked GlueKernel.py of glue_pyspark, and found this code:
def _retrieve_os_env_variable(self, key):
_, output = subprocess.getstatusoutput(f"echo ${key}")
return output or os.environ.get(key)
When I run the code below manually, I get $GLUE_VERSION as final result. That obviously doesn't match '2.0' or '3.0'. The command for retrieving environment variables on Windows is a different one. If my understanding is correct, then this whole thing will never work on Windows. Maybe I am the only one who wants to run it on Windows and no one else cares? I got it to work on WSL, but still. I lost quite some time to fix something that cannot be fixed (or can it?)
import subprocess
import os
_, output = subprocess.getstatusoutput(f"echo $GLUE_VERSION")
osoutput = os.environ.get("GLUE_VERSION")
print(output) #$GLUE_VERSION
print (osoutput) #'3.0'
print(output or osoutput) #$GLUE_VERSION
enter image description here
So the issue seems to be that GLUE_VERSION is not set in the environment variables. Once this is set - it works

Getting "unmarshal failed" when trying to create first website post in Hugo after installation

I'm following the instructions at Hugo's Quickstart guide (https://gohugo.io/getting-started/quick-start/) but I keep getting this error message when I try to create a post:
unmarshal failed: Near line 1 (last key parsed 'theme'): expected value but found '\\' instead
I've posted some lines of my code below. The error message appears at the bottom. Could anyone help point out what I am doing wrong?
C:\Users\Scott\quickstart\MyHugoBlog\themes>git init
Initialized empty Git repository in C:/Users/Scott/quickstart/MyHugoBlog/themes/.git/
C:\Users\Scott\quickstart\MyHugoBlog\themes>git submodule add https://github.com/dashdashzako/paperback.git
Cloning into 'C:/Users/Scott/quickstart/MyHugoBlog/themes/paperback'...
remote: Enumerating objects: 16, done.
remote: Counting objects: 100% (16/16), done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 194 (delta 3), reused 9 (delta 1), pack-reused 178 eceiving objects: 53% (103/194)
Receiving objects: 100% (194/194), 466.30 KiB | 5.62 MiB/s, done.
Resolving deltas: 100% (93/93), done.
warning: LF will be replaced by CRLF in .gitmodules.
The file will have its original line endings in your working directory
C:\Users\Scott\quickstart\MyHugoBlog\themes>echo theme = \"paperback\" >> config.toml
C:\Users\Scott\quickstart\MyHugoBlog\themes>hugo new posts/my-first-post.md
Error: "C:\Users\Scott\quickstart\MyHugoBlog\themes\config.toml:1:1": unmarshal failed: Near line 1 (last key parsed 'theme'): expected value but found '\\' instead
It looks like you're following instructions meant for Unix-like systems on Windows. This command isn't doing what you want:
echo theme = \"paperback\" >> config.toml
Using Bash on Linux, for example, this appends
theme = "paperback"
to your config.toml file, creating it if necessary. That's what Hugo expects to find in the file.
However, using cmd.exe on Windows I get the backslashes included:
theme = \"paperback\"
And using PowerShell, I get something even stranger:
theme
=
\paperback\
Neither of these looks like valid TOML to me, and both contain extraneous backslashes as referenced in your error message. I suggest you simply edit config.toml using your favourite text editor and add the expected
theme = "paperback"
line manually.
The issue on my end was that the file wasn't created as UTF-8
Delete the config.toml file and recreate it manually on your text editor, then paste the content like: theme = "ananke"
should work

Kaggle API in colab datasets `!kaggle datasets list` error

I have a problem, i don't understand this error, when trying to list kaggles datasets in google colab.
Notebook config: Python 3.x, no hdw acc.
#to upload my kaggle token
from google.colab import files
files.upload()
#setting up the token
!pip install --upgrade kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
#and taking a look at datasets
!kaggle datasets list
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/kaggle/cli.py", line 51, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 940, in dataset_list_cli
max_size, min_size)
File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 905, in dataset_list
return [Dataset(d) for d in datasets_list_result]
File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 905, in <listcomp>
return [Dataset(d) for d in datasets_list_result]
File "/usr/local/lib/python3.6/dist-packages/kaggle/models/kaggle_models_extended.py", line 67, in __init__
self.size = File.get_size(self.totalBytes)
File "/usr/local/lib/python3.6/dist-packages/kaggle/models/kaggle_models_extended.py", line 107, in get_size
while size >= 1024 and suffix_index < 4:
TypeError: '>=' not supported between instances of 'NoneType' and 'int'
well, I would like to understand what happened, and how to fix it. Thank's in the advance.
jet.
I am encountering this problem as well. I noticed that if I set the use this call
kaggle datasets list --min-size 1
It will work. Note you will need version 1.5.6. I had 1.5.4 on a Colab instance and that version didn’t support that argument.
The problem seems to be bigquery/crypto-litecoin has no data. As a consequence of this, it looks like totalBytes is None in Dataset.
I've opened an issue on github and will created a PR. If you want a temporary work around, you can grab the file from my fork. You can use your traceback to determine where to put the file. Or alternatively, just use --min-size 1 so it will ignore the case when there are no data files.
I ran into the same problem.
Generate the Kaggle JSON API file. On the Widget/Icon in the top Right corner -> click "Account" -> Scroll down to "API" subsection, Click "Expire API Token" -> Click "Create New API Token"
In Google Colab. Upload your json file
Run the following code:
#first upload kaggle api file "kaggle.json" import os #this path contains the json file os.environ['KAGGLE_CONFIG_DIR'] = "/content"
#Find the competition or Dataset under Data. Like this: !kaggle competitions download -c jane-street-market-prediction
This worked for me after a lot of banging my head against the wall.
If you get errors still, you may need to link your Colab and Kaggle accounts. You can do this in the account settings portion of kaggle.

Tensorflow- bidirectional_dynamic_rnn: Attempt to reuse RNNCell

The following code (taken from - https://github.com/dennybritz/tf-rnn/blob/master/bidirectional_rnn.ipynb)
import tensorflow as tf
import numpy as np
tf.reset_default_graph()
# Create input data
X = np.random.randn(2, 10, 8)
# The second example is of length 6
X[1,6:] = 0
X_lengths = [10, 6]
cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)
outputs, states = tf.nn.bidirectional_dynamic_rnn(
cell_fw=cell,
cell_bw=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
output_fw, output_bw = outputs
states_fw, states_bw = states
is giving the following error for
tensorflow - 1.1 for both 2.7 and 3.5
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.LSTMCell object at 0x10ce0c2b0>
with a different variable scope than its first use. First use of cell was with scope
'bidirectional_rnn/fw/lstm_cell', this attempt is with scope 'bidirectional_rnn/bw/lstm_cell'.
Please create a new instance of the cell if you would like it to use a different set of weights.
If before you were using: MultiRNNCell([LSTMCell(...)] * num_layers), change to:
MultiRNNCell([LSTMCell(...) for _ in range(num_layers)]). If before you were using the same cell
instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances
(one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use
existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation,
so this error will remain until then.)
But it is working in
tensorflow - 1.0.1 for python 3.5 (did not test on python - 2.7)
I tried with multiple code examples I found online but
tf.nn.bidirectional_dynamic_rnn
is giving the same error with tensorflow - 1.1
Is there a bug in tensorflow 1.1 or am i just missing something?
Sorry you ran into this. I can confirm that the error appears in 1.1 (docker run -it gcr.io/tensorflow/tensorflow:1.1.0 python) but not in 1.2 RC0 (docker run -it gcr.io/tensorflow/tensorflow:1.2.0-rc0 python).
So it looks like either 1.2-rc0 or 1.0.1 are your options for the moment.

Tesnor Flow unsupported opperand

I am going to use tensor flow package to run models/rnn/ptb/
ptb_worl_lm.py.
However, I have got this error in seq2Seq.py,
line 653, in sequence_loss_by_example
log_perps /= total_size
TypeError: unsupported operand type(s) for /=: 'Tensor' and 'Tensor'
I am using Ubuntu, and python 2.7.
Are you running with the released version of tensorflow and using a post-release model, by chance? This sounds a lot like Github issue 293. My suggestion would be to either: (a) Update your install; (b) Try removing the from __future__ import division from the top of the file; or (c) changing the line to invoke the underlying log_perps = tf.div(log_perps, total_size) function directly.
(b) or (c) is the fastest fix, but in the long run, I'd go with (a).