IOError: [Errno 13] Permission denied: while exporting to .csv - python-2.7

I'm just getting this error while trying to export data to a .csv format.
I've tried to run the application as administrator but it did not work.
Please help a rookie!
Here's the code:
import pandas as pd
tickers = ['AAPL', 'MSFT', 'XOM', 'BP']
portfolio_selection = pd.DataFrame()
for t in tickers:
portfolio_selection = wb.DataReader(tickers, 'google', start = '2005-1-1')['Close']
portfolio_selection
portfolio_selection.to_csv('C:\Users\PC\Documents\Lucas\Random_Folder')
Here's what i've got
--------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-6-0b1cec90f143> in <module>()
----> 1 portfolio_selection.to_csv('C:\Users\PC\Documents\Lucas\Random_Folder')
C:\Users\Pichau\Anaconda2\lib\site-packages\pandas\core\frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
1411 doublequote=doublequote,
1412 escapechar=escapechar, decimal=decimal)
-> 1413 formatter.save()
1414
1415 if path_or_buf is None:
C:\Users\Pichau\Anaconda2\lib\site-packages\pandas\io\formats\format.pyc in save(self)
1566 f, handles = _get_handle(self.path_or_buf, self.mode,
1567 encoding=self.encoding,
-> 1568 compression=self.compression)
1569 close = True
1570
C:\Users\Pichau\Anaconda2\lib\site-packages\pandas\io\common.pyc in _get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text)
374 if compat.PY2:
375 # Python 2
--> 376 f = open(path_or_buf, mode)
377 elif encoding:
378 # Python 3 and encoding
IOError: [Errno 13] Permission denied: 'C:\Users\PC\Documents\Lucas\Random_Folder'

I'm not sure what the error would look like on windows but I imagin it's because you need a file name. (On a Mac, your code would throw a IsADirectoryError: [Errno 21] Is a directory: Random_Folder)
Something like this should fix it:
portfolio_selection.to_csv('C:\Users\PC\Documents\Lucas\Random_Folder\portfolio_selection.csv')

Related

Querying BigQuery table in AI platform notebooks

I'm stuck in using a query into my jupyter notebook of gcp.
The query work fine in Bigquery when I run it there (see pic below)
When I run it in my notebook using this code.
query = """SELECT *, FROM [kaggle-competition-datasets:geotab_intersection_congestion.train] LIMIT 4"""
import google.datalab.bigquery as bq
train = bq.Query(query).execute().result().to_dataframe()
I get this error.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/usr/local/lib/python3.5/dist-packages/google/datalab/bigquery/_query.py in execute_async(self, output_options, sampling, context, query_params)
278 try:
--> 279 destination = query_result['configuration']['query']['destinationTable']
280 table_name = (destination['projectId'], destination['datasetId'], destination['tableId'])
KeyError: 'destinationTable'
During handling of the above exception, another exception occurred:
Exception Traceback (most recent call last)
<ipython-input-1-85e1ffdadde6> in <module>
1 query = """SELECT *, FROM [kaggle-competition-datasets:geotab_intersection_congestion.train] LIMIT 4"""
2 import google.datalab.bigquery as bq
----> 3 train = bq.Query(query).execute().result().to_dataframe()
/usr/local/lib/python3.5/dist-packages/google/datalab/bigquery/_query.py in execute(self, output_options, sampling, context, query_params)
337 """
338 return self.execute_async(output_options, sampling=sampling, context=context,
--> 339 query_params=query_params).wait()
340
341 #staticmethod
/usr/local/lib/python3.5/dist-packages/google/datalab/bigquery/_query.py in execute_async(self, output_options, sampling, context, query_params)
281 except KeyError:
282 # The query was in error
--> 283 raise Exception(_utils.format_query_errors(query_result['status']['errors']))
284
285 execute_job = _query_job.QueryJob(job_id, table_name, sql, context=context)
Exception: invalidQuery: Syntax error: Unexpected "[" at [1:16]. If this is a table identifier, escape the name with `, e.g. `table.name` rather than [table.name].
Of course I modified the query as suggested by the traceback but nothing works. What the problem does notebooks in gcp access in different manner in bigquery table??

Accessing a bz2 file in S3 from Sagemaker notebook

I am able to read and write csv files from and to S3 bucket from Sagemaker notebook, but when trying to read a bz2 file, using the path method used in csv files, I get the error of no file or directory
IOErrorTraceback (most recent call last)
<ipython-input-19-d14d47a702e1> in <module>()
2 # Create corpus
3 #%time wiki = WikiCorpus("resources/articles1.xml.bz2", tokenizer_func=spacy_tokenize)
----> 4 wiki = WikiCorpus("s3://sagemakerq/enwiki.xml.bz2", tokenizer_func=spacy_tokenize)
/home/ec2-user/anaconda3/envs/amazonei_mxnet_p27/lib/python2.7/site-packages/gensim/corpora/wikicorpus.pyc in __init__(self, fname, processes, lemmatize, dictionary, filter_namespaces, tokenizer_func, article_min_tokens, token_min_len, token_max_len, lower, filter_articles)
634
635 if dictionary is None:
--> 636 self.dictionary = Dictionary(self.get_texts())
637 else:
638 self.dictionary = dictionary
/home/ec2-user/anaconda3/envs/amazonei_mxnet_p27/lib/python2.7/site-packages/gensim/corpora/dictionary.pyc in __init__(self, documents, prune_at)
82
83 if documents is not None:
---> 84 self.add_documents(documents, prune_at=prune_at)
85
86 def __getitem__(self, tokenid):
/home/ec2-user/anaconda3/envs/amazonei_mxnet_p27/lib/python2.7/site-packages/gensim/corpora/dictionary.pyc in add_documents(self, documents, prune_at)
195
196 """
--> 197 for docno, document in enumerate(documents):
198 # log progress & run a regular check for pruning, once every 10k docs
199 if docno % 10000 == 0:
/home/ec2-user/anaconda3/envs/amazonei_mxnet_p27/lib/python2.7/site-packages/gensim/corpora/wikicorpus.pyc in get_texts(self)
676 ((text, self.lemmatize, title, pageid, tokenization_params)
677 for title, text, pageid
--> 678 in extract_pages(bz2.BZ2File(self.fname), self.filter_namespaces, self.filter_articles))
679 pool = multiprocessing.Pool(self.processes, init_to_ignore_interrupt)
680
IOError: [Errno 2] No such file or directory: 's3://sagemakerq/enwiki.xml.bz2'
Looks like you are using Python gensim package to construct a corpus from a wiki based database dump from S3. The package does not support reading directly from S3. Instead you can download the file and work with it.
import boto3
from gensim.corpora.wikicorpus import WikiCorpus
s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')
wiki = WikiCorpus('FILE_NAME')

Keep Getting Permission denied when using fastai library on AWS setting

I'm learning deep learning by taking a lecture that uses fastai. I'm running fastai library on AWS p2.xlarge. When I ran some function on fastai library I get this error.:
Traceback (most recent call last)
<ipython-input-12-1d86fc0ece07> in <module>()
1 arch = resnet34
2 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch,sz ))
----> 3 learn = ConvLearner.pretrained(arch, data, precompute = True)
4 learn.fit(0.01, 2)
~/fastai/fastai/conv_learner.py in pretrained(cls, f, data, ps, xtra_fc, xtra_cut, custom_head, precompute, pretrained, **kwargs)
112 models = ConvnetBuilder(f, data.c, data.is_multi, data.is_reg,
113 ps=ps, xtra_fc=xtra_fc, xtra_cut=xtra_cut, custom_head=custom_head, pretrained=pretrained)
--> 114 return cls(data, models, precompute, **kwargs)
115
116 #classmethod
~/fastai/fastai/conv_learner.py in __init__(self, data, models, precompute, **kwargs)
95 def __init__(self, data, models, precompute=False, **kwargs):
96 self.precompute = False
---> 97 super().__init__(data, models, **kwargs)
98 if hasattr(data, 'is_multi') and not data.is_reg and self.metrics is None:
99 self.metrics = [accuracy_thresh(0.5)] if self.data.is_multi else [accuracy]
~/fastai/fastai/learner.py in __init__(self, data, models, opt_fn, tmp_name, models_name, metrics, clip, crit)
35 self.tmp_path = tmp_name if os.path.isabs(tmp_name) else os.path.join(self.data.path, tmp_name)
36 self.models_path = models_name if os.path.isabs(models_name) else os.path.join(self.data.path, models_name)
---> 37 os.makedirs(self.tmp_path, exist_ok=True)
38 os.makedirs(self.models_path, exist_ok=True)
39 self.crit = crit if crit else self._get_crit(data)
~/anaconda3/envs/fastai/lib/python3.6/os.py in makedirs(name, mode, exist_ok)
218 return
219 try:
--> 220 mkdir(name, mode)
221 except OSError:
222 # Cannot rely on checking for EEXIST, since the operating system
PermissionError: [Errno 13] Permission denied: 'data/dogscats/tmp'
I think the AWS console has no permission to make the directory.
I did sudo mkdir tmp data/dogscats/ but I get another error that I couldn't understand.
I think I have to give AWS some permission but I have no clue how to do that.
I hope you guys can give me some clear idea on how to solve this kind of problem.
Fastai creates saves data like current loss etc. in a folder it creates by default the folder is created in the working directory but you can pass the argument path that is the path where you have the privileges to create a folder.

IOError: [Errno 22] loading parquet file

I have parquet data like the sample data below. I’m trying to load it in to a dataframe using the code below. The engine I’m using is pyarrow. I have other files that it works fine for, but when I try to load this file. I’m getting the error below. I’m new to parquet does anyone see what the issue might be?
Code:
pd.read_parquet('/tmp/dt=20/09_0')
Error:
ArrowIOErrorTraceback (most recent call last)
<ipython-input-20-23dfd4ca529a> in <module>()
----> 1 view_df=pd.read_parquet('/data_tmp/view_coremetrics/dt=20180402/000119_0')
2 # view_df=pd.read_parquet('/data_tmp/000031_0')
3 print view_df.shape
4 view_df.head()
/data2/user1/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read_parquet(path, engine, columns, **kwargs)
255
256 impl = get_engine(engine)
--> 257 return impl.read(path, columns=columns, **kwargs)
/data2/user1/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read(self, path, columns, **kwargs)
128 kwargs['use_pandas_metadata'] = True
129 return self.api.parquet.read_table(path, columns=columns,
--> 130 **kwargs).to_pandas()
131
132 def _validate_write_lt_070(self, df):
/data2/user1/anaconda2/lib/python2.7/site-packages/pyarrow/parquet.pyc in read_table(source, columns, nthreads, metadata, use_pandas_metadata)
937 return fs.read_parquet(source, columns=columns, metadata=metadata)
938
--> 939 pf = ParquetFile(source, metadata=metadata)
940 return pf.read(columns=columns, nthreads=nthreads,
941 use_pandas_metadata=use_pandas_metadata)
/data2/user1/anaconda2/lib/python2.7/site-packages/pyarrow/parquet.pyc in __init__(self, source, metadata, common_metadata)
62 self.reader = ParquetReader()
63 source = _ensure_file(source)
---> 64 self.reader.open(source, metadata=metadata)
65 self.common_metadata = common_metadata
66 self._nested_paths_by_prefix = self._build_nested_paths()
_parquet.pyx in pyarrow._parquet.ParquetReader.open()
error.pxi in pyarrow.lib.check_status()
ArrowIOError: Arrow error: IOError: [Errno 22] Invalid argument
Data:
PAR1??x??xLҢ924217587908548913115647362798388396398451534680690245436174253535301832948328446784820483655304337818520249518423095384646626994297369124175421698306711617314169483532812118925257912118483068693626684028851435422618056045560553866002671256164797432939995779833592483738186675911756683298492596228339721443259180385356757426207851989658054881511280641692601503861637822470631692909600167537024514

Python2: the meaning of '!../'

Hi I am studying caffe by this tutorial (http://nbviewer.jupyter.org/github/BVLC/caffe/blob/tutorial/examples/00-caffe-intro.ipynb)
I don't know the meaning of '!../' in the code like the following code:
import os
if os.path.isfile(caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'):
print 'CaffeNet found.'
else:
print 'Downloading pre-trained CaffeNet model...'
!../scripts/download_model_binary.py ../models/bvlc_reference_caffenet
# load ImageNet labels (for understanding the output)
labels_file = 'synset_words.txt'
if not os.path.exists(labels_file):
print 'begin'
!../home2/challege98/caffe/data/ilsvrc12/get_ilsvrc_aux.sh
print 'finish'
labels = np.loadtxt(labels_file, str, delimiter='\t')
Could you explain it in detail, when I run the code, there is error that:
Downloading pre-trained CaffeNet model...
/bin/sh: 1: ../scripts/download_model_binary.py: not found
begin
/bin/sh: 1: ../home2/challege98/caffe/data/ilsvrc12/get_ilsvrc_aux.sh: not found
finish
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-19-8534d29d47f5> in <module>()
12 get_ipython().system(u'../home2/challege98/caffe/data/ilsvrc12/get_ilsvrc_aux.sh')
13 print 'finish'
---> 14 labels = np.loadtxt(labels_file, str, delimiter='\t')
15
16
/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.pyc in loadtxt(fname, dtype, comments, delimiter, converters, skiprows, usecols, unpack, ndmin)
856 fh = iter(bz2.BZ2File(fname))
857 elif sys.version_info[0] == 2:
--> 858 fh = iter(open(fname, 'U'))
859 else:
860 fh = iter(open(fname))
IOError: [Errno 2] No such file or directory: 'synset_words.txt'
The exclamation point is to run a shell command. See here.
The error you are seeing is because the file synset_words.txt does not exist and is not being created because it cannot find the script to create it. Check this path is correct: ../home2/challege98/caffe/data/ilsvrc12/get_ilsvrc_aux.sh