How delete non-contiguous rows using xlwings 0.18.0? - xlwings

How can I delete non-contiguous rows using xlwings 0.18.0?
Using VBA a can do this Range("1:7, 9:9").EntireRow.Delete
xlwings as xw
'''
Sub Delete_NonContigous_Rows_Using_VBA()
'Delete rows 1,2,3 ... 7 and 9
Range("1:7, 9:9").EntireRow.Delete
End Sub
'''
How can I do this in xlwings==0.18.0 ?
xw.Range("1:7","9:9").api.Delete()

I found this
xw.Range(xw.Range('1:7'), xw.Range('9:9')).delete()
But I'm stuck how to count columns
xw.api.UsedRange.Columns.count

Related

Is there a code for separating alphabets from integers from a string in an excel sheet using pandas? [duplicate]

This question already has answers here:
How to split a column into alphabetic values and numeric values from a column in a Pandas dataframe?
(4 answers)
Closed 3 years ago.
I'm working in a company project, the guys collected data and put it in excel sheet. And they want me to separate the integers from alphabets using regex under Barcode_Number column. Is the a way I can do that for all the values under Barcode_Number Column?
import numpy as np
import re
data = pd.read_excel(r'C:\Users\yanga\Gaussian\SEC - 6. Yanga Deliverables\Transmission\Raw\3000_2- processed.xlsx')
data.head()
# Extract the column you want to work with
df = pd.DataFrame(data, columns= ['Barcode_Number'])
# Identify the null values
df.isnull().sum()
# remove all the null values
df.dropna(how = 'all', inplace = True)
# Select cells that contain non-digit values
df1 = df[df['Barcode_Number'].str.contains('^\D', na = False)]
For example if I have list of values under column Barcode_Number
Barcode_Number
'VQA435'
'KSR436'
'LAR437'
'ARB438'
and I want an output to be like this:
'VQA', '435'
'KSR', '436'
'LAR', '437'
'ARB', '438'
import pandas as pd
df = pd.read_csv(filename)
df[["Code", "Number"]] = df["Barcode_Number"].str.extract(r"([A-Z]+)([0-9]+)")
print(df)
Output:
Barcode_Number Code Number
0 VQA435 VQA 435
1 KSR436 KSR 436
2 LAR437 LAR 437
3 ARB438 ARB 438

how to get formula result in excel using xlwings

What i want to do is 1)get a folmula result in excel and 2)update the values to the existing excel file. [ I created and wrote the folmula using "xlsxwriter". But when I tried openpyxl (or pandas) to retrieve the folmula result, it returns 0. I want to use "xlwings" to solve this problem, but no idea how to do it. can anyone help?
#openpyx
wb = openpyxl.load_workbook(filename=xlsx_name,data_only=True)
ws = wb.get_sheet_by_name("sheet1")
print "venn_value",(ws.cell('X2').value)
#pandas
fold_merge_data=pd.read_excel(xlsx_name,sheetname=1)
print fold_merge_data['Venn diagram'][:10]
Yes, xlwings can solve this problem for you because it uses pywin32 objects to interact with Excel, rather than just reading/writing xlsx or csv documents like openpyxl and pandas. This way, Excel actually executes the formula, and xlwings grabs the result.
In order to get the value you can do:
import xlwings as xw
sheet = xw.sheets.active # if the document is open
#otherwise use sheet = xw.Book(r'C:/path/to/file.xlsx').sheets['sheetname']
result = sheet['X2'].value
Also, note that you can set the formula using, for example
sheet['A1'].value = '=1+1' # or ='B1*2' if you want to reference other cells
import xlwings as xw
sheet = xw['Sheet1']
a2_formula = sheet.range('A2').formula
sheet.range('A2:A300').formula = a2_formula #it copys relative
You can use this method for copy formula or value

How to Pass Arguments from xlwings to VBA Excel Macro?

I was looking at How do I call an Excel macro from Python using xlwings?, and I understand it's not fully supported, but I Would like to know if there is a way to do this.
some like:
from xlwings import Workbook, Application
wb = Workbook(...)
Application(wb).xl_app.Run("your_macro("%Args%")")
This can be done by doing what you propose. However please keep in mind that the this solution will not be cross-platform (Win/Mac). I´m on Windows so below has to be adjusted to appscript on Mac. http://docs.xlwings.org/en/stable/missing_features.html
The VBA script can be called by following:
linked_wb.xl_workbook.Application.Run("vba_script", variable_to_pass)
Example:
Let´s say you have a list of strings that should be used in a Data Validation list in Excel
Python:
from xlwings import Workbook
linked_wb = Workbook.caller()
animals = ['cat', 'dog', 'snake', 'bird']
animal_list = ""
for animal in animals:
animal_list += animal + "|"
linked_wb.xl_workbook.Application.Run("create_validation", animal_list)
Excel VBA:
Public Sub create_validation(validation_list)
Dim validation_split() As String
validation_split = Split(validation_list, "|")
'The drop-down validation can only be created with 1-dimension array.
'We get 1-D from the Split above
With Sheet1.Range("A1").Validation
.Delete
.Add Type:=xlValidateList, AlertStyle:=xlValidAlertStop, _
Operator:=xlBetween, Formula1:=Join(validation_split, ",")
End With
End Sub
Python Example:
import xlwings as xw
# wb_path = r"set_wb_path"
wb = xw.Book(wb_path)
app = wb.app
variable_to_pass = 'test'
macro = wb.macro(moduleName.macroName)
macro(variable_to_pass)
wb.app.quit()
#or
wb.close()
As long as your VBA function accepts a variable and you pass it the same type of variable (str,int,list) this will work.

Pandas: Iterate on a column one row at a time to automate a google search?

I am trying to automate 100 google searches (one per individual String in a row and return urls per each query) on a specific column in a csv (via python 2.7); however, I am unable to get Pandas to read the row contents to the Google Search automater.
*GoogleSearch source = https://breakingcode.wordpress.com/2010/06/29/google-search-python/
Overall, I can print Urls successfully for a query when I utilize the following code:
from google import search
query = "apples"
for url in search(query, stop=5, pause=2.0):
print(url)
However, when I add Pandas ( to read each "query") the rows are not read -> queried as intended. I.E. "data.irow(n)" is being queired instead of the row contents, one at a time.
from google import search
import pandas as pd
from pandas import DataFrame
query_performed = 0
querying = True
query = 'data.irow(n)'
#read the excel file at column 2 (i.e. "Fruit")
df = pd.read_csv('C:\Users\Desktop\query_results.csv', header=0, sep=',', index_col= 'Fruit')
# need to specify "Column2" and one "data.irow(n)" queried at a time
while querying:
if query_performed <= 100:
print("query")
query_performed +=1
else:
querying = False
print("Asked all 100 query's")
#prints initial urls for each "query" in a google search
for url in search(query, stop=5, pause=2.0):
print(url)
Incorrect output I receive at the command line:
query
Asked all 100 query's
query
Asked all 100 query's
Asked all 100 query's
http://www.irondata.com/
http://www.irondata.com/careers
http://transportation.irondata.com/
http://www.irondata.com/about
http://www.irondata.com/public-sector/regulatory/products/versa
http://www.irondata.com/contact-us
http://www.irondata.com/public-sector/regulatory/products/cavu
https://www.linkedin.com/company/iron-data-solutions
http://www.glassdoor.com/Reviews/Iron-Data-Reviews-E332311.htm
https://www.facebook.com/IronData
http://www.bloomberg.com/research/stocks/private/snapshot.asp?privcapId=35267805
http://www.indeed.com/cmp/Iron-Data
http://www.ironmountain.com/Services/Data-Centers.aspx
FYI: My Excel .CSV format is the following:
B
1 **Fruit**
2 apples
2 oranges
4 mangos
5 mangos
6 mangos
...
101 mangos
Any advice on next steps is greatly appreciated! Thanks in advance!
Here's what I got. Like I mentioned in my comment, I couldn't get the stop parameter to work like i thought it should. Maybe i'm misunderstanding how its used. I'm assuming you only want the first 5 urls per search.
a sample df
d = {"B" : ["mangos", "oranges", "apples"]}
df = pd.DataFrame(d)
Then
stop = 5
urlcols = ["C","D","E","F","G"]
# Here i'm using an apply() to call the google search for each 'row'
# and a list is built for the urls return by search()
df[urlcols] = df["B"].apply(lambda fruit : pd.Series([url for url in
search(fruit, stop=stop, pause=2.0)][:stop])) #get 5 by slicing
which gives you. Formatting is a bit rough on this
B C D E F G
0 mangos http://en.wikipedia.org/wiki/Mango http://en.wikipedia.org/wiki/Mango_(disambigua... http://en.wikipedia.org/wiki/Mangifera http://en.wikipedia.org/wiki/Mangifera_indica http://en.wikipedia.org/wiki/Purple_mangosteen
1 oranges http://en.wikipedia.org/wiki/Orange_(fruit) http://en.wikipedia.org/wiki/Bitter_orange http://en.wikipedia.org/wiki/Valencia_orange http://en.wikipedia.org/wiki/Rutaceae http://en.wikipedia.org/wiki/Cherry_Orange
2 apples https://www.apple.com/ http://desmoines.citysearch.com/review/692986920 http://local.yahoo.com/info-28919583-apple-sto... http://www.judysbook.com/Apple-Store-BtoB~Cell... https://tr.foursquare.com/v/apple-store/4b466b...
if you'd rather not specify the columns (i.e. ["C",D"..]) you could do the following.
df.join(df["B"].apply(lambda fruit : pd.Series([url for url in
search(fruit, stop=stop, pause=2.0)][:stop])))

Python - Create An Empty Pandas DataFrame and Populate From Another DataFrame Using a For Loop

Using: Python 2.7 and Pandas 0.11.0 on Mac OSX Lion
I'm trying to create an empty DataFrame and then populate it from another dataframe, based on a for loop.
I have found that when I construct the DataFrame and then use the for loop as follows:
data = pd.DataFrame()
for item in cols_to_keep:
if item not in dummies:
data = data.join(df[item])
Results in an empty DataFrame, but with the headers of the appropriate columns to be added from the other DataFrame.
That's because you are using join incorrectly.
You can use a list comprehension to restrict the DataFrame to the columns you want:
df[[col for col in cols_to_keep if col not in dummies]]
What about just creating a new frame based off of the columns you know you want to keep, instead of creating an empty one first?
import pandas as pd
import numpy as np
df = pd.DataFrame({'a':np.random.randn(5),
'b':np.random.randn(5),
'c':np.random.randn(5),
'd':np.random.randn(5)})
cols_to_keep = ['a', 'c', 'd']
dummies = ['d']
not_dummies = [x for x in cols_to_keep if x not in dummies]
data = df[not_dummies]
data
a c
0 2.288460 0.698057
1 0.097110 -0.110896
2 1.075598 -0.632659
3 -0.120013 -2.185709
4 -0.099343 1.627839