How to correctly use the Table.Repeat function? - powerbi

I'm having some trouble getting the Table.Repeat function to work properly... I'm very new to PowerQuery/BI, so am just about getting my head wrapped around all the coding.
Following the syntax and it would appear everything is correct, given that the addition of columns is optional.
What I am aiming to achieve is to have the entire repeated a specific number of times, the repeat function described here sounds like it fits the bill. But when I have attempted to implement it, it results in an error.
I was previously using the Append Function, however, as I'm trying to append the query several thousand times, this results in the query crashing excel and has become uneditable after the initial setup.
I've tried implementing the Repeat code halfway through the query, where it's needed; and on a new sheet. Halfway through gave me an error stating: that it could not find a Value and that a table.
When I tired it on a new sheet, though I did not get an error, the applied sets disappeared and the data wasn't repeated. I tried the repeating tables, much lower I needed to test out, but this still went into error.
#"Repeat" = Table.Repeat(#"8-1", 2)
Essentially the entire table repeated X number of times.

Related

How to automatically feed a cell value from a range of values, based on its matching condition with other cell value

I'm making a time-spending tracker based on the work I do every hour of the day.
Now, suppose I have 28 types of work listed in my tracker (which I also have to increase from time to time), and I have about 8 significance values that I have decided to relate to these 28 types of work, predefined.
I want that, as soon as I enter a type of work in cell 1 - I want the adjacent cell 2 to get automatically populated with a significance value (from a range of 8 values) that is pre-definitely set by me.
Every time I input a new or old occurrence of a type of work, the adjacent cell should automatically get matched with its relevant significance value & automatically get populated in real-time.
I know how to do it using IF, IFS, and IF_OR conditions, but I feel that based on the ever-expanding types of work & significance values, the above formulas will be very big, complicated, and repetitive in the future. I feel there's a more efficient way to achieve it. Also, I don't want it to be selected from a drop-down list.
Guys, please help me out with the most efficient way to handle this. TUIA :)
Also, I've added a snapshot and a sample sheet describing the problem.
Sample sheet
XLOOKUP() may work. Try-
=XLOOKUP(D2,A2:A,B2:B)
Or FILTER() function like-
=FILTER(B2:B,A2:A=D2)
You can use this formula for a whole column:
=INDEX(IFERROR(VLOOKUP(C14:C,A2:B9,2,0)))
Adapt the ranges to your actual tables in order to include in the second argument all the potential values and their significances
This is the formula, that worked for me (for anybody's reference):
I created another reference sheet, stating the types of work & their significance. From that sheet, I'm using either vlookup, filter, xlookup.Using gforms for inputting my data.
=ARRAYFORMULA(IFS(ROW(D:D)=1,"Significance",A:A="","",TRUE,VLOOKUP(D:D,Reference!$A:$B,2,0)))

Getting the error: "missing 1 required positional argument: 'row'" when using Dataframe.apply()

I am trying to improve performance of my stock order placer algorithm (1000's of lines) by switching from using iterrows() to using apply(), but I am getting an error:
TypeError: ("place_orders() missing 1 required positional argument: 'row'", 'occurred at index 2008-01-14 00:00:00')
Below is an example of the orders file I am reading in (short list for simplicity):
Next...below is my code both my attempt at implementing apply() and the slower iterrows()
I apologize if this is a newbie question, but I need to use the index and the rows inside the function, as the index is a bunch of dates.
Update: Below is an example of my prices_table.
When switching from iterrows to apply you need to change your mindset a little bit. Instead of a looping over the dataframe and taking every row from top to bottom, you just specify what you want to happen in every row. Mostly just let go of row numbers.
So when using apply it's usually a good idea to let go of of row numbers (in you case i). Try using a function like this in your apply:
orders_df.apply(lambda row: place_orders(row), axis=1)
I realize that inside your place_orders function you are using specific (sets of) rows of the prices_table. To overcome this part you might want to merge the dataframes before calling apply, since apply is not really intended to work on multiple dataframes at once.
This forces you to rewrite some of your code, but in my experience the performance increase you gain from not using iterrows is always worth it.

AWS Quicksight calculated fields gives incorrect result for simple division

I have a dataset with fields targeted and opens and I need to add calculated field opens per targeted which essentially means doing simple devision of those 2 values.
My calculated field is as follows
{opens}/{targeted}
but then displaying simple table with values they are completely incorrect
If I try any other operator like + * etc calculations are correct.
I'm completely out of ideas on how to debug this. I've simplified the dataset to just columns of targeted and opens, can't get any simpler.
Had the same problem, I fixed it by wrapping the columns with the sum() function. Like this:
sum({opens})/sum({targeted})
I think you need to make AWS understand that you are working with float numbers.
1.0*{opens}/{targeted}
if still not working try also
(1.0*{opens})/({targeted}*1.0)
it should give you the desired output (not tested, let me know if it doesnt work)

Xlrd list index out of range

I'm just starting to explore Xlrd, and to be honest am pretty new to programming altogether, and have been working through some of their simple examples, and can't get this simple code to work:
import xlrd
book=open_workbook('C:\\Users\\M\\Documents\\trial.xlsx')
sheet=book.sheet_by_index(1)
cell=sheet.cell(0,0)
print cell
I get an error: list index out of range (referring to the 2nd to last bit of code)
I cut and pasted most of the code from the pdf...any help?
You say:
I get an error: list index out of range (referring to the 2nd to last
bit of code)
I doubt it. How many sheets are there in the file? I suspect that there is only one sheet. Indexing in Python starts from 0, not 1. Please edit your question to show the full traceback and the full error message. I suspect that it will show that the IndexError occurs in the 3rd-last line:
sheet=book.sheet_by_index(1)
I would play around with it in the console.
Execute each statement one at a time and then view the result of each. The sheet indexes count from 0, so if you only have one worksheet then you're asking for the second one, and that will give you a list index out of range error.
Another thing that you might be missing is that not all cells exist if they don't have data in them. Some do, but some don't. Basically, the cells that exist from xlrd's standpoint are the ones in the matrix nrows x ncols.
Another thing is that if you actually want the values out of the cells, use the cell_value method. That will return you either a string or a float.
Side note, you could write your path like so: 'C:/Users/M/Documents/trial.xlsx'. Python will handle the / vs \ on the backend perfectly and you won't have to screw around with escape characters.

Comparing two documents

I have two very large lists. They both were originally in excel, but the larger one is a list of emails (about 160,000) of them with other information like their name and address etc. And the smaller one is a list of just 18,000 emails.
My question is what would be the easiest way to get rid of all 18,000 rows from the first document that contain the email addresses from the second?
I was thinking regex or maybe there is another application I can use? I have tried searching online but it seems like there isn't much specific to this. I also tried notepad++ but it freezes when I try to compare these large files.
-Thank You in Advance!!
Good question. One way I would tackle this is making a C++ program [you could extrapolate the idea to the language of your choice; You never mentioned which languages you were proficient in] that read each item of the smaller file into a vector of strings. First, of course, use Excel to save the files as CSV instead of XLS or XLSX, which will comma-separate the values so you can work with them easier. For the larger list, "Save As" a copy of just email addresses, deleting the other rows for now.
Then, you could open the larger list and use a nested loop to check if you should output to an output file. Something like:
bool foundMatch=false;
for(int y=0;y<LargeListVector.size();y++) {
for(int x=0;x<SmallListVector.size();x++) {
if(SmallListVector[x]==LargeListVector[y]) foundMatch=true;
}
if(!foundMatch) OutputVector.append(LargeListVector[y]);
foundMatch=false;
}
That might be partially pseudo-code, but do you get the idea?
So I read a forum post at : Here
=MATCH(B1,$A$1:$A$3,0)>0
Column B would be the large list, with the 160,000 inputs and column A was my list of things I needed to delete of 18,000.
I used this to match everything, and in a separate column pasted this formula. It would print out either an error or TRUE. If the data was in both columns it printed out true.
Then because I suck with excel, I threw this text into Notepad++ and searched for all lines that contained TRUE (match case, because in my case some of the data had the word true in it without caps.) I marked those lines, then under search, bookmarks, I removed all lines with bookmarks. Pasted that back into excel and voila.
I would like to thank you guys for helping and pointing me in the right direction :)