list the first few rows in Stata - stata

I have used R quite a bit and I know I can use head(data[,"column"]) or head(data) to see the first few rows/cells of data.
How can I do that in Stata?

You can use the list command for this:
list column in 1/6
or
list in 1/6
If you have a look at help list, you will discover plenty of options to customize the display.

Related

Maximum over 7 columns in Quicksight

I have a very simple problem, but unable to find a way to solve it in Quicksight.
I have data in the below form.I just want to compare the different columns(marks of different weeks) and print out the highest in a new column.
I have tried using the "Max" function, but seems like we can use it only for a specific column,
The output should look like:
Any help is appreciated.
I dont think that we can do aggregate functions across columns like in excel we can do max over range e.g. =max(c1:c5)
I have this work-around, given columns are not too many.
ifelse(
marks_week1>marks_week2,
ifelse(marks_week1>marks_week3,
ifelse(marks_week1>marks_week4,
ifelse(marks_week1>marks_week5, marks_week1, marks_week5),
ifelse(marks_week4>marks_week5, marks_week4, marks_week5)),
ifelse(marks_week3>marks_week4,
ifelse(marks_week3>marks_week5, marks_week3, marks_week5), marks_week4))
, ifelse(marks_week2>marks_week3,
ifelse(marks_week2>marks_week4,
ifelse(marks_week2>marks_week5, marks_week2, marks_week5),
ifelse(marks_week4>marks_week5,marks_week4, marks_week5)),
ifelse(marks_week3>marks_week4,
ifelse(marks_week3>marks_week5,marks_week3, marks_week5),
ifelse(marks_week4>marks_week5,marks_week4,marks_week5)))
)
Also posted on quicksight community as feature request or provide insight in case this exists already.

Extract list from range Google Sheets

I have some data from workplaces with some different work areas, I need to extract a list for each workplace with their corresponding availables working areas, I have an example of some kind of attempt really close what I wanted. I use this formula but with more data will be long time to do it =IF(D2=$G$1, "Yes", "No"). I want to do it more automatic with some formulas but I don't know where to start.
Give a try on below formula. Put the formula to G1 cell then drag down as needed.
=TRANSPOSE(IFERROR(FILTER($D$2:$D$16,$A$2:$A$16=F2,$D$2:$D$16<>""),""))

How can I resolve INDEX MATCH errors caused by discrepancies in the spelling of names across multiple data sources?

I've set up a Google Sheets workbook that synthesizes data from a few different sources via manual input, IMPORTHTML and IMPORTRANGE. Once the data is populated, I'm using INDEX MATCH to filter and compare the information and to RANK each data set.
Since I have multiple data inputs, I'm running into a persistent issue of names not being written exactly the same between sources, even though they're the same person. First names are the primary culprit (i.e. Mary Lou vs Marylou vs Mary-Lou vs Mary Louise) but some last names with special symbols (umlauts, accents, tildes) are also causing errors. When Sheets can't recognize a match, the INDEX MATCH and RANK functions both break down.
I'm wondering how to better unify the data automatically so my Sheet understands that each occurrence is actually the same person (or "value").
Since you can't edit the results of an IMPORTHTML directly, I've set up "helper columns" and used functions like TRIM and SPLIT to try and fix instances as I go, but it seems like there must be a simpler path.
It feels like IFS could work but I can't figure how to integrate it. Also thinking this may require a script, which I'm just beginning to study.
Here's a simplified example of what I'm trying to achieve and the corresponding errors: Sample Spreadsheet
The first tab is attempting to pull and RANK data from tabs 2 and 3. Sample formulas from the Summary tab, row 3 (Amelia Rose):
Cell B3: =INDEX('Q1 Sales'!B:B, MATCH(A3,'Q1 Sales'!A:A,0))
Cell C3: =RANK(B3,$B$2:B,1)
Cell D3: =INDEX('Q2 Sales'!B:B, MATCH(A3,'Q2 Sales'!A:A,0))
Cell E3: =RANK(D3,$D$2:D,1)
I'd be grateful for any insight on how to best index 'Q2Sales'!B3 as the correct value for 'Summary'!D3. Thanks in advance - the thoughtful answers on Stack Overflow have gotten me this far!
to counter every possible scenario do it like this:
=ARRAYFORMULA(IFERROR(VLOOKUP(LOWER(REGEXREPLACE(A2:A, "-|\s", )),
{REGEXEXTRACT(LOWER(REGEXREPLACE('Q2 Sales'!A2:A, "-|\s", )),
TEXTJOIN("|", 1, LOWER(REGEXREPLACE(A2:A, "-|\s", )))), 'Q2 Sales'!B2:B}, 2, 0)))

How to report a list in Behaviorspace NetLogo?

I am running a NetLogo model in BehaviorSpace each time varying number of runs. I have turtle-breed pigs, and they accumulate a table with patch-types as keys and number of visits to each patch-type as values.
In the end I calculate a list of mean number of visits from all pigs. The list has the same length as long as the original table has the same number of keys (number of patch-types). I would like to export this mean number of visits to each patch-type with BehaviorSpace.
Perhaps I could write a separate csv file (tried - creates many files, so lots of work later on putting them together). But I would rather have everything in the same file output after a run.
I could make a global variable for each patch-type but this seems crude and wrong. Especially if I upload a different patch configuration.
I tried just exporting the list, but then in Excel I see it with brackets e.g. [49 0 31.5 76 7 0].
So my question Q1: is there a proper way to export a list of values so that in BehaviorSpace table output csv there is a column for each value?
Q2: Or perhaps there is an example of how to output a single csv that looks exactly as I want it from BehaviorSpace?
PS: In my case the patch types are costs. And I might change those in the future and rerun everything. Ideally I would like to have as output: a graph of costs vs frequency of visits.
Thanks
If the lists are a fixed length that doesn't vary from run to run, you can get the items into separate columns by using one metric for each item. So in your BehaviorSpace experiment definition, instead of putting mylist, put item 0 mylist and item 1 mylist and so on.
If the lists aren't always the same length, you're out of luck. BehaviorSpace isn't flexible that way. You would have to write a separate program (in the programming language of your choice, perhaps NetLogo itself, perhaps an Excel macro, perhaps something else) to postprocess the BehaviorSpace output and make it look how you want.

How to get the natural indices in pandas?

I wanted to get the unique values in a Data frame. I used drop.duplicates() but after that, i got the unique values but the indexing are not in natural order. Indexing was something like, 0,1,5,9,15, etc...I wanted a natural indexing like 0,1,2,3,4 etc after doing drop.duplicated(). How to do that?
Just reassign the index after drop duplcated, ie:
df.index = range(0,len(df))
df= df.reset_index()
should do the trick! For more information on what this does, and its arguments you can read the documentation below.
http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.reset_index.html