Stata - repeated time values within panel - stata

I have a dataset that has the following format:
Company|Dependent var|Independent vars|Company ID|Date|dummy1|dummy2
A|Values|Values|1|01/01/2015|0|1
A|Values|Values|1|01/01/2015|1|0
A|Values|Values|1|01/01/2014|1|0
B|Values|Values|2|01/01/2015|0|1
B|Values|Values|2|01/01/2014|0|1
As you can see, companies can have multiple values at the same period (as they are rated by 2 different agencies). The problem then arises when I use xtset to define my panel data it throws the "repeated time values within panel". I wish to cluster errors by company and so I define the panel data set using "xtset CompanyID Date". Is there a way I can get round the error?
I wish to distinguish between the two entries that stata perceives as the same (i.e. but isn't as the dummy variables differentiate between them) but still cluster errors bases on company (using company id). Do I need to create a new id? Will this lose clustering by company?
Any help would be appreciated.
Laurence
Follow up: Basically I found that I am dealing with what is known as a multidimensional panel (e.g. y_i_j_k) not a 2 dimensional panel (y_i_j) and as such you cant do two dimensional commands on a >2 dimensional panel. As such I needed to reframe the panel 2 two dimensions by creating a new ID (egen newID = group(companyID Dummy1 Dummy2) This then allows you to use two dimensional commands. I think you can then group the data later using cluster (vce(cluster clustervar)). Thanks

Related

Take user input to complete missing values

I need some sort of guidance on what would be the best way of accomplishing such kind of task in SAS EG environment (on a 9.4 server). Let's say I have table named ITEM_EVALUATION like in the following example. The missing evaluations (rows: 4,5 and 6) should be filled in by the user. Although there may be better solutions, I would prefer if SAS iterated over the missing rows, give the user the row information (item) and take the input (evaluation) then update the table by that input.
Since this task is going to be a part of another sas eg project (egp), I need to do it within this project, so please advise...
ITEM_EVALUATION.sas7bdat
ROW
ITEM
EVALUATION
1
car
owned
2
house
rent
3
cat
none
4
phone
5
job
6
vacation
7
Make sure your dataset occurs in your project. (Drag it to a flow, for instance.)
Ask your user to double click any empty cell in the table and accept to go to edit mode.
If the empty cells are rare, the user can enter a filter missing(ITEM) or missing(EVALUATION) on top of the data to find them.
If that is too complex for your user, Enterprise Guide is not the tool for this person.

DAX function to get specific value from dimension?

I have a model with DimCustomer, DimSegment and FactRevenue.
For most cases each customer is associated with a single segment.
There are a handful of Customers, that have two Segments… Therefore, in FactRevenue a few few cases of Customers have two Segments associated…
I want to “override” it and display 'current segment' only using FactRevenue (I don’t have a connection between DimCustomer and DimSegment)
Do I have any tools in DAX to achieve this?
Is it viable to have a mapped list of this exceptions and hardcode it somehow?

Copying data from one sheet to another based on value in a cell

I have a Google sheet with multiple sheets.
The Ambassador users sheet has a list of multiple users (ID, Email, Coupon, and three more irrelevant columns).
Each new user is updated to the sheet via Zapier.
I can have three users with coupon 1234, four with ABCD and two with XYZ.
I then create a unique sheet for each type of coupon (also via Zapier) and want to update each sheet only with the users that have the correct coupon for that sheet.
The coupon is also listed in cell J1 on each sheet.
I need the update to happen automatically without pressing any buttons.
I do not know how to use the functions on Google sheets (I understand it's different from VBA), and I though using a function would be the best solution.
I tried using the IF function in conjunction with the INDEX function and it worked, however, it requires me to copy the function into each row, and thus reduces the automation option.
=if('Ambassador users'!$C3=$J$1, index('Ambassador users'!A3:G3),"")
Then I tried to use the IMPORTRANGE function, and this worked, but not in conjunction with the IF
=if('Ambassador users'!$C2=$J$1, importrange("1QHGSCR_pVepNlMtjFshvGnI-vSPzgqi3g9jz98","'Ambassador users'!A2:G11"),"")
This gave me all the rows in the Ambassador users sheet.
I think I'm doing something wrong with the IF statement in the initial range I'm setting is wrong.
I also tried to set a range in the IF, but that totally didn't work.
try like this with ARRAYFORMULA:
=ARRAYFORMULA(IF('Ambassador users'!C3:C=J1, 'Ambassador users'!A3:G, ))
or perhaps FILTER:
=FILTER('Ambassador users'!A3:G, 'Ambassador users'!C3:C=J1)

Merge cells with similar but different data, different spelling

I am trying Tableau with data extracted from Salesforce. The input includes a "Country" record were the row have different spellings for the same thing.
Example: Cananda, CANADA, CAnada etc.
Is there a way to fix this in Tableau?
The easiest solution is create a group field based on your Country field.
Select Country in the data pane on the left side bar, right click and choose Create Group. Select elements that you want to group together put them into a single group, say Canada, that contains all variations of spelling.
This new group field initially has a name of Country (group). You may want to rename it Country_Corrected. (Or even better, rename the first field, Country_Original, and call the group field simply Country. Then you can hide Country_Original)
Groups are implemented using SQL case statements. They have many uses, but one application is to easily tolerate some inconsistent spellings in your data source without having to change your data. In general, you can specify several transformations like this that take effect at query and visualization time. For very large data sets, or for very complicated transformations, you may eventually want to push some of them upstream in your data pipeline to get better performance. But make those optimizations later when you've proven the necessity.
If the differences are just in case (upper vs lower), you can right-click the Country dimension, and create a calculated field called something like "New Country", and use the following formula to make the case consistent:
upper([Country])
Use this new "New Country" calc dimension instead of your "Country" dimension, and it will group them all without case sensitivity, and display as uppercase. Or you can use "lower" instead of "upper" if preferred.

How to Query Large Sharepoint 2013 Lists in Infopath 2010?

I'm designing an Infopath form to help guide people in a data creation process. The form needs to draw from a Sharepoint list that contains around 19,000 rows, each with six columns that contain attributes (Column 1 = Attribute A, Column 2 = Attribute B, etc.) I've reduced the first three columns to their own lists, which contain only a few hundred unique entries each, if that. When I get to Column 4, there are 8,000 unique entries, which makes querying the list outright impossible
In an attempt to get around the item limitation, I've created an Infopath form with a data connection to the list (which does not automatically query when the form is loaded). Additionally, I've added drop downs that sets values for the queryFields of the secondary data source (one for Column 1, another for Column 2, and another for Column 3). On the last drop down, I set an action to query the database, but I still get the error regarding limitations and that rules cannot be applied.
Is there any way to "pre-filter" the data connection so that I can bypass the limitation by only drawing the data I need? Am I going about this the right way?
Any guidance would be greatly appreciated.
Are you able to add indexes to your list columns that you intend to query on? I've found that I can get around the error message on list limits if I go to the list and add an index for the columns that I will be setting as query fields prior to running my query data connection.