I'm trying to add a dataset which are stored in a hidden input in a form.
Here are the inputs, which are set using ajax , and I extract to create a table but now want to add this dataset to a chart.js
<input type="hidden" name="graphlabels" id="graphlabels" value="'Job 1 Year 1 $51,390.00','Job 1 Year 2 $91,845.00','Job 1 Year 3 $98,667.00','Job 2 Year 1 $88,886.40','Job 2 Year 2 $124,406.40','Job 2 Year 3 $136,675.20','Job 2 Promotion Year 4 $222,585.60','Job 2 Promotion Year 5 $224,688.00','Job 2 Promotion Year 6 $226,944.00','Job 2 Promotion Year 7 $229,670.40','Job 2 Promotion Year 8 $234,566.40','Job 2 Promotion Year 9 $239,664.00','Job 2 Promotion Year 10 $246,556.80','Job 2 Promotion Year 11 $250,310.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40'"></input>
<input type="hidden" name="graphdata" id="graphdata" value="51390,91845,98667,88886.4,124406.4,136675.2,222585.6,224688,226944,229670.4,234566.4,239664,246556.8,250310.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4"></input>
Then I use this function to add the dataset but it doesn't work.
function addgraphdata()
{
var labeld= "[" + $('#graphlabels').val() + "]";
var datad = "[" + $('#graphdata').val() + "]";
console.log ("Labels:"+labeld);
console.log ("Data:"+datad);
var newDataset = {
labels: labeld,
data: datad
}
data.datasets.push(newDataset);
console.log(newDataset);
myPayChart.update();
};
I've toyed around with it for a while now and got nowhere does anyone have any ideas?
labels must not be placed inside the dataset but one level higher, inside data. Also the hidden values must be properly converted into arrays. Note that for labels, I first remove the leading and trailing quotes , then split the string by ','.
Please take a look at your amended code below and see how it can be done.
function addgraphdata() {
var labeld = $('#graphlabels').val().replace(/^'(.+(?='$))'$/, '$1').split('\',\'');
var datad = $('#graphdata').val().split(',')
var newDataset = {
label: 'Amount',
data: datad,
backgroundColor: 'rgb(0, 0, 255)'
}
myPayChart.data.labels = labeld;
myPayChart.data.datasets.push(newDataset);
myPayChart.update();
};
const myPayChart = new Chart('chart', {
type: 'bar',
data: {
datasets: []
},
options: {
scales: {
y: {
beginAtZero: true
}
}
}
});
addgraphdata();
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/3.7.0/chart.min.js"></script>
<canvas id="chart"></canvas>
<input type="hidden" name="graphlabels" id="graphlabels" value="'Job 1 Year 1 $51,390.00','Job 1 Year 2 $91,845.00','Job 1 Year 3 $98,667.00','Job 2 Year 1 $88,886.40','Job 2 Year 2 $124,406.40','Job 2 Year 3 $136,675.20','Job 2 Promotion Year 4 $222,585.60','Job 2 Promotion Year 5 $224,688.00','Job 2 Promotion Year 6 $226,944.00','Job 2 Promotion Year 7 $229,670.40','Job 2 Promotion Year 8 $234,566.40','Job 2 Promotion Year 9 $239,664.00','Job 2 Promotion Year 10 $246,556.80','Job 2 Promotion Year 11 $250,310.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40','Job 2 Promotion Year 12 $255,638.40'"></input>
<input type="hidden" name="graphdata" id="graphdata" value="51390,91845,98667,88886.4,124406.4,136675.2,222585.6,224688,226944,229670.4,234566.4,239664,246556.8,250310.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4,255638.4"></input>
Related
This question already has answers here:
variable showing the highest value attained of another variable, recorded so far, over time
(2 answers)
Closed 1 year ago.
I want to generate a variable max_count wherein, for a given group ID, if the value of count for the current year is higher than for the previous year then max_count takes the value for the current year. The value for the current year will be applied to the succeeding years until a higher value than that in the current year occurs. For instance, in the example below for ID 2, the value of count in 2001 is 10 but the succeeding years (2002 and 2003) have values less than 10 (i.e. 2 and 4) so 2002 and 2003 then take the value of 10 (the highest value after 2001).
I used this Stata code but it doesn't work:
bysort id (Year): gen max_count=max(count, count[_n-1])
The highest value is only applied to the immediately succeeding year and not to all succeeding years.
ID Year count max_count
1 2000 5 5
1 2001 0 5
1 2002 3 5
1 2003 7 7
2 2000 5 5
2 2001 10 10
2 2002 2 10
2 2003 4 10
3 2000 2 2
3 2001 5 5
3 2002 9 9
3 2003 6 9
clear
input ID Year count max_count
1 2000 5 5
1 2001 0 5
1 2002 3 5
1 2003 7 7
2 2000 5 5
2 2001 10 10
2 2002 2 10
2 2003 4 10
3 2000 2 2
3 2001 5 5
3 2002 9 9
3 2003 6 9
end
bysort ID (Year) : gen wanted = count[1]
by ID : replace wanted = max(wanted[_n-1], count) if _n > 1
list, sepby(ID)
+---------------------------------------+
| ID Year count max_co~t wanted |
|---------------------------------------|
1. | 1 2000 5 5 5 |
2. | 1 2001 0 5 5 |
3. | 1 2002 3 5 5 |
4. | 1 2003 7 7 7 |
|---------------------------------------|
5. | 2 2000 5 5 5 |
6. | 2 2001 10 10 10 |
7. | 2 2002 2 10 10 |
8. | 2 2003 4 10 10 |
|---------------------------------------|
9. | 3 2000 2 2 2 |
10. | 3 2001 5 5 5 |
11. | 3 2002 9 9 9 |
12. | 3 2003 6 9 9 |
+---------------------------------------+
There is a detailed discussion of how to get such records (the maximum or minimum so far is the "record", as in sport) in this Stata FAQ.
For a one-line solution, install rangestat from SSC and then
rangestat (max) WANTED = count, int(Year . 0) by(ID)
The problem of when the record occurred is naturally related:
by ID : gen when = Year[1]
by ID : replace when = cond(wanted > wanted[_n-1], Year, when[_n-1]) if _n > 1
I'm trying to figure out a concise way to keep only the two years before and after the year in which an event takes place using daily panel data in Stata. The panel is unbalanced. Ultimately, I'm trying to conduct an event study but I experienced issues because the unique groups report inconsistent years.
The data looks something like this:
ID year month day event
1 1999 1 1 0
1 1999 1 2 0
1 1999 1 3 0
1 1999 1 4 0
1 1999 1 5 0
1 1999 1 6 0
1 1999 1 7 0
1 1999 1 8 0
1 1999 1 9 0
1 1999 1 10 0
1 1999 1 11 0
1 1999 1 12 0
1 1999 1 13 0
1 1999 1 14 0
1 1999 1 15 0
1 1999 1 16 0
1 1999 1 17 0
1 1999 1 18 0
1 1999 1 19 0
1 1999 1 20 0
1 1999 1 21 0
1 1999 1 22 0
1 1999 1 23 0
1 1999 1 24 0
1 1999 1 25 0
1 1999 1 26 0
1 1999 1 27 0
1 1999 1 28 0
1 1999 1 29 0
1 1999 1 30 0
1 1999 1 31 0
1 1999 2 1 1
1 1999 2 2 1
In this case, the event takes place in February 1999. The event is monthly, but I need the daily data for a later part of the analysis. I want to somehow tag the 24 months before February 1999 and the 24 months after February 1999. However, I need to do this in a way that won't codify any months in 2002 if group 1 reported no data in 2000.
I got the following to work on a similar set of monthly data but I can't figure out a way to do it with daily data. Furthermore, if anyone has suggestions for a less clunky solution, I would be very appreciative.
bys ID year (month) : egen year_change = max(event)
bys ID (year month) : replace year_change = 2 if ///
(year_change[_n+24] == 1 & year[_n] == year[_n+24] - 2) | ///
(year_change[_n+12] == 1 & year[_n] == year[_n+12] - 1) | ///
(year_change[_n-12] == 1 & year[_n] == year[_n-12] + 1) | ///
(year_change[_n-24] == 1 & year[_n] == year[_n-24] + 2)
keep if year_change >= 1
It seems that your event date is the first date with event 1. So,
gen dailydate = mdy(month, day, year)
bysort id : egen key = min(cond(event == 1, dailydate, .))
gen wanted = inrange(dailydate, key - 730, key + 730)
Check that wanted gives the dates you want and then modify the rule or keep accordingly.
This code doesn't assume that the event date is the same for each panel, but that would not be a problem.
See this paper for a review of related technique.
For your task, I suggest you to work use actual Stata dates, instead of relying on year + month + day variables - this way, it would be easier to add/subtract 24 months without relying on data sorting (the "_n+24" part in your code) and the codification would not suffer from the issue with missing data that you outline in the question.
I see a straightforward solution, which relies on an assumption I made on your setting (that you did not specify, but is the general form of event studies): the event date is unique for all IDs, hence there is no group-specific "treatment" date.
g stata_date = mdy(month, day, year) // generate variable with Stata date
/* Unique event on Feb 1, 1999 */
bys ID: egen treat_group = max(event) // indicator for an ID to ever be "treated"
g event_window = (stata_date >= td(01Feb1997) & stata_date < td(01Feb2001)) // indicator for event window - 2 years before and after Feb 1, 1999
g event_treatment = treat_group * event_window // indicator for a treated ID during the event window
I have a tricky question about conditional sum in SAS. Actually, it is very complicated for me and therefore, I cannot explain it by words. Therefore I want to show an example:
A B
5 3
7 2
8 6
6 4
9 5
8 2
3 1
4 3
As you can see, I have a datasheet that has two columns. First of all, I calculated the conditional cumulative sum of column A ( I can do it by myself-So no need help for that step):
A B CA
5 3 5
7 2 12
8 6 18
6 4 8 ((12+8)-18)+6
9 5 17
8 2 18
3 1 10 (((17+8)-18)+3
4 3 14
So my condition value is 18. If the cumulative more than 18, then it equal 18 and next value if sum of the first value after 18 and exceeds amount over 18. ( As I said I can do it by myself )
So the tricky part is I have to calculate the cumulative sum of column B according to column A:
A B CA CB
5 3 5 3
7 2 12 5
8 6 18 9.5 (5+(6*((18-12)/8)))
6 4 8 5.5 ((5+6)-9.5)+4
9 5 17 10.5 (5.5+5)
8 2 18 10.75 (10.5+(2*((18-7)/8)))
3 1 10 2.75 ((10.5+2)-10.75)+1
4 3 14 5.75 (2.75+3)
As you can see from example the cumulative sum of column B is very specific. When column CA is equal to our condition value (18), then we calculate the proportion of the last value for getting our condition value (18) and then use this proportion for computing cumulative sum of column B.
Looks like when the sum of A reaches 18 or more you want to split the values of A and B between the current and the next record. One way is to remember the left over values for A and B and carry them forward in your new cumulative variables. Just make sure to output the observation before resetting those variables.
data want ;
set have ;
ca+a;
cb+b;
if ca >= 18 then do;
extra_a=ca - 18;
extra_b=b - b*((a - extra_a)/a) ;
ca=18;
cb=cb-extra_b ;
end;
output;
if ca=18 then do;
ca=extra_a;
cb=extra_b;
end;
drop extra_a extra_b ;
run;
I have a dataset that shows how much was paid ("cenoz" - cents per ounce) per product category during specific week and in a specific store.
clear
set more off
input week store cenoz category
1 1 2 1
1 1 4 2
1 1 3 3
1 2 5 1
1 2 7 2
1 2 8 3
2 1 4 1
2 1 1 2
2 1 10 3
2 2 3 1
2 2 4 2
2 2 7 3
3 1 5 1
3 1 3 2
3 2 5 1
3 2 4 2
end
I create a new variable cenoz3 that indicates how much on average was paid for category 3 given specific week and a store. Same with cenoz1, and cenoz2.
egen cenoz1 = mean(cenoz/ (category == 1)), by(week store)
egen cenoz2 = mean(cenoz/ (category == 2)), by(week store)
egen cenoz3 = mean(cenoz/ (category == 3)), by(week store)
It turns out that category 3 was not sold in any of the stores (1 and 2) in week 3. As a result, missing values are generated.
week store cenoz category cenoz1 cenoz2 cenoz3
1 1 2 1 2 4 3
1 1 4 2 2 4 3
1 1 3 3 2 4 3
1 2 5 1 5 7 8
1 2 7 2 5 7 8
1 2 8 3 5 7 8
2 1 4 1 4 1 10
2 1 1 2 4 1 10
2 1 10 3 4 1 10
2 2 3 1 3 4 7
2 2 4 2 3 4 7
2 2 7 3 3 4 7
3 1 5 1 5 3 .
3 1 3 2 5 3 .
3 2 5 1 5 4 .
3 2 4 2 5 4 .
I would like to replace missing values of a particular week with values of the previous week and matching store. That's to say:
replace missing values for category 3 in week 3 in store 1
with values for category 3 in week 2 in store 1
and
replace missing values for category 3 in week 3 in store 2
with values for category 3 in week 2 in store 2
Can I use command replace or is it something more complicated than that?
Something like:
replace cenoz1 = cenoz1[_n-1] if missing(cenoz1)
But I also need to the stores to match, not just the time variable week.
I found this code provided by Nicholas Cox at
http://www.stata.com/support/faqs/data-management/replacing-missing-values/:
by id (time), sort: replace myvar = myvar[_n-1] if myvar >= .
Do you think
by store (week), sort: cenoz1 = cenoz1[_n-1] if missing(cenoz1)
is sufficient?
UPDATE:
When I use the code
by store (week category), sort: replace cenoz3 = cenoz3[_n-1] if missing(cenoz3)
It seems it delivers correct values:
week store cenoz category cenoz1 cenoz2 cenoz3
1 1 2 1 2 4 3
1 1 4 2 2 4 3
1 1 3 3 2 4 3
1 2 5 1 5 7 8
1 2 7 2 5 7 8
1 2 8 3 5 7 8
2 1 4 1 4 1 10
2 1 1 2 4 1 10
2 1 10 3 4 1 10
2 2 3 1 3 4 7
2 2 4 2 3 4 7
2 2 7 3 3 4 7
3 1 5 1 5 3 10
3 1 3 2 5 3 10
3 2 5 1 5 4 7
3 2 4 2 5 4 7
Is there any way to double check this code given that my dataset is quite large?
How make this code not so specific but applicable to any missing cenoz if it finds one with missing vaues? (cenoz1, cenoz2, cenoz3, cenoz4...cenoz12)
If you want to use the previous information for the same store and the same category, that should be
by store category (week), sort: replace cenoz3 = cenoz3[_n-1] if missing(cenoz3)
A generalization could be
sort store category week
forval j = 1/12 {
by store category: replace cenoz`j' = cenoz`j'[_n-1] if missing(cenoz`j')
}
However this carrying forward is a fairly crude method of interpolation. Consider linear, cubic, cubic spline, PCHIP methods of interpolation. Use search to find Stata programs.
A quick note on why your code
by store (category week), sort: replace cenoz3 = cenoz3[_n-1] if missing(cenoz3)
won't work.
It will work for the example dataset you give. But a slight modification can give unexpected results. Consider the following example:
clear all
set more off
input week store cenoz category
1 1 2 1
1 1 4 2 /*
1 1 3 3 deleted observation */
1 2 5 1
1 2 7 2
1 2 8 3
2 1 4 1
2 1 1 2
2 1 10 3
2 2 3 1
2 2 4 2
2 2 7 3
3 1 5 1
3 1 3 2
3 1 999 3 // new observation
3 2 5 1
3 2 4 2
end
egen cenoz1 = mean(cenoz/ (category == 1)), by(week store)
egen cenoz2 = mean(cenoz/ (category == 2)), by(week store)
egen cenoz3 = mean(cenoz/ (category == 3)), by(week store)
order store category week
sort store category week
list, sepby(store category)
*----- method 1 (your code) -----
gen cenoz3x1 = cenoz3
by store (category week), sort: replace cenoz3x1 = cenoz3x1[_n-1] if missing(cenoz3x1)
*----- method 2 (Nick's code) -----
gen cenoz3x2 = cenoz3
by store category (week), sort: replace cenoz3x2 = cenoz3x2[_n-1] if missing(cenoz3x2)
list, sepby(store category)
Method 1 will assign the price of a category 1 article to a category 2 article (observation 4 of cenoz3x1). Presumably, something you don't want. If you want to avoid this, then the groups should be based on store category and not just store.
The best place to start reading is help and the manuals.
At the moment my code reads: gen lateFirms = 1 if firmage0 != .
So at the moment the dataset which I get looks like this:
firm_id lateFirms firmage0
1
1
1
1
1
3
3
3
3
3
4
4
4
4
4
5
5
6 1 110
6
6
6
6
7
7
7
7
7
8 1 90
8
8
8
8
But what I want is this:
firm_id lateFirms firmage0
1
1
1
1
1
3
3
3
3
3
4
4
4
4
4
5
5
6 1 110
6 1
6 1
6 1
6 1
7
7
7
7
7
8 1 90
8 1
8 1
8 1
8 1
NOTE: All blank entries are missing values!
So "lateFirms" should equal 1 if, regarding a "firm_id", there exists one observation for which firmage0 is not a missing value.
bysort firm_id : egen present = count(firmage0)
replace lateFirms = present > 0
The count() function of egen counts non-missings and assigns the count to all values for each firm.
Maybe this helps:
bysort firm_id: gen dum = 1 if sum(firmage0) != 0
To get exactly what you want, you can use replace instead of generate:
bysort firm_id: replace lateFirms = 1 if sum(firmage0) != 0
As #NickCox pointed out, this solution is specific to the example dataset you provided.